Files
macrolactone-toolkit/docs/SUMMARY.md
lingyuzeng c0ead42384 feat(toolkit): add classification and migration
Implement the standard/non-standard/not-macrolactone classification layer
and integrate it into analyzer, fragmenter, and CLI outputs.

Port the remaining legacy package capabilities into new visualization and
workflow modules, restore batch/statistics/SDF scripts on top of the flat
CSV workflow, and update active docs to the new package API.
2026-03-18 23:56:41 +08:00

1.7 KiB

Macro Split 文档摘要

当前仓库的正式接口全部集中在 src/macro_lactone_toolkit/,核心能力包括:

  • MacroLactoneAnalyzer
    • 分子级分类:standard_macrolactone / non_standard_macrocycle / not_macrolactone
    • 12-20 元大环内酯识别
    • 批量统计、DataFrame 分类、动态 SMARTS、基本理化性质
  • MacrolactoneFragmenter
    • 标准大环内酯编号
    • 侧链裂解
    • flat JSON/CSV 输出
  • macro_lactone_toolkit.visualization
    • 编号分子 SVG/PNG
    • 碎片 SVG/PNG
  • macro_lactone_toolkit.workflows
    • CSV 批量裂解
    • FragmentationResult 转 DataFrame
    • JSON 导出
    • 编号图片 + 标注 CSV 导出
  • macro_lactone_toolkit.splicing
    • 通用大环内酯 scaffold 预处理
    • 片段活化和拼接

推荐起步方式:

from macro_lactone_toolkit import MacroLactoneAnalyzer, MacrolactoneFragmenter
from macro_lactone_toolkit.workflows import fragment_csv, results_to_dataframe

analyzer = MacroLactoneAnalyzer()
classification = analyzer.classify_macrocycle(smiles)

fragmenter = MacrolactoneFragmenter()
result = fragmenter.fragment_molecule(smiles, parent_id="mol_001")

results = fragment_csv("molecules.csv")
fragments_df = results_to_dataframe(results)

推荐脚本工作流:

python scripts/batch_process.py --input molecules.csv --output fragments.csv --errors-output errors.csv
python scripts/analyze_fragments.py --input fragments.csv --output-dir analysis
python scripts/generate_sdf_and_statistics.py --input fragments.csv --output-dir sdf_output

活动文档和脚本都基于 macro_lactone_toolkit.*。历史 notebook .ipynb 快照保留作归档参考,但不再作为当前 API 文档。