Files

lingyuzeng c0ead42384 feat(toolkit): add classification and migration

Implement the standard/non-standard/not-macrolactone classification layer
and integrate it into analyzer, fragmenter, and CLI outputs.

Port the remaining legacy package capabilities into new visualization and
workflow modules, restore batch/statistics/SDF scripts on top of the flat
CSV workflow, and update active docs to the new package API.

2026-03-18 23:56:41 +08:00

1.7 KiB

Raw Permalink Blame History

Macro Split 文档摘要

当前仓库的正式接口全部集中在 src/macro_lactone_toolkit/，核心能力包括：

MacroLactoneAnalyzer
- 分子级分类：standard_macrolactone / non_standard_macrocycle / not_macrolactone
- 12-20 元大环内酯识别
- 批量统计、DataFrame 分类、动态 SMARTS、基本理化性质
MacrolactoneFragmenter
- 标准大环内酯编号
- 侧链裂解
- flat JSON/CSV 输出
macro_lactone_toolkit.visualization
- 编号分子 SVG/PNG
- 碎片 SVG/PNG
macro_lactone_toolkit.workflows
- CSV 批量裂解
- FragmentationResult 转 DataFrame
- JSON 导出
- 编号图片 + 标注 CSV 导出
macro_lactone_toolkit.splicing
- 通用大环内酯 scaffold 预处理
- 片段活化和拼接

推荐起步方式：

from macro_lactone_toolkit import MacroLactoneAnalyzer, MacrolactoneFragmenter
from macro_lactone_toolkit.workflows import fragment_csv, results_to_dataframe

analyzer = MacroLactoneAnalyzer()
classification = analyzer.classify_macrocycle(smiles)

fragmenter = MacrolactoneFragmenter()
result = fragmenter.fragment_molecule(smiles, parent_id="mol_001")

results = fragment_csv("molecules.csv")
fragments_df = results_to_dataframe(results)

推荐脚本工作流：

python scripts/batch_process.py --input molecules.csv --output fragments.csv --errors-output errors.csv
python scripts/analyze_fragments.py --input fragments.csv --output-dir analysis
python scripts/generate_sdf_and_statistics.py --input fragments.csv --output-dir sdf_output

活动文档和脚本都基于 macro_lactone_toolkit.*。历史 notebook .ipynb 快照保留作归档参考，但不再作为当前 API 文档。

1.7 KiB Raw Permalink Blame History

Macro Split 文档摘要

1.7 KiB

Raw Permalink Blame History