Files
macrolactone-toolkit/README.md
lingyuzeng 5e7b236f31 feat(toolkit): ship macro_lactone_toolkit package
Unify macrolactone detection, numbering, fragmentation, and
splicing under the installable macro_lactone_toolkit package.

- replace legacy src.* modules with the new package layout
- add analyze/number/fragment CLI entrypoints and pixi tasks
- migrate tests, README, and scripts to the new package API
2026-03-18 22:06:45 +08:00

68 lines
2.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# macro_lactone_toolkit
`macro_lactone_toolkit` 是一个正式可安装的 Python 包,用于 12-20 元有效大环内酯的识别、环编号、侧链裂解和简单拼接回组装。
## 核心能力
- 默认自动识别 12-20 元有效大环内酯,也允许显式指定 `ring_size`
- 环编号规则固定为:
- 位置 1 = 内酯羰基碳
- 位置 2 = 环上的酯键氧
- 位置 3-N = 沿统一方向连续编号
- 侧链裂解同时输出两套 SMILES
- `fragment_smiles_labeled`,例如 `[5*]`
- `fragment_smiles_plain`,例如 `*`
- dummy 原子与连接原子的原始键型保持一致
- 提供正式 CLI
- `macro-lactone-toolkit analyze`
- `macro-lactone-toolkit number`
- `macro-lactone-toolkit fragment`
## 环境
推荐使用 `pixi`,项目已固定到 Python 3.12,并支持 `osx-arm64``linux-64`
```bash
pixi install
pixi run pytest
pixi run python -c "import macro_lactone_toolkit"
```
## Python API
```python
from macro_lactone_toolkit import MacroLactoneAnalyzer, MacrolactoneFragmenter
analyzer = MacroLactoneAnalyzer()
valid_ring_sizes = analyzer.get_valid_ring_sizes("O=C1CCCCCCCCCCCCCCO1")
fragmenter = MacrolactoneFragmenter()
numbering = fragmenter.number_molecule("O=C1CCCCCCCCCCCCCCO1")
result = fragmenter.fragment_molecule("O=C1CCCC(C)CCCCCCCCCCO1", parent_id="mol_001")
```
## CLI
单分子分析:
```bash
pixi run macro-lactone-toolkit analyze --smiles "O=C1CCCCCCCCCCCCCCO1"
pixi run macro-lactone-toolkit number --smiles "O=C1CCCCCCCCCCCCCCO1"
pixi run macro-lactone-toolkit fragment --smiles "O=C1CCCC(C)CCCCCCCCCCO1" --parent-id mol_001
```
CSV 批处理:
```bash
pixi run macro-lactone-toolkit fragment \
--input molecules.csv \
--output fragments.csv \
--errors-output fragment_errors.csv
```
默认读取 `smiles` 列;若存在 `id` 列则将其作为 `parent_id`,否则自动生成 `row_<index>`
## Legacy Scripts
`scripts/` 目录保留为薄封装或迁移提示,不再承载核心实现。正式接口以 `macro_lactone_toolkit.*``macro-lactone-toolkit` CLI 为准。