Unify macrolactone detection, numbering, fragmentation, and splicing under the installable macro_lactone_toolkit package. - replace legacy src.* modules with the new package layout - add analyze/number/fragment CLI entrypoints and pixi tasks - migrate tests, README, and scripts to the new package API
68 lines
2.0 KiB
Markdown
68 lines
2.0 KiB
Markdown
# macro_lactone_toolkit
|
||
|
||
`macro_lactone_toolkit` 是一个正式可安装的 Python 包,用于 12-20 元有效大环内酯的识别、环编号、侧链裂解和简单拼接回组装。
|
||
|
||
## 核心能力
|
||
|
||
- 默认自动识别 12-20 元有效大环内酯,也允许显式指定 `ring_size`
|
||
- 环编号规则固定为:
|
||
- 位置 1 = 内酯羰基碳
|
||
- 位置 2 = 环上的酯键氧
|
||
- 位置 3-N = 沿统一方向连续编号
|
||
- 侧链裂解同时输出两套 SMILES:
|
||
- `fragment_smiles_labeled`,例如 `[5*]`
|
||
- `fragment_smiles_plain`,例如 `*`
|
||
- dummy 原子与连接原子的原始键型保持一致
|
||
- 提供正式 CLI:
|
||
- `macro-lactone-toolkit analyze`
|
||
- `macro-lactone-toolkit number`
|
||
- `macro-lactone-toolkit fragment`
|
||
|
||
## 环境
|
||
|
||
推荐使用 `pixi`,项目已固定到 Python 3.12,并支持 `osx-arm64` 与 `linux-64`。
|
||
|
||
```bash
|
||
pixi install
|
||
pixi run pytest
|
||
pixi run python -c "import macro_lactone_toolkit"
|
||
```
|
||
|
||
## Python API
|
||
|
||
```python
|
||
from macro_lactone_toolkit import MacroLactoneAnalyzer, MacrolactoneFragmenter
|
||
|
||
analyzer = MacroLactoneAnalyzer()
|
||
valid_ring_sizes = analyzer.get_valid_ring_sizes("O=C1CCCCCCCCCCCCCCO1")
|
||
|
||
fragmenter = MacrolactoneFragmenter()
|
||
numbering = fragmenter.number_molecule("O=C1CCCCCCCCCCCCCCO1")
|
||
result = fragmenter.fragment_molecule("O=C1CCCC(C)CCCCCCCCCCO1", parent_id="mol_001")
|
||
```
|
||
|
||
## CLI
|
||
|
||
单分子分析:
|
||
|
||
```bash
|
||
pixi run macro-lactone-toolkit analyze --smiles "O=C1CCCCCCCCCCCCCCO1"
|
||
pixi run macro-lactone-toolkit number --smiles "O=C1CCCCCCCCCCCCCCO1"
|
||
pixi run macro-lactone-toolkit fragment --smiles "O=C1CCCC(C)CCCCCCCCCCO1" --parent-id mol_001
|
||
```
|
||
|
||
CSV 批处理:
|
||
|
||
```bash
|
||
pixi run macro-lactone-toolkit fragment \
|
||
--input molecules.csv \
|
||
--output fragments.csv \
|
||
--errors-output fragment_errors.csv
|
||
```
|
||
|
||
默认读取 `smiles` 列;若存在 `id` 列则将其作为 `parent_id`,否则自动生成 `row_<index>`。
|
||
|
||
## Legacy Scripts
|
||
|
||
`scripts/` 目录保留为薄封装或迁移提示,不再承载核心实现。正式接口以 `macro_lactone_toolkit.*` 与 `macro-lactone-toolkit` CLI 为准。
|