diff --git a/README.md b/README.md
index 8828c36..de8f97e 100644
--- a/README.md
+++ b/README.md
@@ -1,292 +1,67 @@
-# Macrolactone Fragmenter
+# macro_lactone_toolkit
 
-[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
-[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
-[![Pixi](https://img.shields.io/badge/pixi-managed-blue)](https://pixi.sh)
+`macro_lactone_toolkit` 是一个正式可安装的 Python 包，用于 12-20 元有效大环内酯的识别、环编号、侧链裂解和简单拼接回组装。
 
-一个专业的大环内酯（Macrolactone, 12-20 元环）侧链断裂和分析工具，提供从分子编号、侧链断裂到可视化和数据导出的完整工作流。
+## 核心能力
 
-## ✨ 主要特性
+- 默认自动识别 12-20 元有效大环内酯，也允许显式指定 `ring_size`
+- 环编号规则固定为：
+  - 位置 1 = 内酯羰基碳
+  - 位置 2 = 环上的酯键氧
+  - 位置 3-N = 沿统一方向连续编号
+- 侧链裂解同时输出两套 SMILES：
+  - `fragment_smiles_labeled`，例如 `[5*]`
+  - `fragment_smiles_plain`，例如 `*`
+- dummy 原子与连接原子的原始键型保持一致
+- 提供正式 CLI：
+  - `macro-lactone-toolkit analyze`
+  - `macro-lactone-toolkit number`
+  - `macro-lactone-toolkit fragment`
 
-- 🔢 **智能环原子编号** - 支持 12-20 元环，基于内酯结构的固定编号系统
-- ✂️ **自动侧链断裂** - 智能识别并断裂所有侧链，支持单个或批量处理
-- 📊 **强大的可视化** - SVG 显示 + PNG 保存，支持自定义 DPI 和尺寸
-- 💾 **多种导出格式** - JSON、CSV、DataFrame，预留 SQLModel 数据库接口
-- 🚀 **批量处理** - 直接处理 CSV 文件，支持 2000+ 分子的大规模分析
-- 📦 **易于安装** - 使用 pixi 或 pip 安装，支持开发模式
+## 环境
 
-## 🚀 快速开始
-
-### 安装
-
-**使用 Pixi（推荐）**
+推荐使用 `pixi`，项目已固定到 Python 3.12，并支持 `osx-arm64` 与 `linux-64`。
 
 ```bash
-# 安装 pixi
-curl -fsSL https://pixi.sh/install.sh | bash
-
-# 克隆项目并安装依赖
-git clone https://github.com/yourusername/macro_split.git
-cd macro_split
 pixi install
-
-# 激活环境
-pixi shell
-```
-
-**使用 Pip**
-
-```bash
-# 首先安装 RDKit（必须通过 conda）
-conda install -c conda-forge rdkit
-
-# 安装项目
-pip install -e .  # 开发模式
-```
-
-### 基本使用
-
-```python
-from src.macrolactone_fragmenter import MacrolactoneFragmenter
-
-# 创建 16 元环处理器
-fragmenter = MacrolactoneFragmenter(ring_size=16)
-
-# 处理单个分子
-smiles = "CCC1OC(=O)C[C@H](O)C(C)[C@@H](O)..."
-result = fragmenter.process_molecule(smiles, parent_id="mol_001")
-
-# 查看结果
-print(f"总碎片数: {len(result.fragments)}")
-for frag in result.fragments[:5]:
-    print(f"位置 {frag.cleavage_position}: {frag.fragment_smiles}")
-
-# 可视化
-svg = fragmenter.visualize_molecule(result.parent_smiles)
-
-# 导出数据
-fragmenter.save_to_json(result, "output/fragments.json")
-df = fragmenter.to_dataframe(result)
-```
-
-### 批量处理
-
-```python
-# 批量处理 CSV 文件
-results = fragmenter.process_csv(
-    "data/molecules.csv",
-    smiles_column="smiles",
-    id_column="unique_id",
-    max_rows=100
-)
-
-# 转换为 DataFrame 并保存
-df_all = fragmenter.batch_to_dataframe(results)
-df_all.to_csv("output/all_fragments.csv", index=False)
-```
-
-## 📖 文档
-
-完整的文档请访问：[在线文档](https://yourusername.github.io/macro_split/)
-
-或本地查看：
-
-```bash
-pixi run mkdocs serve
-# 访问 http://localhost:8000
-```
-
-### 文档内容
-
-- 📘 **[快速开始](docs/getting-started.md)** - 5 分钟上手指南
-- 📗 **[安装指南](docs/installation.md)** - 详细的安装说明
-- 📙 **[用户指南](docs/user-guide/index.md)** - 完整的使用说明
-- 📕 **[API 参考](docs/api/index.md)** - 自动生成的 API 文档
-- 📔 **[教程与示例](docs/tutorials/index.md)** - 实际使用案例
-
-## 🎯 核心功能
-
-### MacrolactoneFragmenter 类
-
-高级封装类，提供一站式解决方案：
-
-```python
-fragmenter = MacrolactoneFragmenter(ring_size=16)
-
-# 一站式处理
-result = fragmenter.process_molecule(smiles)
-
-# 单位置断裂
-fragments = fragmenter.cleave_at_position(mol, position=5)
-
-# 可视化（SVG + PNG）
-fragmenter.visualize_molecule(smiles, save_path="mol.png", dpi=600)
-
-# 批量处理
-results = fragmenter.process_csv("molecules.csv")
-```
-
-### 环编号系统
-
-基于内酯结构的固定编号：
-
-- **位置 1**: 羰基碳（C=O 中的 C）
-- **位置 2**: 酯键氧（环上的 O）
-- **位置 3-N**: 按顺序编号环上剩余原子
-
-### 数据结构
-
-使用 dataclass 存储碎片信息，支持 JSON 导入导出：
-
-```python
-@dataclass
-class Fragment:
-    fragment_smiles: str         # 碎片 SMILES（含 dummy 原子 *）
-    parent_smiles: str           # 母分子 SMILES
-    cleavage_position: int       # 断裂位置（1-N）
-    fragment_id: str             # 碎片 ID
-    parent_id: str               # 母分子 ID
-    atom_count: int              # 原子数
-    molecular_weight: float      # 分子量
-```
-
-## 📂 项目结构
-
-```
-macro_split/
-├── src/                         # 核心源代码
-│   ├── macrolactone_fragmenter.py  # ⭐ 高级封装类
-│   ├── ring_numbering.py        # 环编号系统
-│   ├── ring_visualization.py    # 可视化工具
-│   ├── fragment_dataclass.py    # 碎片数据类
-│   ├── fragment_cleaver.py      # 侧链断裂
-│   └── visualizer.py            # 统计可视化
-├── notebooks/                   # Jupyter Notebook 示例
-│   └── filter_molecules.ipynb   # ⭐ 完整使用案例
-├── docs/                        # MkDocs 文档
-├── scripts/                     # 执行脚本
-├── tests/                       # 单元测试
-├── pyproject.toml               # 项目配置
-├── setup.py                     # 打包脚本
-├── pixi.toml                    # Pixi 环境配置
-└── mkdocs.yml                   # 文档配置
-```
-
-## 🔧 环境管理
-
-本项目使用 [Pixi](https://pixi.sh/) 进行环境管理。Pixi 是一个现代化的包管理工具，具有以下优势：
-
-- ✅ 自动安装 RDKit（无需手动配置 conda）
-- ✅ 环境隔离，不污染系统
-- ✅ 跨平台支持（Linux、macOS、Windows）
-- ✅ 快速且可重现的依赖管理
-
-### Pixi 常用命令
-
-```bash
-# 安装依赖
-pixi install
-
-# 激活环境
-pixi shell
-
-# 添加新包
-pixi add package-name
-
-# 运行命令
-pixi run python script.py
-pixi run jupyter notebook
-
-# 查看已安装包
-pixi list
-```
-
-## 🧪 运行示例
-
-### 方式 1: Jupyter Notebook
-
-```bash
-pixi run jupyter notebook notebooks/filter_molecules.ipynb
-```
-
-查看第 15 章节的 MacrolactoneFragmenter 完整演示。
-
-### 方式 2: Python 脚本
-
-```bash
-pixi run python examples/basic_usage.py
-```
-
-### 方式 3: 交互式 Python
-
-```bash
-pixi shell
-python
->>> from src.macrolactone_fragmenter import MacrolactoneFragmenter
->>> fragmenter = MacrolactoneFragmenter(ring_size=16)
->>> # 开始使用...
-```
-
-## 📊 性能
-
-- **处理速度**: ~100 分子/分钟
-- **已测试**: 2000+ 个 16 元环大环内酯
-- **支持环大小**: 12-20 元环
-- **输出格式**: JSON, CSV, DataFrame, PNG, SVG
-
-## 🤝 贡献
-
-欢迎贡献！请查看 [贡献指南](docs/development/contributing.md)。
-
-### 开发环境设置
-
-```bash
-# Fork 并克隆项目
-git clone https://github.com/YOUR_USERNAME/macro_split.git
-cd macro_split
-
-# 安装开发依赖
-pixi install
-
-# 运行测试
 pixi run pytest
-
-# 检查代码风格
-pixi run black src/
-pixi run flake8 src/
+pixi run python -c "import macro_lactone_toolkit"
 ```
 
-## 📝 许可证
+## Python API
 
-本项目基于 [MIT License](LICENSE) 开源。
+```python
+from macro_lactone_toolkit import MacroLactoneAnalyzer, MacrolactoneFragmenter
 
-## 🔗 相关链接
+analyzer = MacroLactoneAnalyzer()
+valid_ring_sizes = analyzer.get_valid_ring_sizes("O=C1CCCCCCCCCCCCCCO1")
 
-- **文档**: [在线文档](https://yourusername.github.io/macro_split/)
-- **GitHub**: [macro_split](https://github.com/yourusername/macro_split)
-- **PyPI**: [macrolactone-fragmenter](https://pypi.org/project/macrolactone-fragmenter/)（即将发布）
-- **问题反馈**: [Issues](https://github.com/yourusername/macro_split/issues)
+fragmenter = MacrolactoneFragmenter()
+numbering = fragmenter.number_molecule("O=C1CCCCCCCCCCCCCCO1")
+result = fragmenter.fragment_molecule("O=C1CCCC(C)CCCCCCCCCCO1", parent_id="mol_001")
+```
 
-## 📧 联系方式
+## CLI
 
-如有问题或建议，请：
+单分子分析：
 
-- 提交 [Issue](https://github.com/yourusername/macro_split/issues)
-- 参与 [Discussions](https://github.com/yourusername/macro_split/discussions)
-- 查看[文档](https://yourusername.github.io/macro_split/)
+```bash
+pixi run macro-lactone-toolkit analyze --smiles "O=C1CCCCCCCCCCCCCCO1"
+pixi run macro-lactone-toolkit number --smiles "O=C1CCCCCCCCCCCCCCO1"
+pixi run macro-lactone-toolkit fragment --smiles "O=C1CCCC(C)CCCCCCCCCCO1" --parent-id mol_001
+```
 
-## 🌟 致谢
+CSV 批处理：
 
-- [RDKit](https://www.rdkit.org/) - 化学信息学核心库
-- [Pixi](https://pixi.sh/) - 现代化的包管理工具
-- [MkDocs Material](https://squidfunk.github.io/mkdocs-material/) - 漂亮的文档主题
+```bash
+pixi run macro-lactone-toolkit fragment \
+  --input molecules.csv \
+  --output fragments.csv \
+  --errors-output fragment_errors.csv
+```
 
----
+默认读取 `smiles` 列；若存在 `id` 列则将其作为 `parent_id`，否则自动生成 `row_<index>`。
 
-<div align="center">
+## Legacy Scripts
 
-**⭐ 如果这个项目对您有帮助，请给它一个 Star！⭐**
-
-Made with ❤️ by Macro Split Team
-
-</div>
+`scripts/` 目录保留为薄封装或迁移提示，不再承载核心实现。正式接口以 `macro_lactone_toolkit.*` 与 `macro-lactone-toolkit` CLI 为准。
diff --git a/pixi.toml b/pixi.toml
index 646f6a1..c751e27 100644
--- a/pixi.toml
+++ b/pixi.toml
@@ -1,24 +1,23 @@
 [workspace]
 authors = ["hotwa <pylyzeng@gmail.com>"]
 channels = ["conda-forge"]
-name = "macro_split"
-platforms = ["linux-64"]
+name = "macro_lactone_toolkit"
+platforms = ["osx-arm64", "linux-64"]
 version = "0.1.0"
 
 [tasks]
+test = "pytest"
+lint = "python -m compileall src"
+smoke-import = "python -c \"import macro_lactone_toolkit\""
+macro-lactone-toolkit = "python -m macro_lactone_toolkit.cli"
 
 [dependencies]
+python = "3.12.*"
+pip = ">=24,<26"
+pytest = ">=8.3,<9"
 rdkit = ">=2025.9.1,<2026"
 pandas = ">=2.3.3,<3"
 numpy = ">=2.3.4,<3"
-matplotlib = ">=3.10.7,<4"
-seaborn = ">=0.13.2,<0.14"
-jupyter = ">=1.1.1,<2"
-tqdm = ">=4.67.1,<5"
-mkdocs = ">=1.6.1,<2"
-mkdocs-material = ">=9.6.23,<10"
-mkdocstrings = ">=0.30.1,<0.31"
-mkdocstrings-python = ">=1.18.2,<2"
 
 [pypi-dependencies]
-dataclasses-json = ">=0.6.7, <0.7"
+macro_lactone_toolkit = { path = ".", editable = true }
diff --git a/pyproject.toml b/pyproject.toml
index 70b0c9b..ba8e662 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,74 +1,28 @@
 [build-system]
-requires = ["setuptools>=61.0", "wheel"]
+requires = ["setuptools>=69", "wheel"]
 build-backend = "setuptools.build_meta"
 
 [project]
-name = "macrolactone-fragmenter"
-version = "1.2.0"
-description = "A tool for analyzing and fragmenting macrolactone (12-20 membered ring) side chains"
+name = "macro_lactone_toolkit"
+version = "0.1.0"
+description = "Toolkit for macrolactone detection, numbering, fragmentation, and splicing."
 readme = "README.md"
-requires-python = ">=3.8"
-license = {text = "MIT"}
+requires-python = ">=3.12"
+license = { text = "MIT" }
 authors = [
-    {name = "Macro Split Team", email = "your.email@example.com"},
+  { name = "Macro Split Team" },
 ]
-keywords = ["chemistry", "cheminformatics", "macrolactone", "rdkit", "fragmentation"]
-classifiers = [
-    "Development Status :: 4 - Beta",
-    "Intended Audience :: Science/Research",
-    "Topic :: Scientific/Engineering :: Chemistry",
-    "License :: OSI Approved :: MIT License",
-    "Programming Language :: Python :: 3",
-    "Programming Language :: Python :: 3.8",
-    "Programming Language :: Python :: 3.9",
-    "Programming Language :: Python :: 3.10",
-    "Programming Language :: Python :: 3.11",
-]
-
 dependencies = [
-    "pandas>=1.3.0",
-    "numpy>=1.20.0",
-    "matplotlib>=3.3.0",
-    "seaborn>=0.11.0",
-    "dataclasses-json>=0.5.0",
-    "tqdm>=4.60.0",
+  "numpy>=1.26",
+  "pandas>=2.2",
 ]
 
-[project.optional-dependencies]
-dev = [
-    "jupyter>=1.0.0",
-    "pytest>=7.0.0",
-    "black>=22.0.0",
-    "flake8>=4.0.0",
-]
-docs = [
-    "mkdocs>=1.4.0",
-    "mkdocs-material>=9.0.0",
-    "mkdocstrings[python]>=0.20.0",
-]
-
-[project.urls]
-Homepage = "https://github.com/yourusername/macro_split"
-Documentation = "https://macro-split.readthedocs.io"
-Repository = "https://github.com/yourusername/macro_split"
-Issues = "https://github.com/yourusername/macro_split/issues"
-
 [project.scripts]
-macro-fragmenter = "src.macrolactone_fragmenter:main"
+macro-lactone-toolkit = "macro_lactone_toolkit.cli:main"
 
-[tool.setuptools]
-packages = ["src"]
-
-[tool.setuptools.package-data]
-src = ["*.py"]
-
-[tool.black]
-line-length = 100
-target-version = ['py38', 'py39', 'py310', 'py311']
+[tool.setuptools.packages.find]
+where = ["src"]
+include = ["macro_lactone_toolkit*"]
 
 [tool.pytest.ini_options]
 testpaths = ["tests"]
-python_files = "test_*.py"
-python_classes = "Test*"
-python_functions = "test_*"
-
diff --git a/scripts/README.md b/scripts/README.md
index 4cbe5ba..78c4a18 100644
--- a/scripts/README.md
+++ b/scripts/README.md
@@ -1,317 +1,11 @@
-# 批量处理脚本使用说明
+# scripts
 
-## 概述
+这些脚本现在都是基于 `macro_lactone_toolkit.*` 的薄封装或迁移提示。
 
-本文件夹包含两个批量处理脚本，用于对大环内酯分子进行编号、裂解和保存。
-
-## 脚本列表
-
-### 1. `batch_process_ring16.py` - 处理16元环分子
-
-**功能：**
-- 处理已过滤的1241个16元环分子
-- 对每个分子进行环原子编号
-- 进行侧链断裂
-- 保存JSON文件
-
-**输入文件：**
-- `ring16/temp_filtered_complete.csv`
-
-**输出文件夹：**
-- `output/ring16_fragments/`
-  - `ring16_mol_{idx}/` - 每个分子的文件夹
-    - `ring16_mol_{idx}_all_fragments.json` - 所有碎片信息
-    - `{fragment_id}.json` - 单个碎片文件
-    - `metadata.json` - 分子元数据
-  - `processing_log_*.txt` - 处理日志
-  - `error_log_*.txt` - 错误日志
-  - `processing_statistics.json` - 统计信息
-
-**使用方法：**
-
-```bash
-cd /home/zly/project/macro_split
-
-# 方法1: 直接运行Python脚本
-python scripts/batch_process_ring16.py
-
-# 方法2: 作为可执行文件运行
-chmod +x scripts/batch_process_ring16.py
-./scripts/batch_process_ring16.py
-```
-
-**预期运行时间：** 10-30分钟（取决于计算机性能）
-
----
-
-### 2. `batch_process_multi_rings.py` - 处理12-20元环分子
-
-**功能：**
-- 处理12-20元环的所有大环内酯分子
-- 按环大小自动分类（12元、13元、14元...20元）
-- 检测并剔除含有多个内酯键的分子
-- 对每种环大小分别编号和裂解
-- 保存到对应的文件夹
-
-**输入文件：**
-- `data/ring12_20/temp.csv`
-
-**输出文件夹：**
-- `output/ring12_fragments/` - 12元环碎片
-- `output/ring13_fragments/` - 13元环碎片
-- `output/ring14_fragments/` - 14元环碎片
-- ...
-- `output/ring20_fragments/` - 20元环碎片
-
-每个文件夹包含：
-- `ring{N}_mol_{idx}/` - 每个分子的文件夹
-  - `ring{N}_mol_{idx}_all_fragments.json`
-  - `{fragment_id}.json`
-  - `metadata.json`
-- `processing_log_*.txt` - 处理日志
-- `error_log_*.txt` - 错误日志
-- `multiple_lactone_log_*.txt` - 多内酯键分子记录
-- `processing_statistics.json` - 统计信息
-
-**使用方法：**
-
-```bash
-cd /home/zly/project/macro_split
-
-# 确保输入文件存在
-ls data/ring12_20/temp.csv
-
-# 运行脚本
-python scripts/batch_process_multi_rings.py
-```
-
-**预期运行时间：** 根据数据集大小，可能需要30分钟-2小时
-
----
-
-## 输出文件说明
-
-### JSON文件格式
-
-#### 1. `{parent_id}_all_fragments.json`
-
-包含所有碎片的完整信息：
-
-```json
-{
-  "parent_id": "ring16_mol_0",
-  "parent_smiles": "...",
-  "fragments": [
-    {
-      "fragment_smiles": "CC(C)C",
-      "parent_smiles": "...",
-      "cleavage_position": 5,
-      "fragment_id": "ring16_mol_0_frag_0",
-      "parent_id": "ring16_mol_0",
-      "atom_count": 4,
-      "molecular_weight": 58.12
-    },
-    ...
-  ]
-}
-```
-
-#### 2. `{fragment_id}.json`
-
-单个碎片的信息：
-
-```json
-{
-  "fragment_smiles": "CC(C)C",
-  "parent_smiles": "...",
-  "cleavage_position": 5,
-  "fragment_id": "ring16_mol_0_frag_0",
-  "parent_id": "ring16_mol_0",
-  "atom_count": 4,
-  "molecular_weight": 58.12
-}
-```
-
-#### 3. `metadata.json`
-
-分子的元数据：
-
-```json
-{
-  "parent_id": "ring16_mol_0",
-  "molecule_id": "CHEMBL94657",
-  "molecule_name": "PATUPILONE",
-  "smiles": "C/C(=C\\c1csc(C)n1)...",
-  "ring_size": 16,
-  "num_fragments": 11,
-  "processing_date": "2025-11-06T10:30:00"
-}
-```
-
-#### 4. `processing_statistics.json`
-
-处理统计信息：
-
-```json
-{
-  "ring_size": 16,
-  "total": 1241,
-  "success": 1200,
-  "failed_parse": 5,
-  "failed_numbering": 10,
-  "failed_validation": 8,
-  "failed_cleavage": 18,
-  "total_fragments": 12500
-}
-```
-
----
-
-## 日志文件
-
-### 1. `processing_log_*.txt`
-
-记录所有处理过程：
-
-```
-[2025-11-06 10:30:00] 开始批量处理 1241 个16元环分子
-[2025-11-06 10:30:05] 分子 0 处理成功
-...
-[2025-11-06 11:00:00] 处理完成: 成功 1200/1241
-```
-
-### 2. `error_log_*.txt`
-
-记录所有错误：
-
-```
-[2025-11-06 10:35:12] ❌ 分子 125 (CHEMBL12345) SMILES解析失败
-[2025-11-06 10:36:45] ❌ 分子 256 (CHEMBL67890) 编号验证失败
-```
-
-### 3. `multiple_lactone_log_*.txt`
-
-记录含有多个内酯键的分子（仅在multi_rings脚本中）：
-
-```
-[2025-11-06 10:40:00] 分子 350 (CHEMBL11111, Molecule Name) 有多个内酯键，已剔除。内酯碳索引: [15, 28, 42]
-```
-
----
-
-## 常见问题
-
-### Q1: 如何检查处理进度？
-
-**A:** 脚本会显示实时进度条。你也可以查看日志文件：
-
-```bash
-tail -f output/ring16_fragments/processing_log_*.txt
-```
-
-### Q2: 如何查看处理结果？
-
-**A:** 查看统计文件：
-
-```bash
-cat output/ring16_fragments/processing_statistics.json
-```
-
-### Q3: 处理失败了怎么办？
-
-**A:** 
-1. 查看 `error_log_*.txt` 了解失败原因
-2. 脚本会跳过失败的分子，继续处理其他分子
-3. 可以手动处理失败的分子
-
-### Q4: 如何重新处理？
-
-**A:** 
-1. 删除输出文件夹：`rm -rf output/ring16_fragments`
-2. 重新运行脚本
-
-### Q5: 内存不足怎么办？
-
-**A:** 
-- 脚本已优化为逐个处理分子，内存占用很小
-- 如果仍有问题，可以修改脚本分批处理
-
----
-
-## 后续分析
-
-处理完成后，可以进行以下分析：
-
-### 1. 统计碎片多样性
-
-```python
-import json
-from pathlib import Path
-from collections import Counter
-
-# 读取所有碎片
-all_fragments = []
-for mol_dir in Path('output/ring16_fragments').iterdir():
-    if mol_dir.is_dir() and mol_dir.name.startswith('ring16_mol_'):
-        fragments_file = mol_dir / f"{mol_dir.name}_all_fragments.json"
-        if fragments_file.exists():
-            with open(fragments_file) as f:
-                data = json.load(f)
-                all_fragments.extend([frag['fragment_smiles'] for frag in data['fragments']])
-
-# 统计
-fragment_counts = Counter(all_fragments)
-print(f"总碎片数: {len(all_fragments)}")
-print(f"独特碎片数: {len(fragment_counts)}")
-print(f"\n最常见的10个碎片:")
-for smiles, count in fragment_counts.most_common(10):
-    print(f"  {smiles}: {count}次")
-```
-
-### 2. 分析位置分布
-
-```python
-from collections import defaultdict
-
-position_fragments = defaultdict(list)
-
-for mol_dir in Path('output/ring16_fragments').iterdir():
-    if mol_dir.is_dir() and mol_dir.name.startswith('ring16_mol_'):
-        fragments_file = mol_dir / f"{mol_dir.name}_all_fragments.json"
-        if fragments_file.exists():
-            with open(fragments_file) as f:
-                data = json.load(f)
-                for frag in data['fragments']:
-                    position = frag['cleavage_position']
-                    position_fragments[position].append(frag['fragment_smiles'])
-
-# 统计每个位置的碎片多样性
-for pos in sorted(position_fragments.keys()):
-    unique_frags = set(position_fragments[pos])
-    print(f"位置 {pos:2d}: {len(position_fragments[pos])} 个碎片, {len(unique_frags)} 种不同结构")
-```
-
----
-
-## 性能优化
-
-如果需要处理大量数据，可以考虑：
-
-1. **并行处理：** 使用 `multiprocessing` 模块
-2. **分批处理：** 将数据集分成多个批次
-3. **增量处理：** 只处理新增的分子
-
----
-
-## 联系与支持
-
-如有问题，请查看：
-- 错误日志文件
-- `BRIDGE_RING_ANALYSIS.md` - 详细技术文档
-- `QUICK_GUIDE.md` - 快速参考指南
-
----
-
-**最后更新：** 2025-11-06  
-**版本：** 1.0
+- `batch_process.py`: 等价于 `macro-lactone-toolkit fragment`
+- `batch_process_ring16.py`: 等价于 `macro-lactone-toolkit fragment --ring-size 16`
+- `batch_process_multi_rings.py`: 自动识别模式的批处理封装
+- `analyze_fragments.py`: 等价于 `macro-lactone-toolkit analyze`
+- `tylosin_splicer.py`: 使用 `macro_lactone_toolkit.splicing.*` 做简单拼接
 
+核心实现与正式接口都在 `src/macro_lactone_toolkit/` 中。
diff --git a/scripts/analyze_fragments.py b/scripts/analyze_fragments.py
old mode 100755
new mode 100644
index bfb9e93..4febb62
--- a/scripts/analyze_fragments.py
+++ b/scripts/analyze_fragments.py
@@ -1,496 +1,10 @@
-#!/usr/bin/env python3
-"""
-碎片分析脚本
-分析指定环大小的分子碎片：
-- 统计每个位置的碎片频率
-- 分析碎片理化性质
-- 找出裂解最多的位置（前5）
-- 找出变化最大的位置
-"""
+from __future__ import annotations
 
 import sys
-from pathlib import Path
 
-# 添加项目根目录到 Python 路径
-script_dir = Path(__file__).resolve().parent
-project_root = script_dir.parent
-sys.path.insert(0, str(project_root))
-
-import json
-import pandas as pd
-import numpy as np
-from collections import Counter, defaultdict
-from rdkit import Chem
-from rdkit.Chem import Descriptors, rdMolDescriptors
-from rdkit import DataStructs
-from rdkit.Chem import AllChem
-import matplotlib.pyplot as plt
-import seaborn as sns
-from scipy import stats as scipy_stats
+from macro_lactone_toolkit.cli import main
 
 
-def load_fragments_from_ring_size(ring_size):
-    """加载指定环大小的所有碎片"""
-    output_dir = Path(f'output/ring{ring_size}_fragments')
-    
-    if not output_dir.exists():
-        print(f"❌ 未找到 {ring_size}元环的输出文件夹")
-        return None
-    
-    all_fragments = []
-    
-    for mol_dir in output_dir.iterdir():
-        if mol_dir.is_dir() and mol_dir.name.startswith(f'ring{ring_size}_mol_'):
-            fragments_file = mol_dir / f"{mol_dir.name}_all_fragments.json"
-            if fragments_file.exists():
-                with open(fragments_file, 'r', encoding='utf-8') as f:
-                    data = json.load(f)
-                    all_fragments.extend(data['fragments'])
-    
-    return all_fragments
-
-
-def calculate_fragment_properties(smiles):
-    """计算碎片的理化性质"""
-    mol = Chem.MolFromSmiles(smiles)
-    if mol is None:
-        return None
-    
-    properties = {
-        'smiles': smiles,
-        'molecular_weight': Descriptors.MolWt(mol),
-        'num_atoms': mol.GetNumAtoms(),
-        'num_heavy_atoms': mol.GetNumHeavyAtoms(),
-        'num_rotatable_bonds': Descriptors.NumRotatableBonds(mol),
-        'num_hbd': Descriptors.NumHDonors(mol),  # 氢键供体
-        'num_hba': Descriptors.NumHAcceptors(mol),  # 氢键受体
-        'logp': Descriptors.MolLogP(mol),
-        'tpsa': Descriptors.TPSA(mol),  # 极性表面积
-        'num_rings': Descriptors.RingCount(mol),
-        'num_aromatic_rings': Descriptors.NumAromaticRings(mol),
-    }
-    
-    return properties
-
-
-def calculate_shannon_entropy(fragment_list):
-    """
-    计算香农熵，用于衡量碎片多样性
-    熵越高，表示多样性越大
-    """
-    if not fragment_list:
-        return 0
-    
-    counts = Counter(fragment_list)
-    total = len(fragment_list)
-    
-    entropy = 0
-    for count in counts.values():
-        p = count / total
-        if p > 0:
-            entropy -= p * np.log2(p)
-    
-    return entropy
-
-
-def calculate_structural_diversity(smiles_list):
-    """
-    计算结构多样性
-    使用Tanimoto相似度的平均值
-    相似度越低，多样性越高
-    """
-    if len(smiles_list) < 2:
-        return 0
-    
-    # 生成指纹
-    fps = []
-    for smiles in smiles_list:
-        mol = Chem.MolFromSmiles(smiles)
-        if mol:
-            fp = AllChem.GetMorganFingerprintAsBitVect(mol, 2, nBits=1024)
-            fps.append(fp)
-    
-    if len(fps) < 2:
-        return 0
-    
-    # 计算所有配对的Tanimoto相似度
-    similarities = []
-    for i in range(len(fps)):
-        for j in range(i+1, len(fps)):
-            sim = DataStructs.TanimotoSimilarity(fps[i], fps[j])
-            similarities.append(sim)
-    
-    # 返回平均相似度（1-相似度 = 多样性）
-    avg_similarity = np.mean(similarities)
-    diversity = 1 - avg_similarity
-    
-    return diversity
-
-
-def analyze_position_statistics(fragments, ring_size):
-    """分析每个位置的统计信息"""
-    
-    # 按位置分组
-    position_data = defaultdict(list)
-    for frag in fragments:
-        pos = frag['cleavage_position']
-        position_data[pos].append(frag)
-    
-    # 统计信息
-    results = []
-    
-    for pos in sorted(position_data.keys()):
-        frags = position_data[pos]
-        smiles_list = [f['fragment_smiles'] for f in frags]
-        
-        # 基础统计
-        total_count = len(smiles_list)
-        unique_count = len(set(smiles_list))
-        most_common = Counter(smiles_list).most_common(3)
-        
-        # 多样性指标
-        shannon_entropy = calculate_shannon_entropy(smiles_list)
-        structural_diversity = calculate_structural_diversity(list(set(smiles_list)))
-        
-        # 理化性质统计
-        props_list = []
-        for smiles in set(smiles_list):
-            props = calculate_fragment_properties(smiles)
-            if props:
-                props_list.append(props)
-        
-        if props_list:
-            mw_mean = np.mean([p['molecular_weight'] for p in props_list])
-            mw_std = np.std([p['molecular_weight'] for p in props_list])
-            atom_mean = np.mean([p['num_atoms'] for p in props_list])
-            atom_std = np.std([p['num_atoms'] for p in props_list])
-        else:
-            mw_mean = mw_std = atom_mean = atom_std = 0
-        
-        results.append({
-            'position': pos,
-            'total_count': total_count,
-            'unique_count': unique_count,
-            'uniqueness_ratio': unique_count / total_count if total_count > 0 else 0,
-            'shannon_entropy': shannon_entropy,
-            'structural_diversity': structural_diversity,
-            'mw_mean': mw_mean,
-            'mw_std': mw_std,
-            'atom_mean': atom_mean,
-            'atom_std': atom_std,
-            'most_common': most_common
-        })
-    
-    return pd.DataFrame(results)
-
-
-def plot_position_analysis(df_stats, ring_size, output_dir):
-    """绘制位置分析图表"""
-    
-    fig, axes = plt.subplots(2, 3, figsize=(18, 12))
-    fig.suptitle(f'{ring_size}元环碎片位置分析', fontsize=16, fontweight='bold')
-    
-    positions = df_stats['position'].values
-    
-    # 1. 碎片总数分布
-    ax = axes[0, 0]
-    bars = ax.bar(positions, df_stats['total_count'], color='steelblue', edgecolor='black', alpha=0.7)
-    ax.set_xlabel('环位置', fontsize=12)
-    ax.set_ylabel('碎片总数', fontsize=12)
-    ax.set_title('各位置碎片总数', fontsize=14, fontweight='bold')
-    ax.grid(axis='y', alpha=0.3)
-    
-    # 标注前5
-    top5_indices = df_stats.nlargest(5, 'total_count').index
-    for idx in top5_indices:
-        pos = df_stats.loc[idx, 'position']
-        count = df_stats.loc[idx, 'total_count']
-        ax.text(pos, count, str(int(count)), ha='center', va='bottom', fontsize=9, fontweight='bold')
-    
-    # 2. 独特碎片数
-    ax = axes[0, 1]
-    ax.bar(positions, df_stats['unique_count'], color='coral', edgecolor='black', alpha=0.7)
-    ax.set_xlabel('环位置', fontsize=12)
-    ax.set_ylabel('独特碎片数', fontsize=12)
-    ax.set_title('各位置独特碎片数', fontsize=14, fontweight='bold')
-    ax.grid(axis='y', alpha=0.3)
-    
-    # 3. 香农熵（多样性）
-    ax = axes[0, 2]
-    bars = ax.bar(positions, df_stats['shannon_entropy'], color='mediumseagreen', edgecolor='black', alpha=0.7)
-    ax.set_xlabel('环位置', fontsize=12)
-    ax.set_ylabel('香农熵', fontsize=12)
-    ax.set_title('各位置碎片多样性（香农熵）', fontsize=14, fontweight='bold')
-    ax.grid(axis='y', alpha=0.3)
-    
-    # 标注熵最高的位置
-    top_entropy_idx = df_stats['shannon_entropy'].idxmax()
-    pos = df_stats.loc[top_entropy_idx, 'position']
-    entropy = df_stats.loc[top_entropy_idx, 'shannon_entropy']
-    ax.text(pos, entropy, f'{entropy:.2f}', ha='center', va='bottom', fontsize=9, fontweight='bold')
-    
-    # 4. 结构多样性
-    ax = axes[1, 0]
-    ax.bar(positions, df_stats['structural_diversity'], color='mediumpurple', edgecolor='black', alpha=0.7)
-    ax.set_xlabel('环位置', fontsize=12)
-    ax.set_ylabel('结构多样性', fontsize=12)
-    ax.set_title('各位置结构多样性（指纹相似度）', fontsize=14, fontweight='bold')
-    ax.grid(axis='y', alpha=0.3)
-    
-    # 5. 分子量分布
-    ax = axes[1, 1]
-    ax.errorbar(positions, df_stats['mw_mean'], yerr=df_stats['mw_std'],
-                fmt='o-', color='darkred', ecolor='lightcoral', capsize=5, capthick=2, alpha=0.7)
-    ax.set_xlabel('环位置', fontsize=12)
-    ax.set_ylabel('分子量 (Da)', fontsize=12)
-    ax.set_title('各位置碎片平均分子量', fontsize=14, fontweight='bold')
-    ax.grid(axis='y', alpha=0.3)
-    
-    # 6. 原子数分布
-    ax = axes[1, 2]
-    ax.errorbar(positions, df_stats['atom_mean'], yerr=df_stats['atom_std'],
-                fmt='s-', color='darkgreen', ecolor='lightgreen', capsize=5, capthick=2, alpha=0.7)
-    ax.set_xlabel('环位置', fontsize=12)
-    ax.set_ylabel('原子数', fontsize=12)
-    ax.set_title('各位置碎片平均原子数', fontsize=14, fontweight='bold')
-    ax.grid(axis='y', alpha=0.3)
-    
-    plt.tight_layout()
-    
-    # 保存图表
-    plot_path = output_dir / f'ring{ring_size}_position_analysis.png'
-    plt.savefig(plot_path, dpi=300, bbox_inches='tight')
-    print(f"✓ 图表已保存: {plot_path}")
-    
-    plt.show()
-
-
-def plot_fragment_properties_distribution(fragments, ring_size, output_dir):
-    """绘制碎片理化性质分布"""
-    
-    # 计算所有碎片的性质
-    all_props = []
-    unique_smiles = set([f['fragment_smiles'] for f in fragments])
-    
-    for smiles in unique_smiles:
-        props = calculate_fragment_properties(smiles)
-        if props:
-            all_props.append(props)
-    
-    df_props = pd.DataFrame(all_props)
-    
-    fig, axes = plt.subplots(2, 3, figsize=(18, 12))
-    fig.suptitle(f'{ring_size}元环碎片理化性质分布', fontsize=16, fontweight='bold')
-    
-    # 1. 分子量分布
-    ax = axes[0, 0]
-    ax.hist(df_props['molecular_weight'], bins=30, color='steelblue', edgecolor='black', alpha=0.7)
-    ax.set_xlabel('分子量 (Da)', fontsize=12)
-    ax.set_ylabel('频数', fontsize=12)
-    ax.set_title('分子量分布', fontsize=14, fontweight='bold')
-    ax.axvline(df_props['molecular_weight'].median(), color='red', linestyle='--', 
-               label=f'中位数: {df_props["molecular_weight"].median():.1f}')
-    ax.legend()
-    ax.grid(axis='y', alpha=0.3)
-    
-    # 2. 原子数分布
-    ax = axes[0, 1]
-    ax.hist(df_props['num_atoms'], bins=range(1, int(df_props['num_atoms'].max())+2), 
-            color='coral', edgecolor='black', alpha=0.7)
-    ax.set_xlabel('原子数', fontsize=12)
-    ax.set_ylabel('频数', fontsize=12)
-    ax.set_title('原子数分布', fontsize=14, fontweight='bold')
-    ax.axvline(df_props['num_atoms'].median(), color='red', linestyle='--',
-               label=f'中位数: {df_props["num_atoms"].median():.0f}')
-    ax.legend()
-    ax.grid(axis='y', alpha=0.3)
-    
-    # 3. LogP分布
-    ax = axes[0, 2]
-    ax.hist(df_props['logp'], bins=30, color='mediumseagreen', edgecolor='black', alpha=0.7)
-    ax.set_xlabel('LogP', fontsize=12)
-    ax.set_ylabel('频数', fontsize=12)
-    ax.set_title('LogP分布（亲脂性）', fontsize=14, fontweight='bold')
-    ax.axvline(df_props['logp'].median(), color='red', linestyle='--',
-               label=f'中位数: {df_props["logp"].median():.2f}')
-    ax.legend()
-    ax.grid(axis='y', alpha=0.3)
-    
-    # 4. TPSA分布
-    ax = axes[1, 0]
-    ax.hist(df_props['tpsa'], bins=30, color='mediumpurple', edgecolor='black', alpha=0.7)
-    ax.set_xlabel('TPSA (Ų)', fontsize=12)
-    ax.set_ylabel('频数', fontsize=12)
-    ax.set_title('极性表面积分布', fontsize=14, fontweight='bold')
-    ax.axvline(df_props['tpsa'].median(), color='red', linestyle='--',
-               label=f'中位数: {df_props["tpsa"].median():.1f}')
-    ax.legend()
-    ax.grid(axis='y', alpha=0.3)
-    
-    # 5. 可旋转键数分布
-    ax = axes[1, 1]
-    ax.hist(df_props['num_rotatable_bonds'], 
-            bins=range(int(df_props['num_rotatable_bonds'].max())+2),
-            color='darkgoldenrod', edgecolor='black', alpha=0.7)
-    ax.set_xlabel('可旋转键数', fontsize=12)
-    ax.set_ylabel('频数', fontsize=12)
-    ax.set_title('可旋转键数分布（柔性）', fontsize=14, fontweight='bold')
-    ax.grid(axis='y', alpha=0.3)
-    
-    # 6. 氢键统计
-    ax = axes[1, 2]
-    width = 0.35
-    x = np.arange(2)
-    hbd_mean = df_props['num_hbd'].mean()
-    hba_mean = df_props['num_hba'].mean()
-    ax.bar(x[0]-width/2, hbd_mean, width, label='氢键供体 (HBD)', color='lightcoral', edgecolor='black')
-    ax.bar(x[1]+width/2, hba_mean, width, label='氢键受体 (HBA)', color='lightblue', edgecolor='black')
-    ax.set_ylabel('平均数量', fontsize=12)
-    ax.set_title('氢键能力统计', fontsize=14, fontweight='bold')
-    ax.set_xticks(x)
-    ax.set_xticklabels(['HBD', 'HBA'])
-    ax.legend()
-    ax.grid(axis='y', alpha=0.3)
-    
-    plt.tight_layout()
-    
-    # 保存图表
-    plot_path = output_dir / f'ring{ring_size}_properties_distribution.png'
-    plt.savefig(plot_path, dpi=300, bbox_inches='tight')
-    print(f"✓ 图表已保存: {plot_path}")
-    
-    plt.show()
-    
-    return df_props
-
-
-def generate_analysis_report(ring_size):
-    """生成完整的分析报告"""
-    
-    print("="*80)
-    print(f"{ring_size}元环碎片分析")
-    print("="*80)
-    print()
-    
-    # 加载碎片
-    print("加载碎片数据...")
-    fragments = load_fragments_from_ring_size(ring_size)
-    
-    if fragments is None or len(fragments) == 0:
-        print(f"❌ 没有找到碎片数据")
-        return
-    
-    print(f"✓ 加载了 {len(fragments)} 个碎片")
-    print()
-    
-    # 创建输出文件夹
-    output_dir = Path(f'output/ring{ring_size}_analysis')
-    output_dir.mkdir(parents=True, exist_ok=True)
-    
-    # 1. 位置统计分析
-    print("分析各位置统计信息...")
-    df_stats = analyze_position_statistics(fragments, ring_size)
-    
-    # 保存统计表格
-    stats_csv = output_dir / f'ring{ring_size}_position_statistics.csv'
-    df_stats.to_csv(stats_csv, index=False, encoding='utf-8')
-    print(f"✓ 统计表格已保存: {stats_csv}")
-    print()
-    
-    # 2. 打印关键结果
-    print("="*80)
-    print("关键发现")
-    print("="*80)
-    print()
-    
-    # 裂解最多的前5个位置
-    top5_positions = df_stats.nlargest(5, 'total_count')
-    print("📊 裂解最多的前5个位置:")
-    for i, row in enumerate(top5_positions.itertuples(), 1):
-        print(f"  {i}. 位置 {row.position}: {row.total_count} 个碎片 "
-              f"({row.unique_count} 种独特结构)")
-    print()
-    
-    # 多样性最大的前5个位置（基于香农熵）
-    top5_diversity = df_stats.nlargest(5, 'shannon_entropy')
-    print("🌈 多样性最大的前5个位置（香农熵）:")
-    for i, row in enumerate(top5_diversity.itertuples(), 1):
-        print(f"  {i}. 位置 {row.position}: 熵={row.shannon_entropy:.3f}, "
-              f"独特比={row.uniqueness_ratio:.2%}, "
-              f"{row.unique_count}/{row.total_count} 独特")
-    print()
-    
-    # 结构多样性最大的前5个位置
-    top5_structural = df_stats.nlargest(5, 'structural_diversity')
-    print("🔬 结构多样性最大的前5个位置（指纹相似度）:")
-    for i, row in enumerate(top5_structural.itertuples(), 1):
-        print(f"  {i}. 位置 {row.position}: 多样性={row.structural_diversity:.3f}, "
-              f"{row.unique_count} 种结构")
-    print()
-    
-    # 碎片大小变化最大的位置
-    top5_mw_variance = df_stats.nlargest(5, 'mw_std')
-    print("⚖️ 分子量变化最大的前5个位置:")
-    for i, row in enumerate(top5_mw_variance.itertuples(), 1):
-        print(f"  {i}. 位置 {row.position}: 平均={row.mw_mean:.1f} Da, "
-              f"标准差={row.mw_std:.1f} Da")
-    print()
-    
-    # 3. 绘制图表
-    print("生成可视化图表...")
-    plot_position_analysis(df_stats, ring_size, output_dir)
-    print()
-    
-    # 4. 理化性质分析
-    print("分析碎片理化性质...")
-    df_props = plot_fragment_properties_distribution(fragments, ring_size, output_dir)
-    
-    # 保存理化性质表格
-    props_csv = output_dir / f'ring{ring_size}_fragment_properties.csv'
-    df_props.to_csv(props_csv, index=False, encoding='utf-8')
-    print(f"✓ 理化性质表格已保存: {props_csv}")
-    print()
-    
-    # 5. 总结统计
-    print("="*80)
-    print("总体统计")
-    print("="*80)
-    print(f"总碎片数: {len(fragments)}")
-    print(f"独特碎片数: {len(set([f['fragment_smiles'] for f in fragments]))}")
-    print(f"平均分子量: {df_props['molecular_weight'].mean():.2f} ± {df_props['molecular_weight'].std():.2f} Da")
-    print(f"平均原子数: {df_props['num_atoms'].mean():.2f} ± {df_props['num_atoms'].std():.2f}")
-    print(f"分子量范围: {df_props['molecular_weight'].min():.1f} - {df_props['molecular_weight'].max():.1f} Da")
-    print(f"原子数范围: {int(df_props['num_atoms'].min())} - {int(df_props['num_atoms'].max())}")
-    print()
-    
-    # 6. 生成文本报告
-    report_path = output_dir / f'ring{ring_size}_analysis_report.txt'
-    with open(report_path, 'w', encoding='utf-8') as f:
-        f.write(f"{ring_size}元环碎片分析报告\n")
-        f.write("="*80 + "\n\n")
-        f.write(f"生成时间: {pd.Timestamp.now()}\n\n")
-        
-        f.write("位置统计摘要:\n")
-        f.write(df_stats.to_string())
-        f.write("\n\n")
-        
-        f.write("理化性质摘要:\n")
-        f.write(df_props.describe().to_string())
-        f.write("\n")
-    
-    print(f"✓ 完整报告已保存: {report_path}")
-    print()
-    print("="*80)
-    print("分析完成！")
-    print("="*80)
-
-
-if __name__ == '__main__':
-    import argparse
-    
-    parser = argparse.ArgumentParser(description='分析大环内酯碎片')
-    parser.add_argument('ring_size', type=int, help='环大小（如：16）')
-    
-    args = parser.parse_args()
-    
-    generate_analysis_report(args.ring_size)
+if __name__ == "__main__":
+    sys.argv = ["macro-lactone-toolkit", "analyze", *sys.argv[1:]]
+    main()
diff --git a/scripts/batch_process.py b/scripts/batch_process.py
index 02d9a41..d9f6709 100644
--- a/scripts/batch_process.py
+++ b/scripts/batch_process.py
@@ -1,165 +1,10 @@
-"""
-Batch processing script for analyzing all macrolactones in the dataset.
-"""
+from __future__ import annotations
+
 import sys
-sys.path.append('..')
 
-import pandas as pd
-from pathlib import Path
-from tqdm import tqdm
-import json
-from rdkit import Chem
-
-from src.fragment_cleaver import process_molecule
-from src.fragment_dataclass import MoleculeFragments
-
-
-def batch_process_molecules(csv_path: str, output_base_dir: str, 
-                            max_molecules: int = None):
-    """
-    Process all molecules in the CSV file.
-    
-    Args:
-        csv_path: Path to the CSV file containing SMILES
-        output_base_dir: Base directory for output
-        max_molecules: Maximum number of molecules to process (None for all)
-    """
-    # Read CSV
-    df = pd.read_csv(csv_path)
-    print(f"Loaded {len(df)} molecules from {csv_path}")
-    
-    if max_molecules:
-        df = df.head(max_molecules)
-        print(f"Processing first {max_molecules} molecules")
-    
-    # Create output directory
-    output_dir = Path(output_base_dir)
-    output_dir.mkdir(parents=True, exist_ok=True)
-    
-    # Statistics
-    successful = 0
-    failed = 0
-    failed_molecules = []
-    all_fragments = []
-    
-    # Process each molecule
-    for idx, row in tqdm(df.iterrows(), total=len(df), desc="Processing molecules"):
-        smiles = row['smiles']
-        molecule_id = row.get('IDs', f'molecule_{idx}')
-        
-        try:
-            # Process molecule
-            mol_fragments = process_molecule(smiles, idx)
-            
-            if mol_fragments is None or len(mol_fragments.fragments) == 0:
-                failed += 1
-                failed_molecules.append({
-                    'index': idx,
-                    'id': molecule_id,
-                    'reason': 'No fragments extracted'
-                })
-                continue
-            
-            # Create output directory for this molecule
-            mol_output_dir = output_dir / mol_fragments.parent_id
-            mol_output_dir.mkdir(parents=True, exist_ok=True)
-            
-            # Save complete molecule fragments
-            mol_fragments_path = mol_output_dir / f"{mol_fragments.parent_id}_all_fragments.json"
-            mol_fragments.to_json_file(str(mol_fragments_path))
-            
-            # Save individual fragments
-            for frag in mol_fragments.fragments:
-                frag_path = mol_output_dir / f"{frag.fragment_id}.json"
-                frag.to_json_file(str(frag_path))
-                
-                # Collect for overall statistics
-                all_fragments.append({
-                    'parent_id': frag.parent_id,
-                    'fragment_id': frag.fragment_id,
-                    'cleavage_position': frag.cleavage_position,
-                    'fragment_smiles': frag.fragment_smiles,
-                    'atom_count': frag.atom_count,
-                    'molecular_weight': frag.molecular_weight,
-                    'parent_smiles': frag.parent_smiles
-                })
-            
-            successful += 1
-            
-        except Exception as e:
-            failed += 1
-            failed_molecules.append({
-                'index': idx,
-                'id': molecule_id,
-                'error': str(e)
-            })
-            print(f"\nError processing molecule {idx} ({molecule_id}): {e}")
-    
-    # Save overall statistics
-    stats = {
-        'total_molecules': len(df),
-        'successful': successful,
-        'failed': failed,
-        'total_fragments': len(all_fragments),
-        'failed_molecules': failed_molecules
-    }
-    
-    stats_path = output_dir / 'processing_stats.json'
-    with open(stats_path, 'w') as f:
-        json.dump(stats, f, indent=2)
-    
-    # Save all fragments as CSV for easy analysis
-    if all_fragments:
-        fragments_df = pd.DataFrame(all_fragments)
-        fragments_csv_path = output_dir / 'all_fragments.csv'
-        fragments_df.to_csv(fragments_csv_path, index=False)
-        print(f"\n✓ Saved all fragments to: {fragments_csv_path}")
-    
-    # Print summary
-    print(f"\n{'='*60}")
-    print(f"PROCESSING COMPLETE")
-    print(f"{'='*60}")
-    print(f"Total molecules: {len(df)}")
-    print(f"Successfully processed: {successful}")
-    print(f"Failed: {failed}")
-    print(f"Total fragments extracted: {len(all_fragments)}")
-    print(f"{'='*60}")
-    print(f"\nResults saved to: {output_dir}")
-    print(f"Statistics saved to: {stats_path}")
-    
-    return fragments_df if all_fragments else None
+from macro_lactone_toolkit.cli import main
 
 
 if __name__ == "__main__":
-    import argparse
-    
-    parser = argparse.ArgumentParser(
-        description="Batch process macrolactones to extract side chain fragments"
-    )
-    parser.add_argument(
-        '--input',
-        type=str,
-        default='../ring16/temp.csv',
-        help='Input CSV file path'
-    )
-    parser.add_argument(
-        '--output',
-        type=str,
-        default='../output/fragments',
-        help='Output directory path'
-    )
-    parser.add_argument(
-        '--max',
-        type=int,
-        default=None,
-        help='Maximum number of molecules to process (default: all)'
-    )
-    
-    args = parser.parse_args()
-    
-    batch_process_molecules(
-        csv_path=args.input,
-        output_base_dir=args.output,
-        max_molecules=args.max
-    )
-
+    sys.argv = ["macro-lactone-toolkit", "fragment", *sys.argv[1:]]
+    main()
diff --git a/scripts/batch_process_multi_rings.py b/scripts/batch_process_multi_rings.py
old mode 100755
new mode 100644
index 64ad1cc..d9f6709
--- a/scripts/batch_process_multi_rings.py
+++ b/scripts/batch_process_multi_rings.py
@@ -1,423 +1,10 @@
-#!/usr/bin/env python3
-"""
-批量处理12-20元环大环内酯分子
-处理文件: data/ring12_20/temp.csv
-输出文件夹: output/ring{N}_fragments/ (N=12,13,14...20)
-"""
+from __future__ import annotations
 
 import sys
-from pathlib import Path
 
-# 添加项目根目录到 Python 路径
-# 获取脚本所在目录的父目录（项目根目录）
-script_dir = Path(__file__).resolve().parent
-project_root = script_dir.parent
-sys.path.insert(0, str(project_root))
-
-from rdkit import Chem
-import pandas as pd
-import json
-from tqdm import tqdm
-from datetime import datetime
-from collections import defaultdict
-
-from src.ring_numbering import get_ring_atoms
-from src.fragment_cleaver import cleave_side_chains
+from macro_lactone_toolkit.cli import main
 
 
-def find_lactone_carbons(mol, ring_size):
-    """
-    找到指定环大小的环上的内酯羰基碳原子
-    返回所有环上内酯碳的索引列表
-    """
-    # 获取指定环大小的所有环原子
-    ring_pattern = Chem.MolFromSmarts(f"[r{ring_size}]")
-    if ring_pattern is None:
-        return []
-    
-    ring_matches = mol.GetSubstructMatches(ring_pattern)
-    ring_atoms = list(set([match[0] for match in ring_matches]))
-    
-    if len(ring_atoms) == 0:
-        return []
-    
-    # 使用更严格的SMARTS模式：环上的氧连接到环上的羰基碳
-    # [r{ring_size};#8] 环上的氧，连接到 [r{ring_size};#6] 环上的碳，该碳有双键氧 (=O)
-    lactone_pattern = Chem.MolFromSmarts(f"[r{ring_size};#8][r{ring_size};#6](=[#8])")
-    
-    if lactone_pattern is None:
-        return []
-    
-    matches = mol.GetSubstructMatches(lactone_pattern)
-    
-    # 提取环上的内酯碳索引 (match[1]是羰基碳)
-    lactone_carbons = []
-    for match in matches:
-        if len(match) >= 2:
-            carbonyl_carbon = match[1]  # 羰基碳
-            # 确保这个碳确实在指定大小的环上
-            if carbonyl_carbon in ring_atoms:
-                lactone_carbons.append(carbonyl_carbon)
-    
-    return list(set(lactone_carbons))  # 去重
-
-
-def order_ring_atoms_clockwise(mol, ring_atoms, start_atom):
-    """顺时针排列环原子"""
-    if start_atom not in ring_atoms:
-        raise ValueError("Start atom is not in the ring")
-    
-    ordered = [start_atom]
-    visited = {start_atom}
-    current = start_atom
-    
-    while len(ordered) < len(ring_atoms):
-        atom = mol.GetAtomWithIdx(current)
-        
-        # 找到下一个环原子
-        next_atom = None
-        for neighbor in atom.GetNeighbors():
-            neighbor_idx = neighbor.GetIdx()
-            if neighbor_idx in ring_atoms and neighbor_idx not in visited:
-                next_atom = neighbor_idx
-                break
-        
-        if next_atom is None:
-            # 尝试从起点反向遍历
-            if len(ordered) < len(ring_atoms):
-                atom = mol.GetAtomWithIdx(start_atom)
-                for neighbor in atom.GetNeighbors():
-                    neighbor_idx = neighbor.GetIdx()
-                    if neighbor_idx in ring_atoms and neighbor_idx not in visited:
-                        ordered.insert(0, neighbor_idx)
-                        visited.add(neighbor_idx)
-                        current = neighbor_idx
-                        break
-                else:
-                    break
-            else:
-                break
-        else:
-            ordered.append(next_atom)
-            visited.add(next_atom)
-            current = next_atom
-    
-    return ordered
-
-
-def assign_ring_numbering_general(mol, ring_size):
-    """
-    为任意大小的环分配编号
-    返回: (numbering_dict, lactone_carbon) 或 (None, None) 或 ('multiple_lactones', lactone_list)
-    """
-    # 找到内酯碳
-    lactone_carbons = find_lactone_carbons(mol, ring_size)
-    
-    if len(lactone_carbons) == 0:
-        return None, None
-    
-    if len(lactone_carbons) > 1:
-        return 'multiple_lactones', lactone_carbons
-    
-    lactone_c = lactone_carbons[0]
-    
-    # 获取环原子
-    ring_atoms = get_ring_atoms(mol)
-    
-    # 如果环原子数不匹配
-    if len(ring_atoms) != ring_size:
-        return None, None
-    
-    # 排列环原子
-    try:
-        ordered_atoms = order_ring_atoms_clockwise(mol, ring_atoms, lactone_c)
-    except Exception:
-        return None, None
-    
-    # 创建编号字典
-    numbering = {}
-    for position, atom_idx in enumerate(ordered_atoms, start=1):
-        numbering[atom_idx] = position
-    
-    return numbering, lactone_c
-
-
-def validate_numbering_general(numbering, ring_size):
-    """验证编号是否正确"""
-    if numbering is None or numbering == 'multiple_lactones':
-        return False
-    
-    if len(numbering) != ring_size:
-        return False
-    
-    positions = set(numbering.values())
-    if positions != set(range(1, ring_size + 1)):
-        return False
-    
-    return True
-
-
-def get_ring_size_from_smiles(smiles):
-    """从SMILES获取分子中的环大小"""
-    mol = Chem.MolFromSmiles(smiles)
-    if mol is None:
-        return []
-    
-    ring_sizes = set()
-    for ring_size in range(12, 21):  # 12-20
-        pattern = Chem.MolFromSmarts(f"[r{ring_size}]")
-        if pattern is not None:
-            matches = mol.GetSubstructMatches(pattern)
-            if matches:
-                unique_atoms = list(set([match[0] for match in matches]))
-                if len(unique_atoms) == ring_size:
-                    ring_sizes.add(ring_size)
-    
-    return sorted(list(ring_sizes))
-
-
-def process_single_molecule(idx, row, ring_size, output_base_dir, log_file, error_log_file, multiple_lactone_log):
-    """处理单个分子"""
-    try:
-        # 获取分子信息
-        molecule_id = row.get('IDs', f'mol_{idx}')
-        molecule_name = row.get('molecule_pref_name', 'Unknown')
-        smiles = row['smiles']
-        parent_id = f"ring{ring_size}_mol_{idx}"
-        
-        # 解析SMILES
-        mol = Chem.MolFromSmiles(smiles)
-        if mol is None:
-            log_message(f"分子 {idx} ({molecule_id}) SMILES解析失败", error_log_file, 'error')
-            return None, 'parse_failed'
-        
-        # 分配编号
-        numbering, lactone_c = assign_ring_numbering_general(mol, ring_size)
-        
-        # 检查环上是否有多个内酯键
-        if numbering == 'multiple_lactones':
-            log_message(
-                f"分子 {idx} ({molecule_id}, {molecule_name}) 环上有{len(lactone_c)}个内酯键，已剔除。环上内酯碳索引: {lactone_c}",
-                multiple_lactone_log,
-                'info'
-            )
-            return None, 'multiple_lactones'
-        
-        if numbering is None:
-            log_message(f"分子 {idx} ({molecule_id}) 编号失败", error_log_file, 'error')
-            return None, 'numbering_failed'
-        
-        # 验证编号
-        if not validate_numbering_general(numbering, ring_size):
-            log_message(f"分子 {idx} ({molecule_id}) 编号验证失败", error_log_file, 'error')
-            return None, 'validation_failed'
-        
-        # 侧链断裂
-        try:
-            mol_fragments = cleave_side_chains(mol, smiles, parent_id, numbering)
-        except Exception as e:
-            log_message(f"分子 {idx} ({molecule_id}) 裂解失败: {str(e)}", error_log_file, 'error')
-            return None, 'cleavage_failed'
-        
-        # 创建分子专属文件夹
-        mol_output_dir = output_base_dir / parent_id
-        mol_output_dir.mkdir(parents=True, exist_ok=True)
-        
-        # 保存整个MoleculeFragments对象
-        mol_fragments_path = mol_output_dir / f"{parent_id}_all_fragments.json"
-        mol_fragments.to_json_file(str(mol_fragments_path))
-        
-        # 保存每个碎片
-        for frag in mol_fragments.fragments:
-            frag_path = mol_output_dir / f"{frag.fragment_id}.json"
-            frag.to_json_file(str(frag_path))
-        
-        # 保存分子信息到metadata
-        metadata = {
-            'parent_id': parent_id,
-            'molecule_id': molecule_id,
-            'molecule_name': molecule_name,
-            'smiles': smiles,
-            'ring_size': ring_size,
-            'num_fragments': len(mol_fragments.fragments),
-            'processing_date': datetime.now().isoformat()
-        }
-        
-        metadata_path = mol_output_dir / 'metadata.json'
-        with open(metadata_path, 'w', encoding='utf-8') as f:
-            json.dump(metadata, f, indent=2, ensure_ascii=False)
-        
-        return len(mol_fragments.fragments), 'success'
-        
-    except Exception as e:
-        log_message(f"分子 {idx} 未预期错误: {str(e)}", error_log_file, 'error')
-        return None, 'unexpected_error'
-
-
-def log_message(message, log_file, log_type='info'):
-    """记录日志信息"""
-    timestamp = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
-    log_entry = f"[{timestamp}] {message}\n"
-    
-    with open(log_file, 'a', encoding='utf-8') as f:
-        f.write(log_entry)
-    
-    if log_type == 'error':
-        tqdm.write(f"❌ {message}")
-    elif log_type == 'info' and 'multiple' not in log_file.name:
-        tqdm.write(f"✓ {message}")
-
-
-def main():
-    print("="*80)
-    print("12-20元环大环内酯批量处理程序")
-    print("="*80)
-    print()
-    
-    # 读取数据
-    input_file = Path('data/ring12_20/temp.csv')
-    if not input_file.exists():
-        print(f"❌ 错误: 找不到输入文件 {input_file}")
-        return
-    
-    df = pd.read_csv(input_file)
-    print(f"✓ 读取数据集: {len(df)} 个分子")
-    print(f"  文件: {input_file}")
-    print()
-    
-    # 第一步：按环大小分类
-    print("步骤 1: 按环大小分类分子...")
-    print()
-    
-    ring_size_groups = defaultdict(list)
-    
-    for idx in tqdm(range(len(df)), desc="分类进度"):
-        smiles = df.iloc[idx]['smiles']
-        ring_sizes = get_ring_size_from_smiles(smiles)
-        
-        for ring_size in ring_sizes:
-            if 12 <= ring_size <= 20:
-                ring_size_groups[ring_size].append(idx)
-    
-    print()
-    print("分类结果:")
-    for ring_size in sorted(ring_size_groups.keys()):
-        print(f"  {ring_size}元环: {len(ring_size_groups[ring_size])} 个分子")
-    print()
-    
-    # 第二步：对每种环大小分别处理
-    print("步骤 2: 处理各环大小分子...")
-    print()
-    
-    all_stats = {}
-    
-    for ring_size in sorted(ring_size_groups.keys()):
-        indices = ring_size_groups[ring_size]
-        
-        print(f"\n处理 {ring_size}元环 ({len(indices)} 个分子)...")
-        print("-" * 60)
-        
-        # 创建输出文件夹
-        output_base_dir = Path(f'output/ring{ring_size}_fragments')
-        output_base_dir.mkdir(parents=True, exist_ok=True)
-        
-        # 创建日志文件
-        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
-        log_file = output_base_dir / f'processing_log_{timestamp}.txt'
-        error_log_file = output_base_dir / f'error_log_{timestamp}.txt'
-        multiple_lactone_log = output_base_dir / f'multiple_lactone_log_{timestamp}.txt'
-        
-        # 统计信息
-        stats = {
-            'ring_size': ring_size,
-            'total': len(indices),
-            'success': 0,
-            'failed_parse': 0,
-            'failed_numbering': 0,
-            'failed_validation': 0,
-            'failed_cleavage': 0,
-            'failed_unexpected': 0,
-            'multiple_lactones': 0,
-            'total_fragments': 0
-        }
-        
-        log_message(f"开始处理 {len(indices)} 个{ring_size}元环分子", log_file)
-        
-        # 处理每个分子
-        for idx in tqdm(indices, desc=f"Ring-{ring_size}"):
-            num_fragments, status = process_single_molecule(
-                idx, df.iloc[idx], ring_size,
-                output_base_dir, log_file, error_log_file, multiple_lactone_log
-            )
-            
-            if status == 'success':
-                stats['success'] += 1
-                stats['total_fragments'] += num_fragments
-            elif status == 'parse_failed':
-                stats['failed_parse'] += 1
-            elif status == 'numbering_failed':
-                stats['failed_numbering'] += 1
-            elif status == 'validation_failed':
-                stats['failed_validation'] += 1
-            elif status == 'cleavage_failed':
-                stats['failed_cleavage'] += 1
-            elif status == 'multiple_lactones':
-                stats['multiple_lactones'] += 1
-            else:
-                stats['failed_unexpected'] += 1
-        
-        # 保存统计结果
-        stats_file = output_base_dir / 'processing_statistics.json'
-        with open(stats_file, 'w', encoding='utf-8') as f:
-            json.dump(stats, f, indent=2)
-        
-        all_stats[ring_size] = stats
-        
-        # 打印该环大小的统计
-        print()
-        print(f"{ring_size}元环处理结果:")
-        print(f"  成功: {stats['success']}/{stats['total']} ({stats['success']/stats['total']*100:.1f}%)")
-        print(f"  碎片数: {stats['total_fragments']}")
-        if stats['success'] > 0:
-            print(f"  平均碎片: {stats['total_fragments']/stats['success']:.2f} 个/分子")
-        print(f"  多个内酯键: {stats['multiple_lactones']}")
-        
-        log_message(f"处理完成: 成功 {stats['success']}/{stats['total']}", log_file)
-    
-    # 第三步：总结
-    print()
-    print("="*80)
-    print("所有环大小处理完成！")
-    print("="*80)
-    print()
-    
-    print("总体统计:")
-    total_molecules = 0
-    total_success = 0
-    total_fragments = 0
-    total_multiple_lactones = 0
-    
-    for ring_size, stats in sorted(all_stats.items()):
-        total_molecules += stats['total']
-        total_success += stats['success']
-        total_fragments += stats['total_fragments']
-        total_multiple_lactones += stats['multiple_lactones']
-    
-    print(f"  总分子数: {total_molecules}")
-    print(f"  总成功数: {total_success} ({total_success/total_molecules*100:.1f}%)")
-    print(f"  总碎片数: {total_fragments}")
-    if total_success > 0:
-        print(f"  平均碎片: {total_fragments/total_success:.2f} 个/分子")
-    print(f"  多内酯键分子: {total_multiple_lactones}")
-    print()
-    
-    print("各环大小输出文件夹:")
-    for ring_size in sorted(all_stats.keys()):
-        output_dir = Path(f'output/ring{ring_size}_fragments')
-        print(f"  {ring_size}元环: {output_dir}")
-    print()
-
-
-if __name__ == '__main__':
+if __name__ == "__main__":
+    sys.argv = ["macro-lactone-toolkit", "fragment", *sys.argv[1:]]
     main()
-
diff --git a/scripts/batch_process_ring16.py b/scripts/batch_process_ring16.py
old mode 100755
new mode 100644
index 7b1f0d5..52c9a7c
--- a/scripts/batch_process_ring16.py
+++ b/scripts/batch_process_ring16.py
@@ -1,259 +1,10 @@
-#!/usr/bin/env python3
-"""
-批量处理16元环大环内酯分子
-处理文件: ring16/temp_filtered_complete.csv (1241个分子)
-输出文件夹: output/ring16_fragments/
-"""
+from __future__ import annotations
 
 import sys
-from pathlib import Path
 
-# 添加项目根目录到 Python 路径
-# 获取脚本所在目录的父目录（项目根目录）
-script_dir = Path(__file__).resolve().parent
-project_root = script_dir.parent
-sys.path.insert(0, str(project_root))
-
-from rdkit import Chem
-import pandas as pd
-import json
-from tqdm import tqdm
-from datetime import datetime
-
-from src.ring_numbering import (
-    assign_ring_numbering,
-    validate_numbering,
-    get_ring_atoms
-)
-from src.fragment_cleaver import cleave_side_chains
+from macro_lactone_toolkit.cli import main
 
 
-def log_message(message, log_file, log_type='info'):
-    """记录日志信息"""
-    timestamp = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
-    log_entry = f"[{timestamp}] {message}\n"
-    
-    with open(log_file, 'a', encoding='utf-8') as f:
-        f.write(log_entry)
-    
-    if log_type == 'error':
-        print(f"❌ {message}")
-    else:
-        print(f"✓ {message}")
-
-
-def process_single_molecule(idx, row, output_base_dir, log_file, error_log_file, multiple_lactone_log):
-    """处理单个分子"""
-    try:
-        # 获取分子信息
-        molecule_id = row['IDs']
-        molecule_name = row['molecule_pref_name']
-        smiles = row['smiles']
-        parent_id = f"ring16_mol_{idx}"
-        
-        # 解析SMILES
-        mol = Chem.MolFromSmiles(smiles)
-        if mol is None:
-            log_message(f"分子 {idx} ({molecule_id}) SMILES解析失败", error_log_file, 'error')
-            return None, 'parse_failed'
-        
-        # 检查环上是否有多个内酯键
-        from src.ring_numbering import find_lactone_carbon, get_ring_atoms
-        
-        # 获取16元环上的所有原子
-        ring_atoms = get_ring_atoms(mol)
-        
-        if len(ring_atoms) > 0:
-            # 使用更严格的SMARTS模式：环上的氧连接到环上的羰基碳
-            # [r16;O] 环上的氧，连接到 [r16;#6] 环上的碳，该碳有双键氧 (=O)
-            lactone_pattern = Chem.MolFromSmarts("[r16;#8][r16;#6](=[#8])")
-            
-            if lactone_pattern:
-                matches = mol.GetSubstructMatches(lactone_pattern)
-                
-                # 提取环上的内酯碳索引（match[1]是羰基碳）
-                lactone_carbons_on_ring = []
-                for match in matches:
-                    if len(match) >= 2:
-                        carbonyl_carbon = match[1]  # 羰基碳
-                        # 确保这个碳确实在16元环上
-                        if carbonyl_carbon in ring_atoms:
-                            lactone_carbons_on_ring.append(carbonyl_carbon)
-                
-                # 去重
-                lactone_carbons_on_ring = list(set(lactone_carbons_on_ring))
-                
-                # 只有当环上有多个内酯键时才过滤
-                if len(lactone_carbons_on_ring) > 1:
-                    log_message(
-                        f"分子 {idx} ({molecule_id}, {molecule_name}) 环上有{len(lactone_carbons_on_ring)}个内酯键，已过滤。环上内酯碳索引: {lactone_carbons_on_ring}",
-                        multiple_lactone_log,
-                        'info'
-                    )
-                    return None, 'multiple_lactones'
-        
-        # 分配编号
-        numbering = assign_ring_numbering(mol)
-        if len(numbering) == 0:
-            log_message(f"分子 {idx} ({molecule_id}) 编号失败", error_log_file, 'error')
-            return None, 'numbering_failed'
-        
-        # 验证编号
-        if not validate_numbering(mol, numbering):
-            log_message(f"分子 {idx} ({molecule_id}) 编号验证失败", error_log_file, 'error')
-            return None, 'validation_failed'
-        
-        # 侧链断裂
-        try:
-            mol_fragments = cleave_side_chains(mol, smiles, parent_id, numbering)
-        except Exception as e:
-            log_message(f"分子 {idx} ({molecule_id}) 裂解失败: {str(e)}", error_log_file, 'error')
-            return None, 'cleavage_failed'
-        
-        # 创建分子专属文件夹
-        mol_output_dir = output_base_dir / parent_id
-        mol_output_dir.mkdir(parents=True, exist_ok=True)
-        
-        # 保存整个MoleculeFragments对象
-        mol_fragments_path = mol_output_dir / f"{parent_id}_all_fragments.json"
-        mol_fragments.to_json_file(str(mol_fragments_path))
-        
-        # 保存每个碎片
-        for frag in mol_fragments.fragments:
-            frag_path = mol_output_dir / f"{frag.fragment_id}.json"
-            frag.to_json_file(str(frag_path))
-        
-        # 保存分子信息到metadata
-        metadata = {
-            'parent_id': parent_id,
-            'molecule_id': molecule_id,
-            'molecule_name': molecule_name,
-            'smiles': smiles,
-            'ring_size': 16,
-            'num_fragments': len(mol_fragments.fragments),
-            'processing_date': datetime.now().isoformat()
-        }
-        
-        metadata_path = mol_output_dir / 'metadata.json'
-        with open(metadata_path, 'w', encoding='utf-8') as f:
-            json.dump(metadata, f, indent=2, ensure_ascii=False)
-        
-        return len(mol_fragments.fragments), 'success'
-        
-    except Exception as e:
-        log_message(f"分子 {idx} 未预期错误: {str(e)}", error_log_file, 'error')
-        return None, 'unexpected_error'
-
-
-def main():
-    print("="*80)
-    print("16元环大环内酯批量处理程序")
-    print("="*80)
-    print()
-    
-    # 读取数据
-    input_file = Path('ring16/temp_filtered_complete.csv')
-    if not input_file.exists():
-        print(f"❌ 错误: 找不到输入文件 {input_file}")
-        return
-    
-    df = pd.read_csv(input_file)
-    print(f"✓ 读取数据集: {len(df)} 个分子")
-    print(f"  文件: {input_file}")
-    print()
-    
-    # 创建输出文件夹
-    output_base_dir = Path('output/ring16_fragments')
-    output_base_dir.mkdir(parents=True, exist_ok=True)
-    
-    # 创建日志文件
-    timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
-    log_file = output_base_dir / f'processing_log_{timestamp}.txt'
-    error_log_file = output_base_dir / f'error_log_{timestamp}.txt'
-    multiple_lactone_log = output_base_dir / f'multiple_lactone_log_{timestamp}.txt'
-    
-    print(f"✓ 输出文件夹: {output_base_dir}")
-    print(f"✓ 日志文件: {log_file}")
-    print()
-    
-    # 统计信息
-    stats = {
-        'total': len(df),
-        'success': 0,
-        'failed_parse': 0,
-        'failed_numbering': 0,
-        'failed_validation': 0,
-        'failed_cleavage': 0,
-        'failed_unexpected': 0,
-        'multiple_lactones': 0,
-        'total_fragments': 0
-    }
-    
-    # 批量处理
-    print("开始批量处理...")
-    print()
-    
-    log_message(f"开始批量处理 {len(df)} 个16元环分子", log_file)
-    
-    for idx in tqdm(range(len(df)), desc="处理进度"):
-        num_fragments, status = process_single_molecule(
-            idx, df.iloc[idx], output_base_dir, log_file, error_log_file, multiple_lactone_log
-        )
-        
-        if status == 'success':
-            stats['success'] += 1
-            stats['total_fragments'] += num_fragments
-        elif status == 'parse_failed':
-            stats['failed_parse'] += 1
-        elif status == 'numbering_failed':
-            stats['failed_numbering'] += 1
-        elif status == 'validation_failed':
-            stats['failed_validation'] += 1
-        elif status == 'cleavage_failed':
-            stats['failed_cleavage'] += 1
-        elif status == 'multiple_lactones':
-            stats['multiple_lactones'] += 1
-        else:
-            stats['failed_unexpected'] += 1
-    
-    print()
-    print("="*80)
-    print("处理完成！")
-    print("="*80)
-    print()
-    
-    # 打印统计结果
-    print("统计结果:")
-    print(f"  总分子数: {stats['total']}")
-    print(f"  成功处理: {stats['success']} ({stats['success']/stats['total']*100:.2f}%)")
-    print(f"  总碎片数: {stats['total_fragments']}")
-    if stats['success'] > 0:
-        print(f"  平均碎片: {stats['total_fragments']/stats['success']:.2f} 个/分子")
-    print()
-    
-    print("失败情况:")
-    print(f"  SMILES解析失败: {stats['failed_parse']}")
-    print(f"  编号失败: {stats['failed_numbering']}")
-    print(f"  验证失败: {stats['failed_validation']}")
-    print(f"  裂解失败: {stats['failed_cleavage']}")
-    print(f"  多个内酯键: {stats['multiple_lactones']}")
-    print(f"  其他错误: {stats['failed_unexpected']}")
-    total_failed = stats['total'] - stats['success']
-    print(f"  总失败: {total_failed} ({total_failed/stats['total']*100:.2f}%)")
-    print()
-    
-    # 保存统计结果
-    stats_file = output_base_dir / 'processing_statistics.json'
-    with open(stats_file, 'w', encoding='utf-8') as f:
-        json.dump(stats, f, indent=2)
-    
-    print(f"✓ 统计结果已保存: {stats_file}")
-    print(f"✓ 输出文件夹: {output_base_dir}")
-    print()
-    
-    log_message(f"处理完成: 成功 {stats['success']}/{stats['total']}", log_file)
-
-
-if __name__ == '__main__':
+if __name__ == "__main__":
+    sys.argv = ["macro-lactone-toolkit", "fragment", "--ring-size", "16", *sys.argv[1:]]
     main()
-
diff --git a/scripts/generate_sdf_and_statistics.py b/scripts/generate_sdf_and_statistics.py
index 01c7ba5..4dc6d6a 100644
--- a/scripts/generate_sdf_and_statistics.py
+++ b/scripts/generate_sdf_and_statistics.py
@@ -1,203 +1,12 @@
-#!/usr/bin/env python3
-"""
-统计分析795个成功处理的分子，并生成SDF文件
-
-功能：
-1. 统计哪些位置被替换（断裂）最多
-2. 为每个分子生成3D SDF文件（使用ETKDGv3）
-"""
-
-import sys
-from pathlib import Path
-from collections import Counter, defaultdict
-import json
-from tqdm import tqdm
-
-# 添加项目根目录到 Python 路径
-script_dir = Path(__file__).resolve().parent
-project_root = script_dir.parent
-sys.path.insert(0, str(project_root))
-
-from rdkit import Chem
-from rdkit.Chem import AllChem
-
-from src.fragment_dataclass import MoleculeFragments
+from macro_lactone_toolkit import FragmentationResult
 
 
-def load_all_fragments(output_dir):
-    """加载所有分子的碎片信息"""
-    fragments_dir = Path(output_dir) / "ring16_fragments"
-    mol_dirs = sorted([d for d in fragments_dir.glob("ring16_mol_*") if d.is_dir()])
-    
-    all_fragments = []
-    for mol_dir in tqdm(mol_dirs, desc="加载碎片信息"):
-        fragments_file = mol_dir / f"{mol_dir.name}_all_fragments.json"
-        if fragments_file.exists():
-            try:
-                mol_fragments = MoleculeFragments.from_json_file(str(fragments_file))
-                all_fragments.append((mol_dir, mol_fragments))
-            except Exception as e:
-                print(f"警告: 无法加载 {fragments_file}: {e}")
-    
-    return all_fragments
+def main() -> None:
+    raise SystemExit(
+        "Legacy helper retired. Use 'macro-lactone-toolkit fragment' to export fragments, "
+        "then generate SDF/statistics in downstream analysis code."
+    )
 
 
-def analyze_cleavage_positions(all_fragments):
-    """统计分析哪些位置被替换最多"""
-    position_counter = Counter()
-    position_molecules = defaultdict(list)  # 记录每个位置出现在哪些分子中
-    
-    for mol_dir, mol_fragments in all_fragments:
-        positions = mol_fragments.get_cleavage_positions()
-        for pos in positions:
-            position_counter[pos] += 1
-            position_molecules[pos].append(mol_dir.name)
-    
-    # 打印统计结果
-    print("\n" + "="*80)
-    print("位置替换统计（按替换次数排序）")
-    print("="*80)
-    print(f"{'位置':<8} {'替换次数':<12} {'占比':<10} {'分子数':<10}")
-    print("-"*80)
-    
-    total_replacements = sum(position_counter.values())
-    for pos in sorted(position_counter.keys()):
-        count = position_counter[pos]
-        percentage = count / total_replacements * 100 if total_replacements > 0 else 0
-        mol_count = len(set(position_molecules[pos]))
-        print(f"{pos:<8} {count:<12} {percentage:>6.2f}%   {mol_count:<10}")
-    
-    print("-"*80)
-    print(f"{'总计':<8} {total_replacements:<12} {'100.00%':<10} {len(all_fragments):<10}")
-    print("="*80)
-    
-    # 保存统计结果
-    stats = {
-        'position_counts': dict(position_counter),
-        'position_molecules': {str(k): list(set(v)) for k, v in position_molecules.items()},
-        'total_replacements': total_replacements,
-        'total_molecules': len(all_fragments)
-    }
-    
-    stats_file = Path(output_dir) / "ring16_fragments" / "cleavage_position_statistics.json"
-    with open(stats_file, 'w', encoding='utf-8') as f:
-        json.dump(stats, f, indent=2, ensure_ascii=False)
-    
-    print(f"\n✓ 统计结果已保存: {stats_file}")
-    
-    return position_counter, position_molecules
-
-
-def generate_3d_sdf(mol, output_path, max_attempts=100):
-    """使用ETKDGv3生成3D SDF文件"""
-    try:
-        # 添加氢原子
-        mol = Chem.AddHs(mol)
-        
-        # 使用ETKDGv3参数
-        params = AllChem.ETKDGv3()
-        
-        for attempt in range(max_attempts):
-            try:
-                status = AllChem.EmbedMolecule(mol, params)
-                if status == 0:
-                    # UFF优化
-                    AllChem.UFFOptimizeMolecule(mol)
-                    
-                    writer = Chem.SDWriter(str(output_path))
-                    writer.write(mol)
-                    writer.close()
-                    return mol, True
-            except Exception as e:
-                if attempt == max_attempts - 1:
-                    raise e
-                continue
-        
-        return None, False
-    except Exception as e:
-        print(f"生成3D SDF失败: {e}")
-        return None, False
-
-
-def generate_sdf_files(all_fragments, output_dir):
-    """为所有分子生成3D SDF文件"""
-    fragments_dir = Path(output_dir) / "ring16_fragments"
-    
-    print("\n" + "="*80)
-    print("生成3D SDF文件")
-    print("="*80)
-    
-    success_count = 0
-    failed_count = 0
-    failed_molecules = []  # 记录失败的分子
-    
-    for mol_dir, mol_fragments in tqdm(all_fragments, desc="生成3D SDF"):
-        smiles = mol_fragments.parent_smiles
-        
-        # 先检查能否从SMILES创建分子
-        test_mol = Chem.MolFromSmiles(smiles)
-        if test_mol is None:
-            failed_count += 1
-            failed_molecules.append((mol_dir.name, "无法从SMILES创建分子"))
-            continue
-        
-        base_name = mol_dir.name
-        sdf_3d_path = mol_dir / f"{base_name}_3d.sdf"
-        
-        # 重试机制：最多尝试10次，每次从原始SMILES重新创建分子
-        max_retries = 10
-        success = False
-        
-        for retry in range(max_retries):
-            # 每次重试都从原始SMILES重新创建分子
-            mol = Chem.MolFromSmiles(smiles)
-            if mol is None:
-            continue
-        
-            mol_3d, success = generate_3d_sdf(mol, sdf_3d_path, max_attempts=100)
-            if success and mol_3d is not None:
-                success_count += 1
-                break
-        
-        if not success:
-            failed_count += 1
-            failed_molecules.append((mol_dir.name, f"10次重试后仍失败"))
-    
-    print(f"\n✓ 成功生成3D SDF: {success_count}/{len(all_fragments)}")
-    if failed_count > 0:
-        print(f"⚠ 失败: {failed_count}")
-        print("\n失败的分子列表:")
-        for mol_name, reason in failed_molecules:
-            print(f"  - {mol_name}: {reason}")
-
-
-def main():
-    global output_dir
-    output_dir = Path('output')
-    
-    print("="*80)
-    print("16元环大环内酯统计分析及SDF生成程序")
-    print("="*80)
-    print()
-    
-    # 1. 加载所有碎片信息
-    print("步骤1: 加载所有分子的碎片信息...")
-    all_fragments = load_all_fragments(output_dir)
-    print(f"✓ 加载了 {len(all_fragments)} 个分子的碎片信息")
-    
-    # 2. 统计分析位置替换
-    print("\n步骤2: 统计分析位置替换...")
-    position_counter, position_molecules = analyze_cleavage_positions(all_fragments)
-    
-    # 3. 生成3D SDF文件
-    print("\n步骤3: 生成3D SDF文件...")
-    generate_sdf_files(all_fragments, output_dir)
-    
-    print("\n" + "="*80)
-    print("处理完成！")
-    print("="*80)
-
-
-if __name__ == '__main__':
+if __name__ == "__main__":
     main()
-
diff --git a/scripts/tylosin_splicer.py b/scripts/tylosin_splicer.py
index 71f04d4..17dbc70 100644
--- a/scripts/tylosin_splicer.py
+++ b/scripts/tylosin_splicer.py
@@ -1,10 +1,34 @@
-#!/usr/bin/env python
-"""
-Tylosin Splicing System - Main Entry Point
-"""
+from __future__ import annotations
+
+import argparse
+
+from rdkit import Chem
+
+from macro_lactone_toolkit.splicing import (
+    activate_fragment,
+    prepare_macrolactone_scaffold,
+    splice_molecule,
+)
+
+
+def main() -> None:
+    parser = argparse.ArgumentParser(description="Splice a fragment onto a prepared macrolactone scaffold.")
+    parser.add_argument("--scaffold-smiles", required=True)
+    parser.add_argument("--fragment-smiles", required=True)
+    parser.add_argument("--position", required=True, type=int)
+    parser.add_argument("--ring-size", type=int, default=None)
+    parser.add_argument("--strategy", default="smart")
+    args = parser.parse_args()
+
+    scaffold, _ = prepare_macrolactone_scaffold(
+        args.scaffold_smiles,
+        positions=[args.position],
+        ring_size=args.ring_size,
+    )
+    fragment = activate_fragment(args.fragment_smiles, strategy=args.strategy)
+    product = splice_molecule(scaffold, fragment, position=args.position)
+    print(Chem.MolToSmiles(product, isomericSmiles=True))
 
-def main():
-    print("Hello from Tylosin Splicer")
 
 if __name__ == "__main__":
     main()
diff --git a/setup.py b/setup.py
deleted file mode 100644
index 07f32e6..0000000
--- a/setup.py
+++ /dev/null
@@ -1,75 +0,0 @@
-"""
-Setup script for macrolactone-fragmenter package.
-
-This package provides tools for analyzing and fragmenting macrolactone 
-(12-20 membered ring) side chains.
-
-Note: RDKit must be installed separately via conda:
-    conda install -c conda-forge rdkit
-
-For development installation:
-    pip install -e .
-    
-For regular installation:
-    pip install .
-"""
-
-from setuptools import setup, find_packages
-from pathlib import Path
-
-# Read the contents of README file
-this_directory = Path(__file__).parent
-long_description = (this_directory / "README.md").read_text(encoding="utf-8")
-
-setup(
-    name="macrolactone-fragmenter",
-    version="1.2.0",
-    author="Macro Split Team",
-    author_email="your.email@example.com",
-    description="A tool for analyzing and fragmenting macrolactone side chains",
-    long_description=long_description,
-    long_description_content_type="text/markdown",
-    url="https://github.com/yourusername/macro_split",
-    packages=find_packages(),
-    classifiers=[
-        "Development Status :: 4 - Beta",
-        "Intended Audience :: Science/Research",
-        "Topic :: Scientific/Engineering :: Chemistry",
-        "License :: OSI Approved :: MIT License",
-        "Programming Language :: Python :: 3",
-        "Programming Language :: Python :: 3.8",
-        "Programming Language :: Python :: 3.9",
-        "Programming Language :: Python :: 3.10",
-        "Programming Language :: Python :: 3.11",
-    ],
-    python_requires=">=3.8",
-    install_requires=[
-        "pandas>=1.3.0",
-        "numpy>=1.20.0",
-        "matplotlib>=3.3.0",
-        "seaborn>=0.11.0",
-        "dataclasses-json>=0.5.0",
-        "tqdm>=4.60.0",
-    ],
-    extras_require={
-        "dev": [
-            "jupyter>=1.0.0",
-            "pytest>=7.0.0",
-            "black>=22.0.0",
-            "flake8>=4.0.0",
-        ],
-        "docs": [
-            "mkdocs>=1.4.0",
-            "mkdocs-material>=9.0.0",
-            "mkdocstrings[python]>=0.20.0",
-        ],
-    },
-    entry_points={
-        "console_scripts": [
-            # Can add command-line tools here
-        ],
-    },
-    include_package_data=True,
-    zip_safe=False,
-)
-
diff --git a/src/__init__.py b/src/__init__.py
deleted file mode 100644
index 0f5d380..0000000
--- a/src/__init__.py
+++ /dev/null
@@ -1,6 +0,0 @@
-"""
-16-Membered Macrolactone Side Chain Analysis Package
-"""
-
-__version__ = "1.0.0"
-
diff --git a/src/fragment_cleaver.py b/src/fragment_cleaver.py
deleted file mode 100644
index 79d3e1b..0000000
--- a/src/fragment_cleaver.py
+++ /dev/null
@@ -1,219 +0,0 @@
-"""
-Side chain cleavage and fragment extraction for macrolactones.
-"""
-from rdkit import Chem
-from rdkit.Chem import Descriptors, rdMolDescriptors
-from typing import List, Tuple, Optional, Dict
-from .ring_numbering import assign_ring_numbering, get_ring_position
-from .fragment_dataclass import Fragment, MoleculeFragments
-
-
-def identify_side_chains(mol: Chem.Mol, ring_atoms: List[int]) -> List[Tuple[int, int]]:
-    """
-    Identify side chains attached to the ring.
-    
-    Args:
-        mol: RDKit molecule object
-        ring_atoms: List of atom indices in the 16-membered ring
-    
-    Returns:
-        List of tuples (ring_atom_idx, side_chain_first_atom_idx)
-    """
-    side_chains = []
-    ring_atom_set = set(ring_atoms)
-    
-    for ring_atom_idx in ring_atoms:
-        atom = mol.GetAtomWithIdx(ring_atom_idx)
-        
-        for neighbor in atom.GetNeighbors():
-            neighbor_idx = neighbor.GetIdx()
-            
-            # If neighbor is not in the ring, it's a side chain
-            if neighbor_idx not in ring_atom_set:
-                side_chains.append((ring_atom_idx, neighbor_idx))
-    
-    return side_chains
-
-
-def extract_side_chain_fragment(mol: Chem.Mol, ring_atom_idx: int, side_chain_start_idx: int, 
-                                 ring_atoms: List[int]) -> Optional[str]:
-    """
-    Extract a side chain fragment as a SMILES string.
-    
-    Args:
-        mol: RDKit molecule object
-        ring_atom_idx: Index of the ring atom where side chain is attached
-        side_chain_start_idx: Index of the first atom in the side chain
-        ring_atoms: List of all ring atom indices
-    
-    Returns:
-        SMILES string of the fragment, or None if extraction failed
-    """
-    # Create an editable copy
-    emol = Chem.RWMol(mol)
-    
-    # Find the bond between ring and side chain
-    bond = mol.GetBondBetweenAtoms(ring_atom_idx, side_chain_start_idx)
-    if bond is None:
-        return None
-    
-    # Get all atoms in the side chain by BFS from side_chain_start_idx
-    # but not crossing into the ring
-    ring_atom_set = set(ring_atoms)
-    visited = set()
-    queue = [side_chain_start_idx]
-    side_chain_atoms = []
-    
-    while queue:
-        current_idx = queue.pop(0)
-        if current_idx in visited:
-            continue
-        
-        visited.add(current_idx)
-        side_chain_atoms.append(current_idx)
-        
-        atom = mol.GetAtomWithIdx(current_idx)
-        for neighbor in atom.GetNeighbors():
-            neighbor_idx = neighbor.GetIdx()
-            
-            # Only continue into non-ring atoms
-            if neighbor_idx not in ring_atom_set and neighbor_idx not in visited:
-                queue.append(neighbor_idx)
-    
-    if not side_chain_atoms:
-        return None
-    
-    # Create a new molecule with only the side chain atoms
-    # We'll use Chem.RWMol to build it
-    fragment_mol = Chem.RWMol()
-    
-    # Map old atom indices to new ones
-    old_to_new = {}
-    
-    # Add atoms
-    for old_idx in side_chain_atoms:
-        atom = mol.GetAtomWithIdx(old_idx)
-        new_atom = Chem.Atom(atom.GetAtomicNum())
-        new_atom.SetFormalCharge(atom.GetFormalCharge())
-        new_atom.SetIsAromatic(atom.GetIsAromatic())
-        new_idx = fragment_mol.AddAtom(new_atom)
-        old_to_new[old_idx] = new_idx
-    
-    # Add bonds
-    for old_idx in side_chain_atoms:
-        atom = mol.GetAtomWithIdx(old_idx)
-        for neighbor in atom.GetNeighbors():
-            neighbor_idx = neighbor.GetIdx()
-            
-            # Only add bonds within the side chain
-            if neighbor_idx in old_to_new and old_idx < neighbor_idx:
-                bond = mol.GetBondBetweenAtoms(old_idx, neighbor_idx)
-                fragment_mol.AddBond(
-                    old_to_new[old_idx],
-                    old_to_new[neighbor_idx],
-                    bond.GetBondType()
-                )
-    
-    # Convert to molecule and get SMILES
-    try:
-        fragment_mol = fragment_mol.GetMol()
-        Chem.SanitizeMol(fragment_mol)
-        fragment_smiles = Chem.MolToSmiles(fragment_mol)
-        return fragment_smiles
-    except:
-        # If sanitization fails, return None
-        return None
-
-
-def cleave_side_chains(mol: Chem.Mol, parent_smiles: str, parent_id: str, 
-                       numbering: Dict[int, int]) -> MoleculeFragments:
-    """
-    Cleave all side chains from a macrolactone and create fragment records.
-    
-    Args:
-        mol: RDKit molecule object
-        parent_smiles: SMILES string of the parent molecule
-        parent_id: Identifier for the parent molecule (e.g., "macro_0")
-        numbering: Ring numbering dictionary from assign_ring_numbering
-    
-    Returns:
-        MoleculeFragments object containing all fragments
-    """
-    mol_fragments = MoleculeFragments(
-        parent_id=parent_id,
-        parent_smiles=parent_smiles,
-        ring_numbering={str(k): v for k, v in numbering.items()}  # Convert to str keys for JSON
-    )
-    
-    ring_atoms = list(numbering.keys())
-    side_chains = identify_side_chains(mol, ring_atoms)
-    
-    fragment_counter = 0
-    
-    for ring_atom_idx, side_chain_start_idx in side_chains:
-        # Get the ring position
-        cleavage_position = get_ring_position(numbering, ring_atom_idx)
-        
-        if cleavage_position is None:
-            continue
-        
-        # Extract the fragment
-        fragment_smiles = extract_side_chain_fragment(
-            mol, ring_atom_idx, side_chain_start_idx, ring_atoms
-        )
-        
-        if fragment_smiles is None:
-            continue
-        
-        # Create fragment molecule for properties
-        frag_mol = Chem.MolFromSmiles(fragment_smiles)
-        if frag_mol is None:
-            continue
-        
-        # Create Fragment object
-        fragment = Fragment(
-            fragment_smiles=fragment_smiles,
-            parent_smiles=parent_smiles,
-            cleavage_position=cleavage_position,
-            fragment_id=f"{parent_id}_frag_{fragment_counter}",
-            parent_id=parent_id,
-            atom_count=frag_mol.GetNumAtoms(),
-            molecular_weight=round(Descriptors.MolWt(frag_mol), 2)
-        )
-        
-        mol_fragments.add_fragment(fragment)
-        fragment_counter += 1
-    
-    return mol_fragments
-
-
-def process_molecule(smiles: str, molecule_idx: int) -> Optional[MoleculeFragments]:
-    """
-    Process a single molecule: assign numbering and cleave side chains.
-    
-    Args:
-        smiles: SMILES string of the molecule
-        molecule_idx: Index number for naming (e.g., 0 -> "macro_0")
-    
-    Returns:
-        MoleculeFragments object or None if processing failed
-    """
-    parent_id = f"macro_{molecule_idx}"
-    
-    # Parse molecule
-    mol = Chem.MolFromSmiles(smiles)
-    if mol is None:
-        print(f"Error: Could not parse SMILES for {parent_id}")
-        return None
-    
-    # Assign ring numbering
-    numbering = assign_ring_numbering(mol)
-    if not numbering:
-        print(f"Error: Could not assign ring numbering for {parent_id}")
-        return None
-    
-    # Cleave side chains
-    mol_fragments = cleave_side_chains(mol, smiles, parent_id, numbering)
-    
-    return mol_fragments
-
diff --git a/src/fragment_dataclass.py b/src/fragment_dataclass.py
deleted file mode 100644
index 19ec446..0000000
--- a/src/fragment_dataclass.py
+++ /dev/null
@@ -1,95 +0,0 @@
-"""
-Fragment dataclass for storing side chain information.
-"""
-from dataclasses import dataclass, field
-from typing import List, Optional
-from dataclasses_json import dataclass_json
-import json
-
-
-@dataclass_json
-@dataclass
-class Fragment:
-    """
-    Data class to store fragment information from macrocycle side chain cleavage.
-    
-    Attributes:
-        fragment_smiles: SMILES string of the cleaved fragment (contains dummy atom '*' 
-                        at the attachment point for molzip compatibility)
-        parent_smiles: SMILES string of the parent molecule
-        cleavage_position: Position on the ring where cleavage occurred (1-N, where N is ring size)
-        fragment_id: Unique identifier for this fragment
-        parent_id: Identifier of the parent molecule (e.g., "macro_0")
-        atom_count: Number of atoms in the fragment
-        molecular_weight: Molecular weight of the fragment
-    
-    Note:
-        The fragment_smiles contains a dummy atom (*) at the attachment point, which allows
-        for easy reconstruction using RDKit's molzip functionality. This format is compatible
-        with SQLModel and PostgreSQL storage while maintaining chemical information.
-    """
-    fragment_smiles: str
-    parent_smiles: str
-    cleavage_position: int
-    fragment_id: str
-    parent_id: str
-    atom_count: Optional[int] = None
-    molecular_weight: Optional[float] = None
-    
-    def to_json_file(self, filepath: str):
-        """Save fragment to JSON file."""
-        with open(filepath, 'w') as f:
-            json.dump(self.to_dict(), f, indent=2, ensure_ascii=False)
-    
-    @classmethod
-    def from_json_file(cls, filepath: str):
-        """Load fragment from JSON file."""
-        with open(filepath, 'r') as f:
-            data = json.load(f)
-        return cls.from_dict(data)
-
-
-@dataclass_json
-@dataclass
-class MoleculeFragments:
-    """
-    Collection of all fragments from a single macrocycle molecule.
-    
-    Attributes:
-        parent_id: Identifier of the parent molecule
-        parent_smiles: SMILES string of the parent molecule
-        ring_numbering: Dictionary mapping atom indices to ring positions (1-16)
-        fragments: List of Fragment objects
-    """
-    parent_id: str
-    parent_smiles: str
-    ring_numbering: dict = field(default_factory=dict)
-    fragments: List[Fragment] = field(default_factory=list)
-    
-    def add_fragment(self, fragment: Fragment):
-        """Add a fragment to the collection."""
-        self.fragments.append(fragment)
-    
-    def to_json_file(self, filepath: str):
-        """Save all fragments to JSON file."""
-        with open(filepath, 'w') as f:
-            json.dump(self.to_dict(), f, indent=2, ensure_ascii=False)
-    
-    @classmethod
-    def from_json_file(cls, filepath: str):
-        """Load fragments from JSON file."""
-        with open(filepath, 'r') as f:
-            data = json.load(f)
-        return cls.from_dict(data)
-    
-    def get_cleavage_positions(self) -> List[int]:
-        """Get list of all cleavage positions."""
-        return [frag.cleavage_position for frag in self.fragments]
-    
-    def get_fragment_by_position(self, position: int) -> Optional[Fragment]:
-        """Get fragment by cleavage position."""
-        for frag in self.fragments:
-            if frag.cleavage_position == position:
-                return frag
-        return None
-
diff --git a/src/macro_lactone_analyzer.py b/src/macro_lactone_analyzer.py
deleted file mode 100644
index be980d1..0000000
--- a/src/macro_lactone_analyzer.py
+++ /dev/null
@@ -1,494 +0,0 @@
-#!/usr/bin/env python
-# -*- encoding: utf-8 -*-
-"""
-@file               :macro_lactone_analyzer.py
-@Description:       :大环内酯分析器 - 环数识别、验证和分类
-                    :
-                    :主要功能：
-                    1. 识别分子中所有12-20元环的大小
-                    2. 验证是否为有效的大环内酯（含有酯键）
-                    3. 支持单分子和批量分析
-                    4. 分类为单环和桥环分子
-                    5. 支持动态SMARTS模式匹配
-                    :
-@Date               :2025-11-09 (更新：2025-11-09)
-@Author             :参考 lyzeng 的设计
-@version            :2.0
-"""
-
-from dataclasses import dataclass, field
-from typing import List, Dict, Optional, Union, Tuple, Set
-from pathlib import Path
-import pandas as pd
-from rdkit import Chem
-from rdkit.Chem import Descriptors, Crippen, QED
-
-# 导入项目内的辅助函数
-from .ring_visualization import get_ring_atoms_by_size
-
-
-@dataclass
-class MacroLactoneAnalyzer:
-    """
-    大环内酯分析器 - 用于识别、验证和分类大环内酯分子。
-
-    属性:
-        smiles_list (List[str]): SMILES字符串列表，可通过add_smiles方法添加。
-        ester_patterns (List[str]): 用于识别酯键的SMARTS模式列表。
-    """
-    smiles_list: List[str] = field(default_factory=list)
-    ester_patterns: List[str] = field(default_factory=list)
-
-    def __post_init__(self):
-        """初始化后设置默认的酯键SMARTS模式"""
-        if not self.ester_patterns:
-            self.ester_patterns = [
-                "[C](=O)[O]",  # 通用酯键
-                "C(=O)O",     # 简单酯键
-            ]
-
-    @staticmethod
-    def detect_ring_sizes(mol: Chem.Mol) -> List[int]:
-        """
-        识别分子中所有12-20元环的大小
-
-        Args:
-            mol: RDKit分子对象
-
-        Returns:
-            环大小列表，按升序排列
-        """
-        if mol is None:
-            return []
-
-        ring_sizes = []
-        ring_info = mol.GetRingInfo()
-        rings = ring_info.AtomRings()
-
-        for ring in rings:
-            ring_size = len(ring)
-            if 12 <= ring_size <= 20:
-                ring_sizes.append(ring_size)
-
-        # 去重并排序
-        return sorted(list(set(ring_sizes)))
-
-    @staticmethod
-    def has_ester_on_ring(mol: Chem.Mol, ring_atoms: List[int], ester_patterns: Optional[List[str]] = None) -> bool:
-        """
-        检查环上是否有酯键（大环内酯特征）
-
-        Args:
-            mol: RDKit分子对象
-            ring_atoms: 环原子索引列表
-            ester_patterns: 可选的酯键SMARTS模式列表
-
-        Returns:
-            是否为大环内酯
-        """
-        if mol is None or not ring_atoms:
-            return False
-
-        ring_atoms_set = set(ring_atoms)
-        patterns = ester_patterns or ["[C](=O)[O]", "C(=O)O"]
-
-        for pattern_str in patterns:
-            pattern = Chem.MolFromSmarts(pattern_str)
-            if pattern is None:
-                continue
-
-            matches = mol.GetSubstructMatches(pattern)
-            for match in matches:
-                if len(match) >= 3:
-                    carbonyl_c = match[0]
-                    ester_o = match[2]
-                    # 检查酯键原子是否在环上
-                    if carbonyl_c in ring_atoms_set and ester_o in ring_atoms_set:
-                        return True
-
-        return False
-
-    @staticmethod
-    def is_valid_macrolactone(mol: Chem.Mol, size: int, ester_patterns: Optional[List[str]] = None) -> bool:
-        """
-        检查分子是否为指定大小的有效大环内酯
-
-        Args:
-            mol: RDKit分子对象
-            size: 环大小
-            ester_patterns: 可选的酯键SMARTS模式列表
-
-        Returns:
-            是否为有效大环内酯
-        """
-        if mol is None:
-            return False
-
-        # 找到指定大小的环
-        ring_atoms = get_ring_atoms_by_size(mol, size)
-        if not ring_atoms:
-            return False
-
-        # 检查环上是否有酯键
-        return MacroLactoneAnalyzer.has_ester_on_ring(mol, ring_atoms, ester_patterns)
-
-    @staticmethod
-    def analyze_smiles(smiles: str,
-                      ring_range: range = range(12, 21),
-                      ester_patterns: Optional[List[str]] = None) -> Optional[Dict[str, Union[int, List[int], bool]]]:
-        """
-        对单个SMILES字符串进行大环内酯检测和分析。
-
-        Args:
-            smiles: 分子的SMILES表示
-            ring_range: 需要检测的环大小范围，默认为12至20
-            ester_patterns: 可选的酯键SMARTS模式列表
-
-        Returns:
-            包含分析结果的字典：
-            - 'ring_sizes': 检测到的所有环大小列表
-            - 'valid_sizes': 有效大环内酯的环大小列表
-            - 'is_macrolactone': 是否为大环内酯（布尔值）
-            - 'has_ester': 是否含有酯键（布尔值）
-            - 'is_bridge': 是否为桥环（布尔值）
-            如果未匹配到任何大环内酯或解析失败则返回None
-        """
-        if not isinstance(smiles, str):
-            raise TypeError("输入必须为SMILES字符串。")
-
-        mol = Chem.MolFromSmiles(smiles)
-        if mol is None:
-            raise ValueError(f"无法解析SMILES: {smiles}")
-
-        # 检测环大小
-        ring_sizes = MacroLactoneAnalyzer.detect_ring_sizes(mol)
-
-        # 过滤出有效的内酯环
-        valid_sizes = []
-        for size in ring_range:
-            if size in ring_sizes:
-                if MacroLactoneAnalyzer.is_valid_macrolactone(mol, size, ester_patterns):
-                    valid_sizes.append(size)
-
-        # 分析结果
-        result = {
-            'ring_sizes': ring_sizes,
-            'valid_sizes': valid_sizes,
-            'is_macrolactone': len(valid_sizes) > 0,
-            'has_ester': any(
-                MacroLactoneAnalyzer.has_ester_on_ring(
-                    mol, get_ring_atoms_by_size(mol, size), ester_patterns
-                ) for size in ring_sizes
-            ),
-            'is_bridge': len(valid_sizes) > 1
-        }
-
-        return result if result['is_macrolactone'] else None
-
-    @staticmethod
-    def dynamic_smarts_match(smiles: str, ring_size: int) -> Optional[List[int]]:
-        """
-        使用动态构造的SMARTS模式匹配大环内酯
-
-        参考用户的analyze_smiles方法实现
-
-        Args:
-            smiles: SMILES字符串
-            ring_size: 环大小
-
-        Returns:
-            匹配到的原子索引列表，未匹配到则返回None
-        """
-        mol = Chem.MolFromSmiles(smiles)
-        if mol is None:
-            return None
-
-        # 动态构造SMARTS模式：[r{ring_size}]([#8][#6](=[#8]))
-        smarts = f'[r{ring_size}]([#8][#6](=[#8]))'
-        query = Chem.MolFromSmarts(smarts)
-        if query is None:
-            return None
-
-        matches = mol.GetSubstructMatches(query)
-        return matches[0] if matches else None
-
-    def get_single_ring_info(self,
-                            smiles: str,
-                            ring_range: range = range(12, 21),
-                            ester_patterns: Optional[List[str]] = None) -> Optional[Dict[str, Union[int, List[int], bool]]]:
-        """
-        对单个SMILES字符串返回大环内酯的详细信息。
-
-        Args:
-            smiles: 分子的SMILES表示
-            ring_range: 环大小检测范围，默认为12至20
-            ester_patterns: 可选的酯键SMARTS模式列表
-
-        Returns:
-            包含分析结果的字典；若未匹配到则返回None
-        """
-        return self.analyze_smiles(smiles, ring_range, ester_patterns)
-
-    def analyze_list(self,
-                    smiles_list: Optional[List[str]] = None,
-                    ring_range: range = range(12, 21),
-                    ester_patterns: Optional[List[str]] = None) -> Dict[str, Union[int, List[str]]]:
-        """
-        对SMILES字符串列表进行统计分析。
-
-        Args:
-            smiles_list: SMILES字符串列表，如果为None则使用实例的smiles_list
-            ring_range: 环大小检测范围
-            ester_patterns: 可选的酯键SMARTS模式列表
-
-        Returns:
-            统计结果字典，包含：
-            - 'total': 总分子数
-            - 'macrolactones': 大环内酯分子数
-            - 'bridge_rings': 桥环分子数
-            - 'ring_size_stats': 各环大小分子数统计
-            - 'valid_molecules': 有效大环内酯的SMILES列表
-            - 'bridge_molecules': 桥环分子的SMILES列表
-        """
-        target_list = smiles_list if smiles_list is not None else self.smiles_list
-
-        if not target_list:
-            return {'total': 0, 'macrolactones': 0}
-
-        stats = {
-            'total': len(target_list),
-            'macrolactones': 0,
-            'bridge_rings': 0,
-            'ring_size_stats': {size: 0 for size in ring_range},
-            'valid_molecules': [],
-            'bridge_molecules': [],
-            'failed_molecules': []
-        }
-
-        for smiles in target_list:
-            try:
-                result = self.analyze_smiles(smiles, ring_range, ester_patterns)
-                if result:
-                    stats['macrolactones'] += 1
-                    if result['is_bridge']:
-                        stats['bridge_rings'] += 1
-                        stats['bridge_molecules'].append(smiles)
-                    else:
-                        size = result['valid_sizes'][0]
-                        stats['ring_size_stats'][size] += 1
-                        stats['valid_molecules'].append(smiles)
-            except Exception as e:
-                stats['failed_molecules'].append((smiles, str(e)))
-
-        return stats
-
-    def classify_molecules(self,
-                          df: pd.DataFrame,
-                          smiles_column: str = 'smiles',
-                          id_column: Optional[str] = None,
-                          ring_range: range = range(12, 21),
-                          ester_patterns: Optional[List[str]] = None) -> Tuple[Dict[int, pd.DataFrame], pd.DataFrame]:
-        """
-        将DataFrame中的分子按环大小分类（参考notebook中的classify_molecules_by_ring_size函数）
-
-        Args:
-            df: 包含SMILES的DataFrame
-            smiles_column: SMILES列名
-            id_column: ID列名（可选）
-            ring_range: 环大小检测范围
-            ester_patterns: 可选的酯键SMARTS模式列表
-
-        Returns:
-            Tuple of (ring_size_to_df_dict, ambiguous_df)
-            - ring_size_to_df_dict: 环大小 -> DataFrame映射
-            - ambiguous_df: 有多个可能环数的分子DataFrame
-        """
-        print(f"开始对 {len(df)} 个分子进行环数识别和分类...")
-
-        # 初始化结果
-        ring_size_dfs = {size: [] for size in ring_range}
-        ambiguous_molecules = []
-
-        stats = {
-            'processed': 0,
-            'single_ring': {size: 0 for size in ring_range},
-            'ambiguous': 0,
-            'no_valid_ring': 0,
-            'failed': 0
-        }
-
-        for idx, row in df.iterrows():
-            smiles = row[smiles_column]
-            mol_id = row[id_column] if id_column else f"row_{idx}"
-
-            try:
-                mol = Chem.MolFromSmiles(smiles)
-                if mol is None:
-                    stats['no_valid_ring'] += 1
-                    continue
-
-                # 检测环大小
-                ring_sizes = self.detect_ring_sizes(mol)
-
-                # 过滤出有效的内酯环
-                valid_sizes = []
-                for size in ring_sizes:
-                    if self.is_valid_macrolactone(mol, size, ester_patterns):
-                        valid_sizes.append(size)
-
-                # 分类
-                if len(valid_sizes) == 1:
-                    # 单个有效环大小
-                    size = valid_sizes[0]
-                    row_copy = row.copy()
-                    row_copy['detected_ring_sizes'] = valid_sizes
-                    row_copy['ring_size'] = size
-                    ring_size_dfs[size].append(row_copy)
-                    stats['single_ring'][size] += 1
-                elif len(valid_sizes) > 1:
-                    # 多个可能环大小（桥环情况）
-                    row_copy = row.copy()
-                    row_copy['detected_ring_sizes'] = valid_sizes
-                    ambiguous_molecules.append(row_copy)
-                    stats['ambiguous'] += 1
-                else:
-                    # 没有有效环
-                    stats['no_valid_ring'] += 1
-
-            except Exception as e:
-                stats['failed'] += 1
-                print(f"警告：分子 {mol_id} 处理失败: {e}")
-
-            stats['processed'] += 1
-
-            # 进度显示
-            if stats['processed'] % 100 == 0:
-                print(f"已处理 {stats['processed']}/{len(df)} 个分子...")
-
-        # 转换为DataFrame
-        ring_size_to_df = {}
-        for size in ring_range:
-            if ring_size_dfs[size]:
-                ring_size_to_df[size] = pd.DataFrame(ring_size_dfs[size])
-            else:
-                ring_size_to_df[size] = pd.DataFrame()
-
-        ambiguous_df = pd.DataFrame(ambiguous_molecules) if ambiguous_molecules else pd.DataFrame()
-
-        # 打印统计结果
-        print("\n" + "="*60)
-        print("环数识别和分类结果:")
-        print("="*60)
-        print(f"总处理分子数: {stats['processed']}")
-        print(f"无有效环分子: {stats['no_valid_ring']}")
-        print(f"桥环分子（多个环数）: {stats['ambiguous']}")
-        if stats['failed'] > 0:
-            print(f"处理失败分子: {stats['failed']}")
-        print("\n各环大小分子数:")
-        for size in ring_range:
-            count = stats['single_ring'][size]
-            if count > 0:
-                print(f"  {size}元环: {count} 个分子")
-        print("="*60)
-
-        return ring_size_to_df, ambiguous_df
-
-    def add_smiles(self, smiles: Union[str, List[str]]):
-        """
-        添加一个或多个SMILES字符串到当前分析器中。
-
-        Args:
-            smiles: 单个SMILES字符串或字符串列表
-
-        Raises:
-            TypeError: 如果输入既不是字符串也不是字符串列表
-        """
-        if isinstance(smiles, str):
-            self.smiles_list.append(smiles)
-        elif isinstance(smiles, list):
-            for s in smiles:
-                if not isinstance(s, str):
-                    raise TypeError("列表中的每个元素必须为SMILES字符串。")
-                self.smiles_list.append(s)
-        else:
-            raise TypeError("输入必须为SMILES字符串或字符串列表。")
-
-    def calculate_molecular_properties(self, smiles: str) -> Optional[Dict[str, float]]:
-        """
-        计算分子的基本性质
-
-        Args:
-            smiles: SMILES字符串
-
-        Returns:
-            包含分子性质的字典，如果计算失败则返回None
-        """
-        try:
-            mol = Chem.MolFromSmiles(smiles)
-            if mol is None:
-                return None
-
-            return {
-                'mol_weight': Descriptors.MolWt(mol),
-                'logP': Crippen.MolLogP(mol),
-                'qed': QED.qed(mol),
-                'tpsa': Descriptors.TPSA(mol),
-                'num_atoms': mol.GetNumAtoms()
-            }
-        except Exception as e:
-            print(f"警告：计算分子性质失败: {e}")
-            return None
-
-
-# 示例用法和测试
-if __name__ == "__main__":
-    # 创建分析器实例
-    analyzer = MacroLactoneAnalyzer()
-
-    # 测试SMILES
-    test_smiles = [
-        "O=C1CCCCCCCC(=O)OCC/C=C/C=C/1",  # 16元环大环内酯
-        "CC(=O)OC1=CC=CC=C1C(=O)O",      # 非大环内酯
-    ]
-
-    # 单分子分析
-    print("="*60)
-    print("单分子分析测试")
-    print("="*60)
-    for smiles in test_smiles:
-        result = analyzer.get_single_ring_info(smiles)
-        print(f"\nSMILES: {smiles[:50]}...")
-        if result:
-            print(f"  ✓ 有效大环内酯")
-            print(f"  环大小: {result['valid_sizes']}")
-            print(f"  是否桥环: {result['is_bridge']}")
-        else:
-            print(f"  ✗ 非大环内酯或无有效环")
-
-    # 批量分析
-    print("\n" + "="*60)
-    print("批量分析测试")
-    print("="*60)
-    analyzer.add_smiles(test_smiles)
-    stats = analyzer.analyze_list()
-    print(f"\n统计结果:")
-    print(f"  总分子数: {stats['total']}")
-    print(f"  大环内酯: {stats['macrolactones']}")
-    print(f"  桥环: {stats['bridge_rings']}")
-    print(f"  环大小统计: {stats['ring_size_stats']}")
-
-    # DataFrame分类测试
-    print("\n" + "="*60)
-    print("DataFrame分类测试")
-    print("="*60)
-    test_df = pd.DataFrame({
-        'ID': ['mol1', 'mol2', 'mol3'],
-        'smiles': test_smiles + ["O=C1CCCCCCCC(=O)OCC/C=C/C=C/1"]
-    })
-
-    ring_size_dfs, ambiguous_df = analyzer.classify_molecules(test_df, smiles_column='smiles', id_column='ID')
-    print(f"\n分类结果:")
-    for size, df_size in ring_size_dfs.items():
-        if not df_size.empty:
-            print(f"  {size}元环: {len(df_size)} 个分子")
-    print(f"桥环分子: {len(ambiguous_df)} 个")
diff --git a/src/macro_lactone_toolkit/__init__.py b/src/macro_lactone_toolkit/__init__.py
new file mode 100644
index 0000000..133d893
--- /dev/null
+++ b/src/macro_lactone_toolkit/__init__.py
@@ -0,0 +1,25 @@
+from .analyzer import MacroLactoneAnalyzer
+from .errors import (
+    AmbiguousMacrolactoneError,
+    FragmentationError,
+    MacrolactoneDetectionError,
+    MacrolactoneError,
+    RingNumberingError,
+)
+from .fragmenter import MacrolactoneFragmenter
+from .models import FragmentationResult, RingNumberingResult, SideChainFragment
+
+__all__ = [
+    "AmbiguousMacrolactoneError",
+    "FragmentationError",
+    "FragmentationResult",
+    "MacroLactoneAnalyzer",
+    "MacrolactoneDetectionError",
+    "MacrolactoneError",
+    "MacrolactoneFragmenter",
+    "RingNumberingError",
+    "RingNumberingResult",
+    "SideChainFragment",
+]
+
+__version__ = "0.1.0"
diff --git a/src/macro_lactone_toolkit/_core.py b/src/macro_lactone_toolkit/_core.py
new file mode 100644
index 0000000..7fbee07
--- /dev/null
+++ b/src/macro_lactone_toolkit/_core.py
@@ -0,0 +1,243 @@
+from __future__ import annotations
+
+from collections import deque
+from dataclasses import dataclass
+from typing import Iterable
+
+from rdkit import Chem
+
+from .errors import MacrolactoneDetectionError, RingNumberingError
+from .models import RingNumberingResult
+
+
+VALID_RING_SIZES = tuple(range(12, 21))
+
+
+@dataclass(frozen=True)
+class DetectedMacrolactone:
+    ring_size: int
+    ring_atoms: tuple[int, ...]
+    carbonyl_carbon_idx: int
+    ester_oxygen_idx: int
+
+
+def ensure_mol(mol_input: str | Chem.Mol) -> tuple[Chem.Mol, str]:
+    if isinstance(mol_input, Chem.Mol):
+        mol = Chem.Mol(mol_input)
+    else:
+        mol = Chem.MolFromSmiles(mol_input)
+        if mol is None:
+            raise MacrolactoneDetectionError(f"Unable to parse SMILES: {mol_input}")
+    return mol, Chem.MolToSmiles(mol, isomericSmiles=True)
+
+
+def find_macrolactone_candidates(
+    mol: Chem.Mol,
+    ring_size: int | None = None,
+) -> list[DetectedMacrolactone]:
+    candidates: list[DetectedMacrolactone] = []
+    seen: set[tuple[tuple[int, ...], int, int]] = set()
+
+    for ring in mol.GetRingInfo().AtomRings():
+        if len(ring) not in VALID_RING_SIZES:
+            continue
+        if ring_size is not None and len(ring) != ring_size:
+            continue
+
+        ester_match = _find_ring_ester_atoms(mol, ring)
+        if ester_match is None:
+            continue
+
+        carbonyl_carbon_idx, ester_oxygen_idx = ester_match
+        key = (tuple(sorted(ring)), carbonyl_carbon_idx, ester_oxygen_idx)
+        if key in seen:
+            continue
+        seen.add(key)
+        candidates.append(
+            DetectedMacrolactone(
+                ring_size=len(ring),
+                ring_atoms=tuple(ring),
+                carbonyl_carbon_idx=carbonyl_carbon_idx,
+                ester_oxygen_idx=ester_oxygen_idx,
+            )
+        )
+
+    return sorted(
+        candidates,
+        key=lambda candidate: (
+            candidate.ring_size,
+            tuple(sorted(candidate.ring_atoms)),
+            candidate.carbonyl_carbon_idx,
+            candidate.ester_oxygen_idx,
+        ),
+    )
+
+
+def build_numbering_result(mol: Chem.Mol, candidate: DetectedMacrolactone) -> RingNumberingResult:
+    ring_atoms = list(candidate.ring_atoms)
+    ring_atom_set = set(ring_atoms)
+    adjacency = {
+        atom_idx: sorted(
+            neighbor.GetIdx()
+            for neighbor in mol.GetAtomWithIdx(atom_idx).GetNeighbors()
+            if neighbor.GetIdx() in ring_atom_set
+        )
+        for atom_idx in ring_atoms
+    }
+
+    ordered_atoms = [candidate.carbonyl_carbon_idx, candidate.ester_oxygen_idx]
+    visited = set(ordered_atoms)
+    previous_atom = candidate.carbonyl_carbon_idx
+    current_atom = candidate.ester_oxygen_idx
+
+    while len(ordered_atoms) < candidate.ring_size:
+        next_atoms = [
+            atom_idx
+            for atom_idx in adjacency[current_atom]
+            if atom_idx != previous_atom and atom_idx not in visited
+        ]
+        if len(next_atoms) != 1:
+            raise RingNumberingError("Unable to determine a unique ring traversal order.")
+        next_atom = next_atoms[0]
+        ordered_atoms.append(next_atom)
+        visited.add(next_atom)
+        previous_atom, current_atom = current_atom, next_atom
+
+    if candidate.carbonyl_carbon_idx not in adjacency[current_atom]:
+        raise RingNumberingError("Ring traversal did not close on the lactone carbonyl carbon.")
+
+    position_to_atom = {position: atom_idx for position, atom_idx in enumerate(ordered_atoms, start=1)}
+    atom_to_position = {atom_idx: position for position, atom_idx in position_to_atom.items()}
+
+    return RingNumberingResult(
+        ring_size=candidate.ring_size,
+        ring_atoms=sorted(ring_atoms),
+        ordered_atoms=ordered_atoms,
+        carbonyl_carbon_idx=candidate.carbonyl_carbon_idx,
+        ester_oxygen_idx=candidate.ester_oxygen_idx,
+        atom_to_position=atom_to_position,
+        position_to_atom=position_to_atom,
+    )
+
+
+def collect_side_chain_atoms(
+    mol: Chem.Mol,
+    start_atom_idx: int,
+    ring_atom_indices: Iterable[int],
+) -> list[int]:
+    ring_atom_set = set(ring_atom_indices)
+    queue: deque[int] = deque([start_atom_idx])
+    visited: set[int] = set()
+    side_chain_atoms: list[int] = []
+
+    while queue:
+        atom_idx = queue.popleft()
+        if atom_idx in visited or atom_idx in ring_atom_set:
+            continue
+        visited.add(atom_idx)
+        side_chain_atoms.append(atom_idx)
+
+        atom = mol.GetAtomWithIdx(atom_idx)
+        for neighbor in atom.GetNeighbors():
+            neighbor_idx = neighbor.GetIdx()
+            if neighbor_idx not in visited and neighbor_idx not in ring_atom_set:
+                queue.append(neighbor_idx)
+
+    return side_chain_atoms
+
+
+def is_intrinsic_lactone_neighbor(
+    mol: Chem.Mol,
+    candidate: DetectedMacrolactone,
+    ring_atom_idx: int,
+    neighbor_idx: int,
+) -> bool:
+    if ring_atom_idx != candidate.carbonyl_carbon_idx:
+        return False
+    bond = mol.GetBondBetweenAtoms(ring_atom_idx, neighbor_idx)
+    neighbor = mol.GetAtomWithIdx(neighbor_idx)
+    return bond is not None and bond.GetBondType() == Chem.BondType.DOUBLE and neighbor.GetAtomicNum() == 8
+
+
+def build_fragment_smiles(
+    mol: Chem.Mol,
+    side_chain_atoms: list[int],
+    side_chain_start_idx: int,
+    ring_atom_idx: int,
+    dummy_isotope: int,
+) -> str:
+    fragment_mol = Chem.RWMol()
+    old_to_new: dict[int, int] = {}
+
+    for old_idx in side_chain_atoms:
+        old_atom = mol.GetAtomWithIdx(old_idx)
+        new_atom = Chem.Atom(old_atom)
+        new_idx = fragment_mol.AddAtom(new_atom)
+        old_to_new[old_idx] = new_idx
+
+    for old_idx in side_chain_atoms:
+        atom = mol.GetAtomWithIdx(old_idx)
+        for neighbor in atom.GetNeighbors():
+            neighbor_idx = neighbor.GetIdx()
+            if neighbor_idx in old_to_new and old_idx < neighbor_idx:
+                bond = mol.GetBondBetweenAtoms(old_idx, neighbor_idx)
+                fragment_mol.AddBond(
+                    old_to_new[old_idx],
+                    old_to_new[neighbor_idx],
+                    bond.GetBondType(),
+                )
+
+    bond_to_ring = mol.GetBondBetweenAtoms(ring_atom_idx, side_chain_start_idx)
+    if bond_to_ring is None:
+        raise RingNumberingError("Missing ring-to-side-chain bond during fragmentation.")
+
+    dummy_atom = Chem.Atom(0)
+    dummy_atom.SetIsotope(dummy_isotope)
+    dummy_idx = fragment_mol.AddAtom(dummy_atom)
+    fragment_mol.AddBond(
+        old_to_new[side_chain_start_idx],
+        dummy_idx,
+        bond_to_ring.GetBondType(),
+    )
+
+    fragment = fragment_mol.GetMol()
+    Chem.SanitizeMol(fragment)
+    return Chem.MolToSmiles(fragment, isomericSmiles=True)
+
+
+def _find_ring_ester_atoms(mol: Chem.Mol, ring_atoms: Iterable[int]) -> tuple[int, int] | None:
+    ring_atom_set = set(ring_atoms)
+    matches: list[tuple[int, int]] = []
+
+    for atom_idx in ring_atom_set:
+        atom = mol.GetAtomWithIdx(atom_idx)
+        if atom.GetAtomicNum() != 6:
+            continue
+
+        has_carbonyl_oxygen = any(
+            bond.GetBondType() == Chem.BondType.DOUBLE and bond.GetOtherAtom(atom).GetAtomicNum() == 8
+            for bond in atom.GetBonds()
+        )
+        if not has_carbonyl_oxygen:
+            continue
+
+        for neighbor in atom.GetNeighbors():
+            neighbor_idx = neighbor.GetIdx()
+            bond = mol.GetBondBetweenAtoms(atom_idx, neighbor_idx)
+            if (
+                neighbor_idx in ring_atom_set
+                and neighbor.GetAtomicNum() == 8
+                and bond is not None
+                and bond.GetBondType() == Chem.BondType.SINGLE
+            ):
+                ring_neighbors = [
+                    ring_neighbor.GetIdx()
+                    for ring_neighbor in neighbor.GetNeighbors()
+                    if ring_neighbor.GetIdx() in ring_atom_set
+                ]
+                if len(ring_neighbors) == 2:
+                    matches.append((atom_idx, neighbor_idx))
+
+    if not matches:
+        return None
+    return sorted(matches)[0]
diff --git a/src/macro_lactone_toolkit/analyzer.py b/src/macro_lactone_toolkit/analyzer.py
new file mode 100644
index 0000000..5158696
--- /dev/null
+++ b/src/macro_lactone_toolkit/analyzer.py
@@ -0,0 +1,27 @@
+from __future__ import annotations
+
+from rdkit import Chem
+
+from ._core import ensure_mol, find_macrolactone_candidates
+
+
+class MacroLactoneAnalyzer:
+    """Identify valid 12-20 membered macrolactones."""
+
+    def get_valid_ring_sizes(self, mol_input: str | Chem.Mol) -> list[int]:
+        mol, _ = ensure_mol(mol_input)
+        candidates = find_macrolactone_candidates(mol)
+        return sorted({candidate.ring_size for candidate in candidates})
+
+    def analyze_molecule(self, mol_input: str | Chem.Mol) -> dict:
+        mol, smiles = ensure_mol(mol_input)
+        candidates = find_macrolactone_candidates(mol)
+        valid_ring_sizes = sorted({candidate.ring_size for candidate in candidates})
+        is_ambiguous = len(valid_ring_sizes) > 1 or len(candidates) > 1
+        return {
+            "smiles": smiles,
+            "valid_ring_sizes": valid_ring_sizes,
+            "candidate_count": len(candidates),
+            "is_ambiguous": is_ambiguous,
+            "selected_ring_size": valid_ring_sizes[0] if len(valid_ring_sizes) == 1 and len(candidates) == 1 else None,
+        }
diff --git a/src/macro_lactone_toolkit/cli.py b/src/macro_lactone_toolkit/cli.py
new file mode 100644
index 0000000..da6ff1c
--- /dev/null
+++ b/src/macro_lactone_toolkit/cli.py
@@ -0,0 +1,160 @@
+from __future__ import annotations
+
+import argparse
+import json
+from dataclasses import asdict
+from pathlib import Path
+
+import pandas as pd
+
+from .analyzer import MacroLactoneAnalyzer
+from .errors import MacrolactoneError
+from .fragmenter import MacrolactoneFragmenter
+
+
+def main() -> None:
+    parser = build_parser()
+    args = parser.parse_args()
+    if not hasattr(args, "func"):
+        parser.print_help()
+        raise SystemExit(1)
+    args.func(args)
+
+
+def build_parser() -> argparse.ArgumentParser:
+    parser = argparse.ArgumentParser(prog="macro-lactone-toolkit")
+    subparsers = parser.add_subparsers(dest="command")
+
+    analyze = subparsers.add_parser("analyze")
+    _add_common_input_arguments(analyze)
+    analyze.add_argument("--ring-size", type=int, default=None)
+    analyze.set_defaults(func=run_analyze)
+
+    number = subparsers.add_parser("number")
+    number.add_argument("--smiles", required=True)
+    number.add_argument("--ring-size", type=int, default=None)
+    number.set_defaults(func=run_number)
+
+    fragment = subparsers.add_parser("fragment")
+    _add_common_input_arguments(fragment)
+    fragment.add_argument("--ring-size", type=int, default=None)
+    fragment.add_argument("--parent-id", default=None)
+    fragment.add_argument("--errors-output", default=None)
+    fragment.set_defaults(func=run_fragment)
+
+    return parser
+
+
+def run_analyze(args: argparse.Namespace) -> None:
+    analyzer = MacroLactoneAnalyzer()
+    if args.smiles:
+        payload = analyzer.analyze_molecule(args.smiles)
+        _write_output(payload, args.output)
+        return
+
+    rows = _read_csv_rows(args.input, args.smiles_column, args.id_column)
+    payload = []
+    for row in rows:
+        analysis = analyzer.analyze_molecule(row["smiles"])
+        analysis["parent_id"] = row["parent_id"]
+        payload.append(analysis)
+    _write_output(payload, args.output)
+
+
+def run_number(args: argparse.Namespace) -> None:
+    fragmenter = MacrolactoneFragmenter(ring_size=args.ring_size)
+    payload = fragmenter.number_molecule(args.smiles).to_dict()
+    _write_json(payload, None)
+
+
+def run_fragment(args: argparse.Namespace) -> None:
+    fragmenter = MacrolactoneFragmenter(ring_size=args.ring_size)
+
+    if args.smiles:
+        result = fragmenter.fragment_molecule(args.smiles, parent_id=args.parent_id)
+        _write_output(result.to_dict(), args.output)
+        return
+
+    rows = _read_csv_rows(args.input, args.smiles_column, args.id_column)
+    fragment_rows: list[dict] = []
+    error_rows: list[dict] = []
+
+    for row in rows:
+        try:
+            result = fragmenter.fragment_molecule(row["smiles"], parent_id=row["parent_id"])
+        except MacrolactoneError as exc:
+            error_rows.append(
+                {
+                    "parent_id": row["parent_id"],
+                    "smiles": row["smiles"],
+                    "error_type": type(exc).__name__,
+                    "error_message": str(exc),
+                }
+            )
+            continue
+
+        for fragment in result.fragments:
+            fragment_rows.append(
+                {
+                    "parent_id": result.parent_id,
+                    "ring_size": result.ring_size,
+                    **fragment.to_dict(),
+                }
+            )
+
+    if args.output:
+        _write_output(fragment_rows, args.output)
+    else:
+        _write_json({"fragments": fragment_rows, "errors": error_rows}, None)
+
+    if args.errors_output:
+        _write_output(error_rows, args.errors_output)
+
+
+def _add_common_input_arguments(parser: argparse.ArgumentParser) -> None:
+    group = parser.add_mutually_exclusive_group(required=True)
+    group.add_argument("--smiles")
+    group.add_argument("--input")
+    parser.add_argument("--smiles-column", default="smiles")
+    parser.add_argument("--id-column", default="id")
+    parser.add_argument("--output", default=None)
+
+
+def _read_csv_rows(input_path: str, smiles_column: str, id_column: str) -> list[dict]:
+    dataframe = pd.read_csv(input_path)
+    rows = []
+    for index, row in dataframe.iterrows():
+        parent_id = row[id_column] if id_column in dataframe.columns else f"row_{index}"
+        rows.append(
+            {
+                "parent_id": str(parent_id),
+                "smiles": row[smiles_column],
+            }
+        )
+    return rows
+
+
+def _write_output(payload: list[dict] | dict, output_path: str | None) -> None:
+    if output_path is None:
+        _write_json(payload, None)
+        return
+
+    path = Path(output_path)
+    if path.suffix.lower() == ".csv":
+        dataframe = pd.DataFrame(payload)
+        dataframe.to_csv(path, index=False)
+        return
+
+    _write_json(payload, path)
+
+
+def _write_json(payload: list[dict] | dict, output_path: Path | None) -> None:
+    text = json.dumps(payload, indent=2, ensure_ascii=False)
+    if output_path is None:
+        print(text)
+    else:
+        output_path.write_text(text + "\n", encoding="utf-8")
+
+
+if __name__ == "__main__":
+    main()
diff --git a/src/macro_lactone_toolkit/errors.py b/src/macro_lactone_toolkit/errors.py
new file mode 100644
index 0000000..6102157
--- /dev/null
+++ b/src/macro_lactone_toolkit/errors.py
@@ -0,0 +1,18 @@
+class MacrolactoneError(Exception):
+    """Base class for macrolactone toolkit errors."""
+
+
+class MacrolactoneDetectionError(MacrolactoneError):
+    """Raised when a valid macrolactone cannot be detected."""
+
+
+class AmbiguousMacrolactoneError(MacrolactoneDetectionError):
+    """Raised when multiple valid macrolactone candidates are detected."""
+
+
+class RingNumberingError(MacrolactoneError):
+    """Raised when ring numbering cannot be assigned."""
+
+
+class FragmentationError(MacrolactoneError):
+    """Raised when side-chain fragmentation fails."""
diff --git a/src/macro_lactone_toolkit/fragmenter.py b/src/macro_lactone_toolkit/fragmenter.py
new file mode 100644
index 0000000..11fb4f8
--- /dev/null
+++ b/src/macro_lactone_toolkit/fragmenter.py
@@ -0,0 +1,114 @@
+from __future__ import annotations
+
+from rdkit import Chem
+from rdkit.Chem import Descriptors
+
+from ._core import (
+    build_fragment_smiles,
+    build_numbering_result,
+    collect_side_chain_atoms,
+    ensure_mol,
+    find_macrolactone_candidates,
+    is_intrinsic_lactone_neighbor,
+)
+from .analyzer import MacroLactoneAnalyzer
+from .errors import AmbiguousMacrolactoneError, FragmentationError, MacrolactoneDetectionError
+from .models import FragmentationResult, RingNumberingResult, SideChainFragment
+
+
+class MacrolactoneFragmenter:
+    """Number and fragment 12-20 membered macrolactones."""
+
+    def __init__(self, ring_size: int | None = None):
+        if ring_size is not None and not 12 <= ring_size <= 20:
+            raise ValueError("ring_size must be between 12 and 20")
+        self.ring_size = ring_size
+        self.analyzer = MacroLactoneAnalyzer()
+
+    def number_molecule(self, mol_input: str | Chem.Mol) -> RingNumberingResult:
+        mol, _ = ensure_mol(mol_input)
+        candidate = self._select_candidate(mol)
+        return build_numbering_result(mol, candidate)
+
+    def fragment_molecule(
+        self,
+        mol_input: str | Chem.Mol,
+        parent_id: str | None = None,
+    ) -> FragmentationResult:
+        mol, parent_smiles = ensure_mol(mol_input)
+        candidate = self._select_candidate(mol)
+        numbering = build_numbering_result(mol, candidate)
+        parent_id = parent_id or "molecule_0"
+
+        ring_atom_set = set(candidate.ring_atoms)
+        fragments: list[SideChainFragment] = []
+
+        for position, ring_atom_idx in numbering.position_to_atom.items():
+            ring_atom = mol.GetAtomWithIdx(ring_atom_idx)
+            for neighbor in ring_atom.GetNeighbors():
+                neighbor_idx = neighbor.GetIdx()
+                if neighbor_idx in ring_atom_set:
+                    continue
+                if is_intrinsic_lactone_neighbor(mol, candidate, ring_atom_idx, neighbor_idx):
+                    continue
+
+                side_chain_atoms = collect_side_chain_atoms(mol, neighbor_idx, ring_atom_set)
+                if not side_chain_atoms:
+                    continue
+
+                try:
+                    fragment_smiles_labeled = build_fragment_smiles(
+                        mol=mol,
+                        side_chain_atoms=side_chain_atoms,
+                        side_chain_start_idx=neighbor_idx,
+                        ring_atom_idx=ring_atom_idx,
+                        dummy_isotope=position,
+                    )
+                    fragment_smiles_plain = build_fragment_smiles(
+                        mol=mol,
+                        side_chain_atoms=side_chain_atoms,
+                        side_chain_start_idx=neighbor_idx,
+                        ring_atom_idx=ring_atom_idx,
+                        dummy_isotope=0,
+                    )
+                except Exception as exc:  # pragma: no cover - sanitized by tests
+                    raise FragmentationError(str(exc)) from exc
+
+                plain_fragment = Chem.MolFromSmiles(fragment_smiles_plain)
+                if plain_fragment is None:
+                    raise FragmentationError("Generated fragment_smiles_plain could not be read by RDKit.")
+
+                fragment_id = f"{parent_id}_frag_{len(fragments)}"
+                fragments.append(
+                    SideChainFragment(
+                        parent_id=parent_id,
+                        fragment_id=fragment_id,
+                        cleavage_position=position,
+                        attachment_atom_idx=ring_atom_idx,
+                        fragment_smiles_labeled=fragment_smiles_labeled,
+                        fragment_smiles_plain=fragment_smiles_plain,
+                        atom_count=sum(1 for atom in plain_fragment.GetAtoms() if atom.GetAtomicNum() != 0),
+                        molecular_weight=round(Descriptors.MolWt(plain_fragment), 4),
+                    )
+                )
+
+        return FragmentationResult(
+            parent_id=parent_id,
+            parent_smiles=parent_smiles,
+            ring_size=numbering.ring_size,
+            numbering=numbering,
+            fragments=fragments,
+        )
+
+    def _select_candidate(self, mol: Chem.Mol):
+        candidates = find_macrolactone_candidates(mol, ring_size=self.ring_size)
+        if not candidates:
+            requested = f"{self.ring_size}-membered " if self.ring_size is not None else ""
+            raise MacrolactoneDetectionError(f"No valid {requested}macrolactone was detected.")
+
+        valid_ring_sizes = sorted({candidate.ring_size for candidate in candidates})
+        if len(candidates) > 1 or len(valid_ring_sizes) > 1:
+            raise AmbiguousMacrolactoneError(
+                "Multiple valid macrolactone candidates were detected. Pass ring_size explicitly."
+            )
+        return candidates[0]
diff --git a/src/macro_lactone_toolkit/models.py b/src/macro_lactone_toolkit/models.py
new file mode 100644
index 0000000..36d927c
--- /dev/null
+++ b/src/macro_lactone_toolkit/models.py
@@ -0,0 +1,52 @@
+from __future__ import annotations
+
+from dataclasses import asdict, dataclass, field
+
+
+@dataclass(frozen=True)
+class RingNumberingResult:
+    ring_size: int
+    ring_atoms: list[int]
+    ordered_atoms: list[int]
+    carbonyl_carbon_idx: int
+    ester_oxygen_idx: int
+    atom_to_position: dict[int, int]
+    position_to_atom: dict[int, int]
+
+    def to_dict(self) -> dict:
+        return asdict(self)
+
+
+@dataclass(frozen=True)
+class SideChainFragment:
+    parent_id: str
+    fragment_id: str
+    cleavage_position: int
+    attachment_atom_idx: int
+    fragment_smiles_labeled: str
+    fragment_smiles_plain: str
+    atom_count: int
+    molecular_weight: float
+
+    def to_dict(self) -> dict:
+        return asdict(self)
+
+
+@dataclass(frozen=True)
+class FragmentationResult:
+    parent_id: str
+    parent_smiles: str
+    ring_size: int
+    numbering: RingNumberingResult
+    fragments: list[SideChainFragment] = field(default_factory=list)
+    warnings: list[str] = field(default_factory=list)
+
+    def to_dict(self) -> dict:
+        return {
+            "parent_id": self.parent_id,
+            "parent_smiles": self.parent_smiles,
+            "ring_size": self.ring_size,
+            "numbering": self.numbering.to_dict(),
+            "fragments": [fragment.to_dict() for fragment in self.fragments],
+            "warnings": list(self.warnings),
+        }
diff --git a/src/macro_lactone_toolkit/splicing/__init__.py b/src/macro_lactone_toolkit/splicing/__init__.py
new file mode 100644
index 0000000..729fc3e
--- /dev/null
+++ b/src/macro_lactone_toolkit/splicing/__init__.py
@@ -0,0 +1,9 @@
+from .engine import splice_molecule
+from .fragment_prep import activate_fragment
+from .scaffold_prep import prepare_macrolactone_scaffold
+
+__all__ = [
+    "activate_fragment",
+    "prepare_macrolactone_scaffold",
+    "splice_molecule",
+]
diff --git a/src/macro_lactone_toolkit/splicing/engine.py b/src/macro_lactone_toolkit/splicing/engine.py
new file mode 100644
index 0000000..aabefff
--- /dev/null
+++ b/src/macro_lactone_toolkit/splicing/engine.py
@@ -0,0 +1,51 @@
+from __future__ import annotations
+
+from rdkit import Chem
+
+
+def splice_molecule(scaffold: Chem.Mol, fragment: Chem.Mol, position: int) -> Chem.Mol:
+    combined = Chem.CombineMols(scaffold, fragment)
+    rwmol = Chem.RWMol(combined)
+
+    scaffold_atom_count = scaffold.GetNumAtoms()
+    total_atoms = rwmol.GetNumAtoms()
+
+    scaffold_dummy_idx = -1
+    for atom_idx in range(scaffold_atom_count):
+        atom = rwmol.GetAtomWithIdx(atom_idx)
+        if atom.GetAtomicNum() == 0 and atom.GetIsotope() == position:
+            scaffold_dummy_idx = atom_idx
+            break
+    if scaffold_dummy_idx == -1:
+        raise ValueError(f"Scaffold dummy atom with isotope {position} not found")
+
+    fragment_dummy_idx = -1
+    for atom_idx in range(scaffold_atom_count, total_atoms):
+        atom = rwmol.GetAtomWithIdx(atom_idx)
+        if atom.GetAtomicNum() == 0:
+            fragment_dummy_idx = atom_idx
+            break
+    if fragment_dummy_idx == -1:
+        raise ValueError("Fragment does not contain a dummy atom")
+
+    scaffold_dummy = rwmol.GetAtomWithIdx(scaffold_dummy_idx)
+    fragment_dummy = rwmol.GetAtomWithIdx(fragment_dummy_idx)
+    if scaffold_dummy.GetDegree() != 1:
+        raise ValueError("Scaffold dummy atom must have exactly one neighbor")
+    if fragment_dummy.GetDegree() != 1:
+        raise ValueError("Fragment dummy atom must have exactly one neighbor")
+
+    scaffold_anchor_idx = scaffold_dummy.GetNeighbors()[0].GetIdx()
+    fragment_anchor_idx = fragment_dummy.GetNeighbors()[0].GetIdx()
+    fragment_bond = rwmol.GetBondBetweenAtoms(fragment_dummy_idx, fragment_anchor_idx)
+    if fragment_bond is None:
+        raise ValueError("Fragment dummy atom is not bonded to an anchor atom")
+
+    rwmol.AddBond(scaffold_anchor_idx, fragment_anchor_idx, fragment_bond.GetBondType())
+
+    for atom_idx in sorted((scaffold_dummy_idx, fragment_dummy_idx), reverse=True):
+        rwmol.RemoveAtom(atom_idx)
+
+    product = rwmol.GetMol()
+    Chem.SanitizeMol(product)
+    return product
diff --git a/src/macro_lactone_toolkit/splicing/fragment_prep.py b/src/macro_lactone_toolkit/splicing/fragment_prep.py
new file mode 100644
index 0000000..abc31b3
--- /dev/null
+++ b/src/macro_lactone_toolkit/splicing/fragment_prep.py
@@ -0,0 +1,49 @@
+from __future__ import annotations
+
+from rdkit import Chem
+
+
+def activate_fragment(smiles: str, strategy: str = "smart") -> Chem.Mol:
+    mol = Chem.MolFromSmiles(smiles)
+    if mol is None:
+        raise ValueError(f"Invalid SMILES string: {smiles}")
+
+    target_idx = -1
+
+    if strategy == "smart":
+        for smarts in ("[N;!H0]", "[O;H1]", "[S;H1]"):
+            pattern = Chem.MolFromSmarts(smarts)
+            if pattern is None:
+                continue
+            matches = mol.GetSubstructMatches(pattern)
+            if matches:
+                target_idx = matches[0][0]
+                break
+        if target_idx == -1:
+            strategy = "random"
+
+    if strategy == "random":
+        carbon_candidates: list[int] = []
+        other_candidates: list[int] = []
+        for atom in mol.GetAtoms():
+            if atom.GetTotalNumHs() == 0:
+                continue
+            if atom.GetAtomicNum() == 6:
+                carbon_candidates.append(atom.GetIdx())
+            else:
+                other_candidates.append(atom.GetIdx())
+        candidates = carbon_candidates or other_candidates
+        if not candidates:
+            raise ValueError("No suitable atoms with hydrogens found to attach to.")
+        target_idx = candidates[0]
+
+    if target_idx == -1:
+        raise ValueError("Could not identify a target atom for activation.")
+
+    rwmol = Chem.RWMol(mol)
+    dummy_idx = rwmol.AddAtom(Chem.Atom(0))
+    rwmol.AddBond(target_idx, dummy_idx, Chem.BondType.SINGLE)
+
+    result = rwmol.GetMol()
+    Chem.SanitizeMol(result)
+    return result
diff --git a/src/macro_lactone_toolkit/splicing/scaffold_prep.py b/src/macro_lactone_toolkit/splicing/scaffold_prep.py
new file mode 100644
index 0000000..7f4d30d
--- /dev/null
+++ b/src/macro_lactone_toolkit/splicing/scaffold_prep.py
@@ -0,0 +1,63 @@
+from __future__ import annotations
+
+from typing import Iterable
+
+from rdkit import Chem
+
+from .._core import collect_side_chain_atoms, ensure_mol, find_macrolactone_candidates, is_intrinsic_lactone_neighbor
+from ..fragmenter import MacrolactoneFragmenter
+
+
+def prepare_macrolactone_scaffold(
+    mol_input: str | Chem.Mol,
+    positions: Iterable[int],
+    ring_size: int | None = None,
+) -> tuple[Chem.Mol, dict[int, int]]:
+    positions = list(positions)
+    mol, _ = ensure_mol(mol_input)
+    fragmenter = MacrolactoneFragmenter(ring_size=ring_size)
+    numbering = fragmenter.number_molecule(mol)
+    candidate = find_macrolactone_candidates(mol, ring_size=numbering.ring_size)[0]
+    ring_atom_set = set(numbering.ordered_atoms)
+
+    atoms_to_remove: set[int] = set()
+    dummy_specs: list[tuple[int, int, Chem.BondType]] = []
+
+    for position in positions:
+        if position not in numbering.position_to_atom:
+            raise ValueError(f"Position {position} not found in ring numbering.")
+        ring_atom_idx = numbering.position_to_atom[position]
+        ring_atom = mol.GetAtomWithIdx(ring_atom_idx)
+        for neighbor in ring_atom.GetNeighbors():
+            neighbor_idx = neighbor.GetIdx()
+            if neighbor_idx in ring_atom_set:
+                continue
+            if is_intrinsic_lactone_neighbor(mol, candidate, ring_atom_idx, neighbor_idx):
+                continue
+            side_chain_atoms = collect_side_chain_atoms(mol, neighbor_idx, ring_atom_set)
+            atoms_to_remove.update(side_chain_atoms)
+            bond = mol.GetBondBetweenAtoms(ring_atom_idx, neighbor_idx)
+            if bond is not None:
+                dummy_specs.append((ring_atom_idx, position, bond.GetBondType()))
+
+        if not any(spec_position == position for _, spec_position, _ in dummy_specs):
+            dummy_specs.append((ring_atom_idx, position, Chem.BondType.SINGLE))
+
+    rwmol = Chem.RWMol(mol)
+    for ring_atom_idx, position, bond_type in dummy_specs:
+        dummy_atom = Chem.Atom(0)
+        dummy_atom.SetIsotope(position)
+        dummy_idx = rwmol.AddAtom(dummy_atom)
+        rwmol.AddBond(ring_atom_idx, dummy_idx, bond_type)
+
+    for atom_idx in sorted(atoms_to_remove, reverse=True):
+        rwmol.RemoveAtom(atom_idx)
+
+    scaffold = rwmol.GetMol()
+    Chem.SanitizeMol(scaffold)
+    dummy_map = {
+        atom.GetIsotope(): atom.GetIdx()
+        for atom in scaffold.GetAtoms()
+        if atom.GetAtomicNum() == 0 and atom.GetIsotope() in positions
+    }
+    return scaffold, dummy_map
diff --git a/src/macrolactone_fragmenter.py b/src/macrolactone_fragmenter.py
deleted file mode 100644
index c730876..0000000
--- a/src/macrolactone_fragmenter.py
+++ /dev/null
@@ -1,856 +0,0 @@
-"""
-Macrolactone Fragmenter - 大环内酯侧链断裂和分析器
-
-专门用于 12-20 元环大环内酯的：
-- 环原子编号（羰基C为1，酯键O为2）
-- 侧链断裂（单个位置或全部）
-- 分子可视化（带编号）
-- 碎片可视化
-- 数据导出（JSON、DataFrame、数据库）
-
-Author: Auto-generated
-Date: 2025-11-07
-"""
-
-from typing import Optional, Union, List, Tuple, Dict
-from pathlib import Path
-import pandas as pd
-from rdkit import Chem
-from rdkit.Chem import Descriptors, AllChem, Draw
-from rdkit.Chem.Draw import rdMolDraw2D
-from IPython.display import display, SVG
-
-from .fragment_dataclass import Fragment, MoleculeFragments
-from .ring_visualization import (
-    get_macrolactone_numbering,
-    identify_side_chains,
-    extract_side_chain_fragment,
-    cleave_side_chain_at_position,
-    draw_mol_with_ring_numbering
-)
-
-
-class MacrolactoneFragmenter:
-    """
-    大环内酯（Macrolactone）侧链断裂和分析器
-    
-    专门用于 12-20 元环大环内酯的：
-    - 环原子编号（羰基C为1，酯键O为2）
-    - 侧链断裂（单个位置或全部）
-    - 分子可视化（带编号）
-    - 碎片可视化
-    - 数据导出（JSON、DataFrame、数据库）
-    
-    使用场景：
-        >>> # 处理16元环大环内酯
-        >>> fragmenter = MacrolactoneFragmenter(ring_size=16)
-        >>> result = fragmenter.process_molecule(smiles)
-        >>> 
-        >>> # 可视化
-        >>> svg = fragmenter.visualize_molecule(result.parent_smiles)
-        >>> display(SVG(svg))
-        >>> 
-        >>> # 保存结果
-        >>> fragmenter.save_to_json(result, "output.json")
-        >>> 
-        >>> # 批量处理
-        >>> results = fragmenter.process_csv("molecules.csv", smiles_column="smiles")
-    
-    编号规则：
-        - 位置1: 羰基碳（C=O中的C，在环上）
-        - 位置2: 酯键氧（环上的O）
-        - 位置3-N: 从位置2开始，按照O→C反方向顺序编号环上剩余原子
-    
-    Attributes:
-        ring_size: 环的大小（12-20）
-        smarts_pattern: 用于识别大环内酯的SMARTS模式
-    """
-    
-    # 预设常用SMARTS模式（通用酯键模式适用于12-20元环）
-    DEFAULT_SMARTS = {
-        # 使用通用酯键模式，适用于所有12-20元环大环内酯
-        # 格式：[ring_size][C](=O)[O] 表示环上的碳连接羰基和氧
-        12: "[C](=O)[O]",  # 通用酯键模式
-        13: "[C](=O)[O]",  # 通用酯键模式
-        14: "[C](=O)[O]",  # 通用酯键模式
-        15: "[C](=O)[O]",  # 通用酯键模式
-        16: "[C](=O)[O]",  # 通用酯键模式（原模式："O=C1CCCCCCCC(=O)OCC/C=C/C=C/1"）
-        17: "[C](=O)[O]",  # 通用酯键模式
-        18: "[C](=O)[O]",  # 通用酯键模式
-        19: "[C](=O)[O]",  # 通用酯键模式
-        20: "[C](=O)[O]",  # 通用酯键模式
-    }
-    
-    def __init__(
-        self,
-        ring_size: int = 16,
-        custom_smarts: Optional[str] = None
-    ):
-        """
-        初始化大环内酯断裂器
-        
-        Args:
-            ring_size: 环的大小（12-20），默认16
-            custom_smarts: 自定义SMARTS模式（可选）
-                          如果不提供，使用预设模式
-        
-        Raises:
-            ValueError: 如果ring_size不在12-20范围内
-            ValueError: 如果custom_smarts无效
-            ValueError: 如果ring_size没有预设SMARTS且未提供custom_smarts
-        """
-        # 验证ring_size
-        if not (12 <= ring_size <= 20):
-            raise ValueError(
-                f"Ring size must be between 12 and 20, got {ring_size}"
-            )
-        
-        self.ring_size = ring_size
-        
-        # 确定SMARTS模式
-        if custom_smarts is not None:
-            # 使用自定义SMARTS
-            self.smarts_pattern = custom_smarts
-            # 验证SMARTS有效性
-            test_mol = Chem.MolFromSmarts(self.smarts_pattern)
-            if test_mol is None:
-                raise ValueError(
-                    f"Invalid SMARTS pattern: {self.smarts_pattern}"
-                )
-        else:
-            # 使用预设SMARTS
-            if ring_size not in self.DEFAULT_SMARTS:
-                raise ValueError(
-                    f"No default SMARTS pattern for ring size {ring_size}. "
-                    f"Available sizes: {list(self.DEFAULT_SMARTS.keys())}. "
-                    f"Please provide custom_smarts parameter."
-                )
-            self.smarts_pattern = self.DEFAULT_SMARTS[ring_size]
-    
-    # ============ 核心处理方法 ============
-    
-    def process_molecule(
-        self,
-        mol_input: Union[str, Chem.Mol],
-        parent_id: Optional[str] = None
-    ) -> MoleculeFragments:
-        """
-        一站式处理：编号 + 断裂所有侧链
-        
-        Args:
-            mol_input: SMILES字符串或RDKit Mol对象
-            parent_id: 可选的分子ID（如果不提供，自动生成）
-        
-        Returns:
-            MoleculeFragments对象，包含所有断裂的侧链
-        
-        Raises:
-            ValueError: 如果分子解析失败
-            ValueError: 如果不是有效的大环内酯（找不到环或酯键）
-        """
-        # 解析输入
-        mol, smiles = self._parse_mol_input(mol_input)
-        
-        # 自动生成parent_id
-        if parent_id is None:
-            parent_id = f"macro_{self.ring_size}_{id(mol)}"
-        
-        # 断裂所有侧链
-        mol_fragments = self.cleave_all_sidechains(mol, parent_id)
-        
-        return mol_fragments
-    
-    def number_ring(
-        self,
-        mol: Chem.Mol
-    ) -> Tuple[List[int], Dict[int, int], int, int]:
-        """
-        对环原子进行编号
-        
-        编号规则：
-        - 位置1: 羰基碳（C=O中的C）
-        - 位置2: 酯键氧（环上的O）
-        - 位置3-N: 按照O→C反方向顺序编号
-        
-        Args:
-            mol: RDKit分子对象
-        
-        Returns:
-            Tuple of:
-                - ring_atoms: 环原子索引列表
-                - ring_numbering: 原子索引到位置编号的映射 {atom_idx: position}
-                - carbonyl_carbon: 羰基碳原子索引
-                - ester_oxygen: 酯键氧原子索引
-        
-        Raises:
-            ValueError: 如果找不到指定大小的环
-            ValueError: 如果找不到酯键（不是有效的大环内酯）
-        """
-        ring_atoms, ring_numbering, ordered_atoms, carbonyl_carbon, ester_oxygen = \
-            get_macrolactone_numbering(mol, ring_size=self.ring_size)
-        
-        if ring_numbering is None:
-            raise ValueError(
-                f"Failed to number the ring. "
-                f"This molecule may not be a valid {self.ring_size}-membered macrolactone. "
-                f"Expected to find a {self.ring_size}-membered ring with an ester bond."
-            )
-        
-        return ring_atoms, ring_numbering, carbonyl_carbon, ester_oxygen
-    
-    def cleave_at_position(
-        self,
-        mol: Chem.Mol,
-        position: int,
-        parent_id: Optional[str] = None
-    ) -> List[Fragment]:
-        """
-        在指定位置断裂侧链
-        
-        Args:
-            mol: RDKit分子对象
-            position: 环上位置编号（1到ring_size）
-            parent_id: 母分子ID（可选）
-        
-        Returns:
-            该位置的所有侧链碎片列表（通常0-1个Fragment）
-            如果该位置没有侧链，返回空列表
-        
-        Raises:
-            ValueError: 如果position超出范围
-            ValueError: 如果分子编号失败
-        """
-        # 验证position
-        if not (1 <= position <= self.ring_size):
-            raise ValueError(
-                f"Position must be between 1 and {self.ring_size}, got {position}"
-            )
-        
-        # 获取编号
-        ring_atoms, ring_numbering, carbonyl_carbon, ester_oxygen = self.number_ring(mol)
-        
-        # 获取SMILES
-        parent_smiles = Chem.MolToSmiles(mol)
-        if parent_id is None:
-            parent_id = f"macro_{self.ring_size}_{id(mol)}"
-        
-        # 断裂指定位置
-        fragment_smiles_list = cleave_side_chain_at_position(
-            mol, ring_atoms, ring_numbering, position
-        )
-        
-        # 创建Fragment对象
-        fragments = []
-        for i, frag_smiles in enumerate(fragment_smiles_list):
-            if frag_smiles:
-                frag_mol = Chem.MolFromSmiles(frag_smiles)
-                if frag_mol:
-                    fragment = Fragment(
-                        fragment_smiles=frag_smiles,
-                        parent_smiles=parent_smiles,
-                        cleavage_position=position,
-                        fragment_id=f"{parent_id}_pos{position}_{i}",
-                        parent_id=parent_id,
-                        atom_count=frag_mol.GetNumAtoms(),
-                        molecular_weight=round(Descriptors.MolWt(frag_mol), 2)
-                    )
-                    fragments.append(fragment)
-        
-        return fragments
-    
-    def cleave_all_sidechains(
-        self,
-        mol: Chem.Mol,
-        parent_id: Optional[str] = None,
-        skip_positions: Optional[List[int]] = None
-    ) -> MoleculeFragments:
-        """
-        断裂所有侧链（默认位置3到ring_size）
-        
-        Args:
-            mol: RDKit分子对象
-            parent_id: 母分子ID（可选）
-            skip_positions: 要跳过的位置列表（默认跳过1和2，即酯键位置）
-        
-        Returns:
-            MoleculeFragments对象，包含所有断裂的侧链
-        
-        Raises:
-            ValueError: 如果分子编号失败
-        """
-        # 获取编号
-        ring_atoms, ring_numbering, carbonyl_carbon, ester_oxygen = self.number_ring(mol)
-        
-        # 获取SMILES
-        parent_smiles = Chem.MolToSmiles(mol)
-        if parent_id is None:
-            parent_id = f"macro_{self.ring_size}_{id(mol)}"
-        
-        # 创建MoleculeFragments对象
-        mol_fragments = MoleculeFragments(
-            parent_id=parent_id,
-            parent_smiles=parent_smiles,
-            ring_numbering={str(k): v for k, v in ring_numbering.items()}
-        )
-        
-        # 确定要跳过的位置（默认跳过1和2）
-        if skip_positions is None:
-            skip_positions = [1, 2]
-        
-        # 断裂所有侧链
-        fragment_counter = 0
-        for position in range(1, self.ring_size + 1):
-            if position in skip_positions:
-                continue
-            
-            fragment_smiles_list = cleave_side_chain_at_position(
-                mol, ring_atoms, ring_numbering, position
-            )
-            
-            for frag_smiles in fragment_smiles_list:
-                if frag_smiles:
-                    frag_mol = Chem.MolFromSmiles(frag_smiles)
-                    if frag_mol:
-                        fragment = Fragment(
-                            fragment_smiles=frag_smiles,
-                            parent_smiles=parent_smiles,
-                            cleavage_position=position,
-                            fragment_id=f"{parent_id}_frag_{fragment_counter}",
-                            parent_id=parent_id,
-                            atom_count=frag_mol.GetNumAtoms(),
-                            molecular_weight=round(Descriptors.MolWt(frag_mol), 2)
-                        )
-                        mol_fragments.add_fragment(fragment)
-                        fragment_counter += 1
-        
-        return mol_fragments
-    
-    # ============ 可视化方法 ============
-    
-    def visualize_molecule(
-        self,
-        mol_input: Union[str, Chem.Mol],
-        show_numbering: bool = True,
-        size: Tuple[int, int] = (800, 800),
-        title: str = "",
-        save_path: Optional[Union[str, Path]] = None,
-        dpi: int = 600
-    ) -> str:
-        """
-        可视化分子结构
-        
-        Args:
-            mol_input: SMILES字符串或RDKit Mol对象
-            show_numbering: 是否显示环原子编号
-            size: 图像大小（宽，高）
-            title: 图像标题
-            save_path: 保存PNG的路径（可选）。如果提供，保存PNG图片
-            dpi: PNG图片的DPI（仅在save_path提供时有效）
-        
-        Returns:
-            SVG字符串（可在Jupyter中用display(SVG(...))显示）
-        
-        Raises:
-            ValueError: 如果分子解析失败
-            ValueError: 如果分子编号失败（show_numbering=True时）
-        """
-        # 解析输入
-        mol, smiles = self._parse_mol_input(mol_input)
-        
-        if show_numbering:
-            # 获取编号
-            ring_atoms, ring_numbering, _, _ = self.number_ring(mol)
-            
-            # 生成SVG
-            svg = draw_mol_with_ring_numbering(
-                mol,
-                ring_numbering=ring_numbering,
-                ring_atoms=ring_atoms,
-                size=size,
-                title=title
-            )
-        else:
-            # 不显示编号，只高亮环原子
-            try:
-                ring_atoms, _, _, _ = self.number_ring(mol)
-                
-                # 确保有2D坐标
-                AllChem.Compute2DCoords(mol)
-                
-                drawer = rdMolDraw2D.MolDraw2DSVG(size[0], size[1])
-                drawer.SetFontSize(6)
-                
-                draw_options = drawer.drawOptions()
-                draw_options.addAtomIndices = False
-                
-                # 高亮环原子
-                highlight_atoms = list(ring_atoms)
-                atom_colors = {idx: (0.8, 0.9, 1.0) for idx in ring_atoms}
-                
-                drawer.DrawMolecule(
-                    mol,
-                    highlightAtoms=highlight_atoms,
-                    highlightAtomColors=atom_colors
-                )
-                
-                drawer.FinishDrawing()
-                svg = drawer.GetDrawingText()
-            except:
-                # 如果编号失败，只显示基本分子
-                AllChem.Compute2DCoords(mol)
-                drawer = rdMolDraw2D.MolDraw2DSVG(size[0], size[1])
-                drawer.DrawMolecule(mol)
-                drawer.FinishDrawing()
-                svg = drawer.GetDrawingText()
-        
-        # 如果提供了save_path，保存PNG
-        if save_path is not None:
-            self._save_png(mol, save_path, size, dpi, show_numbering)
-        
-        return svg
-    
-    def visualize_fragment(
-        self,
-        fragment: Fragment,
-        size: Tuple[int, int] = (400, 400),
-        save_path: Optional[Union[str, Path]] = None,
-        dpi: int = 600
-    ) -> str:
-        """
-        可视化单个碎片
-        
-        Args:
-            fragment: Fragment对象
-            size: 图像大小
-            save_path: 保存PNG的路径（可选）
-            dpi: PNG图片的DPI
-        
-        Returns:
-            SVG字符串
-        
-        Raises:
-            ValueError: 如果碎片SMILES无效
-        """
-        frag_mol = Chem.MolFromSmiles(fragment.fragment_smiles)
-        if frag_mol is None:
-            raise ValueError(f"Invalid fragment SMILES: {fragment.fragment_smiles}")
-        
-        # 确保有2D坐标
-        AllChem.Compute2DCoords(frag_mol)
-        
-        # 生成SVG
-        drawer = rdMolDraw2D.MolDraw2DSVG(size[0], size[1])
-        drawer.SetFontSize(6)
-        
-        title = f"Position {fragment.cleavage_position}"
-        drawer.DrawMolecule(frag_mol)
-        drawer.FinishDrawing()
-        svg = drawer.GetDrawingText()
-        
-        # 如果提供了save_path，保存PNG
-        if save_path is not None:
-            save_path = Path(save_path)
-            save_path.parent.mkdir(parents=True, exist_ok=True)
-            
-            # 计算实际尺寸
-            scale_factor = dpi / 72.0
-            actual_width = int(size[0] * scale_factor)
-            actual_height = int(size[1] * scale_factor)
-            
-            drawer_png = rdMolDraw2D.MolDraw2DCairo(actual_width, actual_height)
-            font_size = max(6.0, min(40.0, 12.0 * scale_factor))
-            drawer_png.SetFontSize(font_size)
-            
-            drawer_png.DrawMolecule(frag_mol)
-            drawer_png.FinishDrawing()
-            
-            png_data = drawer_png.GetDrawingText()
-            with open(save_path, 'wb') as f:
-                f.write(png_data)
-        
-        return svg
-    
-    def visualize_all_fragments(
-        self,
-        mol_fragments: MoleculeFragments,
-        max_display: int = 10,
-        save_dir: Optional[Union[str, Path]] = None,
-        dpi: int = 600
-    ):
-        """
-        可视化所有碎片（在Jupyter中显示）
-        
-        Args:
-            mol_fragments: MoleculeFragments对象
-            max_display: 最多显示的碎片数量（防止输出过多）
-            save_dir: 保存PNG图片的目录（可选）。如果提供，所有碎片图片会保存到此目录
-            dpi: PNG图片的DPI
-        
-        Note:
-            此方法会直接在Jupyter中显示图片，不返回值
-        """
-        fragments = mol_fragments.fragments[:max_display]
-        
-        if save_dir is not None:
-            save_dir = Path(save_dir)
-            save_dir.mkdir(parents=True, exist_ok=True)
-        
-        for i, frag in enumerate(fragments, 1):
-            print(f"\n{'='*60}")
-            print(f"Fragment {i}/{len(fragments)}")
-            print(f"Position: {frag.cleavage_position}")
-            print(f"SMILES: {frag.fragment_smiles}")
-            print(f"Atoms: {frag.atom_count}, MW: {frag.molecular_weight}")
-            print('='*60)
-            
-            # 确定保存路径
-            frag_save_path = None
-            if save_dir is not None:
-                frag_save_path = save_dir / f"{frag.fragment_id}.png"
-            
-            # 生成SVG并显示
-            svg = self.visualize_fragment(frag, save_path=frag_save_path, dpi=dpi)
-            display(SVG(svg))
-        
-        if len(mol_fragments.fragments) > max_display:
-            print(f"\n... and {len(mol_fragments.fragments) - max_display} more fragments")
-    
-    # ============ 数据导出方法 ============
-    
-    def save_to_json(
-        self,
-        mol_fragments: MoleculeFragments,
-        filepath: Union[str, Path]
-    ):
-        """
-        保存到JSON文件
-        
-        Args:
-            mol_fragments: MoleculeFragments对象
-            filepath: JSON文件路径
-        """
-        filepath = Path(filepath)
-        filepath.parent.mkdir(parents=True, exist_ok=True)
-        mol_fragments.to_json_file(str(filepath))
-    
-    def to_dataframe(
-        self,
-        mol_fragments: MoleculeFragments
-    ) -> pd.DataFrame:
-        """
-        转换为DataFrame（方便分析）
-        
-        Args:
-            mol_fragments: MoleculeFragments对象
-        
-        Returns:
-            包含所有碎片信息的DataFrame，列包括：
-            - fragment_id
-            - parent_id
-            - parent_smiles
-            - cleavage_position
-            - fragment_smiles
-            - atom_count
-            - molecular_weight
-        """
-        data = []
-        for frag in mol_fragments.fragments:
-            data.append({
-                'fragment_id': frag.fragment_id,
-                'parent_id': frag.parent_id,
-                'parent_smiles': frag.parent_smiles,
-                'cleavage_position': frag.cleavage_position,
-                'fragment_smiles': frag.fragment_smiles,
-                'atom_count': frag.atom_count,
-                'molecular_weight': frag.molecular_weight
-            })
-        
-        return pd.DataFrame(data)
-    
-    # ============ 批处理方法 ============
-    
-    def process_csv(
-        self,
-        csv_path: Union[str, Path],
-        smiles_column: str = "smiles",
-        id_column: Optional[str] = "unique_id",
-        max_rows: Optional[int] = None
-    ) -> List[MoleculeFragments]:
-        """
-        批量处理CSV文件中的分子
-        
-        Args:
-            csv_path: CSV文件路径
-            smiles_column: SMILES列名
-            id_column: ID列名（可选）。如果为None，自动生成ID
-            max_rows: 最多处理的行数（可选）。用于测试或限制处理量
-        
-        Returns:
-            MoleculeFragments对象列表
-        
-        Raises:
-            FileNotFoundError: 如果CSV文件不存在
-            ValueError: 如果CSV中没有指定的列
-        """
-        csv_path = Path(csv_path)
-        if not csv_path.exists():
-            raise FileNotFoundError(f"CSV file not found: {csv_path}")
-        
-        # 读取CSV
-        df = pd.read_csv(csv_path)
-        
-        # 验证列存在
-        if smiles_column not in df.columns:
-            raise ValueError(
-                f"Column '{smiles_column}' not found in CSV. "
-                f"Available columns: {df.columns.tolist()}"
-            )
-        
-        if id_column is not None and id_column not in df.columns:
-            raise ValueError(
-                f"Column '{id_column}' not found in CSV. "
-                f"Available columns: {df.columns.tolist()}"
-            )
-        
-        # 限制行数
-        if max_rows is not None:
-            df = df.head(max_rows)
-        
-        # 批量处理
-        results = []
-        failed_count = 0
-        
-        for idx, row in df.iterrows():
-            smiles = row[smiles_column]
-            mol_id = row[id_column] if id_column is not None else f"mol_{idx}"
-            
-            try:
-                mol_fragments = self.process_molecule(smiles, parent_id=mol_id)
-                results.append(mol_fragments)
-            except Exception as e:
-                failed_count += 1
-                print(f"Failed to process {mol_id}: {e}")
-                continue
-        
-        print(f"\nProcessed {len(results)} molecules successfully")
-        if failed_count > 0:
-            print(f"Failed: {failed_count} molecules")
-        
-        return results
-    
-    def batch_to_dataframe(
-        self,
-        results: List[MoleculeFragments]
-    ) -> pd.DataFrame:
-        """
-        将批处理结果转换为单个DataFrame
-        
-        Args:
-            results: MoleculeFragments对象列表
-        
-        Returns:
-            包含所有分子、所有碎片的DataFrame
-        """
-        all_data = []
-        for mol_fragments in results:
-            df = self.to_dataframe(mol_fragments)
-            all_data.append(df)
-        
-        if all_data:
-            return pd.concat(all_data, ignore_index=True)
-        else:
-            return pd.DataFrame()
-    
-    # ============ 数据库接口（预留SQLModel）============
-    
-    def save_to_database(
-        self,
-        mol_fragments: MoleculeFragments,
-        session  # SQLModel Session
-    ):
-        """
-        保存到数据库（预留接口）
-        
-        预期Schema:
-        
-        ```python
-        from sqlmodel import SQLModel, Field, Relationship
-        from typing import Optional, List
-        
-        class MacrolactoneParent(SQLModel, table=True):
-            \"\"\"母分子表\"\"\"
-            id: Optional[int] = Field(default=None, primary_key=True)
-            parent_id: str = Field(index=True, unique=True)
-            parent_smiles: str
-            ring_size: int
-            ring_numbering: str  # JSON字符串
-            
-            # 关系
-            fragments: List["FragmentSidechain"] = Relationship(back_populates="parent")
-        
-        class FragmentSidechain(SQLModel, table=True):
-            \"\"\"侧链碎片表\"\"\"
-            id: Optional[int] = Field(default=None, primary_key=True)
-            fragment_id: str = Field(index=True)
-            parent_id: str = Field(foreign_key="macrolactoneparent.parent_id")
-            cleavage_position: int
-            fragment_smiles: str
-            atom_count: int
-            molecular_weight: float
-            
-            # 关系
-            parent: Optional[MacrolactoneParent] = Relationship(back_populates="fragments")
-        ```
-        
-        使用示例:
-        ```python
-        from sqlmodel import create_engine, Session
-        
-        engine = create_engine("postgresql://user:pass@localhost/db")
-        
-        with Session(engine) as session:
-            fragmenter.save_to_database(mol_fragments, session)
-        ```
-        
-        Args:
-            mol_fragments: MoleculeFragments对象
-            session: SQLModel Session对象
-        
-        TODO: 实现SQLModel集成
-        """
-        raise NotImplementedError(
-            "Database integration not yet implemented. "
-            "Will use SQLModel with schema: MacrolactoneParent + FragmentSidechain. "
-            "See method docstring for expected schema."
-        )
-    
-    @classmethod
-    def load_from_database(
-        cls,
-        parent_id: str,
-        session  # SQLModel Session
-    ) -> MoleculeFragments:
-        """
-        从数据库加载（预留接口）
-        
-        Args:
-            parent_id: 母分子ID
-            session: SQLModel Session对象
-        
-        Returns:
-            MoleculeFragments对象
-        
-        TODO: 实现SQLModel集成
-        """
-        raise NotImplementedError("Database loading not yet implemented")
-    
-    # ============ 私有辅助方法 ============
-    
-    def _parse_mol_input(
-        self,
-        mol_input: Union[str, Chem.Mol]
-    ) -> Tuple[Chem.Mol, str]:
-        """
-        解析分子输入（SMILES或Mol对象）
-        
-        Args:
-            mol_input: SMILES字符串或RDKit Mol对象
-        
-        Returns:
-            Tuple of (RDKit Mol对象, SMILES字符串)
-        
-        Raises:
-            ValueError: 如果输入无效
-        """
-        if isinstance(mol_input, str):
-            # SMILES字符串
-            mol = Chem.MolFromSmiles(mol_input)
-            if mol is None:
-                raise ValueError(f"Invalid SMILES: {mol_input}")
-            smiles = mol_input
-        elif isinstance(mol_input, Chem.Mol):
-            # RDKit Mol对象
-            mol = mol_input
-            smiles = Chem.MolToSmiles(mol)
-        else:
-            raise ValueError(
-                f"mol_input must be SMILES string or RDKit Mol object, "
-                f"got {type(mol_input)}"
-            )
-        
-        return mol, smiles
-    
-    def _save_png(
-        self,
-        mol: Chem.Mol,
-        save_path: Union[str, Path],
-        size: Tuple[int, int],
-        dpi: int,
-        show_numbering: bool
-    ):
-        """
-        保存分子为PNG图片
-        
-        Args:
-            mol: RDKit分子对象
-            save_path: 保存路径
-            size: 图像大小
-            dpi: DPI
-            show_numbering: 是否显示编号
-        """
-        save_path = Path(save_path)
-        save_path.parent.mkdir(parents=True, exist_ok=True)
-        
-        # 计算实际尺寸
-        scale_factor = dpi / 72.0
-        actual_width = int(size[0] * scale_factor)
-        actual_height = int(size[1] * scale_factor)
-        
-        drawer = rdMolDraw2D.MolDraw2DCairo(actual_width, actual_height)
-        
-        # 设置字体大小
-        font_size = max(6.0, min(40.0, 12.0 * scale_factor))
-        drawer.SetFontSize(font_size)
-        
-        draw_options = drawer.drawOptions()
-        draw_options.addAtomIndices = False
-        
-        if show_numbering:
-            try:
-                ring_atoms, ring_numbering, _, _ = self.number_ring(mol)
-                
-                # 高亮环原子
-                highlight_atoms = list(ring_atoms)
-                atom_colors = {idx: (0.8, 0.9, 1.0) for idx in ring_atoms}
-                
-                # 设置编号
-                mol_copy = Chem.Mol(mol)
-                for atom_idx in ring_atoms:
-                    if atom_idx in ring_numbering:
-                        atom = mol_copy.GetAtomWithIdx(atom_idx)
-                        atom.SetProp("atomNote", str(ring_numbering[atom_idx]))
-                
-                drawer.DrawMolecule(
-                    mol_copy,
-                    highlightAtoms=highlight_atoms,
-                    highlightAtomColors=atom_colors
-                )
-            except:
-                # 如果编号失败，只绘制基本分子
-                drawer.DrawMolecule(mol)
-        else:
-            drawer.DrawMolecule(mol)
-        
-        drawer.FinishDrawing()
-        
-        # 保存PNG
-        png_data = drawer.GetDrawingText()
-        with open(save_path, 'wb') as f:
-            f.write(png_data)
-    
-    def __repr__(self) -> str:
-        return (
-            f"MacrolactoneFragmenter(ring_size={self.ring_size}, "
-            f"smarts='{self.smarts_pattern[:50]}...')"
-        )
-
diff --git a/src/ring_numbering.py b/src/ring_numbering.py
deleted file mode 100644
index 6b76224..0000000
--- a/src/ring_numbering.py
+++ /dev/null
@@ -1,211 +0,0 @@
-"""
-Ring numbering system for 16-membered macrolactones.
-"""
-from rdkit import Chem
-from typing import Dict, Tuple, Optional, List
-
-
-def find_lactone_carbon(mol: Chem.Mol) -> Optional[int]:
-    """
-    Find the lactone carbonyl carbon in the 16-membered ring.
-    Uses SMARTS pattern: [r16;#8][r16;#6](=[#8]) to ensure both O and C are on the ring.
-    
-    Args:
-        mol: RDKit molecule object
-    
-    Returns:
-        Atom index of the lactone carbonyl carbon (C=O in the ring)
-        Returns None if not found
-    """
-    # First, get the 16-membered ring atoms
-    ring_atoms = get_ring_atoms(mol)
-    if len(ring_atoms) == 0:
-        return None
-    
-    ring_atoms_set = set(ring_atoms)
-    
-    # SMARTS pattern to match ester group where both O and C are on the ring
-    # [r16;#8] = ring atom that is oxygen
-    # [r16;#6](=[#8]) = ring atom that is carbon with double bond to oxygen
-    lactone_pattern = Chem.MolFromSmarts("[r16;#8][r16;#6](=[#8])")
-    
-    if lactone_pattern is None:
-        return None
-    
-    matches = mol.GetSubstructMatches(lactone_pattern)
-    
-    if not matches:
-        return None
-    
-    # Get the first match
-    # matches[0] contains: (ring_O, ring_C, O_double)
-    # We want the carbon (index 1) which should be in the ring
-    for match in matches:
-        if len(match) >= 2:
-            carbonyl_carbon = match[1]  # The carbonyl carbon
-            # Double check that it's in the ring (should be guaranteed by SMARTS, but verify)
-            if carbonyl_carbon in ring_atoms_set:
-                return carbonyl_carbon
-    
-    return None
-
-
-def get_ring_atoms(mol: Chem.Mol) -> List[int]:
-    """
-    Get all atoms in the 16-membered ring.
-    
-    Args:
-        mol: RDKit molecule object
-    
-    Returns:
-        List of atom indices in the 16-membered ring
-    """
-    # Find atoms in 16-membered rings
-    # Note: Use r16 directly, not r{16} which is not supported in all RDKit versions
-    ring_pattern = Chem.MolFromSmarts("[r16]")
-    
-    if ring_pattern is None:
-        return []
-    
-    matches = mol.GetSubstructMatches(ring_pattern)
-    
-    # Extract unique atom indices
-    ring_atoms = list(set([match[0] for match in matches]))
-    
-    return ring_atoms
-
-
-def order_ring_atoms_clockwise(mol: Chem.Mol, ring_atoms: List[int], start_atom: int) -> List[int]:
-    """
-    Order ring atoms in a clockwise manner starting from a specific atom.
-    
-    Args:
-        mol: RDKit molecule object
-        ring_atoms: List of atom indices in the ring
-        start_atom: Starting atom index
-    
-    Returns:
-        Ordered list of atom indices
-    """
-    if start_atom not in ring_atoms:
-        raise ValueError("Start atom is not in the ring")
-    
-    ordered = [start_atom]
-    visited = {start_atom}
-    current = start_atom
-    
-    while len(ordered) < len(ring_atoms):
-        atom = mol.GetAtomWithIdx(current)
-        
-        # Find the next ring atom among neighbors
-        next_atom = None
-        for neighbor in atom.GetNeighbors():
-            neighbor_idx = neighbor.GetIdx()
-            if neighbor_idx in ring_atoms and neighbor_idx not in visited:
-                next_atom = neighbor_idx
-                break
-        
-        if next_atom is None:
-            # If we can't find a forward path, we might need to go backwards from start
-            if len(ordered) < len(ring_atoms):
-                # Try going backwards from the start
-                atom = mol.GetAtomWithIdx(start_atom)
-                for neighbor in atom.GetNeighbors():
-                    neighbor_idx = neighbor.GetIdx()
-                    if neighbor_idx in ring_atoms and neighbor_idx not in visited:
-                        # Insert at the beginning
-                        ordered.insert(0, neighbor_idx)
-                        visited.add(neighbor_idx)
-                        current = neighbor_idx
-                        break
-                else:
-                    break
-            else:
-                break
-        else:
-            ordered.append(next_atom)
-            visited.add(next_atom)
-            current = next_atom
-    
-    return ordered
-
-
-def assign_ring_numbering(mol: Chem.Mol) -> Dict[int, int]:
-    """
-    Assign fixed numbering (1-16) to atoms in the 16-membered ring.
-    Numbering starts from the lactone carbonyl carbon (position 1) and proceeds clockwise.
-    
-    Args:
-        mol: RDKit molecule object
-    
-    Returns:
-        Dictionary mapping atom index to ring position (1-16)
-        Returns empty dict if no valid ring found
-    """
-    # Find the lactone carbon (will be position 1)
-    lactone_c = find_lactone_carbon(mol)
-    
-    if lactone_c is None:
-        print("Warning: Could not find lactone carbon")
-        return {}
-    
-    # Get all 16-ring atoms
-    ring_atoms = get_ring_atoms(mol)
-    
-    if len(ring_atoms) != 16:
-        print(f"Warning: Expected 16 ring atoms, found {len(ring_atoms)}")
-        # Still proceed if we found some ring atoms
-        if len(ring_atoms) == 0:
-            return {}
-    
-    # Order atoms clockwise starting from lactone carbon
-    ordered_atoms = order_ring_atoms_clockwise(mol, ring_atoms, lactone_c)
-    
-    # Create numbering dictionary
-    numbering = {}
-    for position, atom_idx in enumerate(ordered_atoms, start=1):
-        numbering[atom_idx] = position
-    
-    return numbering
-
-
-def get_ring_position(numbering: Dict[int, int], atom_idx: int) -> Optional[int]:
-    """
-    Get the ring position for a given atom index.
-    
-    Args:
-        numbering: Dictionary from assign_ring_numbering
-        atom_idx: Atom index
-    
-    Returns:
-        Ring position (1-16) or None if not in ring
-    """
-    return numbering.get(atom_idx)
-
-
-def validate_numbering(mol: Chem.Mol, numbering: Dict[int, int]) -> bool:
-    """
-    Validate that the numbering is correct.
-    
-    Args:
-        mol: RDKit molecule object
-        numbering: Dictionary from assign_ring_numbering
-    
-    Returns:
-        True if valid, False otherwise
-    """
-    if len(numbering) != 16:
-        return False
-    
-    # Check that positions are 1-16
-    positions = set(numbering.values())
-    if positions != set(range(1, 17)):
-        return False
-    
-    # Check that position 1 is the lactone carbon
-    lactone_c = find_lactone_carbon(mol)
-    if lactone_c is None or numbering.get(lactone_c) != 1:
-        return False
-    
-    return True
-
diff --git a/src/ring_visualization.py b/src/ring_visualization.py
deleted file mode 100644
index 7dce6e1..0000000
--- a/src/ring_visualization.py
+++ /dev/null
@@ -1,1175 +0,0 @@
-"""
-Ring numbering and visualization utilities for macrolactones.
-
-This module provides functions for:
-1. Assigning ring numbering starting from C=O carbonyl carbon
-2. Visualizing molecules with ring numbering in Jupyter notebooks
-3. Supporting both general molecules and macrolactones (12-20 membered rings)
-"""
-from rdkit import Chem
-from rdkit.Chem import Draw, AllChem
-from rdkit.Chem.Draw import rdMolDraw2D
-from IPython.display import SVG, display
-from typing import Dict, Tuple, Optional, List, Union
-from collections import deque
-from pathlib import Path
-import pandas as pd
-
-# 导入分析器类
-try:
-    from src.macro_lactone_analyzer import MacroLactoneAnalyzer
-except ImportError:
-    # 如果导入失败，定义一个简单的替代类
-    class MacroLactoneAnalyzer:
-        def is_valid_macrolactone(self, mol, ring_size):
-            # 简单检查是否有指定大小的环
-            ring_atoms = get_ring_atoms_by_size(mol, ring_size)
-            return ring_atoms is not None
-
-# 原子序数映射，支持元素符号和原子序数字符串
-ATOMIC_NUMBERS = {
-    "H": 1, "C": 6, "N": 7, "O": 8, "S": 16, "P": 15, 
-    "F": 9, "Cl": 17, "Br": 35, "I": 53
-}
-
-def normalize_atom_types(atom_types: Optional[List[str]]) -> Optional[List[int]]:
-    """
-    标准化原子类型列表，将元素符号转换为原子序数
-    
-    Args:
-        atom_types: 原子类型列表，可以是元素符号["C", "N"]或原子序数["6", "7"]
-    
-    Returns:
-        标准化的原子序数列表，如果输入为None则返回None
-    """
-    if atom_types is None:
-        return None
-    
-    normalized = []
-    for atom_type in atom_types:
-        if atom_type.isdigit():
-            # 已经是原子序数字符串
-            normalized.append(int(atom_type))
-        elif atom_type.upper() in ATOMIC_NUMBERS:
-            # 元素符号，转换为原子序数
-            normalized.append(ATOMIC_NUMBERS[atom_type.upper()])
-        else:
-            raise ValueError(f"不支持的原子类型: {atom_type}")
-    
-    return normalized
-
-def validate_ring_atom_composition(
-    mol: Chem.Mol, 
-    ring_atoms: List[int], 
-    carbonyl_carbon_idx: int, 
-    ester_oxygen_idx: int,
-    allowed_atom_types: Optional[List[int]] = None
-) -> Tuple[bool, str]:
-    """
-    验证环原子组成是否符合要求
-    
-    Args:
-        mol: RDKit分子对象
-        ring_atoms: 环原子索引列表
-        carbonyl_carbon_idx: 羰基碳索引（位置1）
-        ester_oxygen_idx: 酯氧索引（位置2）
-        allowed_atom_types: 允许的原子类型列表（原子序数）
-    
-    Returns:
-        (is_valid, reason): 是否有效及原因说明
-    """
-    if allowed_atom_types is None:
-        return True, "不限制原子类型"
-    
-    # 检查环中每个原子（除了酯键氧）
-    invalid_atoms = []
-    
-    for atom_idx in ring_atoms:
-        # 跳过酯键氧原子（位置2）
-        if atom_idx == ester_oxygen_idx:
-            continue
-            
-        atom = mol.GetAtomWithIdx(atom_idx)
-        atomic_num = atom.GetAtomicNum()
-        
-        if atomic_num not in allowed_atom_types:
-            symbol = atom.GetSymbol()
-            invalid_atoms.append(f"{symbol}(索引:{atom_idx})")
-    
-    if invalid_atoms:
-        return False, f"环中包含不允许的原子类型: {', '.join(invalid_atoms)}"
-    
-    return True, f"环原子组成符合要求，只允许: {[Chem.GetPeriodicTable().GetElementSymbol(num) for num in allowed_atom_types]}"
-
-
-def get_ring_atoms_by_size(mol: Chem.Mol, ring_size: int) -> Optional[List[int]]:
-    """
-    Find atoms in a ring of specified size.
-    
-    Args:
-        mol: RDKit molecule object
-        ring_size: Size of the ring to find (e.g., 16 for 16-membered ring)
-    
-    Returns:
-        List of atom indices in the ring, or None if not found
-    """
-    ring_info = mol.GetRingInfo()
-    rings = ring_info.AtomRings()
-    
-    for ring in rings:
-        if len(ring) == ring_size:
-            return list(ring)
-    
-    return None
-
-
-def find_ester_smarts(mol: Chem.Mol, ring_atoms: List[int]) -> Tuple[Optional[List[int]], Optional[str]]:
-    """
-    Find ester group atoms using SMARTS patterns.
-    
-    Args:
-        mol: RDKit molecule object
-        ring_atoms: List of atom indices in the ring
-    
-    Returns:
-        Tuple of (ester_atoms_list, pattern_string) or (None, None) if not found
-    """
-    ring_atoms_set = set(ring_atoms)
-    
-    # Try different SMARTS patterns
-    patterns = [
-        "[r16][C](=O)[O]",  # Original pattern for 16-membered ring
-        "[C](=O)[O]",       # General pattern
-        "C(=O)O",          # Simple pattern
-    ]
-    
-    for pattern_str in patterns:
-        pattern = Chem.MolFromSmarts(pattern_str)
-        if pattern is None:
-            continue
-        
-        matches = mol.GetSubstructMatches(pattern)
-        for match in matches:
-            # Check if first atom is in the ring
-            if match[0] in ring_atoms_set:
-                return list(match), pattern_str
-    
-    return None, None
-
-
-def get_carbonyl_carbon_in_ring(mol: Chem.Mol, ester_atoms: List[int], ring_atoms: List[int]) -> Optional[int]:
-    """
-    Find the C=O carbonyl carbon atom in the ring.
-    
-    Args:
-        mol: RDKit molecule object
-        ester_atoms: List of ester group atom indices
-        ring_atoms: List of ring atom indices
-    
-    Returns:
-        Atom index of carbonyl carbon, or None if not found
-    """
-    ring_atoms_set = set(ring_atoms)
-    
-    for idx in ester_atoms:
-        atom = mol.GetAtomWithIdx(idx)
-        # Find C atom in the ring
-        if atom.GetSymbol() == 'C' and idx in ring_atoms_set:
-            # Check if this C has a double bond to O (carbonyl feature)
-            for neighbor in atom.GetNeighbors():
-                neighbor_idx = neighbor.GetIdx()
-                bond = mol.GetBondBetweenAtoms(idx, neighbor_idx)
-                if bond.GetBondType() == Chem.BondType.DOUBLE and neighbor.GetSymbol() == 'O':
-                    return idx
-    
-    return None
-
-
-def get_ester_oxygen_in_ring(mol: Chem.Mol, ester_atoms: List[int], ring_atoms: List[int]) -> Optional[int]:
-    """
-    Find the ester oxygen atom in the ring.
-    
-    Args:
-        mol: RDKit molecule object
-        ester_atoms: List of ester group atom indices
-        ring_atoms: List of ring atom indices
-    
-    Returns:
-        Atom index of ester oxygen in ring, or None if not found
-    """
-    ring_atoms_set = set(ring_atoms)
-    
-    for idx in ester_atoms:
-        atom = mol.GetAtomWithIdx(idx)
-        if atom.GetSymbol() == 'O' and idx in ring_atoms_set:
-            return idx
-    
-    return None
-
-
-def order_ring_atoms_from_start(
-    mol: Chem.Mol, 
-    ring_atoms: List[int], 
-    start_atom_idx: int, 
-    target_atom_idx: Optional[int] = None
-) -> List[int]:
-    """
-    Order ring atoms starting from a specific atom, optionally prioritizing path to target atom.
-    
-    Args:
-        mol: RDKit molecule object
-        ring_atoms: List of ring atom indices
-        start_atom_idx: Starting atom index
-        target_atom_idx: Optional target atom index to prioritize path towards
-    
-    Returns:
-        Ordered list of atom indices
-    """
-    ring_atoms_set = set(ring_atoms)
-    
-    if start_atom_idx not in ring_atoms_set:
-        return ring_atoms
-    
-    ordered = [start_atom_idx]
-    remaining = ring_atoms_set - {start_atom_idx}
-    current = start_atom_idx
-    
-    # If target atom specified, prioritize path towards it
-    if target_atom_idx and target_atom_idx in ring_atoms_set:
-        queue = deque([(start_atom_idx, [start_atom_idx])])
-        visited = {start_atom_idx}
-        
-        while queue:
-            node, path = queue.popleft()
-            if node == target_atom_idx:
-                # Found path, add atoms in path
-                for atom_idx in path[1:]:  # Skip first (already in ordered)
-                    if atom_idx in remaining:
-                        ordered.append(atom_idx)
-                        remaining.remove(atom_idx)
-                break
-            
-            atom = mol.GetAtomWithIdx(node)
-            for neighbor in atom.GetNeighbors():
-                neighbor_idx = neighbor.GetIdx()
-                if neighbor_idx in ring_atoms_set and neighbor_idx not in visited:
-                    visited.add(neighbor_idx)
-                    queue.append((neighbor_idx, path + [neighbor_idx]))
-    
-    # Continue adding remaining ring atoms
-    while remaining:
-        atom = mol.GetAtomWithIdx(current)
-        found_next = False
-        
-        for neighbor in atom.GetNeighbors():
-            neighbor_idx = neighbor.GetIdx()
-            if neighbor_idx in remaining:
-                ordered.append(neighbor_idx)
-                remaining.remove(neighbor_idx)
-                current = neighbor_idx
-                found_next = True
-                break
-        
-        if not found_next:
-            if remaining:
-                next_atom = remaining.pop()
-                ordered.append(next_atom)
-                current = next_atom
-    
-    return ordered
-
-
-def create_ring_numbering(
-    mol: Chem.Mol, 
-    ring_atoms: List[int], 
-    carbonyl_carbon_idx: int, 
-    ester_oxygen_idx: int
-) -> Tuple[Dict[int, int], List[int]]:
-    """
-    Create ring numbering mapping starting from C=O carbonyl carbon.
-    
-    Args:
-        mol: RDKit molecule object
-        ring_atoms: List of ring atom indices
-        carbonyl_carbon_idx: Index of C=O carbonyl carbon (will be position 1)
-        ester_oxygen_idx: Index of ester oxygen in ring (for ordering direction)
-    
-    Returns:
-        Tuple of (numbering_dict, ordered_atoms_list)
-        numbering_dict: Maps atom index to position (1-N)
-        ordered_atoms_list: Ordered list of atom indices
-    """
-    ordered_atoms = order_ring_atoms_from_start(
-        mol, ring_atoms, carbonyl_carbon_idx, ester_oxygen_idx
-    )
-    numbering = {}
-    for i, atom_idx in enumerate(ordered_atoms, start=1):
-        numbering[atom_idx] = i
-    
-    return numbering, ordered_atoms
-
-
-def get_macrolactone_numbering(
-    mol: Union[Chem.Mol, str], 
-    ring_size: int = 16,
-    allowed_ring_atom_types: Optional[List[str]] = None
-) -> Tuple[Optional[List[int]], Optional[Dict[int, int]], Optional[List[int]], Optional[int], Optional[int], Tuple[bool, str]]:
-    """
-    Get ring numbering for a macrolactone molecule with optional atom type filtering.
-    
-    This function performs the complete numbering workflow:
-    1. Find ring of specified size
-    2. Find ester group
-    3. Find carbonyl carbon and ester oxygen
-    4. Validate ring atom composition (if filtering requested)
-    5. Create numbering mapping
-    
-    Args:
-        mol: RDKit molecule object or SMILES string
-        ring_size: Size of the macrolactone ring (12-20, default 16)
-        allowed_ring_atom_types: Allowed atom types for ring atoms (excluding ester oxygen)
-            - None: No restriction (default behavior)
-            - ["C"]: Only carbon atoms allowed (except ester oxygen)
-            - ["C", "N"]: Carbon and nitrogen atoms allowed (except ester oxygen)
-            - Can use element symbols or atomic number strings: ["6", "7"]
-    
-    Returns:
-        Tuple of (ring_atoms, ring_numbering, ordered_atoms, carbonyl_carbon, ester_oxygen, (is_valid, validation_reason))
-        Returns (None, None, None, None, None, (False, reason)) if numbering fails
-    """
-    # Convert SMILES to molecule if needed
-    if isinstance(mol, str):
-        mol = Chem.MolFromSmiles(mol)
-        if mol is None:
-            return None, None, None, None, None, (False, "无法解析SMILES字符串")
-    
-    # 标准化原子类型
-    try:
-        allowed_atom_numbers = normalize_atom_types(allowed_ring_atom_types)
-    except ValueError as e:
-        return None, None, None, None, None, (False, f"原子类型参数错误: {str(e)}")
-    
-    # Find ring of specified size
-    ring_atoms = get_ring_atoms_by_size(mol, ring_size)
-    if ring_atoms is None:
-        return None, None, None, None, None, (False, f"未找到{ring_size}元环")
-    
-    # Find ester group
-    ester_atoms, pattern = find_ester_smarts(mol, ring_atoms)
-    if ester_atoms is None:
-        return None, None, None, None, None, (False, "未在环中找到酯键")
-    
-    # Find carbonyl carbon and ester oxygen
-    carbonyl_carbon = get_carbonyl_carbon_in_ring(mol, ester_atoms, ring_atoms)
-    ester_oxygen = get_ester_oxygen_in_ring(mol, ester_atoms, ring_atoms)
-    
-    if carbonyl_carbon is None or ester_oxygen is None:
-        return None, None, None, None, None, (False, "无法识别羰基碳或酯氧原子")
-    
-    # 验证环原子组成（如果指定了允许的原子类型）
-    if allowed_atom_numbers is not None:
-        is_valid, validation_reason = validate_ring_atom_composition(
-            mol, ring_atoms, carbonyl_carbon, ester_oxygen, allowed_atom_numbers
-        )
-        if not is_valid:
-            return None, None, None, None, None, (False, validation_reason)
-    else:
-        validation_reason = "不限制原子类型"
-    
-    # Create numbering
-    ring_numbering, ordered_atoms = create_ring_numbering(
-        mol, ring_atoms, carbonyl_carbon, ester_oxygen
-    )
-    
-    return ring_atoms, ring_numbering, ordered_atoms, carbonyl_carbon, ester_oxygen, (True, validation_reason)
-
-
-def get_macrolactone_numbering_legacy(
-    mol: Union[Chem.Mol, str], 
-    ring_size: int = 16
-) -> Tuple[Optional[List[int]], Optional[Dict[int, int]], Optional[List[int]], Optional[int], Optional[int]]:
-    """
-    向后兼容版本的get_macrolactone_numbering函数
-    
-    Args:
-        mol: RDKit molecule object or SMILES string
-        ring_size: Size of the macrolactone ring (12-20, default 16)
-    
-    Returns:
-        Tuple of (ring_atoms, ring_numbering, ordered_atoms, carbonyl_carbon, ester_oxygen)
-        Returns (None, None, None, None, None) if numbering fails
-    """
-    ring_atoms, ring_numbering, ordered_atoms, carbonyl_carbon, ester_oxygen, (is_valid, _) = \
-        get_macrolactone_numbering(mol, ring_size, allowed_ring_atom_types=None)
-    
-    if not is_valid:
-        return None, None, None, None, None
-    
-    return ring_atoms, ring_numbering, ordered_atoms, carbonyl_carbon, ester_oxygen
-
-
-def is_pure_carbon_macrolactone(mol: Union[Chem.Mol, str], ring_size: int) -> Tuple[bool, str]:
-    """
-    便捷函数：检查是否为纯碳大环内酯（除酯键氧外只含碳原子）
-    
-    Args:
-        mol: RDKit molecule object or SMILES string
-        ring_size: Size of the macrolactone ring
-    
-    Returns:
-        (is_valid, reason): 是否为纯碳大环内酯及原因
-    """
-    _, _, _, _, _, (is_valid, reason) = get_macrolactone_numbering(
-        mol, ring_size, allowed_ring_atom_types=["C"]
-    )
-    return is_valid, reason
-
-
-def is_carbon_nitrogen_macrolactone(mol: Union[Chem.Mol, str], ring_size: int) -> Tuple[bool, str]:
-    """
-    便捷函数：检查是否为碳氮大环内酯（除酯键氧外只含碳和氮原子）
-    
-    Args:
-        mol: RDKit molecule object or SMILES string
-        ring_size: Size of the macrolactone ring
-    
-    Returns:
-        (is_valid, reason): 是否为碳氮大环内酯及原因
-    """
-    _, _, _, _, _, (is_valid, reason) = get_macrolactone_numbering(
-        mol, ring_size, allowed_ring_atom_types=["C", "N"]
-    )
-    return is_valid, reason
-
-
-def visualize_macrolactone_with_auto_coloring(
-    mol: Union[Chem.Mol, str],
-    ring_size: Optional[int] = None,
-    allowed_ring_atom_types: Optional[List[str]] = None,
-    size: Tuple[int, int] = (800, 800),
-    title: str = "",
-    return_svg: bool = False,
-    show_atom_labels: bool = True
-) -> Union[str, None]:
-    """
-    可视化大环内酯，自动根据原子类型进行颜色编码
-    
-    Args:
-        mol: RDKit分子对象或SMILES字符串
-        ring_size: 环大小（12-20），如果为None则自动检测
-        allowed_ring_atom_types: 允许的环原子类型（除酯氧外）
-            - None: 不限制，显示所有原子类型
-            - ["C"]: 只允许碳原子（默认筛选条件）
-            - ["C", "N"]: 允许碳和氮原子
-        size: 图像大小
-        title: 图像标题
-        return_svg: 是否返回SVG字符串
-        show_atom_labels: 是否显示原子编号标签
-    
-    Returns:
-        SVG字符串（如果return_svg=True）或显示图像
-    """
-    # 转换SMILES到分子对象
-    if isinstance(mol, str):
-        mol_obj = Chem.MolFromSmiles(mol)
-        if mol_obj is None:
-            if return_svg:
-                return None
-            else:
-                print("❌ 无法解析SMILES字符串")
-                return None
-    else:
-        mol_obj = mol
-    
-    # 自动检测环大小（如果未指定）
-    if ring_size is None:
-        ring_sizes = []
-        for size in range(12, 21):
-            ring_atoms = get_ring_atoms_by_size(mol_obj, size)
-            if ring_atoms:
-                # 检查是否为有效大环内酯
-                analyzer = MacroLactoneAnalyzer()
-                if analyzer.is_valid_macrolactone(mol_obj, size):
-                    ring_sizes.append(size)
-        
-        if not ring_sizes:
-            if return_svg:
-                return None
-            else:
-                print("❌ 未找到12-20元大环内酯")
-                return None
-        
-        # 使用找到的第一个环大小
-        ring_size = ring_sizes[0]
-        if len(ring_sizes) > 1:
-            print(f"⚠️  找到多个环大小: {ring_sizes}，使用 {ring_size}")
-    
-    # 获取环编号信息
-    ring_atoms, ring_numbering, ordered_atoms, carbonyl_carbon, ester_oxygen, (is_valid, validation_reason) = \
-        get_macrolactone_numbering(mol_obj, ring_size, allowed_ring_atom_types)
-    
-    if not is_valid or not ring_atoms:
-        if return_svg:
-            return None
-        else:
-            print(f"❌ 无法获取环编号: {validation_reason}")
-            return None
-    
-    # 创建分子副本并设置原子标签
-    mol_copy = Chem.Mol(mol_obj)
-    AllChem.Compute2DCoords(mol_copy)
-    
-    if show_atom_labels:
-        for atom_idx in ring_atoms:
-            if atom_idx in ring_numbering:
-                atom = mol_copy.GetAtomWithIdx(atom_idx)
-                atom.SetProp("atomNote", str(ring_numbering[atom_idx]))
-    
-    # 定义原子颜色方案
-    def get_atom_color(symbol: str, is_ester_oxygen: bool = False) -> Tuple[float, float, float]:
-        """获取原子颜色"""
-        if is_ester_oxygen:
-            return (1.0, 0.4, 0.4)  # 酯氧用深红色
-        elif symbol == 'C':
-            return (0.7, 0.8, 1.0)  # 碳用蓝色
-        elif symbol == 'N':
-            return (1.0, 0.8, 1.0)  # 氮用粉色
-        elif symbol == 'O':
-            return (1.0, 0.7, 0.7)  # 氧用红色
-        elif symbol == 'S':
-            return (1.0, 1.0, 0.6)  # 硫用黄色
-        elif symbol == 'P':
-            return (0.8, 0.6, 1.0)  # 磷用紫色
-        else:
-            return (0.8, 1.0, 0.8)  # 其他用绿色
-    
-    # 设置原子颜色
-    atom_colors = {}
-    atom_type_stats = {}
-    
-    for atom_idx in ring_atoms:
-        atom = mol_obj.GetAtomWithIdx(atom_idx)
-        symbol = atom.GetSymbol()
-        is_ester_oxygen = (atom_idx == ester_oxygen)
-        
-        # 设置颜色
-        color = get_atom_color(symbol, is_ester_oxygen)
-        atom_colors[atom_idx] = color
-        
-        # 统计原子类型（不包括酯氧）
-        if not is_ester_oxygen:
-            atom_type_stats[symbol] = atom_type_stats.get(symbol, 0) + 1
-    
-    # 绘制分子
-    # 确保size是元组格式
-    if isinstance(size, int):
-        size = (size, size)
-    elif isinstance(size, (list, tuple)) and len(size) == 1:
-        size = (size[0], size[0])
-    elif not isinstance(size, (list, tuple)) or len(size) != 2:
-        size = (800, 800)  # 默认大小
-    
-    drawer = rdMolDraw2D.MolDraw2DSVG(int(size[0]), int(size[1]))
-    drawer.SetFontSize(12)  # 设置合适的字体大小（最小为6）
-    
-    # 注意：某些RDKit版本不支持DrawTitle，暂时注释掉
-    # if title:
-    #     drawer.DrawTitle(title)
-    
-    drawer.DrawMolecule(mol_copy, 
-                       highlightAtoms=ring_atoms,
-                       highlightAtomColors=atom_colors)
-    drawer.FinishDrawing()
-    svg = drawer.GetDrawingText()
-    
-    # 显示统计信息
-    if not return_svg:
-        display(SVG(svg))
-        
-        print(f"\n📊 分子信息:")
-        print(f"   环大小: {ring_size} 元环")
-        print(f"   羰基碳位置: {ring_numbering.get(carbonyl_carbon, 'N/A')} (深红色标记)")
-        print(f"   酯氧位置: {ring_numbering.get(ester_oxygen, 'N/A')} (深红色标记)")
-        print(f"   环原子组成: {atom_type_stats}")
-        print(f"   筛选条件: {validation_reason}")
-        
-        # 显示颜色说明
-        print(f"\n🎨 颜色说明:")
-        print(f"   深红色: 酯键氧原子 (位置2)")
-        print(f"   蓝色: 碳原子")
-        print(f"   粉色: 氮原子")
-        print(f"   红色: 氧原子")
-        print(f"   黄色: 硫原子")
-        print(f"   紫色: 磷原子")
-        print(f"   绿色: 其他原子")
-    
-    return svg if return_svg else None
-
-
-def batch_visualize_macrolactones(
-    data_file: Path,
-    ring_sizes: List[int] = None,
-    allowed_ring_atom_types: Optional[List[str]] = None,
-    max_examples_per_size: int = 3,
-    output_dir: Optional[Path] = None
-) -> Dict[int, List[Dict]]:
-    """
-    批量可视化大环内酯分子
-    
-    Args:
-        data_file: CSV数据文件路径
-        ring_sizes: 要测试的环大小列表，默认为12-20
-        allowed_ring_atom_types: 允许的环原子类型
-        max_examples_per_size: 每种环大小最大示例数
-        output_dir: 输出目录，如果指定则保存SVG文件
-    
-    Returns:
-        按环大小分组的可视化结果字典
-    """
-    if ring_sizes is None:
-        ring_sizes = list(range(12, 21))
-    
-    print(f"🔍 开始批量可视化大环内酯")
-    print(f"   数据文件: {data_file}")
-    print(f"   环大小范围: {ring_sizes}")
-    print(f"   筛选条件: {allowed_ring_atom_types or '无限制'}")
-    
-    # 加载数据
-    if not data_file.exists():
-        print(f"❌ 数据文件不存在: {data_file}")
-        return {}
-    
-    df = pd.read_csv(data_file)
-    print(f"✓ 加载数据: {len(df)} 个分子")
-    
-    results = {}
-    
-    for ring_size in ring_sizes:
-        print(f"\n🔄 处理 {ring_size} 元环...")
-        
-        size_results = []
-        found_count = 0
-        
-        for idx, row in df.iterrows():
-            if found_count >= max_examples_per_size:
-                break
-            
-            smiles = row.get('smiles', '')
-            if not smiles:
-                continue
-            
-            try:
-                # 测试分子
-                mol = Chem.MolFromSmiles(smiles)
-                if not mol:
-                    continue
-                
-                # 检查是否为指定大小的有效大环内酯
-                analyzer = MacroLactoneAnalyzer()
-                if not analyzer.is_valid_macrolactone(mol, ring_size):
-                    continue
-                
-                # 应用筛选条件（如果指定）
-                if allowed_ring_atom_types:
-                    is_valid, validation_reason = get_macrolactone_numbering(
-                        mol, ring_size, allowed_ring_atom_types
-                    )[5]
-                    if not is_valid:
-                        continue
-                
-                # 获取详细信息
-                ring_atoms, ring_numbering, ordered_atoms, carbonyl_carbon, ester_oxygen, (is_valid, validation_reason) = \
-                    get_macrolactone_numbering(mol, ring_size, allowed_ring_atom_types)
-                
-                if is_valid:
-                    # 分析原子组成
-                    composition = analyze_ring_atom_composition(smiles, ring_size)
-                    
-                    result = {
-                        'index': idx,
-                        'smiles': smiles,
-                        'molecule_id': row.get('molecule_id', f'mol_{idx}'),
-                        'ring_size': ring_size,
-                        'composition': composition,
-                        'validation_reason': validation_reason,
-                        'carbonyl_carbon_pos': ring_numbering.get(carbonyl_carbon, 'N/A'),
-                        'ester_oxygen_pos': ring_numbering.get(ester_oxygen, 'N/A')
-                    }
-                    
-                    size_results.append(result)
-                    found_count += 1
-                    
-                    print(f"   ✓ 找到示例 {found_count}: 分子{idx} ({composition})")
-                    
-                    # 保存可视化（如果指定了输出目录）
-                    if output_dir:
-                        output_dir.mkdir(parents=True, exist_ok=True)
-                        filename = f"ring{ring_size}_example{found_count}_mol{idx}.svg"
-                        output_path = output_dir / filename
-                        
-                        svg = visualize_macrolactone_with_auto_coloring(
-                            mol, ring_size, allowed_ring_atom_types, 
-                            return_svg=True, 
-                            title=f"{ring_size}-元环示例 {found_count}"
-                        )
-                        
-                        if svg:
-                            with open(output_path, 'w', encoding='utf-8') as f:
-                                f.write(svg)
-                            print(f"     💾 保存到: {output_path}")
-                
-            except Exception as e:
-                print(f"   ⚠️  处理分子 {idx} 时出错: {str(e)}")
-                continue
-        
-        results[ring_size] = size_results
-        print(f"   📊 {ring_size} 元环: 找到 {len(size_results)} 个示例")
-    
-    # 显示总体统计
-    print(f"\n📈 总体统计:")
-    total_found = sum(len(results[size]) for size in ring_sizes)
-    print(f"   总示例数: {total_found}")
-    
-    for ring_size in ring_sizes:
-        count = len(results[ring_size])
-        print(f"   {ring_size} 元环: {count} 个示例")
-    
-    return results
-
-
-def test_all_ring_sizes_with_filtering(
-    filtered_data_dir: Path,
-    allowed_ring_atom_types: Optional[List[str]] = ["C"],  # 默认只允许碳原子
-    output_dir: Optional[Path] = None
-) -> Dict[int, Dict]:
-    """
-    测试所有环大小（12-20）的筛选效果
-    
-    Args:
-        filtered_data_dir: 包含filtered CSV文件的目录
-        allowed_ring_atom_types: 允许的环原子类型
-        output_dir: 输出目录
-    
-    Returns:
-        详细的测试结果
-    """
-    print(f"🧪 开始测试所有环大小的筛选效果")
-    print(f"   数据目录: {filtered_data_dir}")
-    print(f"   筛选条件: {allowed_ring_atom_types or '无限制'}")
-    
-    all_results = {}
-    
-    for ring_size in range(12, 21):
-        print(f"\n{'='*60}")
-        print(f"🔍 测试 {ring_size} 元环")
-        print(f"{'='*60}")
-        
-        # 查找对应的数据文件
-        data_file = filtered_data_dir / f'macrolactone_ring{ring_size}_filtered.csv'
-        
-        if not data_file.exists():
-            print(f"⚠️  未找到 {ring_size} 元环数据文件: {data_file}")
-            all_results[ring_size] = {
-                'data_file_exists': False,
-                'total_molecules': 0,
-                'valid_macrolactones': 0,
-                'filtered_molecules': 0,
-                'examples': []
-            }
-            continue
-        
-        # 批量测试
-        batch_results = batch_visualize_macrolactones(
-            data_file, 
-            ring_sizes=[ring_size],
-            allowed_ring_atom_types=allowed_ring_atom_types,
-            max_examples_per_size=5,
-            output_dir=output_dir / f'ring{ring_size}_examples' if output_dir else None
-        )
-        
-        # 统计结果
-        size_results = batch_results.get(ring_size, [])
-        
-        # 加载完整数据进行统计
-        try:
-            df = pd.read_csv(data_file)
-            total_molecules = len(df)
-            
-            # 统计有效大环内酯数量
-            valid_count = 0
-            filtered_count = 0
-            
-            for _, row in df.iterrows():
-                smiles = row.get('smiles', '')
-                if not smiles:
-                    continue
-                
-                try:
-                    mol = Chem.MolFromSmiles(smiles)
-                    if mol:
-                        analyzer = MacroLactoneAnalyzer()
-                        if analyzer.is_valid_macrolactone(mol, ring_size):
-                            valid_count += 1
-                            
-                            if allowed_ring_atom_types:
-                                is_valid, _ = get_macrolactone_numbering(
-                                    mol, ring_size, allowed_ring_atom_types
-                                )[5]
-                                if is_valid:
-                                    filtered_count += 1
-                            else:
-                                filtered_count += 1
-                except:
-                    continue
-            
-            all_results[ring_size] = {
-                'data_file_exists': True,
-                'total_molecules': total_molecules,
-                'valid_macrolactones': valid_count,
-                'filtered_molecules': filtered_count,
-                'filter_rate': filtered_count / valid_count * 100 if valid_count > 0 else 0,
-                'examples': size_results
-            }
-            
-            print(f"\n📊 {ring_size} 元环统计:")
-            print(f"   总分子数: {total_molecules}")
-            print(f"   有效大环内酯: {valid_count}")
-            print(f"   通过筛选: {filtered_count}")
-            print(f"   筛选通过率: {filtered_count/valid_count*100:.1f}%" if valid_count > 0 else "   筛选通过率: 0%")
-            
-        except Exception as e:
-            print(f"❌ 统计 {ring_size} 元环时出错: {str(e)}")
-            all_results[ring_size] = {
-                'data_file_exists': True,
-                'error': str(e),
-                'examples': size_results
-            }
-    
-    # 显示总体统计
-    print(f"\n{'='*60}")
-    print(f"📈 所有环大小测试总结")
-    print(f"{'='*60}")
-    
-    total_molecules = sum(result.get('total_molecules', 0) for result in all_results.values())
-    total_valid = sum(result.get('valid_macrolactones', 0) for result in all_results.values())
-    total_filtered = sum(result.get('filtered_molecules', 0) for result in all_results.values())
-    
-    print(f"总分子数: {total_molecules}")
-    print(f"总有效大环内酯: {total_valid}")
-    print(f"总通过筛选: {total_filtered}")
-    print(f"总体筛选通过率: {total_filtered/total_valid*100:.1f}%" if total_valid > 0 else "总体筛选通过率: 0%")
-    
-    return all_results
-
-
-def analyze_ring_atom_composition(mol: Union[Chem.Mol, str], ring_size: int) -> Dict[str, int]:
-    # 首先获取环编号信息
-    ring_atoms, ring_numbering, ordered_atoms, carbonyl_carbon, ester_oxygen, (is_valid, _) = \
-        get_macrolactone_numbering(mol, ring_size)
-    
-    if not is_valid or not ring_atoms:
-        return {}
-    
-    # 转换SMILES到分子对象（如果需要）
-    if isinstance(mol, str):
-        mol = Chem.MolFromSmiles(mol)
-        if mol is None:
-            return {}
-    
-    # 统计原子类型
-    composition = {}
-    
-    for atom_idx in ring_atoms:
-        # 跳过酯键氧原子
-        if atom_idx == ester_oxygen:
-            continue
-            
-        atom = mol.GetAtomWithIdx(atom_idx)
-        symbol = atom.GetSymbol()
-        composition[symbol] = composition.get(symbol, 0) + 1
-    
-    return composition
-
-
-def draw_mol_with_ring_numbering(
-    mol: Union[Chem.Mol, str],
-    ring_numbering: Optional[Dict[int, int]] = None,
-    ring_atoms: Optional[List[int]] = None,
-    size: Tuple[int, int] = (800, 800),
-    title: str = "",
-    ring_size: int = 16,
-    return_svg: bool = False
-) -> Optional[str]:
-    """
-    Draw molecule with ring numbering displayed.
-    
-    This function can work in two modes:
-    1. If ring_numbering is provided, use it directly
-    2. If ring_numbering is None, automatically compute it for macrolactones
-    
-    Args:
-        mol: RDKit molecule object or SMILES string
-        ring_numbering: Optional pre-computed numbering dictionary
-        ring_atoms: Optional pre-computed ring atoms list
-        size: Image size (width, height)
-        title: Optional title for the image
-        ring_size: Ring size for auto-numbering (default 16)
-        return_svg: If True, return SVG string instead of displaying
-    
-    Returns:
-        SVG string if return_svg=True, None otherwise (displays in notebook)
-    """
-    # Convert SMILES to molecule if needed
-    if isinstance(mol, str):
-        mol = Chem.MolFromSmiles(mol)
-        if mol is None:
-            print("Error: Could not parse SMILES")
-            return None
-    
-    # Auto-compute numbering if not provided
-    if ring_numbering is None:
-        ring_atoms, ring_numbering, _, _, _ = get_macrolactone_numbering(mol, ring_size)
-        if ring_numbering is None:
-            print("Error: Could not compute ring numbering")
-            return None
-    
-    # Get ring atoms if not provided
-    if ring_atoms is None:
-        ring_atoms = list(ring_numbering.keys())
-    
-    # Create drawer
-    drawer = rdMolDraw2D.MolDraw2DSVG(size[0], size[1])
-    drawer.SetFontSize(6)  # Minimum font size is 6
-    
-    draw_options = drawer.drawOptions()
-    draw_options.addAtomIndices = False
-    
-    # Highlight ring atoms
-    highlight_atoms = list(ring_atoms)
-    atom_colors = {}
-    for atom_idx in ring_atoms:
-        atom_colors[atom_idx] = (0.8, 0.9, 1.0)  # Light blue
-    
-    # Create copy and set atom notes (ring numbering)
-    mol_copy = Chem.Mol(mol)
-    for atom_idx in ring_atoms:
-        if atom_idx in ring_numbering:
-            atom = mol_copy.GetAtomWithIdx(atom_idx)
-            atom.SetProp("atomNote", str(ring_numbering[atom_idx]))
-    
-    # Draw molecule
-    drawer.DrawMolecule(
-        mol_copy,
-        highlightAtoms=highlight_atoms,
-        highlightAtomColors=atom_colors
-    )
-    
-    drawer.FinishDrawing()
-    svg = drawer.GetDrawingText()
-    
-    if return_svg:
-        return svg
-    else:
-        display(SVG(svg))
-        return None
-
-
-def visualize_molecule_with_numbering(
-    mol: Union[Chem.Mol, str],
-    ring_size: int = 16,
-    size: Tuple[int, int] = (800, 800),
-    title: str = ""
-) -> Tuple[Optional[Dict[int, int]], Optional[List[int]]]:
-    """
-    Convenience function to visualize a molecule with automatic ring numbering.
-    
-    This is the main function to use in Jupyter notebooks for quick visualization.
-    
-    Args:
-        mol: RDKit molecule object or SMILES string
-        ring_size: Ring size for macrolactones (default 16)
-        size: Image size (width, height)
-        title: Optional title
-    
-    Returns:
-        Tuple of (ring_numbering_dict, ring_atoms_list)
-    """
-    # Get numbering
-    ring_atoms, ring_numbering, ordered_atoms, carbonyl_carbon, ester_oxygen = \
-        get_macrolactone_numbering(mol, ring_size)
-    
-    if ring_numbering is None:
-        print("Error: Could not compute ring numbering")
-        return None, None
-    
-    # Display
-    print(f"Ring size: {ring_size}")
-    print(f"Carbonyl C position: {ring_numbering.get(carbonyl_carbon, 'N/A')}")
-    print(f"Ester O position: {ring_numbering.get(ester_oxygen, 'N/A')}")
-    print(f"Numbering range: 1-{len(ring_numbering)}")
-    
-    draw_mol_with_ring_numbering(
-        mol, ring_numbering, ring_atoms, size, title, ring_size
-    )
-    
-    return ring_numbering, ring_atoms
-
-
-def identify_side_chains(mol: Chem.Mol, ring_atoms: List[int]) -> List[Tuple[int, int]]:
-    """
-    Identify side chains attached to the ring.
-    
-    Args:
-        mol: RDKit molecule object
-        ring_atoms: List of atom indices in the ring
-    
-    Returns:
-        List of tuples (ring_atom_idx, side_chain_first_atom_idx)
-    """
-    side_chains = []
-    ring_atom_set = set(ring_atoms)
-    
-    for ring_atom_idx in ring_atoms:
-        atom = mol.GetAtomWithIdx(ring_atom_idx)
-        for neighbor in atom.GetNeighbors():
-            neighbor_idx = neighbor.GetIdx()
-            # If neighbor is not in the ring, it's a side chain
-            if neighbor_idx not in ring_atom_set:
-                side_chains.append((ring_atom_idx, neighbor_idx))
-    
-    return side_chains
-
-
-def extract_side_chain_fragment(
-    mol: Chem.Mol, 
-    ring_atom_idx: int, 
-    side_chain_start_idx: int, 
-    ring_atoms: List[int]
-) -> Optional[str]:
-    """
-    Extract a side chain fragment as a SMILES string with dummy atom (*) at attachment point.
-    
-    Args:
-        mol: RDKit molecule object
-        ring_atom_idx: Index of the ring atom where side chain is attached
-        side_chain_start_idx: Index of the first atom in the side chain
-        ring_atoms: List of all ring atom indices
-    
-    Returns:
-        SMILES string of the fragment (contains dummy atom *), or None if extraction failed
-    """
-    ring_atom_set = set(ring_atoms)
-    visited = set()
-    queue = [side_chain_start_idx]
-    side_chain_atoms = []
-    
-    # Use BFS to collect all atoms in the side chain
-    while queue:
-        current_idx = queue.pop(0)
-        if current_idx in visited:
-            continue
-        
-        visited.add(current_idx)
-        side_chain_atoms.append(current_idx)
-        
-        atom = mol.GetAtomWithIdx(current_idx)
-        for neighbor in atom.GetNeighbors():
-            neighbor_idx = neighbor.GetIdx()
-            # Only continue into non-ring atoms
-            if neighbor_idx not in ring_atom_set and neighbor_idx not in visited:
-                queue.append(neighbor_idx)
-    
-    if not side_chain_atoms:
-        return None
-    
-    # Get the bond type to the ring
-    bond_to_ring = mol.GetBondBetweenAtoms(ring_atom_idx, side_chain_start_idx)
-    if bond_to_ring is None:
-        return None
-    bond_type = bond_to_ring.GetBondType()
-    
-    # Create a new molecule with only the side chain atoms
-    fragment_mol = Chem.RWMol()
-    old_to_new = {}
-    
-    # Add atoms
-    for old_idx in side_chain_atoms:
-        atom = mol.GetAtomWithIdx(old_idx)
-        new_atom = Chem.Atom(atom.GetAtomicNum())
-        new_atom.SetFormalCharge(atom.GetFormalCharge())
-        new_atom.SetIsAromatic(atom.GetIsAromatic())
-        new_idx = fragment_mol.AddAtom(new_atom)
-        old_to_new[old_idx] = new_idx
-    
-    # Add bonds (within side chain)
-    for old_idx in side_chain_atoms:
-        atom = mol.GetAtomWithIdx(old_idx)
-        for neighbor in atom.GetNeighbors():
-            neighbor_idx = neighbor.GetIdx()
-            if neighbor_idx in old_to_new and old_idx < neighbor_idx:
-                bond = mol.GetBondBetweenAtoms(old_idx, neighbor_idx)
-                fragment_mol.AddBond(
-                    old_to_new[old_idx],
-                    old_to_new[neighbor_idx],
-                    bond.GetBondType()
-                )
-    
-    # Add dummy atom (atomic number 0, displays as * in SMILES)
-    # Dummy atom connects to the atom that was originally connected to the ring
-    attachment_point = old_to_new[side_chain_start_idx]
-    dummy_atom_idx = fragment_mol.AddAtom(Chem.Atom(0))
-    
-    # Add bond between dummy atom and attachment point, keeping original bond type
-    fragment_mol.AddBond(attachment_point, dummy_atom_idx, bond_type)
-    
-    # Convert to molecule and get SMILES
-    try:
-        fragment_mol = fragment_mol.GetMol()
-        Chem.SanitizeMol(fragment_mol)
-        fragment_smiles = Chem.MolToSmiles(fragment_mol)
-        return fragment_smiles
-    except Exception as e:
-        return None
-
-
-def cleave_side_chain_at_position(
-    mol: Chem.Mol, 
-    ring_atoms: List[int], 
-    ring_numbering: Dict[int, int], 
-    position: int
-) -> List[str]:
-    """
-    Cleave side chains at a specified ring position.
-    
-    Args:
-        mol: RDKit molecule object
-        ring_atoms: List of ring atom indices
-        ring_numbering: Ring numbering dictionary mapping atom index to position
-        position: Position number to cleave (1-N, where N is ring size)
-    
-    Returns:
-        List of SMILES strings, one for each side chain (contains dummy atom *, 
-        compatible with molzip for reconstruction)
-    """
-    # Find the ring atom index for this position
-    ring_atom_idx = None
-    for atom_idx, pos in ring_numbering.items():
-        if pos == position:
-            ring_atom_idx = atom_idx
-            break
-    
-    if ring_atom_idx is None:
-        return []
-    
-    # Find all side chains attached to this ring atom
-    side_chains = identify_side_chains(mol, ring_atoms)
-    fragments = []
-    
-    for ring_atom, side_start in side_chains:
-        if ring_atom == ring_atom_idx:
-            fragment_smiles = extract_side_chain_fragment(mol, ring_atom, side_start, ring_atoms)
-            if fragment_smiles:
-                fragments.append(fragment_smiles)
-    
-    return fragments
-
diff --git a/src/splicing/__init__.py b/src/splicing/__init__.py
deleted file mode 100644
index e69de29..0000000
diff --git a/src/splicing/engine.py b/src/splicing/engine.py
deleted file mode 100644
index 3a4e3a9..0000000
--- a/src/splicing/engine.py
+++ /dev/null
@@ -1,79 +0,0 @@
-from rdkit import Chem
-
-def splice_molecule(scaffold: Chem.Mol, fragment: Chem.Mol, position: int) -> Chem.Mol:
-    """
-    Connects a scaffold and a fragment by replacing a specific dummy atom on the scaffold
-    and a dummy atom on the fragment with a single bond.
-
-    Args:
-        scaffold: The scaffold molecule containing labeled dummy atoms (e.g., [1*]).
-        fragment: The fragment molecule containing a dummy atom (*).
-        position: The isotope number of the dummy atom on the scaffold to attach to.
-
-    Returns:
-        Chem.Mol: The spliced molecule.
-
-    Raises:
-        ValueError: If the specified dummy atom is not found on the scaffold
-                    or if the fragment does not contain a dummy atom.
-    """
-    # 1. Combine molecules
-    # Note: CombineMols preserves atom indices of mol1 (scaffold), and appends mol2 (fragment).
-    # Atoms 0 to N-1 are scaffold, N to N+M-1 are fragment.
-    combined = Chem.CombineMols(scaffold, fragment)
-    rw_mol = Chem.RWMol(combined)
-    
-    scaffold_atom_count = scaffold.GetNumAtoms()
-    total_atoms = rw_mol.GetNumAtoms()
-    
-    # 2. Find Scaffold Dummy
-    scaffold_dummy_idx = -1
-    for i in range(scaffold_atom_count):
-        atom = rw_mol.GetAtomWithIdx(i)
-        if atom.GetAtomicNum() == 0 and atom.GetIsotope() == position:
-            scaffold_dummy_idx = i
-            break
-            
-    if scaffold_dummy_idx == -1:
-        raise ValueError(f"Scaffold dummy atom with isotope {position} not found")
-        
-    # 3. Find Fragment Dummy
-    # We search in the fragment part (indices >= scaffold_atom_count)
-    fragment_dummy_idx = -1
-    for i in range(scaffold_atom_count, total_atoms):
-        atom = rw_mol.GetAtomWithIdx(i)
-        # We assume any dummy atom in the fragment is the attachment point.
-        # Usually fragment has isotope 0 on its dummy.
-        if atom.GetAtomicNum() == 0:
-            fragment_dummy_idx = i
-            break
-            
-    if fragment_dummy_idx == -1:
-        raise ValueError("Fragment does not contain a dummy atom")
-        
-    # 4. Identify Neighbors (Anchors)
-    scaffold_dummy = rw_mol.GetAtomWithIdx(scaffold_dummy_idx)
-    if scaffold_dummy.GetDegree() != 1:
-        raise ValueError(f"Scaffold dummy atom at index {scaffold_dummy_idx} must have exactly one neighbor")
-    scaffold_anchor_idx = scaffold_dummy.GetNeighbors()[0].GetIdx()
-    
-    fragment_dummy = rw_mol.GetAtomWithIdx(fragment_dummy_idx)
-    if fragment_dummy.GetDegree() != 1:
-        raise ValueError(f"Fragment dummy atom at index {fragment_dummy_idx} must have exactly one neighbor")
-    fragment_anchor_idx = fragment_dummy.GetNeighbors()[0].GetIdx()
-    
-    # 5. Add Bond
-    rw_mol.AddBond(scaffold_anchor_idx, fragment_anchor_idx, Chem.BondType.SINGLE)
-    
-    # 6. Remove Dummy Atoms
-    # Remove the higher index first to preserve the lower index
-    # We know fragment_dummy_idx > scaffold_dummy_idx because fragment atoms were appended.
-    # However, just to be safe, we sort them.
-    indices_to_remove = sorted([scaffold_dummy_idx, fragment_dummy_idx], reverse=True)
-    for idx in indices_to_remove:
-        rw_mol.RemoveAtom(idx)
-        
-    # 7. Sanitize
-    Chem.SanitizeMol(rw_mol)
-    
-    return rw_mol.GetMol()
diff --git a/src/splicing/fragment_prep.py b/src/splicing/fragment_prep.py
deleted file mode 100644
index 788851d..0000000
--- a/src/splicing/fragment_prep.py
+++ /dev/null
@@ -1,87 +0,0 @@
-import random
-from rdkit import Chem
-
-def activate_fragment(smiles: str, strategy: str = "smart") -> Chem.Mol:
-    """
-    Convert a small molecule fragment into an attachable R-group by adding a dummy atom (*).
-
-    Args:
-        smiles: SMILES string of the fragment.
-        strategy: 'smart' (prioritize heteroatoms) or 'random' (any atom with H).
-
-    Returns:
-        Chem.Mol: The activated fragment with a dummy atom attached.
-    """
-    mol = Chem.MolFromSmiles(smiles)
-    if mol is None:
-        raise ValueError(f"Invalid SMILES string: {smiles}")
-
-    target_idx = -1
-
-    if strategy == "smart":
-        # Order of preference: Amine, Alcohol/Phenol, Thiol
-        # Amine: [N;!H0] - Nitrogen with at least one H
-        # Alcohol/Phenol: [O;H1] - Oxygen with 1 H (usually 2 bonds total)
-        # Thiol: [S;H1]
-        
-        smarts_patterns = [
-            "[N;!H0]", # Primary/Secondary amine
-            "[O;H1]",  # Alcohol/Phenol
-            "[S;H1]"   # Thiol
-        ]
-        
-        for smarts in smarts_patterns:
-            pattern = Chem.MolFromSmarts(smarts)
-            if pattern:
-                matches = mol.GetSubstructMatches(pattern)
-                if matches:
-                    # Pick the first match
-                    target_idx = matches[0][0]
-                    break
-        
-        if target_idx == -1:
-            # Fallback to random if no smart match found
-            strategy = "random"
-
-    if strategy == "random":
-        # Find all atoms with at least one H
-        candidates = []
-        carbon_candidates = []
-        
-        for atom in mol.GetAtoms():
-            # GetTotalNumHs includes implicit and explicit Hs
-            if atom.GetTotalNumHs() > 0:
-                candidates.append(atom.GetIdx())
-                if atom.GetSymbol() == 'C':
-                    carbon_candidates.append(atom.GetIdx())
-        
-        if not candidates:
-            raise ValueError("No suitable atoms with hydrogens found to attach to.")
-        
-        # Prefer Carbon atoms if available
-        if carbon_candidates:
-            # Pick the first one for deterministic behavior
-            target_idx = carbon_candidates[0]
-        else:
-            target_idx = candidates[0]
-
-    if target_idx == -1:
-        # Should be caught by the candidates check, but just in case
-        raise ValueError("Could not identify a target atom for activation.")
-
-    # Perform attachment
-    rwmol = Chem.RWMol(mol)
-    
-    # Add dummy atom
-    dummy_idx = rwmol.AddAtom(Chem.Atom('*'))
-    
-    # Add bond to target atom
-    rwmol.AddBond(target_idx, dummy_idx, Chem.BondType.SINGLE)
-    
-    # Sanitize to fix implicit H counts and ensure validity
-    try:
-        Chem.SanitizeMol(rwmol)
-    except Exception as e:
-        raise ValueError(f"Failed to sanitize molecule after activation: {e}")
-
-    return rwmol.GetMol()
diff --git a/src/splicing/scaffold_prep.py b/src/splicing/scaffold_prep.py
deleted file mode 100644
index 3ce30d9..0000000
--- a/src/splicing/scaffold_prep.py
+++ /dev/null
@@ -1,123 +0,0 @@
-from rdkit import Chem
-from typing import List, Dict, Tuple, Set, Optional
-from src.ring_numbering import assign_ring_numbering
-
-def get_subtree_indices(mol: Chem.Mol, start_atom_idx: int, forbidden_idx: int) -> Set[int]:
-    """
-    Get all atom indices in the subtree starting at start_atom_idx,
-    moving away from forbidden_idx.
-    Used to identify side chain atoms to remove.
-    """
-    subtree = set()
-    stack = [start_atom_idx]
-    
-    while stack:
-        current = stack.pop()
-        if current in subtree:
-            continue
-        subtree.add(current)
-        
-        atom = mol.GetAtomWithIdx(current)
-        for neighbor in atom.GetNeighbors():
-            n_idx = neighbor.GetIdx()
-            # Traverse neighbors except the one leading back to the ring (forbidden)
-            # and those already visited
-            if n_idx != forbidden_idx and n_idx not in subtree:
-                stack.append(n_idx)
-                
-    return subtree
-
-def prepare_tylosin_scaffold(smiles: str, positions: List[int]) -> Tuple[Chem.Mol, Dict[int, int]]:
-    """
-    Prepare the Tylosin scaffold by removing side chains at specified positions
-    and marking them with dummy atoms.
-    
-    Args:
-        smiles: SMILES string of the scaffold/molecule.
-        positions: List of ring positions (1-16) to prepare (add sockets).
-        
-    Returns:
-        Tuple of (Modified Molecule, Dict mapping position -> new_dummy_atom_idx)
-        The returned molecule has dummy atoms ('*') at the specified positions,
-        marked with Isotope = position number.
-    """
-    mol = Chem.MolFromSmiles(smiles)
-    if not mol:
-        raise ValueError(f"Invalid SMILES: {smiles}")
-        
-    # 1. Ring numbering to identify target atoms
-    ring_map = assign_ring_numbering(mol) # atom_idx -> ring_pos
-    if not ring_map:
-        raise ValueError("Could not assign ring numbering. Is this a 16-membered lactone?")
-        
-    # Reverse map for easy lookup: ring_pos -> atom_idx
-    pos_to_atom = {v: k for k, v in ring_map.items()}
-    
-    atoms_to_remove = set()
-    dummies_to_add = [] # List of (ring_atom_idx, position)
-    
-    # 2. Identify edits
-    for pos in positions:
-        if pos not in pos_to_atom:
-            raise ValueError(f"Position {pos} not found in ring numbering.")
-            
-        ring_atom_idx = pos_to_atom[pos]
-        ring_atom = mol.GetAtomWithIdx(ring_atom_idx)
-        
-        # Identify non-ring neighbors (side chains)
-        # Note: neighbors in ring have indices in ring_map
-        side_chain_neighbors = []
-        for n in ring_atom.GetNeighbors():
-            if n.GetIdx() not in ring_map:
-                side_chain_neighbors.append(n.GetIdx())
-        
-        # If side chains exist, mark them for removal
-        if side_chain_neighbors:
-            for sc_idx in side_chain_neighbors:
-                subtree = get_subtree_indices(mol, sc_idx, forbidden_idx=ring_atom_idx)
-                atoms_to_remove.update(subtree)
-        
-        # Plan to add dummy at this ring atom
-        dummies_to_add.append((ring_atom_idx, pos))
-
-    # 3. Apply edits using RWMol
-    rwmol = Chem.RWMol(mol)
-    
-    # Step A: Add dummy atoms
-    # We add them before deletion to use stable ring indices.
-    # Note: Adding atoms does not change existing indices.
-    
-    for ring_idx, pos in dummies_to_add:
-        # Create dummy atom
-        dummy = Chem.Atom('*')
-        dummy.SetIsotope(pos) # Mark with position
-        
-        new_idx = rwmol.AddAtom(dummy)
-        
-        # Add bond to ring atom
-        rwmol.AddBond(ring_idx, new_idx, Chem.BondType.SINGLE)
-        
-    # Step B: Remove side chain atoms
-    # Sort descending to preserve lower indices during deletion
-    sorted_remove = sorted(list(atoms_to_remove), reverse=True)
-    for idx in sorted_remove:
-        rwmol.RemoveAtom(idx)
-        
-    # 4. Finalize
-    mol_final = rwmol.GetMol()
-    
-    try:
-        Chem.SanitizeMol(mol_final)
-    except Exception:
-        # Sometime dummies trigger sanitization errors, but usually partial sanitization works.
-        # We'll ignore strict sanitization errors for the scaffold as it has dummies.
-        pass
-        
-    # 5. Build result map (position -> atom_index in new mol)
-    # Since indices shifted, we find dummies by their isotope markers.
-    final_dummy_map = {}
-    for atom in mol_final.GetAtoms():
-        if atom.GetSymbol() == '*' and atom.GetIsotope() in positions:
-            final_dummy_map[atom.GetIsotope()] = atom.GetIdx()
-            
-    return mol_final, final_dummy_map
diff --git a/src/visualizer.py b/src/visualizer.py
deleted file mode 100644
index e1ecdb8..0000000
--- a/src/visualizer.py
+++ /dev/null
@@ -1,222 +0,0 @@
-"""
-Visualization utilities for macrolactone analysis.
-"""
-from rdkit import Chem
-from rdkit.Chem import Draw
-from typing import Dict, List
-import matplotlib.pyplot as plt
-import seaborn as sns
-import pandas as pd
-import numpy as np
-
-
-def draw_molecule_with_numbering(mol: Chem.Mol, numbering: Dict[int, int], 
-                                 highlight_atoms: List[int] = None,
-                                 figsize=(10, 10)) -> None:
-    """
-    Draw molecule with ring atom numbering.
-    
-    Args:
-        mol: RDKit molecule object
-        numbering: Ring numbering dictionary
-        highlight_atoms: Optional list of atom indices to highlight
-        figsize: Figure size tuple
-    """
-    # Create atom labels
-    atom_labels = {}
-    for atom_idx, position in numbering.items():
-        atom_labels[atom_idx] = str(position)
-    
-    # Draw with labels
-    img = Draw.MolToImage(
-        mol,
-        size=(figsize[0]*100, figsize[1]*100),
-        highlightAtoms=highlight_atoms or [],
-        atomLabels=atom_labels
-    )
-    
-    plt.figure(figsize=figsize)
-    plt.imshow(img)
-    plt.axis('off')
-    plt.tight_layout()
-    plt.show()
-
-
-def plot_substitution_frequency(position_counts: Dict[int, int], 
-                                save_path: str = None,
-                                figsize=(12, 6)) -> None:
-    """
-    Plot bar chart of substitution frequency at each ring position.
-    
-    Args:
-        position_counts: Dictionary mapping ring position to count
-        save_path: Optional path to save the figure
-        figsize: Figure size tuple
-    """
-    positions = sorted(position_counts.keys())
-    counts = [position_counts[pos] for pos in positions]
-    
-    plt.figure(figsize=figsize)
-    
-    # Create bar plot
-    bars = plt.bar(positions, counts, color='steelblue', edgecolor='black', alpha=0.7)
-    
-    # Color bars by frequency (gradient)
-    max_count = max(counts) if counts else 1
-    for bar, count in zip(bars, counts):
-        normalized = count / max_count
-        bar.set_color(plt.cm.YlOrRd(normalized))
-    
-    plt.xlabel('Ring Position', fontsize=14, fontweight='bold')
-    plt.ylabel('Number of Substitutions', fontsize=14, fontweight='bold')
-    plt.title('Side Chain Substitution Frequency by Ring Position', 
-             fontsize=16, fontweight='bold')
-    plt.xticks(positions, fontsize=12)
-    plt.yticks(fontsize=12)
-    plt.grid(axis='y', alpha=0.3, linestyle='--')
-    
-    # Add value labels on bars
-    for i, (pos, count) in enumerate(zip(positions, counts)):
-        plt.text(pos, count + max_count*0.01, str(count), 
-                ha='center', va='bottom', fontsize=10, fontweight='bold')
-    
-    plt.tight_layout()
-    
-    if save_path:
-        plt.savefig(save_path, dpi=300, bbox_inches='tight')
-        print(f"Figure saved to {save_path}")
-    
-    plt.show()
-
-
-def plot_substitution_heatmap(position_fragment_data: pd.DataFrame,
-                              save_path: str = None,
-                              figsize=(14, 8)) -> None:
-    """
-    Plot heatmap showing fragment diversity at each position.
-    
-    Args:
-        position_fragment_data: DataFrame with position and fragment information
-        save_path: Optional path to save the figure
-        figsize: Figure size tuple
-    """
-    # Create pivot table
-    pivot = position_fragment_data.pivot_table(
-        index='fragment_smiles',
-        columns='cleavage_position',
-        values='count',
-        fill_value=0
-    )
-    
-    plt.figure(figsize=figsize)
-    
-    sns.heatmap(pivot, cmap='YlOrRd', annot=False, 
-                cbar_kws={'label': 'Frequency'},
-                linewidths=0.5, linecolor='gray')
-    
-    plt.xlabel('Ring Position', fontsize=14, fontweight='bold')
-    plt.ylabel('Fragment Type', fontsize=14, fontweight='bold')
-    plt.title('Fragment Diversity Across Ring Positions', 
-             fontsize=16, fontweight='bold')
-    plt.tight_layout()
-    
-    if save_path:
-        plt.savefig(save_path, dpi=300, bbox_inches='tight')
-        print(f"Figure saved to {save_path}")
-    
-    plt.show()
-
-
-def plot_circular_substitution_pattern(position_counts: Dict[int, int],
-                                       save_path: str = None,
-                                       figsize=(10, 10)) -> None:
-    """
-    Plot circular/radar plot of substitution patterns around the ring.
-    
-    Args:
-        position_counts: Dictionary mapping ring position to count
-        save_path: Optional path to save the figure
-        figsize: Figure size tuple
-    """
-    positions = list(range(1, 17))  # All 16 positions
-    counts = [position_counts.get(pos, 0) for pos in positions]
-    
-    # Append first value to close the circle
-    counts_circle = counts + [counts[0]]
-    
-    # Calculate angles
-    angles = np.linspace(0, 2 * np.pi, 16, endpoint=False).tolist()
-    angles_circle = angles + [angles[0]]
-    
-    fig, ax = plt.subplots(figsize=figsize, subplot_kw=dict(projection='polar'))
-    
-    # Plot
-    ax.plot(angles_circle, counts_circle, 'o-', linewidth=2, color='steelblue', 
-            markersize=8, markerfacecolor='orange', markeredgecolor='black', markeredgewidth=1.5)
-    ax.fill(angles_circle, counts_circle, alpha=0.25, color='steelblue')
-    
-    # Set labels
-    ax.set_xticks(angles)
-    ax.set_xticklabels([str(i) for i in positions], fontsize=12)
-    ax.set_ylim(0, max(counts) * 1.1 if counts else 1)
-    
-    ax.set_title('Circular Substitution Pattern\n(16-Membered Ring)', 
-                fontsize=16, fontweight='bold', pad=20)
-    ax.grid(True, linestyle='--', alpha=0.6)
-    
-    plt.tight_layout()
-    
-    if save_path:
-        plt.savefig(save_path, dpi=300, bbox_inches='tight')
-        print(f"Figure saved to {save_path}")
-    
-    plt.show()
-
-
-def plot_fragment_statistics(fragments_df: pd.DataFrame,
-                             save_path: str = None,
-                             figsize=(15, 5)) -> None:
-    """
-    Plot statistics about fragment properties.
-    
-    Args:
-        fragments_df: DataFrame with fragment information
-        save_path: Optional path to save the figure
-        figsize: Figure size tuple
-    """
-    fig, axes = plt.subplots(1, 3, figsize=figsize)
-    
-    # Plot 1: Fragment size distribution
-    axes[0].hist(fragments_df['atom_count'], bins=20, color='skyblue', 
-                edgecolor='black', alpha=0.7)
-    axes[0].set_xlabel('Number of Atoms', fontsize=12, fontweight='bold')
-    axes[0].set_ylabel('Frequency', fontsize=12, fontweight='bold')
-    axes[0].set_title('Fragment Size Distribution', fontsize=14, fontweight='bold')
-    axes[0].grid(axis='y', alpha=0.3)
-    
-    # Plot 2: Molecular weight distribution
-    axes[1].hist(fragments_df['molecular_weight'], bins=20, color='lightcoral', 
-                edgecolor='black', alpha=0.7)
-    axes[1].set_xlabel('Molecular Weight (Da)', fontsize=12, fontweight='bold')
-    axes[1].set_ylabel('Frequency', fontsize=12, fontweight='bold')
-    axes[1].set_title('Fragment MW Distribution', fontsize=14, fontweight='bold')
-    axes[1].grid(axis='y', alpha=0.3)
-    
-    # Plot 3: Fragments per molecule
-    frags_per_mol = fragments_df.groupby('parent_id').size()
-    axes[2].hist(frags_per_mol, bins=range(int(frags_per_mol.min()), 
-                                           int(frags_per_mol.max())+2), 
-                color='lightgreen', edgecolor='black', alpha=0.7)
-    axes[2].set_xlabel('Fragments per Molecule', fontsize=12, fontweight='bold')
-    axes[2].set_ylabel('Number of Molecules', fontsize=12, fontweight='bold')
-    axes[2].set_title('Fragments per Macrolactone', fontsize=14, fontweight='bold')
-    axes[2].grid(axis='y', alpha=0.3)
-    
-    plt.tight_layout()
-    
-    if save_path:
-        plt.savefig(save_path, dpi=300, bbox_inches='tight')
-        print(f"Figure saved to {save_path}")
-    
-    plt.show()
-
diff --git a/tests/__init__.py b/tests/__init__.py
new file mode 100644
index 0000000..3e0ea00
--- /dev/null
+++ b/tests/__init__.py
@@ -0,0 +1 @@
+# Tests package marker for helper imports.
diff --git a/tests/helpers.py b/tests/helpers.py
new file mode 100644
index 0000000..4efbc75
--- /dev/null
+++ b/tests/helpers.py
@@ -0,0 +1,96 @@
+from __future__ import annotations
+
+from dataclasses import dataclass
+from typing import Mapping
+
+from rdkit import Chem
+
+
+@dataclass(frozen=True)
+class BuiltMacrolactone:
+    mol: Chem.Mol
+    smiles: str
+    position_to_atom: dict[int, int]
+
+
+def build_macrolactone(
+    ring_size: int,
+    side_chains: Mapping[int, str] | None = None,
+) -> BuiltMacrolactone:
+    if not 12 <= ring_size <= 20:
+        raise ValueError("ring_size must be between 12 and 20")
+
+    side_chains = dict(side_chains or {})
+    rwmol = Chem.RWMol()
+
+    position_to_atom: dict[int, int] = {
+        1: rwmol.AddAtom(Chem.Atom("C")),
+        2: rwmol.AddAtom(Chem.Atom("O")),
+    }
+    for position in range(3, ring_size + 1):
+        position_to_atom[position] = rwmol.AddAtom(Chem.Atom("C"))
+
+    carbonyl_oxygen_idx = rwmol.AddAtom(Chem.Atom("O"))
+
+    rwmol.AddBond(position_to_atom[1], position_to_atom[2], Chem.BondType.SINGLE)
+    for position in range(2, ring_size):
+        rwmol.AddBond(
+            position_to_atom[position],
+            position_to_atom[position + 1],
+            Chem.BondType.SINGLE,
+        )
+    rwmol.AddBond(position_to_atom[ring_size], position_to_atom[1], Chem.BondType.SINGLE)
+    rwmol.AddBond(position_to_atom[1], carbonyl_oxygen_idx, Chem.BondType.DOUBLE)
+
+    for position, side_chain in side_chains.items():
+        if position not in position_to_atom:
+            raise ValueError(f"Invalid ring position: {position}")
+        _add_side_chain(rwmol, position_to_atom[position], side_chain)
+
+    mol = rwmol.GetMol()
+    Chem.SanitizeMol(mol)
+    return BuiltMacrolactone(
+        mol=mol,
+        smiles=Chem.MolToSmiles(mol, isomericSmiles=True),
+        position_to_atom=position_to_atom,
+    )
+
+
+def build_ambiguous_smiles() -> str:
+    mol_12 = build_macrolactone(12).mol
+    mol_14 = build_macrolactone(14).mol
+    combined = Chem.CombineMols(mol_12, mol_14)
+    return Chem.MolToSmiles(combined, isomericSmiles=True)
+
+
+def canonicalize(smiles_or_mol: str | Chem.Mol) -> str:
+    if isinstance(smiles_or_mol, Chem.Mol):
+        mol = smiles_or_mol
+    else:
+        mol = Chem.MolFromSmiles(smiles_or_mol)
+        if mol is None:
+            raise ValueError(f"Unable to parse SMILES: {smiles_or_mol}")
+    return Chem.MolToSmiles(mol, isomericSmiles=True)
+
+
+def _add_side_chain(rwmol: Chem.RWMol, ring_atom_idx: int, side_chain: str) -> None:
+    if side_chain == "methyl":
+        carbon_idx = rwmol.AddAtom(Chem.Atom("C"))
+        rwmol.AddBond(ring_atom_idx, carbon_idx, Chem.BondType.SINGLE)
+        return
+
+    if side_chain == "ethyl":
+        carbon_1_idx = rwmol.AddAtom(Chem.Atom("C"))
+        carbon_2_idx = rwmol.AddAtom(Chem.Atom("C"))
+        rwmol.AddBond(ring_atom_idx, carbon_1_idx, Chem.BondType.SINGLE)
+        rwmol.AddBond(carbon_1_idx, carbon_2_idx, Chem.BondType.SINGLE)
+        return
+
+    if side_chain == "exocyclic_alkene":
+        carbon_1_idx = rwmol.AddAtom(Chem.Atom("C"))
+        carbon_2_idx = rwmol.AddAtom(Chem.Atom("C"))
+        rwmol.AddBond(ring_atom_idx, carbon_1_idx, Chem.BondType.DOUBLE)
+        rwmol.AddBond(carbon_1_idx, carbon_2_idx, Chem.BondType.SINGLE)
+        return
+
+    raise ValueError(f"Unsupported side chain: {side_chain}")
diff --git a/tests/test_cli.py b/tests/test_cli.py
new file mode 100644
index 0000000..9b047db
--- /dev/null
+++ b/tests/test_cli.py
@@ -0,0 +1,74 @@
+from __future__ import annotations
+
+import json
+import subprocess
+import sys
+
+import pandas as pd
+
+from .helpers import build_ambiguous_smiles, build_macrolactone
+
+
+def run_cli(*args: str) -> subprocess.CompletedProcess[str]:
+    return subprocess.run(
+        [sys.executable, "-m", "macro_lactone_toolkit.cli", *args],
+        capture_output=True,
+        text=True,
+        check=False,
+    )
+
+
+def test_cli_smoke_commands():
+    built = build_macrolactone(16, {5: "methyl"})
+
+    analyze = run_cli("analyze", "--smiles", built.smiles)
+    assert analyze.returncode == 0, analyze.stderr
+    analyze_payload = json.loads(analyze.stdout)
+    assert analyze_payload["valid_ring_sizes"] == [16]
+
+    number = run_cli("number", "--smiles", built.smiles)
+    assert number.returncode == 0, number.stderr
+    number_payload = json.loads(number.stdout)
+    assert number_payload["ring_size"] == 16
+    assert number_payload["position_to_atom"]["1"] >= 0
+
+    fragment = run_cli("fragment", "--smiles", built.smiles, "--parent-id", "cli_1")
+    assert fragment.returncode == 0, fragment.stderr
+    fragment_payload = json.loads(fragment.stdout)
+    assert fragment_payload["parent_id"] == "cli_1"
+    assert fragment_payload["ring_size"] == 16
+    assert fragment_payload["fragments"][0]["fragment_smiles_labeled"]
+
+
+def test_cli_fragment_csv_skips_ambiguous_and_records_errors(tmp_path):
+    valid = build_macrolactone(14, {4: "methyl"})
+    ambiguous = build_ambiguous_smiles()
+    input_path = tmp_path / "molecules.csv"
+    output_path = tmp_path / "fragments.csv"
+    errors_path = tmp_path / "errors.csv"
+
+    pd.DataFrame(
+        [
+            {"id": "valid_1", "smiles": valid.smiles},
+            {"id": "ambiguous_1", "smiles": ambiguous},
+        ]
+    ).to_csv(input_path, index=False)
+
+    completed = run_cli(
+        "fragment",
+        "--input",
+        str(input_path),
+        "--output",
+        str(output_path),
+        "--errors-output",
+        str(errors_path),
+    )
+
+    assert completed.returncode == 0, completed.stderr
+
+    fragments = pd.read_csv(output_path)
+    errors = pd.read_csv(errors_path)
+
+    assert set(fragments["parent_id"]) == {"valid_1"}
+    assert errors.loc[0, "parent_id"] == "ambiguous_1"
+    assert errors.loc[0, "error_type"] == "AmbiguousMacrolactoneError"
diff --git a/tests/test_detection_and_numbering.py b/tests/test_detection_and_numbering.py
new file mode 100644
index 0000000..1cbc4a8
--- /dev/null
+++ b/tests/test_detection_and_numbering.py
@@ -0,0 +1,73 @@
+import pytest
+from rdkit import Chem
+
+from macro_lactone_toolkit import (
+    AmbiguousMacrolactoneError,
+    MacroLactoneAnalyzer,
+    MacrolactoneDetectionError,
+    MacrolactoneFragmenter,
+)
+
+from .helpers import build_ambiguous_smiles, build_macrolactone
+
+
+@pytest.mark.parametrize("ring_size", [12, 14, 16, 20])
+def test_analyzer_detects_supported_ring_sizes(ring_size: int):
+    built = build_macrolactone(ring_size)
+    analyzer = MacroLactoneAnalyzer()
+
+    assert analyzer.get_valid_ring_sizes(built.smiles) == [ring_size]
+
+
+def test_analyzer_rejects_non_lactone_macrocycle():
+    analyzer = MacroLactoneAnalyzer()
+
+    assert analyzer.get_valid_ring_sizes("C1CCCCCCCCCCC1") == []
+
+
+def test_fragmenter_auto_numbers_ring_with_expected_positions():
+    built = build_macrolactone(16, {5: "methyl"})
+    result = MacrolactoneFragmenter().number_molecule(built.mol)
+
+    assert result.ring_size == 16
+    assert result.position_to_atom == built.position_to_atom
+    assert set(result.position_to_atom) == set(range(1, 17))
+    assert result.atom_to_position == {
+        atom_idx: position for position, atom_idx in built.position_to_atom.items()
+    }
+
+    carbonyl_atom = built.mol.GetAtomWithIdx(result.position_to_atom[1])
+    assert carbonyl_atom.GetSymbol() == "C"
+    assert any(
+        bond.GetBondType() == Chem.BondType.DOUBLE and bond.GetOtherAtom(carbonyl_atom).GetSymbol() == "O"
+        for bond in carbonyl_atom.GetBonds()
+    )
+
+    ester_oxygen = built.mol.GetAtomWithIdx(result.position_to_atom[2])
+    assert ester_oxygen.GetSymbol() == "O"
+
+
+def test_fragmenter_requires_explicit_ring_size_for_ambiguous_molecule():
+    ambiguous_smiles = build_ambiguous_smiles()
+
+    with pytest.raises(AmbiguousMacrolactoneError):
+        MacrolactoneFragmenter().number_molecule(ambiguous_smiles)
+
+
+def test_fragmenter_raises_for_missing_macrolactone():
+    with pytest.raises(MacrolactoneDetectionError):
+        MacrolactoneFragmenter().number_molecule("CCO")
+
+
+def test_explicit_ring_size_selects_requested_ring():
+    built = build_macrolactone(14)
+    result = MacrolactoneFragmenter(ring_size=14).number_molecule(built.smiles)
+
+    assert result.ring_size == 14
+
+
+def test_explicit_ring_size_rejects_wrong_ring():
+    built = build_macrolactone(12)
+
+    with pytest.raises(MacrolactoneDetectionError):
+        MacrolactoneFragmenter(ring_size=16).number_molecule(built.smiles)
diff --git a/tests/test_env_integration.py b/tests/test_env_integration.py
deleted file mode 100644
index 8ed3779..0000000
--- a/tests/test_env_integration.py
+++ /dev/null
@@ -1,39 +0,0 @@
-import sys
-import os
-from pathlib import Path
-
-# Add SIME to path
-SIME_PATH = "/home/zly/project/SIME"
-if SIME_PATH not in sys.path:
-    sys.path.append(SIME_PATH)
-
-# Add project root to path so we can import 'src'
-PROJECT_ROOT = str(Path(__file__).parent.parent)
-if PROJECT_ROOT not in sys.path:
-    sys.path.append(PROJECT_ROOT)
-
-def test_imports():
-    """Verify that we can import from both local project and SIME."""
-    print(f"sys.path: {sys.path}")
-    
-    # 1. Test local import from src
-    try:
-        # Correct function name based on file inspection
-        from src.ring_numbering import assign_ring_numbering
-        assert callable(assign_ring_numbering)
-        print("Successfully imported src.ring_numbering.assign_ring_numbering")
-    except ImportError as e:
-        print(f"Failed to import src.ring_numbering: {e}")
-        raise
-
-    # 2. Test SIME import
-    try:
-        from utils.mole_predictor import ParallelBroadSpectrumPredictor
-        assert ParallelBroadSpectrumPredictor is not None
-        print("Successfully imported ParallelBroadSpectrumPredictor from utils.mole_predictor")
-    except ImportError as e:
-        print(f"Failed to import from SIME: {e}")
-        raise
-
-if __name__ == "__main__":
-    test_imports()
diff --git a/tests/test_fragment_prep.py b/tests/test_fragment_prep.py
index 5245f9c..592f7f7 100644
--- a/tests/test_fragment_prep.py
+++ b/tests/test_fragment_prep.py
@@ -1,95 +1,42 @@
 import pytest
 from rdkit import Chem
-from src.splicing.fragment_prep import activate_fragment
+
+from macro_lactone_toolkit.splicing.fragment_prep import activate_fragment
+
 
 def test_activate_smart_ethanol():
-    """Test 'smart' activation on Ethanol (CCO). Should attach to Oxygen."""
-    smiles = "CCO"
-    mol = activate_fragment(smiles, strategy="smart")
-    
-    # Check if we have a dummy atom
+    mol = activate_fragment("CCO", strategy="smart")
+
     assert mol is not None
-    assert mol.GetNumAtoms() == 4  # C, C, O, *
-    
-    # Check if the dummy atom is attached to Oxygen
-    # Find the dummy atom
-    dummy_atom = None
-    for atom in mol.GetAtoms():
-        if atom.GetSymbol() == '*':
-            dummy_atom = atom
-            break
-    
-    assert dummy_atom is not None
-    
-    # Check neighbors of dummy atom
-    neighbors = dummy_atom.GetNeighbors()
-    assert len(neighbors) == 1
-    assert neighbors[0].GetSymbol() == 'O'
-    
-    # Check output SMILES format
-    out_smiles = Chem.MolToSmiles(mol)
-    assert '*' in out_smiles
+    assert mol.GetNumAtoms() == 4
+    dummy_atom = next(atom for atom in mol.GetAtoms() if atom.GetAtomicNum() == 0)
+    assert dummy_atom.GetNeighbors()[0].GetSymbol() == "O"
+    assert "*" in Chem.MolToSmiles(mol)
+
 
 def test_activate_smart_amine():
-    """Test 'smart' activation on Ethylamine (CCN). Should attach to Nitrogen."""
-    smiles = "CCN"
-    mol = activate_fragment(smiles, strategy="smart")
-    
-    assert mol is not None
-    
-    # Find the dummy atom
-    dummy_atom = None
-    for atom in mol.GetAtoms():
-        if atom.GetSymbol() == '*':
-            dummy_atom = atom
-            break
-            
-    assert dummy_atom is not None
-    neighbors = dummy_atom.GetNeighbors()
-    assert neighbors[0].GetSymbol() == 'N'
+    mol = activate_fragment("CCN", strategy="smart")
+
+    dummy_atom = next(atom for atom in mol.GetAtoms() if atom.GetAtomicNum() == 0)
+    assert dummy_atom.GetNeighbors()[0].GetSymbol() == "N"
+
 
 def test_activate_random_pentane():
-    """Test 'random' activation on Pentane (CCCCC). Should attach to a Carbon."""
-    smiles = "CCCCC"
-    # Seed is not easily passed to the function unless we add it to the signature or fix it inside.
-    # For this test, any Carbon is fine.
-    mol = activate_fragment(smiles, strategy="random")
-    
-    assert mol is not None
-    assert mol.GetNumAtoms() == 6 # 5 C + 1 *
-    
-    dummy_atom = None
-    for atom in mol.GetAtoms():
-        if atom.GetSymbol() == '*':
-            dummy_atom = atom
-            break
-            
-    assert dummy_atom is not None
-    neighbors = dummy_atom.GetNeighbors()
-    assert neighbors[0].GetSymbol() == 'C'
+    mol = activate_fragment("CCCCC", strategy="random")
+
+    assert mol.GetNumAtoms() == 6
+    dummy_atom = next(atom for atom in mol.GetAtoms() if atom.GetAtomicNum() == 0)
+    assert dummy_atom.GetNeighbors()[0].GetSymbol() == "C"
+
 
 def test_activate_smart_fallback():
-    """Test 'smart' fallback when no heteroatoms are found (e.g. Propane)."""
-    smiles = "CCC"
-    # Should fall back to finding a terminal carbon or random
-    # The requirement says "fall back to a terminal Carbon" or random. 
-    # Let's assume the implementation picks a terminal carbon if possible, or just behaves like random on C.
-    mol = activate_fragment(smiles, strategy="smart")
-    
-    assert mol is not None
-    dummy_atom = None
-    for atom in mol.GetAtoms():
-        if atom.GetSymbol() == '*':
-            dummy_atom = atom
-            break
-    
-    assert dummy_atom is not None
-    neighbor = dummy_atom.GetNeighbors()[0]
-    assert neighbor.GetSymbol() == 'C'
-    # Verify it's a valid molecule
+    mol = activate_fragment("CCC", strategy="smart")
+
+    dummy_atom = next(atom for atom in mol.GetAtoms() if atom.GetAtomicNum() == 0)
+    assert dummy_atom.GetNeighbors()[0].GetSymbol() == "C"
     assert Chem.SanitizeMol(mol) == Chem.SanitizeFlags.SANITIZE_NONE
 
+
 def test_invalid_smiles():
     with pytest.raises(ValueError):
         activate_fragment("NotASmiles", strategy="smart")
-
diff --git a/tests/test_fragmentation.py b/tests/test_fragmentation.py
new file mode 100644
index 0000000..02a1a58
--- /dev/null
+++ b/tests/test_fragmentation.py
@@ -0,0 +1,53 @@
+from rdkit import Chem
+
+from macro_lactone_toolkit import MacrolactoneFragmenter
+
+from .helpers import build_macrolactone
+
+
+def test_fragmentation_returns_empty_list_without_sidechains():
+    built = build_macrolactone(12)
+    result = MacrolactoneFragmenter().fragment_molecule(built.smiles, parent_id="plain")
+
+    assert result.fragments == []
+
+
+def test_fragmentation_emits_labeled_and_plain_smiles_round_trip():
+    built = build_macrolactone(16, {5: "ethyl", 8: "methyl"})
+    result = MacrolactoneFragmenter().fragment_molecule(built.smiles, parent_id="mol_001")
+
+    assert result.parent_id == "mol_001"
+    assert result.ring_size == 16
+    assert {fragment.cleavage_position for fragment in result.fragments} == {5, 8}
+
+    for fragment in result.fragments:
+        labeled = Chem.MolFromSmiles(fragment.fragment_smiles_labeled)
+        plain = Chem.MolFromSmiles(fragment.fragment_smiles_plain)
+
+        assert labeled is not None
+        assert plain is not None
+        assert Chem.MolToSmiles(labeled, isomericSmiles=True)
+        assert Chem.MolToSmiles(plain, isomericSmiles=True)
+        assert any(
+            atom.GetAtomicNum() == 0 and atom.GetIsotope() == fragment.cleavage_position
+            for atom in labeled.GetAtoms()
+        )
+        assert any(
+            atom.GetAtomicNum() == 0 and atom.GetIsotope() == 0
+            for atom in plain.GetAtoms()
+        )
+
+
+def test_fragmentation_preserves_attachment_bond_type():
+    built = build_macrolactone(16, {6: "exocyclic_alkene"})
+    result = MacrolactoneFragmenter().fragment_molecule(built.smiles, parent_id="bond_type")
+
+    fragment = next(fragment for fragment in result.fragments if fragment.cleavage_position == 6)
+    labeled = Chem.MolFromSmiles(fragment.fragment_smiles_labeled)
+    plain = Chem.MolFromSmiles(fragment.fragment_smiles_plain)
+
+    for mol in (labeled, plain):
+        dummy_atom = next(atom for atom in mol.GetAtoms() if atom.GetAtomicNum() == 0)
+        neighbor = dummy_atom.GetNeighbors()[0]
+        bond = mol.GetBondBetweenAtoms(dummy_atom.GetIdx(), neighbor.GetIdx())
+        assert bond.GetBondType() == Chem.BondType.DOUBLE
diff --git a/tests/test_imports.py b/tests/test_imports.py
new file mode 100644
index 0000000..9aed8e4
--- /dev/null
+++ b/tests/test_imports.py
@@ -0,0 +1,5 @@
+import macro_lactone_toolkit
+
+
+def test_public_imports_smoke():
+    assert macro_lactone_toolkit is not None
diff --git a/tests/test_ring_numbering.py b/tests/test_ring_numbering.py
deleted file mode 100644
index 170f0d5..0000000
--- a/tests/test_ring_numbering.py
+++ /dev/null
@@ -1,223 +0,0 @@
-"""
-测试环编号功能 - 验证原子编号是否固定
-"""
-import sys
-sys.path.insert(0, '/home/zly/project/macro_split')
-
-from rdkit import Chem
-from rdkit.Chem import Draw, AllChem
-from rdkit.Chem.Draw import rdMolDraw2D
-from src.ring_visualization import (
-    get_macrolactone_numbering,
-    get_ring_atoms_by_size
-)
-
-
-def test_ring_numbering_consistency(smiles: str, ring_size: int = 16, num_tests: int = 5):
-    """
-    测试环编号的一致性 - 多次运行确保编号固定
-    """
-    print("=" * 70)
-    print("测试环编号一致性")
-    print("=" * 70)
-    print(f"\nSMILES: {smiles[:80]}...")
-    print(f"环大小: {ring_size}")
-    print(f"测试次数: {num_tests}")
-
-    # 解析分子
-    mol = Chem.MolFromSmiles(smiles)
-    if mol is None:
-        print("❌ 无法解析SMILES")
-        return False
-
-    print(f"✓ 分子解析成功，共 {mol.GetNumAtoms()} 个原子")
-
-    # 检测环大小
-    ring_atoms = get_ring_atoms_by_size(mol, ring_size)
-    if ring_atoms is None:
-        for size in range(12, 21):
-            ring_atoms = get_ring_atoms_by_size(mol, size)
-            if ring_atoms:
-                ring_size = size
-                print(f"⚠️  使用检测到的{size}元环")
-                break
-
-    if ring_atoms is None:
-        print("❌ 未找到12-20元环")
-        return False
-
-    print(f"✓ 找到{ring_size}元环，包含 {len(ring_atoms)} 个原子")
-
-    # 多次测试编号一致性
-    all_numberings = []
-    all_carbonyl_carbons = []
-    all_ester_oxygens = []
-
-    for i in range(num_tests):
-        result = get_macrolactone_numbering(mol, ring_size)
-        ring_atoms_result, ring_numbering, ordered_atoms, carbonyl_carbon, ester_oxygen, (is_valid, reason) = result
-
-        if not is_valid:
-            print(f"❌ 第{i+1}次测试失败: {reason}")
-            return False
-
-        all_numberings.append(ring_numbering.copy())
-        all_carbonyl_carbons.append(carbonyl_carbon)
-        all_ester_oxygens.append(ester_oxygen)
-
-    # 验证一致性
-    print("\n" + "-" * 50)
-    print("编号一致性检查:")
-    print("-" * 50)
-
-    is_consistent = True
-
-    if len(set(all_carbonyl_carbons)) == 1:
-        print(f"✓ 羰基碳位置一致: 原子索引 {all_carbonyl_carbons[0]}")
-    else:
-        print(f"❌ 羰基碳位置不一致: {all_carbonyl_carbons}")
-        is_consistent = False
-
-    if len(set(all_ester_oxygens)) == 1:
-        print(f"✓ 酯氧位置一致: 原子索引 {all_ester_oxygens[0]}")
-    else:
-        print(f"❌ 酯氧位置不一致: {all_ester_oxygens}")
-        is_consistent = False
-
-    first_numbering = all_numberings[0]
-    for i, numbering in enumerate(all_numberings[1:], 2):
-        if numbering != first_numbering:
-            print(f"❌ 第{i}次编号与第1次不一致")
-            is_consistent = False
-            break
-
-    if is_consistent:
-        print(f"✓ 所有{num_tests}次测试的编号完全一致")
-
-    # 显示详细编号信息
-    print("\n" + "-" * 50)
-    print("环原子编号详情:")
-    print("-" * 50)
-
-    numbering = all_numberings[0]
-    carbonyl_carbon = all_carbonyl_carbons[0]
-    ester_oxygen = all_ester_oxygens[0]
-
-    sorted_items = sorted(numbering.items(), key=lambda x: x[1])
-
-    print(f"{'位置':<6} {'原子索引':<10} {'元素':<6} {'说明'}")
-    print("-" * 40)
-
-    for atom_idx, position in sorted_items:
-        atom = mol.GetAtomWithIdx(atom_idx)
-        symbol = atom.GetSymbol()
-        note = ""
-        if atom_idx == carbonyl_carbon:
-            note = "← 羰基碳 (C=O)"
-        elif atom_idx == ester_oxygen:
-            note = "← 酯键氧"
-        print(f"{position:<6} {atom_idx:<10} {symbol:<6} {note}")
-
-    return is_consistent
-
-
-def save_visualization(smiles: str, output_path: str, ring_size: int = 16):
-    """保存分子可视化图片"""
-    print("\n" + "=" * 70)
-    print("保存可视化图片")
-    print("=" * 70)
-
-    mol = Chem.MolFromSmiles(smiles)
-    if mol is None:
-        print("❌ 无法解析SMILES")
-        return
-
-    for size in range(12, 21):
-        ring_atoms = get_ring_atoms_by_size(mol, size)
-        if ring_atoms:
-            ring_size = size
-            break
-
-    result = get_macrolactone_numbering(mol, ring_size)
-    ring_atoms, ring_numbering, ordered_atoms, carbonyl_carbon, ester_oxygen, (is_valid, reason) = result
-
-    if not is_valid:
-        print(f"❌ 无法获取编号: {reason}")
-        return
-
-    mol_copy = Chem.Mol(mol)
-    AllChem.Compute2DCoords(mol_copy)
-
-    for atom_idx in ring_atoms:
-        if atom_idx in ring_numbering:
-            atom = mol_copy.GetAtomWithIdx(atom_idx)
-            atom.SetProp("atomNote", str(ring_numbering[atom_idx]))
-
-    atom_colors = {}
-    for atom_idx in ring_atoms:
-        atom = mol.GetAtomWithIdx(atom_idx)
-        symbol = atom.GetSymbol()
-
-        if atom_idx == carbonyl_carbon:
-            atom_colors[atom_idx] = (1.0, 0.6, 0.0)
-        elif atom_idx == ester_oxygen:
-            atom_colors[atom_idx] = (1.0, 0.4, 0.4)
-        elif symbol == 'C':
-            atom_colors[atom_idx] = (0.7, 0.85, 1.0)
-        elif symbol == 'O':
-            atom_colors[atom_idx] = (1.0, 0.7, 0.7)
-        elif symbol == 'N':
-            atom_colors[atom_idx] = (0.8, 0.7, 1.0)
-        else:
-            atom_colors[atom_idx] = (0.8, 1.0, 0.8)
-
-    drawer = rdMolDraw2D.MolDraw2DSVG(1000, 1000)
-    drawer.SetFontSize(14)
-    drawer.DrawMolecule(mol_copy, highlightAtoms=list(ring_atoms), highlightAtomColors=atom_colors)
-    drawer.FinishDrawing()
-    svg = drawer.GetDrawingText()
-
-    svg_path = output_path.replace('.png', '.svg')
-    with open(svg_path, 'w', encoding='utf-8') as f:
-        f.write(svg)
-    print(f"✓ SVG已保存到: {svg_path}")
-
-    try:
-        drawer_png = rdMolDraw2D.MolDraw2DCairo(1000, 1000)
-        drawer_png.SetFontSize(14)
-        drawer_png.DrawMolecule(mol_copy, highlightAtoms=list(ring_atoms), highlightAtomColors=atom_colors)
-        drawer_png.FinishDrawing()
-        drawer_png.WriteDrawingText(output_path)
-        print(f"✓ PNG已保存到: {output_path}")
-    except Exception as e:
-        print(f"⚠️  PNG保存失败: {e}")
-
-    print("\n颜色说明:")
-    print("  橙色: 羰基碳 (位置1)")
-    print("  红色: 酯键氧 (位置2)")
-    print("  浅蓝色: 环上碳原子")
-
-
-def main():
-    smiles = "O[C@H]1[C@H]([C@H]([C@H](OC[C@@H]2[C@@H](CC)OC(C[C@H]([C@H](C)[C@H]([C@@H](CC=O)C[C@@H](C)C(/C=C/C(/C)=C/2)=O)O[C@H]2[C@@H]([C@H]([C@@H]([C@@H](C)O2)O[C@H]2C[C@](C)([C@@H]([C@@H](C)O2)O)O)[N@](C)C)O)O)=O)O[C@@H]1C)OC)OC"
-
-    print("\n大环内酯环编号测试\n")
-    is_consistent = test_ring_numbering_consistency(smiles, ring_size=16, num_tests=5)
-
-    output_path = "/home/zly/project/macro_split/output/test_ring_numbering.png"
-    save_visualization(smiles, output_path, ring_size=16)
-
-    print("\n" + "=" * 70)
-    print("测试总结")
-    print("=" * 70)
-    if is_consistent:
-        print("✅ 所有测试通过！环原子编号是固定的。")
-    else:
-        print("❌ 测试失败：环原子编号不一致")
-
-    return is_consistent
-
-
-if __name__ == "__main__":
-    success = main()
-    sys.exit(0 if success else 1)
diff --git a/tests/test_scaffold_prep.py b/tests/test_scaffold_prep.py
deleted file mode 100644
index d1d5a8d..0000000
--- a/tests/test_scaffold_prep.py
+++ /dev/null
@@ -1,84 +0,0 @@
-import pytest
-from rdkit import Chem
-from src.splicing.scaffold_prep import prepare_tylosin_scaffold
-from src.ring_numbering import assign_ring_numbering
-
-def test_prepare_tylosin_scaffold():
-    # Construct a 16-membered lactone with side chains
-    # Numbering logic (assumed based on implementation):
-    # 1: C=O
-    # 2-6: CH2
-    # 7: CH(CH3)  <- Methyl side chain
-    # 8-14: CH2
-    # 15: CH(CC)  <- Ethyl side chain
-    # 16: O
-    
-    # SMILES:
-    # O=C1 (pos 1)
-    # CCCCC (pos 2-6)
-    # C(C) (pos 7, with Methyl)
-    # CCCCCCC (pos 8-14)
-    # C(CC) (pos 15, with Ethyl)
-    # O1 (pos 16)
-    
-    smiles = "O=C1CCCCC(C)CCCCCCCCC(CC)O1"
-    
-    # Verify initial assumption about numbering
-    mol = Chem.MolFromSmiles(smiles)
-    numbering = assign_ring_numbering(mol)
-    
-    # Find atom indices for pos 7 and 15 to ensure our SMILES construction is correct for the test
-    pos_map = {v: k for k, v in numbering.items()}
-    assert 7 in pos_map, "Position 7 not found in ring"
-    assert 15 in pos_map, "Position 15 not found in ring"
-    assert 5 in pos_map, "Position 5 not found in ring"
-    
-    atom7 = mol.GetAtomWithIdx(pos_map[7])
-    atom15 = mol.GetAtomWithIdx(pos_map[15])
-    atom5 = mol.GetAtomWithIdx(pos_map[5])
-    
-    # Check side chains exist
-    # Atom 7 should have 3 neighbors (2 ring, 1 methyl)
-    assert len(atom7.GetNeighbors()) == 3
-    # Atom 15 should have 3 neighbors (2 ring, 1 ethyl)
-    assert len(atom15.GetNeighbors()) == 3
-    # Atom 5 should have 2 neighbors (2 ring, 2 implicit H)
-    assert len(atom5.GetNeighbors()) == 2
-    
-    # Execute scaffold prep
-    target_positions = [5, 7, 15]
-    res_mol, dummy_map = prepare_tylosin_scaffold(smiles, target_positions)
-    
-    assert res_mol is not None
-    assert len(dummy_map) == 3
-    
-    # Verify dummies
-    for pos in target_positions:
-        assert pos in dummy_map
-        dummy_idx = dummy_map[pos]
-        dummy_atom = res_mol.GetAtomWithIdx(dummy_idx)
-        assert dummy_atom.GetSymbol() == "*"
-        assert dummy_atom.GetIsotope() == pos
-        
-        # Check that dummy is connected to the correct ring position
-        neighbors = dummy_atom.GetNeighbors()
-        assert len(neighbors) == 1
-        
-    # Verify side chains removed
-    # New atom counts.
-    # Original: 16 (ring) + 1 (O=) + 1 (Me) + 2 (Et) = 20 heavy atoms.
-    # Removed: Me (1), Et (2). Total -3.
-    # Added: 3 dummies. Total +3.
-    # Net: 20.
-    assert res_mol.GetNumAtoms() == 20
-    
-    # Check that the specific side chains are gone.
-    # Count carbons.
-    # Original C count: 1 (C=O) + 14 (CH2/CH) + 1(Me) + 2(Et) = 18 C.
-    # New C count: 1 (C=O) + 14 (Ring C) = 15 C.
-    # Dummies are *. O are O.
-    c_count = sum(1 for a in res_mol.GetAtoms() if a.GetSymbol() == 'C')
-    assert c_count == 15, f"Expected 15 Carbons, found {c_count}"
-    
-    dummy_count = sum(1 for a in res_mol.GetAtoms() if a.GetSymbol() == '*')
-    assert dummy_count == 3
diff --git a/tests/test_splicing_engine.py b/tests/test_splicing_engine.py
index e3a6edf..09ee986 100644
--- a/tests/test_splicing_engine.py
+++ b/tests/test_splicing_engine.py
@@ -1,77 +1,51 @@
 import pytest
 from rdkit import Chem
-from src.splicing.engine import splice_molecule
+
+from macro_lactone_toolkit import MacrolactoneFragmenter
+from macro_lactone_toolkit.splicing.engine import splice_molecule
+from macro_lactone_toolkit.splicing.scaffold_prep import prepare_macrolactone_scaffold
+
+from .helpers import build_macrolactone, canonicalize
+
 
 def test_splice_benzene_methyl():
-    """
-    Test splicing a benzene scaffold (isotope 1) with a methyl fragment.
-    Scaffold: c1ccccc1[1*] (Phenyl radical-ish dummy)
-    Fragment: C* (Methyl radical-ish dummy)
-    Result: Cc1ccccc1 (Toluene)
-    """
     scaffold = Chem.MolFromSmiles("c1ccccc1[1*]")
     fragment = Chem.MolFromSmiles("C*")
-    
-    assert scaffold is not None
-    assert fragment is not None
-    
+
     product = splice_molecule(scaffold, fragment, position=1)
-    
-    # Expected result: Toluene
-    expected_smiles = "Cc1ccccc1"
-    expected_mol = Chem.MolFromSmiles(expected_smiles)
-    expected_canonical = Chem.MolToSmiles(expected_mol, isomericSmiles=True)
-    
-    product_canonical = Chem.MolToSmiles(product, isomericSmiles=True)
-    
-    assert product_canonical == expected_canonical
+
+    assert canonicalize(product) == canonicalize("Cc1ccccc1")
+
 
 def test_splice_missing_isotope():
-    """Test that error is raised if the requested position is not found on scaffold."""
-    scaffold = Chem.MolFromSmiles("c1ccccc1[2*]") # Isotope 2
+    scaffold = Chem.MolFromSmiles("c1ccccc1[2*]")
     fragment = Chem.MolFromSmiles("C*")
-    
+
     with pytest.raises(ValueError, match="Scaffold dummy atom with isotope 1 not found"):
         splice_molecule(scaffold, fragment, position=1)
 
+
 def test_splice_no_fragment_dummy():
-    """Test that error is raised if fragment has no dummy atom."""
     scaffold = Chem.MolFromSmiles("c1ccccc1[1*]")
-    fragment = Chem.MolFromSmiles("C") # Methane, no dummy
-    
+    fragment = Chem.MolFromSmiles("C")
+
     with pytest.raises(ValueError, match="Fragment does not contain a dummy atom"):
         splice_molecule(scaffold, fragment, position=1)
 
-def test_complex_splicing():
-    """
-    Test splicing with more complex structures.
-    Scaffold: Pyridine derivative n1cccc1CC[1*]
-    Fragment: Cyclopropane C1CC1*
-    Result: n1cccc1CCC1CC1
-    """
-    scaffold = Chem.MolFromSmiles("n1cccc1CC[1*]")
-    fragment = Chem.MolFromSmiles("*C1CC1")
-    
-    product = splice_molecule(scaffold, fragment, position=1)
-    
-    expected = Chem.MolFromSmiles("n1cccc1CCC1CC1")
-    
-    assert Chem.MolToSmiles(product) == Chem.MolToSmiles(expected)
 
-def test_scaffold_with_multiple_different_dummies():
-    """
-    Test splicing when scaffold has multiple dummies with different isotopes.
-    Scaffold: [1*]c1ccccc1[2*]
-    Fragment: C*
-    Target: Splicing at 1 should leave [2*] intact.
-    """
-    scaffold = Chem.MolFromSmiles("[1*]c1ccccc1[2*]")
-    fragment = Chem.MolFromSmiles("C*")
-    
-    # Splice at 1
-    product = splice_molecule(scaffold, fragment, position=1)
-    
-    # Expected: Cc1ccccc1[2*]
-    expected = Chem.MolFromSmiles("Cc1ccccc1[2*]")
-    
-    assert Chem.MolToSmiles(product) == Chem.MolToSmiles(expected)
+def test_prepare_scaffold_and_reassemble_fragment():
+    built = build_macrolactone(16, {5: "ethyl"})
+    result = MacrolactoneFragmenter(ring_size=16).fragment_molecule(built.smiles, parent_id="reassemble")
+    fragment = next(fragment for fragment in result.fragments if fragment.cleavage_position == 5)
+
+    scaffold, dummy_map = prepare_macrolactone_scaffold(
+        built.smiles,
+        positions=[5],
+        ring_size=16,
+    )
+
+    assert 5 in dummy_map
+
+    product = splice_molecule(scaffold, Chem.MolFromSmiles(fragment.fragment_smiles_labeled), position=5)
+
+    assert canonicalize(product) == canonicalize(built.mol)