first add
This commit is contained in:
139
README.md
Normal file
139
README.md
Normal file
@@ -0,0 +1,139 @@
|
||||
## MolE 广谱抗菌预测 API
|
||||
|
||||
测试案例: example_usage.py
|
||||
|
||||
## 功能特性
|
||||
|
||||
1. **高性能并行处理** - 支持多进程并行计算,显著提高大批量分子预测速度
|
||||
2. **多种使用方式** - 提供Python API、命令行工具和Web服务三种使用方式
|
||||
3. **模块化设计** - 易于集成到其他项目中
|
||||
4. **灵活配置** - 支持自定义模型路径、阈值等参数
|
||||
|
||||
## 安装
|
||||
|
||||
```bash
|
||||
pip install -e .
|
||||
```
|
||||
|
||||
## 使用方式
|
||||
|
||||
### 1. Python API
|
||||
|
||||
```python
|
||||
from broad_spectrum_parallel import predict_smiles, MoleculeInput
|
||||
|
||||
# 预测单个或多个SMILES
|
||||
results = predict_smiles(["CCO", "CCN"], ["ethanol", "ethylamine"])
|
||||
|
||||
for result in results:
|
||||
print(f"{result.chem_id}: 广谱={result.broad_spectrum}, 抑制数={result.ginhib_total}")
|
||||
```
|
||||
|
||||
### 2. 命令行工具
|
||||
|
||||
```bash
|
||||
# 基本用法
|
||||
predict_antimicrobial input.tsv output.tsv --smiles_input --smiles_colname smiles --chemid_colname chem_id
|
||||
|
||||
# 聚合预测结果
|
||||
predict_antimicrobial input.tsv output.tsv --smiles_input --aggregate_scores
|
||||
```
|
||||
|
||||
### 3. Web API服务
|
||||
|
||||
```bash
|
||||
# 启动服务
|
||||
uvicorn broad_spectrum_parallel.api:app --host 0.0.0.0 --port 8000
|
||||
```
|
||||
|
||||
然后可以通过POST请求访问`http://localhost:8000/predict`端点:
|
||||
|
||||
```bash
|
||||
curl -X POST "http://localhost:8000/predict" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"smiles": ["CCO", "CCN"]}'
|
||||
```
|
||||
|
||||
## 结果解读
|
||||
|
||||
从运行结果可以看到,每个化合物返回8个关键指标:
|
||||
|
||||
1. 抗菌潜力分数(对数尺度):
|
||||
|
||||
apscore_total: -11.758 - 总体抗菌分数
|
||||
apscore_gnegative: -11.648 - 革兰阴性菌抗菌分数
|
||||
apscore_gpositive: -11.848 - 革兰阳性菌抗菌分数
|
||||
2. 抑制菌株统计:
|
||||
|
||||
ginhib_total: 0 - 总抑制菌株数
|
||||
ginhib_gnegative: 0 - 抑制的革兰阴性菌株数
|
||||
ginhib_gpositive: 0 - 抑制的革兰阳性菌株数
|
||||
3. 广谱判定:
|
||||
|
||||
broad_spectrum: 0 - 是否广谱抗菌(需抑制≥10个菌株)
|
||||
🧪 结果解释示例
|
||||
以乙醇(CCO)为例:
|
||||
|
||||
抗菌分数很低 (-11.758):表明预测的抗菌活性很弱
|
||||
无菌株抑制 (0):在设定阈值下不能有效抑制任何测试菌株
|
||||
非广谱抗菌 (0):不满足广谱抗菌的最低标准
|
||||
这个结果符合预期,因为乙醇虽有杀菌作用,但在药物发现的标准下不被认为是有效的抗菌候选化合物。
|
||||
|
||||
## 可以运行的菌株信息
|
||||
|
||||
```shell
|
||||
(mole) root@DESK4090:/srv/project/mole_antimicrobial_potential/broad_spectrum_parallel# micromamba run -n mole python -c "
|
||||
> import pandas as pd
|
||||
> import numpy as np
|
||||
>
|
||||
> # 加载菌株筛选数据
|
||||
> maier_screen = pd.read_csv('data/01.prepare_training_data/maier_screening_results.tsv.gz', sep='\t', index_col=0)
|
||||
> print(f'总菌株数量: {len(maier_screen.columns)}')
|
||||
> print(f'总化合物数量: {len(maier_screen.index)}')
|
||||
> print(f'菌株列表前10个:')
|
||||
> for i, strain in enumerate(maier_screen.columns[:10]):
|
||||
> print(f'{i+1}. {strain}')
|
||||
>
|
||||
> # 加载革兰染色信息
|
||||
> gram_info = pd.read_excel('raw_data/maier_microbiome/strain_info_SF2.xlsx',
|
||||
> skiprows=[0, 1, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54],
|
||||
> index_col='NT data base')
|
||||
> print(f'\n革兰染色信息:')
|
||||
> print(gram_info['Gram stain'].value_counts())
|
||||
> "
|
||||
总菌株数量: 40
|
||||
总化合物数量: 1197
|
||||
菌株列表前10个:
|
||||
1. Akkermansia muciniphila (NT5021)
|
||||
2. Bacteroides caccae (NT5050)
|
||||
3. Bacteroides fragilis (ET) (NT5033)
|
||||
4. Bacteroides fragilis (NT) (NT5003)
|
||||
5. Bacteroides ovatus (NT5054)
|
||||
6. Bacteroides thetaiotaomicron (NT5004)
|
||||
7. Bacteroides uniformis (NT5002)
|
||||
8. Bacteroides vulgatus (NT5001)
|
||||
9. Bacteroides xylanisolvens (NT5064)
|
||||
10. Bifidobacterium adolescentis (NT5022)
|
||||
/root/micromamba/envs/mole/lib/python3.10/site-packages/openpyxl/worksheet/_reader.py:329: UserWarning: Unknown extension is not supported and will be removed
|
||||
warn(msg)
|
||||
|
||||
革兰染色信息:
|
||||
Gram stain
|
||||
positive 22
|
||||
negative 18
|
||||
Name: count, dtype: int64
|
||||
```
|
||||
|
||||
## 权重下载
|
||||
|
||||
mole
|
||||
https://www.alipan.com/s/DNuDo8iEn89
|
||||
提取码: mh90
|
||||
|
||||
下载完成放到:pretrained_model/model_ginconcat_btwin_100k_d8000_l0.0001
|
||||
|
||||
## 原始论文与github仓库
|
||||
|
||||
https://www.nature.com/articles/s41467-025-58804-4
|
||||
|
||||
https://github.com/rolayoalarcon/mole_antimicrobial_potential
|
||||
Reference in New Issue
Block a user