This commit is contained in:
mm644706215
2025-10-16 17:26:35 +08:00
parent b1d437a06d
commit ea218a3a39
49 changed files with 694742 additions and 2 deletions

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,3 @@
website: https://macrolact.collaborationspharma.com/
通过MacrolactoneDB页面筛选的结果并不准确我限定16环但是在下载的csv里面有14环的结果所以还是要自己进行在筛选一遍。

View File

@@ -0,0 +1,3 @@
Your Filtered Macrolactone Database
11036 compounds have been filtered from MacrolactoneDB based on your specified inputs.

View File

@@ -0,0 +1,89 @@
Target Organisms
Homo sapiens 815
Homo sapiens, None 180
Plasmodium falciparum 161
Hepatitis C virus, None 112
Homo sapiens, Plasmodium falciparum 63
Oryctolagus cuniculus 62
Mus musculus 60
Toxoplasma gondii 39
Homo sapiens, Rattus norvegicus 27
Mus musculus, Homo sapiens 24
None, Rattus norvegicus 23
Human immunodeficiency virus 1 20
Hepatitis C virus 18
Rattus norvegicus 17
Homo sapiens, Sus scrofa 11
Homo sapiens, Chlorocebus aethiops 10
Serratia marcescens 9
Escherichia coli 8
Oryctolagus cuniculus, Homo sapiens 7
Streptococcus pneumoniae 6
Oryctolagus cuniculus, Staphylococcus aureus, Raoultella planticola, Bacillus subtilis, Mus musculus, Micrococcus luteus, None, Escherichia coli, Plasmodium falciparum, Streptococcus pneumoniae, Homo sapiens, Escherichia coli K-12, Toxoplasma gondii 6
Plasmodium falciparum K1 5
Bacillus anthracis 5
Mus musculus, Homo sapiens, None 5
Bacillus anthracis, Homo sapiens 4
Candida albicans, Cryptococcus neoformans, Aspergillus fumigatus 4
Mus musculus, None 4
Plasmodium falciparum, Homo sapiens, None 4
None, Homo sapiens, Plasmodium falciparum 3
Bacillus subtilis, Homo sapiens 3
Oryctolagus cuniculus, Homo sapiens, None 3
Sus scrofa, Mus musculus, None, Plasmodium falciparum, Homo sapiens, Rattus norvegicus 2
Homo sapiens, None, Rattus norvegicus 2
Cryptococcus neoformans 2
Homo sapiens, None, Chlorocebus aethiops 2
Staphylococcus aureus 2
Candida albicans, Cryptococcus neoformans, Mycobacterium intracellulare, Aspergillus fumigatus 2
Mus musculus, None, Human immunodeficiency virus 1 2
Escherichia coli (strain K12) 2
Plasmodium falciparum 3D7, Homo sapiens 2
Aspergillus fumigatus 1
Sus scrofa 1
Saccharomyces cerevisiae S288c, Human immunodeficiency virus 1, Human herpesvirus 1, Plasmodium falciparum, None, Homo sapiens, Rattus norvegicus 1
Hepatitis C virus, Homo sapiens, None 1
Plasmodium falciparum 3D7 1
Bacillus subtilis 1
Mus musculus, Homo sapiens, None, Saccharomyces cerevisiae 1
Chlorocebus aethiops 1
Homo sapiens, Escherichia coli K-12, None 1
Hepatitis C virus, Homo sapiens, None, Rattus norvegicus 1
None, Homo sapiens, Human herpesvirus 1 1
Homo sapiens, None, Trypanosoma brucei brucei 1
Homo sapiens, None, Cryptococcus neoformans 1
Homo sapiens, Rattus norvegicus, Human immunodeficiency virus 1 1
None, Plasmodium falciparum, Escherichia coli, Streptococcus pneumoniae, Naegleria fowleri, Homo sapiens, Streptococcus, Toxoplasma gondii 1
Giardia intestinalis, Trypanosoma cruzi, Equus caballus, Bos taurus, Mus musculus, None, Plasmodium falciparum, Chlorocebus aethiops, Homo sapiens 1
Plasmodium falciparum NF54, Trypanosoma cruzi, Trypanosoma brucei rhodesiense, Rattus norvegicus 1
None, Homo sapiens, Plasmodium falciparum K1, Plasmodium falciparum 1
Saccharomyces cerevisiae S288c, Homo sapiens, None, Saccharomyces cerevisiae, Phytophthora sojae 1
Bacillus subtilis, Homo sapiens, Schistosoma mansoni, Saccharomyces cerevisiae, Giardia intestinalis 1
Streptococcus, Homo sapiens, None 1
Mus musculus, Homo sapiens, Rattus norvegicus 1
Homo sapiens, Spinacia oleracea 1
Human immunodeficiency virus 1, Mus musculus, None, Hepatitis C virus, Homo sapiens, Rattus norvegicus 1
None, Plasmodium falciparum, Trypanosoma brucei rhodesiense 1
Hepatitis C virus, None, Rattus norvegicus 1
Homo sapiens, Equus caballus 1
Plasmodium falciparum NF54, Trypanosoma cruzi, Trypanosoma brucei rhodesiense 1
Schistosoma mansoni, Influenza A virus 1
Leishmania chagasi, Trypanosoma cruzi 1
Candida albicans, Cryptococcus neoformans 1
None, Plasmodium falciparum 1
Caenorhabditis elegans 1
Bos taurus, Sus scrofa 1
Plasmodium falciparum, Enterococcus faecium 1
Homo sapiens, Gallus gallus 1
Homo sapiens, Escherichia coli 1
Plasmodium falciparum, Homo sapiens, None, Rattus norvegicus, Schistosoma mansoni 1
Homo sapiens, None, Influenza A virus 1
Mycobacterium tuberculosis, None 1
Escherichia coli, Homo sapiens, Toxoplasma gondii, None, Streptococcus pneumoniae 1
Bacillus subtilis, Oryctolagus cuniculus, Homo sapiens, Schistosoma mansoni, Giardia intestinalis 1
Homo sapiens, None, Rattus norvegicus, Escherichia coli O157:H7 1
Giardia intestinalis, Schistosoma mansoni, Mus musculus, None, Homo sapiens, Saccharomyces cerevisiae 1
Trypanosoma cruzi 1
Influenza A virus 1
Escherichia coli K-12 1
Human herpesvirus 4 (strain B95-8) 1

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

Binary file not shown.

After

Width:  |  Height:  |  Size: 199 KiB

File diff suppressed because one or more lines are too long

289579
Data/MacrolactoneDB/ring16/temp.sdf Executable file

File diff suppressed because one or more lines are too long

0
Data/ery_core.txt Normal file → Executable file
View File

36
Data/fragment/README.md Normal file
View File

@@ -0,0 +1,36 @@
## [Cell](https://www.cell.com/cell/abstract/S0092-8674(25)00855-4) 论文筛选数据
## 数据输入:原始片段库
Frags-Enamine-18M.csvEnamine REAL数据库的18M片段需提取SMILES
GDB11-27M.csvGDB-11数据库的27M片段需提取SMILES
下载地址:[Zenodo link](https://zenodo.org/records/15191826)
## 原文筛选逻辑(淋病奈瑟菌靶向)
1数据输入原始片段库
文件来源:
Frags-Enamine-18M.csvEnamine REAL数据库的18M片段需提取SMILES
GDB11-27M.csvGDB-11数据库的27M片段需提取SMILES
2模型预测Chemprop预训练模型
模型用途:
使用预训练的Chemprop模型针对淋病奈瑟菌或金黄色葡萄球菌预测片段的抗菌活性得分范围0-1
模型合理性:
Chemprop模型基于图神经网络GNN已在大规模化合物库如Broad Institute的38,765个化合物上训练对结构-活性关系有较高预测精度。
论文验证了模型对已知抗生素片段的预测能力见Figure S1A证明其可靠性。
3多维度过滤条件
筛选逻辑包含以下条件(需代码实现):
1.活性阈值:
GDB库片段预测得分>0.05
Enamine库片段预测得分>0.1(因合成性更佳)。
2.毒性过滤:
使用预训练的HepG2、HSkMC、IMR-90细胞毒性模型剔除预测得分>0.5的片段。
3.结构过滤:
排除含PAINS/Brenk子结构的片段易导致假阳性或代谢不稳定
与已知559个抗生素的Tanimoto相似度<0.5(确保结构新颖性)。
4结果输出
最终获得1,156,945个片段淋病奈瑟菌靶向存储于补充数据或Zenodo仓库中。

0
Data/image.png Normal file → Executable file
View File

Before

Width:  |  Height:  |  Size: 94 KiB

After

Width:  |  Height:  |  Size: 94 KiB

9
Data/my_sugars.txt Executable file
View File

@@ -0,0 +1,9 @@
[*R*][C@@H](O[C@@H]1O[C@H](C)[C@@H](O[C@@H]2O[C@H](C)[C@@H](O)[C@@](O)(C)C2)[C@H](N(C)C)[C@H]1O)[*R*]
[*R*][C@@H](CO[C@@H]1O[C@H](C)[C@@H](O)[C@@H](OC)[C@H]1OC)[*R*]
[*R*][C@H](O[C@H]9C[C@@](C)(OC)[C@@H](O)[C@H](C)O9)[*R*]
[*R*][C@H](O[C@@H]9O[C@H](C)C[C@@H]([C@H]9O)N(C)C)[*R*]
[*R*][C@H](O[C@@H]9O[C@H](C)C[C@@H]([C@H]9OC(C)=O)N(C)C)[*R*]
[*R*][C@H](O[C@H]9C[C@H](OC)O[C@@H](C)[C@@H]9OC(C)=O)[*R*]
[*R*][C@H](O[C@H]9C[C@H](OC)[C@@H](O)[C@H](C)O9)[*R*]
[*R*][C@H](O[C@H]9C[C@@H](O)[C@H](O)[C@@H](C)O9)[*R*]
[*R*][C@H](O[C@@H]9O[C@H](C)C[C@H](NC)[C@H]9O)[*R*]

0
Data/selected_extenders.txt Normal file → Executable file
View File

2
Data/split_position.md Executable file
View File

@@ -0,0 +1,2 @@
键编号 31: 17(C) -> 32(O), 键类型: SINGLE
键编号 6: 6(C) -> 7(C), 键类型: SINGLE

0
Data/sugars Normal file → Executable file
View File