update
This commit is contained in:
File diff suppressed because one or more lines are too long
2752
Data/MacrolactoneDB/Macrolactone_Filtered_16Ring_Active_435.csv
Normal file
2752
Data/MacrolactoneDB/Macrolactone_Filtered_16Ring_Active_435.csv
Normal file
File diff suppressed because one or more lines are too long
3
Data/MacrolactoneDB/README.md
Normal file
3
Data/MacrolactoneDB/README.md
Normal file
@@ -0,0 +1,3 @@
|
||||
website: https://macrolact.collaborationspharma.com/
|
||||
|
||||
通过MacrolactoneDB页面筛选的结果并不准确,我限定16环,但是在下载的csv里面有14环的结果,所以还是要自己进行在筛选一遍。
|
||||
3
Data/MacrolactoneDB/ring12_20/README.md
Normal file
3
Data/MacrolactoneDB/ring12_20/README.md
Normal file
@@ -0,0 +1,3 @@
|
||||
Your Filtered Macrolactone Database
|
||||
|
||||
11036 compounds have been filtered from MacrolactoneDB based on your specified inputs.
|
||||
89
Data/MacrolactoneDB/ring12_20/counts.txt
Normal file
89
Data/MacrolactoneDB/ring12_20/counts.txt
Normal file
@@ -0,0 +1,89 @@
|
||||
Target Organisms
|
||||
Homo sapiens 815
|
||||
Homo sapiens, None 180
|
||||
Plasmodium falciparum 161
|
||||
Hepatitis C virus, None 112
|
||||
Homo sapiens, Plasmodium falciparum 63
|
||||
Oryctolagus cuniculus 62
|
||||
Mus musculus 60
|
||||
Toxoplasma gondii 39
|
||||
Homo sapiens, Rattus norvegicus 27
|
||||
Mus musculus, Homo sapiens 24
|
||||
None, Rattus norvegicus 23
|
||||
Human immunodeficiency virus 1 20
|
||||
Hepatitis C virus 18
|
||||
Rattus norvegicus 17
|
||||
Homo sapiens, Sus scrofa 11
|
||||
Homo sapiens, Chlorocebus aethiops 10
|
||||
Serratia marcescens 9
|
||||
Escherichia coli 8
|
||||
Oryctolagus cuniculus, Homo sapiens 7
|
||||
Streptococcus pneumoniae 6
|
||||
Oryctolagus cuniculus, Staphylococcus aureus, Raoultella planticola, Bacillus subtilis, Mus musculus, Micrococcus luteus, None, Escherichia coli, Plasmodium falciparum, Streptococcus pneumoniae, Homo sapiens, Escherichia coli K-12, Toxoplasma gondii 6
|
||||
Plasmodium falciparum K1 5
|
||||
Bacillus anthracis 5
|
||||
Mus musculus, Homo sapiens, None 5
|
||||
Bacillus anthracis, Homo sapiens 4
|
||||
Candida albicans, Cryptococcus neoformans, Aspergillus fumigatus 4
|
||||
Mus musculus, None 4
|
||||
Plasmodium falciparum, Homo sapiens, None 4
|
||||
None, Homo sapiens, Plasmodium falciparum 3
|
||||
Bacillus subtilis, Homo sapiens 3
|
||||
Oryctolagus cuniculus, Homo sapiens, None 3
|
||||
Sus scrofa, Mus musculus, None, Plasmodium falciparum, Homo sapiens, Rattus norvegicus 2
|
||||
Homo sapiens, None, Rattus norvegicus 2
|
||||
Cryptococcus neoformans 2
|
||||
Homo sapiens, None, Chlorocebus aethiops 2
|
||||
Staphylococcus aureus 2
|
||||
Candida albicans, Cryptococcus neoformans, Mycobacterium intracellulare, Aspergillus fumigatus 2
|
||||
Mus musculus, None, Human immunodeficiency virus 1 2
|
||||
Escherichia coli (strain K12) 2
|
||||
Plasmodium falciparum 3D7, Homo sapiens 2
|
||||
Aspergillus fumigatus 1
|
||||
Sus scrofa 1
|
||||
Saccharomyces cerevisiae S288c, Human immunodeficiency virus 1, Human herpesvirus 1, Plasmodium falciparum, None, Homo sapiens, Rattus norvegicus 1
|
||||
Hepatitis C virus, Homo sapiens, None 1
|
||||
Plasmodium falciparum 3D7 1
|
||||
Bacillus subtilis 1
|
||||
Mus musculus, Homo sapiens, None, Saccharomyces cerevisiae 1
|
||||
Chlorocebus aethiops 1
|
||||
Homo sapiens, Escherichia coli K-12, None 1
|
||||
Hepatitis C virus, Homo sapiens, None, Rattus norvegicus 1
|
||||
None, Homo sapiens, Human herpesvirus 1 1
|
||||
Homo sapiens, None, Trypanosoma brucei brucei 1
|
||||
Homo sapiens, None, Cryptococcus neoformans 1
|
||||
Homo sapiens, Rattus norvegicus, Human immunodeficiency virus 1 1
|
||||
None, Plasmodium falciparum, Escherichia coli, Streptococcus pneumoniae, Naegleria fowleri, Homo sapiens, Streptococcus, Toxoplasma gondii 1
|
||||
Giardia intestinalis, Trypanosoma cruzi, Equus caballus, Bos taurus, Mus musculus, None, Plasmodium falciparum, Chlorocebus aethiops, Homo sapiens 1
|
||||
Plasmodium falciparum NF54, Trypanosoma cruzi, Trypanosoma brucei rhodesiense, Rattus norvegicus 1
|
||||
None, Homo sapiens, Plasmodium falciparum K1, Plasmodium falciparum 1
|
||||
Saccharomyces cerevisiae S288c, Homo sapiens, None, Saccharomyces cerevisiae, Phytophthora sojae 1
|
||||
Bacillus subtilis, Homo sapiens, Schistosoma mansoni, Saccharomyces cerevisiae, Giardia intestinalis 1
|
||||
Streptococcus, Homo sapiens, None 1
|
||||
Mus musculus, Homo sapiens, Rattus norvegicus 1
|
||||
Homo sapiens, Spinacia oleracea 1
|
||||
Human immunodeficiency virus 1, Mus musculus, None, Hepatitis C virus, Homo sapiens, Rattus norvegicus 1
|
||||
None, Plasmodium falciparum, Trypanosoma brucei rhodesiense 1
|
||||
Hepatitis C virus, None, Rattus norvegicus 1
|
||||
Homo sapiens, Equus caballus 1
|
||||
Plasmodium falciparum NF54, Trypanosoma cruzi, Trypanosoma brucei rhodesiense 1
|
||||
Schistosoma mansoni, Influenza A virus 1
|
||||
Leishmania chagasi, Trypanosoma cruzi 1
|
||||
Candida albicans, Cryptococcus neoformans 1
|
||||
None, Plasmodium falciparum 1
|
||||
Caenorhabditis elegans 1
|
||||
Bos taurus, Sus scrofa 1
|
||||
Plasmodium falciparum, Enterococcus faecium 1
|
||||
Homo sapiens, Gallus gallus 1
|
||||
Homo sapiens, Escherichia coli 1
|
||||
Plasmodium falciparum, Homo sapiens, None, Rattus norvegicus, Schistosoma mansoni 1
|
||||
Homo sapiens, None, Influenza A virus 1
|
||||
Mycobacterium tuberculosis, None 1
|
||||
Escherichia coli, Homo sapiens, Toxoplasma gondii, None, Streptococcus pneumoniae 1
|
||||
Bacillus subtilis, Oryctolagus cuniculus, Homo sapiens, Schistosoma mansoni, Giardia intestinalis 1
|
||||
Homo sapiens, None, Rattus norvegicus, Escherichia coli O157:H7 1
|
||||
Giardia intestinalis, Schistosoma mansoni, Mus musculus, None, Homo sapiens, Saccharomyces cerevisiae 1
|
||||
Trypanosoma cruzi 1
|
||||
Influenza A virus 1
|
||||
Escherichia coli K-12 1
|
||||
Human herpesvirus 4 (strain B95-8) 1
|
||||
51
Data/MacrolactoneDB/ring12_20/processed_embedding_small.csv
Normal file
51
Data/MacrolactoneDB/ring12_20/processed_embedding_small.csv
Normal file
File diff suppressed because one or more lines are too long
11037
Data/MacrolactoneDB/ring12_20/temp.csv
Normal file
11037
Data/MacrolactoneDB/ring12_20/temp.csv
Normal file
File diff suppressed because one or more lines are too long
BIN
Data/MacrolactoneDB/ring12_20/umap_visualization_small.png
Normal file
BIN
Data/MacrolactoneDB/ring12_20/umap_visualization_small.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 199 KiB |
2023
Data/MacrolactoneDB/ring16/temp.csv
Executable file
2023
Data/MacrolactoneDB/ring16/temp.csv
Executable file
File diff suppressed because one or more lines are too long
289579
Data/MacrolactoneDB/ring16/temp.sdf
Executable file
289579
Data/MacrolactoneDB/ring16/temp.sdf
Executable file
File diff suppressed because one or more lines are too long
0
Data/ery_core.txt
Normal file → Executable file
0
Data/ery_core.txt
Normal file → Executable file
36
Data/fragment/README.md
Normal file
36
Data/fragment/README.md
Normal file
@@ -0,0 +1,36 @@
|
||||
## [Cell](https://www.cell.com/cell/abstract/S0092-8674(25)00855-4) 论文筛选数据
|
||||
|
||||
|
||||
|
||||
## 数据输入:原始片段库
|
||||
|
||||
Frags-Enamine-18M.csv:Enamine REAL数据库的18M片段(需提取SMILES)。
|
||||
GDB11-27M.csv:GDB-11数据库的27M片段(需提取SMILES)。
|
||||
|
||||
下载地址:[Zenodo link](https://zenodo.org/records/15191826)
|
||||
|
||||
## 原文筛选逻辑(淋病奈瑟菌靶向)
|
||||
|
||||
(1)数据输入:原始片段库
|
||||
文件来源:
|
||||
Frags-Enamine-18M.csv:Enamine REAL数据库的18M片段(需提取SMILES)。
|
||||
GDB11-27M.csv:GDB-11数据库的27M片段(需提取SMILES)。
|
||||
(2)模型预测:Chemprop预训练模型
|
||||
模型用途:
|
||||
使用预训练的Chemprop模型(针对淋病奈瑟菌或金黄色葡萄球菌)预测片段的抗菌活性得分(范围0-1)。
|
||||
模型合理性:
|
||||
Chemprop模型基于图神经网络(GNN),已在大规模化合物库(如Broad Institute的38,765个化合物)上训练,对结构-活性关系有较高预测精度。
|
||||
论文验证了模型对已知抗生素片段的预测能力(见Figure S1A),证明其可靠性。
|
||||
(3)多维度过滤条件
|
||||
筛选逻辑包含以下条件(需代码实现):
|
||||
|
||||
1.活性阈值:
|
||||
GDB库片段预测得分>0.05;
|
||||
Enamine库片段预测得分>0.1(因合成性更佳)。
|
||||
2.毒性过滤:
|
||||
使用预训练的HepG2、HSkMC、IMR-90细胞毒性模型,剔除预测得分>0.5的片段。
|
||||
3.结构过滤:
|
||||
排除含PAINS/Brenk子结构的片段(易导致假阳性或代谢不稳定)。
|
||||
与已知559个抗生素的Tanimoto相似度<0.5(确保结构新颖性)。
|
||||
(4)结果输出
|
||||
最终获得1,156,945个片段(淋病奈瑟菌靶向),存储于补充数据或Zenodo仓库中。
|
||||
0
Data/image.png
Normal file → Executable file
0
Data/image.png
Normal file → Executable file
|
Before Width: | Height: | Size: 94 KiB After Width: | Height: | Size: 94 KiB |
9
Data/my_sugars.txt
Executable file
9
Data/my_sugars.txt
Executable file
@@ -0,0 +1,9 @@
|
||||
[*R*][C@@H](O[C@@H]1O[C@H](C)[C@@H](O[C@@H]2O[C@H](C)[C@@H](O)[C@@](O)(C)C2)[C@H](N(C)C)[C@H]1O)[*R*]
|
||||
[*R*][C@@H](CO[C@@H]1O[C@H](C)[C@@H](O)[C@@H](OC)[C@H]1OC)[*R*]
|
||||
[*R*][C@H](O[C@H]9C[C@@](C)(OC)[C@@H](O)[C@H](C)O9)[*R*]
|
||||
[*R*][C@H](O[C@@H]9O[C@H](C)C[C@@H]([C@H]9O)N(C)C)[*R*]
|
||||
[*R*][C@H](O[C@@H]9O[C@H](C)C[C@@H]([C@H]9OC(C)=O)N(C)C)[*R*]
|
||||
[*R*][C@H](O[C@H]9C[C@H](OC)O[C@@H](C)[C@@H]9OC(C)=O)[*R*]
|
||||
[*R*][C@H](O[C@H]9C[C@H](OC)[C@@H](O)[C@H](C)O9)[*R*]
|
||||
[*R*][C@H](O[C@H]9C[C@@H](O)[C@H](O)[C@@H](C)O9)[*R*]
|
||||
[*R*][C@H](O[C@@H]9O[C@H](C)C[C@H](NC)[C@H]9O)[*R*]
|
||||
0
Data/selected_extenders.txt
Normal file → Executable file
0
Data/selected_extenders.txt
Normal file → Executable file
2
Data/split_position.md
Executable file
2
Data/split_position.md
Executable file
@@ -0,0 +1,2 @@
|
||||
键编号 31: 17(C) -> 32(O), 键类型: SINGLE
|
||||
键编号 6: 6(C) -> 7(C), 键类型: SINGLE
|
||||
0
Data/sugars
Normal file → Executable file
0
Data/sugars
Normal file → Executable file
Reference in New Issue
Block a user