5.0 KiB
qsar
Getting started
DOI https://doi.org/10.1016/j.ejmech.2022.114495
Design, synthesis and activity against drug-resistant bacteria evaluation of C-20, C-23 modified 5-O-mycaminosyltylonolide derivatives
A类似物:22个活性数据 B类似物:7个活性数据 C类似物:47个活性数据
检索条件:Structure2D_A1.mol 85% 以上相似度
检索结果:A_85.csv
env
python -m pip install --upgrade pip
pip config set global.index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
pip install unimol_tools
pip install huggingface_hub
result
(analyse) (base) root@DESK4090:/mnt/c/project/qsar/MIC# python qsar_1D.py
[1D-QSAR][Linear Regression] MSE:32.3949 R2:0.6525
Model saved to 1d_qsar_linear_regression_model.pkl
[1D-QSAR][Stochastic Gradient Descent] MSE:230009980374197965960989638656.0000 R2:-2467672699617844819673481216.0000
Model saved to 1d_qsar_stochastic_gradient_descent_model.pkl
[1D-QSAR][K-Nearest Neighbors] MSE:30.2081 R2:0.6759
Model saved to 1d_qsar_k-nearest_neighbors_model.pkl
[1D-QSAR][Decision Tree] MSE:27.7150 R2:0.7027
Model saved to 1d_qsar_decision_tree_model.pkl
[1D-QSAR][Random Forest] MSE:26.5204 R2:0.7155
Model saved to 1d_qsar_random_forest_model.pkl
[1D-QSAR][XGBoost] MSE:27.7147 R2:0.7027
Model saved to 1d_qsar_xgboost_model.pkl
[1D-QSAR][Multi-layer Perceptron] MSE:143.3505 R2:-0.5379
Model saved to 1d_qsar_multi-layer_perceptron_model.pkl
---
[2D-QSAR][Linear Regression] MSE:30.1093 R2:0.6770
Model saved to 2d_qsar_linear_regression_model.pkl
[2D-QSAR][Stochastic Gradient Descent] MSE:33.7336 R2:0.6381
Model saved to 2d_qsar_stochastic_gradient_descent_model.pkl
[2D-QSAR][K-Nearest Neighbors] MSE:48.8179 R2:0.4763
Model saved to 2d_qsar_k-nearest_neighbors_model.pkl
[2D-QSAR][Decision Tree] MSE:30.2360 R2:0.6756
Model saved to 2d_qsar_decision_tree_model.pkl
[2D-QSAR][Random Forest] MSE:28.7916 R2:0.6911
Model saved to 2d_qsar_random_forest_model.pkl
[2D-QSAR][XGBoost] MSE:30.2351 R2:0.6756
Model saved to 2d_qsar_xgboost_model.pkl
[2D-QSAR][Multi-layer Perceptron] MSE:30.1715 R2:0.6763
Model saved to 2d_qsar_multi-layer_perceptron_model.pkl
---
[3D-QSAR][Stochastic Gradient Descent] MSE:64.5768 R2:0.3072
Model saved to 3d_qsar_stochastic_gradient_descent_model.pkl
[3D-QSAR][K-Nearest Neighbors] MSE:38.6921 R2:0.5849
Model saved to 3d_qsar_k-nearest_neighbors_model.pkl
[3D-QSAR][Decision Tree] MSE:30.2360 R2:0.6756
Model saved to 3d_qsar_decision_tree_model.pkl
[3D-QSAR][Random Forest] MSE:30.8310 R2:0.6692
Model saved to 3d_qsar_random_forest_model.pkl
[3D-QSAR][XGBoost] MSE:30.2362 R2:0.6756
Model saved to 3d_qsar_xgboost_model.pkl
[3D-QSAR][Multi-layer Perceptron] MSE:29.9844 R2:0.6783
Model saved to 3d_qsar_multi-layer_perceptron_model.pkl
---
unimol qsar
{'mse': 59.72037918598548, 'mae': 5.179289798987539, 'pearsonr': 0.638764928149331, 'spearmanr': 0.6006870492749102, 'r2': 0.35928715315601223}
deepMD-kit
notebookLM
pytorch code
3D-QSAR tutorial
molpipline
DOI:https://doi.org/10.1021/acs.jcim.4c00863
MolToBinary:将分子转换为二进制格式的特征。这些特征可以是分子的指纹,通常用于计算相似性。
MolToConcatenatedVector:将多个特征向量连接起来,用于产生更丰富的特征表征。
MolToSmiles:将分子对象转换为 SMILES(Simplified Molecular Input Line Entry System)字符串格式。SMILES 是一种用于描述分子结构的字符串格式,非常适合用于分子结构数据的标准化表示。
MolToMACCSFP:用于计算 MACCS 键(分子结构关键子)指纹。这种类型的指纹是用于分子结构相似性计算和建模的标准特征。
MolToMorganFP:用于计算 Morgan 指纹(也称为径向指纹),可以选择位数和半径,这些指纹是分子的拓扑特征,常用于化学信息学的机器学习建模中。
MolToNetCharge:用于计算分子的净电荷,电荷信息对于理解分子的化学性质、反应性等非常重要。
Mol2PathFP:用于计算基于路径的指纹。这些指纹基于分子的连接路径来描述分子结构,可以用于相似性分析和模型训练。
MolToInchi 和 MolToInchiKey:将分子转换为 InChI(International Chemical Identifier)和 InChI Key。这些用于描述分子的标准化编码通常用于化学数据库中的分子唯一性标识。
MolToRDKitPhysChem:用于计算分子的理化性质(物理化学特性),例如分子量、TPSA(极性表面积)、氢键供体和受体数等。这些理化特性是机器学习建模中常用的基础特征。