2025-03-15 14:17:13 +08:00
2025-03-11 12:24:38 +08:00
2024-09-28 13:14:52 +08:00
2024-09-28 13:14:52 +08:00
2024-09-28 13:14:52 +08:00
2025-03-15 14:17:13 +08:00
add
2025-03-03 20:23:35 +08:00
2024-09-28 13:14:52 +08:00
2024-09-28 13:14:52 +08:00
2024-09-28 13:14:52 +08:00
2024-10-10 17:16:48 +08:00
2024-09-28 13:14:52 +08:00

qsar

Getting started

chembl

DOI https://doi.org/10.1016/j.ejmech.2022.114495

Design, synthesis and activity against drug-resistant bacteria evaluation of C-20, C-23 modified 5-O-mycaminosyltylonolide derivatives

A类似物22个活性数据 B类似物7个活性数据 C类似物47个活性数据

检索条件Structure2D_A1.mol 85% 以上相似度

检索结果A_85.csv

env

unimol install

python -m pip install --upgrade pip
pip config set global.index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
pip install unimol_tools
pip install huggingface_hub

result

(analyse) (base) root@DESK4090:/mnt/c/project/qsar/MIC# python qsar_1D.py 
[1D-QSAR][Linear Regression]    MSE:32.3949     R2:0.6525
Model saved to 1d_qsar_linear_regression_model.pkl
[1D-QSAR][Stochastic Gradient Descent]  MSE:230009980374197965960989638656.0000 R2:-2467672699617844819673481216.0000
Model saved to 1d_qsar_stochastic_gradient_descent_model.pkl
[1D-QSAR][K-Nearest Neighbors]  MSE:30.2081     R2:0.6759
Model saved to 1d_qsar_k-nearest_neighbors_model.pkl
[1D-QSAR][Decision Tree]        MSE:27.7150     R2:0.7027
Model saved to 1d_qsar_decision_tree_model.pkl
[1D-QSAR][Random Forest]        MSE:26.5204     R2:0.7155
Model saved to 1d_qsar_random_forest_model.pkl
[1D-QSAR][XGBoost]      MSE:27.7147     R2:0.7027
Model saved to 1d_qsar_xgboost_model.pkl
[1D-QSAR][Multi-layer Perceptron]       MSE:143.3505    R2:-0.5379
Model saved to 1d_qsar_multi-layer_perceptron_model.pkl
---
[2D-QSAR][Linear Regression]    MSE:30.1093     R2:0.6770
Model saved to 2d_qsar_linear_regression_model.pkl
[2D-QSAR][Stochastic Gradient Descent]  MSE:33.7336     R2:0.6381
Model saved to 2d_qsar_stochastic_gradient_descent_model.pkl
[2D-QSAR][K-Nearest Neighbors]  MSE:48.8179     R2:0.4763
Model saved to 2d_qsar_k-nearest_neighbors_model.pkl
[2D-QSAR][Decision Tree]        MSE:30.2360     R2:0.6756
Model saved to 2d_qsar_decision_tree_model.pkl
[2D-QSAR][Random Forest]        MSE:28.7916     R2:0.6911
Model saved to 2d_qsar_random_forest_model.pkl
[2D-QSAR][XGBoost]      MSE:30.2351     R2:0.6756
Model saved to 2d_qsar_xgboost_model.pkl
[2D-QSAR][Multi-layer Perceptron]       MSE:30.1715     R2:0.6763
Model saved to 2d_qsar_multi-layer_perceptron_model.pkl
---
[3D-QSAR][Stochastic Gradient Descent]  MSE:64.5768     R2:0.3072
Model saved to 3d_qsar_stochastic_gradient_descent_model.pkl
[3D-QSAR][K-Nearest Neighbors]  MSE:38.6921     R2:0.5849
Model saved to 3d_qsar_k-nearest_neighbors_model.pkl
[3D-QSAR][Decision Tree]        MSE:30.2360     R2:0.6756
Model saved to 3d_qsar_decision_tree_model.pkl
[3D-QSAR][Random Forest]        MSE:30.8310     R2:0.6692
Model saved to 3d_qsar_random_forest_model.pkl
[3D-QSAR][XGBoost]      MSE:30.2362     R2:0.6756
Model saved to 3d_qsar_xgboost_model.pkl
[3D-QSAR][Multi-layer Perceptron]       MSE:29.9844     R2:0.6783
Model saved to 3d_qsar_multi-layer_perceptron_model.pkl
---
unimol qsar
{'mse': 59.72037918598548, 'mae': 5.179289798987539, 'pearsonr': 0.638764928149331, 'spearmanr': 0.6006870492749102, 'r2': 0.35928715315601223}

deepMD-kit

notebookLM

pytorch code

3D-QSAR tutorial

molpipline

DOI:https://doi.org/10.1021/acs.jcim.4c00863

MolToBinary将分子转换为二进制格式的特征。这些特征可以是分子的指纹通常用于计算相似性。

MolToConcatenatedVector将多个特征向量连接起来用于产生更丰富的特征表征。

MolToSmiles将分子对象转换为 SMILESSimplified Molecular Input Line Entry System字符串格式。SMILES 是一种用于描述分子结构的字符串格式,非常适合用于分子结构数据的标准化表示。

MolToMACCSFP用于计算 MACCS 键(分子结构关键子)指纹。这种类型的指纹是用于分子结构相似性计算和建模的标准特征。

MolToMorganFP用于计算 Morgan 指纹(也称为径向指纹),可以选择位数和半径,这些指纹是分子的拓扑特征,常用于化学信息学的机器学习建模中。

MolToNetCharge用于计算分子的净电荷电荷信息对于理解分子的化学性质、反应性等非常重要。

Mol2PathFP用于计算基于路径的指纹。这些指纹基于分子的连接路径来描述分子结构可以用于相似性分析和模型训练。

MolToInchi 和 MolToInchiKey将分子转换为 InChIInternational Chemical Identifier和 InChI Key。这些用于描述分子的标准化编码通常用于化学数据库中的分子唯一性标识。

MolToRDKitPhysChem用于计算分子的理化性质物理化学特性例如分子量、TPSA极性表面积、氢键供体和受体数等。这些理化特性是机器学习建模中常用的基础特征。

Description
qsar笔记
Readme 3.7 MiB
Languages
Jupyter Notebook 85.4%
Python 13.6%
Arc 1%