A ML classification model is created to find potential ligand molecules for Pantothenate synthetase protein to inhibiting Micobacterium Tuberculosis replication
- Pantothenate synthetase is an enzyme found in Mycobacterium tuberculosis. This enzyme plays a crucial role in the
biosynthesis of coenzyme A (CoA)
, a molecule essential for multiple metabolic pathways, including the synthesis and degradation of fatty acids. Inhibition of pantothenate synthetase can disrupt the biosynthesis of CoA
, leading to the impairment of essential metabolic processes in M. tuberculosis.- This makes pantothenate synthetase an
attractive target
for the development of novel anti-tuberculosis drugs. Inhibitors of pantothenate synthetase have the potential to selectively target the bacteria without harming the host cells.
- Machine Learning classification model building and screening of DrugBank
- Virtual screening of potential active molecules using Maestro
- Perform MD Simulation of selected molecules using Gromacs
- Pantothenate synthetase protein bioactivity data collection from CHEMBL
- Selecting data on the basis of IC50 values
- Pre-processing data (handling missing values, removing invalid smiles, converting smiles to canonical smiles)
- Dividing molecules into active and inactive on the basis of IC50 values (
IC50<=1000nM --> Active
,IC50>=10000nM ----> Inactive
) - Generating Morgan fingerprints using rdkit
- Applying PCA to reduce dimesnion of data
- Training various machine learning classification models and selecting the best ones
- Screening drug bank molecules uing trained models to get potential active molecules
-
Result: After screening we got 103 potential active molecules and virtual screening performed on these molecules.
-
Learning: In this project I have learned how to use machine learning algorithms when we have very less data available