Computational-Drug-Discovery-Neural-Network

This GitHub repository contains code for a computational drug discovery project utilizing neural networks. The goal of the project is to explore bioactivity data for a specific target protein related to Herpes virus and use it to classify compounds as active, inactive, or intermediate based on their bioactivity values. Additionally, molecular descriptors are calculated to aid in the analysis.

Data Collection

The data is collected from the ChEMBL database using the ChEMBL web service package. The target protein for Herpes virus is searched, and the bioactivity data is retrieved.

Handling Missing Data

Any compounds with missing values for the standard_value column are dropped from the dataset.

Data Preprocessing

The bioactivity data is preprocessed, and compounds are labeled as active, inactive, or intermediate based on their IC50 values.

Calculate Lipinski's Descriptors

Lipinski's descriptors are calculated for the compounds, which are essential molecular properties used in drug discovery and medicinal chemistry.

Convert IC50 to pIC50

IC50 values are converted to pIC50, a negative logarithmic scale, to ensure uniform distribution and facilitate analysis.

Exploratory Data Analysis

Frequency plots, scatter plots, and box plots are used to explore the distribution of bioactivity classes and molecular properties.

Statistical Analysis

The Mann-Whitney U test is performed to assess whether there is a significant difference between the distributions of active and inactive compounds for various molecular properties.

Descriptor Calculation and Dataset Preparation

PaDEL-Descriptor software is used to calculate molecular descriptors and prepare the dataset for further analysis.

Please feel free to explore the code and datasets in this repository to understand the drug discovery process for the target protein related to Herpes virus. If you have any questions or suggestions, feel free to open an issue or contribute to the project. Happy drug discovery!

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Exploratory Data Anlysis		Exploratory Data Anlysis
Mann-Whitney results		Mann-Whitney results
Drug-Discovery.ipynb		Drug-Discovery.ipynb
README.md		README.md
bioactivity_data.csv		bioactivity_data.csv
bioactivity_preprocessed_data.csv		bioactivity_preprocessed_data.csv
descriptors.csv		descriptors.csv
herpesvirus5_capsid_protein_P40_bioactivity_data_3class_pIC50.csv		herpesvirus5_capsid_protein_P40_bioactivity_data_3class_pIC50.csv
herpesvirus5_capsid_protein_P40_bioactivity_data_3class_pIC50_pubchem_fp.csv		herpesvirus5_capsid_protein_P40_bioactivity_data_3class_pIC50_pubchem_fp.csv
molecules.smi		molecules.smi

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Computational-Drug-Discovery-Neural-Network

Table of Contents

Data Collection

Handling Missing Data

Data Preprocessing

Calculate Lipinski's Descriptors

Convert IC50 to pIC50

Exploratory Data Analysis

Statistical Analysis

Descriptor Calculation and Dataset Preparation

About

Releases

Packages

Languages

BehRoooz/Computational-Drug-Discovery-Neural-Network

Folders and files

Latest commit

History

Repository files navigation

Computational-Drug-Discovery-Neural-Network

Table of Contents

Data Collection

Handling Missing Data

Data Preprocessing

Calculate Lipinski's Descriptors

Convert IC50 to pIC50

Exploratory Data Analysis

Statistical Analysis

Descriptor Calculation and Dataset Preparation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages