Augmenting a training dataset of the generative diffusion model for molecular docking with artificial binding pockets

Article

Data preparation

The model is trained to dock small molecules in a predefined binding pocket. Therefore, the input PDB file is expected to include only pocket residues. A general recommendation is to consider all the residues within 5-6 Å of any heavy atom of known ligand (15-30 residues) or equivalent pocket sizes for the binding sites defined by other methods. Refer to the examples/extract_pocket.py as basic pocket extraction script.

Usage

From source code

Install the recommended dependencies compatible to your hardware and operating system
- Git LFS
- Python >= 3.8
- PyTorch >= 2.0
- PyTorch Geometric (including torch_scatter and torch_cluster)
- reduce
Clone the repository, navigate to the cloned folder, pull model weights
git clone https://github.com/vtarasv/pocket-cfdm.git
cd pocket-cfdm/
git lfs pull
Install required packages
pip install -r requirements.txt
Run the inference
python predict.py --pdb my_pocket.pdb --sdf my_ligands.sdf --save_path my_ligands_docked.sdf --samples 16 --batch_size 16 --no_filter
An increase of samples argument will lead to generation of higher alternative poses per docked molecule (better prediction quality for additional computational cost).
Consider decreasing the batch_size if you face GPU memory-related errors.
By default the results include only poses with acceptable quality. The no_filter flag allows to write all the generated poses despite their quality.
The first script run will take some time to precompute and save in the cache required data distributions.

Docker image

Pull the docker image
docker pull vtarasv/pocket-cfdm
Run the inference code using docker
docker run -it --rm --gpus all -v '/home/':'/home/' vtarasv/pocket-cfdm -m predict --pdb /home/user/temp/my_pocket.pdb --sdf /home/user/temp/my_ligands.sdf --save_path /home/user/temp/my_ligands_docked.sdf

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
data/config		data/config
examples		examples
utils		utils
workdir/.model_checkpoints		workdir/.model_checkpoints
.dockerignore		.dockerignore
.gitattributes		.gitattributes
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
dataset.py		dataset.py
features.py		features.py
model.py		model.py
params.py		params.py
predict.py		predict.py
requirements.txt		requirements.txt
requirements_dev.txt		requirements_dev.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Augmenting a training dataset of the generative diffusion model for molecular docking with artificial binding pockets

Article

Data preparation

Usage

From source code

Docker image

About

Releases

Packages

Languages

License

vtarasv/pocket-cfdm

Folders and files

Latest commit

History

Repository files navigation

Augmenting a training dataset of the generative diffusion model for molecular docking with artificial binding pockets

Article

Data preparation

Usage

From source code

Docker image

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages