Skip to content

DockFlow

DIEGO ENRY GOMES edited this page Dec 5, 2017 · 18 revisions

Description

DockFlow aims at making docking campaigns and virtual screening easy tasks accessible by everyone. Its main advantage is to properly organize results and manage errors to make the user's experience as productive as possible.
It is also completely integrated in the ChemFlow environment, making it super efficient to rescore docking poses with other scoring functions, run MD simulations, or automatically produce tables and graphs for a thorough analysis.

Usage

Input files

This version of DockFlow implements PLANTS to run the docking experiments. For this reason, it is only able to read mol2 files.
DockFlow requires 4 things from the user :

  • A directory that will contain all the files and folders mentioned below : the run folder.
  • A directory containing the receptor in a mol2 file.
  • A directory containing the ligands in mol2 files.
    ℹ️ Each mol2 file can contain one or several ligands. The number of ligands in each file should be distributed equally for maximum performance when running in parallel.
  • A configuration file : DockFlow.config.
    ℹ️ The configuration file should either be generated by running ConfigFlow, which guides you through a graphical interface, or copied from the $CHEMFLOW_HOME/config_files directory.

Configuration file : DockFlow.config

  1. Mandatory parameters

Start by stating the absolute path to your receptor file and ligand directory.
For the definition of the binding site, PLANTS uses a sphere so you will need to provide x, y and z coordinates of the center of the sphere as well as its radius.
Since DockFlow's search algorithm and scoring function perform quite fast, we recommend asking for at least 25 docking poses per ligand.

Scoring function :

Name Description
plp95 Piecewise Linear Potential (PLP) from Gehlhaar DK et al
plp PLANTS version of the PLP
chemplp PLANTS version of the PLP implementing some of GOLD's terms
  1. Optional parameters

You can provide other parameters for PLANTS, such as :

  • docking with a structural water molecule
  • or any other specification supported by PLANTS.

Finally, and most importantly, you can choose how to run the experiment :

  • local : run on the current computer in serial,
  • parallel : run locally using GNU parallel for a more efficient use of your computer resources,
  • mazinger : run on a compute cluster equipped with PBS.

You can modify the current DockFlow.config file directly from the command line interface.
To get the available options, run DockFlow -h, and to have a more extensive help, run DockFlow -hh.

Running

Once all of the above requirements are met, you can launch DockFlow from the run folder :

DockFlow

Results

All DockFlow's results are located inside the "docking" directory. It essentially consists of 4 types of files :

  1. Docking poses
    Each mol2 file given as input, located inside your ligand directory, will lead to the creation of a directory of the same name inside "docking". Each of these directories will contain the docking poses generated by PLANTS (1 mol2 file per pose).
  2. Log files
    Each of these directories contain a "plants.job" file, which stores informations, warnings and errors from PLANTS for the user.
  3. Docking score and rank
    All the docking scores are appended to the "ranking.csv" file inside "docking".
    A copy of this CSV table, sorted by score, is available in "ranking_sorted.csv".
    A complete decomposition of the energy according to PLANTS scoring function's terms is available in "features.csv".
  4. Binding site residues
    Information on the residues present in the binding site (defined by the user in the configuration file) is available in "protein_binding_site_fixed.mol2".
    Such file can be used to run a protein-ligand interaction fingerprint analysis with PyPLIF if needed (not included in ChemFlow for now).

FAQ

I have a chemical library available as a single mol2 or sdf file. Do you have any tool to split it in several mol2 files before running DockFlow on a cluster ?

  • splitmol can split any sdf, smi or mol2 file in smaller files. It uses Open Babel to perform the splitting.
    Run splitmol -h for more information.