Skip to content

Use PyConforMap to generate a simple scatter plot to map conformational landscapes of intrinsically disordered proteins, and quantify conformational diversity.

License

Notifications You must be signed in to change notification settings

hshadman/2d_conformational_landscape_map

Repository files navigation

DOI

PyConforMap: Draw pretty maps of your polymer or disordered protein conformational ensembles!

This repository provides an easy-to-implement python module called PyConforMap that generates scatter plots of instantaneous shape ratio (Rs) against relative radius of gyration (Rg/Rgmean).

PLEASE READ ALL DOCUMENTATION

There are two main main metrics: the relative radius of gyration (Rg/Rgmean) and the instantaneous shape ratio (Rs). Rs is computed as Rs = Ree2/Rg2 where Ree and Rg are (instantaneous) end-to-end distance and (instantaneous) radius of gyration respectively.

The Rg/Rgmean is a measure of (relative) size for a protein or polymer chain, and Rs is a measure of its shape. Rs is expected to be low (~2 or lower) for compact structures and high for highly extended structures (~12 or higher). A single Rg/Rgmean value and corresponding Rs value for a polymer together is how we define its instantaneous conformation. When all the Rg/Rgmean and Rs values of a polymer are plotted together, they constitute what we call a 2D map of the conformational landscape of that polymer.

The PyConforMap Module Generates Two-Dimensional Scatter Plots

This module generates 2D scatter plots of Rs against Rg/Rgmean for a protein/polymer simulation (data and protein label/identity provided by user) and a Gausssian Walk (GW) polymer model simulation (data for 720000 snapshots of a GW model of length 100 included with repository). Each point on the scatter plot (belonging to either GW or a given protein/polymer) represents a conformation snapshot, and has coordinates (Rg/Rgmean, Rs). The GW model is intended to be a reference model, whose conformational landscape map (i.e. as represented by all the (Rg/Rgmean, Rs) points) serves as a 'universal' or reference map for those of other proteins/polymers. Using the 2D scatter plot, an fC, representing the fraction of the GW points 'close' (i.e. within a pre-defined radius) to at least one protein/polymer point, is automatically calculated. fC is a quantity that represents the conformational diversity of the protein/polymer provided, and can be used to rank the conformational diversities of different proteins/polymers. The included GW file is 'GW_chainlen100.csv.' The python module can be additionally used to conduct a new GW simulation with different chain length and number of snapshots, should the user wish to do so. On the scatter plot, it is important that the protein/polymer points do not significantly exceed the boundaries defined by the reference (GW) points. Most of the protein/polymer points should be 'close' (i.e. within a pre-defined radius) to at least one GW point.

The Module Code Requires One Input File

The needed input is a csv file (for a given protein/polymer simulation) with 2 columns. The first column contains Rg2 values and the second column contains Ree2 values. In this (user provided) file, each row represents a protein/polymer conformation snapshot from the simulation. An example input is the 'example_protein.csv' csv file (included with repository).

Files Included with Repository

The 'code_input_output.md' file provides technical details (input arguments, expected outputs) of the module. The 'pyconformap.py' file contains the source code for the module. The 'illustrated_example.ipynb' jupyter notebook file shows examples to illustrate implementation of the code. The 'GW_chainlen100.csv' is the reference GW simulation and 'example_protein.csv' is the simulation of an example protein.

Packages Required

The module requires the pandas, numpy, matplotlib, scipy, itertools, more_itertools and random python packages. They are automatically loaded when the 'pyconformap.py' file is executed, as shown in the illustrated examples.

Publication

PyConforMap is companion to this publication.

How to Cite

If you use this module, please cite us using the provided DOI.

Contact Information

If you have comments/suggestions or a bug report, please feel free to email me at hossain.shadman17@gmail.com, or contact me through my social media links provided in the home page.