Predicting the effect of mutations on protein folding and protein-protein interaction.
ELASPIC2
has been integrated into the original ELASPIC web server, available at: http://elaspic.kimlab.org.
The following notebooks can be used to explore the basic functionality of ELASPIC2
.
See other notebooks in the notebooks/
directory for more detailed information about how ELASPIC2 models are trained and validated.
ELASPIC2
is accessible through a REST API, documented at: http://elaspic.kimlab.org/api/v2/docs.
The following code snippet shows how the REST API can be used from within Python.
import json
import time
import requests
ELASPIC2_JOBS_API = "http://elaspic.kimlab.org/api/v2/jobs/"
mutation_info = {
"protein_structure_url": "https://files.rcsb.org/download/1MFG.pdb",
"protein_sequence": (
"GSMEIRVRVEKDPELGFSISGGVGGRGNPFRPDDDGIFVTRVQPEGPASKLLQPGDKIIQANGYSFINI"
"EHGQAVSLLKTFQNTVELIIVREVSS"
),
"mutations": "G1A,G1C",
"ligand_sequence": "EYLGLDVPV",
}
# Submit a job
job_request = requests.post(ELASPIC2_JOBS_API, json=mutation_info).json()
while True:
# Wait for the job to finish
time.sleep(10)
job_status = requests.get(job_request["web_url"]).json()
if job_status["status"] in ["error", "success"]:
break
# Collect results
job_result = requests.get(job_status["web_url"]).json()
# Delete job (optional)
requests.delete(job_request["web_url"]).raise_for_status()
# Show results
print(job_result)
Finally, ELASPIC2
can be used through a command-line interface.
python -m elaspic2 \
--protein-structure tests/structures/1MFG.pdb \
--protein-sequence GSMEIRVRVEKDPELGFSISGGVGGRGNPFRPDDDGIFVTRVQPEGPASKLLQPGDKIIQANGYSFINIEHGQAVSLLKTFQNTVELIIVREVSS \
--ligand-sequence EYLGLDVPV \
--mutations G1A.G1C
Docker images that contain ELASPIC2
and all dependencies are available at: https://gitlab.com/elaspic/elaspic2/container_registry.
Conda-pack tarballs containing ELASPIC2
and all dependencies are available at: http://conda-envs.proteinsolver.org/elaspic2/.
Simply download and extract the tarball into a desired directory and run conda-unpack
to unpack.
wget http://conda-envs.proteinsolver.org/elaspic2/elaspic2-latest.tar.gz
mkdir ~/elaspic2
tar -xzf elaspic2-latest.tar.gz -C ~/elaspic2
source ~/elaspic2/bin/activate
conda-unpack
ELASPIC2
can be installed using conda
. However, the torch-geometric
dependencies have to be installed separately.
Replace cudatoolkit=10.1
and cu101
with the desired CUDA version.
conda create -n elaspic2 -c pytorch -c ostrokach-forge -c conda-forge -c defaults elaspic2 "cudatoolkit=10.1"
conda activate elaspic2
pip install "torch-scatter==latest+cu101" -f https://pytorch-geometric.com/whl/torch-1.7.0.html
pip install "torch-sparse==latest+cu101" -f https://pytorch-geometric.com/whl/torch-1.7.0.html
pip install "torch-cluster==latest+cu101" -f https://pytorch-geometric.com/whl/torch-1.7.0.html
pip install "torch-spline-conv==latest+cu101" -f https://pytorch-geometric.com/whl/torch-1.7.0.html
pip install "torch-geometric==1.6.1"
ELASPIC2
can be installed using pip
. However, the torch
and torch-geometric
dependencies have to be installed from external channels.
Make sure that git lfs
is installed on your system, and run the commands below, replace cu101
below with the desired CUDA version.
pip install "torch==1.8.0";
pip install -f https://pytorch-geometric.com/whl/torch-1.8.0+cu101.html --default-timeout=600 \
"transformers==3.3.1" \
"torch-scatter==2.0.6" \
"torch-sparse==0.6.9" \
"torch-cluster==1.5.9" \
"torch-spline-conv==1.2.1" \
"torch-geometric==1.6.1" \
"https://gitlab.com/kimlab/kmbio/-/archive/v2.1.0/kmbio-v2.1.0.zip" \
"https://gitlab.com/kimlab/kmtools/-/archive/v0.2.8/kmtools-v0.2.8.zip" \
"https://gitlab.com/ostrokach/proteinsolver/-/archive/v0.1.25/proteinsolver-v0.1.25.zip" \
"git+https://gitlab.com/elaspic/elaspic2.git"
Data used to train and validate the ELASPIC2
models are available at http://elaspic2.data.proteinsolver.org and http://protein-folding-energy.data.proteinsolver.org.
See the protein-folding-energy
repository to see how these data were generated.
- Alexey Strokach, Tian Yu Lu, Philip M. Kim. ELASPIC2 (EL2): Combining contextualized language models and graph neural networks to predict effects of mutations. Journal of Molecular Biology. https://doi.org/10.1016/j.jmb.2021.166810.