IntFold

Generating intermediate representations using AlphaFold2 and ColabFold

Overview

This repo uses Evoformer of AlphaFold2 to generate intermediate representations (MSA and Pair) for proteins, especially enzymes with EC numbers. The code is based on ColabFold and LocalColabFold.

The enzyme dataset is splitted in four files: uniprot-filtered-reviewed_yes.tab.gz.partaa, uniprot-filtered-reviewed_yes.tab.gz.partab, uniprot-filtered-reviewed_yes.tab.gz.partac, uniprot-filtered-reviewed_yes.tab.gz.partad, comes from UniProt.

Install

Only Linux is supported to run IntFold, please install Windows Subsystem for Linux if you are using Windows 10 or later.

Install Docker
- Install nVidia Container Toolkit if you have nVidia GPUs
- Set up Docker as a non-root user
- Check if your nVidia Container Toolkit installation is successful by running
```
docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
```
Output lists your available GPUs, if no GPU is listed, check if you have followed the instruction of installing nVidia Container Toolkit and take a look at nVidia docker issues
If you don't need to modify the code, you can directly use this built docker image by running
```
docker pull yuxin60/intfold
```
If you need to modify the code to run your tasks, first clone this repo and cd into it
```
git clone https://github.com/yuxin212/intfold.git
```
And modify the code accordingly.

Build docker image

docker build -f docker/Dockerfile -t intfold .

Running IntFold

First run

docker run --gpus <number of gpus> yuxin60/intfold:latest

Get Container id
```
docker ps
```
After running, copy generated intermediate representations from docker container to host
```
docker cp <container-id>:/app/intermediate/ <path to store results>
```
After copying the output, please remove the docker container
```
docker stop <container-id>
docker rm <container-id>
```

IntFold Output

The output will be saved as numpy arrays in docker container, and path is /app/intermediate/. This directory has the following structure:

/app/intermediate/<EC 1st number>/<EC 2nd number>/<EC 3rd number>/<EC 4th number>/
    <Entry>_msa_first_row.npy
    <Entry>_msa.npy
    <Entry>_pair.npy
    <Entry>_single.npy

Content of each output file, where r is number of amino acid residues:

<Entry>_msa_first_row.npy: First row of MSA representation, shape: (512, r, 256)

<Entry>_msa.npy: Full MSA representation, shape: (r, 256)

<Entry>_pair.npy: Pair representation, shape: (r, r, 128)

<Entry>_single.npy: Single Representation, shape: (r, 384)

Citation

If you use this source code for your publication, plase cite

Mirdita M, Schütze K, Moriwaki Y, Heo L, Ovchinnikov S and Steinegger M. ColabFold: Making protein folding accessible to all. Nature Methods (2022) doi: 10.1038/s41592-022-01488-1
Jumper et al. "Highly accurate protein structure prediction with AlphaFold." Nature (2021) doi: 10.1038/s41586-021-03819-2
If you use AlphaFold-multimer, please cite Evans et al. "Protein complex prediction with AlphaFold-Multimer." biorxiv (2021) doi: 10.1101/2021.10.04.463034v1

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
docker		docker
LICENSE.md		LICENSE.md
README.md		README.md
alphafold_model_model.py		alphafold_model_model.py
batch_intfold.py		batch_intfold.py
intfold.py		intfold.py
model_output_representation.patch		model_output_representation.patch
run_intfold.sh		run_intfold.sh
uniprot-filtered-reviewed_yes.tab.gz.partaa		uniprot-filtered-reviewed_yes.tab.gz.partaa
uniprot-filtered-reviewed_yes.tab.gz.partab		uniprot-filtered-reviewed_yes.tab.gz.partab
uniprot-filtered-reviewed_yes.tab.gz.partac		uniprot-filtered-reviewed_yes.tab.gz.partac
uniprot-filtered-reviewed_yes.tab.gz.partad		uniprot-filtered-reviewed_yes.tab.gz.partad

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IntFold

Overview

Install

Running IntFold

IntFold Output

Citation

About

Releases

Packages

Languages

License

yuxin212/intfold

Folders and files

Latest commit

History

Repository files navigation

IntFold

Overview

Install

Running IntFold

IntFold Output

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages