Data and Code for: Deploying synthetic coevolution and machine learning to engineer protein-protein interactions
Fine-tuning of protein-protein interactions occurs naturally through coevolution, but this process is difficult to recapitulate in the laboratory. We describe a synthetic platform for protein-protein coevolution that can isolate matched pairs of interacting muteins from complex libraries. This large dataset of coevolved complexes drove a systems-level analysis of molecular recognition between Z domain-affibody pairs spanning a wide range of structures, affinities, cross-reactivities, and orthogonalities, and captured a broad spectrum of coevolutionary networks. Furthermore, we harnessed pre-trained protein language models to expand, in silico, the amino acid diversity of our coevolution screen, predicting remodeled interfaces beyond the reach of the experimental library. The integration of these approaches provides a means of generating protein complexes with diverse molecular recognition properties as tools for biotechnology and synthetic biology.
Paper Link: https://www.science.org/doi/10.1126/science.adh1720
python >= 3.8
pytorch >= 1.11.0
CUDA >= 11.6
-
A notebook is provided for inference using our pre-trained model and pre-processed data for results shown in the manuscript.
-
For inference from sequence pairs, you can follow this notebook. please see ESM for detailed installation instruction of the ESM-1b model.
You can also find them here