An extension to the original PconsC4 model. This model is be capable of predicting the distance between two residues using the modified UNet++ Architecture. The input for the model is a multiple sequence alignment file and the output would be a matrix with the predictions.
pip3 install gaussdca
pip3 install tensorflow
pip3 install keras
pip3 install h5py
The multiple sequence alignments were generated using JackHMMer and the features that were extracted the MSA are:
- Gaussian Direct Coupling Analysis
- APC-Corrected Mutual Information
- Normalized APC-Corrected Mutual Information
- Cross Entropy
- Sequence Features
Using these 5 features, multiple fully convolutional neural network architectures were developed:
- FC-DenseNet 103
- U-Net
- Recreation of the trRosetta Model
- VGG 19
- ResNet 50
Multiple different loss functions were tested on the various models. The loss functions are:
- Focal Loss
- Dice Loss
- Categorical Cross Entropy
- Mean Squared Error
- Weighted CCE
- Tversky Loss
Follow the steps below to make your own prediction using the deeper U-Net model. Run the following command in the src folder.
python3 predict.py alignment.a3m output
alignment.a3m correponds to the alignment file. output is the name of the output file which had the predictions from the U-Net model.