Adaption of the method on protein-binding affinity prediction proposed by Li et al.[1] for protein function prediction.
- Python==3.7
- PaddlePaddle==2.2.1
- Pgl==2.2.2
- scikit-learn==1.0.1
- tqdm==4.62.3
The Protein Data Bank (PDB). Pre-processing and transformation of proteins into graphs can be found here. After preprocessing the data should be copied in the ./data folder. Dataset splits (i.e., test, validation, and test) as proposed by [2] can be downloaded here or from their repository. They should also be copied to the folder ./data after extraction.
python train.py [params]
Where params are keyword arguments. See train.py for the list of arguments (with their default values).
python test.py --model_name <path-to-saved-model> --label_data_path <path-to-protein-with-their-labels> [more params]
model_name and label_data_path are required arguments. More (optional) parameters can be added as well. See test.py for a full list of expected arguments.
[1] Shuangli Li, Jingbo Zhou, Tong Xu, et al. Structure-aware Interactive Graph Neural Networks for the Prediction of Protein-Ligand Binding Affinity. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD '21). Association for Computing Machinery, New York, NY, USA, 975–985.
[2] Gligorijević, V., Renfrew, P.D., Kosciolek, T. et al. Structure-based protein function prediction using graph convolutional networks. Nat Commun 12, 3168 (2021).