Similar to Foldseek, this project implements a protein structure database searching methodology, while the method used here is based on GVP-GNN for protein structure representation learning.
We use Foldseek to generate the ground-truth datasets.
We use CATH/Gene3D dataset, see this page to download the .pdb format dataset.
We use Alphafold protein structure database, see this page to download the Swiss-Prot dataset (Huge!!! about 26GB compressed).
The app will be constructed later.