Source data and scripts for the paper "Deciphering the sequence variation and structural dynamics of envelope glycoprotein gp120 in HIV neutralization phenotype by molecular dynamics simulation and graph machine learning"
This paper aims to decipher the roles of rapid sequence variability and significant structural dynamics of envelope glycoprotein gp120 in HIV neutralization phenotype. 45 prefusion gp120 from different HIV strains belong to three tiers of sensitive, moderate, and resistant neutralization phenotype are structurally modeled by homology modeling and investigated by molecular dynamics simulations and graph machine learning.
- All HIV sequences with the neutralization-sensitive, moderate, and resistant neutralization phenotype based on the experimental assessment of HIV neutralization phenotype with a broad range of genetic and geographic diversity were obtained from the UniProtKB database (http://www.uniprot.org).
- All gp120 structural models from different HIV strains with the neutralization-sensitive, moderate, and resistant neutralization phenotype were randomly selected and constructed by homology modeling.
- 45 prefusion gp120 from different HIV strains with the neutralization-sensitive, moderate, and resistant neutralization phenotype are investigated by molecular dynamics simulations (only Ca trajectories are uploaded).
- structural deviations: root mean square deviation RMSD and radius gyration Rg of the hydrophobic core.
- population distribution: free energy landscapes FEL.
- conformational flexibility: root-mean-square fluctuation RMSF.
- Model selection from the Gated Recurrent Unit (GRU), Graph Convolution Network (GCN) and Graph Isomorphism Network (GIN).
- Graph Isomorphism Network (GIN) training on molecular dynamics simulations.
- Graph Isomorphism Network (GIN) with attention mechanism GIN_ATT.