Code for "Empowering graph neural network-based computational drug repositioning with large language model-inferred knowledge representation"
We introduce 4 drug-disease association benchmark datasets in our study, including: B-dataset, C-dataset, F-dataset, and R-dataset. Dataset summary is as follows:
Dataset | Drugs | Diseases | Drug-disease Associations | Pos-Neg Ratio |
---|---|---|---|---|
B-dataset | 269 | 598 | 18,416 | 11.45% |
C-dataset | 663 | 409 | 2,532 | 1.57% |
F-dataset | 593 | 313 | 1,933 | 1.05% |
R-dataset | 894 | 454 | 2,704 | 0.67% |
Please find the designed zero-shot template for GPT-4 to generate drug and disease knowledge descriptions in desc_generate.py
. Please find the generated drug (drug_desc.csv) and disease descriptions (disease_desc.csv) for B-dataset, C-dataset, F-dataset, and R-dataset, in each "feat/DATASET" folder.
To generate LLM-inferred knowledge representations, please refer emb_generate.py
. Also, we have stored generated embedding files for B-dataset, C-dataset, F-dataset, and R-dataset, please find them in "feat" folder. Specifically, LLM_drug_emb.pkl
and LLM_disease_emb.pkl
are generated embeddings from GPT-4; BERT_drug_emb.pkl
and BERT_disease_emb.pkl
are generated embeddings from BioBERT.
torch
: 1.13.0+cu117scikit-learn
: 1.2.2rdkit
: 2023.3.3dgl
: 1.1.2+cu117
python main.py -sp {SAVE_PATH} -da {DATASET} -fo {NUM_FOLD} -se {SEED} -ft {LLM_EMB} -ct {MODEL_TYPE} -id {DEVICE} -ep {EPOCH} -dp {DROPOUT} -hf {HIDDEN_FEAT}
Suggested setting:
python main.py -sp {SAVE_PATH} -da {DATASET} -fo 5 -se 0 -ft LLM -ct graph_ae -id 0 -ep 5000 -dp 0.4 -hf 128
We have also stored standard prediction results of LLM-DDAGNN-AE and DirectPred baseline in "result" folder.
@article{RN93,
author = {Gu, Yaowen and Xu, Zidu and Yang, Carl},
title = {Empowering Graph Neural Network-Based Computational Drug Repositioning with Large Language Model-Inferred Knowledge Representation},
journal = {Interdisciplinary Sciences: Computational Life Sciences},
ISSN = {1867-1462},
DOI = {10.1007/s12539-024-00654-7},
url = {https://doi.org/10.1007/s12539-024-00654-7},
year = {2024},
type = {Journal Article}
}