The repository provides PyTorch code and datasets for our paper "DegUIL: Degree-aware Graph Neural Networks for Long-tailed User Identity Linkage", which is published in 23'ECML-PKDD.
- PyTorch >= 1.8.1+cu111
- Python 3.8.5
- Numpy 1.19.5
- nltk 3.6.1
- tqdm 4.60.0
To install, run pip install -r requirements.txt
. Our code is run in GPU by default (CUDA 11.1+ here) , you can change the device into CPU if only CPU is available.
- datasets/: contains the datasets FT(Foursquare-Twitter) and DBLP(DBLP17-DBLP19), which come from open source. The
bestEmbs
folder stores the mapped embeddings of two networks when getting the best performance. - layers/: model layers
- models/: main modules of our model DegUIL
- utils:/ tool functions for processing data and logging
- config.py: hyperparameters
- DegUIL.py: main file
Some codes refer to https://github.com/shuaiOKshuai/Tail-GNN
# FT dataset
python Node2Vec/run_node2vec.py -- dataset FT # get emb_n2v1.pkl
python DegUIL.py --dataset FT --mu 0.001 --lr 5e-4
# DBLP dataset
python Node2Vec/run_node2vec.py -- dataset DBLP
python DegUIL.py --dataset DBLP --mu 0.01 --lr 1e-3
or
bash run.sh
Each dataset in datasets/ includes two original adjacency matrices adj_s.pkl
and adj_t.pkl
,
their corresponding node embeddings generated by node2vec, are saved as emb_n2v1.pkl
.
To run the code on your own datasets, please refer to utils/data_process.py to process the corresponding datasets into the input format.
@inproceedings{long2023deg,
title={DegUIL: Degree-aware Graph Neural Networks for Long-tailed User Identity Linkage},
author={Meixiu Long, Siyuan Chen, Xin Du, and Jiahai Wang},
booktitle = {{ECML/PKDD}},
volume = {14174},
pages = {122-138},
publisher = {Springer},
year = {2023}
}