Repo for our paper "Learning Ordering in Crystalline Materials with Symmetry-Aware Graph Neural Networks" (preprint on arXiv).
To start, clone this repo and all its submodules to your local directory or a workstation:
git clone --recurse-submodules git@github.com:learningmatter-mit/PerovskiteOrderingGCNNs.git
or
git clone git@github.com:learningmatter-mit/PerovskiteOrderingGCNNs.git
cd PerovskiteOrderingGCNNs
git submodule update --init
Our codes are built upon previous implementations of CGCNN, e3nn, and PaiNN, which are included as submodules in this repo. If there are any changes in their corresponding GitHub repos, the following command will update the submodules in this repo:
git submodule update --remote --merge
This repository requires the following packages to run correctly:
pandas 1.5.3
scipy 1.10.1
numpy 1.24.3
scikit-learn 1.2.2
matplotlib 3.7.1
seaborn 0.12.2
pymatgen 2023.5.10
ase 3.22.1
rdkit 2023.3.1
e3fp 1.2.5
pytorch 1.13.1
pytorch-cuda 11.7
pytorch-sparse 0.6.17
pytorch-scatter 2.1.1
pytorch-cluster 1.6.1
torchvision 0.14.1
torchaudio 0.13.1
pyg 2.3.0
e3nn 0.5.1
sigopt 8.8.2
sigoptlite 0.1.2
gdown 4.7.1
mscorefonts 0.0.1
boken 3.3.4
All these packages can be installed using the environment.yml file and conda
:
conda env create -f environment.yml
conda activate Perovskite_ML_Environment
All our data and trained models are archived on Zenodo (DOI: 10.5281/zenodo.13820311) and Materials Data Facility (DOI: 10.18126/ncqt-rh18). Please place all data and model files in the corresponding directories and then refer to the following Jupyter notebooks to reproduce the results of our paper:
- 1_model_training.ipynb: This notebook provides examples of how to train GCNNs on the training dataset and conduct hyperparameter optimization based on the loss on the validation set.
- 2_model_inference.ipynb: This notebook provides examples of how to verify the performance of GCNNs on the validation set, select the top-performing models accordingly, compute the prediction on the test and holdout sets, and extract the latent embeddings of CGCNN and e3nn after all message passing and graph convolution layers.
- 3_model_analysis.ipynb: This notebook provides examples of how to reproduce all major figures in this manuscript.
If you use our codes, data, and/or models, please cite the following paper:
@article{peng2024learning,
title={Learning Ordering in Crystalline Materials with Symmetry-Aware Graph Neural Networks},
author={Jiayu Peng and James Damewood and Jessica Karaguesian and Jaclyn R. Lunger and Rafael Gómez-Bombarelli},
journal={arXiv:2409.13851},
url = {https://arxiv.org/abs/2409.13851},
year={2024}