This repository provides:
- Reproduction guides and Training results of:
- NAQANet
- NABERT
- NABERT+
- We also retrained NABERT+ with BERT-Large, which gained 8.2% EM and 7.4% F1 improvement on dev datasets of Discrete Reasoning Over the content of Paragraphs (DROP).
- A detailed report
The codes and training configs are based on @raylin1000(NABERT Model) and AI2.
# clone project
git clone https://github.com/BC-Li/nabert-large
cd nabert-large
# [OPTIONAL] create conda environment
conda create -n myenv python=3.7
conda activate myenv
# install pytorch according to instructions
# https://pytorch.org/get-started/
# install requirements
pip install -r requirements.txt
This will require a GRAM for ~8GB and about 20 hours on RTX3090 to make the model reach convergence.
allennlp train /nabert-large/src/baseline/config/naqanet.jsonnet -s /nabert-large/src/baseline/storage --include-package baseline
allennlp train /nabert/src/nabert/config/nabert.json -s /nabert/src/nabert/storage --include-package nabert
Use config from src/nabert-large/config.
Please ensure you have a GPU which have >22GB GRAM. I trained it on a single RTX3090 for about 20 hours with early stopping to avoid overfitting.
allennlp train /nabert/src/nabert/config/nabert-large.json -s /nabert-large/src/nabert-large/storage --include-package nabert-large
AllenNLP also supports TensorBoard. To open it, just change the --logdir
parameter to run following command.
tensorboard --logdir="/nabert-large/src/nabert-large/storage"
Model/Human | EM | F1 |
---|---|---|
NAQANet | 46.20 | 49.24 |
NABERT | 54.67 | 57.64 |
NABERT+ | 62.67 | 66.29 |
NumNet | 64.92 | 68.31 |
NABERT-Large+ (dropout=0.11) | 67.82 | 71.25 |
OPERA (Current Rank 1 on DROP Leaderboard) | 86.79 | 89.41 |
Human | 94.90 | 96.42 |
Train Batch EM | Train Batch F1 | Validation/Train EM | Validation/Train F1 |
---|---|---|---|
Train Batch EM | Train Batch F1 | Train EM | Train F1 |
---|---|---|---|
Train Loss | Validation EM | Validation F1 |
---|---|---|
Train Batch EM | Train Batch F1 | Validation/Train EM | Validation/Train F1 |
---|---|---|---|
Train Batch EM | Train Batch F1 | Validation/Train EM | Validation/Train F1 |
---|---|---|---|