Code for the paper Learning Natural Language Generation with Truncated Reinforcement Learning accepted at Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
This repo is using CLEVR and VQAv2 Datasets:
- Get data
gdown --id 1AVZXRzmKBxVH6Ul9ZviSWPVdj_kwU3yX --output data/data.zip unzip data/data.zip -d data/ rm data/data.zip rm data/vqa-v2/cache/vocab.json
- If you want the whole clevr dataset:
rm -r data/CLEVR_v1.0 wget https://dl.fbaipublicfiles.com/clevr/CLEVR_v1.0.zip -O data/CLEVR_v1.0.zip unzip data/CLEVR_v1.0.zip -d data rm data/CLEVR_v1.0.zip
- If you want the whole vqa-v2 dataset
wget https://dl.fbaipublicfiles.com/vilbert-multi-task/datasets/coco/features_100/COCO_trainval_resnext152_faster_rcnn_genome.lmdb/data.mdb mv data.mdb data/vqa-v2/coco_trainval.lmdb/
- To download the clevr dialog on 20000 images to train the external language model
gdown --id 1BSqXY6KV4wOxo6tdjP7xej54gMvqk7k1 --output data/clevr_ext/clevr_dialog_train_raw.json
- To get the necessary models:
chmod +x sh/files/get_models.sh sh/files/get_models.sh
- You can create a conda environment called rl-nlp:
conda create -n rl-nlp
- And activate it:
conda activate rl-nlp
- The required library can be installed via the file requirements.txt:
pip install -r requirements.txt
- The code relies on the CLOSURE github: you need to install it with:
python -m pip install git+https://github.com/gqkc/CLOSURE.git --upgrade
- And on the VILBERT multi-task github:
python -m pip install git+https://github.com/gqkc/vilbert-multi-task.git --upgrade
RL-NLP
├── config # store the configuration file to create/train models
|
|
├── output # store the output experiments (pre-trained models, logs...)
| ├── lm_model / model.pt : path for pre-trained model .pt on CLEVR dataset.
| ├── SL_LSTM_32_64 / model.pt: path for the pre-trained policy .pt on CLEVR dataset.
| ├── SL_LSTM_32_64_VQA / model.pt: path for the pre-trained policy conditioned on the answer (for VQA reward) on the CLEVR dataset.
| └── vqa_model_film / model.pt: path for the pre-trained oracle model for the "vqa" reward of the CLEVR dataset
└── lm_model_vqa / model.pt: path for pre-trained lm model
└── vqa_policy_512_1024_answer / model.pt
└── vilbert_vqav2
├── model.bin : path for vilbert oracle fine-tuned on vqav2 task
├── bert_base_6layer_6conect.json : config file for vilbert oracle
|
├── data
| └── CLEVR1.0 # root folder for the CLEVR dataset.
└── vqa-v2 # root folder for the VQA-V2 dataset.
├── coco_trainval.lmdb # lmdb folder for the image features (reduced one on local machine, complete one on VM).
├── cache
├── vocab.json: path for vocab.
└── closure_vocab.json: vocab path for closure dataset (used on the "vqa reward" of CLEVR).
└── vocab.json: vocab path for the CLEVR dataset.
└── train_questions.h5: h5 file for training question dataset.
└── val_questions.h5: h5 file for validation question dataset.
└── test_questions.h5: h5 file for test question dataset.
└── train_features.h5: h5 file for training images features.
└── val_features.h5: h5 file for validation images features.
└── test_features.h5: h5 file for test images features.
|
└── src # source files
- To run all the scripts from the origin repo (RL-NLP), run first the following command line:
export PYTHONPATH=src:${PYTHONPATH}
To preprocess the questions of the three datasets, run the scripts src/sh/preprocess_questions or the 3 following command lines (in this order):
-
python src/preprocessing/preprocess_questions.py -data_path "data/CLEVR_v1.0/questions/CLEVR_train_questions.json" \
-out_vocab_path "data/vocab.json" -out_h5_path "data/train_questions.h5" -min_token_count 1
-
python src/preprocessing/preprocess_questions.py -data_path "data/CLEVR_v1.0/questions/CLEVR_val_questions.json" \
-out_vocab_path "data/vocab.json" -out_h5_path "data/val_questions.h5" -min_token_count 1
-
python src/preprocessing/preprocess_questions.py -data_path "data/CLEVR_v1.0/questions/CLEVR_test_questions.json" \
-out_vocab_path "data/vocab.json" -out_h5_path "data/test_questions.h5" -min_token_count 1
To extract the image features, run the script src/sh/extract_features.py or the 3 following command lines (batch size arg must be tuned depending on memory availability):
-
python src/preprocessing/extract_features.py \
--input_image_dir data/CLEVR_v1.0/images/train \
--output_h5_file data/train_features.h5 --batch_size 128
-
python src/preprocessing/extract_features.py \
--input_image_dir data/CLEVR_v1.0/images/val \
--output_h5_file data/val_features.h5 --batch_size 128
-
python src/preprocessing/extract_features.py \
--input_image_dir data/CLEVR_v1.0/images/test \
--output_h5_file data/test_features.h5 --batch_size 128
python src/preprocessing/preprocess_vqa_dataset.py -data_path "data/vqa-v2" -features_path "data/vqa-v2/coco_trainval.lmdb" -vocab_path "none" -min_split 0
This creates a file "vocab.json" with the vocab.
python src/preprocessing/preprocess_vqa_dataset.py -data_path "data/vqa-v2" -features_path "data/vqa-v2/coco_trainval.lmdb" -vocab_path "none" -min_split 1
This creates a file "vocab_min.json" with the vocab.
python src/preprocessing/preprocess_vqa_dataset.py -data_path "data/vqa-v2" -features_path "data/vqa-v2/coco_trainval.lmdb" -vocab_path "data/vqa-v2/cache/vocab.json" -split "train" -min_split 0 -test 1
This will create a file "train_entries.pkl"
python src/preprocessing/preprocess_vqa_dataset.py -data_path "data/vqa-v2" -features_path "data/vqa-v2/coco_trainval.lmdb" -vocab_path "data/vqa-v2/cache/vocab.json" -split "val" -min_split 0 -test 1
This will create a file "val_entries.pkl"
python src/preprocessing/preprocess_vqa_dataset.py -data_path "data/vqa-v2" -features_path "data/vqa-v2/coco_trainval.lmdb" -vocab_path "data/vqa-v2/cache/vocab.json" -split "train" -min_split 1 -test 0
This will create a file "mintrain_entries.pkl"
python src/preprocessing/preprocess_vqa_dataset.py -data_path "data/vqa-v2" -features_path "data/vqa-v2/coco_trainval.lmdb" -vocab_path "data/vqa-v2/cache/vocab.json" -split "val" -min_split 1 -test 0
This will create a file "minval_entries.pkl"
Reduced datasets on reduced vocab (train_dataset = 20,000 questions & val_dataset = 5,000 questions)
python src/preprocessing/preprocess_vqa_dataset.py -data_path "data/vqa-v2" -features_path "data/vqa-v2/coco_trainval.lmdb" -vocab_path "data/vqa-v2/cache/vocab_min.json" -split "train" -min_split 1 -test 0
This will create a file "mintrain_minvocab_entries.pkl".
python src/preprocessing/preprocess_vqa_dataset.py -data_path "data/vqa-v2" -features_path "data/vqa-v2/coco_trainval.lmdb" -vocab_path "data/vqa-v2/cache/vocab_min.json" -split "val" -min_split 1 -test 0
This will create a file "minval_minvocab_entries.pkl".
cd ..
git clone https://github.com/gqkc/vilbert-multi-task.git
cd vilbert-multi-task/tools/
git clone -b python3 https://github.com/lichengunc/refer.git
cd refer
make
# if problem, do instead:
python setup.py install
cd ../../
python -m pip install -e .
#if problem with python :
python -m pip install --upgrade cython
- Language Model .pt file here.
- Levenshtein Task:
- Pretrained Policy .pt file (word_emb_size = 32, hidden_size = 64) here.
- VQA task:
- VQA task:
python src/train/launch_train.py -task "lm" -dataset "clevr" -model "lstm" -num_layers 1 -emb_size 512 -hidden_size 512 -p_drop 0.1 -lr 0.001 -data_path "data" -out_path "output" -bs 512 -ep 20 -num_workers 6
python src/train/launch_train.py -task "lm" -dataset "vqa" -model "lstm" -num_layers 1 -emb_size 512 -hidden_size 512 -p_drop 0.1 -lr 0.001 -data_path "data/vqa-v2" -features_path "data/vqa-v2/coco_trainval.lmdb" -out_path "output" -bs 512 -ep 50 -num_workers 6
python src/train/launch_train.py -task "policy" -dataset "clevr" -data_path "data" -out_path "output/policy_pre_training" -emb_size 32 -hidden_size 64 -bs 512 -ep 50 -num_workers 0 -max_samples 21 -fusion "cat"
N.B: When training only on a CPU, the max_samples args is required to train only on a subset of the dataset.
python src/train/launch_train.py -task "policy" -dataset "clevr" -data_path "data" -out_path "output/policy_pre_training" -emb_size 32 -hidden_size 64 -bs 512 -ep 50 -num_workers 0 -max_samples 21 -fusion "cat" -condition_answer "after_fusion"
python src/train/launch_train.py -task "policy" -dataset "vqa" -data_path "data" -out_path "output/policy_pre_training" -emb_size 32 -hidden_size 64 -bs 512 -ep 50 -num_workers 0 -fusion "average"
python src/train/launch_train.py -task "policy" -dataset "vqa" -data_path "data" -out_path "output/policy_pre_training" -emb_size 32 -hidden_size 64 -bs 512 -ep 50 -num_workers 0 -fusion "average" -condition_answer "after_fusion"
- See examples in src/scripts/sh.
- The folder "debug" allows to run small experiments on each of the algo for the 2 CLEVR tasks (Levenshtein & VQA rewards).
cd output/2000_img_len_20"
tensorboard --logdir=experiments/train
With GCP VM (VM Instance name here = alice_martindonati@pytorch-3-vm), on local machine:
gcloud compute ssh alice_martindonati@pytorch-3-vm -- -NfL 6006:localhost:6006