This repository contains the source code for our IROS 2022 paper "Grounding Commands for Autonomous Vehicles via Region-specific Dynamic Layer Attention".
Our code is built on the excellent repository of VOLTA.
We also upload a demo video of our model in here.
If you use this code, please cite our paper:
@inproceedings{conf/acl/chan19keyphraseRL,
title={Grounding Commands for Autonomous Vehicles via Layer Fusion with Region-specific Dynamic Layer Attention},
author={Hou Pong Chan and Mingxi Guo and Cheng-Zhong Xu},
booktitle={Proceedings of IROS},
year={2022}
}
1. Create a fresh conda environment, and install all dependencies.
conda create -n volta python=3.6
conda activate volta
pip install -r requirements.txt
2. Install PyTorch
conda install pytorch=1.4.0 torchvision=0.5 cudatoolkit=10.1 -c pytorch
3. Install apex. If you use a cluster, you may want to first run commands like the following:
module load cuda/10.1.105
module load gcc/8.3.0-cuda
4. Setup the refer
submodule for Referring Expression Comprehension:
cd tools/refer; make
5. Install this codebase as a package in this environment.
python setup.py develop
We conduct experiments on the Talk2Car dataset. If you use this dataset, please cite their paper.
Thierry Deruyttere, Simon Vandenhende, Dusan Grujicic, Luc Van Gool, Marie-Francine Moens:
Talk2Car: Taking Control of Your Self-Driving Car. EMNLP 2019
The followings are our preprocessed data.
First, create a directory mkdir ./data/talk2car/
.
Download the image tar.gz file from here, extract it, and move the images
directory to ./data/talk2car/images
, i.e., mv images ./data/talk2car/images
.
Download the mapping file from here and move this file to ./data/talk2car/talk2car_w_rpn_no_duplicates.json
.
Download the regions extracted by centernet (we only keep the top 36 regions) from here and move this file to ./data/talk2car/talk2car_centernet_dets_36.json
Download the instances.json
from here and move it to ./data/talk2car/annotations/talk2car/instances.json
Download the refs_spacy.p
from here and move it to ./data/talk2car/annotations/talk2car/refs_spacy.p
Download the region features extracted by Faster R-CNN from here, the password is RSDL
, unzip data.zip.001 and data.zip.002, then move the files to ./data/talk2car/resnet101_faster_rcnn_genome_imgfeats_centernet/volta/refcoco+_unc_dets36_feat.lmdb/lock.mdb
and ./data/talk2car/resnet101_faster_rcnn_genome_imgfeats_centernet/volta/refcoco+_unc_dets36_feat.lmdb/data.mdb
.
Download the pre-trained UNITER and LXMERT checkpoints provided by VOLTA
wget https://sid.erda.dk/share_redirect/FeYIWpMSFg
mv FeYIWpMSFg checkpoints/conceptual_captions/ctrl_uniter/ctrl_uniter_base/pytorch_model_9.bin
wget https://sid.erda.dk/share_redirect/Dp1g16DIA5
mv Dp1g16DIA5 checkpoints/conceptual_captions/ctrl_lxmert/ctrl_lxmert/pytorch_model_9.bin
We provide sample scripts to train our RSD-UNITER and RSD-LXMERT models: examples/ctrl_uniter/talk2car/train_RSD_uniter.sh and examples/ctrl_lxmert/talk2car/train_RSD_lxmert.sh
Run the following script to construct a mapping between the id of the sample and corresponding token in the leaderboard of talk2car
python3 generate_token.py
Run inference on the validation and test sets (the computed AP50 score on the test set is always 0 since we do not have the ground-truth): examples/ctrl_lxmert/talk2car/val_RSD_uniter.sh and examples/ctrl_lxmert/talk2car/test_RSD_uniter.sh
Export the predictions to a json file
python generate_prediction.py --result_path ./results/talk2car/ctrl_uniter/pytorch_model_best.bin-
Submit the json file ./results/talk2car/ctrl_uniter/pytorch_model_best.bin/predictions_for_leaderboard.json
to the leaderboard of Talk2Car here (create submission button).
Please take a look at the plot_prediction.py
.