Grounding Commands for Autonomous Vehicles via Region-specific Dynamic Layer Attention

This repository contains the source code for our IROS 2022 paper "Grounding Commands for Autonomous Vehicles via Region-specific Dynamic Layer Attention".

Our code is built on the excellent repository of VOLTA.

We also upload a demo video of our model in here.

If you use this code, please cite our paper:

@inproceedings{conf/acl/chan19keyphraseRL,
  title={Grounding Commands for Autonomous Vehicles via Layer Fusion with Region-specific Dynamic Layer Attention},
  author={Hou Pong Chan and Mingxi Guo and Cheng-Zhong Xu},
  booktitle={Proceedings of IROS},
  year={2022}
}

Repository Setup

1. Create a fresh conda environment, and install all dependencies.

conda create -n volta python=3.6
conda activate volta
pip install -r requirements.txt

2. Install PyTorch

conda install pytorch=1.4.0 torchvision=0.5 cudatoolkit=10.1 -c pytorch

3. Install apex. If you use a cluster, you may want to first run commands like the following:

module load cuda/10.1.105
module load gcc/8.3.0-cuda

4. Setup the refer submodule for Referring Expression Comprehension:

cd tools/refer; make

5. Install this codebase as a package in this environment.

python setup.py develop

Data

We conduct experiments on the Talk2Car dataset. If you use this dataset, please cite their paper.

Thierry Deruyttere, Simon Vandenhende, Dusan Grujicic, Luc Van Gool, Marie-Francine Moens:
Talk2Car: Taking Control of Your Self-Driving Car. EMNLP 2019

The followings are our preprocessed data.

First, create a directory mkdir ./data/talk2car/.

Download the image tar.gz file from here, extract it, and move the images directory to ./data/talk2car/images, i.e., mv images ./data/talk2car/images.

Download the mapping file from here and move this file to ./data/talk2car/talk2car_w_rpn_no_duplicates.json.

Download the regions extracted by centernet (we only keep the top 36 regions) from here and move this file to ./data/talk2car/talk2car_centernet_dets_36.json

Download the instances.json from here and move it to ./data/talk2car/annotations/talk2car/instances.json

Download the refs_spacy.p from here and move it to ./data/talk2car/annotations/talk2car/refs_spacy.p

Download the region features extracted by Faster R-CNN from here, the password is RSDL, unzip data.zip.001 and data.zip.002, then move the files to ./data/talk2car/resnet101_faster_rcnn_genome_imgfeats_centernet/volta/refcoco+_unc_dets36_feat.lmdb/lock.mdb and ./data/talk2car/resnet101_faster_rcnn_genome_imgfeats_centernet/volta/refcoco+_unc_dets36_feat.lmdb/data.mdb.

Pre-trained Models

Download the pre-trained UNITER and LXMERT checkpoints provided by VOLTA

wget https://sid.erda.dk/share_redirect/FeYIWpMSFg
mv FeYIWpMSFg checkpoints/conceptual_captions/ctrl_uniter/ctrl_uniter_base/pytorch_model_9.bin
wget https://sid.erda.dk/share_redirect/Dp1g16DIA5
mv Dp1g16DIA5 checkpoints/conceptual_captions/ctrl_lxmert/ctrl_lxmert/pytorch_model_9.bin

Training

We provide sample scripts to train our RSD-UNITER and RSD-LXMERT models: examples/ctrl_uniter/talk2car/train_RSD_uniter.sh and examples/ctrl_lxmert/talk2car/train_RSD_lxmert.sh

Evaluate

Run the following script to construct a mapping between the id of the sample and corresponding token in the leaderboard of talk2car

python3 generate_token.py

Run inference on the validation and test sets (the computed AP50 score on the test set is always 0 since we do not have the ground-truth): examples/ctrl_lxmert/talk2car/val_RSD_uniter.sh and examples/ctrl_lxmert/talk2car/test_RSD_uniter.sh

Export the predictions to a json file

python generate_prediction.py --result_path ./results/talk2car/ctrl_uniter/pytorch_model_best.bin-

Submit the json file ./results/talk2car/ctrl_uniter/pytorch_model_best.bin/predictions_for_leaderboard.json to the leaderboard of Talk2Car here (create submission button).

Plot the predicted bounding box

Please take a look at the plot_prediction.py.

Name		Name	Last commit message	Last commit date
Latest commit History 398 Commits
apex		apex
article_and_demo_video		article_and_demo_video
config		config
config_tasks		config_tasks
conversions		conversions
data		data
examples		examples
scripts		scripts
tools/refer		tools/refer
volta		volta
LICENSE		LICENSE
MODELS.md		MODELS.md
README.md		README.md
ViLBERT_VOLTA.png		ViLBERT_VOLTA.png
detectron_extract_regions_from_talk2car.py		detectron_extract_regions_from_talk2car.py
eval_retrieval.py		eval_retrieval.py
eval_task.py		eval_task.py
generate_prediction.py		generate_prediction.py
generate_token.py		generate_token.py
plot_layer_attention_distribution.py		plot_layer_attention_distribution.py
plot_prediction.py		plot_prediction.py
requirements.txt		requirements.txt
setup.py		setup.py
talk2car_regions_to_volta_format.py		talk2car_regions_to_volta_format.py
train_concap.py		train_concap.py
train_task.py		train_task.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Grounding Commands for Autonomous Vehicles via Region-specific Dynamic Layer Attention

Repository Setup

Data

Pre-trained Models

Training

Evaluate

Plot the predicted bounding box

About

Releases

Packages

Contributors 2

Languages

License

kenchan0226/RSDLayerAttn

Folders and files

Latest commit

History

Repository files navigation

Grounding Commands for Autonomous Vehicles via Region-specific Dynamic Layer Attention

Repository Setup

Data

Pre-trained Models

Training

Evaluate

Plot the predicted bounding box

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages