Within BDD Driving Project, we formulate the self driving task as future egomotion prediction. To attack the task, we collected Berkeley DeepDrive Video Dataset with our partner Nexar, proposed a FCN+LSTM model and implement it using tensorflow.
BDD-V dataset will be released here.
First clone the codebase to your local file system at $BDD_ROOT.
git clone https://github.com/gy20073/BDD_Driving_Model.git && cd BDD_Driving_Model && export BDD_ROOT=$(pwd)
For Ubuntu 14.04 and 16.04, you can install all dependencies using:
cd $BDD_ROOT && bash setup.sh
Or if you don't want to install Anaconda or you're using other versions of Linux, you could manually install those packages:
- Tensorflow 0.11
- ffmpeg and ffprobe
- Python packages:
- IPython, PIL, opencv-python, scipy, matplotlib, numpy, sklearn
We provide some pretrained models that are ready to use. Download the model zoo here to $BDD_ROOT"/data" and make sure they are available at locations like $BDD_ROOT"/data/discrete_cnn_lstm/model.ckpt-146001.bestmodel".
The tf.caffenet.bin
is a pretrained Alexnet model generated from the Caffe-Tensorflow tool. It's used as the finetuning start point for the driving model.
With the pretrained model, you could test it on your own dashcam video. wrapper.py
is a simple wrapper that use the model without the requirement to prepare a TFRecord dataset. It takes in an image at every 1/3 second and output the predicted future egomotion. See wrapper_test.ipynb
for an example usage.
TODO(Improve Visualization)
Download and unzip the dataset's training and validation set into some directory $DATA_ROOT. There should be directories like $DATA_ROOT/train/videos and $DATA_ROOT/val/info.
Then run the following commands to generate indexes of videos and convert raw videos to TFRecords. For validation set:
cd $BDD_ROOT"/data_prepare"
python filter.py $DATA_ROOT/val
python prepare_tfrecords.py --video_index=$DATA_ROOT/val/video_filtered_38_60.txt --output_directory=$DATA_ROOT/tfrecords/validation
and on the training set:
cd $BDD_ROOT"/data_prepare"
python filter.py $DATA_ROOT/train
python prepare_tfrecords.py --video_index=$DATA_ROOT/train/video_filtered_38_60.txt --output_directory=$DATA_ROOT/tfrecords/train
To train a driving model, first change some path flags in $BDD_ROOT"/config.py". In particular set FLAGS.pretrained_model_path = "$BDD_ROOT/data/tf.caffenet.bin" and FLAGS.data_dir = "$DATA_ROOT/tfrecords". They are paths to the ImageNet pretrained Alexnet model and the TFRecord files we got from the previous data preparation step.
There are a bunch of different types of models proposed in the paper and implemented in this repo. The configuration of each model is a function in config.py
, such as discrete_tcnn1
and continuous_datadriven_bin
. The discrete_tcnn1
model is a model with temporal convolution of window size 1 and the model predicts discrete driving actions such as Go
, Stop
, Left
and Right
. The continuous_datadriven_bin
model is a CNN-LSTM style model that predicts continuous egomotions, including future angular velocity and future speed. The binning method used in this model is a data-driven approach.
We will use discrete_tcnn1
as a running example, the training procedures of other models are similar. To train the model, run
cd $BDD_ROOT && python config.py train discrete_tcnn1
For the continuous_datadriven_bin
model, we need to get the distribution before do the actually training, to get the distribution, run
cd $BDD_ROOT && python config.py stat continuous_datadriven_bin
During training, the program will write checkpoints and logs to $BDD_ROOT"/data/discrete_tcnn1". To monitor the validation performance, we could start another process evaluating the model periodically
cd $BDD_ROOT && python config.py eval discrete_tcnn1
One could also use tensorboard to visually monitor the training progress
cd $BDD_ROOT"/data" && tensorboard --logdir=. --port=8888
and open it at: http://localhost:8888
If you want to cite our paper, please use the following bibtex:
@article{xu2016end,
title={End-to-end learning of driving models from large-scale video datasets},
author={Xu, Huazhe and Gao, Yang and Yu, Fisher and Darrell, Trevor},
journal={arXiv preprint arXiv:1612.01079},
year={2016}
}