Note repo is in process of being updated with latest code. Paper title and citation have been updated.
This is the Tensorflow implementation of SilhoNet from the paper "SilhoNet: An RGB Method for 6D Object Pose Estimation", published in IROS/RAL 2019. The code supports training, validation, and testing for both the silhouette prediction and 3D orientation estimation stages of the network on the YCB-Video dataset.
SilhoNet: An RGB Method for 6D Object Pose Estimation
Gideon Billings, Matthew Johnson-Roberson
IEEE Robotics and Automation Letters 2019
[arxiv]
- Linux or OSX (Tested on Ubuntu 14.04 and 16.04)
- NVIDIA GPU + CUDA + CuDNN (CPU mode and CUDA without CuDNN may work with minimal modification, but untested)
We assume you have cloned this repo, and the root directory is $SilhoNet_ROOT
.
The network requires the YCB-Video dataset, augmented with groundtruth silhouette renderings, and renderings of the object models.
The YCB-Video datset can be downloaded from the official project site:
[YCB-Video dataset]
Create a simlink to the root directory of the YCB dataset.
cd $SilhoNet_ROOT
ln -s $YCB_DIR data/YCB
where '$YCB_DIR' is the root directory of the YCB dataset.
The dataset_toolbox folder provides MATLAB scripts for generating the silhouette annotation files.
WARNING: These scripts can take several days to run, depending on how much processing power is available. It is recommended to run them on a CPU server.
Run the script for generating the augmented annotations.
cd $SilhoNet_ROOT/dataset_toolbox
matlab -nodesktop -nosplash -r gen_full_silhouettes
A blender script is provided, based on the Stanford shapenet renderer, for rendering the model viewpoints needed as part of the network input. This script is located under the folder 'dataset_toolbox/drop-shapenet-renderer'. The Readme in this folder provides more information about running the script, but if you have blender installed on your system, the following commands will generate the rendered viewpoints expected by the network.
cd $SilhoNet_ROOT/dataset_toolbox/drop-shapenet-renderer
find $YCB_DIR/models -name "*.obj" -print0 | xargs -0 -n1 -P3 -I {} blender --background --python render_blender.py -- --output_folder $YCB_DIR/models/rendered {}
For testing SilhoNet on predicted ROIs, we provide our Faster-RCNN detections file for the keyframe image set. This file should be downloaded to the $SilhoNet_ROOT/data
folder.
[Faster-RCNN detections]
For training, the network also requires the COCO-2017 training images set, which can be downloaded form the COCO dataset site:
[COCO dataset]
Create a simlink to the COCO images directory
cd $SilhoNet_ROOT
ln -s $COCO_DIR/images data/COCO
where '$COCO_DIR' is the root directory of the COCO dataset.
Generate annotations for the synthetic data
cd $SilhoNet_ROOT/dataset_toolbox
matlab -nodesktop -nosplash -r gen_full_silhouettes
Generate the synthetic bounding box annotation files:
matlab -nodesktop -nosplash -r gen_bboxes_synthetic
Generate the trainsyn.txt
image set file, which includes the supplementary synthetic images
sh gen_synthetic_image_sets.sh
We recommend using the provided docker image to run experiments without modifying your local system setup. These instructions assume you have installed docker with the nvidia-docker wrapper. The SilhoNet code base is mounted at runtime outside of the docker image for ease of development.
Build the docker image
cd $SilhoNet_ROOT
sudo docker build -t tensorflow/tensorflow:silhonet .
We have provided a run_docker.sh
script for launching of the docker image. This script should be modified for your system.
- Replace
/home/gidobot/mnt/workspace/neural_networks/tensorflow/SilhoNet
with the path to your SilhoNet directory,$SilhoNet_ROOT
. - Replace
/home/gidobot/mnt/storage
with the storage folder containing the downloaded YCB and COCO datasets. The mounted path and original path to this folder must match for the simlinks to work.
We have provided configuration files with the parameters for replicating the test results reported in the paper. These can be loaded on runtime with the --argsjs
parameter. The configuration files load our trained model weights which can be downloaded from the link below.
[SilhoNet pretrained weights]
Extract the weights file under the data
folder by running
tar -xvfz pretrained_weights.tar.gz -C $SilhoNet_ROOT/data/
The runtime parameters for SilhoNet can be listed by running
python -m scripts.run_silhonet --help
Use the following command to test the silhouette prediction network:
python -m scripts.run_silhonet --mode test-seg --argsjs args/args_silhouette_test.json
By default, the test runs with the YCB dataset ground truth ROIs. To test with the Faster-RCNN predicted ROIs, set use_pred_rois
to true in the config file.
The network saves the test results to $logdir/table.txt
, where logdir
is specified in the config file. The columns of the accuracy tables correspond to the threshold values, specified by the eval_thresh
parameter, where the threshold value is used to convert the probability masks into binary masks. The accuracy values are IoU percentage scores.
Use the following command to test the full 3D pose prediction network:
python -m scripts.run_silhonet --mode test-quat --argsjs args/args_pose_test.json
By default, the test runs with the YCB dataset ground truth ROIs. To test with the Faster-RCNN predicted ROIs, set use_pred_rois
to true in the config file.
The network saves the test results to $logdir/table.txt
and $logdir/angle_errors.mat
, where logdir
is specified in the config file. The columns of the accuracy table corresponds to the angle error threshold values, where the accuracy values are the percentage of predicted poses that have an angle error less than the threshold. The angle_errors.mat file is used to plot accuracy against the PoseCNN published results.
To compare accuracy against PoseCNN, run the test with ground truth ROIs and copy the generated angle_errors.mat
file to dataset_toolbox/results_SilhoNet/angle_errors_gt.mat
. Then run the test with predicted ROIs and copy the generated angle_errors.mat
file to dataset_toolbox/results_SilhoNet/angle_errors_pred.mat
. There is a MATLAB script to run the evaluation
matlab -nodesktop -nosplash -r plot_accuracy_keyframe
The plots of the results are saved under the subdirectory plots
.
Some test results are summerized to a tensorboard event file under the specified logdir
directory. These visualizations can be helpful for debugging and can be viewed in a web browser by running
tensorboard --logdir $logdir --port $PORT
NOTE: If running tensorboard in docker, you will need to forward the port out of docker to view in your local browser. If logging the results to a location accessible outside of docker (recommended), you can run tensorboard on your local system.
The runtime parameters for SilhoNet can be listed by running
python -m scripts.run_silhonet --help
We provide imagenet pretrained weights for the VGG16 backbone network which can be downloaded from the link below.
[VGG16 ImageNet pretrained weights]
Extract the weights file under the data
folder by running
tar -xvfz imagenet_weights.tar.gz -C $SilhoNet_ROOT/data/
Use the following command to train the silhouette prediction network with the default parameters:
python -m scripts.run_silhonet --mode train-seg --argsjs args/args_silhouette_train.json
NOTE: In the release code, it is expected that the silhouette prediction stage is trained before the 3D pose prediction stage, as the network weights for the silhouette prediction stage are loaded for both training and testing the 3D pose prediction stage.
Training checkpoints are saved to the logdir
directory specified in the config file.
Use the following command to train the 3D pose prediction network with the default parameters:
python -m scripts.run_silhonet --mode train-quat --argsjs args/args_pose_train.json
Training checkpoints are saved to the logdir
directory specified in the config file.
Training is summerized to a tensorboard event file under the specified logdir
directory. Visualize trianing in a web browser by running
tensorboard --logdir $logdir --port $PORT
NOTE: If running tensorboard in docker, you will need to forward the port out of docker to view in your local browser. If logging the results to a location outside of docker (recommended), you can run tensorboard on your local system.
If you use our code, we request you to cite the following work.
@article{billings2019silhonet,
title={SilhoNet: An RGB Method for 6D Object Pose Estimation},
author={Billings, Gideon and Johnson-Roberson, Matthew},
journal={IEEE Robotics and Automation Letters},
volume={4},
number={4},
pages={3727--3734},
year={2019},
publisher={IEEE}
}