In CVPR 2017. [Project Website].
Deepak Pathak, Ross Girshick, Piotr Dollár, Trevor Darrell, Bharath Hariharan
University of California, Berkeley
Facebook AI Research (FAIR)
This is the code for our CVPR 2017 paper on Unsupervised Learning using unlabeled videos. This repository contains models trained by the unsupervised motion grouping algorithm both in Caffe and Torch. If you find this work useful in your research, please cite:
@inproceedings{pathakCVPR17learning,
Author = {Pathak, Deepak and Girshick, Ross and Doll\'{a}r,
Piotr and Darrell, Trevor and Hariharan, Bharath},
Title = {Learning Features by Watching Objects Move},
Booktitle = {Computer Vision and Pattern Recognition ({CVPR})},
Year = {2017}
}
The models below only contains the layer that are used for unsupervised transfer learning. For the full model that contains motion segmentation, see next section.
- Clone the repository
git clone https://github.com/pathak22/unsupervised-video.git
- Fetch caffe models
cd unsupervised-video/
bash ./models/download_caffe_models.sh
# This will populate the `./models/` folder with trained models.
The models were initially trained in Torch and then converted to caffe. Hence, please include pycaffe based image_transform_layer.py
in your folder. It converts the scale and mean of the input image as needed.
- Fetch torch models
cd unsupervised-video/
bash ./models/download_torch_models.sh
# This will populate the `./models/` folder with trained models.
Follow the instructions below to download full motion segmentation model trained on the automatically selected 205K videos from YFCC100m. I trained it in Torch, but you can train your own model from the full data available here in any deep learning package using the training details from paper.
cd unsupervised-video/
bash ./models/download_torch_motion_model.sh
# This will populate the `./models/` folder with trained model.
cd motionseg/
th load_motionmodel.lua -input ../models/motionSegmenter_fullModel.t7
We are releasing software packages which were developed in the project, but could be generally useful for computer vision research. If you find them useful, please consider citing our work. These include:
(a) uNLC [github]: Implementation of unsupervised bottom-up video segmentation algorithm which is unsupervised adaptation of NLC algorithm by Faktor and Irani, BMVC 2014. For additional details, see section 5.1 in the paper.
(b) PyFlow [github]: This is python wrapper around Ce Liu's C++ implementation of Coarse2Fine Optical Flow. This is used inside uNLC implementation, and also generally useful as an independent package.