Collection of papers and other resources for object detection and tracking using deep learning
- Region Proposal
- RCNN
- Faster R-CNN Towards Real-Time Object Detection with Region Proposal Networks [tpami17] [pdf] [notes]
- RFCN - Object Detection via Region-based Fully Convolutional Networks [nips16] [Microsoft Research] [pdf] [notes]
- Mask R-CNN [iccv17] [Facebook AI Research] [pdf] [notes] [arxiv] [code (keras)] [code (tensorflow)]
- YOLO
- SSD
- RetinaNet
- Misc
- Tubelet
- FGFA
- RNN
- Deep Learning
- Reinforcement Learning
- Learning to Track: Online Multi-object Tracking by Decision Making [iccv15] [Stanford] [pdf] [code (matlab)] [project] [notes]
- Network Flow
- Graph Optimization
- Baseline
- Reinforcement Learning
- Deep Reinforcement Learning for Visual Object Tracking in Videos [ax1704] [USC-Santa Barbara, Samsung Research] [pdf] [arxiv] [author] [notes]
- Visual Tracking by Reinforced Decision Making [ax1702] [Seoul National University, Chung-Ang University] [pdf] [arxiv] [author] [notes]
- Action-Decision Networks for Visual Tracking with Deep Reinforcement Learning [cvpr17] [Seoul National University] [pdf] [supplementary] [project] [notes]
- End-to-end Active Object Tracking via Reinforcement Learning [ax1705] [Peking University, Tencent AI Lab] [pdf] [arxiv]
- Siamese
- Video Frame Interpolation via Adaptive Convolution [cvpr17 / iccv17] [pdf (cvpr17)] [pdf (iccv17)] [ppt]
- Variational
- Multi Object Tracking
- IDOT
- UA-DETRAC Benchmark Suite
- GRAM Road-Traffic Monitoring
- Stanford Drone Dataset
- Ko-PER Intersection Dataset
- TRANCOS
- Urban Tracker
- DARPA VIVID / PETS 2005 (Non stationary camera)
- KIT-AKS (No ground truth)
- CBCL StreetScenes Challenge Framework (No top down viewpoint)
- MOT 2015 (mostly street level camera viewpoint)
- MOT 2016 (mostly street level camera viewpoint)
- MOT 2017 (mostly street level camera viewpoint)
- PETS 2009 (No vehicles)
- PETS 2017 (Low density; mostly pedestrians)
- KITTI Tracking Dataset (No top down viewpoint; non stationary camera)
- The WILDTRACK Seven-Camera HD Dataset (pedestrian detection and tracking)
- 3D Traffic Scene Understanding from Movable Platforms (intersection traffic/stereo setup/moving camera)
- Single Object Tracking
- TrackingNet: A Large-Scale Dataset and Benchmark for Object Tracking in the Wild [eccv18]
- LaSOT: Large-scale Single Object Tracking [cvpr19]
- Need for speed: A benchmark for higher frame rate object tracking [iccv17]
- Long-term Tracking in the Wild A Benchmark [eccv18]
- UAV123: A benchmark and simulator for UAV tracking [eccv16] [project]
- Sim4CV A Photo-Realistic Simulator for Computer Vision Applications [ijcv18]
- Video Understanding / Activity Recognition
- Video Detection
- Static Detection
- Static Segmentation
- Video Segmentation
- Classification
- Optical Flow
- Datasets
- Single Object Tracking
- Multi Object Tracking
- Segmentation
- Deep Compressed Sensing
- Misc
- Static Detection
- Deep Learning for Object Detection: A Comprehensive Review
- Review of Deep Learning Algorithms for Object Detection
- A Simple Guide to the Versions of the Inception Network
- R-CNN, Fast R-CNN, Faster R-CNN, YOLO - Object Detection Algorithms
- A gentle guide to deep learning object detection
- The intuition behind RetinaNet
- YOLO—You only look once, real time object detection explained
- Understanding Feature Pyramid Networks for object detection (FPN)
- Fast object detection with SqueezeDet on Keras
- Region of interest pooling explained
- Video Detection
- Deep RL
- Autoencoders
- Multi Object Tracking
- Globally-optimal greedy algorithms for tracking a variable number of objects [cvpr11] [matlab] [author]
- Continuous Energy Minimization for Multitarget Tracking [cvpr11 / iccv11 / tpami 2014] [matlab]
- Discrete-Continuous Energy Minimization for Multi-Target Tracking [cvpr12] [matlab] [project]
- The way they move: Tracking multiple targets with similar appearance [iccv13] [matlab]
- 3D Traffic Scene Understanding from Movable Platforms [2d_tracking] [pami14/kit13/iccv13/nips11] [c++/matlab]
- Multiple target tracking based on undirected hierarchical relation hypergraph [cvpr14] [C++] [author]
- Robust online multi-object tracking based on tracklet confidence and online discriminative appearance learning [cvpr14] [matlab] (project)
- Learning to Track: Online Multi-Object Tracking by Decision Making [iccv15] [matlab]
- Joint Tracking and Segmentation of Multiple Targets [cvpr15] [matlab]
- Multiple Hypothesis Tracking Revisited [iccv15] [highest MT on MOT2015 among open source trackers] [matlab]
- Simple Online and Realtime Tracking [icip 2016] [python]
- Deep SORT : Simple Online Realtime Tracking with a Deep Association Metric [icip 2017] [python]
- Combined Image- and World-Space Tracking in Traffic Scenes [icra 2017] [c++]
- High-Speed Tracking-by-Detection Without Using Image Information [avss 2017] [python]
- Single Object Tracking
- A collection of common tracking algorithms (2003-2012) [c++/matlab]
- SenseTime Research platform for single object tracking, implementing algorithms like SiamRPN and SiamMask [pytorch]
- In Defense of Color-based Model-free Tracking [cvpr15] [c++]
- Hierarchical Convolutional Features for Visual Tracking [iccv15] [matlab]
- Visual Tracking with Fully Convolutional Networks [iccv15] [matlab]
- DeepTracking: Seeing Beyond Seeing Using Recurrent Neural Networks [aaai16] [torch 7]
- Learning Multi-Domain Convolutional Neural Networks for Visual Tracking [cvpr16] [vot2015 winner] [matlab/matconvnet]
- Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking [eccv 2016] [matlab]
- Fully-Convolutional Siamese Networks for Object Tracking [eccvw 2016] [matlab/matconvnet] [project] [pytorch] [pytorch (only training)]
- DCFNet: Discriminant Correlation Filters Network for Visual Tracking [ax1704] [matlab/matconvnet] [pytorch]
- End-to-end representation learning for Correlation Filter based tracking [cvpr17] [matlab/matconvnet] [tensorflow/inference_only] [project]
- A simplified PyTorch implementation of Siamese networks for tracking: SiamFC, SiamRPN, SiamRPN++, SiamVGG, SiamDW, SiamRPN-VGG [pytorch]
- RATM: Recurrent Attentive Tracking Model [cvprw17] [python]
- ROLO : Spatially Supervised Recurrent Convolutional Neural Networks for Visual Object Tracking [iscas 2017] [tensorfow]
- ECO: Efficient Convolution Operators for Tracking [cvpr17] [matlab]
- Detect to Track and Track to Detect [iccv17] [matlab]
- High Performance Visual Tracking with Siamese Region Proposal Network [cvpr18] [pytorch] [pytorch] [pytorch/no_train] [pytorch]
- Distractor-aware Siamese Networks for Visual Object Tracking [eccv18] [vot18 winner] [pytorch]
- Fast Online Object Tracking and Segmentation: A Unifying Approach [cvpr19] [pytorch] [project]
- PyTracking: A general python framework for training and running visual object trackers, based on PyTorch [DiMP / ATOM] [cvpr19/iccv19] [pytorch]
- Video Detection
- Flow-Guided Feature Aggregation for Video Object Detection [nips16 / iccv17] [mxnet]
- T-CNN: Tubelets with Convolution Neural Networks [cvpr16] [python]
- TPN: Tubelet Proposal Network [cvpr17] [python]
- Deep Feature Flow for Video Recognition [cvpr17] [mxnet]
- Mobile Video Object Detection with Temporally-Aware Feature Maps [cvpr18] [Google] [tensorflow]
- Static Detection and Matching
- Frameworks
- Region Proposal
- MCG : Multiscale Combinatorial Grouping - Object Proposals and Segmentation (project) [tpami16/cvpr14] [python]
- COB : Convolutional Oriented Boundaries (project) [tpami18/eccv16] [matlab/caffe]
- FPN
- Feature Pyramid Networks for Object Detection [caffe/python]
- RCNN
- RFCN (author) [caffe/matlab]
- RFCN-tensorflow [tensorflow]
- PVANet: Lightweight Deep Neural Networks for Real-time Object Detection [intel] [emdnn16(nips16)]
- Mask R-CNN [tensorflow] [keras]
- Light-head R-CNN [cvpr18] [tensorflow]
- Evolving Boxes for Fast Vehicle Detection [icme18] [caffe/python]
- Cascade R-CNN (cvpr18) [detectron] [caffe]
- SSD
- SSD-Tensorflow [tensorflow]
- SSD-Tensorflow (tf.estimator) [tensorflow]
- SSD-Tensorflow (tf.slim) [tensorflow]
- SSD-Keras [keras]
- SSD-Pytorch [pytorch]
- Enhanced SSD with Feature Fusion and Visual Reasoning [nca18] [tensorflow]
- RefineDet - Single-Shot Refinement Neural Network for Object Detection [cvpr18] [caffe]
- YOLO
- Darknet: Convolutional Neural Networks [c/python]
- YOLO9000: Better, Faster, Stronger - Real-Time Object Detection. 9000 classes! [c/python]
- Darkflow [tensorflow]
- Pytorch Yolov2 [pytorch]
- Yolo-v3 and Yolo-v2 for Windows and Linux [c/python]
- YOLOv3 in PyTorch [pytorch]
- pytorch-yolo-v3 [pytorch] [no training] [tutorial]
- YOLOv3_TensorFlow [tensorflow]
- tensorflow-yolo-v3 [tensorflow slim]
- tensorflow-yolov3 [tensorflow slim]
- keras-yolov3 [keras]
- Relation Networks for Object Detection [cvpr18] [MXNet]
- DeNet: Scalable Real-time Object Detection with Directed Sparse Sampling [iccv17(poster)] [theano]
- SNIPER: Efficient Multi-Scale Training [cvpr18 / nips18] [mxnet]
- Multi-scale Location-aware Kernel Representation for Object Detection [cvpr18] [caffe/python]
- Matching
- Boundary Detection
- Optical Flow
- FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks (cvpr17) - caffe, pytorch/nvidia
- SPyNet: Spatial Pyramid Network for Optical Flow (cvpr17) - lua, pytorch
- Guided Optical Flow Learning (cvprw17) - caffe, tensorflow
- Fast Optical Flow using Dense Inverse Search (DIS) [eccv16] [C++]
- A Filter Formulation for Computing Real Time Optical Flow [ral16] [c++/cuda - matlab,python wrappers]
- PatchBatch - a Batch Augmented Loss for Optical Flow [cvpr16] [python/theano]
- Piecewise Rigid Scene Flow [iccv13/eccv14/ijcv15] [c++/matlab]
- DeepFlow v2 [iccv13] [c++/python/matlab], [project]
- An Evaluation of Data Costs for Optical Flow [gcpr13] [matlab]
- Instance Segmentation
- Fully Convolutional Instance-aware Semantic Segmentation [cvpr17] [coco16 winner] [mxnet]
- DeepMask/SharpMask [nips15/eccv16] [facebook] [torch] [tensorflow] [pytorch/deepmask]
- Simultaneous Detection and Segmentation [eccv14] [matlab] [project]
- Autoencoders
- β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework [iclr17] [deepmind] [tensorflow] [tensorflow] [pytorch]
- Disentangling by Factorising [ax1806] [pytorch]
- Classification
- Learning Efficient Convolutional Networks Through Network Slimming [iccv17] [pytorch]
- Deep RL
- Misc