OverView of 3D object detection method

To facilitate organizing and reading the papers, I will compile a list of papers related to 3D object detection. This will cover deep learning-based algorithms and multimodal fusion algorithms.

（It's mainly because my PhD supervisor told me to organize it; otherwise, I'd be too lazy to do it, haha.）

papper list

Survey
object detection without fusion
multimodel object detection
Selfsupervised Learning
Unsupervised Learning
DownSampling in pointcloud
Point Cloud Local Feature Description
Cooperative Driving Automation
DataSet
Collaborative DataSet

survey

Method	Title	Author
object detection	Foreground-Background Imbalance Problem in Deep Object Detectors: A Review	Joya Chen, Tong Xu
object detection	A Review and Comparative Study on Probabilistic Object Detection in Autonomous Driving	Di Feng,Ali Harakeh,Steven Waslander
object detection	An Overview Of 3D Object Detection	Yilin Wang, Jiayi Ye
object detection	3D Object Detection for Autonomous Driving: A Survey	Rui Qian, Xin Lai
MultiModel	Multi-Modal 3D Object Detection in Autonomous Driving: a Survey	Yingjie Wang,Qiuyu Mao
MultiModel	Deep Multi-modal Object Detection and Semantic Segmentation for Autonomous Driving: Datasets,Methods, and Challenges	Di Feng,Christian Haase-Schutz
MultiModel	Deep Learning for Image and Point Cloud Fusion in Autonomous Driving: A Review	Yaodong Cui

object detection without fusion

Method	Title	Input	Pub.	Author
Monocular based	Deep3DBox: 3D Bounding Box Estimation Using Deep Learning and Geometry	Monocular Image	CVPR 2017	Chen et al.
Monocular based	MonoCon : Learning Auxiliary Monocular Contexts Helps Monocular 3D Object Detection	Monocular Image	arXiv 2021	Liu et al.
Monocular based	Mono3D-PLiDAR : Monocular 3d object detection with pseudo-lidar point cloud	Monocular Image	ICCV 2019	Weng et al.
Monocular based	M3DSSD: onocular 3D Single Stage Object Detector	Monocular Image	CVPR 2021	Luo et al.
Monocular based	MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation	Monocular Image	CVPR 2021	chen et al.
Stereo based	3DOP: 3D Object Proposals using Stereo Imagery for Accurate Object Class Detection	Monocular Image	NIPS 2015	Chen et al.
Stereo based	Liga-stereo: Learning lidar geometry aware representations for stereo-based 3d detector	Stereo Image	CVPR 2021	Guo et al.
Stereo based	CG-Stereo : Confidence guided stereo 3D object detection with split depth estimation	Stereo Image	IROS 2020	Li et al.
Stereo based	Stereo R-CNN :Stereo R-CNN Based 3D Object Detection for Autonomous Driving	Stereo Image	CVPR 2019	Li et al.
MultiView based	VeloFCN : Vehicle detection from 3d lidar using fully convolutional network	Front View,FV	CVPR 2016	Li et al.
MultiView based	BirdNet : Birdnet: a 3d object detection framework from lidar information	Bird’s Eye of View,BEV	CVPR 2018	Jorge et al.
MultiView based	Pixor: Real-time 3d object detection from point clouds	BEV	CVPR 2018	Yang et al.
MultiView based	Hdnet: Exploiting hd maps for 3d object detection	BEV	PMLR 2018	Yang et al.
MultiView based	LaserNet: An Efficient Probabilistic 3D Object Detector for Autonomous Driving	Range View,RV	CVPR 2019	Meyer et al.
Voxel based	Voxelnet: End-to-end learning for point cloud based 3d object detection	voxel	CVPR 2018	Zhou et al.
Voxel based	Second: Sparsely embedded convolutional detection	voxel	Sensors 2018	Yan et al.
Voxel based	PointPillars: Fast Encoders for Object Detection From Point Clouds	voxel	CVPR 2019	Lang et al.
Voxel based	HVNet: Hybrid Voxel Network for LiDAR Based 3D Object Detection	voxel	CVPR 2020	Ye et al.
Voxel based	HVPR: Hybrid Voxel-Point Representation for Single-Stage 3D Object Detection	voxel	CVPR 2021	Noh et al.
Voxel based	SA-SSD : Structure aware single-stage 3d object detection from point cloud	voxel	CVPR 2020	He et al.
Point based	PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud	point	CVPR 2019	Shi et al.
Point based	VoteNet :A Deep Learning Label Fusion Method for Multi-atlas Segmentation	point	ICCV 2019	Ding et al.
Point based	Part A^2 :From Points to Parts: 3D Object Detection From Point Cloud With Part-Aware and Part-Aggregation Network	point	TPAMI2020	Shi et al.
Point based	PV RCNN : Point-Voxel Feature Set Abstraction for 3D Object Detection	point	CVPR 2020	Shi et al.
Point based	3DSSD :Point-based 3D Single Stage Object Detector	point	CVPR 2020	Yang et al.
Point based	LiDAR RCNN :An Efficient and Universal 3D Object Detector	point	CVPR 2021	Li et al.
Point based	3DIoUMatch :Leveraging IoU Prediction for Semi-Supervised 3D Object Detection	point	CVPR 2021	Wang et al.
Point based	ST3D :Self-Training for Unsupervised Domain Adaptation on 3D Object Detection	point	CVPR 2021	Yang et al.

multimodel object detection

Title	Pub.	Author
MV3D : Multi-View 3D Object Detection Network for Autonomous Driving	CVPR 2017	chen et al.
AVOD : Joint 3D Proposal Generation and Object Detection from View Aggregation	IROS 2018	Ku et al.
SCANet: Spatial-channel attention network for 3D object detection	ICASSP 2019	Lu et al.
MVX-net: Multimodal voxelnet for 3d object detection	ICRA 2019	Sindagi et al.
MMF : Multi-task multi-sensor fusion for 3d object detection	CVPR 2019	liang et al.
CLOCs: Camera-LiDAR Object Candidates Fusion for 3D Object Detection	IROS 2020	Peng et al.
ContFusion : Deep continuous fusion for multi-sensor 3d object detection	ECCV 2018	Liang et al.
Pointfusion: Deep sensor fusion for 3d bounding box estimation	CVPR 2018	Xu et al.
Pointpainting: Sequential fusion for 3d object detection	CVPR 2020	Lang et al.
Epnet: Enhancing point features with image semantics for 3d object detection	ECCV 2020	Huang et al.
PI-RCNN: An efficient multi-sensor 3D object detector with point-based attentive cont-conv fusion module	AAAI 2020	Xiang et al.
MoCa : Multi-Modality Cut and Paste for 3D Object Detection	arXiv 2020	Zhang et al.
PointAugmenting: Cross-Modal Augmentation for 3D Object Detection	CVPR 2021	Wang et al.
Imvotenet: Boosting 3d object detection in point clouds with image votes	CVPR 2020	Charles Qi et al.
Pseudo-LiDAR From Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving	CVPR 2019	Wang et al.
Roarnet: A robust 3d object detection based on region approximation refinement	IEEE.IV 2019	Shin et al.
Frustum PointNet : Frustum pointnets for 3d object detection from rgb-d data	CVPR 2018	Qi et al.
Frustum ConvNet : Sliding Frustums to Aggregate Local Point-Wise Features for Amodal 3D Object Detection	IROS 2019	Wang et al.

Selfsupervised Learning

Title	Pub.	Author
Data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language	2022	MetaAI
CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding	CVPR 2022	Mohamed Afham

Unsupervised Learning

Title	Pub.	Author
Unsupervised Learning of Depth from Monocular Videos Using 3D-2D Corresponding Constraints	Remote Sensing 2021	Jin et al.
ST3D: Self-training for Unsupervised Domain Adaptation on 3D Object Detection	CVPR 2021	Yang et al.

downsampling in pointcloud

Method	Title
farthest point sampling（FPS）	PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space
farthest point sampling（FPS）	ShellNet: Efficient Point Cloud Convolutional Neural Networks Using Concentric Shells Statistics
grid sampling（GS）	RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds
grid sampling（GS）	KPConv: Flexible and Deformable Convolution for Point Clouds
random sampling（RS）	Grid-GCN for Fast and Scalable Point Cloud Learning
Critical Points Layer (CPL)	Adaptive Hierarchical Down-Sampling for Point Cloud Classification
Weighted Critical Points Layer (WCPL)	Adaptive Hierarchical Down-Sampling for Point Cloud Classification
Adaptive Sampling	PointASNL: Robust Point Clouds Processing using Nonlocal Neural Networks with Adaptive Sampling
Feature-FPS (F-FPS)	3DSSD: Point-based 3D Single Stage Object Detector
Semantics-guided Farthest Point Sampling (S-FPS)	SASA：SASA: Semantics-Augmented Set Abstraction for Point-based 3D Object Detection

Point Cloud Local Feature Description

Title	Pub.	Author
2D Shape Context: Shape Context: A new descriptor for shape matching and object recognition	NeurIPS 2000	Serge Belongie et al.
3D Shape Context:Recognizing Objects in Range Data Using Regional Point Descriptors	ECCV 2004	Andrea et al.
Shape Matching and Object Recognition Using Shape Contexts	2002	Belongie et al.
3D Shape Descriptor for Objects Recognition	LARS and SBR 2017	Sales et al.
ROI-cloud: A Key Region Extraction Method for LiDAR Odometry and Localization	ICRA 2020	Zhou et al.
PointSIFT: A sift-like network module for 3D point cloud semantic segmentation	CVPR 2018	Jiang et al.

Cooperative Driving Automation

Title	Pub.	Author
V2X-ViT :V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision Transformer	ECCV'22	Xu et al.
Where2comm :Communication-Efficient Collaborative Perception via Spatial Confidence Maps	NeurIPS'22	Hu et al.
CoBEVT :Cooperative Bird's Eye View Semantic Segmentation with Sparse Transformers	CoRL'22	Hu et al.
V2VNet :Vehicle-to-Vehicle Communication for Joint Perception and Prediction	ECCV'20	Wang et al.
OPV2V: An Open Benchmark Dataset and Fusion Pipeline for Perception with Vehicle-to-Vehicle Communication	ECCV'22	Xu et al.
SyncNet:Latency-Aware Collaborative Perception	ECCV'22	Lei et al.
CoAlign :Robust Collaborative 3D Object Detection in Presence of Pose Errors	ICRA'22	Lu et al.
Double-M:Uncertainty Quantification of Collaborative Detection for Self-Driving	ICAR'23	Su et al.
SCOPE: Spatio-Temporal Domain Awareness for Multi-Agent Collaborative Perception	ICCV'23	Yang et al.
MPDA: Bridging the Domain Gap for Multi-Agent Perception	ICRA'23	Xu et al.
AdaFusion: Adaptive Feature Fusion for Cooperative Perception using LiDAR Point Clouds	WACV'23	Qiao et al.
CoBEVFlow :Robust Asynchronous Collaborative 3D Detection via Bird’s Eye View Flow	NeurIPS'23	Wei et al.
HAEL :An Extensible Framework for Open Heterogeneous Collaborative Perception	ICLR 2024	Lu et al.
CoHFF :Collaborative Semantic Occupancy Prediction with Hybrid Feature Fusion in Connected Automated Vehicles	CVPR 2024	Song et al.
CMiMC :What Makes Good Collaborative Views? Contrastive Mutual Information Maximization for Multi-Agent Perception	AAAI 2024	Su et al.
CharSim :Editable Scene Simulation for Autonomous Driving via LLM-Agent Collaboration	CVPR 2024 Highlight	Wei et al.
RoCo :Robust Collaborative Perception By Iterative Object Matching and Pose Adjustment	ACM MM 2024	Huang et al.

DataSet

DataSet	Size	Categories / Remarks	Sensing Modalities
ScanNet	1513 scans 2.5M frames	floor, wall, chair, cabinet, bed, sofa, table, door, window, bookself, picture, counter, desk, curtain, refrigerator, shower curtain, toilet, sink, bathtub, other furniture	3D comera,deep Sensors
SUN RGB-D
SUN3D
KITTI	7481 frames (training) 80.256 objects	Car, Van, Truck, Pedestrian, Person (sitting), Cyclist, Tram,Misc	Visual (Stereo) camera, 3D LiDAR, GNSS, and inertial sensors
nuScense	1000 scenes, 1.4M frames (camera, Radar), 390k frames (3D LiDAR)	25 Object classes, such as Car /Van / SUV, different Trucks,Buses, Persons, Animal, Traffic Cone, Temporary Traffic Barrier, Debris, etc.	Visual cameras (6), 3D LiDAR, and Radars (5)
BLVD	120k frames, 249,129 objects	Vehicle, Pedestrian, Rider during day and night	Visual (Stereo) camera, 3D LiDAR
Waymo open dataset	200k frames, 12M objects (3D LiDAR), 1.2M objects (2D camera)	Vehicles, Pedestrians, Cyclists,Signs	3D LiDAR (5), Visual cameras (5)
H3D	27,721 frames, 1,071,302 objects	Car, Pedestrian, Cyclist, Truck, Misc, Animals, Motorcyclist, Bus	Visual cameras (3), 3D LiDAR
Lyft-L5 AV dataset	55k frames	Semantic HD map included	3D LiDAR (5), Visual cameras (6)
A2D2	40k frames (semantics), 12k frames (3D objects), 390k frames unlabeled	Car,Bicycle, Pedestrian, Truck,Small vehicles, Traffic signal,Utility vehicle, Sidebars, Speed bumper, Curbstone, Solid line,Irrelevant signs, Road blocks, Tractor, Non-drivable street, Zebra crossing, Obstacles / trash, Poles,RD restricted area, Animals, Grid structure, Signal corpus, Drivable cobbleston, Electronic traffic,Slow drive area, Nature object,Parking area, Sidewalk, Ego car,Painted driv. instr., Traffic guide obj., Dashed line, RD normal street, Sky, Buildings, Blurred area, Rain dirt	Visual cameras (6); 3D LiDAR (5); Bus data
ApolloScape	143,906 image frames, 89,430 objects	Rover, Sky, Car, Motobicycle,Bicycle, Person, Rider, Truck,Bus, Tricycle, Road, Sidewalk,Traffic Cone, Road Pile, Fence,Traffic Light, Pole, Traffic Sign,Wall, Dustbin, Billboard,Building, Bridge, Tunnel,Overpass, Vegetation	Visual (Stereo) camera, 3D LiDAR, GNSS, and inertial sensors
A3D Dataset	39k frames, 230k objects	Car, Van, Bus, Truck, Pedestrians,Cyclists, and Motorcyclists;Afternoon and night, wet and dry	Visual cameras (2); 3D LiDAR
DBNet Dataset	Over 10k frames	In total seven datasets with different test scenarios, such as seaside roads, school areas,mountain roads.	3D LiDAR, Dashboard visual camera, GNSS
KAIST multispectral dataset	7,512 frames, 308,913 objects	Person, Cyclist, Car during day and night, fine time slots (sunrise,afternoon,...)
PandaSet

Collaborative

DataSet	Simulation
OPV2V	Yes
V2V4Real	No
V2XSet	Yes
V2X-Sim	Yes
[DAIR-V2X]	No

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OverView of 3D object detection method

papper list

survey

object detection without fusion

multimodel object detection

Selfsupervised Learning

Unsupervised Learning

downsampling in pointcloud

Point Cloud Local Feature Description

Cooperative Driving Automation

DataSet

Collaborative

About

Releases

Packages

HuangZhe885/papper-3D-detection

Folders and files

Latest commit

History

Repository files navigation

OverView of 3D object detection method

papper list

survey

object detection without fusion

multimodel object detection

Selfsupervised Learning

Unsupervised Learning

downsampling in pointcloud

Point Cloud Local Feature Description

Cooperative Driving Automation

DataSet

Collaborative

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages