Skip to content

Latest commit

 

History

History
5 lines (3 loc) · 2.08 KB

2403.12722.md

File metadata and controls

5 lines (3 loc) · 2.08 KB

HUGS: Holistic Urban 3D Scene Understanding via Gaussian Splatting

Holistic understanding of urban scenes based on RGB images is a challenging yet important problem. It encompasses understanding both the geometry and appearance to enable novel view synthesis, parsing semantic labels, and tracking moving objects. Despite considerable progress, existing approaches often focus on specific aspects of this task and require additional inputs such as LiDAR scans or manually annotated 3D bounding boxes. In this paper, we introduce a novel pipeline that utilizes 3D Gaussian Splatting for holistic urban scene understanding. Our main idea involves the joint optimization of geometry, appearance, semantics, and motion using a combination of static and dynamic 3D Gaussians, where moving object poses are regularized via physical constraints. Our approach offers the ability to render new viewpoints in real-time, yielding 2D and 3D semantic information with high accuracy, and reconstruct dynamic scenes, even in scenarios where 3D bounding box detection are highly noisy. Experimental results on KITTI, KITTI-360, and Virtual KITTI 2 demonstrate the effectiveness of our approach.

基于RGB图像的城市场景的整体理解是一个具有挑战性但非常重要的问题。它包括了对几何和外观的理解,以实现新视角的合成、解析语义标签和跟踪移动对象。尽管取得了相当大的进步,现有方法通常只关注这一任务的特定方面,并且需要额外的输入,如LiDAR扫描或手动标注的3D边界框。在这篇论文中,我们引入了一个利用3D高斯喷溅进行城市场景整体理解的新流程。我们的主要思想涉及使用静态和动态3D高斯的组合对几何、外观、语义和运动进行联合优化,其中移动对象姿态通过物理约束进行规范化。我们的方法提供了实时渲染新视点的能力,能够高精度地生成2D和3D语义信息,并且即使在3D边界框检测非常噪声的情况下也能重建动态场景。在KITTI、KITTI-360和Virtual KITTI 2上的实验结果展示了我们方法的有效性。