Modeling, understanding, and reconstructing the real world are crucial in XR/VR. Recently, 3D Gaussian Splatting (3D-GS) methods have shown remarkable success in modeling and understanding 3D scenes. Similarly, various 4D representations have demonstrated the ability to capture the dynamics of the 4D world. However, there is a dearth of research focusing on segmentation within 4D representations. In this paper, we propose Segment Any 4D Gaussians (SA4D), one of the first frameworks to segment anything in the 4D digital world based on 4D Gaussians. In SA4D, an efficient temporal identity feature field is introduced to handle Gaussian drifting, with the potential to learn precise identity features from noisy and sparse input. Additionally, a 4D segmentation refinement process is proposed to remove artifacts. Our SA4D achieves precise, high-quality segmentation within seconds in 4D Gaussians and shows the ability to remove, recolor, compose, and render high-quality anything masks.
在XR/VR中建模、理解和重建真实世界至关重要。最近,3D高斯光滑(3D-GS)方法在建模和理解3D场景方面取得了显著成功。类似地,各种4D表示法展示了捕捉4D世界动态的能力。然而,目前缺乏专注于4D表示内分割的研究。在本文中,我们提出了Segment Any 4D Gaussians(SA4D),这是首个基于4D高斯函数在4D数字世界中分割任何物体的框架之一。在SA4D中,引入了一个高效的时间标识特征场来处理高斯漂移,有潜力从嘈杂和稀疏的输入中学习精确的身份特征。此外,还提出了一个4D分割细化过程来去除伪影。我们的SA4D能够在几秒钟内精确地实现高质量的4D高斯函数分割,并展示了移除、重新着色、合成和渲染高质量任何物体掩码的能力。