Semantic Gaussians: Open-Vocabulary Scene Understanding with 3D Gaussian Splatting
arxiv(2024)
摘要
Open-vocabulary 3D scene understanding presents a significant challenge in
computer vision, withwide-ranging applications in embodied agents and augmented
reality systems. Previous approaches haveadopted Neural Radiance Fields (NeRFs)
to analyze 3D scenes. In this paper, we introduce SemanticGaussians, a novel
open-vocabulary scene understanding approach based on 3D Gaussian Splatting.
Our keyidea is distilling pre-trained 2D semantics into 3D Gaussians. We design
a versatile projection approachthat maps various 2Dsemantic features from
pre-trained image encoders into a novel semantic component of 3D Gaussians,
withoutthe additional training required by NeRFs. We further build a 3D
semantic network that directly predictsthe semantic component from raw 3D
Gaussians for fast inference. We explore several applications ofSemantic
Gaussians: semantic segmentation on ScanNet-20, where our approach attains a
4.2
understanding counterparts; object part segmentation,sceneediting, and
spatial-temporal segmentation with better qualitative results over 2D and 3D
baselines,highlighting its versatility and effectiveness on supporting diverse
downstream tasks.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要