Neural Groundplans: Persistent Neural Scene Representations from a Single Image

Prafull Sharma,Ayush Tewari, Yilun Du, Sergey Zakharov,Rares Ambrus,Adrien Gaidon,William T. Freeman大牛学者,Fredo Durand大牛学者,Joshua B. Tenenbaum大牛学者,Vincent Sitzmann

arxiv(2023)

引用 0|浏览164
摘要
We present a method to map 2D image observations of a scene to a persistent 3D scene representation, enabling novel view synthesis and disentangled representation of the movable and immovable components of the scene. Motivated by the bird's-eye-view (BEV) representation commonly used in vision and robotics, we propose conditional neural groundplans, ground-aligned 2D feature grids, as persistent and memory-efficient scene representations. Our method is trained self-supervised from unlabeled multi-view observations using differentiable rendering, and learns to complete geometry and appearance of occluded regions. In addition, we show that we can leverage multi-view videos at training time to learn to separately reconstruct static and movable components of the scene from a single image at test time. The ability to separately reconstruct movable objects enables a variety of downstream tasks using simple heuristics, such as extraction of object-centric 3D representations, novel view synthesis, instance-level segmentation, 3D bounding box prediction, and scene editing. This highlights the value of neural groundplans as a backbone for efficient 3D scene understanding models.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
0
您的评分 :

暂无评分

数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn