Denoising Diffusion via Image-Based Rendering
CoRR(2024)
摘要
Generating 3D scenes is a challenging open problem, which requires
synthesizing plausible content that is fully consistent in 3D space. While
recent methods such as neural radiance fields excel at view synthesis and 3D
reconstruction, they cannot synthesize plausible details in unobserved regions
since they lack a generative capability. Conversely, existing generative
methods are typically not capable of reconstructing detailed, large-scale
scenes in the wild, as they use limited-capacity 3D scene representations,
require aligned camera poses, or rely on additional regularizers. In this work,
we introduce the first diffusion model able to perform fast, detailed
reconstruction and generation of real-world 3D scenes. To achieve this, we make
three contributions. First, we introduce a new neural scene representation,
IB-planes, that can efficiently and accurately represent large 3D scenes,
dynamically allocating more capacity as needed to capture details visible in
each image. Second, we propose a denoising-diffusion framework to learn a prior
over this novel 3D scene representation, using only 2D images without the need
for any additional supervision signal such as masks or depths. This supports 3D
reconstruction and generation in a unified architecture. Third, we develop a
principled approach to avoid trivial 3D solutions when integrating the
image-based rendering with the diffusion model, by dropping out representations
of some images. We evaluate the model on several challenging datasets of real
and synthetic images, and demonstrate superior results on generation, novel
view synthesis and 3D reconstruction.
更多查看译文
关键词
Neural Scene Representations,Generative Models,Denoising Diffusion,3D Reconstruction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要