InstantSplat: Unbounded Sparse-view Pose-free Gaussian Splatting in 40 Seconds
arxiv(2024)
摘要
While novel view synthesis (NVS) has made substantial progress in 3D computer
vision, it typically requires an initial estimation of camera intrinsics and
extrinsics from dense viewpoints. This pre-processing is usually conducted via
a Structure-from-Motion (SfM) pipeline, a procedure that can be slow and
unreliable, particularly in sparse-view scenarios with insufficient matched
features for accurate reconstruction. In this work, we integrate the strengths
of point-based representations (e.g., 3D Gaussian Splatting, 3D-GS) with
end-to-end dense stereo models (DUSt3R) to tackle the complex yet unresolved
issues in NVS under unconstrained settings, which encompasses pose-free and
sparse view challenges. Our framework, InstantSplat, unifies dense stereo
priors with 3D-GS to build 3D Gaussians of large-scale scenes from sparseview
pose-free images in less than 1 minute. Specifically, InstantSplat comprises a
Coarse Geometric Initialization (CGI) module that swiftly establishes a
preliminary scene structure and camera parameters across all training views,
utilizing globally-aligned 3D point maps derived from a pre-trained dense
stereo pipeline. This is followed by the Fast 3D-Gaussian Optimization (F-3DGO)
module, which jointly optimizes the 3D Gaussian attributes and the initialized
poses with pose regularization. Experiments conducted on the large-scale
outdoor Tanks Temples datasets demonstrate that InstantSplat significantly
improves SSIM (by 32
(ATE) by 80
involving posefree and sparse-view conditions. Project page:
instantsplat.github.io.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要