LRM: Large Reconstruction Model for Single Image to 3D

ICLR 2024(2023)

Cited 116|Views432
No score
Abstract
We propose the first Large Reconstruction Model (LRM) that predicts the 3D\nmodel of an object from a single input image within just 5 seconds. In contrast\nto many previous methods that are trained on small-scale datasets such as\nShapeNet in a category-specific fashion, LRM adopts a highly scalable\ntransformer-based architecture with 500 million learnable parameters to\ndirectly predict a neural radiance field (NeRF) from the input image. We train\nour model in an end-to-end manner on massive multi-view data containing around\n1 million objects, including both synthetic renderings from Objaverse and real\ncaptures from MVImgNet. This combination of a high-capacity model and\nlarge-scale training data empowers our model to be highly generalizable and\nproduce high-quality 3D reconstructions from various testing inputs including\nreal-world in-the-wild captures and images from generative models. Video demos\nand interactable 3D meshes can be found on this website:\nhttps://yiconghong.me/LRM/.
More
Translated text
Key words
3D Reconstruction,Large-Scale,Transformers
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined