CamCo: Camera-Controllable 3D-Consistent Image-to-Video Generation

Dejia Xu,Weili Nie,Chao Liu,Sifei Liu,Jan Kautz,Zhangyang Wang,Arash Vahdat

CoRR（2024）

Cited 0|Views35

Abstract

Recently video diffusion models have emerged as expressive generative toolsfor high-quality video content creation readily available to general users.However, these models often do not offer precise control over camera poses forvideo generation, limiting the expression of cinematic language and usercontrol. To address this issue, we introduce CamCo, which allows fine-grainedCamera pose Control for image-to-video generation. We equip a pre-trainedimage-to-video generator with accurately parameterized camera pose input usingPlücker coordinates. To enhance 3D consistency in the videos produced, weintegrate an epipolar attention module in each attention block that enforcesepipolar constraints to the feature maps. Additionally, we fine-tune CamCo onreal-world videos with camera poses estimated through structure-from-motionalgorithms to better synthesize object motion. Our experiments show that CamCosignificantly improves 3D consistency and camera control capabilities comparedto previous models while effectively generating plausible object motion.Project page: https://ir1d.github.io/CamCo/

Translated text

Bibtex

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Data Disclaimer

The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn

Chat Paper

Summary is being generated by the instructions you defined