CamCo: Camera-Controllable 3D-Consistent Image-to-Video Generation
CoRR(2024)
Abstract
Recently video diffusion models have emerged as expressive generative toolsfor high-quality video content creation readily available to general users.However, these models often do not offer precise control over camera poses forvideo generation, limiting the expression of cinematic language and usercontrol. To address this issue, we introduce CamCo, which allows fine-grainedCamera pose Control for image-to-video generation. We equip a pre-trainedimage-to-video generator with accurately parameterized camera pose input usingPlücker coordinates. To enhance 3D consistency in the videos produced, weintegrate an epipolar attention module in each attention block that enforcesepipolar constraints to the feature maps. Additionally, we fine-tune CamCo onreal-world videos with camera poses estimated through structure-from-motionalgorithms to better synthesize object motion. Our experiments show that CamCosignificantly improves 3D consistency and camera control capabilities comparedto previous models while effectively generating plausible object motion.Project page: https://ir1d.github.io/CamCo/
MoreTranslated text
PDF
View via Publisher
AI Read Science
Must-Reading Tree
Example

Generate MRT to find the research sequence of this paper
Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn
Chat Paper
Summary is being generated by the instructions you defined