VASE: Object-Centric Appearance and Shape Manipulation of Real Videos
CoRR(2024)
摘要
Recently, several works tackled the video editing task fostered by the
success of large-scale text-to-image generative models. However, most of these
methods holistically edit the frame using the text, exploiting the prior given
by foundation diffusion models and focusing on improving the temporal
consistency across frames. In this work, we introduce a framework that is
object-centric and is designed to control both the object's appearance and,
notably, to execute precise and explicit structural modifications on the
object. We build our framework on a pre-trained image-conditioned diffusion
model, integrate layers to handle the temporal dimension, and propose training
strategies and architectural modifications to enable shape control. We evaluate
our method on the image-driven video editing task showing similar performance
to the state-of-the-art, and showcasing novel shape-editing capabilities.
Further details, code and examples are available on our project page:
https://helia95.github.io/vase-website/
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要