Intelligent Director: An Automatic Framework for Dynamic Visual Composition using ChatGPT
CoRR(2024)
摘要
With the rise of short video platforms represented by TikTok, the trend of
users expressing their creativity through photos and videos has increased
dramatically. However, ordinary users lack the professional skills to produce
high-quality videos using professional creation software. To meet the demand
for intelligent and user-friendly video creation tools, we propose the Dynamic
Visual Composition (DVC) task, an interesting and challenging task that aims to
automatically integrate various media elements based on user requirements and
create storytelling videos. We propose an Intelligent Director framework,
utilizing LENS to generate descriptions for images and video frames and
combining ChatGPT to generate coherent captions while recommending appropriate
music names. Then, the best-matched music is obtained through music retrieval.
Then, materials such as captions, images, videos, and music are integrated to
seamlessly synthesize the video. Finally, we apply AnimeGANv2 for style
transfer. We construct UCF101-DVC and Personal Album datasets and verified the
effectiveness of our framework in solving DVC through qualitative and
quantitative comparisons, along with user studies, demonstrating its
substantial potential.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要