Motion-Guided Latent Diffusion for Temporally Consistent Real-world Video Super-resolution
CoRR(2023)
摘要
Real-world low-resolution (LR) videos have diverse and complex degradations,
imposing great challenges on video super-resolution (VSR) algorithms to
reproduce their high-resolution (HR) counterparts with high quality. Recently,
the diffusion models have shown compelling performance in generating realistic
details for image restoration tasks. However, the diffusion process has
randomness, making it hard to control the contents of restored images. This
issue becomes more serious when applying diffusion models to VSR tasks because
temporal consistency is crucial to the perceptual quality of videos. In this
paper, we propose an effective real-world VSR algorithm by leveraging the
strength of pre-trained latent diffusion models. To ensure the content
consistency among adjacent frames, we exploit the temporal dynamics in LR
videos to guide the diffusion process by optimizing the latent sampling path
with a motion-guided loss, ensuring that the generated HR video maintains a
coherent and continuous visual flow. To further mitigate the discontinuity of
generated details, we insert temporal module to the decoder and fine-tune it
with an innovative sequence-oriented loss. The proposed motion-guided latent
diffusion (MGLD) based VSR algorithm achieves significantly better perceptual
quality than state-of-the-arts on real-world VSR benchmark datasets, validating
the effectiveness of the proposed model design and training strategies.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要