CoDi: Conditional Diffusion Distillation for Higher-Fidelity and Faster Image Generation
arXiv (Cornell University)(2023)
摘要
Large generative diffusion models have revolutionized text-to-image
generation and offer immense potential for conditional generation tasks such as
image enhancement, restoration, editing, and compositing. However, their
widespread adoption is hindered by the high computational cost, which limits
their real-time application. To address this challenge, we introduce a novel
method dubbed CoDi, that adapts a pre-trained latent diffusion model to accept
additional image conditioning inputs while significantly reducing the sampling
steps required to achieve high-quality results. Our method can leverage
architectures such as ControlNet to incorporate conditioning inputs without
compromising the model's prior knowledge gained during large scale
pre-training. Additionally, a conditional consistency loss enforces consistent
predictions across diffusion steps, effectively compelling the model to
generate high-quality images with conditions in a few steps. Our
conditional-task learning and distillation approach outperforms previous
distillation methods, achieving a new state-of-the-art in producing
high-quality images with very few steps (e.g., 1-4) across multiple tasks,
including super-resolution, text-guided image editing, and depth-to-image
generation.
更多查看译文
关键词
diffusion
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要