Cache Me if You Can: Accelerating Diffusion Models through Block Caching
CVPR 2024(2023)
摘要
Diffusion models have recently revolutionized the field of image synthesis
due to their ability to generate photorealistic images. However, one of the
major drawbacks of diffusion models is that the image generation process is
costly. A large image-to-image network has to be applied many times to
iteratively refine an image from random noise. While many recent works propose
techniques to reduce the number of required steps, they generally treat the
underlying denoising network as a black box. In this work, we investigate the
behavior of the layers within the network and find that 1) the layers' output
changes smoothly over time, 2) the layers show distinct patterns of change, and
3) the change from step to step is often very small. We hypothesize that many
layer computations in the denoising network are redundant. Leveraging this, we
introduce block caching, in which we reuse outputs from layer blocks of
previous steps to speed up inference. Furthermore, we propose a technique to
automatically determine caching schedules based on each block's changes over
timesteps. In our experiments, we show through FID, human evaluation and
qualitative analysis that Block Caching allows to generate images with higher
visual quality at the same computational cost. We demonstrate this for
different state-of-the-art models (LDM and EMU) and solvers (DDIM and DPM).
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要