Transparent Image Layer Diffusion using Latent Transparency
CoRR(2024)
摘要
We present LayerDiffusion, an approach enabling large-scale pretrained latent
diffusion models to generate transparent images. The method allows generation
of single transparent images or of multiple transparent layers. The method
learns a "latent transparency" that encodes alpha channel transparency into the
latent manifold of a pretrained latent diffusion model. It preserves the
production-ready quality of the large diffusion model by regulating the added
transparency as a latent offset with minimal changes to the original latent
distribution of the pretrained model. In this way, any latent diffusion model
can be converted into a transparent image generator by finetuning it with the
adjusted latent space. We train the model with 1M transparent image layer pairs
collected using a human-in-the-loop collection scheme. We show that latent
transparency can be applied to different open source image generators, or be
adapted to various conditional control systems to achieve applications like
foreground/background-conditioned layer generation, joint layer generation,
structural control of layer contents, etc. A user study finds that in most
cases (97
previous ad-hoc solutions such as generating and then matting. Users also
report the quality of our generated transparent images is comparable to real
commercial transparent assets like Adobe Stock.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要