UFOGen: You Forward Once Large Scale Text-to-Image Generation Via Diffusion GANs

Yanwu Xu,Yang Zhao,Zhisheng Xiao,Tingbo Hou

CVPR 2024（2024）

Boston University

Cited 118|Views440

Abstract

Text-to-image diffusion models have demonstrated remarkable capabilities in transforming text prompts into coherent images, yet the computational cost of the multi-step inference remains a persistent challenge. To address this issue, we present UFOGen, a novel generative model designed for ultra-fast, one-step text-to-image generation. In contrast to conventional approaches that focus on improving samplers or employing distillation techniques for diffusion models, UFOGen adopts a hybrid methodology, integrating diffusion models with a GAN objective. Leveraging a newly introduced diffusion-GAN objective and initialization with pre-trained diffusion models, UFOGen excels in efficiently generating high-quality images conditioned on textual descriptions in a single step. Beyond traditional text-to-image generation, UFOGen showcases versatility in applications. Notably, UFOGen stands among the pioneering models enabling one-step text-to-image generation and diverse downstream tasks, presenting a significant advancement in the landscape of efficient generative models.

Translated text

Key words

diffusion models,text-to-image generation,GANs

Bibtex

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Data Disclaimer

The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn

Chat Paper

Summary is being generated by the instructions you defined