DreamWalk: Style Space Exploration using Diffusion Guidance
arxiv(2024)
摘要
Text-conditioned diffusion models can generate impressive images, but fall
short when it comes to fine-grained control. Unlike direct-editing tools like
Photoshop, text conditioned models require the artist to perform "prompt
engineering," constructing special text sentences to control the style or
amount of a particular subject present in the output image. Our goal is to
provide fine-grained control over the style and substance specified by the
prompt, for example to adjust the intensity of styles in different regions of
the image (Figure 1). Our approach is to decompose the text prompt into
conceptual elements, and apply a separate guidance term for each element in a
single diffusion process. We introduce guidance scale functions to control when
in the diffusion process and where in the image to intervene. Since the
method is based solely on adjusting diffusion guidance, it does not require
fine-tuning or manipulating the internal layers of the diffusion model's neural
network, and can be used in conjunction with LoRA- or DreamBooth-trained models
(Figure2). Project page: https://mshu1.github.io/dreamwalk.github.io/
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要