ZONE: Zero-Shot Instruction-Guided Local Editing
CoRR(2023)
摘要
Recent advances in vision-language models like Stable Diffusion have shown
remarkable power in creative image synthesis and editing.However, most existing
text-to-image editing methods encounter two obstacles: First, the text prompt
needs to be carefully crafted to achieve good results, which is not intuitive
or user-friendly. Second, they are insensitive to local edits and can
irreversibly affect non-edited regions, leaving obvious editing traces. To
tackle these problems, we propose a Zero-shot instructiON-guided local image
Editing approach, termed ZONE. We first convert the editing intent from the
user-provided instruction (e.g., “make his tie blue") into specific image
editing regions through InstructPix2Pix. We then propose a Region-IoU scheme
for precise image layer extraction from an off-the-shelf segment model. We
further develop an edge smoother based on FFT for seamless blending between the
layer and the image.Our method allows for arbitrary manipulation of a specific
region with a single instruction while preserving the rest. Extensive
experiments demonstrate that our ZONE achieves remarkable local editing results
and user-friendliness, outperforming state-of-the-art methods.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要