Grounded-Instruct-Pix2Pix: Improving Instruction Based Image Editing with Automatic Target Grounding

Artur Shagidanov, Hayk Poghosyan,Xinyu Gong,Zhangyang Wang,Shant Navasardyan,Humphrey Shi

ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2024)

引用 0|浏览1
暂无评分
摘要
Text-guided Image Editing has recently attracted significant attention due to advances in the denoising diffusion models field. Current methods make it possible to execute complex image editing operations with simple text prompts. But despite impressive results, they often fail to restrict the edit area to only the object of interest, specified in the text prompt. To this end, we propose a novel framework we name Grounded-Instruct-Pix2Pix, which is capable of localized instruction-guided image editing in various scenarios including multi-object cases and complex backgrounds. Our experiments on a diverse set of images clearly showcase its advantage over the recent state-of-the-art approaches, especially at restricting the editing effect to the area of interest only. Grounded-Instruct-Pix2Pix implementation will be available at https://github.com/arthur-71/Grounded-Instruct-Pix2Pix.
更多
查看译文
关键词
text-guided image editing,deep generative models,machine learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要