Entwined Inversion: Tune-Free Inversion For Real Image Faithful Reconstruction and Editing

ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2024)

引用 0|浏览1
暂无评分
摘要
Text-conditional image editing is a very practical AIGC task that has recently emerged with great commercial and academic research value. For real image editing, most diffusion model-based methods use DDIM Inversion as the first stage before editing, but DDIM Inversion often results in reconstruction failure, leading to unsatisfactory performance for all downstream edits. In order to solve this problem, we first mathematically analyze the reason for the reconstruction failure of DDIM Inversion, and then propose a new inversion and sampling method named Entwined Inversion that can achieve satisfactory reconstruction and editing performance, which can solve two major problems: 1) the object can retain the main content of the original image; 2) the edited object can conform to the semantics of the text prompt. In addition, our method does not require training the diffusion model itself on a large dataset, nor does it require any fine-tuning for some particular images.
更多
查看译文
关键词
Real image editing,Diffusion model,Text-to-image generation,AIGC
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要