Paired-D++ GAN for image manipulation with text

Machine Vision and Applications（2022）

引用 1|浏览11

暂无评分

摘要

Image manipulation with text is to semantically modify the appearance of an object in a source image based on the given text describing the novel visual attributes while retaining other irrelevant information in the image, such as the background. This has a wide range of applications, such as intelligent image manipulation, and is helpful to those who are not good at painting. We propose a generative adversarial network having a pair of discriminators with different architectures, namely Paired-D++ GAN , for image manipulation with text where the two discriminators make different judgments: one for foreground synthesis and the other for background synthesis. The generator of Paired-D++ GAN has the encoder–decoder architecture with skip-connections and synthesizes an object’s appearance matching the given text description while preserving other parts of the source image. The two discriminators judge the foreground and background of the synthesized image separately to meet the given input text description and the given source image. The Paired-D++ GAN is trained using the effectively unconditional and conditional adversarial learning process in a simultaneous three-player minimax game. Our comprehensively experimental results on the Caltech-200 bird dataset and the Oxford-102 flower dataset show that Paired-D++ GAN can semantically synthesize images to match an input text description while retaining the background in a source image against the state-of-the-art methods.

查看译文

关键词

Image manipulation,Image manipulation with text,Generative adversarial network,Image synthesis,Paired-discriminator

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要