An embedded method

Image Communication(2022)

引用 115|浏览12
暂无评分
摘要
To solve the problem of low quality and lack of specific attributes in the text-to-face synthesis task, this paper proposes EFA, a general embedding method for strengthening face attributes in the text-to-image synthesis models. First, we re-encode the irregular word-level descriptions scattered in sentences to form word encoding. Then, we design the embedded local feature extraction layer for discriminators of different models to learn more specific information related to face attributes. Next, we associate the word encoding with the extracted face image feature regions to obtain face attribute domain classification loss of the real image and the generated image. Finally, in the training process, we adopt the loss function to constrain the generator and discriminator to improve their performance. This method can improve the quality of text-to-face synthesis and enhance the semantic correlation between the generated image and text description. A large number of experimental results on the newly released Multi-Modal CelebA-HQ dataset verify the validity of our method, and the experimental results are competitive compared with state of the art. Especially, our approach boosts the FID by 47.75% over AttnGAN, by 33.68% over ControlGAN, by 10.05% over DM-GAN, and by 12.52% over DF-GAN. Code is available at https://github.com/cookie-ke/EFA . • Text-to-face image synthesis filed has strong application prospects. • The proposed embedded method EFA strengths the model's learning of salient features. • EFA can be embedded into different models to improve their performance. • Verify the effectiveness and versatile of EFA by using the face dataset on 4 models.
更多
查看译文
关键词
Generative adversarial networks,Text-to-image face image generation,Face synthesis,Visual attributes
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要