Hierarchically-Fused Generative Adversarial Network for Text to Realistic Image Synthesis

2019 16th Conference on Computer and Robot Vision (CRV)(2019)

引用 11|浏览100
暂无评分
摘要
In this paper, we present a novel Hierarchically-fused Generative Adversarial Network (HfGAN) for synthesizing realistic images from text descriptions. While existing approaches on this topic have achieved impressive success, to generate 256×256 images from captions, they commonly resort to coarse-to-fine scheme and associate multiple discriminators in different stages of the networks. Such a strategy is both inefficient and prone to artifacts. Motivated by the above findings, we propose an end-to-end network that can generate 256×256 photo-realistic images with only one discriminator. We fully exploit the hierarchical information from different layers and directly generate the fine-scale images by adaptively fusing features from multi-hierarchical layers. We quantitatively evaluate the synthesized images with Inception Score, Visual-semantic Similarity and average training time on the CUB birds, Oxford-102 flowers, and COCO datasets. The results show that our model is more efficient and noticeably outperforms the previous state-of-the-art methods.
更多
查看译文
关键词
Generative Adversarial Networks,text-to-image synthesis,hierarchical features fusion
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要