Text-to-image synthesis: Starting composite from the foreground content

Information Sciences(2022)

引用 1|浏览5
暂无评分
摘要
Recently, text-to-image synthesis has become a hot issue in computer vision and has been widely concerned. Many methods have achieved encouraging results in this field at present, but it is still a great challenge to improve the quality of the synthesized image further. In this paper, we propose a multi-stage synthesis method, which starts composite from the foreground content. The whole synthesis process is divided into three stages. The first stage generates the foreground results, and the third stage synthesizes the final image results. The second stage results include two situations: one is to continue to synthesize the foreground results; the other is to synthesize the image results with background information. Experiments demonstrate that the method of continuing to generate the foreground results in the second stage can achieve better results on the Caltech-UCSD Birds (CUB) and Oxford-102 datasets, while the way of synthesizing foreground results only in the first stage can obtain better performance on the Microsoft Common Objects in Context (MS COCO) dataset. Besides, our synthesized results on the three datasets are subjectively more realistic with better detail processing. It also outperforms most existing methods in quantitative comparison results, which demonstrates the effectiveness and superiority of our method.
更多
查看译文
关键词
Text-to-image synthesis,Generative adversarial networks,Computer vision,Deep learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要