CRFAST: Clip-Based Reference-Guided Facial Image Semantic Transfer

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2023)

引用 0|浏览0
暂无评分
摘要
This paper presents a new task for CLIP-based reference-guided facial image semantic transfer: the source facial image is translated to the output image with the high-level semantic attributes from the reference image while maintaining identity preservation. To this end, we employ the powerful generative capability of StyleGAN generator and the rich semantic knowledge of CLIP encoder to accomplish such a task. Additionally, a novel contrastive loss is designed to comprehensively explore the rich semantic information of CLIP for facial semantic concepts. This loss guides the semantic transfer toward desired directions from different perspectives in the pre-defined CLIP space. Besides, a simple yet effective semantic-preserved modulation module is proposed to explicitly map CLIP embeddings of reference image to the latent space. Experiments demonstrate that our approach achieves realistic facial image semantic transfer driven by reference images with various facial semantics.
更多
查看译文
关键词
Vision-Language Pre-trained Model,Generative Adversarial Networks,Contrastive Learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要