Vocoder Detection of Spoofing Speech Based on GAN Fingerprints and Domain GeneralizationJust Accepted

Fan Li,Yanxiang Chen, Haiyang Liu,Zuxing Zhao, Yuanzhi Yao,Xin Liao

ACM Transactions on Multimedia Computing, Communications, and Applications(2023)

引用 0|浏览0
暂无评分
摘要
As an important part of the text-to-speech (TTS) system, vocoders convert acoustic features into speech waveforms. The difference in vocoders is key to producing different types of forged speech in the TTS system. With the rapid development of general adversarial networks (GANs), an increasing number of GAN vocoders have been proposed. Detectors often encounter vocoders of unknown types, which leads to a decline in the generalization performance of models. However, existing studies lack research on detection generalization based on GAN vocoders. To solve this problem, this study proposes vocoder detection of spoofed speech based on GAN fingerprints and domain generalization. The framework can widen the distance between real speech and forged speech in feature space, improving the detection model’s performance. Specifically, we utilize a fingerprint extractor based on an autoencoder to extract GAN fingerprints from vocoders. We then weight them to the forged speech for subsequent classification to learn the forged speech features with high differentiation. Subsequently, domain generalization is used to further improve the generalization ability of the model for unseen forgery types. We achieve domain generalization using domain-adversarial learning and asymmetric triplet loss to learn a better generalized feature space in which real speech is compact and forged speech synthesized by different vocoders is dispersed. Finally, to optimize the training process, curriculum learning is used to dynamically adjust the contributions of the samples with different difficulties in the training process. Experimental results show that the proposed method achieves the most advanced detection results among four GAN vocoders. The code is available at https://github.com/multimedia-infomation-security/GAN-Vocoder-detection.
更多
查看译文
关键词
Speech Forgery,Vocoder,GAN Fingerprint,Domain Generalization,Curriculum Learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要