Speaker Embedding Conversion for Backward and Cross-Channel Compatibility

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)(2022)

引用 2|浏览5
暂无评分
摘要
The accuracy of automatic speaker verification (ASV) systems has shown tremendous improvements due to the recent breakthroughs in low-rank speaker representations and deep learning techniques, leading to the success of ASV in real-world applications from call centers to mobile applications and smart devices. Particularly, some ASV providers have been migrating their legacy systems from the traditional GMM based i-vector paradigm to the deep learning based x-vector paradigm. Additionally, some of them are in need of implementing simultaneously different systems for different use cases such as 8 kHz over the phone channel and 16 kHz on virtual assistants. In either cases, the speaker embeddings extracted from one ASV system are often not compatible with another ASV system. This makes the process of interchangeability between systems very cumbersome and costly. In this paper, we address this issue by proposing a highly efficient speaker embedding converter that transforms a speaker embedding extracted from system A into a speaker embedding that can be used by system B. We evaluate the performance of the embedding converter for i-vector to x-vector upgrade scenario and for cross channel compatibility scenario. In both scenarios, we show that the proposed system achieves very low and compelling equal error rates.
更多
查看译文
关键词
speaker embedding conversion,cross-channel compatibility,automatic speaker verification systems,low-rank speaker representations,deep learning techniques,real-world applications,mobile applications,ASV providers,legacy systems,traditional GMM based i-vector paradigm,x-vector paradigm,simultaneously different systems,phone channel,ASV system,highly efficient speaker embedding converter,system B,x-vector upgrade scenario,cross channel compatibility scenario,frequency 8.0 kHz,frequency 16.0 kHz
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要