Speaker Recognition using Multiple X-Vector Speaker Representations with Two-Stage Clustering and Outlier Detection Refinement*

2022 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech)(2022)

引用 0|浏览1
暂无评分
摘要
This paper presents a novel Variational Bayes x-vector Voice Print Extraction (VBxVPE) system, capable of capturing vocal variations using multiple x-vector representations with two-stage clustering and outlier detection for robust speaker recognition and verification. The presented approach demonstrates beyond the state-of-the-art results when evaluated against the ‘core-core’ and ‘core-multi’ evaluation conditions of the Speakers In the Wild dataset, achieving an Equal Error Rate of 1.06%, Cost of Detection score of 0.052, minimum Cost of Detection score of 0.010, Speaker Identification Accuracy of 95.84% with Precision, Recall and F1 score values of 0.964, 0.958 and 0.961, respectively on the ‘core-core’ evaluation condition and Equal Error Rate of 1.07%, Cost of Detection score of 0.066, minimum Cost of Detection score of 0.010 with Precision, Recall and F1 score values of 0.967, 0.963 and 0.965, respectively on the ‘core-multi’ evaluation condition.
更多
查看译文
关键词
Voice Biometrics,Speaker Recognition,Voice Print Extraction,X-Vectors,Speakers in the Wild
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要