Speaker Recognition Benchmark Using the CHiME-5 Corpus

INTERSPEECH(2019)

引用 10|浏览156
暂无评分
摘要
In this paper, we introduce a speaker recognition benchmark derived from the publicly-available CHiME-5 corpus. Our goal is to foster research that tackles the challenging artifacts introduced by far-field multi-speaker recordings of naturally occurring spoken interactions. The benchmark comprises four tasks that involve enrollment and test conditions with single-speaker and/or multi-speaker recordings. Additionally, it supports performance comparisons between close-talking vs distant/far-field microphone recordings, and single-microphone vs microphone-array approaches. We validate the evaluation design with a single-microphone state-of-the-art DNN speaker recognition and diarization system (that we are making publicly available). The results show that the proposed tasks are very challenging, and can be used to quantify the performance gap due to the degradations present in far-field multi-speaker recordings.
更多
查看译文
关键词
speaker recognition, multi-speaker, far-field speech, robustness
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要