Where Did I Go Wrong?: Identifying Troublesome Segments For Speaker Diarization Systems

13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3(2012)

引用 29|浏览10
暂无评分
摘要
The focus of this work is to identify types of segments that are difficult for speaker diarization systems. The diarization outputs of five state-of-the-art systems are analyzed on short/long segments as well as segments surrounding speaker changepoints. We found that for all five systems as the duration of the segment decreased the diarization error rate (DER) increased. Also, segments immediately preceding and following speaker changepoints performed much worse than their respective counterparts. In fact, at least 40% of the DER for all five systems is attributed to time within 0.5 seconds of a speaker changepoint. We hope the results of this work motivate future improvements of speaker diarization systems.
更多
查看译文
关键词
speaker diarization,error analysis,rich transcription
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要