Human and machine speaker recognition based on short trivial events

2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2018)

引用 7|浏览23
暂无评分
摘要
Human speech often has events that we will call trivial events, e.g., cough, laugh and sniff. Compared to regular speech, these trivial events are usually short and variable, thus generally regarded as not speaker discriminative and so are largely ignored by present speaker recognition research. However, these trivial events are highly valuable in some particular circumstances such as forensic examination, as they are less subjected to intentional change, so can be used to discover the genuine speaker from disguised speech. In this paper, we collect a trivial event speech database that involves 75 speakers and 6 types of events, and report preliminary speaker recognition results on this database, by both human listeners and machines. Particularly, the deep feature learning technique recently proposed by our group is utilized to analyze and recognize the trivial events, leading to acceptable equal error rates (EERs) ranging from 5% to 15% despite the extremely short durations (0.2-0.5 seconds) of these events. Comparing different types of events, 'hmm' seems more speaker discriminative.
更多
查看译文
关键词
speaker recognition,speech perception,deep neural network,speaker feature learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要