Towards Intoxicated Speech Recognition

2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)(2017)

引用 7|浏览201
暂无评分
摘要
In a real-life scenario, the acoustic characteristics of speech often suffer from the variations induced by diverse environmental noises and different speakers. To overcome the speaker-related speech variation problem for Automatic Speech Recognition (ASR), many speaker adaptation techniques have been proposed and studied. Almost all of these studies, however, only considered the speakers' long-term traits, such as age, gender, and dialect. Speakers' short-term states, for example, affect and intoxication, are largely ignored. In this study, we address one particular speaker state, alcohol intoxication, which has rarely been studied in the context of ASR. To do this, empirical experiments are performed on a publicly available database used for the INTERSPEECH 2011 Speaker State Challenge, Intoxication Sub-Challenge. The experimental results show that the intoxicated state of the speaker indeed degrades the performance of ASR systems by a large margin for all of the three considered speech styles (spontaneous speech, tongue twisters, command & control). In addition, this paper further shows that multi-condition training can notably improve the acoustic model.
更多
查看译文
关键词
acoustic characteristics,speaker-related speech variation problem,automatic speech recognition,ASR,speaker adaptation,alcohol intoxication,INTERSPEECH 2011 speaker state challenge,multicondition training
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要