Factorial Modeling For Effective Suppression Of Directional Noise

18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION(2017)

引用 0|浏览123
暂无评分
摘要
The assumed scenario is transcription of a face-to-face conversation, such as in the financial industry when an agent and a customer talk over a desk with microphones placed between the speakers and then it is transcribed. From the automatic speech recognition (ASR) perspective, one of the speakers is the target speaker, and the other speaker is a directional noise source. When the number of microphones is small, we often accept microphone intervals that are larger than the spatial aliasing limit because the performance of the beamformer is better. Unfortunately, such a configuration results in significant leakage of directional noise in certain frequency bands because the spatial aliasing makes the beamformer and post-filter inaccurate there. Thus, we introduce a factorial model to compensate only the degraded bands with information from the reliable bands in a probabilistic framework integrating our proposed metrics and speech model. In our experiments, the proposed method reduced the errors from 29.8% to 24.9 %.
更多
查看译文
关键词
microphone array, post-filtering, beamformer, speech recognition, factorial model
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要