Teager Energy Subband Filtered Features for Near and Far-Field Automatic Speech Recognition

2021 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)(2021)

引用 0|浏览0
暂无评分
摘要
Automatic Speech Recognition (ASR) usually works well with close-talking microphone environment rather than in far-field conditions. A major challenge in the far-field ASR systems is to handle the background noise, multi path reflections, and reverberation, that leads to decrease in the quality of the speech signal. To that effect, we propose Teager energy-based Gabor filterbank (TGFB) features that preserve the amplitude and frequency modulation of a resonant signal, and improve the time-frequency resolution. In addition, via TGFB features, we exploit noise suppression capability of Teager Energy Operator (TEO) for improving ASR performance under signal degradation conditions due to far-field speech. The ASR experiments are performed on LibriSpeech (near-field) and CHiME-3 (far-field) corpora. Marginal improvements were observed for TGFB features over MFCC features in our experiments. We observed that the system combination of TGFB and MFCC features could provide significant improvements over the standalone MFCC features. For LibriSpeech corpus, a relative improvement for Word Error Rate (WER) of close to 5% was observed. On the other hand, for CHiME-3 corpus, the average relative improvement of 7.20 % was obtained over the baseline features using system level combination.
更多
查看译文
关键词
Automatic Speech Recognition,Teager Energy Operator,Near and Far-Field
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要