Comparison of modulation features for phoneme recognition

Acoustics Speech and Signal Processing(2010)

引用 9|浏览35
暂无评分
摘要
In this paper, we compare several approaches for the extraction of modulation frequency features from speech signal using a phoneme recognition system. The general framework in these approaches is to decompose the speech signal into a set of sub-bands. Amplitude modulations (AM) in the sub-band signal are used to derive features for automatic speech recognition (ASR). Then, we propose a feature extraction technique which uses autoregressive models (AR) of sub-band Hilbert envelopes in relatively long segments of speech signal. AR models of Hilbert envelopes are derived using frequency domain linear prediction (FDLP). Features are formed by converting the FDLP envelopes into static and dynamic modulation frequency components. In the phoneme recognition experiments using the TIMIT database, the FDLP based modulation frequency features provide significant improvements compared to other techniques (average relative improvement of 7.5% over the base-line features). Furthermore, a detailed analysis is performed to determine the relative contribution of various processing stages in the proposed technique.
更多
查看译文
关键词
acoustic signal processing,amplitude modulation,autoregressive processes,frequency modulation,speech recognition,amplitude modulations,automatic speech recognition,autoregressive models,frequency domain linear prediction,modulation frequency,phoneme recognition,speech signal,sub-band Hilbert envelopes,sub-band signal,Feature Extraction,Frequency domain linear prediction (FDLP),Modulations,Phoneme recognition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要