An Alternative to MFCCs for ASR.

INTERSPEECH(2020)

引用 1|浏览70
暂无评分
摘要
The Mel scale is the most commonly used frequency warping function to extract features for automatic speech recognition (ASR) and is known to be quite effective. However, it is not specifically designed for ASR acoustic models based on deep neural networks (DNN). In this study, we introduce a frequency warping function which is a modified version of Mel scale. This warping function is parameterized using 2 parameters and we use it to propose a new set of features called modified Mel-frequency cepstral coefficients (MFCC), which use cosine-shaped filters. The bandwidths are computed using a new function. By evaluating the proposed features on a variety of ASR data sets, we see consistent improvements over regular MFCCs and (log) Mel filter bank energies.
更多
查看译文
关键词
feature extraction, ASR, Modified Mel
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要