A Generative Auditory Model Embedded Neural Network For Speech Processing
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)(2018)
摘要
Before the era of the neural network (NN), features extracted from auditory models have been applied to various speech applications and been demonstrated more robust against noise than conventional speech-processing features. What's the role of auditory models in the current NN era? Are they obsolete? To answer this question, we construct a NN with a generative auditory model embedded to process speech signals. The generative auditory model consists of two stages, the stage of spectrum estimation in the logarithmic-frequency axis by the cochlea and the stage of spectral-temporal analysis in the modulation domain by the auditory cortex. The NN is evaluated in a simple speaker identification task. Experiment results show that the auditory model embedded NN is still more robust against noise, especially in low SNR conditions, than the randomly-initialized NN in speaker identification.
更多查看译文
关键词
generative auditory model, convolutional neural network, multi-resolution, speaker identification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络