Speaker Adaptation Using The I-Vector Technique For Bottleneck Features
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5(2015)
摘要
Deep Neural Networks (DNN) have been largely used and successfully applied in the context of speaker independent Automatic Speech Recognition (ASR). However, these models are not easily adapted to model a specific speaker characteristic. Recently, one approach was proposed to address this issue, which consists of using the I-vector representation as input to the DNN. The I-vector is playing the role of providing information about the speaker as well as the environmental conditions for a given recording. This approach achieved a significant improvement in the context of a hybrid system of DNN combined with Hidden Markov Model (HMM). In this paper, we study the effect of speaker adaptation based on the I-vector framework in the context of stacked bottleneck features. These features, extracted from a second level of DNNs, are modelled by a classical Gaussian Mixture Model (GMM) ASR system. The proposed approach achieved an absolute WER improvement of 1.2% on an Arabic Broadcast news task.
更多查看译文
关键词
DNN, I-Vector, Bottleneck Features, Speech Recognition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要