Flat Start Training Of Cd-Ctc-Smbr Lstm Rnn Acoustic Models
2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2016)
摘要
We present a recipe for training acoustic models with context dependent (CD) phones from scratch using recurrent neural networks (RNNs). First, we use the connectionist temporal classification (CTC) technique to train a model with context independent (CI) phones directly from the written-domain word transcripts by aligning with all possible phonetic verbalizations. Then, we devise a mechanism to generate a set of CD phones using the CTC CI phone model alignments and train a CD phone model to improve the accuracy. This end-to-end training recipe does not require any previously trained GMM-HMM or DNN model for CD phone generation or alignment, and thus drastically reduces the overall model building time. We show that using this procedure does not degrade the performance of models and allows us to improve models more quickly by updates to pronunciations or training data.
更多查看译文
关键词
Flat start,CTC,LSTM RNN,acoustic modeling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络