Annealed F-Smoothing As A Mechanism To Speed Up Neural Network Training
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION(2017)
摘要
In this paper, we describe a method to reduce the overall number of neural network training steps, during both cross entropy and sequence training stages. This is achieved through the interpolation of frame-level CE and sequence level SMBR criteria, during the sequence training stage. This interpolation is known as f-smoothing and has previously been just used to prevent overfitting during sequence training. However, in this paper, we investigate its application to reduce the training time. We explore different interpolation strategies to reduce the overall training steps; and achieve a reduction of up to 25% with almost no degradation in word error rate (WER). Finally, we explore the generalization of f-smoothing to other tasks.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络