Investigation Of Sampling Techniques For Maximum Entropy Language Modeling Training

2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)(2019)

引用 1|浏览1594
暂无评分
摘要
Maximum entropy language models (MaxEnt LMs) are log-linear models which are able to incorporate various hand-crafted features and non-linguistic information. Standard MaxEnt LMs are computationally heavy for tasks with a large vocabulary size due to the expensive normalization computation in the denominator. To address this issue, most recent works on MaxEnt LMs have used class based MaxEnt LMs. However, the performance of class based MaxEnt LMs might be sensitive to word clustering and it is also time-consuming to generate high-quality word classes. Motivated by the recent success of sampling techniques in accelerating the training of neural network language models, in this paper, three widely used sampling techniques, importance sampling, noise contrastive estimation (NCE) and sampled softmax, are investigated for the MaxEnt LM training. Experimental results on the Google One Billion corpus and an internal speech recognition system demonstrate the effectiveness of sampled softmax and NCE on MaxEnt LM training. However, importance sampling is not effective for MaxEnt LM training despite its similarity to sampled softmax. To our knowledge, this is the first work applying sampling techniques on MaxEnt LM training.
更多
查看译文
关键词
Maximum entropy language model, importance sampling, noise contrastive estimation, sampled softmax
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要