A Fast Re-scoring Strategy to Capture Long-Distance Dependencies.

EMNLP '11: Proceedings of the Conference on Empirical Methods in Natural Language Processing(2011)

引用 8|浏览58
暂无评分
摘要
A re-scoring strategy is proposed that makes it feasible to capture more long-distance dependencies in the natural language. Two pass strategies have become popular in a number of recognition tasks such as ASR (automatic speech recognition), MT (machine translation) and OCR (optical character recognition). The first pass typically applies a weak language model ( n -grams) to a lattice and the second pass applies a stronger language model to N best lists. The stronger language model is intended to capture more long-distance dependencies. The proposed method uses RNN-LM (recurrent neural network language model), which is a long span LM, to re-score word lattices in the second pass. A hill climbing method (iterative decoding) is proposed to search over islands of confusability in the word lattice. An evaluation based on Broadcast News shows speedups of 20 over basic N best re-scoring, and word error rate reduction of 8% (relative) on a highly competitive setup.
更多
查看译文
关键词
stronger language model,word lattice,long-distance dependency,natural language,recurrent neural network language,weak language model,pass strategy,automatic speech recognition,optical character recognition,proposed method,fast re-scoring strategy
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要