Variational approximation of long-span language models for lvcsr.

ICASSP(2011)

引用 69|浏览116
暂无评分
摘要
Long-span language models that capture syntax and semantics are seldom used in the first pass of large vocabulary continuous speech recognition systems due to the prohibitive search-space of sentence-hypotheses. Instead, an N-best list of hypotheses is created using tractable n-gram models, and rescored using the long-span models. It is shown in this paper that computationally tractable variational approximations of the long-span models are a better choice than standard n-gram models for first pass decoding. They not only result in a better first pass output, but also produce a lattice with a lower oracle word error rate, and rescoring the N-best list from such lattices with the long-span models requires a smaller N to attain the same accuracy. Empirical results on the WSJ, MIT Lectures, NIST 2007 Meeting Recognition and NIST 2001 Conversational Telephone Recognition data sets are presented to support these claims.
更多
查看译文
关键词
approximation theory,decoding,languages,programming language semantics,speech coding,speech recognition,variational techniques,vocabulary,LVCSR system,MIT lecture,N-best list,NIST 2001 conversational telephone recognition data set,NIST 2007 meeting recognition,WSJ,computational tractable variational approximation,large vocabulary continuous speech recognition systems,long-span language model,oracle word error rate,pass decoding,semantic capture,sentence-hypotheses prohibitive search-space,syntax capture,tractable n-gram model,Language Model,Recurrent Neural Network,Variational Inference
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要