An architecture for rapid decoding of large vocabulary conversational speech

INTERSPEECH(2003)

引用 54|浏览34
暂无评分
摘要
This paper addresses the question of how to design a large vocabulary recognition system so that it can simultaneously handle a sophisticated language model, perform state-of- the-art speaker adaptation, and run in one times real time 1 (1 RT). The architecture we propose is based on classi- cal HMM Viterbi decoding, but uses an extremely fast ini- tial speaker-independent decoding to estimate VTL warp factors, feature-space and model-space MLLR transforma- tions that are used in a final speaker-adapted decoding. We present results on past Switchboard evaluation data that in - dicate that this strategy compares favorably to published unlimited-time systems (running in several hundred times real-time). Coincidentally, this is the system that IBM fiel ded in the 2003 EARS Rich Transcription evaluation.
更多
查看译文
关键词
viterbi decoder,language model,real time,feature space
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要