An architecture for rapid decoding of large vocabulary conversational speech
INTERSPEECH(2003)
摘要
This paper addresses the question of how to design a large vocabulary recognition system so that it can simultaneously handle a sophisticated language model, perform state-of- the-art speaker adaptation, and run in one times real time 1 (1 RT). The architecture we propose is based on classi- cal HMM Viterbi decoding, but uses an extremely fast ini- tial speaker-independent decoding to estimate VTL warp factors, feature-space and model-space MLLR transforma- tions that are used in a final speaker-adapted decoding. We present results on past Switchboard evaluation data that in - dicate that this strategy compares favorably to published unlimited-time systems (running in several hundred times real-time). Coincidentally, this is the system that IBM fiel ded in the 2003 EARS Rich Transcription evaluation.
更多查看译文
关键词
viterbi decoder,language model,real time,feature space
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要