Efficient On-The-Fly Hypothesis Rescoring In A Hybrid Gpu/Cpu-Based Large Vocabulary Continuous Speech Recognition Engine

13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3(2012)

引用 37|浏览20
暂无评分
摘要
Effectively exploiting the resources available on modern multicore and manycore processors for tasks such as large vocabulary continuous speech recognition (LVCSR) is far from trivial. While prior works have demonstrated the effectiveness of manycore graphic processing units (GPU) for high-throughput, limited vocabulary speech recognition, they are unsuitable for recognition with large acoustic and language models due to the limited 1-6GB of memory on GPUs. To overcome this limitation, we introduce a novel architecture for WFST-based LVCSR that jointly leverages manycore graphic processing units (GPU) and multicore processors (CPU) to efficiently perform recognition even when large acoustic and language models are applied. In the proposed approach, recognition is performed on the GPU using an H-level WFST, composed using a unigram language model. During decoding partial hypotheses generated over this network are rescored on-the-fly using a large language model, which resides on the CPU. By maintaining N-best hypotheses during decoding our proposed architecture obtains comparable accuracy to a standard CPU-based WFST decoder while improving decoding speed by a factor of 11 x.
更多
查看译文
关键词
Large Vocabulary Continuous Speech Recognition,WFST,On-The-Fly Rescoring,Graphics Processing Units
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要