The JHU ASR System for VOiCES from a Distance Challenge 2019

INTERSPEECH(2019)

引用 8|浏览116
暂无评分
摘要
This paper describes the system developed by the JHU team for automatic speech recognition (ASR) of the VOiCES from a Distance Challenge 2019, focusing on single channel distant/farfield audio under noisy conditions. We participated in the Fixed Condition track, where the systems are only trained on an 80-hour subset of the Librispeech corpus provided by the organizer. The training data was first augmented with both background noises and simulated reverberation. We then trained factorized TDNN acoustic models that differed only in their use of i-vectors for adaptation. Both systems utilized RNN language models trained on original and reversed text for rescoring. We submitted three systems: the system using i-vectors with WER 19.4% on the development set, the system without i-vectors that achieved WER 19.0%, and the their lattice-level fusion with WER 17.8%. On the evaluation set, our best system achieves 23.9% WER.
更多
查看译文
关键词
far-field speech recognition, VOiCES Challenge 2019
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要