An Investigation Of Multilingual Asr Using End-To-End Lf-Mmi
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)(2019)
摘要
The end-to-end lattice-free maximum mutual information ( LF-MMI) approach has recently been shown to be beneficial for automatic speech recognition ( ASR) in general. More specifically, its end-to-end nature and use of context independent phone labels make it attractive for multilingual ASR. We show that end-to-end LF-MMI is indeed competitive on a low-resourced multilingual task, comfortably outperforming a connectionist temporal classification ( CTC) baseline. We further investigate the feasibility of biphone contexts, being a candidate compromise between the context independent approach and the triphone contexts that usually perform well. We show that biphones do not initially perform well, but can do so after language adaptive training, concluding that biphones carry language variability but are promising for multilingual ASR.
更多查看译文
关键词
end-to-end LF-MMI, multilingual ASR, CTC, language adaptive training
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络