Massively Multilingual ASR: A Lifelong Learning Solution

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)(2022)

引用 25|浏览36
The development of end-to-end models has largely sped up the research in massively multilingual automatic speech recognition (MMASR). Previous research has demonstrated the feasibility to build high quality MMASR models. In this work, we study the impact of adding more languages and propose a lifelong learning approach to build high quality MMASR systems. Experiments on a 66-language Voice Search task show that we can take a model built on 15 languages and continue training to obtain a 32-language model and similarly to further build a 67-language model. More importantly, models developed in this way achieve better quality compared to those trained from scratch. It maintains similar performance on old languages and achieves competitive results on new ones. This would potentially speed up the development of universal ASR models that recognize speech from any language, any domain and any environment by reusing knowledge learned beforehand.
massive,multilingual,lifelong learning
AI 理解论文