Beyond English-Centric Multilingual Machine Translation

JOURNAL OF MACHINE LEARNING RESEARCH(2021)

引用 573|浏览286
暂无评分
摘要
Existing work in translation demonstrated the potential of massively multilingual machine translation by training a single model able to translate between any pair of languages. However, much of this work is English-Centric, training only on data which was translated from or to English. While this is supported by large sources of training data, it does not reflect translation needs worldwide. In this work, we create a true Many-to-Many multilingual translation model that can translate directly between any pair of 100 languages. We build and open-source a training data set that covers thousands of language directions with parallel data, created through large-scale mining. Then, we explore how to effectively increase model capacity through a combination of dense scaling and language-specific sparse parameters to create high quality models. Our focus on non-English-Centric models brings gains of more than 10 BLEU when directly translating between non-English directions while performing competitively to the best single systems from the Workshop on Machine Translation (WMT). We open-source our scripts so that others may reproduce the data, evaluation, and final M2M-100 model: https://github.com/pytorch/fairseq/tree/master/examples/m2m_100.
更多
查看译文
关键词
many-to-many, multilingual machine translation, model scaling, bitext mining, neural networks
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络