Inference Discrepancy Based Curriculum Learning for Neural Machine Translation

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS(2024)

引用 0|浏览0
暂无评分
摘要
In practice, even a well-trained neural machine translation (NMT) model can still make biased inferences on the training set due to distribution shifts. For the human learning process, if we can not reproduce something correctly after learning it multiple times, we consider it to be more difficult. Likewise, a training example causing a large discrepancy between inference and reference implies higher learning difficulty for the MT model. Therefore, we propose to adopt the inference discrepancy of each training example as the difficulty criterion, and according to which rank training examples from easy to hard. In this way, a trained model can guide the curriculum learning process of an initial model identical to itself. We put forward an analogy to this training scheme as guiding the learning process of a curriculum NMT model by a pretrained vanilla model. In this paper, we assess the effectiveness of the proposed training scheme and take an insight into the influence of translation direction, evaluation metrics and different curriculum schedules. Experimental results on translation benchmarks WMT14 English = German, WMT17 Chinese = English and Multitarget TED Talks Task (MTTT) English <=> German, English <=> Chinese, English <=> Russian demonstrate that our proposed method consistently improves the translation performance against the advanced Transformer baseline.
更多
查看译文
关键词
key curriculum learning,machine translation,inference discrep-ancy,self-paced learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要