Impact Of Filtering Generated Pseudo Bilingual Texts In Low-Resource Neural Machine Translation Enhancement: The Case Of Persian-Spanish

AI IN COMPUTATIONAL LINGUISTICS(2021)

引用 0|浏览12
暂无评分
摘要
Although the Neural Machine Translation (NMT) framework has already been shown effective in large training data scenarios, it is less effective for low-resource conditions. To improve NMT performance in a low-resource setting, we extend the high-quality training data by generating a pseudo bilingual dataset and then filtering out low-quality alignments using a quality estimation based on back-translation. We demonstrate that our approach yields significantly higher BLEU scores than those of a set of baselines. (C) 2021 The Authors. Published by Elsevier B.V.
更多
查看译文
关键词
computational linguistics, natural language processing, neural machine translation, low-resource languages
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络