Phrase Table Combination Based On Symmetrization Of Word Alignment For Low-Resource Languages

APPLIED SCIENCES-BASEL(2021)

引用 0|浏览1
暂无评分
摘要
Phrase table combination in pivot approaches can be an effective method to deal with low-resource language pairs. The common practice to generate phrase tables in pivot approaches is to use standard symmetrization, i.e., grow-diag-final-and. Although some researchers found that the use of non-standard symmetrization could improve bilingual evaluation understudy (BLEU) scores, the use of non-standard symmetrization has not been commonly employed in pivot approaches. In this study, we propose a strategy that uses the non-standard symmetrization of word alignment in phrase table combination. The appropriate symmetrization is selected based on the highest BLEU scores in each direct translation of source-target, source-pivot, and pivot-target of Kazakh-English (Kk-En) and Japanese-Indonesian (Ja-Id). Our experiments show that our proposed strategy outperforms the direct translation in Kk-En with absolute improvements of 0.35 (a 11.3% relative improvement) and 0.22 (a 6.4% relative improvement) BLEU points for 3-gram and 5-gram, respectively. The proposed strategy shows an absolute gain of up to 0.11 (a 0.9% relative improvement) BLEU points compared to direct translation for 3-gram in Ja-Id. Our proposed strategy using a small phrase table obtains better BLEU scores than a strategy using a large phrase table. The size of the target monolingual and feature function weight of the language model (LM) could reduce perplexity scores.
更多
查看译文
关键词
symmetrization of word alignment, phrase table combination, pivot approach, low-resource languages
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要