Exploring the Role of Monolingual Data in Cross-Attention Pre-training for Neural Machine Translation

COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2023（2023）

引用 0|浏览3

暂无评分

摘要

Recent advancements in large pre-trained language models have revolutionized the field of natural language processing (NLP). Despite the impressive results achieved in various NLP tasks, their effectiveness in neural machine translation (NMT) remains limited. The main challenge lies in the mismatch between the pre-training objectives of the language model and the translation task, where the language modeling task focuses on reconstructing the language without considering its semantic interaction with other languages. This results in cross-attention weights being randomly initialized and learned from scratch during NMT training. To overcome this issue, one approach is to utilize joint monolingual corpora to pre-train the cross-attention weights, improving the semantic interaction between the source and target languages. In this paper, we perform extensive experiments to analyze the impact of monolingual data on this pre-training approach and demonstrate its effectiveness in enhancing the NMT performance.

查看译文

关键词

Natural Language Processing,Neural Machine Translation,Pre-training,Cross-attention,Monolingual data

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要