Empirical study of pretrained multilingual language models for zero-shot cross-lingual knowledge transfer in generation
arxiv(2023)
摘要
Zero-shot cross-lingual knowledge transfer enables the multilingual
pretrained language model (mPLM), finetuned on a task in one language, make
predictions for this task in other languages. While being broadly studied for
natural language understanding tasks, the described setting is understudied for
generation. Previous works notice a frequent problem of generation in a wrong
language and propose approaches to address it, usually using mT5 as a backbone
model. In this work, we test alternative mPLMs, such as mBART and NLLB-200,
considering full finetuning and parameter-efficient finetuning with adapters.
We find that mBART with adapters performs similarly to mT5 of the same size,
and NLLB-200 can be competitive in some cases. We also underline the importance
of tuning learning rate used for finetuning, which helps to alleviate the
problem of generation in the wrong language.
更多查看译文
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要