Language-Independent Representations Improve Zero-Shot Summarization
arxiv(2024)
摘要
Finetuning pretrained models on downstream generation tasks often leads to
catastrophic forgetting in zero-shot conditions. In this work, we focus on
summarization and tackle the problem through the lens of language-independent
representations. After training on monolingual summarization, we perform
zero-shot transfer to new languages or language pairs. We first show naively
finetuned models are highly language-specific in both output behavior and
internal representations, resulting in poor zero-shot performance. Next, we
propose query-key (QK) finetuning to decouple task-specific knowledge from the
pretrained language generation abilities. Then, after showing downsides of the
standard adversarial language classifier, we propose a balanced variant that
more directly enforces language-agnostic representations. Moreover, our
qualitative analyses show removing source language identity correlates to
zero-shot summarization performance. Our code is openly available.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要