Cross-Task Linearity Emerges in the Pretraining-Finetuning Paradigm
CoRR(2024)
摘要
The pretraining-finetuning paradigm has become the prevailing trend in modern
deep learning. In this work, we discover an intriguing linear phenomenon in
models that are initialized from a common pretrained checkpoint and finetuned
on different tasks, termed as Cross-Task Linearity (CTL). Specifically, if we
linearly interpolate the weights of two finetuned models, the features in the
weight-interpolated model are approximately equal to the linear interpolation
of features in two finetuned models at each layer. Such cross-task linearity
has not been noted in peer literature. We provide comprehensive empirical
evidence supporting that CTL consistently occurs for finetuned models that
start from the same pretrained checkpoint. We conjecture that in the
pretraining-finetuning paradigm, neural networks essentially function as linear
maps, mapping from the parameter space to the feature space. Based on this
viewpoint, our study unveils novel insights into explaining model
merging/editing, particularly by translating operations from the parameter
space to the feature space. Furthermore, we delve deeper into the underlying
factors for the emergence of CTL, emphasizing the impact of pretraining.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要