Fuse to Forget: Bias Reduction and Selective Memorization through Model Fusion.
CoRR(2023)
摘要
Model fusion research aims to aggregate the knowledge of multiple models to
enhance performance by combining their weights. In this work, we study the
inverse, investigating whether and how can model fusion interfere and reduce
unwanted knowledge. We delve into the effects of model fusion on the evolution
of learned shortcuts, social biases, and memorization capabilities in
fine-tuned language models. Through several experiments covering text
classification and generation tasks, our analysis highlights that shared
knowledge among models is usually enhanced during model fusion, while unshared
knowledge is usually lost or forgotten. Based on this observation, we
demonstrate the potential of model fusion as a debiasing tool and showcase its
efficacy in addressing privacy concerns associated with language models.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要