Cutting Off the Head Ends the Conflict: A Mechanism for Interpreting and Mitigating Knowledge Conflicts in Language Models
CoRR(2024)
摘要
Recently, retrieval augmentation and tool augmentation have demonstrated a
remarkable capability to expand the internal memory boundaries of language
models (LMs) by providing external context. However, internal memory and
external context inevitably clash, leading to knowledge conflicts within LMs.
In this paper, we aim to interpret the mechanism of knowledge conflicts through
the lens of information flow, and then mitigate conflicts by precise
interventions at the pivotal point. We find there are some attention heads with
opposite effects in the later layers, where memory heads can recall knowledge
from internal memory, and context heads can retrieve knowledge from external
context. Moreover, we reveal that the pivotal point at which knowledge
conflicts emerge in LMs is the integration of inconsistent information flows by
memory heads and context heads. Inspired by the insights, we propose a novel
method called Pruning Head via PatH PatcHing (PH3), which can efficiently
mitigate knowledge conflicts by pruning conflicting attention heads without
updating model parameters. PH3 can flexibly control eight LMs to use internal
memory (↑ 44.0
can also improve the performance of LMs on open-domain QA tasks. We also
conduct extensive experiments to demonstrate the cross-model, cross-relation,
and cross-format generalization of our method.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要