Model Editing at Scale leads to Gradual and Catastrophic Forgetting
CoRR(2024)
摘要
Editing knowledge in large language models is an attractive capability to
have which allows us to correct incorrectly learnt facts during pre-training,
as well as update the model with an ever-growing list of new facts. While
existing model editing techniques have shown promise, they are usually
evaluated using metrics for reliability, specificity and generalization over
one or few edits. We argue that for model editing to have practical utility, we
must be able to make multiple edits to the same model. With this in mind, we
evaluate the current model editing methods at scale, focusing on two state of
the art methods: ROME and MEMIT. We find that as the model is edited
sequentially with multiple facts, it continually forgets previously edited
facts and the ability to perform downstream tasks. This forgetting happens in
two phases – an initial gradual but progressive forgetting phase followed by
abrupt or catastrophic forgetting phase. Both gradual and catastrophic
forgetting limit the usefulness of model editing methods at scale – the former
making model editing less effective as multiple edits are made to the model
while the latter caps the scalability of such model editing methods. Our
analysis also highlights other key limitations of ROME and MEMIT at scale. With
our work, we push for the development and evaluation of model editing methods
keeping scalability in mind.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要