Memory Injections: Correcting Multi-Hop Reasoning Failures during Inference in Transformer-Based Language Models
CoRR(2023)
摘要
Answering multi-hop reasoning questions requires retrieving and synthesizing
information from diverse sources. Large Language Models (LLMs) struggle to
perform such reasoning consistently. Here we propose an approach to pinpoint
and rectify multi-hop reasoning failures through targeted memory injections on
LLM attention heads. First, we analyze the per-layer activations of GPT-2
models in response to single and multi-hop prompts. We then propose a mechanism
that allows users to inject pertinent prompt-specific information, which we
refer to as "memories," at critical LLM locations during inference. By thus
enabling the LLM to incorporate additional relevant information during
inference, we enhance the quality of multi-hop prompt completions. We show
empirically that a simple, efficient, and targeted memory injection into a key
attention layer can often increase the probability of the desired next token in
multi-hop tasks, by up to 424
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要