infinity-former: Infinite Memory Transformer

Pedro Henrique Martins,Zita Marinho,Andre F. T. Martins

PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS)（2022）

引用 0|浏览839

暂无评分

摘要

Transformers are unable to model long-term memories effectively, since the amount of computation they need to perform grows with the context length. While variations of efficient transformers have been proposed, they all have a finite memory capacity and are forced to drop old information. In this paper, we propose the infinity-former, which extends the vanilla transformer with an unbounded longterm memory. By making use of a continuous-space attention mechanism to attend over the long-term memory, the infinity-former's attention complexity becomes independent of the context length, trading off memory length with precision. In order to control where precision is more important, infinity-former maintains "sticky memories," being able to model arbitrarily long contexts while keeping the computation budget fixed. Experiments on a synthetic sorting task, language modeling, and document grounded dialogue generation demonstrate the infinity-former's ability to retain information from long sequences.(1)

查看译文

关键词

memory

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要