infinity-former: Infinite Memory Transformer

PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS)(2022)

引用 0|浏览839
暂无评分
摘要
Transformers are unable to model long-term memories effectively, since the amount of computation they need to perform grows with the context length. While variations of efficient transformers have been proposed, they all have a finite memory capacity and are forced to drop old information. In this paper, we propose the infinity-former, which extends the vanilla transformer with an unbounded longterm memory. By making use of a continuous-space attention mechanism to attend over the long-term memory, the infinity-former's attention complexity becomes independent of the context length, trading off memory length with precision. In order to control where precision is more important, infinity-former maintains "sticky memories," being able to model arbitrarily long contexts while keeping the computation budget fixed. Experiments on a synthetic sorting task, language modeling, and document grounded dialogue generation demonstrate the infinity-former's ability to retain information from long sequences.(1)
更多
查看译文
关键词
memory
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要