Transformer-XL: Attentive Language Models beyond a Fixed-Length Context

ACL (1), pp. 2978-2988, 2019.

Cited by: 429|Bibtex|Views107|Links
EI

Abstract:

Transformer networks have a potential of learning longer-term dependency, but are limited by a fixed-length context in the setting of language modeling. As a solution, we propose a novel neural architecture, Transformer-XL, that enables Transformer to learn dependency beyond a fixed length without disrupting temporal coherence. Concrete...More

Code:

Data:

Your rating :
0

 

Tags
Comments