Transformer-xl: Language modeling with longer-term dependency

William W Cohen
William W Cohen
Jaime Carbonell
Jaime Carbonell
Quoc V Le
Quoc V Le

2018.

Cited by: 17|Bibtex|Views32|Links

Abstract:

We propose a novel neural architecture, Transformer-XL, for modeling longer-term dependency. To address the limitation of fixed-length contexts, we introduce a notion of recurrence by reusing the representations from the history. Empirically, we show state-of-the-art (SoTA) results on both word-level and character-level language modeling ...More

Code:

Data:

Your rating :
0

 

Tags
Comments