Residual Memory Networks In Language Modeling: Improving The Reputation Of Feed-Forward Networks

18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION(2017)

引用 5|浏览28
暂无评分
摘要
We introduce the Residual Memory Network (RMN) architecture to language modeling. RMN is an architecture of feed forward neural networks that incorporates residual connections and time-delay connections that allow us to naturally incorporate information from a substantial time context. As this is the first time RMNs are applied for language modeling, we thoroughly investigate their behaviour on the well studied Penn Treebank corpus. We change the model slightly for the needs of language modeling, reducing both its time and memory consumption. Our results show that RMN is a suitable choice for small-sized neural language models: With test perplexity 112.7 and as few as 2.3M parameters, they out-perform both a much larger vanilla RNN (PPL 124, 8M parameters) and a similarly sized LSTM (PPL 115. 2.08M parameters), while being only by less than 3 perplexity points worse than twice as big LSTM.
更多
查看译文
关键词
residual memory networks, feed-forward networks, language modeling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要