Neural Machine Translation With GRU-Gated Attention Model

IEEE Transactions on Neural Networks and Learning Systems(2020)

引用 73|浏览129
暂无评分
摘要
Neural machine translation (NMT) heavily relies on context vectors generated by an attention network to predict target words. In practice, we observe that the context vectors for different target words are quite similar to one another and translations with such nondiscriminatory context vectors tend to be degenerative. We ascribe this similarity to the invariant source representations that lack dynamics across decoding steps. In this article, we propose a novel gated recurrent unit (GRU)-gated attention model (GAtt) for NMT. By updating the source representations with the previous decoder state via a GRU, GAtt enables translation-sensitive source representations that then contribute to discriminative context vectors. We further propose a variant of GAtt by swapping the input order of the source representations and the previous decoder state to the GRU. Experiments on the NIST Chinese-English, WMT14 English-German, and WMT17 English-German translation tasks show that the two GAtt models achieve significant improvements over the vanilla attention-based NMT. Further analyses on the attention weights and context vectors demonstrate the effectiveness of GAtt in enhancing the discriminating capacity of representations and handling the challenging issue of overtranslation.
更多
查看译文
关键词
Gated recurrent unit (GRU),gated attention model (GAtt),neural machine translation (NMT)
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要