Domain adaptation using neural network joint model.
Computer Speech & Language(2017)
摘要
Two sets of novel extensions of NNJM model are proposed. The NDAM models that regularizes the loss function with respect to in-domain model, give an improvement of up to +0.4 BLEU points. The NFM models that fuse in- and out-domain NNJM models give an improvement of up to +0.9 BLEU points. The NFM models also beat state-of-the-art phrase-table adaptation methods. The gains obtained from NNJM and phrase-table adaptation were found to be additive. We explore neural joint models for the task of domain adaptation in machine translation in two ways: (i)we apply state-of-the-art domain adaptation techniques, such as mixture modelling and data selection using the recently proposed Neural Network Joint Model (NNJM) (Devlin etal., 2014); (ii)we propose two novel approaches to perform adaptation through instance weighting and weight readjustment in the NNJM framework. In our first approach, we propose a pair of models called Neural Domain Adaptation Models (NDAM) that minimizes the cross entropy by regularizing the loss function with respect to in-domain (and optionally to out-domain) model. In the second approach, we present a set of Neural Fusion Models (NFM) that combines the in- and the out-domain models by readjusting their parameters based on the in-domain data.We evaluated our models on the standard task of translating English-to-German and Arabic-to-English TED talks. The NDAM models achieved better perplexities and modest BLEU improvements compared to the baseline NNJM, trained either on in-domain or on a concatenation of in- and out-domain data. On the other hand, the NFM models obtained significant improvements of up to +0.9 and +0.7 BLEU points, respectively. We also demonstrate improvements over existing adaptation methods such as instance weighting, phrasetable fill-up, linear and log-linear interpolations.
更多查看译文
关键词
Machine translation,Domain adaptation,Neural network joint model,Distributed representation of texts,Noise contrastive estimation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络