Reward Augmented Maximum Likelihood for Neural Structured Prediction.

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016)(2016)

引用 236|浏览157
暂无评分
摘要
A key problem in structured output prediction is direct optimization of the task reward function that matters for test evaluation. This paper presents a simple and computationally efficient approach to incorporate task reward into a maximum likelihood framework. By establishing a link between the log-likelihood and expected reward objectives, we show that an optimal regularized expected reward is achieved when the conditional distribution of the outputs given the inputs is proportional to their exponentiated scaled rewards. Accordingly, we present a framework to smooth the predictive probability of the outputs using their corresponding rewards. We optimize the conditional log-probability of augmented outputs that are sampled proportionally to their exponentiated scaled rewards. Experiments on neural sequence to sequence models for speech recognition and machine translation show notable improvements over a maximum likelihood baseline by using reward augmented maximum likelihood (RML), where the rewards are defined as the negative edit distance between the outputs and the ground truth labels.
更多
查看译文
关键词
neural structured prediction,augmented maximum likelihood,reward
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要