Beyond Word for Word: Fact Guided Training for Neural Data-to-Document Generation

Lecture Notes in Artificial Intelligence(2019)

引用 0|浏览76
暂无评分
摘要
Recent end-to-end encoder-decoder neural models for data-to-text generation can produce fluent and seemingly informative texts despite these models disregard the traditional content selection and surface realization architecture. However, texts generated by such neural models are often missing important facts and contradict the input data, particularly in generation of long texts. To address these issues, we propose a Fact Guided Training (FGT) model to improve both content selection and surface realization by leveraging an information extraction (IE) system. The IE system extracts facts mentioned in reference data and generates texts which provide fact-guided signals. First, a content selection loss is designed to penalize content deviation between generated texts and their references. Moreover, with the selection of proper content for generation, a consistency verification mechanism is designed to inspect fact discrepancy between generated texts and their corresponding input data. The consistency signal is non-differentiable and is optimized via reinforcement learning. Experimental results on a recent challenging dataset ROTOWIRE show our proposed model outperforms neural encoder-decoder models in both automatic and human evaluations.
更多
查看译文
关键词
Generation,Information extraction,Reinforcement learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要