Neural CRF Sentence Alignment Model for Text Simplification

Chao Jiang,Mounica Maddela,Wuwei Lan, Yang Zhong, Wei Xu

semanticscholar(2020)

引用 0|浏览11
暂无评分
摘要
The success of a text simplification system heavily depends on the quality and quantity of complex-simple sentence pairs in the training corpus, which are extracted by aligning sentences between parallel articles. To evaluate and improve sentence alignment quality, we create two manually annotated sentencealigned datasets from two commonly used text simplification corpora. We also propose a novel neural CRF alignment model which not only leverages the sequential nature of sentences in parallel documents but also utilizes a neural sentence pair model to capture semantic similarity. Experiments demonstrate that our proposed method outperforms all the previous approaches on monolingual sentence alignment task by more than 5 points in F1. We apply our aligner to construct NEWSELA-AUTO and WIKI-AUTO text simplification datasets, which are larger and of better quality compared to the existing datasets. A Transformer model trained on our datasets establishes a new state-of-the-art for sentence simplification in both automatic and human evaluation.1
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要