Grammatical Error Correction via Mixed-Grained Weighted Training.
CoRR(2023)
摘要
The task of Grammatical Error Correction (GEC) aims to automatically correct
grammatical errors in natural texts. Almost all previous works treat annotated
training data equally, but inherent discrepancies in data are neglected. In
this paper, the inherent discrepancies are manifested in two aspects, namely,
accuracy of data annotation and diversity of potential annotations. To this
end, we propose MainGEC, which designs token-level and sentence-level training
weights based on inherent discrepancies in accuracy and potential diversity of
data annotation, respectively, and then conducts mixed-grained weighted
training to improve the training effect for GEC. Empirical evaluation shows
that whether in the Seq2Seq or Seq2Edit manner, MainGEC achieves consistent and
significant performance improvements on two benchmark datasets, demonstrating
the effectiveness and superiority of the mixed-grained weighted training.
Further ablation experiments verify the effectiveness of designed weights of
both granularities in MainGEC.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要