Data enhancement and selection strategies for the word-level Quality Estimation
WMT@EMNLP(2015)
摘要
This paper describes the DCU-SHEFF word-level Quality Estimation (QE) system submitted to the QE shared task at WMT15. Starting from a baseline set of features and a CRF algorithm to learn a sequence tagging model, we propose improvements in two ways: (i) by filtering out the training sentences containing too few errors, and (ii) by adding incomplete sequences to the training data to enrich the model with new information. We also experiment with considering the task as a classification problem, and report results using a subset of the features with Random Forest classifiers.
更多查看译文
关键词
enhancement,quality,estimation,word-level
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络