Arabic named entity recognition: using features extracted from noisy data

ACL (Short Papers)(2010)

引用 51|浏览39
暂无评分
摘要
Building an accurate Named Entity Recognition (NER) system for languages with complex morphology is a challenging task. In this paper, we present research that explores the feature space using both gold and bootstrapped noisy features to build an improved highly accurate Arabic NER system. We bootstrap noisy features by projection from an Arabic-English parallel corpus that is automatically tagged with a baseline NER system. The feature space covers lexical, morphological, and syntactic features. The proposed approach yields an improvement of up to 1.64 F-measure (absolute).
更多
查看译文
关键词
accurate arabic ner system,challenging task,entity recognition,noisy feature,complex morphology,noisy data,syntactic feature,baseline ner system,bootstrapped noisy feature,feature space,arabic-english parallel corpus,feature extraction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要