Named entity recognition based on semi-supervised ensemble learning with the improved tri-training algorithm.

Tengfei Ma,Quansheng Dou,Ping Jiang, Huan Liu

ICIT(2020)

引用 0|浏览12
暂无评分
摘要
Named entity recognition is one of the hot topics in natural language processing. The purpose is to identify named entities in text and summarize them into corresponding entity types. The deep learning model has achieved good results for the task of named entity recognition. Due to the lack of annotated corpus data for named entity recognition tasks in specific domains, it is difficult for a single deep learning model to achieve excellent performance. To this end, by ensembling Conditional Random Field (CRF) model, Bidirectional Gated Recurrent Unit (BiGRU) network model, and Bidirectional Long Short-Term Memory (BiLSTM) network model, with collaborative training Tri -training algorithm, we proposed a named entity recognition method based on semi-supervised ensemble learning (NER-SSEL). It aims to improve model performance through iterative training with a small amount of labeled data and a large amount of unlabeled data. First, a small amount of labeled data is used for pre-training on the three base learners, and then a collaborative training tri-training algorithm is used to provide reliable labels for the unlabeled raw data. To avoid the introduction of noisy data, we propose a repeated labeling strategy to select high-confidence samples and iteratively expand the training set through the consistency evaluation function. Finally, the model is integrated through the weighted voting method. This article conducts experiments on the ATIS data set. Experimental results show that the model proposed in this paper can effectively use a large amount of unlabeled data and significantly improve the F1 measure in the task of named entity recognition.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要