Corpus and Baseline Model for Domain-Specific Entity Recognition in German

2020 6th IEEE Congress on Information Science and Technology (CiSt)(2021)

引用 0|浏览11
暂无评分
摘要
Transfer Learning approaches are a promising means to analyze low-resource domain specific texts. The German SmartData corpus is the first German corpus, annotated with entities from different domains, and thus allows to investigate transfer learning approaches for Named Entity Recognition (NER) on different domains. In order to prepare such investigations, this work includes a thorough analysis of the SmartData corpus, and a revision w.r.t. annotations and the split into training and test data, considering the distribution of document and entity types. Based on that a baseline model for NER using BiLSTM-CRF neural networks including hyperparameter optimization is presented.
更多
查看译文
关键词
Named Entity Recognition,German,domain-specific,BiLSTM-CRF,Hyperparameter Optimization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要