Automating Benchmark Generation for Named Entity Recognition and Entity Linking.

ESWC (Satellite Events)(2023)

引用 0|浏览0
暂无评分
摘要
Named Entity Recognition (NER) and Linking (NEL) have seen great advances lately, especially with the development of language models pre-trained on large document corpora, typically written in the most popular languages (e.g., English). This makes NER and NEL tools for other languages, with fewer resources available, fall behind the latest advances in AI. In this work, we propose an automated benchmark data generation process for the tasks of NER and NEL, based on Wikipedia events. Although our process is applied and evaluated on Greek texts, the only requirement for its applicability to other languages is the availability of Wikipedia events pages in that language. The generated Greek datasets, comprising around 19k events and 41k entity mentions, as well as the code to generate such datasets, are publicly available.
更多
查看译文
关键词
named entity recognition,benchmark
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要