WeChat Mini Program
Old Version Features

Large Language Models for Multilingual Slavic Named Entity Linking

Proceedings of the 9th Workshop on Slavic Natural Language Processing 2023 (SlavicNLP 2023)(2023)

Cited 0|Views0
Abstract
This paper describes our submission for the 4th Shared Task on SlavNER on three Slavic languages - Czech, Polish and Russian. We use pre-trained multilingual XLM-R Language Model (Conneau et al., 2020) and fine-tune it for three Slavic languages using datasets provided by organizers. Our multilingual NER model achieves 0.896 F-score on all corpora, with the best result for Czech (0.914) and the worst for Russian (0.880). Our cross-language entity linking module achieves F-score of 0.669 in the official SlavNER 2023 evaluation.
More
Translated text
Key words
Language Modeling,Multilingual Neural Machine Translation,Named Entity Recognition
PDF
Bibtex
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn
Chat Paper

要点】:本文提出了一种基于预训练的多语言XLM-R语言模型,针对捷克语、波兰语和俄语三种斯拉夫语言进行微调,实现了多语言命名实体链接任务的高效处理,并在SlavNER 2023评测中取得了优异的成绩。

方法】:研究采用了Conneau等人于2020年提出的XLM-R语言模型,利用组织者提供的斯拉夫语言数据集进行微调,以提升模型在命名实体识别和跨语言实体链接上的性能。

实验】:实验在SlavNER任务的数据集上进行,模型在所有语料库上达到了0.896的F-score,其中捷克语取得了最佳成绩0.914,俄语成绩最低为0.880;跨语言实体链接模块在官方SlavNER 2023评测中获得了0.669的F-score。