Sequence-driven Neural Network models for NER Tagging in Roman Urdu

Maaz Ali Nadeem, Khadija Irfan, Khaula Atiq,Mirza Omer Beg,Muhammad Umair Arshad

2022 International Conference on Frontiers of Information Technology (FIT)(2022)

引用 0|浏览1
暂无评分
摘要
Modern Natural Language Processing research has taken a flight as it moves to address the issues of mapping contextual sequence labeling for low-resource languages. Named-Entity Recognition is one such labeling application; where text is considered contextually and labeled with the named entities. NER for Roman Urdu aims to achieve tasks such as Information Extraction, Machine Translation, and even big data operations on live digital content. There has been limited research on such NLP applications in Roman Urdu, however, work on Urdu and other languages of the family encourage active research. This paper holds comparisons using a few deep learning-based models that learn the importance of word classification by mapping to a specific context based on placement. Our model is trained on a hand-annotated corpus covering several domains. After a detailed comparison and evaluation, Bi-LSTM yields an exceptional F1-score of 82.7%. Our work demonstrates the possibility of long-range contextual understanding for processing morphologically rich low-resource languages.
更多
查看译文
关键词
NER Tagging,RNN,CNN,GRU,Bi-LSTM
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要