Part-of-Speech (POS) Tagging for the Nyishi Language

Advances in Information Communication Technology and ComputingLecture Notes in Networks and Systems(2022)

引用 0|浏览0
暂无评分
摘要
Part of speech is the building block of any language, and to operate efficiently in any language, it is beneficial to know about the part of speech of that particular language. The prime purpose of this paper is to create resources to carry out part-of-speech tagging of the Nyishi language, which in turn, will create a proper structured data for the Nyishi language. Nyishi part-of-speech (POS) tagging is more difficult than its English equivalent because it needs to be solved together with the problem of word identification. For the Nyishi part of speech tagging, we have built a 36 item tag sets, and from Nyishi-to-English dictionary, we have collected more than 25,000 entries both manually and automatically. In this paper, we will explain how the dictionary creation and part of speech of Nyishi language is done. Therefore, we have designed a tag set first based on training the data which will then be used in the construction of an automatic POS tagger for Nyishi language. And there are many challenges like ambiguity, foreign words, orthography, etc., to overcome.
更多
查看译文
关键词
Parts of Speech,Nyishi language,POS tagger,Dictionary,Ambiguity,Orthography
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要