Development Of Rules Of Generation Of Nominal Word Forms For New-Written Variants Of The Karelian Language

I. P. Novak,N. B. Krizhanovskaya, T. P. Boyko, N. A. Pellinen

VESTNIK UGROVEDENIYA-BULLETIN OF UGRIC STUDIES(2020)

引用 0|浏览0
暂无评分
摘要
Introduction: linking of words of texts (tokens) with meanings of lemmas in the dictionary of VepKar corpus significantly facilitates further work on semantic markup of texts. In 2019, inflectional rules were developed for the Vepsian subcorpora VepKar. To the corpus on the base of these rules a function for generation of a complete paradigm on basic word forms was added.VepKar editors need to enter a large number of word forms when they create dictionary entries in three Karelian subcorpora (about 30 for names and 150 for verbs). Therefore, the development of an algorithm and a computer program for generation of word forms of the Karelian language turned out to be timely.Objective: to illustrate how you can use the list of the stems of the nominal parts of speech of two new-written dialects of the Karelian language to create rules for automatic generation of word forms.Research materials: lemmas and word forms from the Open corpus of the Vepsian and Karelian languages, the Corpus of Border Karelia, and the electronic version of the Dictionary of the Karelian language.Results and novelty of the research: grammatical patterns were studied over many years from theoretical sources, and they were also discovered through experiments. Thanks to this, the list of stems and pseudo-stems of word forms was formed for the nominal parts of speech, the system of rules for generation of word forms was developed, and the corresponding computer program is written and tested. The scientific novelty of the study lies in the first attempt to develop uniform rules for the automatic generation of word forms for two dialects of the Karelian language.
更多
查看译文
关键词
Karelian language, new-written language, corpus linguistics, morphology, nominal inflection, generation of word forms
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要