Author Identification in Albanian Language

Network-Based Information Systems(2011)

引用 2|浏览0
暂无评分
摘要
The identification of authorship has been for a long time the focus of many researchers. A lot of work has been done mostly referring to the English language. There is a gap for the Albanian language in this field of study because of its big differences with other languages according to its difficult syntactic structure. This was our motivation on trying to adapt the algorithm of identifying the authorship of Albanian books. Our previous work concerned the adoption of Dmitri Khmelev algorithm for identifying the authorship of Albanian texts. In this paper we improved the algorithm by taking into account the syntactic structure of Albanian sentences and adding specific linguistic elements to the problem. The results that we obtained by the same set of books were better than the results taken by the basic models of Dmitri Khmelev algorithms.
更多
查看译文
关键词
albanian language,basic model,albanian text,syntactic structure,difficult syntactic structure,albanian sentence,author identification,previous work,english language,dmitri khmelev algorithm,albanian book,natural language processing,databases,markov process,pragmatics,markov processes,dictionaries,text analysis,dna
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要