Role of Machine Learning in Authorship Attribution with Select Stylometric Features

Sumit Gupta, Tapas Kumar Patra,Chitrita Chaudhuri

INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, ISDA 2021(2022)

引用 0|浏览2
暂无评分
摘要
The author of a manuscript leaves behind footprints which can be traced back to identify the person. This paper attempts to solve such a text categorization issue by distinguishing the actual author of a document from within a pool of claimants using various Machine Learning tools. The techniques have been shown to achieve success in resolving controversies as and when they arise in the domain of Authorship Attribution. The corpus comprises assorted documents of three separate short-story writers. The process involves utilizing unsupervised K-Means Clustering technique initially to extract some select stylometric features or attributes from the text corpus. Thereafter several Supervised Learning techniques have been implemented to classify the author's identity correctly. Amongst these the ANN classifier emerges as the best technique with an accuracy of 93.33%.
更多
查看译文
关键词
Machine learning, Authorship attribution, Stylometric attributes, Text classification, Clustering, WEKA tool, tf-idf score
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要