Automatic Recognition of Gender and Genre in a Corpus of Microtexts

Theory and Applications of Dependable Computer SystemsAdvances in Intelligent Systems and Computing(2020)

引用 0|浏览0
暂无评分
摘要
In this paper, we focus on author’s gender and writing genre recognition solely on books titles. We analyse data extracted from the bibliography resources of the National Library of Poland. Within a paper, we compare different methods of text (title) representation and classification. It includes word embedding models such as word2vec, ELMo and classification algorithms such as linear models, multilayer perceptron and bidirectional LSTM. It is shown, that the writing genre (for defined 28 classes) could be automatically recognized based only on the book title with accuracy equal to 0.74. The best results were achieved by fastText methods with word n-grams.
更多
查看译文
关键词
microtexts,corpus,gender,genre,automatic recognition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要