Emphasizing personal information for Author Profiling: New approaches for term selection and weighting.

Knowledge-Based Systems(2018)

引用 28|浏览9
暂无评分
摘要
The Author Profiling (AP) task aims to predict specific profile characteristics of authors by analyzing their written documents. Nowadays, its relevance has been highlighted thanks to several applications in computer forensics, security and marketing. Most previous contributions in AP have been devoted to determine a suitable set of features to model the writing profile of authors. However, in social media this task is challenging due to the informal communication. In this regard, we present a novel approach, which considers that terms located in phrases exposing personal information have a special value for discriminating the author’s profile. The aim of this research work is to emphasize the value of such personal phrases by means of two new proposals: a feature selection method and term weighting scheme, both based on a novel measure called Personal Expression Intensity (PEI) which scores the quantity of personal information revealed by a term. For evaluating the latter ideas, we show experimental results in age and gender prediction of media users on six different collections. Average improvements of 7.34% and 5.76% for age and gender classification were obtained when comparing to the best result from state-of-the-art, indicating that personal phrases play a key role for the AP task by means of selecting and weighting terms.
更多
查看译文
关键词
Author profiling,Feature selection,Term weighting,Personal information,PEI
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要