Empirical Evaluation of Profile Characteristics for Gender Classification on Twitter

ICMLA), 2013 12th International Conference(2013)

引用 77|浏览0
暂无评分
摘要
Online Social Networks (OSNs) provide reliable communication among users from different countries. The volume of texts generated by OSNs is huge and highly informative. Gender classification can serve commercial organizations for advertising, law enforcement for legal investigation, and others for social reasons. Here we explore profile characteristics for gender classification on Twitter. Unlike existing approaches to gender classification that depend heavily on posted text such as tweets, here we study the relative strengths of different characteristics extracted from Twitter profiles (e.g., first name and background color in a user's profile page). Our goal is to evaluate profile characteristics with respect to their predictive accuracy and computational complexity. In addition, we provide a novel technique to reduce the number of features of text-based profile characteristics from the order of millions to a few thousands and, in some cases, to only 40 features. We prove the validity of our approach by examining different classifiers over a large dataset of Twitter profiles.
更多
查看译文
关键词
computational complexity,computer mediated communication,gender issues,pattern classification,social networking (online),OSN,Twitter profiles,background color,commercial organizations,computational complexity,gender classification,informative texts,legal investigation,online social networks,profile characteristics,text-based profile characteristics,user profile page,Color-based features,color quantization,language independence,phonemes as features,profile characteristics,social networks
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要