Language independent gender classification on Twitter

ASONAM '13: Advances in Social Networks Analysis and Mining 2013 Niagara Ontario Canada August, 2013(2013)

引用 101|浏览0
暂无评分
摘要
Online Social Networks (OSNs) generate a huge volume of user-originated texts. Gender classification can serve multiple purposes. For example, commercial organizations can use gender classification for advertising. Law enforcement may use gender classification as part of legal investigations. Others may use gender information for social reasons. Here we explore language independent gender classification. Our approach predicts gender using five color-based features extracted from Twitter profiles (e.g., the background color in a user's profile page). Most other methods for gender prediction are typically language dependent. Those methods use high-dimensional spaces consisting of unique words extracted from such text fields as postings, user names, and profile descriptions. Our approach is independent of the user's language, efficient, and scalable, while attaining a good level of accuracy. We prove the validity of our approach by examining different classifiers over a large dataset of Twitter profiles.
更多
查看译文
关键词
profile page,twitter profile,user name,language independent gender classification,gender information,profile description,gender classification,law enforcement,gender prediction,online social networks,text analysis,social network,feature extraction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要