A content-driven framework for geolocating microblog users

Zhiyuan Cheng,James Caverlee,Kyumin Lee

ACM TIST(2013)

引用 39|浏览36
暂无评分
摘要
Highly dynamic real-time microblog systems have already published petabytes of real-time human sensor data in the form of status updates. However, the lack of user adoption of geo-based features per user or per post signals that the promise of microblog services as location-based sensing systems may have only limited reach and impact. Thus, in this article, we propose and evaluate a probabilistic framework for estimating a microblog user's location based purely on the content of the user's posts. Our framework can overcome the sparsity of geo-enabled features in these services and bring augmented scope and breadth to emerging location-based personalized information services. Three of the key features of the proposed approach are: (i) its reliance purely on publicly available content; (ii) a classification component for automatically identifying words in posts with a strong local geo-scope; and (iii) a lattice-based neighborhood smoothing model for refining a user's location estimate. On average we find that the location estimates converge quickly, placing 51% of users within 100 miles of their actual location.
更多
查看译文
关键词
user adoption,location estimate,content-driven framework,probabilistic framework,dynamic real-time microblog system,microblog user,microblog service,real-time human sensor data,location-based personalized information service,available content,actual location,microblog,text mining,algorithms
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要