A unified generative model for characterizing microblogs' topics

WAIM(2013)

引用 1|浏览0
暂无评分
摘要
In this paper, we focus on the issue of characterizing microblogs' topics based on topic models. Different from dealing with traditional textual media (such as news documents), modeling microblogs has three challenges: 1) too much noise; 2) short text; and 3) content incompleteness. Previously, all these limitations have been investigated separately. Some work filters the noise through a prior classification; some enhances the text through the user's blog history; and some utilizes the social network. However, none of these work could solve all the above limitations simultaneously. To solve this problem, we make a combination of previous work in this paper, and propose a unified generative model for characterizing microblogs' topics. In the proposed unified approach, all the three limitations could be solved. A collapsed Gibbs-sampling optimization method is derived for estimating the parameters. Through both qualitative and quantitative analysis in Twitter, we demonstrate that our approach consistently outperforms previous methods at a significant scale.
更多
查看译文
关键词
collapsed gibbs-sampling optimization method,short text,blog history,prior classification,previous method,content incompleteness,proposed unified approach,news document,previous work,unified generative model,latent dirichlet allocation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要