Ontology-Assisted Discovery Of Hierarchical Topic Clusters On The Social Web

Journal of Web Engineering(2016)

引用 0|浏览34
暂无评分
摘要
Discovery and clustering of users by their topic of interest on the Social Web can help enhance various applications, such as user recommendation and expert finding. Traditional approaches, such as latent semantic analysis-based topic modeling or k-means document clustering, run into issues when content is sparse, the number of existing topics is unknown and/or we seek topics that are hierarchical in nature. In this paper, we propose a method for ontology-assisted topic clustering, in which we map Social Web user content to ontological classes to overcome sparsity. Using a novel ranking technique for calculating the topical similarity between individuals at different topic scopes, we construct graphs on which we apply a quasi-clique algorithm in order to find topic clusters at that scope, without having to pre-define a target number of topics. Our approach allows (1) the topic scope to be controlled in order to discover general or specific topics; (2) the automatic labeling of clusters with tags that are human and machine-understandable; and (3) graphs to be clustered recursively in order to generate a hierarchy of topics. The approach is evaluated against ground truths of Twitter users and the 20-newsgroups dataset, commonly used in document clustering research. We compare our approach to standard and Twitter-specific latent Dirichlet allocation (LDA), hierarchical LDA, and standard and hierarchical k-means clustering. Results show that our method outperforms regular LDA by up to 24.7%, Twitter-LDA by up to 11.9%, and k-means by up to 26.7% on Social Web content. It performs equivalently, depending on several factors, to these approaches on a dataset of traditional documents. Additionally, our method can discover the appropriate number and composition of topics at a given topic scope, whereas k-means clustering cannot account for differences in scope.
更多
查看译文
关键词
Ontology,hierarchical clustering,topic modeling,community detection,Social Web
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要