A universal topic framework (UniZ) and its application in online search

SAC 2015: Symposium on Applied Computing Salamanca Spain April, 2015(2015)

引用 2|浏览74
暂无评分
摘要
Probabilistic topic models, such as PLSA and LDA, are gaining popularity in many fields due to their high-quality results. Unfortunately, existing topic models suffer from two drawbacks: (1) model complexity and (2) disjoint topic groups. That is, when a topic model involves multiple entities (such as authors, papers, conferences, and institutions) and they are connected through multiple relationships, the model becomes too difficult to analyze and often leads to in-tractable solutions. Also, different entity types are classified into disjoint topic groups that are not directly comparable, so it is difficult to see whether heterogeneous entities (such as authors and conferences) are on the same topic or not (e.g., are Rakesh Agrawal and KDD related to the same topic?). In this paper, we propose a novel universal topic framework (UniZ) that addresses these two drawbacks using "prior topic incorporation." Since our framework enables representation of heterogeneous entities in a single universal topic space, all entities can be directly compared within the same topic space. In addition, UniZ breaks complex models into much smaller units, learns the topic group of each entity from the smaller units, and then propagates the learned topics to others. This way, it leverages all the available signals without introducing significant computational complexity, enabling a richer representation of entities and highly accurate results. In a widely-used DBLP dataset prediction problem, our approach achieves the best prediction performance over many state-of-the-art methods. We also demonstrate practical potential of our approach with search logs from a commercial search engine.
更多
查看译文
关键词
topic models, universal topic framework, online search, users and context-modeling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要