A New Geometric Approach To Latent Topic Modeling And Discovery

2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)(2013)

引用 9|浏览33
暂无评分
摘要
A new geometrically-motivated algorithm for topic modeling is developed and applied to the discovery of latent "topics" in text and image "document" corpora. The algorithm is based on robustly finding and clustering extreme-points of empirical cross-document word-frequencies that correspond to novel words unique to each topic. In contrast to related approaches that are based on solving non-convex optimization problems using suboptimal approximations, locally-optimal methods, or heuristics, the new algorithm is convex, has polynomial complexity, and has competitive qualitative and quantitative performance compared to the current state-of-the-art approaches on synthetic and real-world datasets.
更多
查看译文
关键词
Topic modeling, nonnegative matrix factorization (NMF), extreme points, subspace clustering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要