Conditional topical coding: an efficient topic model conditioned on rich features.

KDD '11: The 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining San Diego California USA August, 2011(2011)

引用 7|浏览82
暂无评分
摘要
Probabilistic topic models have shown remarkable success in many application domains. However, a probabilistic conditional topic model can be extremely inefficient when considering a rich set of features because it needs to define a normalized distribution, which usually involves a hard-to-compute partition function. This paper presents conditional topical coding (CTC), a novel formulation of conditional topic models which is non-probabilistic. CTC relaxes the normalization constraints as in probabilistic models and learns non-negative document codes and word codes. CTC does not need to define a normalized distribution and can efficiently incorporate a rich set of features for improved topic discovery and prediction tasks. Moreover, CTC can directly control the sparsity of inferred representations by using appropriate regularization. We develop an efficient and easy-to-implement coordinate descent learning algorithm, of which each coding substep has a closed-form solution. Finally, we demonstrate the advantages of CTC on online review analysis datasets. Our results show that conditional topical coding can achieve state-of-the-art prediction performance and is much more efficient in training (one order of magnitude faster) and testing (two orders of magnitude faster) than probabilistic conditional topic models.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要