A Reduction For Efficient Lda Topic Reconstruction

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018)(2018)

引用 22|浏览75
暂无评分
摘要
We present a novel approach for LDA (Latent Dirichlet Allocation) topic reconstruction. The main technical idea is to show that the distribution over the documents generated by LDA can be transformed into a distribution for a much simpler generative model in which documents are generated from the same set of topics but have a much simpler structure: documents are single topic and topics are chosen uniformly at random. Furthermore, this reduction is approximation preserving, in the sense that approximate distributions - the only ones we can hope to compute in practice - are mapped into approximate distribution in the simplified world. This opens up the possibility of efficiently reconstructing LDA topics in a roundabout way. Compute an approximate document distribution from the given corpus, transform it into an approximate distribution for the single-topic world, and run a reconstruction algorithm in the uniform, single-topic world - a much simpler task than direct LDA reconstruction. We show the viability of the approach by giving very simple algorithms for a generalization of two notable cases that have been studied in the literature, p-separability and matrix-like topics.
更多
查看译文
关键词
generative model,gibbs sampling,latent dirichlet allocation,reconstruction algorithm
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要