A soft frequent pattern mining approach for textual topic detection
WIMS(2014)
摘要
Textual topic detection methods that work by clustering terms according to their cooccurrence patterns are called feature-pivot methods. Typically, the similarity measure that is used for such clustering methods takes into account the cooccurrence patterns of only pairs of items. In this work, we argue that examining the simultaneous cooccurrence patterns of a larger number of terms, is a better option when the corpus contains a set of closely related fine-grained topics. To this end, we treat the topic detection problem as a Frequent Pattern Mining problem and propose a novel algorithm for \"soft\" Frequent Pattern Mining. We test the proposed approach using three annotated datasets collected from Twitter and compare it to a set of algorithms that includes a graph-based feature-pivot approach that takes into account only cooccurrence patterns, a standard Frequent Pattern Mining algorithm and Latent Dirichlet Allocation. The results indicate that SFPM is performing better than the other tested methods and show a clear improvement over the standard FPM approach.
更多查看译文
关键词
algorithms,applications,experimentation,miscellaneous,frequent pattern mining,theory,topic detection,feature-pivot,soft frequent pattern mining
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络