A new efficient approach for mining uncertain frequent patterns using minimum data structure without false positives.

Future Generation Computer Systems(2017)

引用 73|浏览33
暂无评分
摘要
The concept of uncertain pattern mining was recently proposed to fulfill the demand for processing databases with uncertain data, and various relevant methods have been devised. However, previous approaches have the following limitations. State-of-the-art methods based on tree structure can cause fatal problems in terms of runtime and memory usage according to the characteristics of uncertain databases and threshold settings because their own tree data structures can become excessively large and complicated in their mining processes. Various approximation approaches have been suggested in order to overcome such problems; however, they are methods that increase their own mining performance at the cost of accuracy of the mining results. In order to solve the problems, we propose an exact, efficient algorithm for mining uncertain frequent patterns based on novel data structures and mining techniques, which can also guarantee the correctness of the mining results without any false positives. The newly proposed list-based data structures and pruning techniques allow a complete set of uncertain frequent patterns to be mined more efficiently without pattern losses. We also demonstrate that the proposed algorithm outperforms previous state-of-the art approaches in both theoretical and empirical aspects. Especially, we provide analytical results of performance evaluation for various types of datasets to show efficiency of runtime, memory usage, and scalability in our method.
更多
查看译文
关键词
Correctness,Data mining,Existential probability,Frequent pattern mining,Uncertain pattern
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要