Mining recent high average utility patterns based on sliding window from stream data.
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS(2016)
摘要
Utility pattern mining is a technique that finds valuable patterns from large-sized databases with each item's importance and quantity information associated with it. The representative utility pattern mining technique, high utility pattern mining (HUPM), calculates the utilities of patterns by summating all of the item utilities in the patterns. However, such utility measures for patterns in HUPM have a drawback in whichpatterns with long lengths tend to have utilities sufficient to become high utility patterns. For these reasons, high average utility pattern mining (HAUPM) employing different utility measures has been studied in order to consider such pattern length factors. Recently, techniques for handling stream data are necessary because many data sources, e.g. sensors and POS devices, produce data in real time. However, all the existing HAUPM algorithms are unable to find up-to-date, meaningful patterns over data streams. We thus propose the first sliding window based HAUPM algorithm discovering recent high average utility patterns over data streams. Based on the sliding window model, our algorithm divides stream data into numerous batches, and keeps only recent batches in its window. Thereby, the algorithm can mine recent, important patterns over data streams. We also introduce a new strategy that enhances the performance of our algorithm by minimizing the overestimated average utilities stored in the proposed data structure. The experimental results show that our algorithm outperforms the competitors.
更多查看译文
关键词
Association rule mining,utility pattern mining,stream pattern mining,sliding window model
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络