GPU-Based Efficient Parallel Heuristic Algorithm for High-Utility Itemset Mining in Large Transaction Datasets

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING(2024)

引用 0|浏览7
暂无评分
摘要
Heuristic algorithms have been developed to find approximate solutions for high-utility itemset mining (HUIM) problems that compensate for the performance bottlenecks of exact algorithms. However, heuristic algorithms still face the problem of long runtime and insufficient mining quality, especially for large transaction datasets with thousands to tens of thousands of items and up to millions of transactions. To solve these problems, a novel GPU-based efficient parallel heuristic algorithm for HUIM (PHA-HUIM) is proposed in this paper. The iterative process of PHA-HUIM consists of three main steps: the search strategy, fitness evaluation, and ring topology communication. The search strategy and ring topology communication are designed to run in constant time on GPU. The parallelism of fitness evolution helps to substantially accelerate the algorithm. A new data structure with a sort-mapping strategy is proposed to enhance the search ability and reduce memory usage. To improve the mining quality, a multi-start strategy with an unbalanced allocation strategy is employed in the search process. Ring topology communication is adopted to maintain population diversity. A load balancing strategy is introduced to reduce the thread divergence to improve the parallel efficiency. The experimental results on nine large datasets show that PHA-HUIM outperforms state-of-the-art HUIM algorithms in terms of speedup performance, runtime, and mining quality.
更多
查看译文
关键词
High-utility itemset mining,heuristic algorithm,GPU parallel,sort-mapping strategy,load balancing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要