A guided FP-Growth algorithm for mining multitude-targeted item-sets and class association rules in imbalanced data

Information Sciences(2021)

引用 28|浏览46
暂无评分
摘要
Identifying frequent item-sets is a popular data-mining task. It consists of finding sets of items frequently appearing in data. Yet, finding all frequent item-sets in large or dense datasets may be time-consuming, and a user may be interested merely in some specific item-sets rather than all of them. Recently, methods have been proposed for targeted item-set mining; that is to calculate the support of some item-sets of interest. Though this approach is often more suitable for real applications than traditional item-set mining approaches, performance remains an issue. To address that issue, this paper presents a novel algorithm for multitude-targeted mining, named Guided Frequent Pattern-Growth (GFP-Growth). The GFP-Growth algorithm is designed to quickly mine a given set of item-sets using a small amount of memory. This paper proves that GFP-Growth yields the exact frequency-counts for each item-set of interest. It further shows that GFP-Growth can boost the performance for several problems requiring item-set mining. We specifically study the problem of generating minority-class rules from imbalanced data and develop the Minority-Report Algorithm (MRA) that uses GFP-Growth to solve this problem efficiently. We prove several theoretical properties of MRA and present experimental results showing substantial performance gain.
更多
查看译文
关键词
Data mining,Item-set discovery,Multi-targeted mining,Imbalanced data,Minority-class rule,Guided FP-Growth
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要