Frequent Itemset Mining for Big Data

Silicon Valley, CA(2013)

引用 296|浏览30
暂无评分
摘要
Frequent Itemset Mining (FIM) is one of the most well known techniques to extract knowledge from data. The combinatorial explosion of FIM methods become even more problematic when they are applied to Big Data. Fortunately, recent improvements in the field of parallel programming already provide good tools to tackle this problem. However, these tools come with their own technical challenges, e.g. balanced data distribution and inter-communication costs. In this paper, we investigate the applicability of FIM techniques on the MapReduce platform. We introduce two new methods for mining large datasets: Dist-Eclat focuses on speed while BigFIM is optimized to run on really large datasets. In our experiments we show the scalability of our methods.
更多
查看译文
关键词
Big Data,data mining,parallel programming,Big Data,BigFIM,Dist-Eclat,MapReduce platform,frequent itemset mining,knowledge extraction,parallel programming,distributed data mining,eclat,hadoop,mapreduce
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要