Performing in-situ analytics: Mining frequent patterns from big IoT data at network edge with D-HARPP

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE(2022)

引用 1|浏览10
暂无评分
摘要
Big IoT data is inherently distributed, high-dimensional, irregular, and sparse in nature. Fog computing model in its original form is by no means the optimal solution for mining big IoT data. However, utilizing the network edge for mining tasks, such as enabling edge and IoT devices to mine locally frequent patterns can significantly improve the mining performance. Additionally, edge devices capable of performing distributed job processing could utilize the model to the fullest. But resource poorness of edge and IoT devices needs lightweight pattern mining algorithms. This paper presents Distributed HARnessing the Power of Powersets for Mining Frequent Itemsets (D-HARPP), a spark-based distributed algorithm to mine frequent co-occurring itemsets in big IoT data. Unlike state-of-the-art distributed algorithms, D-HARPP makes a single pass over the data and does not create candidate itemsets; thus, achieves significantly better runtime and consumes the least memory. Moreover, performance of D-HARPP is not deteriorated at lower minimum support thresholds. These distinguishing characteristics make D-HARPP an optimal choice for Spark-enabled edge and IoT devices. D-HARPP has outperformed Spark-Apriori, another distributed algorithm by significant margins, both in terms of runtime and memory consumption, particularly on sparse datasets.
更多
查看译文
关键词
IoT, Mining frequent patterns, Network edge with D-HARPP
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要