SPOAHA: Spark Program Optimizer Based on Artificial Hummingbird Algorithm.

Miao Wang, Jiteng Zhen, Yupeng Ma,Xu Huang,Hong Zhang

KSEM (3)(2023)

引用 0|浏览3
暂无评分
摘要
In this era of the Internet of Things (IoT), a large number of sensor devices collect and generate various sensing data over time. It is very essential to mine fresh information by analyzing large amounts of data, predict the future, and make correct decisions. Therefore, a growing number of data-intensive computing frameworks have been proposed, such as Hadoop, Spark, Flink, etc. Rather than reading and writing files to disks, Spark processes data with a memory-based computing framework to improve the performance, which has attracted more attention from researchers. However, due to a wealth of operators provided by Spark, a certain application can be implemented in various ways, which also show big differences in performance. Therefore, tuning a Spark application is a very error-prone and time-consuming process, and requires developers to have a deep understanding of Spark’s operating principles and characteristics. In this paper, we summarize a series of rules such as operator reordering and operator replacement to design and implement a Spark program optimizer, called SPOAHA, based on the artificial Hummingbird algorithm. Experimental results show that without changing the semantics of the original program, the optimized program dramatically reduces the amount of data involved in the shuffling period, and speeds up the execution time by up to 2.7 × .
更多
查看译文
关键词
spark program optimizer,artificial hummingbird algorithm
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要