Clustering-based data placement in cloud computing: a predictive approach

CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS(2021)

引用 7|浏览8
暂无评分
摘要
Nowadays, cloud computing environments have become a natural choice to host and process a huge volume of data. The combination of cloud computing and big data frameworks is an effective way to run data-intensive applications and tasks. Also, an optimal arrangement of data partitions can improve the tasks executions, which is not the case in most big data frameworks. For example, the default distribution of data partitions in Hadoop-based clouds causes several problems, which are mainly related to the load balancing and the resource usage. In addition, most existing data placement solutions are static and lack precision in the placement of data partitions. To overcome these issues, we propose a data placement approach based on the prediction of the future resources usage. We exploit Kernel Density Estimation (KDE) and Fuzzy FCA techniques to, first, forecast the workers’ and tasks’ future resource consumption and, second, cluster data partitions and intensive jobs according to the estimated resource usage. Fuzzy FCA is also used to exclude partitions and jobs that require less resources, which will reduce the needless migrations. To allow monitoring and predicting the workers’ states and the data partitions’ consumption, we modeled the big data cluster as an autonomic service-based system. The obtained results have shown that our solution outperformed existing approaches in terms of migrations rate and resource consumption.
更多
查看译文
关键词
Data placement, Resource usage, Intensive jobs, Prediction, Kernel Density Estimation, Fuzzy FCA, SOA, Autonomic computing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要