Key based reducer placement for data analytics across data centers considering Bi-level resource provision in cloud computing

Zhang Jiangtao,Zhang Lingmin,Huang Hejiao,Jiang, Zeo L.,Wang Xuan

IoTBD 2016 - Proceedings of the International Conference on Internet of Things and Big Data（2016）

引用 0|浏览54

暂无评分

摘要

Due to the distribution characteristic of the data source, such as astronomy and sales, or the legal prohibition, it is not always practical to store the world-wide data in only one data center (DC). Hadoop is a commonly accepted framework for big data analytics. But it can only deal with data within one DC. The distribution of data necessitates the study of Hadoop across DCs. In this situation, though we can place mapper in the local DCs, where to place reducers is a great challenge, since each reducer almost needs to process all map output across all involved DCs. Aiming to reduce costs, a key based scheme is proposed which can respect the locality principle of traditional Hadoop as much as possible while realizing deployment of reducers with lower cost. Considering both data center level and server level resource provision, a bi-level programming is used to formalize the problem and it is solved by a tailored two level group genetic algorithm (TLGGA). Extensive simulations demonstrate the effectiveness of TLGGA. It can outperform both the baseline and the state-of-the-art mechanisms by 49% and 40%, respectively. Copyright © 2016 by SCITEPRESS - Science and Technology Publications, Lda. All rights reserved.

查看译文

关键词

Reducer Placement, Resource Provision, Hadoop Across Data Centers, Distributed Cloud

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要