On the use of shared storage in shared-nothing environments

BigData Conference(2013)

引用 8|浏览46
暂无评分
摘要
Shared-nothing environments, exemplified by systems such as MapReduce and Hadoop, employ node-local storage to achieve high scalability. The exponential growth in application datasets, however, demands ever higher I/O throughput and disk capacity. Simply equipping individual nodes in a Hadoop cluster with more disks is not scalable as it: increases the per-node cost, increases the probability of storage failure at the node, and worsens node failure recovery times. To this end, we propose dividing a Hadoop rack into several (small) sub-racks, and consolidating disks of a sub-rack's compute nodes into a separate shared Localized Storage Node (LSN) within the subrack. Such a shared LSN is easier to manage and provision, and can offer an economically better solution by employing overall fewer disks at the LSN than the total of the sub-rack's individual nodes, while still achieving high I/O performance. In this paper, we provide a quantitative study on the impact of shared storage in Hadoop clusters. We utilize several typical Hadoop applications and test them on a medium-sized cluster and via simulations. Our evaluation shows that: (i) the staggered workload allows our design to support the same number of compute nodes at a comparable or better throughput using fewer total disks than in the node-local case, thus providing more efficient resource utilization; (ii) the impact of lost locality can be mitigated by better provisioning the LSN-node network interconnect and the number of disks in an LSN; and (iii) the consolidation of disks into an LSN is a viable and efficient alternative to the extant node-local storage design. Finally, we show that LSN-based design can deliver up to 39% performance improvement over standard Hadoop.
更多
查看译文
关键词
medium-sized cluster,shared storage,disks consolidation,mapreduce,i/o throughput,hadoop rack,resource utilization,i/o performance,staggered workload,storage failure,resource allocation,lost locality,subrack nodes,lsn-based design,shared memory systems,lsn-node network interconnect,node-local storage design,memory architecture,localized storage node,hadoop cluster,node failure recovery,shared lsn,disc storage,shared-nothing environments,disk capacity
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要