On the performance of MapReduce: A stochastic approach

BigData Conference（2014）

引用 8|浏览13

暂无评分

摘要

MapReduce is a highly acclaimed programming paradigm for large-scale information processing. However, there is no accurate model in the literature that can precisely forecast its run-time and resource usage for a given workload. In this paper, we derive analytic models for shared-memory MapReduce computations, in which the run-time and disk I/O are expressed as functions of the workload properties, hardware configuration, and algorithms used. We then compare these models against trace-driven simulations using our high-performance MapReduce implementation.

查看译文

关键词

Big Data,parallel processing,shared memory systems,sorting,stochastic processes,Big Data,MapReduce performance,analytic models,disk I/O,external sort,hardware configuration,high-performance MapReduce implementation,large-scale information processing,programming paradigm,resource usage,run-time,shared-memory MapReduce computations,stochastic approach,trace-driven simulations,workload properties,Big Data,Disk I/O,External Sort,MapReduce

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要