On the Power of Combiner Optimizations in MapReduce Over MPI Workflows

2018 IEEE 24th International Conference on Parallel and Distributed Systems (ICPADS)(2018)

引用 3|浏览44
暂无评分
摘要
Analyzing large volumes of data is becoming more and more important in various scientific computing domains. MapReduce over MPI frameworks are an appealing solution to enable scalable big data analytics on supercomputing systems. These systems can further leverage features of MapReduce applications by merging (key/value) pairs before the reduce function in combiner optimizations. In this paper, we propose a pipeline combiner workflow and integrate it into Mimir, a cutting-edge implementation of Map Reduce over MPI. Our results with real datasets on the Tianhe-2 supercomputer prove that our pipeline combiner workflow can reduce memory usage up to 51 % and improve the overall performance up to 61 %.
更多
查看译文
关键词
Optimization,Pipelines,Supercomputers,Merging,Sparks,Memory management,Buffer storage
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要