Power Tuning Hpc Jobs On Power-Constrained Systems

PACT(2016)

引用 99|浏览316
暂无评分
摘要
As we approach the exascale era, power has become a primary bottleneck. The US Department of Energy has set a power constraint of 20MW on each exascale machine. To be able achieve one exaflop under this constraint, it is necessary that we use power intelligently to maximize performance under a power constraint.Most production-level parallel applications that run on a supercomputer are tightly-coupled parallel applications. A naive approach of enforcing a power constraint for a parallel job would be to divide the job's power budget uniformly across all the processors. However, previous work has shown that a power capped job suffers from performance variation of otherwise identical processors leading to overall sub-optimal performance. We propose a 2-level hierarchical variation-aware approach of managing power at machine level. At the macro level, PPartition partitions a machine's power budget across jobs to assign a power budget to each job running on the system such that the machine never exceeds its power budget. At the micro level, PTune makes job-centric decisions by taking the performance variation into account. For every moldable job, PTune determines the optimal number of processors, the selection of processors and the distribution of the job's power budget across them, with the goal of maximizing the job's performance under its power budget.Experiments show that, at the micro level, PTune achieves a performance improvement of up to 29% compared to a naive approach. PTune does not lead to any performance degradation, yet frees up almost 40% of the processors for the same performance as that of the naive approach under a hard power bound. At the macro level, PPartition is able to achieve a throughput improvement of 5-35% compared to uniform power distribution.
更多
查看译文
关键词
power tuning HPC jobs,power-constrained systems,exascale machine,power constraint,production-level parallel applications,supercomputer,parallel job,power capped job,performance variation,identical processors,machine-level,power budget,PTune,job-centric decisions,moldable job,macro level,PPartition,uniform power distribution
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要