Optimizing performance and energy across problem sizes through a search space exploration and machine learning.

Lana Scravaglieri,Mihail Popov,Laércio Lima Pilla, Amina Guermouche,Olivier Aumage,Emmanuelle Saillard

J. Parallel Distributed Comput.（2023）

引用 0|浏览24

暂无评分

摘要

HPC systems expose configuration options to assist optimization. Configurations such as parallelism, thread and data mapping, or prefetching have been explored but with a limited optimization objective (e.g., performance) and a fixed problem size. Unfortunately, efficient strategies in one scenario may poorly generalize when applied in new contexts. We investigate the impact of configuration options and different problem sizes over performance and energy. Well-adapted NUMA-related options and cache-prefetchers provide significantly more gains for energy (5.9x) than performance (1.85x). Moreover, reusing optimization strategies from performance to energy only provides 40% of the gains found when natively optimizing for energy, while transferring strategies across problem sizes limits to 70% of the original gains. We fill this gap with ML: simple decision trees predict the best configuration for a target size using only information collected on another size. Our models achieve 88% of the native gains when cross-predicting across performance and energy, and 85% across problem sizes.& COPY; 2023 Elsevier Inc. All rights reserved.

查看译文

关键词

NUMA,Prefetch,Page and thread mapping,Machine learning,Energy

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要