1 Nowadays, GPU platforms have gained wide importance in application"/>

A Reliability-aware Environment for Design Exploration for GPU Devices

2023 26th International Symposium on Design and Diagnostics of Electronic Circuits and Systems (DDECS)(2023)

引用 1|浏览4
暂无评分
摘要
1 Nowadays, GPU platforms have gained wide importance in applications that require high processing power. Unfortunately, the advanced semiconductor technologies used for their manufacturing are prone to different types of faults. Hence, solutions are required to support the exploration of the resilience to faults of different architectures. Based on this motivation, this work presents an environment dedicated to the analysis of the impact of permanent faults on GPU platforms. This environment is based on GPGPU-Sim, with the objective of exploiting the configuration features of this tool and, thus, analyzing the effects of faults when changing the target architecture. To validate the environment and show its usability, a fault campaign has been carried out where three different GPU architectures (Kepler, Volta, and Turing) were used. In addition, each GPU has been modified with an arbitrary number of parallel processing cores (or SMs). Three representative applications (Vector Add, Scalar Product, and Matrix Multiply) were executed on each GPU, and the behavior of each architecture in the presence of permanent faults in the functional (i.e., integer unit and floating-point) units was analyzed. This fault campaign shows the usability of the environment and demonstrates its potential use to support decisions on the best architectural parameters for a given application.
更多
查看译文
关键词
Architectural models, Design exploration, Graphics Processing Units (GPUs), Permanent faults, Reliability
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要