Taking GPU Programming Models to Task for Performance Portability

Joshua H. Davis, Pranav Sivaraman, Joy Kitson,Konstantinos Parasyris,Harshitha Menon, Isaac Minn,Giorgis Georgakoudis,Abhinav Bhatele

arXiv · Distributed, Parallel, and Cluster Computing（2024）

Cited 0|Views19

Abstract

Ensuring high productivity in scientific software development necessitates developing and maintaining a single codebase that can run efficiently on a range of accelerator-based supercomputing platforms. While prior work has investigated the performance portability of a few selected proxy applications or programming models, this paper provides a comprehensive study of a range of proxy applications implemented in the major programming models suitable for GPU-based platforms. We present and analyze performance results across NVIDIA and AMD GPU hardware currently deployed in leadership-class computing facilities using a representative range of scientific codes and several programming models – CUDA, HIP, Kokkos, RAJA, OpenMP, OpenACC, and SYCL. Based on the specific characteristics of applications tested, we include recommendations to developers on how to choose the right programming model for their code. We find that Kokkos, RAJA, and SYCL in particular offer the most promise empirically as performance portable programming models. These results provide a comprehensive evaluation of the extent to which each programming model for heterogeneous systems provides true performance portability in real-world usage.

Translated text

Bibtex

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Data Disclaimer

The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn

Chat Paper

Summary is being generated by the instructions you defined