Exploring A Multi-Resolution Gpu Programming Model For Chapel

Akihiro Hayashi,Sri Raj Paul,Vivek Sarkar

2020 IEEE 34TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW 2020)（2020）

引用 4|浏览12

暂无评分

摘要

There is a growing need to support accelerators, especially GPU accelerators, since they are a common source of performance improvement in HPC clusters. As for GPU programming with Chapel, typically programmers first start with writing forall loops and run these loops on CPUs as a proof-of-concept. If the resulting CPU performance is not sufficient for their needs, their next step could be to try the automatic compiler-based GPU code generation techniques [1], [2]. For portions that remain as performance bottlenecks, even after automatic compilation approaches, the next step is to consider writing GPU kernels using CUDA/HIP/OpenCL and invoking these kernels from the Chapel program using the GPUIterator [3], [4] and Chapel's C interoperability feature.

查看译文

关键词

automatic compilation,GPU kernels,Chapel program,multiresolution GPU programming model,GPU accelerators,HPC clusters,forall loops,Chapel C interoperability feature,automatic compiler-based GPU code generation,CUDA,OpenCL,HIP

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要