Exploring the possibility of a hipSYCL-based implementation of oneAPI.

International Workshop on OpenCL (IWOCL)(2022)

引用 5|浏览2
暂无评分
摘要
oneAPI is an open standard for a software platform built around SYCL 2020 and accelerated libraries such as oneMKL as well as low-level building blocks such as oneAPI Level Zero. All oneAPI implementations currently are based on the DPC++ SYCL implementation. However, being able to utilize multiple independent SYCL implementations with oneAPI code can be beneficial to both users and implementors when it comes to testing code, or e.g. noticing ambiguities in the specification. In this work, we explore the possibility of implementing oneAPI using hipSYCL as an independent SYCL implementation instead. We review hipSYCL’s design and demonstrate it running on oneAPI Level Zero with competitive performance. We also discuss hipSYCL’s support for SYCL 2020 with the examples of unified shared memory (USM), group algorithms and optional kernel lambda naming. To this end, we also contribute microbenchmarks for the SYCL 2020 group algorithms and demonstrate their performance. When testing hipSYCL with HeCBench, a large benchmark suite containing SYCL benchmarks initially developed for DPC++, we point out specification ambiguities and practices that negatively impact code portability when transitioning from DPC++ to hipSYCL. We find that we can compile 122 benchmarks with little effort with hipSYCL, and demonstrate performance for a selection of benchmarks within 20% of native models on NVIDIA and AMD GPUs. Lastly, we demonstrate oneMKL’s BLAS domain running with hipSYCL on AMD and NVIDIA GPUs, and find that it can match native cuBLAS and rocBLAS performance for BLAS level 1, level 2 and level 3 operations, while significantly outperforming oneMKL with DPC++ on NVIDIA GPUs for all but the largest problem sizes. Overall, we find that hipSYCL can support low-level building blocks like Level Zero, oneAPI libraries like oneMKL, and the SYCL 2020 programming model efficiently, and hence conclude that it is indeed possible to implement oneAPI independently from DPC++.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要