Performance Portability Evaluation of OpenCL Benchmarks across Intel and NVIDIA Platforms

2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)(2020)

引用 10|浏览14
暂无评分
摘要
We evaluate the capabilities of vendor-provided OpenCL implementations for performance portability across multiple computing platforms. The Rodinia benchmark suite is used for this evaluation. We apply the metric defined by Pennycook et al., and we use roofline efficiency from the Roofline performance model as the “performance efficiency” in the metric's definition. We found that the delivered performance portability is similar for several benchmarks, even if the roofline-based performance efficiencies across platforms are very different among the benchmarks. To help distinguish between these instances, we extend the metric by adding the standard deviation of the performance efficiencies for each benchmark. We argue that the standard deviation gives additional insight into performance portability assessment since it adds the performance variability across platforms. Additionally, we discuss the challenges to measure performance portability associated with algorithms and system software. In terms of algorithms, we need to carefully construct the benchmarks and appropriately use the concurrency available on a platform. In terms of system software, we depend on the vendor performance tools to support the desired programming model and runtime to be able to measure the metrics of interest.
更多
查看译文
关键词
high performance computing,performance efficiency,performance portability,roofline performance analysis,OpenCL,GPU
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要