Estimating the WCET of GPU-Accelerated Applications Using Hybrid Analysis

Real-Time Systems(2013)

引用 41|浏览3
暂无评分
摘要
The massive parallelism offered by Graphics Processing Units (GPUs) is now routinely exploited to accelerate computationally intensive tasks in a wide variety of application domains. Efficient GPU programming in languages such as CUDA and OpenCL requires careful application of hand optimisations to exploit parallelism and locality while minimising synchronisation. The effectiveness of such optimisations can be highly dependent on workload and the structure of input data, making it difficult to assess performance in general by testing alone. To address this, we study the problem of estimating the Worst-Case Execution Time (WCET) of GPU-accelerated applications. We propose the use of hybrid WCET analysis whereby execution times of small program segments are deduced from traces of execution and a calculation backend derived from the Control Flow Graph (CFG) produces a WCET estimate. Standard techniques which construct a CFG from a binary cannot be applied directly to GPU code because they miss implicit execution paths that arise due the way branches are implemented in hardware - we present a solution using standard compiler analysis. We further describe how to extend the basic hybrid WCET analysis of sequential code so that concurrent timing effects in the GPU execution model are incorporated. We have implemented our analysis as a tool built on top of the GPGPU-sim open source simulator. We evaluate our tool using a set of benchmarks drawn from the CUDA SDK: results show that effective modelling of concurrency is key to reducing pessimism in the WCET calculation.
更多
查看译文
关键词
efficient gpu programming,execution time,basic hybrid wcet analysis,gpu-accelerated applications,hybrid analysis,gpu code,standard compiler analysis,gpu execution model,hybrid wcet analysis,wcet estimate,wcet calculation,implicit execution path,gpu programming,instruction sets,minimisation,concurrency control,benchmark,programming,control flow graph,hardware
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要