Massively Parallel Computation Of Linear Recurrence Equations With Graphics Processing Units

Wonyong Sung,Dong-hwan Lee,Kyuyeon Hwang

2018 INTERNATIONAL CONFERENCE ON EMBEDDED COMPUTER SYSTEMS: ARCHITECTURES, MODELING, AND SIMULATION (SAMOS XVIII)（2018）

引用 1|浏览83

暂无评分

摘要

Graphics processing units (GPUs) show very high performance when executing many parallel programs; however their use in solving linear recurrence equations is considered difficult because of the sequential nature of the problem. Previously developed parallel algorithms, such as recursive doubling and multi-block processing, do not show high efficiency in GPUs because of poor scalability with the number of threads. In this work, we have developed a highly efficient GPU-based algorithm for recurrences using a thread-level parallel (TLP) approach, instead of conventional thread-block level parallel (TBLP) methods. The proposed TLP method executes all of the threads as independently as possible to improve the computational efficiency and employs a hierarchical structure for inter-thread communication. Not only constant but also time-varying coefficient recurrence equations are implemented on NVIDIA GTX285, GTX580 and GTX TITAN X GPUs, and the performances are compared with the results on single-core and multi-core SIMD CPU-based PCs.

查看译文

关键词

Graphics processing unit (GPU), massively parallel processing, linear recurrence equation, prefix-sum, scan

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要