Massively Parallel Computation Of Linear Recurrence Equations With Graphics Processing Units

2018 INTERNATIONAL CONFERENCE ON EMBEDDED COMPUTER SYSTEMS: ARCHITECTURES, MODELING, AND SIMULATION (SAMOS XVIII)(2018)

引用 1|浏览83
暂无评分
摘要
Graphics processing units (GPUs) show very high performance when executing many parallel programs; however their use in solving linear recurrence equations is considered difficult because of the sequential nature of the problem. Previously developed parallel algorithms, such as recursive doubling and multi-block processing, do not show high efficiency in GPUs because of poor scalability with the number of threads. In this work, we have developed a highly efficient GPU-based algorithm for recurrences using a thread-level parallel (TLP) approach, instead of conventional thread-block level parallel (TBLP) methods. The proposed TLP method executes all of the threads as independently as possible to improve the computational efficiency and employs a hierarchical structure for inter-thread communication. Not only constant but also time-varying coefficient recurrence equations are implemented on NVIDIA GTX285, GTX580 and GTX TITAN X GPUs, and the performances are compared with the results on single-core and multi-core SIMD CPU-based PCs.
更多
查看译文
关键词
Graphics processing unit (GPU), massively parallel processing, linear recurrence equation, prefix-sum, scan
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要