Adaptive Task Aggregation for High-Performance Sparse Solvers on GPUs

2019 28th International Conference on Parallel Architectures and Compilation Techniques (PACT)(2019)

引用 8|浏览40
暂无评分
摘要
Sparse solvers are heavily used in computational fluid dynamics (CFD), computer-aided design (CAD), and other important application domains. These solvers remain challenging to execute on massively parallel architectures, due to the sequential dependencies between the fine-grained application tasks. In particular, parallel sparse solvers typically suffer from substantial scheduling and dependency-management overheads relative to the compute operations. We propose adaptive task aggregation (ATA) to efficiently execute such irregular computations on GPU architectures via hierarchical dependency management and low-latency task scheduling. On a gamut of representative problems with different data-dependency structures, ATA significantly outperforms existing GPU task-execution approaches, achieving a geometric mean speedup of 2.2X to 3.7X across different sparse kernels (with speedups of up to two orders of magnitude).
更多
查看译文
关键词
data dependency, fine-grained parallelism, GPUs, runtime adaptation, scheduling, sparse linear algebra, task parallel execution
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要