The Case for Polymorphic Registers in Dataflow Computing

International Journal of Parallel Programming(2017)

引用 3|浏览62
暂无评分
摘要
Heterogeneous systems are becoming increasingly popular, delivering high performance through hardware specialization. However, sequential data accesses may have a negative impact on performance. Data parallel solutions such as Polymorphic Register Files (PRFs) can potentially accelerate applications by facilitating high-speed, parallel access to performance-critical data. This article shows how PRFs can be integrated into dataflow computational platforms. Our semi-automatic, compiler-based methodology generates customized PRFs and modifies the computational kernels to efficiently exploit them. We use a separable 2D convolution case study to evaluate the impact of memory latency and bandwidth on performance compared to a state-of-the-art NVIDIA Tesla C2050 GPU. We improve the throughput up to 56.17X and show that the PRF-augmented system outperforms the GPU for 9× 9 or larger mask sizes, even in bandwidth-constrained systems.
更多
查看译文
关键词
Dataflow computing,Parallel memory accesses,Polymorphic register file,Bandwidth,Vector lanes,Convolution,High performance computing,High-level synthesis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要