A Case for a Flexible Scalar Unit in SIMT Architecture

IPDPS(2014)

引用 23|浏览56
暂无评分
摘要
The wide availability and the Single-Instruction Multiple-Thread (SIMT)-style programming model have made graphics processing units (GPUs) a promising choice for high performance computing. However, because of the SIMT style processing, an instruction will be executed in every thread even if the operands are identical for all the threads. To overcome this inefficiency, the AMD's latest Graphics Core Next (GCN) architecture integrates a scalar unit into a SIMT unit. In GCN, both the SIMT unit and the scalar unit share a single SIMT style instruction stream. Depending on its type, an instruction is issued to either a scalar or a SIMT unit. In this paper, we propose to extend the scalar unit so that it can either share the instruction stream with the SIMT unit or execute a separate instruction stream. The program to be executed by the scalar unit is referred to as a scalar program and its purpose is to assist SIMT-unit execution. The scalar programs are either generated from SIMT programs automatically by the compiler or manually developed by expert developers. We make a case for our proposed flexible scalar unit through three collaborative execution paradigms: data prefetching, control divergence elimination, and scalar-workload extraction. Our experimental results show that significant performance gains can be achieved using our proposed approaches compared to the state-of-art SIMT style processing.
更多
查看译文
关键词
vector unit,data prefetching,flexible scalar unit,single simt style instruction stream,compiler,scalar unit,storage management,amd,graphics processing units,simt,multi-threading,graphics core next architecture,multiprocessing systems,high performance computing,gpgpu, scalar unit, simt, vector unit,collaborative execution paradigms,gcn architecture,gpus,scalar-workload extraction,scalar program,gpgpu,simt architecture,control divergence elimination,single-instruction multiple-thread-style programming model,instruction sets,simt style processing,program compilers,simt-unit execution,vectors,multi threading,distributed processing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要