Spatial computation

Sigplan Notices(2004)

引用 195|浏览44
暂无评分
摘要
This thesis presents a compilation framework for translating ANSI C programs into hardware dataflow machines. The framework is embodied in the CASH compiler, a Compiler for Application-Specific Hardware. CASH generates asynchronous hardware circuits that directly implement the functionality of the source program, without using any interpretative structures. This style of computation is dubbed “Spatial Computation.” CASH relies extensively on predication and speculation for building efficient hardware circuits. The first part of this document describes Pegasus, the internal representation of CASH, and a series of novel program transformations performed by CASH. The most notable of these are a new optimal register-promotion algorithm and partial redundancy elimination for memory accesses based on predicate manipulation. The second part of this document evaluates the performance of the generated circuits using simulation. Using media processing benchmarks, we show that for the domain of embedded computation, the circuits generated by CASH can sustain high levels of instruction level parallelism, due to the effective use of dataflow software pipelining. A comparison of Spatial Computation and superscalar processors highlights some of the weaknesses of our model of computation, such as the lack of branch prediction and register renaming. Low-level simulation however suggests that the energy efficiency of Application-Specific Hardware is three orders of magnitude better than superscalar processors, one order of magnitude better than low-power digital signal processors and asynchronous processors, and approaching custom hardware chips. The results presented in this document can be applied in several domains: (1) most of the compiler optimizations are applicable to traditional compilers for high-level languages; (2) CASH itself can be used as a hardware synthesis tool for very fast system-on-a-chip prototyping directly from C sources; (3) the compilation framework we describe can be applied to the translation of imperative languages to dataflow machines; (4) we have extended the dataflow machine model to encompass predication, data-speculation and control-speculation; and (5) the tool-chain described and some specific optimizations, such as lenient execution and pipeline balancing, can be used for synthesis and optimization of asynchronous hardware.
更多
查看译文
关键词
hardware structure,general terms: measurement,dedicated communication channel,application- specific hardware,application-specific hardware,CASH compiler,Application-Specific Hardware,superscalar processor,performance,simple hardware primitive,computation unit,custom hardware chip,asynchronous hardware,spatial computation,low-power,SC program implementation,efficient hardware circuit,ASH use,hardware dataflow machine,Spatial Computation,low-power.,compilation framework,monolithic superscalar processor,design. keywords: spatial computation,asynchronous hardware circuit,high-end superscalar processor,SC circuit,ASH hardware,dataflow machine
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要