Extending Course-Grained Reconfigurable Arrays with Multi-Kernel Dataflow

semanticscholar(2012)

引用 5|浏览4
暂无评分
摘要
Coarse-Grained Reconfigurable Arrays (CGRAs) are a promising class of architectures for accelerating applications using a large number of parallel execution units for high throughput. While the model allows for tools that can automatically parallelize a single task across many processing elements, all processing elements are required to perform in lock step. This makes applications that involve multiple data streams, multiple tasks, or unpredictable schedules more difficult to program and inefficient in their use of resources. These applications can often be decomposed into a set of communicating kernels, operating independently to achieve the overall computation. Although competing accelerator architectures like Massively Parallel Processor Arrays (MPPAs) can make use of this communicating processes model, it generally requires the designer to decompose the design into as many kernels as there are processors to be used. While this is excellent for executing unrelated tasks simultaneously, the amount of resources easily utilized for a single task is limited. We are developing a new CGRA architecture that enables execution of multiple kernels of computation simultaneously. This greatly extends the domain of applications that can be accelerated with CGRAs. This new architecture poses two problems that we describe in this paper. First, the tools must handle the decomposition, scheduling placement and routing of multiple kernels. Second, the CGRA must include new resources for coordinating and synchronizing the operation of multiple kernels. This paper is a digest of the project’s previously published results [17,18,19,20,21].
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要