Optimizing two-dimensional DMA transfers for scratchpad Based MPSoCs platforms

Microprocessors and Microsystems(2013)

引用 10|浏览0
暂无评分
摘要
Reducing the effects of off-chip memory access latency is a key factor in exploiting efficiently embedded multi-core platforms. We consider architectures that admit a multi-core computation fabric, having its own fast and small memory to which the data blocks to be processed are fetched from external memory using a DMA (direct memory access) engine, employing a double- or multiple-buffering scheme to avoid processor idling. In this paper we focus on application programs that process two-dimensional data arrays and we determine automatically the size and shape of the portions of the data array which are subject to a single DMA call, based on hardware and applications parameters. When the computation on different array elements are completely independent, the asymmetry of memory structure leads always to prefer one-dimensional horizontal pieces of memory, while when the computation of a data element shares some data with its neighbors, there is a pressure for more ''square'' shapes to reduce the amount of redundant data transfers. We provide an analytic model for this optimization problem and validate our results by running a mean filter application on the Cell simulator.
更多
查看译文
关键词
data element share,data array,redundant data transfer,two-dimensional dma transfer,external memory,direct memory access,mpsocs platform,memory structure,off-chip memory access latency,data block,two-dimensional data array,small memory,double buffering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要