Handling large data sets for high-performance embedded applications in heterogeneous systems-on-chip

ESWEEK'16: TWELFTH EMBEDDED SYSTEM WEEK Pittsburgh Pennsylvania October, 2016(2016)

引用 34|浏览116
暂无评分
摘要
Local memory is a key factor for the performance of accelerators in SoCs. Despite technology scaling, the gap between on-chip storage and memory footprint of embedded applications keeps widening. We present a solution to preserve the speedup of accelerators when scaling from small to large data sets. Combining specialized DMA and address translation with a software layer in Linux, our design is transparent to user applications and broadly applicable to any class of SoCs hosting high-throughput accelerators. We demonstrate the robustness of our design across many heterogeneous workload scenarios and memory allocation policies with FPGA-based SoC prototypes featuring twelve concurrent accelerators accessing up to 768MB out of 1GB-addressable DRAM.
更多
查看译文
关键词
high-performance embedded applications,large data set handling,heterogeneous systems-on-chip,local memory,accelerator performance,technology scaling,on-chip storage,memory footprint,DMA,software layer,Linux,high-throughput accelerators,heterogeneous workload scenarios,memory allocation policies,FPGA-based SoC prototypes,concurrent accelerators,DRAM
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要