Automatic multidimensional memory partitioning for FPGA-based accelerators (abstract only).

FPGA '13: The 2013 ACM/SIGDA International Symposium on Field Programmable Gate Arrays Monterey California USA February, 2013(2013)

引用 0|浏览70
暂无评分
摘要
With the increase of data processing throughput in reconfigurable computing, data parallelism is now crucial for the performance of FPGA-based accelerators. However, most of the data parallelism optimizations are still performed manually by experienced hardware designers. Memory partitioning is widely adopted to efficiently increase the memory bandwidth by using multiple memory banks and reducing data access conflict. Previous methods for memory partitioning mainly focused on one-dimensional arrays. As a consequence, designers must flatten a multidimensional array to fit those methodologies, but it makes the partition related to the dimensional width of the array. In this work we propose an automatic memory partitioning scheme for multidimensional arrays to provide high data throughput of on-chip memories for the loop pipelining in high-level synthesis. Linear transformation is applied to optimize the layout of the data elements in the memory banks, with the partition unrelated to the dimensional width. Two transformation vectors are used to map the original data element onto different banks and different inner bank offsets. The vector for the optimal bank mapping is decided by non-conflict access constraint. In addition, a memory padding technique is proposed to find a vector for inner bank offset with a trade-off between practicality and optimality. We use six benchmarks with different access patterns to prove our idea. Compared to the previous one-dimensional partitioning work, the experimental results show that our approach saves up to 21% of block RAMs, 19% in slices, and 46% in DSPs.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要