Exploring the Architecture of Multiple GEMM Accelerators in Heterogeneous Systems

2023 9th International Conference on Control Science and Systems Engineering (ICCSSE)(2023)

引用 0|浏览1
暂无评分
摘要
General Matrix-Matrix Multiplication (GEMM) is a commonly used kernel in machine learning, scientific computing and many other applications. Designing a customized GEMM accelerator can bring obvious performance and power consumption benefits. In this paper, we first perform a detailed workload characterization for different sizes of GEMM kernels. Then, we make a comprehensive design space exploration to find the Pareto optimal architecture configurations. Lastly, we compare two versions of multiple GEMM accelerator systems with main-stream BLAS libraries (e.g., OpenBLAS, MKL and cuBLAS). The proposed GEMM acceleration hardware system shows higher energy efficiency than existing software implementations.
更多
查看译文
关键词
GEMM,accelerator,SoC,heterogeneous system
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要