Automatic Interior I/O Elimination in Systolic Array Architecture
2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)(2018)
摘要
Automatic systolic array generation has long been an interesting topic since manual designs usually require huge implementation efforts. Conventional automatic systolic array generation builds dependency graphs from algorithms, and iteratively maps computation nodes in the graph into processing elements (PEs) with time stamps that specify the sequences for nodes to operate within the PE. Sometimes this mapping will generate an array with I/O operations being distributed to all PEs, which causes complex design scheme and area overheads. The process to eliminate such I/O is called interior I/O elimination. This paper presents our approach for interior I/O elimination. We propose an automated compiler to map algorithms to systolic arrays. For any mapped array with interior I/O, the compiler analyzes the communication patterns and determines if interior I/O can be eliminated in the design. If so, the compiler eliminates the I/O and adds necessary modules in the design for data transfer. We demonstrate our approach on two important applications'matrix multiplication and convolutional neural network. With the help of interior I/O elimination, the designs generated by the compiler are able to achieve 591.4GFLOPs and 405.2GFLOPs, respectively.
更多查看译文
关键词
systolic array,interior I/O elimination,automation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络