In-Package Domain-Specific ASICs for Intel® Stratix® 10 FPGAs: A Case Study of Accelerating Deep Learning Using TensorTile ASIC(Abstract Only).

FPGA(2018)

引用 23|浏览104
暂无评分
摘要
FPGAs or ASICs? There is a long-running debate on this. FPGAs are extremely flexible while ASICs offer top efficiency but inflexible. We believe that FPGAs and ASICs are better together, to offer both flexible and efficient solutions. We propose single-package heterogeneous 2.5D integration of FPGAs and ASICs, using Intel's Embedded Multi-Die Interconnect Bridge (EMIB). Since the ASICs are separate chips from the FPGA, this approach (1) does not change FPGA fabric, allowing re-use of existing ecosystems (FPGA chips/packaging/boards/software/etc), and (2) allows freedom in the ASIC design (area/freq/process/etc unconstrained by FPGA fabric). This approach is more effective than developing traditional stand-alone ASICs. Intel® Stratix® 10 FPGAs already have EMIBs, enabling single-package integration with other chips, or "tiles". We propose leveraging them to mix-and-match any domain-specific ASICs with Stratix10 FPGAs. In particular, this work focuses on deep learning (DL) domain, which demands efficient tensor (matrix/vector) operations. We propose TensorTile, a family of ASICs to complement Stratix10 FPGAs to execute tensor operations with ASIC efficiency, while utilizing FPGA's flexibility for application-specific DL operations (e.g., Winograd). Our evaluation shows: (1) a small TensorTile (10s in mm2, 14nm process) offer much better tensor throughput than a large Stratix10-2800 FPGA; (2) FPGAs and TensorTiles mix-and-match provide scalable solutions (e.g., ~69 peak INT8 TOPs with 1xTensorTile+small Stratix10-400 FPGA, to ~194 FP16 TOPs with 6xTensorTiles+large Stratix10-2800); (3) AlexNet performance and performance/Watt of Intel's DL OpenCL Stratix10 FPGA solution improved by 4x and 3.3x when enhanced with 2xTensorTiles. Overall, this approach is an effective, versatile, and scalable solution.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要