BFLOAT MLP Training Accelerator for FPGAs

Andrei Hagiescu,Martin Langhammer,Bogdan Pasca,Philip Colangelo, Jason Thong, Niayesh Ilkhani

2019 International Conference on ReConFigurable Computing and FPGAs (ReConFig)（2019）

引用 1|浏览5

暂无评分

摘要

This paper describes the architecture of a FPGA-based high-performance training accelerator for neural networks. Our accelerator uses a hybrid embedded floating point and soft logic approach to implement truncated floating-point datapaths, including bfloat16 and bfloat14. The proposed multi-layer perceptron (MLP) training architecture is the highlight of a general methodology for developing high-performance accelerators written in OpenCL and incorporating a systolic-array GEMM engine with off-chip memory interfaces. The accelerator is capable of 5 Tflops on a mid-range FPGA device and achieves over 90% of the peak efficiency during training, thus demonstrating the versatility of using FPGAs as neural network training accelerators.

查看译文

关键词

BFLOAT MLP training accelerator,FPGA,soft logic approach,truncated floating-point datapaths,multilayer perceptron training architecture,systolic-array GEMM engine,neural network training accelerators,high-performance training accelerator,OpenCL,off-chip memory interfaces,bfloat16,bfloat14

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要