BFLOAT MLP Training Accelerator for FPGAs

2019 International Conference on ReConFigurable Computing and FPGAs (ReConFig)(2019)

引用 1|浏览5
暂无评分
摘要
This paper describes the architecture of a FPGA-based high-performance training accelerator for neural networks. Our accelerator uses a hybrid embedded floating point and soft logic approach to implement truncated floating-point datapaths, including bfloat16 and bfloat14. The proposed multi-layer perceptron (MLP) training architecture is the highlight of a general methodology for developing high-performance accelerators written in OpenCL and incorporating a systolic-array GEMM engine with off-chip memory interfaces. The accelerator is capable of 5 Tflops on a mid-range FPGA device and achieves over 90% of the peak efficiency during training, thus demonstrating the versatility of using FPGAs as neural network training accelerators.
更多
查看译文
关键词
BFLOAT MLP training accelerator,FPGA,soft logic approach,truncated floating-point datapaths,multilayer perceptron training architecture,systolic-array GEMM engine,neural network training accelerators,high-performance training accelerator,OpenCL,off-chip memory interfaces,bfloat16,bfloat14
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要