Dnnvm: End-To-End Compiler Leveraging Heterogeneous Optimizations On Fpga-Based Cnn Accelerators

CoRR(2020)

引用 71|浏览6
暂无评分
摘要
The convolutional neural network (CNN) has become a state-of-the-art method for several artificial intelligence domains in recent years. The increasingly complex CNN models are both computation-bound and I/O-bound. Field-programmable gate array-based accelerators driven by custom instruction set architecture (ISA) achieve a balance between generality and efficiency, but there is much on them left to be optimized. We propose the full-stack compiler deep neural network virtual machine (DNNVM), which is an integration of optimizers for graphs, loops and data layouts, an assembler, a runtime supporter, and a validation environment. The DNNVM works in the context of deep learning frameworks and transforms CNN models into the directed acyclic graph: XGraph. Based on XGraph, we transform the optimization challenges for both data layout and pipeline into graph-level problems. DNNVM enumerates all potentially profitable fusion opportunities by a heuristic subgraph isomorphism algorithm to leverage pipeline and data layout optimizations, and searches for the best choice of execution strategies of the whole computing graph. On the Xilinx ZU2@330 MHz and ZU9@330 MHz, we achieve equivalently state-of-the-art performance on our benchmarks by naive implementations without optimizations, and the throughput is further improved up to 1.26x by leveraging heterogeneous optimizations in DNNVM. Finally, with ZU9@330 MHz, we achieve state-of-the-art performance for VGG and ResNet50. We achieve a throughput of 2.82 TOPs/s and an energy efficiency of 123.7 GOPs/s/W for VGG. Additionally, we achieve 1.38 TOPs/s for ResNet50 and 1.41 TOPs/s for GoogleNet.
更多
查看译文
关键词
Hardware,Optimization,Layout,Field programmable gate arrays,Throughput,Computer architecture,Deep learning,Compiler,convolutional neural network (CNN),field-programmable gate array (FPGA),fusion,optimizations
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要