Design-Space Exploration of Quantized Transposed Convolutional Neural Networks for FPGA-based Systems-on-Chip

2022 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech)(2022)

引用 1|浏览0
暂无评分
摘要
With the shift of deep learning applications to Edge Computing devices, compression techniques have been introduced to minimize hardware use, power consumption and latency. For example, quantization uses low numeric precision to represent inputs, parameters and activation functions. Transposed Convolutions (TCONVs) provide neural networks with image up-sampling capabilities. However, the accuracy and performance trade-off of TCONV Layers is under-explored, with existing works evaluating down to 8-bit precision but not less. This research systematically evaluates the impact of very low precision when a two-layers quantized decoder, using TCONVs, is implemented within an FPGA-based System-on-Chip (SoC) architecture. We evaluate the quantization impact on throughput performance and hardware costs, as well as the impact of parallelizing the computations of TCONV Layers using the same metrics. Results show that, when 4-bit data are processed, the circuit implemented on a Xilinx Zynq-7020 SoC only uses ~15% of logic and ~7.5% of on-chip memories, at the expense of a negligible ~2.5% accuracy loss with respect to the 8-bit counterpart. Furthermore, 3.5× speed-up is observed when inputs are processed with 4× parallelism.
更多
查看译文
关键词
transposed convolution layers,quantization,field programmable gate arrays (FPGAs),reconfigurable systems-on-chip
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要