Hardware-Software Co-Design Implementation of Fixed-Point GoogleNet on SoC Using Xilinx Vitis

Mohamed A. Elhewehy, Karim O. Abbass,Omar A. Nasr

2023 5th Novel Intelligent and Leading Emerging Sciences Conference (NILES)(2023)

引用 0|浏览0
暂无评分
摘要
The use of convolutional neural networks (CNNs) has gained significant popularity in recent years due to their effectiveness in many applications, such as image recognition and classification. Field programmable gate arrays (FPGAs) have gained increased appeal compared to GPUs and CPUs due to their energy efficiency, high throughput, and scalability benefits. However, CNNs can be computationally intensive and require large amounts of memory, making their implementation on resource-limited devices such as FPGAs challenging. Moreover, the presence of High-Level Synthesis (HLS) contributes to fast design time, reducing the programming workload and enhancing FPGA design efficiency. Furthermore, the HLS co-design flow enables faster design exploration and optimization, facilitating rapid prototyping and convenient modifications. In this paper, a Hardware/Software (HW/SW) Co-design approach is introduced for the implementation of GoogleNet, a popular CNN architecture, on the Xilinx Zynq UltraScale+ MPSoC ZCU102 Evaluation Kit using the Xilinx Vitis tool. This approach involves offloading the most computationally intensive components to the FPGA, while the remaining parts of the network run on an embedded Central Processing Unit (CPU). The proposed model is then modified to use fixed-point arithmetic using post-training quantization techniques and different HLS optimizations, resulting in improvements in hardware resources while achieving low power. Experimental results show that the model maintains high accuracy while achieving significant reductions in required hardware resources for FPGA implementation. The results exhibit a total on-chip power consumption of 2.49 watts, considering 20-bit fixed-point data precision with fewer hardware resources compared to the corresponding RTL accelerator.
更多
查看译文
关键词
Convolutional Neural Networks (CNNs),Field programmable gate arrays (FPGA),High Level Synthesis (HLS),Hardware Accelerators,Loop Tiling,Fixed-point,Post Training Quantization (PTQ),Vitis,Vitis HLS,Vivado,GoogleNet,Inception,and Image Classification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要