NASA-F: FPGA-Oriented Search and Acceleration for Multiplication-Reduced Hybrid Networks

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS(2024)

引用 0|浏览5
暂无评分
摘要
The costly multiplications challenge the deployment of modern deep neural networks (DNNs) on resource-constrained devices. To promote hardware efficiency, prior works have built multiplication-free models. However, they are generally inferior to their multiplication-based counterparts in accuracy, calling for multiplication-reduced hybrid models to marry the benefits of both approaches. To achieve this goal, recent works, i.e., NASA and NASA+ , have developed Neural Architecture Search (NAS) and Acceleration frameworks to search for and accelerate such hybrid models via a tailored differentiable NAS (DNAS) engine and dedicated ASIC-based accelerators. In this paper, we delve deeper into the inherent advantages of FPGAs and present an enhanced approach called NASA-F, which focuses on FPGA-oriented search and acceleration for hybrid models. Specifically, on the algorithm level, we develop a tailored one-shot supernet-based NAS engine to streamline the search for hybrid models, eliminating the need for executing NAS for each deployment as well as additional training/finetuning steps. On the hardware level, we develop a chunk-based accelerator to fully leverage the diverse hardware resources available on FPGAs for the acceleration of heterogeneous layers in hybrid models, aiming to enhance both hardware utilization and throughput. Extensive experimental results consistently validate the superiority of our NASA-F framework, e.g., we can gain up arrow 0.67% top-1 accuracy over the prior work NASA on CIFAR100 even without additional training steps for searched models. Additionally, we can achieve up to up arrow 1.86x throughout and up arrow 2.16x FPS with up arrow 0.39% top-1 accuracy over the state-of-the-art multiplication-based system on Tiny-ImageNet. Codes are available at https://github.com/shihuihong214/NASA-F.
更多
查看译文
关键词
Multiplication-reduced hybrid networks,neural architecture search,chunk-based accelerator,FPGA accelerator,algorithm-hardware co-design
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要