Energy Efficient DNN Compaction for Edge Deployment

APPLIED RECONFIGURABLE COMPUTING. ARCHITECTURES, TOOLS, AND APPLICATIONS, ARC 2023(2023)

引用 0|浏览2
暂无评分
摘要
Deep Neural Networks (DNNs) are popular deep learning models due to their numerous learnable parameters, which are required for both the training and inference phases. However, deploying these models on mobile and edge devices with limited hardware resources and power budgets is a significant challenge. To meet real-time requirements and energy efficiency, it is essential to compact DNN models. This paper proposes a fixed partition compaction technique exploiting consecutive zeros and non-zero weights/parameters in sparse DNN models. This approach reduces memory storage requirements, memory transactions and computations for DNNs. We implemented convolution and fully connected layers with the compact weights on Virtex-7 FPGA VC707. Our experiments demonstrate that compact layers have better performance and energy efficiency than layers without compaction. Results show that the compact convolution layers achieved an average performance improvement of 32.51% and 29.43% compared to state-of-the-art SMM and direct convolution respectively performed on several convolution configurations. Moreover, an energy consumption reduction of 34.14% over SMM and 29.58% over direct convolution. Experiments on the compact fully connected layers achieved an average performance improvement of 26.61% and energy consumption reduction of 30.85% over layers without compaction.
更多
查看译文
关键词
Compact convolution,Pruning,DNN compaction,Energy efficiency
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要