A Fast, Accurate, and Comprehensive PPA Estimation of Convolutional Hardware Accelerators

IEEE Transactions on Circuits and Systems I: Regular Papers(2022)

引用 1|浏览13
暂无评分
摘要
Convolutional Neural Networks (CNN) are widely adopted for Machine Learning (ML) tasks, such as classification and computer vision. GPUs became the reference platforms for both training and inference phases of CNNs due to their tailored architecture to the CNN operators. However, GPUs are power-hungry architectures. A path to enable the deployment of CNNs in energy-constrained devices is adopting hardware accelerators for the inference phase. However, the literature presents gaps regarding analyses and comparisons of these accelerators to evaluate Power-Performance-Area (PPA) trade-offs. Typically, the literature estimates PPA from the number of executed operations during the inference phase, such as the number of MACs, which may not be a good proxy for PPA. Thus, it is necessary to deliver accurate hardware estimations, enabling design space exploration (DSE) to deploy CNNs according to the design constraints. This work proposes a fast and accurate DSE approach for CNNs using an analytical model fitted from the physical synthesis of hardware accelerators. The model is integrated with CNN frameworks, like TensorFlow, to generate accurate results. The analytic model estimates area, performance, power, energy, and memory accesses. The observed average error comparing the analytical model to the data obtained from the physical synthesis is smaller than 7%.
更多
查看译文
关键词
CNN,convolutional hardware accelerator,power-performance-area (PPA) estimation,design space exploration (DSE)
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要