Mixed Precision Neural Architecture Search for Energy Efficient Deep Learning

2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)(2019)

引用 56|浏览62
暂无评分
摘要
Large scale deep neural networks (DNNs) have achieved remarkable successes in various artificial intelligence applications. However, high computational complexity and energy costs of DNNs impede their deployment on edge devices with a limited energy budget. Two major approaches have been investigated for learning compact and energy-efficient DNNs. Neural architecture search (NAS) enables the design automation of neural network structures to achieve both high accuracy and energy efficiency. The other one, model quantization, leverages low-precision representation and arithmetic to trade off efficiency against accuracy. Although NAS and quantization are both critical components of the DNN design closure, limited research considered them collaboratively. In this paper, we propose a new methodology to perform end-to-end joint optimization over the neural architecture and quantization space. Our approach searches for the optimal combinations of architectures and precisions (bit-widths) to directly optimize both the prediction accuracy and hardware energy consumption. Our framework improves and automatizes the flow across neural architecture design and hardware deployment. Experimental results demonstrate that our proposed approach achieves better energy efficiency than advanced quantization approaches and efficiency-aware NAS methods on CIFAR-100 and ImageNet. We study different search and quantization policies, and offer insights for both neural architecture and hardware designs.
更多
查看译文
关键词
mixed precision neural architecture search,energy efficient deep learning,artificial intelligence applications,high computational complexity,energy costs,edge devices,energy budget,energy-efficient DNNs,design automation,neural network structures,model quantization,low-precision representation,DNN design closure,hardware energy consumption,neural architecture design,hardware deployment,efficiency-aware NAS methods,hardware designs,large scale deep neural networks,end-to-end joint optimization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要