Training Quantized Nets with Adaptive Shared Exponents Based on Statistical Distributions

Katsuhiro Yoda, Wataru Kanemori, Mitsuru Tomono,Makiko Ito,Teruo Ishihara

PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE (ICPRAI 2018)(2018)

引用 0|浏览0
暂无评分
摘要
Recently, Deep Neural Networks (DNNs) have dramatically improved performance in computer vision, speech recognition and natural language processing. However, training of the DNNs requires a large amount of computational resources. As a result, the power consumption of the training servers continues to increase in proportion to the expansion of the DNN size. Therefore, we need a method of reducing the power consumption required for DNN training. In this paper, we propose a method of training DNNs in which we adopt a reduced bitwidth representation to improve energy efficiency of arithmetic operations and to reduce the required memory size. Our quantization method extracts the characteristics of the DNN parameters as statistical distributions in parallel with arithmetic operations of the DNN training. We tap into the collected statistical distributions to determine the shared exponents. We applied our method to the training of LeNet, VGG8, AlexNet and long short term memory (LSTM) and found that the accuracy of our method is equivalent to that of the 32-bit floating-point method.
更多
查看译文
关键词
shared exponents, quantization, floating-point, fixed-point, deep learning, CNN, LSTM
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要