Training Deep Neural Networks in 8-bit Fixed Point with Dynamic Shared Exponent Management.

Hisakatsu Yamaguchi, Makiko Ito, Katsuhiro Yoda, Atsushi Ike

DATE(2021)

引用 1|浏览8
暂无评分
摘要
The increase in complexity and depth of deep neural networks (DNNs) has created a strong need to improve computing performance. Quantization methods for training DNNs can effectively improve computation throughput and energy efficiency of hardware platforms. We have developed an 8-bit quantization training method representing the weight, activation, and gradient tensors in an 8-bit fixed point data format. The shared exponent for each tensor is managed dynamically on the basis of the distribution of the tensor elements calculated in the previous training phase, not in the current training phase, which improves computation throughput. This method provides up to 3.7-times computation throughput compared with FP32 computation without accuracy degradation.
更多
查看译文
关键词
deep neural networks,dynamic shared exponent management,computing performance,quantization methods,training DNNs,computation throughput,energy efficiency,hardware platforms,8-bit quantization training method,gradient tensors,8-bit fixed point data format,tensor elements,previous training phase,current training phase,times computation,FP32 computation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要