O-2a: Low Overhead Dnn Compression With Outlier-Aware Approximation

PROCEEDINGS OF THE 2020 57TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC)(2020)

引用 1|浏览19
暂无评分
摘要
We present a low-latency DNN compression technique to reduce DRAM energy, significant in DNN inferences, namely Outlier-Aware Approximation (O-2A) coding. This technique compresses 8-bit integer, de-facto standard of DNN inferences, to 6-bit without degrading the accuracies of DNNs. The hardware for the O-2A coding can be easily embedded to DRAM controllers due to small overhead. In an Eyeriss platform, the O-2A coding improves both DRAM energy and system performance by 18 similar to 20%. The O-2A coding enables us to implement an error-correction scheme without additional parity overhead, opening the possibility of an approximate DRAM to simultaneously reduce DRAM accessing and refresh energy.
更多
查看译文
关键词
Low-latency DNN compression, Approximation, DRAM
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要