Using Deep Compression on PyTorch Models for Autonomous Systems.

Signal Processing and Communications Applications Conference (SIU)(2022)

引用 0|浏览1
暂无评分
摘要
Applications of artificial neural networks on low-cost embedded systems and microcontrollers (MCUs), has recently been attracting more attention than ever. Since MCUs have limited memory capacity as well as limited compute-speed compared to workstations, employment of current deep learning algorithms on MCUs becomes more practical with the help of model compression. This makes MCUs common and practical alternative solution for autonomous systems. In this paper, we add model compression, specifically Deep Compression, to an existing work, which efficiently deploys PyTorch models on MCUs, in order to increase neural network speed and save electrical power. First, we prune the weight values close to zero in convolutional and fully connected layers. Secondly, the remaining weights and activations are quantized to 8-bit integers from 32-bit floating-point. Finally, forward pass functions are compressed using special data structures for sparse matrices, which store only nonzero weights. In the case of the LeNet-5 model, the memory footprint was reduced by 12.5x, and the inference speed was boosted by 2.6x.
更多
查看译文
关键词
PyTorch models,autonomous systems,artificial neural networks,low-cost embedded systems,microcontrollers,MCUs,memory capacity,deep learning algorithms,model compression,deep compression,LeNet-5 model,fully connected layers,convolutional layers,sparse matrices,data structures
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要