How to Obtain and Run Light and Efficient Deep Learning Networks
2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)(2019)
摘要
As the model size of deep neural networks (DNNs) grows for better performance, the increase in computational cost associated with training and testing makes it extremely difficulty to deploy DNNs on end/edge devices with limited resources while also satisfying the response time requirement. To address this challenge, model compression which compresses model size and thus reduces computation cost is widely adopted in deep learning society. However, the practical impacts of hardware design are often ignored in these algorithm-level solutions, such as the increase of the random accesses to memory hierarchy and the constraints of memory capacity. On the other side, limited understanding about the computational needs at algorithm level may lead to unrealistic assumptions during the hardware designs. In this work, we will discuss this mismatch and provide how our approach addresses it through an interactive design practice across both software and hardware levels.
更多查看译文
关键词
deep neural networks,model compression,hardware accelerator
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络