BlinkNet: Software-Defined Deep Learning Analytics with Bounded Resources.

Brian Koga, Theresa VanderWeide,Xinghui Zhao,Xuechen Zhang

ICMLA(2022)

引用 0|浏览6
暂无评分
摘要
Deep neural networks (DNNs) have recently gained unprecedented success in various domains. In resource-constrained edge systems (e.g., mobile devices and IoT devices) QoS-aware DNNs are required to meet latency and memory/storage requirements of mission-critical deep learning applications. However, none of the existing DNNs has been de-signed to satisfy both latency and memory bounds simultaneously as specified by end-users in the resource-constrained systems. This paper proposes a runtime system, BlinkNet, which can guarantee both latency and memory/storage bounds for one or multiple DNNs via efficient QoS-aware per-layer approximation. We implement BlinkNet in Apache TVM and evaluate it using CaffeNet, CIFAR-10-quick, and VGG16 network models on both CPU and GPU platforms. Our experimental results show that BlinkNet can enforce various latency and memory bounds set by end-users with real-world datasets.
更多
查看译文
关键词
DNN Model Approximation,Quality of Services
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要