Condense: A Framework for Device and Frequency Adaptive Neural Network Models on the Edge


引用 0|浏览21
With the popularity of battery-powered edge computing, an important yet under-explored problem is the supporting of DNNs for diverse edge devices. On the one hand, different edge platforms have various runtime requirements and computation/memory capabilities. Deploying the same DNN model is unsatisfiable, while designing a specialized DNN for each platform is prohibitively expensive. On the other hand, for a single edge device, DVFS is leveraged to prolong the battery, incurring significant inference speed variation for the same DNN and consequently poor user experience. To tackle this, we propose Condense, a framework providing a single adaptive model that can be reconfigured (switch to various sub-networks with different computations/parameters) instantly for diverse devices and execution frequencies without any retraining. Experiments demonstrate that Condense can simultaneously provide vast high-accuracy sub-networks with different computations and parameters corresponding to various sparsity ratios to support diverse edge devices with different runtime requirements, and reduce the speed variation under varying frequencies on each device, with a memory cost of only one set of weights.
battery-powered edge computing,computation-memory capabilities,consequently poor user experience,diverse edge devices,DNN model,DVFS,frequency adaptive neural network models,high-accuracy sub-networks,significant inference speed variation,single adaptive model,single edge device,specialized DNN,under-explored problem
AI 理解论文
Chat Paper