Kalmia: A Heterogeneous QoS-aware Scheduling Framework for DNN Tasks on Edge Servers

IEEE INFOCOM 2022 - IEEE Conference on Computer Communications(2022)

引用 11|浏览45
暂无评分
摘要
Motivated by the popularity of edge intelligence, DNN services have been widely deployed at the edge, posing significant performance pressure on edge servers. How to improve the QoS of edge DNN services becomes a crucial and challenging problem. Previous works, however, did not fully consider the heterogeneous QoS requirements on urgent and non-urgent tasks, causing frequent QoS violations. Meanwhile, our empirical study shows that severe task interference exists in concurrent DNN tasks, further degrading the timeliness of urgent tasks and throughput of non-urgent tasks. To address these issues, we propose Kalmia, a heterogeneous QoS-aware framework for DNN inference task scheduling on edge servers. Specifically, Kalmia includes an offline profiling stage and an online scheduling policy. In offline profiling, we build a regression model to predict the execution time of tasks. During online scheduling, we classify the tasks into urgent and non-urgent tasks and distribute them into two CUDA contexts. By a tailored scheduling strategy, non-urgent tasks can fully utilize the computing resources for throughput improvement, while the timeliness of urgent tasks can be guaranteed via preemption. Experimental results demonstrate that Kalmia can achieve up to 2.8× improvement in throughput and significantly reduce the deadline violation rate compared with state-of-the-art methods.
更多
查看译文
关键词
Task offloading,QoS-aware scheduling,DNN services,edge computing,edge intelligence
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要