AuRORA: Virtualized Accelerator Orchestration for Multi-Tenant Workloads.

Annual IEEE/ACM International Symposium on Microarchitecture(2023)

引用 0|浏览15
暂无评分
摘要
With the widespread adoption of deep neural networks (DNNs) across applications, there is a growing demand for DNN deployment solutions that can seamlessly support multi-tenant execution. This involves simultaneously running multiple DNN workloads on heterogeneous architectures with domain-specific accelerators. However, existing accelerator interfaces directly bind the accelerator’s physical resources to user threads, without an efficient mechanism to adaptively re-partition available resources. This leads to high programming complexities and performance overheads due to sub-optimal resource allocation, making scalable many-accelerator deployment impractical. To address this challenge, we propose AuRORA, a novel accelerator integration methodology that enables scalable accelerator deployment for multi-tenant workloads. In particular, AuRORA supports virtualized accelerator orchestration via co-designing the hardware-software stack of accelerators to allow adaptively binding current workloads onto available accelerators. We demonstrate that AuRORA achieves 2.02 higher overall SLA satisfaction, 1.33 overall system throughput, and 1.34 overall fairness compared to existing accelerator integration solutions with less than 2.7% area overhead.
更多
查看译文
关键词
Multi-tenant system,Multi-core,Accelerators,Resource Management,SoC Integration,Microarchitecture,Machine Learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要