Efficient Job Offloading in Heterogeneous Systems through Hardware-assisted Packet-based Dispatching and User-level Runtime Infrastructure

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems(2020)

引用 4|浏览40
暂无评分
摘要
Emerging heterogeneous systems architectures increasingly integrate general-purpose processors, GPUs, and other specialized computational units to provide both power and performance benefits. While the motivations for developing systems with accelerators are clear, it is important to design efficient dispatching mechanisms in terms of performance and energy while leveraging programmability and orchestration of the diverse computational components. In this paper, we present an infrastructure composed of a hardware, general, packet-based processing-dispatching unit, named generic packet processing unit (GPPU), and of an associated runtime that facilitates user-level access to GPPU objects, such as packets, queues, and contexts. Hence, we remove drawbacks of traditional costly user-to-kernel-level operations, low-level accelerator subtleties that hinder programming productivity, along with architectural obstacles such as handling accelerators’ unified virtual address space. We present the design and evaluation of our framework by integrating the GPPU infrastructure with data streaming type accelerators, image filtering, and matrix multiplication, tightly coupled to ARMv8 architecture via unified virtual memory. Under scaling workload our proposed dispatching methods can deliver $3.7{\times }$ performance improvement over baseline offloading, and up to $4.7{\times }$ better energy efficiency.
更多
查看译文
关键词
Dispatching,Hardware,Runtime,Central Processing Unit,Data transfer,Program processors,Task analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要