HASP: Hierarchical Asynchronous Parallelism for Multi-NN Tasks


Cited 0|Views15
No score
The rapid development of deep learning has propelled many real-world artificial intelligence applications. Many of these applications integrate multiple neural networks (multi-NN) to cater to various functionalities. There are two challenges of multi-NN acceleration: (1) competition for shared resources becomes a bottleneck, and (2) heterogeneous work loads exhibit remarkably different computing-memory characteristics and various synchronization requirements. Therefore, resource isolation and fine-grained resource allocation for each task are two fundamental requirements for multi-NN computing systems. Although a number of multi-NN acceleration technologies have been explored, few can completely fulfill both of these require-ments, especially for mobile scenarios. This paper reports a Hierarchical Asynchronous Parallel Model (HASP) to enhance multi-NN performance to meet both requirements. HASP can be implemented on a multicore processor that adopts Multiple Instruction Multiple Data (MIMD) or Single Instruction Multiple Thread (SIMT) architectures, with minor adaptive modification needed. Further, a prototype chip is developed to validate the hardware effectiveness of this design. A corresponding mapping strategy is also developed, allowing the proposed architecture to simultaneously promote resource utilization and through-put. With the same workload, the prototype chip demon-strates 3.62x, and 3.51x higher throughput over Planaria and8.68x, 2.61xover Jetson AGX Orin for MobileNet-V1 andResNet50, respectively.
Translated text
Key words
Multi-NN,muticore architecture,AI accelerator
AI Read Science
Must-Reading Tree
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined