학술논문

HASP: Hierarchical Asynchronous Parallelism for Multi-NN Tasks
Document Type
Periodical
Source
IEEE Transactions on Computers IEEE Trans. Comput. Computers, IEEE Transactions on. 73(2):366-379 Feb, 2024
Subject
Computing and Processing
Task analysis
Artificial neural networks
Computational modeling
Multicore processing
Hardware
Computer architecture
Synchronization
Multi-NN
muticore architecture
AI accelerator
Language
ISSN
0018-9340
1557-9956
2326-3814
Abstract
The rapid development of deep learning has propelled many real-world artificial intelligence applications. Many of these applications integrate multiple neural networks (multi-NN) to cater to various functionalities. There are two challenges of multi-NN acceleration: (1) competition for shared resources becomes a bottleneck, and (2) heterogeneous workloads exhibit remarkably different computing-memory characteristics and various synchronization requirements. Therefore, resource isolation and fine-grained resource allocation for each task are two fundamental requirements for multi-NN computing systems. Although a number of multi-NN acceleration technologies have been explored, few can completely fulfill both of these requirements, especially for mobile scenarios. This paper reports a Hierarchical Asynchronous Parallel Model (HASP) to enhance multi-NN performance to meet both requirements. HASP can be implemented on a multicore processor that adopts Multiple Instruction Multiple Data (MIMD) or Single Instruction Multiple Thread (SIMT) architectures, with minor adaptive modification needed. Further, a prototype chip is developed to validate the hardware effectiveness of this design. A corresponding mapping strategy is also developed, allowing the proposed architecture to simultaneously promote resource utilization and throughput. With the same workload, the prototype chip demonstrates 3.62$\boldsymbol{\times}$×, and 3.51$\boldsymbol{\times}$× higher throughput over Planaria and 8.68$\boldsymbol{\times}$×, 2.61$\boldsymbol{\times}$× over Jetson AGX Orin for MobileNet-V1 and ResNet50, respectively.