학술논문

Hardware Accelerator Design for Sparse DNN Inference and Training: A Tutorial

Document Type

Periodical

Author

Mao, W.; Wang, M.; Xie, X.; Wu, X.; Wang, Z.

Source

IEEE Transactions on Circuits and Systems II: Express Briefs IEEE Trans. Circuits Syst. II Circuits and Systems II: Express Briefs, IEEE Transactions on. 71(3):1708-1714 Mar, 2024

Subject

Components, Circuits, Devices and Systems
Computational modeling
Tensors
Computational efficiency
Transformers
Encoding
Data models
Convolution
Hardware acceleration
sparsity
CNN
transformer
tutorial
deep learning

Language

ISSN

1549-7747
1558-3791

Abstract

Deep neural networks (DNNs) are widely used in many fields, such as artificial intelligence generated content (AIGC) and robotics. To efficiently support these tasks, the model pruning technique is developed to compress the computational and memory-intensive DNNs. However, directly executing these sparse models on a common hardware accelerator can cause significant under-utilization, since invalid data resulting from the sparse patterns leads to unnecessary computations and irregular memory accesses. This brief analyzes the critical issues in accelerating sparse models, and provides an overview of typical hardware designs for various sparse DNNs, such as convolutional neural networks (CNNs), recurrent neural networks (RNNs), generative adversarial networks (GANs), and Transformers. Following the overview, we give a practical guideline of designing efficient accelerators for sparse DNNs with qualitative metrics to evaluate hardware overhead under different cases. In addition, we highlight potential opportunities in terms of hardware/software/algorithm co-optimizations from the perspective of sparse DNN implementation, and provide insights into recent design trends for the efficient implementation of transformers with sparse attention, which facilitates large language model (LLM) deployments with high throughput and energy efficiency.

Online Access

Full Text (IEEE) Web of Science JCR 저널정보 Scopus Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송