학술논문

SASCHA—Sparsity-Aware Stochastic Computing Hardware Architecture for Neural Network Acceleration.
Document Type
Article
Source
IEEE Transactions on Computer-Aided Design of Integrated Circuits & Systems. Nov2022, Vol. 41 Issue 11, p4169-4180. 12p.
Subject
*LINEAR algebra
*MACHINE learning
Language
ISSN
0278-0070
Abstract
Stochastic computing (SC) has recently emerged as a promising method for efficient machine learning acceleration. Its high compute density, affinity with dense linear algebra primitives, and approximation properties have an uncanny level of synergy with the deep neural network computational requirements. However, there is a conspicuous lack of works trying to integrate SC hardware with sparsity awareness, which has brought significant performance improvements to conventional architectures. In this work, we identify why common sparsity-exploiting techniques are not easily applicable to SC accelerators and propose a new architecture—SASCHA—sparsity-aware SC hardware architecture for the neural network acceleration that addresses those issues. SASCHA encompasses a set of techniques that make utilizing sparsity in inference practical for different types of SC computation. At 90% weight sparsity, SASCHA can be up to $6.5\times $ faster and $5.5\times $ more energy-efficient than comparable dense SC accelerators with a similar area without sacrificing the dense network throughput. SASCHA also outperforms sparse fixed-point accelerators by up to $4\times $ in terms of latency. To the best of our knowledge, SASCHA is the first SC accelerator architecture oriented around sparsity. [ABSTRACT FROM AUTHOR]