학술논문

STONNE: Enabling Cycle-Level Microarchitectural Simulation for DNN Inference Accelerators

Document Type

Conference

Author

Muaoz-Martinez, Francisco; Abellan, Jose L.; Acacio, Manuel E.; Krishna, Tushar

Source

2021 IEEE International Symposium on Workload Characterization (IISWC) IISWC Workload Characterization (IISWC), 2021 IEEE International Symposium on. :201-213 Nov, 2021

Subject

Components, Circuits, Devices and Systems
Computing and Processing
Performance evaluation
Analytical models
Microarchitecture
Neural networks
Accelerator architectures
Complexity theory
Proposals
DNN Accelerators
Simulation tool
Specialized architectures for DNNs

Language

Abstract

The design of specialized architectures for accelerating the inference procedure of Deep Neural Networks (DNNs) is a booming area of research nowadays. While first-generation rigid accelerator proposals used simple fixed dataflows tailored for dense DNNs, more recent architectures have argued for flexibility to efficiently support a wide variety of layer types, dimensions, and sparsity. As the complexity of these accelerators grows, the analytical models currently being used for design-space exploration are unable to capture execution-time subtleties, leading to inexact results in many cases as we demonstrate. This opens up a need for cycle-level simulation tools to allow for fast and accurate design-space exploration of DNN accelerators, and rapid quantification of the efficacy of architectural enhancements during the early stages of a design. To this end, we present STONNE (Simulation TOol of Neural Network/Engines), a cycle-level microarchitectural simulation framework that can plug into any high-level DNN framework as an accelerator device and perform full-model evaluation (i.e. we are able to simulate real, complete, unmodified DNN models) of state-of-the-art rigid and flexible DNN accelerators, both with and without sparsity support. As a proof of concept, we use STONNE in three use cases: i) a direct comparison of three dominant inference accelerators using real DNN models; ii) back-end extensions and iii) front-end extensions of the simulator to showcase the capability of STONNE to rapidly and precisely evaluate data-dependent optimizations.

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송