학술논문

SM6: A 16nm System-on-Chip for Accurate and Noise-Robust Attention-Based NLP Applications : The 33rd Hot Chips Symposium – August 22-24, 2021
Document Type
Conference
Source
2021 IEEE Hot Chips 33 Symposium (HCS) Hot Chips 33 Symposium (HCS), 2021 IEEE. :1-13 Aug, 2021
Subject
Components, Circuits, Devices and Systems
Computing and Processing
Source separation
Computational modeling
Operating systems
Pipelines
Brain modeling
Probabilistic logic
Real-time systems
Language
ISSN
2573-2048
Abstract
In this work, we present SM6, an SoC architecture for real-time denoised speech and NLP pipelines, featuring (1) MSSE: an unsupervised probabilistic sound source separation accelerator, (2) FlexNLP: a programmable inference accelerator for attention-based seq2seq DNNs using adaptive floating-point datatypes for wide dynamic range computations, (3) a dual-core Arm Cortex A53 CPU cluster, which provides on-demand SIMD FFT processing, and operating system support. In adverse acoustic conditions, MSSE allows FlexNLP to store up to 6x smaller ASR models obviating the very inefficient strategy of scaling up the DNN model to achieve noise robustness. MSSE and FlexNLP produce efficiency ranges of 4.33-17.6 Gsamples/s/W and 2.6-7.8TFLOPs/W, respectively, with per-frame end-to-end latencies of 15-45ms.