학술논문

SM6: A 16nm System-on-Chip for Accurate and Noise-Robust Attention-Based NLP Applications : The 33rd Hot Chips Symposium – August 22-24, 2021

Document Type

Conference

Author

Tambe, Thierry; Yang, En-Yu; Ko, Glenn G.; Chai, Yuji; Hooper, Coleman; Donato, Marco; Whatmough, Paul N.; Rush, Alexander M.; Brooks, David; Wei, Gu-Yeon

Source

2021 IEEE Hot Chips 33 Symposium (HCS) Hot Chips 33 Symposium (HCS), 2021 IEEE. :1-13 Aug, 2021

Subject

Components, Circuits, Devices and Systems
Computing and Processing
Source separation
Computational modeling
Operating systems
Pipelines
Brain modeling
Probabilistic logic
Real-time systems

Language

ISSN

2573-2048

Abstract

In this work, we present SM6, an SoC architecture for real-time denoised speech and NLP pipelines, featuring (1) MSSE: an unsupervised probabilistic sound source separation accelerator, (2) FlexNLP: a programmable inference accelerator for attention-based seq2seq DNNs using adaptive floating-point datatypes for wide dynamic range computations, (3) a dual-core Arm Cortex A53 CPU cluster, which provides on-demand SIMD FFT processing, and operating system support. In adverse acoustic conditions, MSSE allows FlexNLP to store up to 6x smaller ASR models obviating the very inefficient strategy of scaling up the DNN model to achieve noise robustness. MSSE and FlexNLP produce efficiency ranges of 4.33-17.6 Gsamples/s/W and 2.6-7.8TFLOPs/W, respectively, with per-frame end-to-end latencies of 15-45ms.

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송