학술논문

A 1.3mW Speech-to-Text Accelerator with Bidirectional Light Gated Recurrent Units for Edge AI
Document Type
Conference
Source
2022 IEEE Asian Solid-State Circuits Conference (A-SSCC) Solid-State Circuits Conference (A-SSCC), 2022 IEEE Asian. :1-3 Nov, 2022
Subject
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Fields, Waves and Electromagnetics
Robotics and Control Systems
Signal Processing and Analysis
Wearable computers
Virtual assistants
Neural networks
Energy efficiency
Silicon
Natural language processing
Solid state circuits
Language
Abstract
As the first step for natural language processing and human-machine interfacing, speech-to-text conversion can be applied to various edge AI devices, such as wearable devices, virtual assistants, and intelligent robots, as shown in Fig. 1(a). Energy-efficient realization with a high accuracy is critical for such devices, driving the need for dedicated accelerators [1–3]. The design in [1] demonstrates the first silicon proof that implements a DSP-based algorithm. Deep neural network is used to improve the accuracy in [2]. In [3], a sequence-to-sequence model is introduced to further improve the accuracy at the cost of largely increased energy consumption. This work presents an energy-efficient speech-to-text accelerator with a high accuracy. Compared to the state-of-the-art designs, the chip achieves a 6.5-to-177× lower normalized energy with the lowest 15.2% phone error rate (PER) on the TIMIT dataset.