학술논문
A 1.3mW Speech-to-Text Accelerator with Bidirectional Light Gated Recurrent Units for Edge AI
Document Type
Conference
Author
Source
2022 IEEE Asian Solid-State Circuits Conference (A-SSCC) Solid-State Circuits Conference (A-SSCC), 2022 IEEE Asian. :1-3 Nov, 2022
Subject
Language
Abstract
As the first step for natural language processing and human-machine interfacing, speech-to-text conversion can be applied to various edge AI devices, such as wearable devices, virtual assistants, and intelligent robots, as shown in Fig. 1(a). Energy-efficient realization with a high accuracy is critical for such devices, driving the need for dedicated accelerators [1–3]. The design in [1] demonstrates the first silicon proof that implements a DSP-based algorithm. Deep neural network is used to improve the accuracy in [2]. In [3], a sequence-to-sequence model is introduced to further improve the accuracy at the cost of largely increased energy consumption. This work presents an energy-efficient speech-to-text accelerator with a high accuracy. Compared to the state-of-the-art designs, the chip achieves a 6.5-to-177× lower normalized energy with the lowest 15.2% phone error rate (PER) on the TIMIT dataset.