학술논문

A Low-Power and Low-Latency Speech Feature Extractor Based on Time-Domain Filter Bank
Document Type
Periodical
Source
IEEE Transactions on Circuits and Systems II: Express Briefs IEEE Trans. Circuits Syst. II Circuits and Systems II: Express Briefs, IEEE Transactions on. 71(3):1421-1425 Mar, 2024
Subject
Components, Circuits, Devices and Systems
Filter banks
Iron
Mel frequency cepstral coefficient
IIR filters
Feature extraction
Computer architecture
Power demand
Feature extractor (FE)
low-power
low-latency
keyword spotting (KWS)
speech recognition
Language
ISSN
1549-7747
1558-3791
Abstract
Keyword spotting system (KWS) is considered as an essential human-machine interface in multifarious edge devices. To extend the standby time of battery-driven devices, KWS system is required to work with ultra-low power consumption. However, conventional KWS systems requires time-frequency transform for feature extractor (FE), leading to high power consumption along with high-latency. This brief proposes a lightweight FE architecture without time-frequency transform, named Time-domain Filter Bank (TFBank), to tremendously reduce power consumption and latency. A post-framing method is also proposed to eliminate redundant computation between contiguous frames and approximate computing is used to mitigate power overload caused by multiplying. Simulation results show that TFBank consumes 387nW with 0.8V@55nm. The system latency is 0.5ms and the memory size usage is 1.175Kb, achieving up to 16- $96\times $ speedup and 4- $10\times $ memory saving compared to the state-of-the-art designs.