학술논문

Improving Contextual Biasing with Text Injection
Document Type
Conference
Source
ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Acoustics, Speech and Signal Processing (ICASSP), ICASSP 2023 - 2023 IEEE International Conference on. :1-5 Jun, 2023
Subject
Bioengineering
Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Training
Computational modeling
Signal processing
Data models
Acoustics
Speech processing
Context modeling
end-to-end ASR
contextual biasing
Language
ISSN
2379-190X
Abstract
In this work, we present a model-based approach to improving contextual biasing that improves quality without drastically increasing model computation during inference. Specifically, we look at injecting text data during training which is representative of contextually-relevant context that will be seen at inference, using a modality-matching text injection method known as JOIST. As JOIST injects text data directly into the E2E model, there is no additional model computation during inference, which is a big difference compared to most model-based biasing techniques. We find that our proposed approach, when combined with an FST-based context model, improves recognition of contacts between 5–15% relative.