학술논문

Self-supervised Language Identification ASR models for Low Resource Indic Languages
Document Type
Conference
Source
2023 International Conference on Modeling, Simulation & Intelligent Computing (MoSICom) Modeling, Simulation & Intelligent Computing (MoSICom), 2023 International Conference on. :30-34 Dec, 2023
Subject
Aerospace
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Fields, Waves and Electromagnetics
Robotics and Control Systems
Signal Processing and Analysis
Training
Speech coding
Computational modeling
Self-supervised learning
Predictive models
Predictive coding
Benchmark testing
Automatic Speech Recognition
Language Identification
Language
Abstract
Self-supervised automatic speech recognition (ASR) is a technique that has gained popularity in recent years due to its ability to improve ASR accuracy using large amounts of unlabeled data. Language identification (LID) is another important technique that has been used to identify the language of speech segments, which can be useful for developing ASR models for multilingual environments. This paper presents a novel approach to self-supervised ASR using LID, where the ASR model is pre-trained on large amounts of unlabeled data, and the language of the speech segments is identified during training. The identified language is then used to guide the training of the ASR model, resulting in improved performance on both in-domain and out-of-domain data. The proposed approach is evaluated on several benchmark datasets, demonstrating its effectiveness in improving ASR accuracy and reducing language mismatch. The results suggest that self-supervised ASR using LID has great potential for developing more robust and accurate ASR systems in multilingual environments.