학술논문

Enhancing ASR Systems for Under-Resourced Languages through a Novel Unsupervised Acoustic Model Training Technique
Document Type
article
Source
Advances in Electrical and Computer Engineering, Vol 15, Iss 1, Pp 63-68 (2015)
Subject
speech recognition
under-resourced languages
unsupervised acoustic modeling
unsupervised training
Electrical engineering. Electronics. Nuclear engineering
TK1-9971
Computer engineering. Computer hardware
TK7885-7895
Language
English
ISSN
1582-7445
1844-7600
Abstract
Statistical speech and language processing techniques, requiring large amounts of training data, are currently state-of-the-art in automatic speech recognition. For high-resourced, international languages this data is widely available, while for under-resourced languages the lack of data poses serious problems. Unsupervised acoustic modeling can offer a cost and time effective way of creating a solid acoustic model for any under-resourced language. This study describes a novel unsupervised acoustic model training method and evaluates it on speech data in an under-resourced language: Romanian. The key novel factor of the method is the usage of two complementary seed ASR systems to produce high quality transcriptions, with a Character Error Rate (ChER) < 5%, for initially untranscribed speech data. The methodology leads to a relative Word Error Rate (WER) improvement of more than 10% when 100 hours of untranscribed speech are used.