학술논문

Sonority rise: Aiding backoff in syllable-based speech synthesis
Document Type
Conference
Source
2016 Twenty Second National Conference on Communication (NCC) Communication (NCC), 2016 Twenty Second National Conference on. :1-5 Mar, 2016
Subject
Communication, Networking and Broadcast Technologies
Signal Processing and Analysis
Speech
Speech synthesis
Databases
Maintenance engineering
Mel frequency cepstral coefficient
Correlation
Unit-selection synthesis
backoff
sonority
vowel-epenthesis
Language
Abstract
Back off techniques are employed in syllable based unit selection speech synthesis systems to maintain the naturalness of the speech in spite of the missing syllables. In synthesizing the missing complex consonant clusters syllables of Telugu, we introduced reduced vowel epenthesis as a rule-based backoff strategy[1]. In this paper, we refine the scope of the approach in selectively applying vowel epenthesis only in cases of sonority rise between adjacent consonants. When the sonority does not rise (stop-stop, liquid-stop clusters), we increase the duration of the consonant. Owing to specific patterns of vowel epenthesis observed in languages, we conduct a subjective evaluation to determine the identity of the epenthetic vowel in Hindi. From the inferences of the listening test, we devise a class based rule to perform epenthesis. Further, to evaluate the performance of the designed system, we perform both subjective as well as an objective evaluation based on confidence measures from an ASR system. We conduct a phone level automatic speech recognition task on the intelligibility of the words synthesized using epenthesis as a cluster-repair strategy. The results show that the proposed back off method helps in producing more natural-sounding speech compared to the conventional backoffs.