학술논문
Statistical methods in data-driven modeling of Spanish prosody for text to speech
Document Type
Conference
Source
Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96 Spoken language processing Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on. 3:1377-1380 vol.3 1996
Subject
Language
Abstract
In (Lopez-Gonzalo et al., 1995), we proposed an automatic data-driven methodology to model both fundamental frequency and segmental duration in TTS converters from a monospeaker recorded corpus. Therefore, it had the advantage that it could be adapted to a specific corpus or a particular speaker. The main disadvantage was the size of the obtained prosodic database. In this paper, we propose to use some statistical methods for reducing the prosodic database required in this methodology. A 50% reduction can be obtained without compromising the naturalness of the synthetic speech obtained by our previous methodology with the same prosodic corpus. A compromise between variability and reduction in prosodic contours is also discussed.