학술논문

Statistical methods in data-driven modeling of Spanish prosody for text to speech

Document Type

Conference

Author

Lopez-Gonzalo, E.; Rodriguez-Garcia, J.M.

Source

Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96 Spoken language processing Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on. 3:1377-1380 vol.3 1996

Subject

Signal Processing and Analysis
Communication, Networking and Broadcast Technologies
Computing and Processing
Statistical analysis
Speech synthesis
Frequency conversion
Spatial databases
Feature extraction
Telecommunications
Natural languages
Electronic mail
Speech recognition
Contracts

Language

Abstract

In (Lopez-Gonzalo et al., 1995), we proposed an automatic data-driven methodology to model both fundamental frequency and segmental duration in TTS converters from a monospeaker recorded corpus. Therefore, it had the advantage that it could be adapted to a specific corpus or a particular speaker. The main disadvantage was the size of the obtained prosodic database. In this paper, we propose to use some statistical methods for reducing the prosodic database required in this methodology. A 50% reduction can be obtained without compromising the naturalness of the synthetic speech obtained by our previous methodology with the same prosodic corpus. A compromise between variability and reduction in prosodic contours is also discussed.

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송