학술논문

A novel quasi-diphone inventory approach to Text-To-Speech synthesis

Document Type

Conference

Author

Source

MELECON 2008 - The 14th IEEE Mediterranean Electrotechnical Conference Electrotechnical Conference, 2008. MELECON 2008. The 14th IEEE Mediterranean. :799-804 May, 2008

Subject

Bioengineering
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Engineered Materials, Dielectrics and Plasmas
Engineering Profession
Fields, Waves and Electromagnetics
General Topics for Engineers
Speech
Speech synthesis
Text analysis
Digital signal processing
Speech processing
Optimization
Fading
TTS
concatenative synthesis
mixed-rank inventory
quasi-diphone
Macedonian

Language

ISSN

2158-8473
2158-8481

Abstract

The paper presents a novel approach to concatenative Text-To-Speech synthesis. The system uses a unique optimized mixed-rank inventory, based on a modification of the classical diphone concept. A new unit type is introduced in our work, dubbed the quasi-diphone unit. A set of these units is designed to cover all the critical transitions between phones and at the same time to be compatible with phone-length units for concatenation purposes. This allows for inventory optimization in respect to its size and quality of the generated speech. The system includes elementary pitch, duration and amplitude modeling implemented with the standard PSOLA algorithm. Presented results show that it was possible to achieve full intelligibility and reasonable naturalness whilst maintaining a rather small inventory. The system was specially developed for the synthesis of Macedonian, and is the first HQ TTS system for this language. Using the developed standardized interface between the modules, the system is also applicable to any of the world’s languages.

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송