학술논문

Shape-invariant pitch and time-scale modification of speech by variable order phase interpolation
Document Type
Conference
Source
1997 IEEE International Conference on Acoustics, Speech, and Signal Processing Acoustics, speech, and signal processing Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on. 2:919-922 vol.2 1997
Subject
Signal Processing and Analysis
Components, Circuits, Devices and Systems
Interpolation
Speech synthesis
Phase estimation
Shape
Coherence
Polynomials
Signal synthesis
Laboratories
Pulse shaping methods
Periodic structures
Language
ISSN
1520-6149
Abstract
To preserve the waveform shape and perceived quality of pitch and time-scale modified sinusoidally modelled voiced speech, the phases of the sinusoids used to model the glottal excitation are made to add coherently at estimated pitch pulse locations. The glottal excitation is therefore made to resemble a pseudoperiodic impulse train, a quality essential for shape-invariance. Conventional methods attempt to maintain the coherence once per synthesis frame by interpolating the phase through a single modified pitch pulse location, a time where all excitation phases are assumed to be integer multiples of 2/spl pi/. Whilst this is adequate for small degrees of modification, the coherence is lost when the required amount of modification is increased. This paper presents a technique which is capable of better preserving the impulse-like nature of the glottal excitation whilst allowing its phases to evolve slowly through time.