학술논문

Robust pitch tracking for prosodic modeling in telephone speech
Document Type
Conference
Source
2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100) Acoustics, speech, and signal processing Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on. 3:1343-1346 vol.3 2000
Subject
Signal Processing and Analysis
Components, Circuits, Devices and Systems
Robustness
Telephony
Speech analysis
Signal processing algorithms
Detection algorithms
Frequency
Personal digital assistants
Chaotic communication
Natural languages
Laboratories
Language
ISSN
1520-6149
Abstract
In this paper, we introduce a pitch detection algorithm that is particularly robust for telephone speech and prosodic modeling. The algorithm uses a logarithmically sampled spectral representation of speech, similar to that in the subharmonic summation approach. Constraints for logF/sub 0/ and /spl Delta/logF/sub 0/ are combined in a dynamic programming search to find an optimum pitch track. The search algorithm is able to find a continuous pitch contour regardless of the voicing status, while a separate voicing decision module computes the probability of voicing per frame. We evaluated the algorithm using the Keele pitch extraction reference database under both studio and telephone conditions. Our algorithm is very robust to channel degradation, and compares favorably to XWAVES under telephone conditions. It also significantly outperforms XWAVES when used for tone classification on a telephone quality Mandarin digit corpus.