학술논문

Lip reading for robust speech recognition on embedded devices

Document Type

Conference

Author

Perez, J.F.G.; Frangi, A.F.; Solano, E.L.; Lukas, K.

Source

Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005. Acoustics, Speech, and Signal Processing Acoustics, Speech, and Signal Processing, 2005. Proceedings. (ICASSP '05). IEEE International Conference on. 1:I/473-I/476 Vol. 1 2005

Subject

Signal Processing and Analysis
Components, Circuits, Devices and Systems
Robustness
Speech recognition
Noise reduction
Automatic speech recognition
Acoustic noise
Acoustic devices
Mouth
Feature extraction
Signal processing algorithms
Working environment noise

Language

ISSN

1520-6149
2379-190X

Abstract

In this article a complete audio-visual speech recognition system suitable for embedded devices is presented. As visual feature extraction algorithms active shape models (ASM) and discrete cosine transformation (DCT) have been investigated and discussed for an embedded implementation. The audio-visual information integration has also been designed by taking into account device limitations. It is well known that the use of visual cues improves the recognition results especially in scenarios with high level of acoustical noise. We wanted to compare the performance of lip reading and the conventional noise reduction systems in these degraded scenarios, as well as the combination of both kinds of solutions. Important improvements are obtained especially for nonstationary background noise like voice interference, car acceleration or indicator clicks. For this kind of noise lip reading outperforms the results obtained with conventional noise reduction technologies.

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송