학술논문

An Optimal Envelope-Based Noise Reduction Method for Cochlear Implants: An Upper Bound Performance Investigation
Document Type
Periodical
Source
IEEE/ACM Transactions on Audio, Speech, and Language Processing IEEE/ACM Trans. Audio Speech Lang. Process. Audio, Speech, and Language Processing, IEEE/ACM Transactions on. 29:1729-1739 2021
Subject
Signal Processing and Analysis
Computing and Processing
Communication, Networking and Broadcast Technologies
General Topics for Engineers
Time-frequency analysis
Signal to noise ratio
Auditory system
Noise measurement
Wiener filters
Cochlear implants
Time-domain analysis
Cochlear implant
Noise reduction
Temporal envelope
Time-frequency mask
Language
ISSN
2329-9290
2329-9304
Abstract
Cochlear implants (CI) are surgically implanted auditory prostheses for partially compensating severe to profound sensorineural hearing losses. Despite providing significant intelligibility levels under silent conditions, their performance is highly affected by additive noise. Time-frequency masks, such as the binary mask, the Wiener filter (WF) and their variations, have been widely employed for noise reduction. However, they were not originally designed for CI applications, which have very particular characteristics. In this work, we propose a new method for noise reduction, especially designed for CI. It is based on the minimization of the mean square error between the squared envelopes of the estimated and target speech. The theoretical derivation of the proposed time-frequency mask is presented and a closed-form for the one-coefficient optimal solution is obtained. Numerical simulations with objective criteria indicate that the proposed method results in higher intelligibility scores as compared to the time-domain implementation of the WF time-frequency mask. Psychoacoustic experiments with normal hearing volunteers and vocoded signals, as well as CI users, corroborate simulation results, showing significant increases in intelligibility, especially for SNR < −5 dB. The proposed method may result in an intelligibility increase of up to 70%, for SNR = −25 dB, as compared to the WF. These findings were obtained considering an ideal signal-to-noise ratio estimator.