학술논문

Joint Online Estimation of Early and Late Residual Echo PSD for Residual Echo Suppression
Document Type
Periodical
Source
IEEE/ACM Transactions on Audio, Speech, and Language Processing IEEE/ACM Trans. Audio Speech Lang. Process. Audio, Speech, and Language Processing, IEEE/ACM Transactions on. 31:333-344 2023
Subject
Signal Processing and Analysis
Computing and Processing
Communication, Networking and Broadcast Technologies
General Topics for Engineers
Reverberation
Microphones
Loudspeakers
Estimation
Echo cancellers
Frequency estimation
Computational modeling
Acoustic echo cancellation
adaptive filters
PSD estimation
residual echo suppression
Language
ISSN
2329-9290
2329-9304
Abstract
In hands-free telephony and other distant-talking applications, an acoustic echo cancellation system is typically required, where a short adaptive filter is often used in practice to achieve fast convergence at low computational cost. This may result in late residual echo (LRE) remaining due to under-modeling of the echo path and early residual echo (ERE) due to filter misalignment. Both residual echo components can be suppressed using a postfilter in the subband domain, which requires accurate estimates of the power spectral density (PSD) of the ERE and LRE components. State-of-the-art methods estimate the ERE and LRE PSDs independently of each other, where the ERE PSD is estimated by simply multiplying the loudspeaker PSD with a frequency-dependent scalar and the LRE PSD is estimated using a recursive estimator based on frequency-dependent reverberation scaling and decay parameters. In this paper, we propose to extend the ERE PSD estimator from a scalar to a moving average filter on the loudspeaker PSD. In addition, we propose a signal-based method to jointly estimate all model parameters for the ERE and LRE PSD estimators in online mode, and derive two gradient-descent-based algorithms to simultaneously update the model parameters by minimizing the mean squared log error. The proposed method is compared with state-of-the-art methods in terms of estimation accuracy of the model parameters as well as the residual echo PSDs. Simulation results using both artificially generated as well as measured impulse responses show that the proposed method outperforms state-of-the-art methods for all considered scenarios.