학술논문

DOA Estimation for Multiple Speech Sources Based on Flexible Single-Source Zones and Concentration Weighting
Document Type
Periodical
Source
IEEE Sensors Journal IEEE Sensors J. Sensors Journal, IEEE. 23(10):10683-10693 May, 2023
Subject
Signal Processing and Analysis
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Robotics and Control Systems
Direction-of-arrival estimation
Estimation
Microphone arrays
Histograms
Sensors
Reverberation
Time-frequency analysis
Concentration weighting
direction-of-arrival (DOA) estimation
flexible single-source zone (SSZ)
microphone array
Language
ISSN
1530-437X
1558-1748
2379-9153
Abstract
Direction-of-arrival (DOA) estimation is the key to many audio applications. Recently, sparse component analysis (SCA)-based methods have attracted much attention, in which single-source points (SSPs) and single-source zones (SSZs) where one source is dominant over the others in time–frequency (T–F) domain are usually detected to construct the pooled histogram containing multisource DOA information. Nonetheless, the SSZ size in existing methods is fixed and empirically predetermined, which cannot accommodate to the varying spectrotemporal property of speech sources. Furthermore, higher SSP concentration in an SSZ implies a locally stronger dominant source as well as more reliable DOA information extracted therein, which, however, is also not considered yet. To address these problems, a DOA estimation, algorithm for multiple speech sources based on flexible SSZs and concentration weighting, is presented in this article. First, in each frame, correlation coefficients of time delay vectors across adjacent frequency bins are calculated to identify SSPs, followed by flexible SSZs construction using a varying number of SSPs located at the consecutive frequency bins. Next, the number of SSPs in each flexible SSZ is considered as a proxy of corresponding concentration degree and employed as a weighting factor to form the pooled histogram. Finally, a matching pursuit (MP)-based approach is utilized to obtain multisource DOA estimates. Simulation results reveal that the proposed method significantly outperforms the existing approaches in terms of noise floor in pooled histogram, angular resolution, and performance under various signal-to-noise ratio (SNR) and reverberant conditions. Real-world experiments also verify its effectiveness and meanwhile demonstrate considerably reduced computational complexity.