학술논문

Harmonic-Temporal Factor Decomposition for Unsupervised Monaural Separation of Harmonic Sounds

Document Type

Periodical

Author

Source

IEEE/ACM Transactions on Audio, Speech, and Language Processing IEEE/ACM Trans. Audio Speech Lang. Process. Audio, Speech, and Language Processing, IEEE/ACM Transactions on. 29:68-82 2021

Subject

Signal Processing and Analysis
Computing and Processing
Communication, Networking and Broadcast Technologies
General Topics for Engineers
Time-frequency analysis
Source separation
Parameter estimation
Computational modeling
Music
Harmonic analysis
Spectrogram
Computational auditory scene analysis
harmonic-temporal clustering
monaural audio source separation
non-negative matrix factorization

Language

ISSN

2329-9290
2329-9304

Abstract

We address the problem of separating a monaural mixture of harmonic sounds into the audio signals of individual semitones in an unsupervised manner. Unsupervised monaural audio source separation has thus far been mainly addressed by two approaches: one rooted in computational auditory scene analysis (CASA) and the other based on non-negative matrix factorization (NMF). These approaches focus on different clues for making source separation possible. A CASA-based method called harmonic-temporal clustering (HTC) focuses on a local time-frequency structure of individual sources, whereas NMF focuses on a global time-frequency structure of music spectrograms. These clues do not conflict with each other and can be used to achieve a more reliable audio source separation algorithm. Hence, we propose a monaural audio source separation framework, harmonic-temporal factor decomposition (HTFD), by developing a spectrogram model that encompasses the features of the models used in the NMF and HTC approaches. We further incorporate a source-filter model to build an extension of HTFD, source-filter HTFD (SF-HTFD). We derive efficient parameter estimation algorithms of HTFD and SF-HTFD based on the auxiliary function principle. We show, through music source separation experiments, the efficacy of HTFD and SF-HTFD compared with conventional methods. Furthermore, we demonstrate the effectiveness of HTFD and SF-HTFD for automatic musical key transposition.

Online Access

Full Text (IEEE) Web of Science JCR 저널정보 Scopus Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송