학술논문

Semantic rank reduction of music audio
Document Type
Conference
Author
Source
2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684) Applications of signal processing to audio and acoustics Applications of Signal Processing to Audio and Acoustics, 2003 IEEE Workshop on.. :135-138 2003
Subject
Signal Processing and Analysis
Computing and Processing
Multiple signal classification
Principal component analysis
Matrix decomposition
Cultural differences
Music information retrieval
Internet
Joining processes
Supervised learning
Acoustic noise
Noise figure
Language
Abstract
Audio understanding and classification tasks are often aided by a reduced dimensionality representation of the source observations. For example, a supervised learning system trained to detect the genre or artist of a piece of music performs better if the input nodes are statistically decorrelated, either to prevent overfitting in the learning process or to 'anchor' similar observations to cluster centroids in the observation space. We provide an alternative approach that decomposes audio observations of music into semantically significant dimensions where each resultant dimension corresponds to the perceived meaning of the audio, and only the most significant meanings (those which are most effective in describing music audio) are kept. We show a fundamentally unsupervised method to obtain this decomposition automatically and compare its performance in a music understanding task against statistical decorrelation approaches such as PCA and non-negative matrix factorization (NMF).