학술논문

Analysis and automatic recognition of Human BeatBox sounds: A comparative study
Document Type
Conference
Source
2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on. :4255-4259 Apr, 2015
Subject
Fields, Waves and Electromagnetics
Instruments
Hidden Markov models
Feature extraction
Error analysis
Speech
Databases
Speech processing
Human beatbox
pitch tracking
onset detection
Hidden Markov Model
automatic speech recognition
Language
ISSN
1520-6149
2379-190X
Abstract
“Human BeatBox” (HBB) is a newly expanding contemporary singing style where the vocalist imitates drum beats percussive sounds as well as pitched musical instrument sounds. Drum sounds typically use a notation based on plosives and fricatives, and instrument sounds cover vocalisations that go beyond spoken language vowels. HBB hence constitutes an interesting use case for expanding techniques initially developed for speech processing, with the goal of automatically annotating performances as well as developing new sound effects dedicated to HBB performers. In this paper, we investigate three complementary aspects of HBB analysis: pitch tracking, onset detection, and automatic recognition of sounds and instruments. As a first step, a new high-quality HBB audio database has been recorded, carefully segmented and annotated manually to obtain a ground truth reference. Various pitch tracking and onset detection methods are then compared and assessed against this reference. Finally, Hidden Markov Models are evaluated, together with an exploration of their parameters space, for the automatic recognition of different types of sounds. This study exhibits very encouraging experimental results.