학술논문

DNN-based Approach to Detect and Classify Pathological Voice
Document Type
Conference
Source
2018 IEEE International Conference on Big Data (Big Data) Big Data (Big Data), 2018 IEEE International Conference on. :5238-5241 Dec, 2018
Subject
Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Transportation
Pathology
Mel frequency cepstral coefficient
Training
Feature extraction
Diseases
Neurons
Sensitivity
deep learning
Phonotrauma
neoplasm
vocal paralysis
Mel Frequency Cepstral Coefficients
pathological voice classification
Language
Abstract
We participate in the FEMH 2018 Challenge of a bigdata subproject of the IEEE. The goal of this Challenge is pathological voice detection, and classify the different diseases, including phono trauma, neoplasm and vocal paralysis. Final, this challenge uses sensitivity, specificity and UAR as a result. The database is recorded with 50 normal voice samples and 150 samples of common voice disorders in a tertiary teaching hospital (Far Eastern Memorial Hospital, FEMH). The paper proposes a Deep Neural Networks based (DNN-based) approach in this challenge. Data preprocessing used Mel-Frequency Cepstral Coefficients (MFCCs), which also have emotion specific information. Gradual spectral variations are captured using 13 MFCCs extracted from speech signal. In the disease detection section, we examine the performance among different DNN structures (ie, hidden layers and number of neurons). Then, In the disease classification section, examine the performance among different batch sizes and normalize or no normalize. Finally, the tested DNN structures have the best results at 5 hidden layers and 200 of neurons.