학술논문
DNN-based Approach to Detect and Classify Pathological Voice
Document Type
Conference
Author
Source
2018 IEEE International Conference on Big Data (Big Data) Big Data (Big Data), 2018 IEEE International Conference on. :5238-5241 Dec, 2018
Subject
Language
Abstract
We participate in the FEMH 2018 Challenge of a bigdata subproject of the IEEE. The goal of this Challenge is pathological voice detection, and classify the different diseases, including phono trauma, neoplasm and vocal paralysis. Final, this challenge uses sensitivity, specificity and UAR as a result. The database is recorded with 50 normal voice samples and 150 samples of common voice disorders in a tertiary teaching hospital (Far Eastern Memorial Hospital, FEMH). The paper proposes a Deep Neural Networks based (DNN-based) approach in this challenge. Data preprocessing used Mel-Frequency Cepstral Coefficients (MFCCs), which also have emotion specific information. Gradual spectral variations are captured using 13 MFCCs extracted from speech signal. In the disease detection section, we examine the performance among different DNN structures (ie, hidden layers and number of neurons). Then, In the disease classification section, examine the performance among different batch sizes and normalize or no normalize. Finally, the tested DNN structures have the best results at 5 hidden layers and 200 of neurons.