학술논문

CNN-n-GRU: end-to-end speech emotion recognition from raw waveform signal using CNNs and gated recurrent unit networks
Document Type
Conference
Source
2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA) ICMLA Machine Learning and Applications (ICMLA), 2022 21st IEEE International Conference on. :699-702 Dec, 2022
Subject
Computing and Processing
Engineering Profession
Robotics and Control Systems
Signal Processing and Analysis
Deep learning
Emotion recognition
Neural networks
Speech recognition
Logic gates
Feature extraction
Acoustics
Speech emotion recognition
CNN
RNN
GRU
Signal processing
Waveform signal
Language
Abstract
We present CNN-n-GRU, a new end-to-end (E2E) architecture built of an n-layer convolutional neural network (CNN) followed sequentially by an n-layer Gated Recurrent Unit (GRU) for speech emotion recognition. CNNs and RNNs both exhibited promising outcomes when fed raw waveform voice inputs. This inspired our idea to combine them into a single model to maximise their potential. Instead of using handcrafted features or spectrograms, we train CNNs to recognise low-level speech representations from raw waveform, which allows the network to capture relevant narrow-band emotion characteristics. On the other hand, RNNs (GRUs in our case) can learn temporal characteristics, allowing the network to better capture the signal’s time-distributed features. Because a CNN can generate multiple levels of representation abstraction, we exploit early layers to extract high-level features, then to supply the appropriate input to subsequent RNN layers in order to aggregate long-term dependencies. By taking advantage of both CNNs and GRUs in a single model, the proposed architecture has important advantages over other models from the literature. The proposed model was evaluated using the TESS dataset and compared to state-of-the-art methods. Our experimental results demonstrate that the proposed model is more accurate than traditional classification approaches for speech emotion recognition.