학술논문

Cubic Knowledge Distillation for Speech Emotion Recognition
Document Type
Conference
Source
ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Acoustics, Speech and Signal Processing (ICASSP), ICASSP 2024 - 2024 IEEE International Conference on. :5705-5709 Apr, 2024
Subject
Bioengineering
Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Knowledge engineering
Human computer interaction
Emotion recognition
Codes
Speech coding
Speech recognition
Signal processing
Knowledge distillation
Light-weight models
Speech emotion recognition
Language
ISSN
2379-190X
Abstract
Speech Emotion Recognition (SER) can play an important role in human-computer interaction. In this paper, we propose a logit knowledge distillation method for SER, called Cubic KD, that distill the knowledge of fine-tuned self-supervised models to allow better performance of small models. By creating cubic structures from teacher and student network output features and using a loss function to distill the cube structure through self-correlation between elements, Cubic KD efficiently captures knowledge within instances and among instances. We apply this distillation method to four student models and conduct experiments using the Emo-DB and IEMOCAP datasets. The results show that Cubic KD outperforms existing predictive logit knowledge distillation methods and is comparable to intermediate feature knowledge distillation methods. Our implementation code is available at https://github.com/Fly1toMoon/Cubic-Knowledge-Distillation