학술논문

DynamicMBFN: Dynamic Multimodal Bottleneck Fusion Network for Multimodal Emotion Recognition

Document Type

Conference

Author

Sun, YuTong; Cheng, Dalei; Chen, Yaxi; He, Zhiwei

Source

2023 3rd International Symposium on Computer Technology and Information Science (ISCTIS) Computer Technology and Information Science (ISCTIS), 2023 3rd International Symposium on. :639-644 Jul, 2023

Subject

Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Emotion recognition
Information science
Computational modeling
Heuristic algorithms
Predictive models
Data processing
Data models
Multimodal Emotion Recognition
IEMOCAP
Deep learning
Multimodal Information Fusion

Language

Abstract

In the realm of multimodal emotion recognition, the processing of diverse data modalities such as audio, text, and video is a necessity. Yet, existing machine perception models predominantly aim at optimizing the handling of specific modalities, subsequently fusing the representations or predictions of each modality in later stages. These multimodal classification algorithms chiefly depend on the complementarity among different modalities to augment classification performance. However, they often grapple with challenges such as insufficient data and excessive computations while exploiting the complementary nature of multimodal information. To circumvent these issues, we introduce a multimodal fusion network, DynamicMBFN. This network implements dynamic evaluation strategies and sparse gating mechanisms to apprehend the information variations within each modality's features. Furthermore, we bring forward a bottleneck mechanism to compel the model to arrange and condense information within each modality, simultaneously sharing requisite information. Experimental findings on the IEMOCAP dataset substantiate that our algorithm not only ameliorates the performance of multimodal information fusion but also effectively mitigates computational costs. Thus, our model offers an efficacious solution for multimodal data processing and carries substantial practical implications for accomplishing dependable multimodal fusion.

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송