학술논문

Inflation-Deflation Networks for Recognizing Head-Movement Functions in Face-to-Face Conversations
Document Type
Conference
Source
Proceedings of the 2021 International Conference on Multimodal Interaction. :361-369
Subject
deep neural networks
head movement
multimodal meeting analysis
nonverbal behavior
Language
English
Abstract
Head movements have various functions in face-to-face conversations. Recently, convolutional neural networks (CNNs) have been proposed to recognize the communicative functions performed by the head movements from the time series of interlocutors’ head pose angles during multiparty conversations. However, there is room for improvement in the recognition performance. To boost the CNNs’ performance, this paper proposes a feature Inflation-Deflation module (I/DeF module) as an additional module attached ahead of the CNNs’ input layer to facilitate the feature learning of the head-movement dynamics. The I/DeF module consists of repeated inflation and deflation processes. The inflation process upscales and extrapolates the windowed input time series by a transposed convolution. The deflation process compresses the inflated data and recovers its original data length. Targeting the ten frequent head-movement functions, the experiments showed that CNNs with the I/DeF module (I/DeF-CNNs) outperformed the previous CNNs in all function categories up to 4.5 points in F-measure. We also integrated the I/DeF module into VGG and ResNet. Comparison to these methods showed that I/DeF-CNNs surpassed the other models for 8 out of 10 functions. These results confirmed the effectiveness of the I/DeF module and its potential for advancing nonverbal behavior recognition.

Online Access