학술논문

ERR-Net: Facial Expression Removal and Recognition Network With Residual Image
Document Type
Periodical
Source
IEEE Transactions on Biometrics, Behavior, and Identity Science IEEE Trans. Biom. Behav. Identity Sci. Biometrics, Behavior, and Identity Science, IEEE Transactions on. 5(4):425-434 Oct, 2023
Subject
Bioengineering
Computing and Processing
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Face recognition
Feature extraction
Image recognition
Generative adversarial networks
Training
Computer vision
Emotion recognition
Expression removal
expression recognition
residual features
expression disentanglement
Language
ISSN
2637-6407
Abstract
Facial expression recognition is an important part of computer vision and has attracted great attention. Although deep learning pushes forward the development of facial expression recognition, it still faces huge challenges due to unrelated factors such as identity, gender, and race. Inspired by decomposing an expression into two parts: neutral component and expression component, we define residual features and propose an end-to-end network framework named Expression Removal and Recognition Network (ERR-Net), which can simultaneously perform expression removal and recognition tasks. The residual features are represented in two ways: pixel level and facial landmark level. Our network focuses on interpreting the encoder’s output and corresponding its segments to expressions to maximize the inter-class distances. We explore the improved generative adversarial network to convert different expressions into neutral expressions (i.e., expression removal), take the residual images as the output, learn the expression components in the process, and realize the classification of expressions. Through sufficient ablation experiments, we have proved that various improvements added on the network have obvious effects. Experimental comparisons on two benchmarks CK+ and MMI demonstrate that our proposed ERR-Net surpasses the state-of-the-art methods in terms of accuracy.