학술논문

UFI: A Unified Feature Interaction Framework for Multi-Label Image Classification
Document Type
Conference
Source
2022 IEEE International Conference on Multimedia and Expo (ICME) Multimedia and Expo (ICME), 2022 IEEE International Conference on. :1-6 Jul, 2022
Subject
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Signal Processing and Analysis
Visualization
Benchmark testing
Feature extraction
Transformers
Convolutional neural networks
Task analysis
Image classification
Feature interaction
multi-label classification
class-related feature selection
feature interaction attention
Language
ISSN
1945-788X
Abstract
Multi-label image classification (MLIC) is a more challenging task compared with single-label image classification due to multiple concepts targets, and complex visual relationships should be formulated. Convolutional Neural Network (CNN) and Visual Transformer (ViT) have shown superior performance in local and global feature representations, respectively. However, the interactions between local and global features are neglected in current works. To further formulate the critical interactions, this paper designs a Unified Feature Interaction (UFI) framework, aiming to integrate the selected local features with global features based on CNN and ViT, simultaneously. The proposed UFI includes two key modules: Class-Related Feature Selection (CRFS) and Feature Interaction Attention (FIA) modules. Specifically, according to the activation map, CRFS selects target regions by the preliminary calculation of predicted scores. FIA enables the significant local-global feature interaction based on the selected target regions and whole image. We initially attempted to interact with local and global features for multi-label image classification. UFI provides a stable improvement over the baseline and produces a new state-of-the-art result on MS-COCO and VOC2007.