학술논문
RGB-D Salient Object Detection Based on Cross-Modal and Cross-Level Feature Fusion
Document Type
Periodical
Source
IEEE Access Access, IEEE. 12:45134-45146 2024
Subject
Language
ISSN
2169-3536
Abstract
Existing RGB-D saliency detection models have not fully considered the differences between features at various levels, and lack an effective mechanism for cross-level feature fusion. This article proposes a novel cross-modality cross-level fusion learning framework. The framework mainly contains three modules: Attention Enhancement Module (AEM), Modality Feature Fusion Module (MFM), and Graph Reasoning Module (GRM). AEM is used to enhance the features of the two modalities. MFM is used to integrate the features of the two modalities to achieve cross-modality feature fusion. Subsequently, the modality fusion features are divided into high-level features and low-level features. The high-level features contain the semantic localization information of salient objects, and the low-level features contain the detailed information of salient objects. GRM extends the semantic localization information of salient objects in the high-level features from pixel features to the entire salient object area, thereby achieving cross-level feature fusion. This framework can effectively eliminate background noise and enhance the model’s expressiveness. Extensive experiments were conducted on seven widely used datasets, and the results show that the new method outperforms nine current state-of-the-art RGB-D SOD methods.