학술논문

Multimodal Colearning Meets Remote Sensing: Taxonomy, State of the Art, and Future Works

Document Type

Periodical

Author

Kieu, N.; Nguyen, K.; Nazib, A.; Fernando, T.; Fookes, C.; Sridharan, S.

Source

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing IEEE J. Sel. Top. Appl. Earth Observations Remote Sensing Selected Topics in Applied Earth Observations and Remote Sensing, IEEE Journal of. 17:7386-7409 2024

Subject

Geoscience
Signal Processing and Analysis
Power, Energy and Industry Applications
Remote sensing
Laser radar
Task analysis
Data models
Earth
Training
Soft sensors
Multimodal colearning
multimodal learning
remote sensing (RS)
satellite imagery

Language

ISSN

1939-1404
2151-1535

Abstract

In remote sensing (RS), multiple modalities of data are usually available, e.g., RGB, multispectral, hyperspectral, light detection and ranging (LiDAR), and synthetic aperture radar (SAR). Multimodal machine learning systems, which fuse these rich multimodal data modalities, have shown better performance compared to unimodal systems. Most multimodal research assumes that all modalities are present, aligned, and noiseless during training and testing time. However, in real-world scenarios, it is common to observe that one or more modalities are missing, noisy, and nonaligned, in either training or testing or both. In addition, acquiring large-scale, noise-free annotations is expensive, as a result, lacking sufficient annotated datasets or having to deal with inconsistent labels are open challenges. These challenges can be addressed under a learning paradigm called multimodal colearning. This article focuses on multimodal colearning techniques for RS data. We first review what data modalities are available in the RS domain and the key benefits and challenges of combining multimodal data in the RS context. We then review the RS tasks that would benefit from multimodal processing including classification, segmentation, target detection, anomaly detection, and temporal change detection. We then dive deeper into technical details by reviewing more than 200 recent efforts in this area and provide a comprehensive taxonomy to systematically review state-of-the-art approaches in four key colearning challenges including missing modalities, noisy modalities, limited modality annotations, and weakly paired modalities. Based on these insights, we propose emerging research directions to inform potential future research in multimodal colearning for RS.

Online Access

Web of Science JCR 저널정보 Scopus Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송