학술논문

MCCG: A ConvNeXt-Based Multiple-Classifier Method for Cross-View Geo-Localization
Document Type
Periodical
Source
IEEE Transactions on Circuits and Systems for Video Technology IEEE Trans. Circuits Syst. Video Technol. Circuits and Systems for Video Technology, IEEE Transactions on. 34(3):1456-1468 Mar, 2024
Subject
Components, Circuits, Devices and Systems
Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Feature extraction
Drones
Task analysis
Image segmentation
Semantics
Satellites
Data mining
Cross-view
ConvNeXt
image retrieval
multiple feature representation
Language
ISSN
1051-8215
1558-2205
Abstract
The key to crossview geolocalization is to match images of the same target from different viewpoints, e.g., images from drones and satellites. It is a challenging problem due to the changing appearance of objects from variable viewpoints. Most existing methods focus mainly on extracting global features or on segmenting feature maps, causing the loss of information contained in the images. To address the above issues, we propose a new ConvNeXt-based method called MCCG, which stands for Multiple Classifier for Cross-view Geolocalization. The proposed method captures rich discriminative information by cross-dimension interaction and acquires multiple feature representations, realizing a comprehensive feature representation. Additionally, the robustness of the model is improved crediting the multiple feature representations exploiting more contextual information despite position shifting or scale variations. Extensive experiments on the widely used public benchmarks University-1652 and SUES-200 demonstrate that the proposed method achieves state-of-the-art performance in both drone-view target localization and drone navigation applications by over 3% compared to existing methods. Our code and model are available at https://github.com/mode-str/crossview.