학술논문

Terrain-Informed Self-Supervised Learning: Enhancing Building Footprint Extraction From LiDAR Data With Limited Annotations
Document Type
Periodical
Source
IEEE Transactions on Geoscience and Remote Sensing IEEE Trans. Geosci. Remote Sensing Geoscience and Remote Sensing, IEEE Transactions on. 62:1-10 2024
Subject
Geoscience
Signal Processing and Analysis
Task analysis
Buildings
Laser radar
Remote sensing
Computational modeling
Self-supervised learning
Surface topography
Building
light detection and ranging (LiDAR)
remote sensing
segmentation
self-supervised learning (SSL)
Language
ISSN
0196-2892
1558-0644
Abstract
Estimating building footprint maps from geospatial data is vital in urban planning, development, disaster management, and various other applications. Deep learning methodologies have gained prominence in building segmentation maps, offering the promise of precise footprint extraction without extensive postprocessing. However, these methods face challenges in generalization and label efficiency, particularly in remote sensing, where obtaining accurate labels can be both expensive and time consuming. To address these challenges, we propose terrain-aware self-supervised learning (SSL), tailored to remote sensing, using digital elevation models (DEMs) from light detection and ranging (LiDAR) data. We propose to learn a model to differentiate between bare Earth and superimposed structures enabling the network to implicitly learn domain-relevant features without the need for extensive pixel-level annotations. We test the effectiveness of our approach by evaluating building segmentation performance on test datasets with varying label fractions. Remarkably, with only 1% of the labels (equivalent to 25 labeled examples), our method improves over ImageNet pretraining, showing the advantage of leveraging unlabeled data for feature extraction in the domain of remote sensing. The performance improvement is more pronounced in few-shot scenarios and gradually closes the gap with ImageNet pretraining as the label fraction increases. We test on a dataset characterized by substantial distribution shifts (including resolution variation and labeling errors) to demonstrate the generalizability of our approach. When compared with other baselines, including ImageNet pretraining and more complex architectures, our approach consistently performs better, demonstrating the efficiency and effectiveness of self-supervised terrain-aware feature learning.