학술논문

Radar-Based Human Activity Recognition Using Multidomain Multilevel Fused Patch-Based Learning
Document Type
Periodical
Source
IEEE Transactions on Instrumentation and Measurement IEEE Trans. Instrum. Meas. Instrumentation and Measurement, IEEE Transactions on. 73:1-14 2024
Subject
Power, Energy and Industry Applications
Components, Circuits, Devices and Systems
Human activity recognition
Spectrogram
Radar imaging
Image color analysis
Doppler radar
Doppler effect
Computational modeling
ConvMixer
deep learning (DL)
human activity recognition (HAR)
multidomain fusion
multilevel fusion
radar
Language
ISSN
0018-9456
1557-9662
Abstract
Recently, several deep-learning (DL) techniques using different types of 2-D representations of radar returns, have been developed for radar-based human activity recognition (HAR). Most of these DL techniques involve a fusion approach (either at the feature level (intermediate) or at the decision level (late)) as the information obtained from one 2-D radar representation supplements the information obtained from another 2-D radar representation for enhanced HAR. The inputs to these fusion-based DL techniques are red, green, and blue (RGB) images of 2-D representations of radar returns. The information contained in the 2-D representations is completely mapped to color (RGB) domains. However, none of the DL techniques exploit this color information explicitly for HAR. This work proposes a novel lightweight multidomain multilevel fused patch-based learning model that exploits individual color domain information of RGB images of 2-D representations, namely, range-time, range-Doppler, and spectrograms for enhanced HAR using radars. This work proposes a novel domain-level (early) fusion of 2-D representations of radar returns based on the color domain information. Individual color planes (R, G, B) of 2-D representations are fused together to form consolidated three-channel images that serve as input to an isotropic patch-based learning model called convolutional mixer (convMixer). The early (domain) fused three-channel images are used as inputs to an attentional feature level (intermediate) fusion-based convMixer models. The performance of the proposed model is evaluated using a publicly available radar signatures dataset of human activities. The proposed model outperforms the state-of-the-art significantly using a location-wise testing strategy, which eliminates the possibility of data leakage.