학술논문

STFNet: Enhanced and Lightweight Spatiotemporal Fusion Network for Wearable Human Activity Recognition
Document Type
Periodical
Source
IEEE Sensors Journal IEEE Sensors J. Sensors Journal, IEEE. 24(8):13686-13698 Apr, 2024
Subject
Signal Processing and Analysis
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Robotics and Control Systems
Feature extraction
Sensors
Data mining
Spatiotemporal phenomena
Robot sensing systems
Data models
Computational modeling
Deep learning
human activity recognition (HAR)
inertial measurement unit (IMU)
lightweight design
neural networks
Language
ISSN
1530-437X
1558-1748
2379-9153
Abstract
Human activity recognition using sensor data has become a research hotspot in the field of ubiquitous computing and has a wide range of application scenarios in real life. Effectively extracting and fusing the rich spatiotemporal information contained in the sensor data from various sensors and multiple wearable devices is a challenging task. Moreover, for practical deployment applications, the limited computational and storage resources of edge devices also impose requirements on the model size and inference speed. Hence, we propose STFNet, an enhanced and lightweight spatiotemporal fusion network for wearable human activity recognition (WHAR), in which we innovatively divide the extraction and fusion process of spatiotemporal features into three stages, including sensing feature extraction (SFE), temporal feature extraction (TFE), and channel fusion (CF), to learn incrementally and reduce information loss. We first extract shallow intrachannel features and learn local interchannel information using a lightweight local attention mechanism. Then, the learned features are fed into a hierarchical convolution structure with residual connections to extract multiscale and comprehensive temporal features. Finally, we use graph convolutions to further fuse the deep interchannel information. By replacing the conventional global attention mechanism with a local one and using hierarchical convolution to replace dilated or multikernel convolution, STFNet achieves the purpose of lightweight design. In the experiments, STFNet performs better on public datasets than the selected WHAR models published in recent years and demonstrates its lightweight characteristics on edge device deployment.