학술논문

Abnormal Behavior Detection Method based on Spatio-temporal Dual-flow Network for Surveillance Videos
Document Type
Conference
Source
2023 IEEE 35th International Conference on Tools with Artificial Intelligence (ICTAI) ICTAI Tools with Artificial Intelligence (ICTAI), 2023 IEEE 35th International Conference on. :849-856 Nov, 2023
Subject
Bioengineering
Components, Circuits, Devices and Systems
General Topics for Engineers
Robotics and Control Systems
Signal Processing and Analysis
Transportation
Convolution
Surveillance
Computational modeling
Video sequences
Feature extraction
Real-time systems
Behavioral sciences
Abnormal behavior detection
Surveillance videos
Partial convolution
Dual attention fusion mechanism
Language
ISSN
2375-0197
Abstract
To address the issues that current surveillance videos abnormal behavior detection methods are affected by complex environments such as video blurring and long distances, resulting in low detection accuracy and slow speed, we propose an abnormal behavior detection method based on spatio-temporal dual-flow network for surveillance videos. Firstly, the input surveillance videos are sampled with different frame rates and a spatio-temporal dual-flow partial convolution network is constructed to extract spatial and motion information features from the spatial and temporal flow video sequences, respectively. Then, a cross-modal dual-attention fusion mechanism is introduced after each feature extraction of the dual-flow partial convolution network to enhance the exchange of feature information between spatial and temporal flows. Finally, the extracted motion and spatial features are fused to output detection results. Experiments show that our method reduces the number of floating-point operations and model parameters on the Kinetics-600 dataset compared to the MViT-L algorithm with guaranteed accuracy. Compared to lightweight networks such as X3D-XL, it improves the accuracy in Top-1 and Top-5 by 4.3% and 3.6%, respectively. The experiments prove that the method can reduce the model’s computational complexity while obtaining more accurate abnormal behavior detection results when the surveillance videos is blurred or the abnormal behavior occurs far away.