학술논문

Weakly Supervised Video Anomaly Detection via Self-Guided Temporal Discriminative Transformer
Document Type
Periodical
Source
IEEE Transactions on Cybernetics IEEE Trans. Cybern. Cybernetics, IEEE Transactions on. 54(5):3197-3210 May, 2024
Subject
Signal Processing and Analysis
Communication, Networking and Broadcast Technologies
Robotics and Control Systems
General Topics for Engineers
Components, Circuits, Devices and Systems
Computing and Processing
Power, Energy and Industry Applications
Feature extraction
Task analysis
Training
Anomaly detection
Detectors
Transformers
Annotations
Multiple instance learning (MIL)
transformer
video anomaly detection (VAD)
video surveillance
weak supervision
Language
ISSN
2168-2267
2168-2275
Abstract
Weakly supervised video anomaly detection is generally formulated as a multiple instance learning (MIL) problem, where an anomaly detector learns to generate frame-level anomaly scores under the supervision of MIL-based video-level classification. However, most previous works suffer from two drawbacks: 1) they lack ability to model temporal relationships between video segments and 2) they cannot extract sufficient discriminative features to separate normal and anomalous snippets. In this article, we develop a weakly supervised temporal discriminative (WSTD) paradigm, that aims to leverage both temporal relation and feature discrimination to mitigate the above drawbacks. To this end, we propose a transformer-styled temporal feature aggregator (TTFA) and a self-guided discriminative feature encoder (SDFE). Specifically, TTFA captures multiple types of temporal relationships between video snippets from different feature subspaces, while SDFE enhances the discriminative powers of features by clustering normal snippets and maximizing the separability between anomalous snippets and normal centers in embedding space. Experimental results on three public benchmarks indicate that WSTD outperforms state-of-the-art unsupervised and weakly supervised methods, which verifies the superiority of the proposed method.