Attention-driven pseudo-label self-training for weakly supervised video anomaly detection
Publication Type
Journal Article
Publication Date
9-2026
Abstract
Recently, two-stage self-training methods based on generating pseudo-labels for weakly supervised video anomaly detection (WSVAD) have achieved notable progress. However, the generated pseudo-labels often suffer from incompleteness and noise, which hampers further performance gains. To achieve better pseudo-label generation and self-training performance, inspired by the human attention mechanism, we introduce a novel dual-branch framework for WSVAD that synchronizes pseudo-label generation and self-training. The first branch introduces a video snippet separation and fusion (VSSF) module based on self-attention and cross-attention mechanisms. A video classification module then follows the VSSF module to classify the fused video feature representations, thereby further enhancing the distinction between anomalous and normal snippets. Building on this, we design an attention-driven pseudo-label generation (PLG) module equipped with a denoising strategy. This module infers accurate and comprehensive snippet-level pseudo-labels from the separation process, guided by a compactness-separation loss and distributional dissimilarity loss. In the second branch, we design a multi-scale temporal feature interaction learning module, which captures rich temporal dependencies among video snippets to enhance their discriminability. Then, the second branch synchronously receives the latest pseudo-labels from the first branch for snippet classifier learning, which minimizes the impact of noisy snippets, thereby improving the self-training performance. Extensive experiments on three benchmark datasets demonstrate that our method consistently surpasses existing two- and multi-stage self-training frameworks and achieves competitive or superior results to recent one-stage approaches, highlighting the effectiveness of our proposed framework. Our code is available at https://github.com/Beyond-Zw/ADPLG-VAD.
Keywords
Attention mechanism, Pseudo-label generation, Self-training, Video anomaly detection, Weak supervision
Discipline
Graphics and Human Computer Interfaces
Research Areas
Intelligent Systems and Optimization
Publication
Pattern Recognition
Volume
177
ISSN
0031-3203
Identifier
10.1016/j.patcog.2026.113349
Publisher
Elsevier
Citation
YANG, Zhiwei; LIU, Jing; PANG, Guansong; WU, Peng; and WU, Zhaoyang.
Attention-driven pseudo-label self-training for weakly supervised video anomaly detection. (2026). Pattern Recognition. 177,.
Available at: https://ink.library.smu.edu.sg/sis_research/11057
Additional URL
https://doi.org/10.1016/j.patcog.2026.113349