Publication Type
Conference Proceeding Article
Version
acceptedVersion
Publication Date
10-2021
Abstract
Anomaly detection with weakly supervised video-level labels is typically formulated as a multiple instance learning (MIL) problem, in which we aim to identify snippets containing abnormal events, with each video represented as a bag of video snippets. Although current methods show effective detection performance, their recognition of the positive instances, i.e., rare abnormal snippets in the abnormal videos, is largely biased by the dominant negative instances, especially when the abnormal events are subtle anomalies that exhibit only small differences compared with normal events. This issue is exacerbated in many methods that ignore important video temporal dependencies. To address this issue, we introduce a novel and theoretically sound method, named Robust Temporal Feature Magnitude learning (RTFM), which trains a feature magnitude learning function to effectively recognise the positive instances, substantially improving the robustness of the MIL approach to the negative instances from abnormal videos. RTFM also adapts dilated convolutions and self-attention mechanisms to capture long- and short-range temporal dependencies to learn the feature magnitude more faithfully. Extensive experiments show that the RTFM-enabled MIL model (i) outperforms several state-of-the-art methods by a large margin on four benchmark data sets (ShanghaiTech, UCF-Crime, XD-Violence and UCSD-Peds) and (ii) achieves significantly improved subtle anomaly discriminability and sample efficiency.
Keywords
Vision applications and systems, Action and behavior recognition, Transfer/Low-shot/Semi/Unsupervised Learning
Discipline
Artificial Intelligence and Robotics | Databases and Information Systems
Research Areas
Intelligent Systems and Optimization
Publication
2021 18th IEEE/CVF International Conference on Computer Vision: Proceedings, Virtual, October 11-17
First Page
1
Last Page
13
ISBN
9781665428125
Identifier
10.1109/ICCV48922.2021.00493
Publisher
IEEE
City or Country
Piscataway, NJ
Citation
TIAN, Yu; PANG, Guansong; CHEN, Yuanhong; SINGH, Rajvinder; VERJANS, Johan W.; and CARNEIRO, Gustavo.
Weakly-supervised video anomaly detection with contrastive learning of long and short-range temporal features. (2021). 2021 18th IEEE/CVF International Conference on Computer Vision: Proceedings, Virtual, October 11-17. 1-13.
Available at: https://ink.library.smu.edu.sg/sis_research/7021
Copyright Owner and License
Authors
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1109/ICCV48922.2021.00493