Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

10-2023

Abstract

Anomaly detection in the video is an important research area and a challenging task in real applications. Due to the unavailability of large-scale annotated anomaly events, most existing video anomaly detection (VAD) methods focus on learning the distribution of normal samples to detect the substantially deviated samples as anomalies. To well learn the distribution of normal motion and appearance, many auxiliary networks are employed to extract foreground object or action information. These high-level semantic features effectively filter the noise from the background to decrease its influence on detection models. However, the capability of these extra semantic models heavily affects the performance of the VAD methods. Motivated by the impressive generative and anti-noise capacity of diffusion model (DM), in this work, we introduce a novel DM-based method to predict the features of video frames for anomaly detection. We aim to learn the distribution of normal samples without any extra high-level semantic feature extraction models involved. To this end, we build two denoising diffusion implicit modules to predict and refine the features. The first module concentrates on feature motion learning, while the last focuses on feature appearance learning. To the best of our knowledge, it is the first DM-based method to predict frame features for VAD. The strong capacity of DMs also enables our method to more accurately predict the normal features than non-DM-based feature prediction-based VAD methods. Extensive experiments show that the proposed approach substantially outperforms state-of-the-art competing methods.

Discipline

Artificial Intelligence and Robotics | Graphics and Human Computer Interfaces

Research Areas

Cybersecurity; Intelligent Systems and Optimization

Publication

Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, October 2-6

First Page

5527

Last Page

5537

Publisher

IEEE

City or Country

New York, NY, USA

Citation

YAN, Cheng; ZHANG, Shiyu; LIU, Yang; PANG, Guansong; and WANG, Wenjun. Feature prediction diffusion model for video anomaly detection. (2023). Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, October 2-6. 5527-5537.
Available at: https://ink.library.smu.edu.sg/sis_research/8414

Copyright Owner and License

Authors

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Download

Included in

Artificial Intelligence and Robotics Commons, Graphics and Human Computer Interfaces Commons

COinS

Research Collection School Of Computing and Information Systems

Feature prediction diffusion model for video anomaly detection

Publication Type

Version

Publication Date

Abstract

Discipline

Research Areas

Publication

First Page

Last Page

Publisher

City or Country

Citation

Copyright Owner and License

Creative Commons License

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Feature prediction diffusion model for video anomaly detection

Author

Publication Type

Version

Publication Date

Abstract

Discipline

Research Areas

Publication

First Page

Last Page

Publisher

City or Country

Citation

Copyright Owner and License

Creative Commons License

Included in

Share

Search

Links

Browse

Links