AnyAnomaly: Zero-Shot Customizable Video Anomaly Detection with LVLM

This repository is the official open-source of AnyAnomaly: Zero-Shot Customizable Video Anomaly Detection with LVLM by Sunghyun Ahn*, Youngwan Jo*, Kijung Lee, Sein Kwon, Inpyo Hong and Sanghyun Park. (*equally contributed)

Description

Video anomaly detection (VAD) is crucial for video analysis and surveillance in computer vision. However, existing VAD models rely on learned normal patterns, which makes them difficult to apply to diverse environments. Consequently, users should retrain models or develop separate AI models for new environments, which requires expertise in machine learning, high-performance hardware, and extensive data collection, limiting the practical usability of VAD. To address these challenges, this study proposes customizable video anomaly detection (C-VAD) technique and the AnyAnomaly model. C-VAD considers user-defined text as an abnormal event and detects frames containing a specified event in a video. We effectively implemented AnyAnomaly using a context-aware visual question answering without fine-tuning the large vision language model. To validate the effectiveness of the proposed model, we constructed C-VAD datasets and demonstrated the superiority of AnyAnomaly. Furthermore, our approach showed competitive performance on VAD benchmark datasets, achieving state-of-the-art results on the UBnormal dataset and outperforming other methods in generalization across all datasets.

Context-aware VQA

Comparison of the proposed model with the baseline. Both models perform C-VAD, but the baseline operates with frame-level VQA, whereas the proposed model employs a segment-level Context-Aware VQA. Context-Aware VQA is a method that performs VQA by utilizing additional contexts that describe an image. To enhance the object analysis and action understanding capabilities of LVLM, we propose Position Context and Temporal Context.

Position Context Tutorial: [Google Colab]
Temporal Context Tutorial: [Google Colab]

Results

Table 1 and Table 2 present the evaluation results on the C-VAD datasets (C-ShT, C-Ave). The proposed model achieved performance improvements of 9.88% and 13.65% over the baseline on the C-ShT and C-Ave datasets, respectively. Specifically, it showed improvements of 14.34% and 8.2% in the action class, and 3.25% and 21.98% in the appearance class.

Qualitative Evaluation

Anomaly Detection in Diverse scenarios

Text	Demo
Jumping-Falling -Pickup
Bicycle- Running
Bicycle- Stroller

Anomaly Detection in Complex scenarios

Text	Demo
Driving outside lane
People and car accident
Jaywalking
Walking drunk

Datasets

We processed the Shanghai Tech Campus (ShT) and CUHK Avenue (Ave) datasets to create the labels for the C-ShT and C-Ave datasets. These labels can be found in the ground_truth folder. To test the C-ShT and C-Ave datasets, you need to first download the ShT and Ave datasets and store them in the directory corresponding to 'data_root'.
You can specify the dataset's path by editing 'data_root' in config.py.

CUHK Avenue	Shnaghai Tech.	Quick Download
Official Site	Official Site	GitHub Page

1. Requirements and Installation For Chat-UniVi

Once the datasets and the Chat-UniVi model are ready, you can move the provided tutorial files to the main directory and run them directly!
Chat-UniVi: [GitHub]
weights: Chat-UniVi 7B [Huggingface], Chat-UniVi 13B [Huggingface]
Install required packages:

git clone https://github.com/PKU-YuanGroup/Chat-UniVi
cd Chat-UniVi
conda create -n chatunivi python=3.10 -y
conda activate chatunivi
pip install --upgrade pip
pip install -e .
pip install numpy==1.24.3

# Download the Model (Chat-UniVi 7B)
mkdir weights
cd weights
sudo apt-get install git-lfs
git lfs install
git lfs clone https://huggingface.co/Chat-UniVi/Chat-UniVi

# Download extra packages
cd ../../
pip install -r requirements.txt

Command

C-Ave type: [too_close, bicycle, throwing, running, dancing]
C-ShT type: [car, bicycle, fighting, throwing, hand_truck, running, skateboarding, falling, jumping, loitering, motorcycle]
C-Ave type (multiple): [throwing-too_close, running-throwing]
C-ShT type (multiple): [stroller-running, stroller-loitering, stroller-bicycle, skateboarding-bicycle, running-skateboarding, running-jumping, running-bicycle, jumping-falling-pickup, car-bicycle]

# Baseline model (Chat-UniVi) → C-ShT
python -u vad_chatunivi.py --dataset=shtech --type=falling
# proposed model (AnyAomaly) → C-ShT
python -u vad_proposed_chatunivi.py --dataset=shtech --type=falling
# proposed model (AnyAnomaly) → C-ShT, diverse anomaly scenarios
python -u vad_proposed_chatunivi.py --dataset=shtech --multiple=True --type=jumping-falling-pickup

2. Requirements and Installation For MiniCPM-V

MiniCPM-V: [GitHub]
Install required packages:

git clone https://github.com/OpenBMB/MiniCPM-V.git
cd MiniCPM-V
conda create -n MiniCPM-V python=3.10 -y
conda activate MiniCPM-V
pip install -r requirements.txt

# Download extra packages
cd ../
pip install -r requirements.txt

Command

# Baseline model (MiniCPM-V) → C-ShT
python -u vad_MiniCPM.py --dataset=shtech --type=falling 
# proposed model (AnyAomaly) → C-ShT
python -u vad_proposed_MiniCPM.py --dataset=shtech --type=falling 
# proposed model (AnyAnomaly) → C-ShT, diverse anomaly scenarios
python -u vad_proposed_MiniCPM.py --dataset=shtech --multiple=True --type=jumping-falling-pickup

Citation

If you use our work, please consider citing:

@article{ahn2025anyanomaly,
  title={AnyAnomaly: Zero-Shot Customizable Video Anomaly Detection with LVLM},
  author={Ahn, Sunghyun and Jo, Youngwan and Lee, Kijung and Kwon, Sein and Hong, Inpyo and Park, Sanghyun},
  journal={arXiv preprint arXiv:2503.04504},
  year={2025}
}

Contact

Should you have any question, please create an issue on this repository or contact me at [email protected].

Name		Name	Last commit message	Last commit date
Latest commit History 104 Commits
functions		functions
ground_truth		ground_truth
LICENSE		LICENSE
README.md		README.md
config.py		config.py
data_loader.py		data_loader.py
requirements.txt		requirements.txt
utils.py		utils.py
vad_MiniCPM.py		vad_MiniCPM.py
vad_chatunivi.py		vad_chatunivi.py
vad_proposed_MiniCPM.py		vad_proposed_MiniCPM.py
vad_proposed_chatunivi.py		vad_proposed_chatunivi.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AnyAnomaly: Zero-Shot Customizable Video Anomaly Detection with LVLM

Description

Context-aware VQA

Results

Qualitative Evaluation

Datasets

1. Requirements and Installation For Chat-UniVi

Command

2. Requirements and Installation For MiniCPM-V

Command

Citation

Contact

About

Uh oh!

Releases

Packages

Languages

License

zhangnn520/Paper-AnyAnomaly

Folders and files

Latest commit

History

Repository files navigation

AnyAnomaly: Zero-Shot Customizable Video Anomaly Detection with LVLM

Description

Context-aware VQA

Results

Qualitative Evaluation

Datasets

1. Requirements and Installation For Chat-UniVi

Command

2. Requirements and Installation For MiniCPM-V

Command

Citation

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages