- This repository contains our implementation of Shot Boundary Detection.
- The code is modified from here
- This code is based on two papers
requirement.txt
or
Window 10
conda create--name pytorch100 python=3.6
conda install pytorch=1.0.1 torchvision=0.2.2 cudatoolkit=10.0 -c pytorch
conda install -c menpo ffmpeg=2.7.0
conda install -c conda-forge opencv=3.4.2
conda install tensorboardx=1.6
conda install scikit-learn=0.20.3
pip install matplotlib==3.0.3
pip install tensorflow==1.15.2
pip install thop==0.0.23
-
directory structure
data/[dataset_name]/ annotations/ : annotations of videos video_lists/ : video file name list videos/ : video files
- ClipShots Dataset (From here)
-
directory
data/ClipShots/ -
clone above repository to directory
data/ -
download videos from above repository
-
- RAI Dataset (From here)
-
directory
data/RAI/ -
save videos to
data/RAI/videos/test -
prepare annotations and video_lists (not now in repository)
-
- TRECVID Dataset
-
directory
data/TRECVID/[Year]/ -
In my case, we use TRECVID 2007 dataset. following description is about TRECVID 2007
- request TRECVID videos through following form
- download videos in shot.test provided from NIST
- save videos to
data/TRECVID/07/videos/test
- if you don't have annotations,
- download annotations from here
- unzip
sbref07.1.tar.gzand save files todata/TRECVID/07/ - and then, execute
make_trecvid_dataset.py- copy TRECVID videos from other directory to
data/TRECVID/07/videos/testwhen no video intestfolder - make annotations JSON file using XML files from
/ref/
- copy TRECVID videos from other directory to
-
- pre-trained Shot Boundary Detection models
- used models in this repository (GitHub)
- The trained model for Alexnet-like backbone. BaiduYun, Google Drive
- The trained model for ResNet-18 backbone. BaiduYun, Google Drive
- save to
models/
- used models in this repository (GitHub)
- pre-trained kinetics models
- You can download pre-trained models (Google Drive)
- used models in this repository
- resnet-18-kinetics.pth
- resnet-50-kinetics.pth
- resnext-101-kinetics.pth
- save to directory
kinetics_pretrained_model/
use ClipShots and TRECVID 2007 Dataset. no RAI
Model(Google Drive)
-
resnext-101
-
no pre-trained kinetics model(Google Drive)
-
command (example)
python main_baseline.py --phase test --dataset ClipShots --test_weight results/model_final/model_final_resnext_noPre_epoch5.pth --train_data_type normal --model resnext --model_depth 101 --pretrained_model False --loss_type normal --sample_size 128
-
-
use pre-trained kinetics model(Google Drive)
-
command (example)
python main_baseline.py --phase test --dataset ClipShots --test_weight results/model_final/model_final_resnext_epoch5.pth --train_data_type normal --model resnext --model_depth 101 --pretrained_model True --loss_type normal --sample_size 128
-
-
-
Knowledge distillation(teacher: alexnet, student: resnext-101)
-
no pre-trained kinetics model(Google Drive)
-
command (example)
python main_baseline.py --phase test --dataset ClipShots --test_weight results/model_final/model_final_teacher_noPre_epoch5.pth --train_data_type normal --model resnext --model_depth 101 --pretrained_model False --loss_type KDloss --sample_size 128
-
-
use pre-trained kinetics model(Google Drive)
-
command (example)
python main_baseline.py --phase test --dataset ClipShots --test_weight results/model_final/model_final_teacher_epoch5.pth --train_data_type normal --model resnext --model_depth 101 --pretrained_model True --loss_type KDloss --sample_size 128
-
-
-
detector (use pre-trained kinetics model)(Google Drive)
-
command (example)
python main_baseline.py --phase test --dataset ClipShots --test_weight results/model_final/model_final_detector_epoch5.pth --train_data_type cut --layer_policy second --model detector --baseline_model resnet --model_depth 50 --pretrained_model True --loss_type multiloss --sample_size 64
-
-
Training
-
options
-
phase
--phase train -
Dataset (only ClipShots)
--dataset ClipShots -
Model
- alexnet / resnet / resnext --model alexnet --train_data_type normal --loss_type normal --model resnet -model_depth 18 --train_data_type normal --loss_type normal --model resnext --model_depth 101 --train_data_type normal --loss_type normal - detector --model detector --baseline_model resnet --model_depth 50 --train_data_type cut --layer_policy second --loss_type multiloss --sample_size 64 -
use/no pre-trained model
--pretrained_model True/False -
loss function
- normal (only classification) --loss_type normal - KDloss (only classification) --loss_type KDloss - multiloss (classification + regression) --loss_type multiloss
-
-
command (example)
python main_baseline.py --phase train --dataset ClipShots --model detector --baseline_model resnet --model_depth 50 --pretrained_model True --loss_type multiloss --sample_size 64
-
-
Testing :
opts.py→ phase='test'-
options
-
phase
--phase test -
Dataset
--dataset ClipShots --dataset RAI --dataset TRECVID[year] ex) TRECVID07 (only 2007 year possible) -
Model
- alexnet / resnet / resnext --model alexnet --train_data_type normal --loss_type normal --model resnet -model_depth 18 --train_data_type normal --loss_type normal --model resnext --model_depth 101 --train_data_type normal --loss_type normal - detector --model detector --baseline_model resnet --model_depth 50 --train_data_type cut --layer_policy second --loss_type multiloss --sample_size 64 -
use/no pre-trained model
--pretrained_model True/False
-
-
command (example)
python main_baseline.py --phase test --dataset ClipShots --test_weight results/model_final/model_final_detector_epoch5.pth --train_data_type cut --layer_policy second --model detector --baseline_model resnet --model_depth 50 --pretrained_model True --loss_type multiloss --sample_size 64
-
-
Training and Testing
- phase='full'
-
Running code
- if change files, Run
run_[OS_type].sh - else, Run python file
main_baseline.pywith options
- if change files, Run