Skip to content

sung0471/PyTorch_SBD

Repository files navigation

Shot Boundary Detection using 3D CNN

  • This repository contains our implementation of Shot Boundary Detection.
  • The code is modified from here
    • Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet? paper, code
    • Fast Video Shot Transition Localization with Deep Structured Models paper, code
  • This code is based on two papers
    • (KR) Shot Boundary Detection Model using Knowledge Distillation (Link)
    • (KR) The Cut Transition Detection Model Using the SSD Method (Link)

Requirement

requirement.txt

or

Window 10
conda create--name pytorch100 python=3.6
conda install pytorch=1.0.1 torchvision=0.2.2 cudatoolkit=10.0 -c pytorch
conda install -c menpo ffmpeg=2.7.0
conda install -c conda-forge opencv=3.4.2
conda install tensorboardx=1.6
conda install scikit-learn=0.20.3
pip install matplotlib==3.0.3
pip install tensorflow==1.15.2
pip install thop==0.0.23

Dataset

  • directory structure

      data/[dataset_name]/
          annotations/    : annotations of videos
          video_lists/    : video file name list
          videos/         : video files
    
  1. ClipShots Dataset (From here)
    • directory

        data/ClipShots/
      
    • clone above repository to directory data/

    • download videos from above repository

  2. RAI Dataset (From here)
    • directory

        data/RAI/
      
    • save videos to data/RAI/videos/test

    • prepare annotations and video_lists (not now in repository)

  3. TRECVID Dataset
    • directory

        data/TRECVID/[Year]/
      
    • In my case, we use TRECVID 2007 dataset. following description is about TRECVID 2007

    1. request TRECVID videos through following form
      • download videos in shot.test provided from NIST
      • save videos to data/TRECVID/07/videos/test
    2. if you don't have annotations,
      • download annotations from here
      • unzip sbref07.1.tar.gz and save files to data/TRECVID/07/
      • and then, execute make_trecvid_dataset.py
        • copy TRECVID videos from other directory to data/TRECVID/07/videos/test when no video in test folder
        • make annotations JSON file using XML files from /ref/

Resources

  1. pre-trained Shot Boundary Detection models
  2. pre-trained kinetics models
    • You can download pre-trained models (Google Drive)
    • used models in this repository
      1. resnet-18-kinetics.pth
      2. resnet-50-kinetics.pth
      3. resnext-101-kinetics.pth
    • save to directory kinetics_pretrained_model/

Experiment result

Dataset

use ClipShots and TRECVID 2007 Dataset. no RAI

  1. resnext-101

    1. no pre-trained kinetics model(Google Drive)

      • command (example)

        python main_baseline.py --phase test --dataset ClipShots --test_weight results/model_final/model_final_resnext_noPre_epoch5.pth --train_data_type normal --model resnext --model_depth 101 --pretrained_model False --loss_type normal --sample_size 128
        
    2. use pre-trained kinetics model(Google Drive)

      • command (example)

        python main_baseline.py --phase test --dataset ClipShots --test_weight results/model_final/model_final_resnext_epoch5.pth --train_data_type normal --model resnext --model_depth 101 --pretrained_model True --loss_type normal --sample_size 128
        
  2. Knowledge distillation(teacher: alexnet, student: resnext-101)

    1. no pre-trained kinetics model(Google Drive)

      • command (example)

        python main_baseline.py --phase test --dataset ClipShots --test_weight results/model_final/model_final_teacher_noPre_epoch5.pth --train_data_type normal --model resnext --model_depth 101 --pretrained_model False --loss_type KDloss --sample_size 128
        
    2. use pre-trained kinetics model(Google Drive)

      • command (example)

        python main_baseline.py --phase test --dataset ClipShots --test_weight results/model_final/model_final_teacher_epoch5.pth --train_data_type normal --model resnext --model_depth 101 --pretrained_model True --loss_type KDloss --sample_size 128
        
  3. detector (use pre-trained kinetics model)(Google Drive)

    • command (example)

      python main_baseline.py --phase test --dataset ClipShots --test_weight results/model_final/model_final_detector_epoch5.pth --train_data_type cut --layer_policy second --model detector --baseline_model resnet --model_depth 50 --pretrained_model True --loss_type multiloss --sample_size 64
      

Traning and Testing

- Change options in files opts.py and utils/config.py or

- run command python

  1. Training

    • options

      • phase

        --phase train
        
      • Dataset (only ClipShots)

        --dataset ClipShots
        
      • Model

        - alexnet / resnet / resnext
        
            --model alexnet --train_data_type normal --loss_type normal
            --model resnet -model_depth 18 --train_data_type normal --loss_type normal
            --model resnext --model_depth 101 --train_data_type normal --loss_type normal
        
        - detector
          
            --model detector --baseline_model resnet --model_depth 50 --train_data_type cut --layer_policy second --loss_type multiloss --sample_size 64
        
      • use/no pre-trained model

        --pretrained_model True/False
        
      • loss function

        - normal (only classification)
            --loss_type normal
        
        - KDloss (only classification)
            --loss_type KDloss
        
        - multiloss (classification + regression)
            --loss_type multiloss
        
    • command (example)

      python main_baseline.py --phase train --dataset ClipShots  --model detector --baseline_model resnet --model_depth 50 --pretrained_model True --loss_type multiloss --sample_size 64
      
  2. Testing : opts.py → phase='test'

    • options

      • phase

        --phase test
        
      • Dataset

        --dataset ClipShots
        --dataset RAI
        --dataset TRECVID[year]
          ex) TRECVID07 (only 2007 year possible)
        
      • Model

        - alexnet / resnet / resnext
        
            --model alexnet --train_data_type normal --loss_type normal
            --model resnet -model_depth 18 --train_data_type normal --loss_type normal
            --model resnext --model_depth 101 --train_data_type normal --loss_type normal
        
        - detector
          
            --model detector --baseline_model resnet --model_depth 50 --train_data_type cut --layer_policy second --loss_type multiloss --sample_size 64
        
      • use/no pre-trained model

        --pretrained_model True/False
        
    • command (example)

      python main_baseline.py --phase test --dataset ClipShots --test_weight results/model_final/model_final_detector_epoch5.pth --train_data_type cut --layer_policy second --model detector --baseline_model resnet --model_depth 50 --pretrained_model True --loss_type multiloss --sample_size 64
      
  3. Training and Testing

    • phase='full'
  4. Running code

    • if change files, Run run_[OS_type].sh
    • else, Run python file main_baseline.py with options

About

Shot Boundary Detection Using 3D CNN

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •