Skip to content

Commit 560c499

Browse files
committed
first commit
0 parents  commit 560c499

File tree

100 files changed

+245156
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

100 files changed

+245156
-0
lines changed

.gitignore

Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
# IntelliJ project files
2+
.idea
3+
*.iml
4+
out
5+
gen
6+
7+
### Vim template
8+
[._]*.s[a-w][a-z]
9+
[._]s[a-w][a-z]
10+
*.un~
11+
Session.vim
12+
.netrwhist
13+
*~
14+
15+
### IPythonNotebook template
16+
# Temporary data
17+
.ipynb_checkpoints/
18+
19+
### Python template
20+
# Byte-compiled / optimized / DLL files
21+
__pycache__/
22+
*.py[cod]
23+
*/*/*.pyc
24+
*/*.pyc
25+
*.pyc
26+
*$py.class
27+
28+
# C extensions
29+
*.so
30+
31+
# Distribution / packaging
32+
.Python
33+
env/
34+
build/
35+
develop-eggs/
36+
dist/
37+
downloads/
38+
eggs/
39+
.eggs/
40+
#lib/
41+
#lib64/
42+
parts/
43+
sdist/
44+
var/
45+
*.egg-info/
46+
.installed.cfg
47+
*.egg
48+
49+
# PyInstaller
50+
# Usually these files are written by a python script from a template
51+
# before PyInstaller builds the exe, so as to inject date/other infos into it.
52+
*.manifest
53+
*.spec
54+
55+
# Installer logs
56+
pip-log.txt
57+
pip-delete-this-directory.txt
58+
59+
# Unit test / coverage reports
60+
htmlcov/
61+
.tox/
62+
.coverage
63+
.coverage.*
64+
.cache
65+
nosetests.xml
66+
coverage.xml
67+
*,cover
68+
69+
# Translations
70+
*.mo
71+
*.pot
72+
73+
# Django stuff:
74+
*.log
75+
76+
# Sphinx documentation
77+
docs/_build/
78+
79+
# PyBuilder
80+
target/
81+
82+
*.ipynb
83+
*.params
84+
*.json
85+
.vscode/
86+
87+
lib/dataset/pycocotools/*.c
88+
lib/dataset/pycocotools/*.cpp
89+
lib/nms/*.c
90+
lib/nms/*.cpp
91+
92+
external
93+
output
94+
model
95+
data
96+
demo
97+
98+
.db

README.md

Lines changed: 149 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,149 @@
1+
# Fully Motion-Aware Network for Video Object Detection
2+
3+
4+
This implementation is a fork of [FGFA](https://github.com/msracver/Flow-Guided-Feature-Aggregation) and extended by [Shiyao Wang](https://github.com/wangshy31) through adding instance-level aggregation and motion pattern reasoning.
5+
6+
7+
8+
## Introduction
9+
10+
**Fully Motion-Aware Network for Video Object Detection (MANet)** is initially described in an [ECCV 2018 paper](https://wangshy31.github.io/papers/2-MANet.pdf). It proposes an end-to-end model called fully motion-aware network (MANet), which jointly calibrates the features of objects on both pixel-level and instance-level in a unified framework.
11+
12+
The contributions of this paper include:
13+
14+
* Propose an instance-level feature calibration method by learning instance movements through time. The instance-level calibration is more robust to occlusions and outperforms pixel-level feature calibration.
15+
* Develop a motion pattern reasoning module to dynamically combine pixel-level and instance-level calibration according to the motion.
16+
* Demonstrate the MANet on the large-scale [ImageNet VID dataset](http://image-net.org/challenges/LSVRC/) with state-of-the-art performance.
17+
18+
19+
20+
## Installation
21+
22+
1. Clone the repo, and we call the directory that you cloned as ${MANet_ROOT}.
23+
```
24+
git clone https://github.com/wangshy31/MANet_for_Video_Object_Detection.git
25+
```
26+
2. Python packages might missing: cython, opencv-python >= 3.2.0, easydict. If `pip` is set up on your system, those packages should be able to be fetched and installed by running
27+
```
28+
pip install Cython
29+
pip install opencv-python==3.2.0.6
30+
pip install easydict==1.6
31+
```
32+
3. Run `sh ./init.sh` to build cython module automatically and create some folders.
33+
34+
4. Install MXNet as [FGFA](https://github.com/msracver/Flow-Guided-Feature-Aggregation):
35+
36+
4.1 Clone MXNet and checkout to [MXNet@(v0.10.0)](https://github.com/apache/incubator-mxnet/tree/v0.10.0) by
37+
38+
```
39+
git clone --recursive https://github.com/apache/incubator-mxnet.git
40+
cd incubator-mxnet
41+
git checkout v0.10.0
42+
git submodule update
43+
```
44+
45+
We also provide a [repo](https://github.com/wangshy31/mxnet.git) that contains mxnet configured as required.
46+
47+
4.2 Copy operators in `$(MANet_ROOT)/manet_rfcn/operator_cxx` to `$(YOUR_MXNET_FOLDER)/src/operator/contrib` by
48+
49+
```cp -r $(MANet_ROOT)/manet_rfcn/operator_cxx/* $(MXNET_ROOT)/src/operator/contrib/```
50+
51+
4.3 Compile MXNet
52+
53+
```
54+
cd ${MXNET_ROOT}
55+
make -j4
56+
```
57+
4.4 Install the MXNet Python binding by
58+
```
59+
cd python
60+
sudo python setup.py install
61+
```
62+
63+
64+
65+
## Preparation for Training & Testing
66+
67+
**For data processing**:
68+
69+
1. Please download ILSVRC2015 DET and ILSVRC2015 VID dataset, and make sure it looks like this:
70+
71+
```
72+
./data/ILSVRC2015/
73+
./data/ILSVRC2015/Annotations/DET
74+
./data/ILSVRC2015/Annotations/VID
75+
./data/ILSVRC2015/Data/DET
76+
./data/ILSVRC2015/Data/VID
77+
./data/ILSVRC2015/ImageSets
78+
```
79+
80+
2. Please download ImageNet pre-trained ResNet-v1-101 model and Flying-Chairs pre-trained FlowNet model manually from [OneDrive](https://1drv.ms/u/s!Am-5JzdW2XHzhqMOBdCBiNaKbcjPrA), and put it under folder `./model`. Make sure it looks like this:
81+
82+
```
83+
./model/pretrained_model/resnet_v1_101-0000.params
84+
./model/pretrained_model/flownet-0000.params
85+
```
86+
87+
**For training & testing**:
88+
89+
1. Three-phase training is performed on the mixture of ImageNet DET+VID which is useful for the final performance.
90+
91+
**Phase 1**: Fix the weights of ResNet, combine pixel-level aggregated features and instance-level aggregated features by average operation. See script/train/phase-1;
92+
93+
**Phase 2**: Similar to phase 1 but joint train ResNet. See script/train/phase-2;
94+
95+
**Phase 3**: Fix the weights of ResNet, change the average operation to learnable weights and sample more VID data. See script/train/phase-3;
96+
97+
We use 4 GPUs to train models on ImageNet VID. Any NVIDIA GPUs with at least 8GB memory should be OK.
98+
99+
2. To perform experiments, run the python script with the corresponding config file as input. For example, to train and test MANet with R-FCN, use the following command
100+
101+
```
102+
./run.sh
103+
```
104+
105+
A cache folder would be created automatically to save the model and the log under
106+
107+
`imagenet_vid/`.
108+
109+
3. Please find more details in config files and in our code.
110+
111+
## Main Results
112+
113+
1. We conduct an ablation study so as to validate the effectiveness of the proposed network.
114+
115+
![ablation study](images/table2.png)
116+
117+
**Table 1**. Accuracy of different methods on ImageNet VID validation, using ResNet-101 feature extraction networks. Detection accuracy of slow (motion IoU > 0.9), medium (0.7 ≤ motion IoU ≤ 0.9), and fast (motion IoU < 0.7) moving object instances.
118+
119+
2. We attempt to take a deeper look at detection results and prove that two calibrated features have respective strengths.
120+
121+
![visualization](images/table3.png)
122+
123+
**Figure 1**. Visualization of two typical examples: occluded and non-rigid objects. They show respective strengths of the two calibration methods.
124+
125+
![statisticalanalysis](images/table4.png)
126+
127+
**Table 2**. Statistical analysis on different validation sets. The instance-level calibration is better when objects are occluded or move more regularly while the pixel-level calibration performs well on non-rigid motion. Combination of these two module can achieve best performance.
128+
129+
130+
## Download Trained Models
131+
You can download the trained MANet from [drive](https://drive.google.com/file/d/1tKFfOKaFUeZanKTCCwVw-xaKu0wAw71t/view?usp=sharing). It can achieve 78.03% mAP without sequence-level post-processing (e.g., SeqNMS).
132+
133+
134+
135+
## Citing MANet
136+
137+
If you find Fully Motion-Aware Network for Video Object Detection useful in your research, please consider citing:
138+
```
139+
@inproceedings{wang2018fully,
140+
Author = {Wang, Shiyao and Zhou, Yucong and Yan, Junjie and Deng, Zhidong},
141+
Title = {Fully Motion-Aware Network for Video Object Detection},
142+
booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
143+
pages={542--557},
144+
Year = {2018}
145+
}
146+
147+
```
148+
149+

0 commit comments

Comments
 (0)