Skip to content

Commit be35b35

Browse files
author
Xinlei Chen
committed
Update new models.
1 parent 7fa363b commit be35b35

File tree

6 files changed

+20
-23
lines changed

6 files changed

+20
-23
lines changed

README.md

Lines changed: 17 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -7,37 +7,37 @@ A Tensorflow implementation of faster RCNN detection framework by Xinlei Chen (x
77
The current code supports **VGG16**, **Resnet V1** and **Mobilenet V1** models. We mainly tested it on plain VGG16 and Resnet101 (thank you @philokey!) architecture. As the baseline, we report numbers using a single model on a single convolution layer, so no multi-scale, no multi-stage bounding box regression, no skip-connection, no extra input is used. The only data augmentation technique is left-right flipping during training following the original Faster RCNN. All models are released.
88

99
With VGG16 (``conv5_3``):
10-
- Train on VOC 2007 trainval and test on VOC 2007 test, **71.2**.
11-
- Train on VOC 2007+2012 trainval and test on VOC 2007 test ([R-FCN](https://github.com/daijifeng001/R-FCN) schedule), **75.3**.
12-
- Train on COCO 2014 [trainval35k](https://github.com/rbgirshick/py-faster-rcnn/tree/master/models) and test on [minival](https://github.com/rbgirshick/py-faster-rcnn/tree/master/models) (900k/1190k), **29.5**.
10+
- Train on VOC 2007 trainval and test on VOC 2007 test, **70.8**.
11+
- Train on VOC 2007+2012 trainval and test on VOC 2007 test ([R-FCN](https://github.com/daijifeng001/R-FCN) schedule), **75.7**.
12+
- Train on COCO 2014 [trainval35k](https://github.com/rbgirshick/py-faster-rcnn/tree/master/models) and test on [minival](https://github.com/rbgirshick/py-faster-rcnn/tree/master/models) (900k/1190k), **30.2**.
1313

1414
With Resnet101 (last ``conv4``):
15-
- Train on VOC 2007 trainval and test on VOC 2007 test, **75.2**.
16-
- Train on VOC 2007+2012 trainval and test on VOC 2007 test (R-FCN schedule), **79.3**.
17-
- Train on COCO 2014 trainval35k and test on minival (900k/1190k), **34.1**.
15+
- Train on VOC 2007 trainval and test on VOC 2007 test, **75.7**.
16+
- Train on VOC 2007+2012 trainval and test on VOC 2007 test (R-FCN schedule), **79.8**.
17+
- Train on COCO 2014 trainval35k and test on minival (900k/1190k), **35.2**.
1818

1919
More Results:
20-
- Train Mobilenet (1.0, 224) on COCO 2014 trainval35k and test on minival (900k/1190k), **21.9**.
21-
- Train Resnet50 on COCO 2014 trainval35k and test on minival (900k/1190k), **31.6**.
22-
- Train Resnet152 on COCO 2014 trainval35k and test on minival (900k/1190k), **35.2**.
20+
- Train Mobilenet (1.0, 224) on COCO 2014 trainval35k and test on minival (900k/1190k), **21.8**.
21+
- Train Resnet50 on COCO 2014 trainval35k and test on minival (900k/1190k), **32.3**.
22+
- Train Resnet152 on COCO 2014 trainval35k and test on minival (900k/1190k), **36.1**.
2323

2424
Approximate *baseline* [setup](https://github.com/endernewton/tf-faster-rcnn/blob/master/experiments/cfgs/res101-lg.yml) from [FPN](https://arxiv.org/abs/1612.03144) (this repo does not contain training code for FPN yet):
25-
- Train Resnet50 on COCO 2014 trainval35k and test on minival (900k/1190k), **33.4**.
26-
- Train Resnet101 on COCO 2014 trainval35k and test on minival (900k/1190k), **36.3**.
27-
- Train Resnet152 on COCO 2014 trainval35k and test on minival (1000k/1390k), **37.2**.
25+
- Train Resnet50 on COCO 2014 trainval35k and test on minival (900k/1190k), **34.2**.
26+
- Train Resnet101 on COCO 2014 trainval35k and test on minival (900k/1190k), **37.4**.
27+
- Train Resnet152 on COCO 2014 trainval35k and test on minival (900k/1190k), **38.2**.
2828

2929
**Note**:
30-
- The numbers should be further improved now, please stay tuned.
3130
- Due to the randomness in GPU training with Tensorflow especially for VOC, the best numbers are reported (with 2-3 attempts) here. According to my experience, for COCO you can almost always get a very close number (within ~0.2%) despite the randomness.
32-
- **All** the numbers are obtained with a different testing scheme without selecting region proposals using non-maximal suppression (TEST.MODE top), the default and original testing scheme (TEST.MODE nms) will likely result in slightly worse performance (see [report](https://arxiv.org/pdf/1702.02138.pdf), for COCO it drops 0.X AP).
31+
- The numbers are obtained with the **default** testing scheme which selects region proposals using non-maximal suppression (TEST.MODE nms), the alternative testing scheme (TEST.MODE nms) will likely result in slightly better performance (see [report](https://arxiv.org/pdf/1702.02138.pdf), for COCO it drops 0.X AP).
3332
- Since we keep the small proposals (\< 16 pixels width/height), our performance is especially good for small objects.
3433
- We do not set a threshold (instead of 0.05) for a detection to be included in the final result, which increases recall.
34+
- Weight decay is set to 1e-4.
3535
- For other minor modifications, please check the [report](https://arxiv.org/pdf/1702.02138.pdf). Notable ones include using ``crop_and_resize``, and excluding ground truth boxes in RoIs during training.
36-
- For COCO, we find the performance improving with more iterations (VGG16 350k/490k: 26.9, 600k/790k: 28.3, 900k/1190k: 29.5), and potentially better performance can be achieved with even more iterations.
37-
- For Resnets, we fix the first block (total 4) when fine-tuning the network, and only use ``crop_and_resize`` to resize the RoIs (7x7) without max-pool (which I find useless especially for COCO). The final feature maps are average-pooled for classification and regression. All batch normalization parameters are fixed. Weight decay is set to Renset101 default 1e-4. Learning rate for biases is not doubled.
36+
- For COCO, we find the performance improving with more iterations, and potentially better performance can be achieved with even more iterations.
37+
- For Resnets, we fix the first block (total 4) when fine-tuning the network, and only use ``crop_and_resize`` to resize the RoIs (7x7) without max-pool (which I find useless especially for COCO). The final feature maps are average-pooled for classification and regression. All batch normalization parameters are fixed. Learning rate for biases is not doubled.
3838
- For Mobilenets, we fix the first five layers when fine-tuning the network. All batch normalization parameters are fixed. Weight decay for Mobilenet layers is set to 4e-5.
3939
- For approximate [FPN](https://arxiv.org/abs/1612.03144) baseline setup we simply resize the image with 800 pixels, add 32^2 anchors, and take 1000 proposals during testing.
40-
- Check out [here](http://ladoga.graphics.cs.cmu.edu/xinleic/tf-faster-rcnn/)/[here](http://xinlei.sp.cs.cmu.edu/xinleic/tf-faster-rcnn/)/[here](https://drive.google.com/open?id=0B1_fAEgxdnvJSmF3YUlZcHFqWTQ) for the latest models, including longer COCO VGG16 models and Resnet ones.
40+
- Check out [here](http://ladoga.graphics.cs.cmu.edu/xinleic/tf-faster-rcnn/)/[here](http://xinlei.sp.cs.cmu.edu/xinleic/tf-faster-rcnn/)/[here](https://drive.google.com/open?id=0B1_fAEgxdnvJSmF3YUlZcHFqWTQ) for the latest models **Needs to be updated**, including longer COCO VGG16 models and Resnet ones.
4141

4242
### Additional features
4343
Additional features not mentioned in the [report](https://arxiv.org/pdf/1702.02138.pdf) are added to make research life easier:

experiments/cfgs/res101-lg.yml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,6 @@ TRAIN:
99
BG_THRESH_LO: 0.0
1010
DISPLAY: 20
1111
BATCH_SIZE: 256
12-
WEIGHT_DECAY: 0.0001
1312
DOUBLE_BIAS: False
1413
SNAPSHOT_PREFIX: res101_faster_rcnn
1514
SCALES: [800]

experiments/cfgs/res101.yml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,6 @@ TRAIN:
99
BG_THRESH_LO: 0.0
1010
DISPLAY: 20
1111
BATCH_SIZE: 256
12-
WEIGHT_DECAY: 0.0001
1312
DOUBLE_BIAS: False
1413
SNAPSHOT_PREFIX: res101_faster_rcnn
1514
TEST:

experiments/cfgs/res50.yml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,6 @@ TRAIN:
99
BG_THRESH_LO: 0.0
1010
DISPLAY: 20
1111
BATCH_SIZE: 256
12-
WEIGHT_DECAY: 0.0001
1312
DOUBLE_BIAS: False
1413
SNAPSHOT_PREFIX: res50_faster_rcnn
1514
TEST:

lib/model/config.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@
2525
__C.TRAIN.MOMENTUM = 0.9
2626

2727
# Weight decay, for regularization
28-
__C.TRAIN.WEIGHT_DECAY = 0.0005
28+
__C.TRAIN.WEIGHT_DECAY = 0.0001
2929

3030
# Factor for reducing the learning rate
3131
__C.TRAIN.GAMMA = 0.1

lib/nets/network.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -345,10 +345,10 @@ def _region_classification(self, fc7, is_training, initializer, initializer_bbox
345345

346346
return cls_prob, bbox_pred
347347

348-
def _image_to_head(self, is_training, reuse=False):
348+
def _image_to_head(self, is_training, reuse=None):
349349
raise NotImplementedError
350350

351-
def _head_to_tail(self, pool5, is_training, reuse=False):
351+
def _head_to_tail(self, pool5, is_training, reuse=None):
352352
raise NotImplementedError
353353

354354
def create_architecture(self, mode, num_classes, tag=None,

0 commit comments

Comments
 (0)