[Feature] Support MobileNetV2 backbone (open-mmlab#86)

xvjiarui · web-flow · commit 3c6dd9e6a4cb · 2020-09-04T15:35:52.000+08:00
* [Feature] Support MobileNetV2 backbone

* Fixed import

* Fixed test

* Fixed test

* Fixed dilate

* upload model

* update table

* update table

* update bibtex

* update MMCV requirement
diff --git a/configs/mobilenet_v2/README.md b/configs/mobilenet_v2/README.md
@@ -0,0 +1,32 @@
+# MobileNetV2: Inverted Residuals and Linear Bottlenecks
+
+## Introduction
+
+```
+@inproceedings{sandler2018mobilenetv2,
+  title={Mobilenetv2: Inverted residuals and linear bottlenecks},
+  author={Sandler, Mark and Howard, Andrew and Zhu, Menglong and Zhmoginov, Andrey and Chen, Liang-Chieh},
+  booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
+  pages={4510--4520},
+  year={2018}
+}
+```
+
+
+## Results and models
+
+### Cityscapes
+|   Method   | Backbone | Crop Size | Lr schd | Mem (GB) | Inf time (fps) | mIoU  | mIoU(ms+flip) |                                                                                                                                                                                                              download                                                                                                                                                                                                              |
+|------------|----------|-----------|--------:|---------:|----------------|------:|---------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| FCN        | M-V2-D8  | 512x1024  |   80000 |      3.4 | 14.2           | 61.54 | -             | [model](https://openmmlab.oss-accelerate.aliyuncs.com/mmsegmentation/v0.5/mobilenet_v2/fcn_m-v2-d8_512x1024_80k_cityscapes/fcn_m-v2-d8_512x1024_80k_cityscapes_20200825_124817-d24c28c1.pth) &#124; [log](https://openmmlab.oss-accelerate.aliyuncs.com/mmsegmentation/v0.5/mobilenet_v2/fcn_m-v2-d8_512x1024_80k_cityscapes/fcn_m-v2-d8_512x1024_80k_cityscapes-20200825_124817.log.json)                                         |
+| PSPNet     | M-V2-D8  | 512x1024  |   80000 |      3.6  | 11.2           | 70.23 | -             | [model](https://openmmlab.oss-accelerate.aliyuncs.com/mmsegmentation/v0.5/mobilenet_v2/pspnet_m-v2-d8_512x1024_80k_cityscapes/pspnet_m-v2-d8_512x1024_80k_cityscapes_20200825_124817-19e81d51.pth) &#124; [log](https://openmmlab.oss-accelerate.aliyuncs.com/mmsegmentation/v0.5/mobilenet_v2/pspnet_m-v2-d8_512x1024_80k_cityscapes/pspnet_m-v2-d8_512x1024_80k_cityscapes-20200825_124817.log.json)                             |
+| DeepLabV3  | M-V2-D8  | 512x1024  |   80000 |      3.9 | 8.4            | 73.84 | -             | [model](https://openmmlab.oss-accelerate.aliyuncs.com/mmsegmentation/v0.5/mobilenet_v2/deeplabv3_m-v2-d8_512x1024_80k_cityscapes/deeplabv3_m-v2-d8_512x1024_80k_cityscapes_20200825_124836-bef03590.pth) &#124; [log](https://openmmlab.oss-accelerate.aliyuncs.com/mmsegmentation/v0.5/mobilenet_v2/deeplabv3_m-v2-d8_512x1024_80k_cityscapes/deeplabv3_m-v2-d8_512x1024_80k_cityscapes-20200825_124836.log.json)                 |
+| DeepLabV3+ | M-V2-D8  | 512x1024  |   80000 |      5.1 | 8.4            | 75.20 | -             | [model](https://openmmlab.oss-accelerate.aliyuncs.com/mmsegmentation/v0.5/mobilenet_v2/deeplabv3plus_m-v2-d8_512x1024_80k_cityscapes/deeplabv3plus_m-v2-d8_512x1024_80k_cityscapes_20200825_124836-d256dd4b.pth) &#124; [log](https://openmmlab.oss-accelerate.aliyuncs.com/mmsegmentation/v0.5/mobilenet_v2/deeplabv3plus_m-v2-d8_512x1024_80k_cityscapes/deeplabv3plus_m-v2-d8_512x1024_80k_cityscapes-20200825_124836.log.json) |
+
+### ADE20k
+|   Method   | Backbone | Crop Size | Lr schd | Mem (GB) | Inf time (fps) | mIoU  | mIoU(ms+flip) |                                                                                                                                                                                                      download                                                                                                                                                                                                      |
+|------------|----------|-----------|--------:|---------:|----------------|------:|---------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| FCN        | M-V2-D8  | 512x512   |  160000 |      6.5 | 64.4           | 19.71 | -             | [model](https://openmmlab.oss-accelerate.aliyuncs.com/mmsegmentation/v0.5/mobilenet_v2/fcn_m-v2-d8_512x512_160k_ade20k/fcn_m-v2-d8_512x512_160k_ade20k_20200825_214953-c40e1095.pth) &#124; [log](https://openmmlab.oss-accelerate.aliyuncs.com/mmsegmentation/v0.5/mobilenet_v2/fcn_m-v2-d8_512x512_160k_ade20k/fcn_m-v2-d8_512x512_160k_ade20k-20200825_214953.log.json)                                         |
+| PSPNet     | M-V2-D8  | 512x512   |  160000 |      6.5 | 57.7           | 29.68 | -             | [model](https://openmmlab.oss-accelerate.aliyuncs.com/mmsegmentation/v0.5/mobilenet_v2/pspnet_m-v2-d8_512x512_160k_ade20k/pspnet_m-v2-d8_512x512_160k_ade20k_20200825_214953-f5942f7a.pth) &#124; [log](https://openmmlab.oss-accelerate.aliyuncs.com/mmsegmentation/v0.5/mobilenet_v2/pspnet_m-v2-d8_512x512_160k_ade20k/pspnet_m-v2-d8_512x512_160k_ade20k-20200825_214953.log.json)                             |
+| DeepLabV3  | M-V2-D8  | 512x512   |  160000 |      6.8 | 39.9           | 34.08 | -             | [model](https://openmmlab.oss-accelerate.aliyuncs.com/mmsegmentation/v0.5/mobilenet_v2/deeplabv3_m-v2-d8_512x512_160k_ade20k/deeplabv3_m-v2-d8_512x512_160k_ade20k_20200825_223255-63986343.pth) &#124; [log](https://openmmlab.oss-accelerate.aliyuncs.com/mmsegmentation/v0.5/mobilenet_v2/deeplabv3_m-v2-d8_512x512_160k_ade20k/deeplabv3_m-v2-d8_512x512_160k_ade20k-20200825_223255.log.json)                 |
+| DeepLabV3+ | M-V2-D8  | 512x512   |  160000 |      8.2 | 43.1           | 34.02 | -             | [model](https://openmmlab.oss-accelerate.aliyuncs.com/mmsegmentation/v0.5/mobilenet_v2/deeplabv3plus_m-v2-d8_512x512_160k_ade20k/deeplabv3plus_m-v2-d8_512x512_160k_ade20k_20200825_223255-465a01d4.pth) &#124; [log](https://openmmlab.oss-accelerate.aliyuncs.com/mmsegmentation/v0.5/mobilenet_v2/deeplabv3plus_m-v2-d8_512x512_160k_ade20k/deeplabv3plus_m-v2-d8_512x512_160k_ade20k-20200825_223255.log.json) |
diff --git a/configs/mobilenet_v2/deeplabv3_m-v2-d8_512x1024_80k_cityscapes.py b/configs/mobilenet_v2/deeplabv3_m-v2-d8_512x1024_80k_cityscapes.py
@@ -0,0 +1,12 @@
+_base_ = '../deeplabv3/deeplabv3_r101-d8_512x1024_80k_cityscapes.py'
+model = dict(
+    pretrained='mmcls://mobilenet_v2',
+    backbone=dict(
+        _delete_=True,
+        type='MobileNetV2',
+        widen_factor=1.,
+        strides=(1, 2, 2, 1, 1, 1, 1),
+        dilations=(1, 1, 1, 2, 2, 4, 4),
+        out_indices=(1, 2, 4, 6)),
+    decode_head=dict(in_channels=320),
+    auxiliary_head=dict(in_channels=96))
diff --git a/configs/mobilenet_v2/deeplabv3_m-v2-d8_512x512_160k_ade20k.py b/configs/mobilenet_v2/deeplabv3_m-v2-d8_512x512_160k_ade20k.py
@@ -0,0 +1,12 @@
+_base_ = '../deeplabv3/deeplabv3_r101-d8_512x512_160k_ade20k.py'
+model = dict(
+    pretrained='mmcls://mobilenet_v2',
+    backbone=dict(
+        _delete_=True,
+        type='MobileNetV2',
+        widen_factor=1.,
+        strides=(1, 2, 2, 1, 1, 1, 1),
+        dilations=(1, 1, 1, 2, 2, 4, 4),
+        out_indices=(1, 2, 4, 6)),
+    decode_head=dict(in_channels=320),
+    auxiliary_head=dict(in_channels=96))
diff --git a/configs/mobilenet_v2/deeplabv3plus_m-v2-d8_512x1024_80k_cityscapes.py b/configs/mobilenet_v2/deeplabv3plus_m-v2-d8_512x1024_80k_cityscapes.py
@@ -0,0 +1,12 @@
+_base_ = '../deeplabv3plus/deeplabv3plus_r101-d8_512x1024_80k_cityscapes.py'
+model = dict(
+    pretrained='mmcls://mobilenet_v2',
+    backbone=dict(
+        _delete_=True,
+        type='MobileNetV2',
+        widen_factor=1.,
+        strides=(1, 2, 2, 1, 1, 1, 1),
+        dilations=(1, 1, 1, 2, 2, 4, 4),
+        out_indices=(1, 2, 4, 6)),
+    decode_head=dict(in_channels=320, c1_in_channels=24),
+    auxiliary_head=dict(in_channels=96))
diff --git a/configs/mobilenet_v2/deeplabv3plus_m-v2-d8_512x512_160k_ade20k.py b/configs/mobilenet_v2/deeplabv3plus_m-v2-d8_512x512_160k_ade20k.py
@@ -0,0 +1,12 @@
+_base_ = '../deeplabv3plus/deeplabv3plus_r101-d8_512x512_160k_ade20k.py'
+model = dict(
+    pretrained='mmcls://mobilenet_v2',
+    backbone=dict(
+        _delete_=True,
+        type='MobileNetV2',
+        widen_factor=1.,
+        strides=(1, 2, 2, 1, 1, 1, 1),
+        dilations=(1, 1, 1, 2, 2, 4, 4),
+        out_indices=(1, 2, 4, 6)),
+    decode_head=dict(in_channels=320, c1_in_channels=24),
+    auxiliary_head=dict(in_channels=96))
diff --git a/configs/mobilenet_v2/fcn_m-v2-d8_512x1024_80k_cityscapes.py b/configs/mobilenet_v2/fcn_m-v2-d8_512x1024_80k_cityscapes.py
@@ -0,0 +1,12 @@
+_base_ = '../fcn/fcn_r101-d8_512x1024_80k_cityscapes.py'
+model = dict(
+    pretrained='mmcls://mobilenet_v2',
+    backbone=dict(
+        _delete_=True,
+        type='MobileNetV2',
+        widen_factor=1.,
+        strides=(1, 2, 2, 1, 1, 1, 1),
+        dilations=(1, 1, 1, 2, 2, 4, 4),
+        out_indices=(1, 2, 4, 6)),
+    decode_head=dict(in_channels=320),
+    auxiliary_head=dict(in_channels=96))
diff --git a/configs/mobilenet_v2/fcn_m-v2-d8_512x512_160k_ade20k.py b/configs/mobilenet_v2/fcn_m-v2-d8_512x512_160k_ade20k.py
@@ -0,0 +1,12 @@
+_base_ = '../fcn/fcn_r101-d8_512x512_160k_ade20k.py'
+model = dict(
+    pretrained='mmcls://mobilenet_v2',
+    backbone=dict(
+        _delete_=True,
+        type='MobileNetV2',
+        widen_factor=1.,
+        strides=(1, 2, 2, 1, 1, 1, 1),
+        dilations=(1, 1, 1, 2, 2, 4, 4),
+        out_indices=(1, 2, 4, 6)),
+    decode_head=dict(in_channels=320),
+    auxiliary_head=dict(in_channels=96))
diff --git a/configs/mobilenet_v2/pspnet_m-v2-d8_512x1024_80k_cityscapes.py b/configs/mobilenet_v2/pspnet_m-v2-d8_512x1024_80k_cityscapes.py
@@ -0,0 +1,12 @@
+_base_ = '../pspnet/pspnet_r101-d8_512x1024_80k_cityscapes.py'
+model = dict(
+    pretrained='mmcls://mobilenet_v2',
+    backbone=dict(
+        _delete_=True,
+        type='MobileNetV2',
+        widen_factor=1.,
+        strides=(1, 2, 2, 1, 1, 1, 1),
+        dilations=(1, 1, 1, 2, 2, 4, 4),
+        out_indices=(1, 2, 4, 6)),
+    decode_head=dict(in_channels=320),
+    auxiliary_head=dict(in_channels=96))
diff --git a/configs/mobilenet_v2/pspnet_m-v2-d8_512x512_160k_ade20k.py b/configs/mobilenet_v2/pspnet_m-v2-d8_512x512_160k_ade20k.py
@@ -0,0 +1,12 @@
+_base_ = '../pspnet/pspnet_r101-d8_512x512_160k_ade20k.py'
+model = dict(
+    pretrained='mmcls://mobilenet_v2',
+    backbone=dict(
+        _delete_=True,
+        type='MobileNetV2',
+        widen_factor=1.,
+        strides=(1, 2, 2, 1, 1, 1, 1),
+        dilations=(1, 1, 1, 2, 2, 4, 4),
+        out_indices=(1, 2, 4, 6)),
+    decode_head=dict(in_channels=320),
+    auxiliary_head=dict(in_channels=96))
diff --git a/mmseg/__init__.py b/mmseg/__init__.py
@@ -2,8 +2,8 @@
 
 from .version import __version__, version_info
 
-MMCV_MIN = '1.0.5'
-MMCV_MAX = '1.1.1'
+MMCV_MIN = '1.1.2'
+MMCV_MAX = '1.2.0'
 
 
 def digit_version(version_str):
diff --git a/mmseg/models/backbones/__init__.py b/mmseg/models/backbones/__init__.py
@@ -1,10 +1,11 @@
 from .fast_scnn import FastSCNN
 from .hrnet import HRNet
+from .mobilenet_v2 import MobileNetV2
 from .resnest import ResNeSt
 from .resnet import ResNet, ResNetV1c, ResNetV1d
 from .resnext import ResNeXt
 
 __all__ = [
     'ResNet', 'ResNetV1c', 'ResNetV1d', 'ResNeXt', 'HRNet', 'FastSCNN',
-    'ResNeSt'
+    'ResNeSt', 'MobileNetV2'
 ]
diff --git a/mmseg/models/backbones/mobilenet_v2.py b/mmseg/models/backbones/mobilenet_v2.py
diff --git a/mmseg/models/utils/__init__.py b/mmseg/models/utils/__init__.py
diff --git a/mmseg/models/utils/make_divisible.py b/mmseg/models/utils/make_divisible.py
diff --git a/tests/test_models/test_forward.py b/tests/test_models/test_forward.py