Skip to content

Commit 4f4e772

Browse files
[Feature] Support iSAID aerial dataset. (open-mmlab#1115)
* support iSAID aerial dataset * Update and rename docs/dataset_prepare.md to 博士/dataset_prepare.md * Update dataset_prepare.md * fix typo * fix typo * fix typo * remove imgviz * fix wrong order in annotation name * upload models&logs * upload models&logs * add load_annotations * fix unittest coverage * fix unittest coverage * fix correct crop size in config * fix iSAID unit test * fix iSAID unit test * fix typos * fix wrong crop size in readme * use smaller figure as test data * add smaller dataset in test data * add blank in docs * use 0 bytes pseudo data * add footnote and comments for crop size * change iSAID to isaid and add default value in it * change iSAID to isaid in _base_ Co-authored-by: MengzhangLI <[email protected]>
1 parent 9522b4f commit 4f4e772

30 files changed

+783
-6
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -138,6 +138,7 @@ Supported datasets:
138138
- [x] [LoveDA](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/en/dataset_prepare.md#loveda)
139139
- [x] [Potsdam](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/en/dataset_prepare.md#isprs-potsdam)
140140
- [x] [Vaihingen](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/en/dataset_prepare.md#isprs-vaihingen)
141+
- [x] [iSAID](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/en/dataset_prepare.md#isaid)
141142

142143
## Installation
143144

README_zh-CN.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -137,6 +137,7 @@ MMSegmentation 是一个基于 PyTorch 的语义分割开源工具箱。它是 O
137137
- [x] [LoveDA](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/zh_cn/dataset_prepare.md#loveda)
138138
- [x] [Potsdam](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/zh_cn/dataset_prepare.md#isprs-potsdam)
139139
- [x] [Vaihingen](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/zh_cn/dataset_prepare.md#isprs-vaihingen)
140+
- [x] [iSAID](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/zh_cn/dataset_prepare.md#isaid)
140141

141142
## 安装
142143

configs/_base_/datasets/isaid.py

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
# dataset settings
2+
dataset_type = 'iSAIDDataset'
3+
data_root = 'data/iSAID'
4+
5+
img_norm_cfg = dict(
6+
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
7+
"""
8+
This crop_size setting is followed by the implementation of
9+
`PointFlow: Flowing Semantics Through Points for Aerial Image
10+
Segmentation <https://arxiv.org/pdf/2103.06564.pdf>`_.
11+
"""
12+
13+
crop_size = (896, 896)
14+
15+
train_pipeline = [
16+
dict(type='LoadImageFromFile'),
17+
dict(type='LoadAnnotations'),
18+
dict(type='Resize', img_scale=(896, 896), ratio_range=(0.5, 2.0)),
19+
dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
20+
dict(type='RandomFlip', prob=0.5),
21+
dict(type='PhotoMetricDistortion'),
22+
dict(type='Normalize', **img_norm_cfg),
23+
dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255),
24+
dict(type='DefaultFormatBundle'),
25+
dict(type='Collect', keys=['img', 'gt_semantic_seg']),
26+
]
27+
test_pipeline = [
28+
dict(type='LoadImageFromFile'),
29+
dict(
30+
type='MultiScaleFlipAug',
31+
img_scale=(896, 896),
32+
# img_ratios=[0.5, 0.75, 1.0, 1.25, 1.5, 1.75],
33+
flip=False,
34+
transforms=[
35+
dict(type='Resize', keep_ratio=True),
36+
dict(type='RandomFlip'),
37+
dict(type='Normalize', **img_norm_cfg),
38+
dict(type='ImageToTensor', keys=['img']),
39+
dict(type='Collect', keys=['img']),
40+
])
41+
]
42+
data = dict(
43+
samples_per_gpu=4,
44+
workers_per_gpu=4,
45+
train=dict(
46+
type=dataset_type,
47+
data_root=data_root,
48+
img_dir='img_dir/train',
49+
ann_dir='ann_dir/train',
50+
pipeline=train_pipeline),
51+
val=dict(
52+
type=dataset_type,
53+
data_root=data_root,
54+
img_dir='img_dir/val',
55+
ann_dir='ann_dir/val',
56+
pipeline=test_pipeline),
57+
test=dict(
58+
type=dataset_type,
59+
data_root=data_root,
60+
img_dir='img_dir/val',
61+
ann_dir='ann_dir/val',
62+
pipeline=test_pipeline))

configs/deeplabv3plus/README.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -114,8 +114,16 @@ Spatial pyramid pooling module or encode-decoder structure are used in deep neur
114114
| DeepLabV3+ | R-50-D8 | 512x512 | 80000 | 7.36 | 26.91 | 73.97 | 75.05 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/deeplabv3plus/deeplabv3plus_r50-d8_4x4_512x512_80k_vaihingen.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/deeplabv3plus/deeplabv3plus_r50-d8_4x4_512x512_80k_vaihingen/deeplabv3plus_r50-d8_4x4_512x512_80k_vaihingen_20211231_230816-5040938d.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/deeplabv3plus/deeplabv3plus_r50-d8_4x4_512x512_80k_vaihingen/deeplabv3plus_r50-d8_4x4_512x512_80k_vaihingen_20211231_230816.log.json) |
115115
| DeepLabV3+ | R-101-D8 | 512x512 | 80000 | 10.83 | 18.59 | 73.06 | 74.14 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/deeplabv3plus/deeplabv3plus_r101-d8_4x4_512x512_80k_vaihingen.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/deeplabv3plus/deeplabv3plus_r101-d8_4x4_512x512_80k_vaihingen/deeplabv3plus_r101-d8_4x4_512x512_80k_vaihingen_20211231_230816-8a095afa.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/deeplabv3plus/deeplabv3plus_r101-d8_4x4_512x512_80k_vaihingen/deeplabv3plus_r101-d8_4x4_512x512_80k_vaihingen_20211231_230816.log.json) |
116116

117+
### iSAID
118+
119+
| Method | Backbone | Crop Size | Lr schd | Mem (GB) | Inf time (fps) | mIoU | mIoU(ms+flip) | config | download |
120+
| ---------- | -------- | --------- | ------: | -------- | -------------- | ----: | ------------: | -------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
121+
| DeepLabV3+ | R-18-D8 | 896x896 | 80000 | 6.19 | 24.81 | 61.35 | 62.61 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/deeplabv3plus/deeplabv3plus_r18-d8_4x4_896x896_80k_isaid.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/deeplabv3plus/deeplabv3plus_r18-d8_4x4_896x896_80k_isaid/deeplabv3plus_r18-d8_4x4_896x896_80k_isaid_20220110_180526-7059991d.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/deeplabv3plus/deeplabv3plus_r18-d8_4x4_896x896_80k_isaid/deeplabv3plus_r18-d8_4x4_896x896_80k_isaid_20220110_180526.log.json) |
122+
| DeepLabV3+ | R-50-D8 | 896x896 | 80000 | 21.45 | 8.42 | 67.06 | 68.02 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/deeplabv3plus/deeplabv3plus_r50-d8_4x4_896x896_80k_isaid.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/deeplabv3plus/deeplabv3plus_r50-d8_4x4_896x896_80k_isaid/deeplabv3plus_r50-d8_4x4_896x896_80k_isaid_20220110_180526-598be439.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/deeplabv3plus/deeplabv3plus_r50-d8_4x4_896x896_80k_isaid/deeplabv3plus_r50-d8_4x4_896x896_80k_isaid_20220110_180526.log.json) |
123+
117124
Note:
118125

119126
- `D-8`/`D-16` here corresponding to the output stride 8/16 setting for DeepLab series.
120127
- `MG-124` stands for multi-grid dilation in the last stage of ResNet.
121128
- `FP16` means Mixed Precision (FP16) is adopted in training.
129+
- `896x896` is the Crop Size of iSAID dataset, which is followed by the implementation of [PointFlow: Flowing Semantics Through Points for Aerial Image Segmentation](https://arxiv.org/pdf/2103.06564.pdf)

configs/deeplabv3plus/deeplabv3plus.yml

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ Collections:
1010
- LoveDA
1111
- Potsdam
1212
- Vaihingen
13+
- iSAID
1314
Paper:
1415
URL: https://arxiv.org/abs/1802.02611
1516
Title: Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
@@ -803,3 +804,47 @@ Models:
803804
mIoU(ms+flip): 74.14
804805
Config: configs/deeplabv3plus/deeplabv3plus_r101-d8_4x4_512x512_80k_vaihingen.py
805806
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/deeplabv3plus/deeplabv3plus_r101-d8_4x4_512x512_80k_vaihingen/deeplabv3plus_r101-d8_4x4_512x512_80k_vaihingen_20211231_230816-8a095afa.pth
807+
- Name: deeplabv3plus_r18-d8_4x4_896x896_80k_isaid
808+
In Collection: deeplabv3plus
809+
Metadata:
810+
backbone: R-18-D8
811+
crop size: (896,896)
812+
lr schd: 80000
813+
inference time (ms/im):
814+
- value: 40.31
815+
hardware: V100
816+
backend: PyTorch
817+
batch size: 1
818+
mode: FP32
819+
resolution: (896,896)
820+
Training Memory (GB): 6.19
821+
Results:
822+
- Task: Semantic Segmentation
823+
Dataset: iSAID
824+
Metrics:
825+
mIoU: 61.35
826+
mIoU(ms+flip): 62.61
827+
Config: configs/deeplabv3plus/deeplabv3plus_r18-d8_4x4_896x896_80k_isaid.py
828+
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/deeplabv3plus/deeplabv3plus_r18-d8_4x4_896x896_80k_isaid/deeplabv3plus_r18-d8_4x4_896x896_80k_isaid_20220110_180526-7059991d.pth
829+
- Name: deeplabv3plus_r50-d8_4x4_896x896_80k_isaid
830+
In Collection: deeplabv3plus
831+
Metadata:
832+
backbone: R-50-D8
833+
crop size: (896,896)
834+
lr schd: 80000
835+
inference time (ms/im):
836+
- value: 118.76
837+
hardware: V100
838+
backend: PyTorch
839+
batch size: 1
840+
mode: FP32
841+
resolution: (896,896)
842+
Training Memory (GB): 21.45
843+
Results:
844+
- Task: Semantic Segmentation
845+
Dataset: iSAID
846+
Metrics:
847+
mIoU: 67.06
848+
mIoU(ms+flip): 68.02
849+
Config: configs/deeplabv3plus/deeplabv3plus_r50-d8_4x4_896x896_80k_isaid.py
850+
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/deeplabv3plus/deeplabv3plus_r50-d8_4x4_896x896_80k_isaid/deeplabv3plus_r50-d8_4x4_896x896_80k_isaid_20220110_180526-598be439.pth
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
_base_ = './deeplabv3plus_r50-d8_4x4_896x896_80k_isaid.py'
2+
model = dict(
3+
pretrained='open-mmlab://resnet18_v1c',
4+
backbone=dict(depth=18),
5+
decode_head=dict(
6+
c1_in_channels=64,
7+
c1_channels=12,
8+
in_channels=512,
9+
channels=128,
10+
),
11+
auxiliary_head=dict(in_channels=256, channels=64))
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
_base_ = [
2+
'../_base_/models/deeplabv3plus_r50-d8.py', '../_base_/datasets/isaid.py',
3+
'../_base_/default_runtime.py', '../_base_/schedules/schedule_80k.py'
4+
]
5+
model = dict(
6+
decode_head=dict(num_classes=16), auxiliary_head=dict(num_classes=16))

configs/hrnet/README.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -107,3 +107,15 @@ High-resolution representations are essential for position-sensitive vision prob
107107
| FCN | HRNetV2p-W18-Small | 512x512 | 80000 | 1.58 | 38.11 | 71.81 | 73.1 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/hrnet/fcn_hr18s_4x4_512x512_80k_vaihingen.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/hrnet/fcn_hr18s_4x4_512x512_80k_vaihingen/fcn_hr18s_4x4_512x512_80k_vaihingen_20211231_230909-b23aae02.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/hrnet/fcn_hr18s_4x4_512x512_80k_vaihingen/fcn_hr18s_4x4_512x512_80k_vaihingen_20211231_230909.log.json) |
108108
| FCN | HRNetV2p-W18 | 512x512 | 80000 | 2.76 | 19.55 | 72.57 | 74.09 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/hrnet/fcn_hr18_4x4_512x512_80k_vaihingen.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/hrnet/fcn_hr18_4x4_512x512_80k_vaihingen/fcn_hr18_4x4_512x512_80k_vaihingen_20211231_231216-2ec3ae8a.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/hrnet/fcn_hr18_4x4_512x512_80k_vaihingen/fcn_hr18_4x4_512x512_80k_vaihingen_20211231_231216.log.json) |
109109
| FCN | HRNetV2p-W48 | 512x512 | 80000 | 6.20 | 17.25 | 72.50 | 73.52 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/hrnet/fcn_hr48_4x4_512x512_80k_vaihingen.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/hrnet/fcn_hr48_4x4_512x512_80k_vaihingen/fcn_hr48_4x4_512x512_80k_vaihingen_20211231_231244-7133cb22.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/hrnet/fcn_hr48_4x4_512x512_80k_vaihingen/fcn_hr48_4x4_512x512_80k_vaihingen_20211231_231244.log.json) |
110+
111+
### iSAID
112+
113+
| Method | Backbone | Crop Size | Lr schd | Mem (GB) | Inf time (fps) | mIoU | mIoU(ms+flip) | config | download |
114+
| ---------- | -------- | --------- | ------: | -------- | -------------- | ----: | ------------: | -------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
115+
| FCN | HRNetV2p-W18-Small | 896x896 | 80000 | 4.95 | 13.84 | 62.30 | 62.97 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/hrnet/fcn_hr18s_4x4_896x896_80k_isaid.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/hrnet/fcn_hr18s_4x4_896x896_80k_isaid/fcn_hr18s_4x4_896x896_80k_isaid_20220118_001603-3cc0769b.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/hrnet/fcn_hr18s_4x4_896x896_80k_isaid/fcn_hr18s_4x4_896x896_80k_isaid_20220118_001603.log.json) |
116+
| FCN | HRNetV2p-W18 | 896x896 | 80000 | 8.30 | 7.71 | 65.06 | 65.60 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/hrnet/fcn_hr18_4x4_896x896_80k_isaid.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/hrnet/fcn_hr18_4x4_896x896_80k_isaid/fcn_hr18_4x4_896x896_80k_isaid_20220110_182230-49bf752e.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/hrnet/fcn_hr18_4x4_896x896_80k_isaid/fcn_hr18_4x4_896x896_80k_isaid_20220110_182230.log.json) |
117+
| FCN | HRNetV2p-W48 | 896x896 | 80000 | 16.89 | 7.34 | 67.80 | 68.53 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/hrnet/fcn_hr48_4x4_896x896_80k_isaid.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/hrnet/fcn_hr48_4x4_896x896_80k_isaid/fcn_hr48_4x4_896x896_80k_isaid_20220114_174643-547fc420.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/hrnet/fcn_hr48_4x4_896x896_80k_isaid/fcn_hr48_4x4_896x896_80k_isaid_20220114_174643.log.json) |
118+
119+
Note:
120+
121+
- `896x896` is the Crop Size of iSAID dataset, which is followed by the implementation of [PointFlow: Flowing Semantics Through Points for Aerial Image Segmentation](https://arxiv.org/pdf/2103.06564.pdf)
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
_base_ = [
2+
'../_base_/models/fcn_hr18.py', '../_base_/datasets/isaid.py',
3+
'../_base_/default_runtime.py', '../_base_/schedules/schedule_80k.py'
4+
]
5+
model = dict(decode_head=dict(num_classes=16))
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
_base_ = './fcn_hr18_4x4_896x896_80k_isaid.py'
2+
model = dict(
3+
pretrained='open-mmlab://msra/hrnetv2_w18_small',
4+
backbone=dict(
5+
extra=dict(
6+
stage1=dict(num_blocks=(2, )),
7+
stage2=dict(num_blocks=(2, 2)),
8+
stage3=dict(num_modules=3, num_blocks=(2, 2, 2)),
9+
stage4=dict(num_modules=2, num_blocks=(2, 2, 2, 2)))))

0 commit comments

Comments
 (0)