Skip to content

Commit f72727c

Browse files
author
谢昕辰
authored
[Tools] Add vit/swin/mit convert weight scripts (open-mmlab#783)
* init scripts * update markdown * update markdown * add docs * delete mit converter and use torch load function * rename segformer readme * update doc * modify doc * 更新中文文档 * Update useful_tools.md * Update useful_tools.md * modify doc * update segformer.yml
1 parent 441be4e commit f72727c

File tree

8 files changed

+381
-11
lines changed

8 files changed

+381
-11
lines changed

configs/segformer/readme.md renamed to configs/segformer/README.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -29,15 +29,15 @@
2929

3030
Evaluation with AlignedResize:
3131

32-
| Method | Backbone | Crop Size | Lr schd | mIoU | mIoU(ms+flip) |
33-
| ------ | -------- | --------- | ------: | ---: | ------------- |
34-
|Segformer | MIT-B0 | 512x512 | 160000 | 38.1 | 38.57 |
35-
|Segformer | MIT-B1 | 512x512 | 160000 | 41.64 | 42.76 |
36-
|Segformer | MIT-B2 | 512x512 | 160000 | 46.53 | 47.49 |
37-
|Segformer | MIT-B3 | 512x512 | 160000 | 48.46 | 49.14 |
38-
|Segformer | MIT-B4 | 512x512 | 160000 | 49.34 | 50.29 |
39-
|Segformer | MIT-B5 | 512x512 | 160000 | 50.08 | 50.72 |
40-
|Segformer | MIT-B5 | 640x640 | 160000 | 50.58 | 50.8 |
32+
| Method | Backbone | Crop Size | Lr schd | mIoU | mIoU(ms+flip) |
33+
| ------ | -------- | --------- | ------: | ---: | ------------- |
34+
|Segformer | MIT-B0 | 512x512 | 160000 | 38.1 | 38.57 |
35+
|Segformer | MIT-B1 | 512x512 | 160000 | 41.64 | 42.76 |
36+
|Segformer | MIT-B2 | 512x512 | 160000 | 46.53 | 47.49 |
37+
|Segformer | MIT-B3 | 512x512 | 160000 | 48.46 | 49.14 |
38+
|Segformer | MIT-B4 | 512x512 | 160000 | 49.34 | 50.29 |
39+
|Segformer | MIT-B5 | 512x512 | 160000 | 50.08 | 50.72 |
40+
|Segformer | MIT-B5 | 640x640 | 160000 | 50.58 | 50.8 |
4141

4242
We replace `AlignedResize` in original implementatiuon to `Resize + ResizeToMultiple`. If you want to test by
4343
using `AlignedResize`, you can change the dataset pipeline like this:

configs/segformer/segformer.yml

Lines changed: 160 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,160 @@
1+
Collections:
2+
- Metadata:
3+
Training Data:
4+
- ADE20k
5+
Name: segformer
6+
Models:
7+
- Config: configs/segformer/segformer_mit-b0_512x512_160k_ade20k.py
8+
In Collection: segformer
9+
Metadata:
10+
backbone: MIT-B0
11+
crop size: (512,512)
12+
inference time (ms/im):
13+
- backend: PyTorch
14+
batch size: 1
15+
hardware: V100
16+
mode: FP32
17+
resolution: (512,512)
18+
value: 19.49
19+
lr schd: 160000
20+
memory (GB): 2.1
21+
Name: segformer_mit-b0_512x512_160k_ade20k
22+
Results:
23+
Dataset: ADE20k
24+
Metrics:
25+
mIoU: 37.41
26+
mIoU(ms+flip): 38.34
27+
Task: Semantic Segmentation
28+
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/segformer/segformer_mit-b0_512x512_160k_ade20k/segformer_mit-b0_512x512_160k_ade20k_20210726_101530-8ffa8fda.pth
29+
- Config: configs/segformer/segformer_mit-b1_512x512_160k_ade20k.py
30+
In Collection: segformer
31+
Metadata:
32+
backbone: MIT-B1
33+
crop size: (512,512)
34+
inference time (ms/im):
35+
- backend: PyTorch
36+
batch size: 1
37+
hardware: V100
38+
mode: FP32
39+
resolution: (512,512)
40+
value: 20.98
41+
lr schd: 160000
42+
memory (GB): 2.6
43+
Name: segformer_mit-b1_512x512_160k_ade20k
44+
Results:
45+
Dataset: ADE20k
46+
Metrics:
47+
mIoU: 40.97
48+
mIoU(ms+flip): 42.54
49+
Task: Semantic Segmentation
50+
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/segformer/segformer_mit-b1_512x512_160k_ade20k/segformer_mit-b1_512x512_160k_ade20k_20210726_112106-d70e859d.pth
51+
- Config: configs/segformer/segformer_mit-b2_512x512_160k_ade20k.py
52+
In Collection: segformer
53+
Metadata:
54+
backbone: MIT-B2
55+
crop size: (512,512)
56+
inference time (ms/im):
57+
- backend: PyTorch
58+
batch size: 1
59+
hardware: V100
60+
mode: FP32
61+
resolution: (512,512)
62+
value: 32.38
63+
lr schd: 160000
64+
memory (GB): 3.6
65+
Name: segformer_mit-b2_512x512_160k_ade20k
66+
Results:
67+
Dataset: ADE20k
68+
Metrics:
69+
mIoU: 45.58
70+
mIoU(ms+flip): 47.03
71+
Task: Semantic Segmentation
72+
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/segformer/segformer_mit-b2_512x512_160k_ade20k/segformer_mit-b2_512x512_160k_ade20k_20210726_112103-cbd414ac.pth
73+
- Config: configs/segformer/segformer_mit-b3_512x512_160k_ade20k.py
74+
In Collection: segformer
75+
Metadata:
76+
backbone: MIT-B3
77+
crop size: (512,512)
78+
inference time (ms/im):
79+
- backend: PyTorch
80+
batch size: 1
81+
hardware: V100
82+
mode: FP32
83+
resolution: (512,512)
84+
value: 45.23
85+
lr schd: 160000
86+
memory (GB): 4.8
87+
Name: segformer_mit-b3_512x512_160k_ade20k
88+
Results:
89+
Dataset: ADE20k
90+
Metrics:
91+
mIoU: 47.82
92+
mIoU(ms+flip): 48.81
93+
Task: Semantic Segmentation
94+
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/segformer/segformer_mit-b3_512x512_160k_ade20k/segformer_mit-b3_512x512_160k_ade20k_20210726_081410-962b98d2.pth
95+
- Config: configs/segformer/segformer_mit-b4_512x512_160k_ade20k.py
96+
In Collection: segformer
97+
Metadata:
98+
backbone: MIT-B4
99+
crop size: (512,512)
100+
inference time (ms/im):
101+
- backend: PyTorch
102+
batch size: 1
103+
hardware: V100
104+
mode: FP32
105+
resolution: (512,512)
106+
value: 64.72
107+
lr schd: 160000
108+
memory (GB): 6.1
109+
Name: segformer_mit-b4_512x512_160k_ade20k
110+
Results:
111+
Dataset: ADE20k
112+
Metrics:
113+
mIoU: 48.46
114+
mIoU(ms+flip): 49.76
115+
Task: Semantic Segmentation
116+
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/segformer/segformer_mit-b4_512x512_160k_ade20k/segformer_mit-b4_512x512_160k_ade20k_20210728_183055-7f509d7d.pth
117+
- Config: configs/segformer/segformer_mit-b5_512x512_160k_ade20k.py
118+
In Collection: segformer
119+
Metadata:
120+
backbone: MIT-B5
121+
crop size: (512,512)
122+
inference time (ms/im):
123+
- backend: PyTorch
124+
batch size: 1
125+
hardware: V100
126+
mode: FP32
127+
resolution: (512,512)
128+
value: 84.1
129+
lr schd: 160000
130+
memory (GB): 7.2
131+
Name: segformer_mit-b5_512x512_160k_ade20k
132+
Results:
133+
Dataset: ADE20k
134+
Metrics:
135+
mIoU: 49.13
136+
mIoU(ms+flip): 50.22
137+
Task: Semantic Segmentation
138+
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/segformer/segformer_mit-b5_512x512_160k_ade20k/segformer_mit-b5_512x512_160k_ade20k_20210726_145235-94cedf59.pth
139+
- Config: configs/segformer/segformer_mit-b5_640x640_160k_ade20k.py
140+
In Collection: segformer
141+
Metadata:
142+
backbone: MIT-B5
143+
crop size: (640,640)
144+
inference time (ms/im):
145+
- backend: PyTorch
146+
batch size: 1
147+
hardware: V100
148+
mode: FP32
149+
resolution: (640,640)
150+
value: 88.5
151+
lr schd: 160000
152+
memory (GB): 11.5
153+
Name: segformer_mit-b5_640x640_160k_ade20k
154+
Results:
155+
Dataset: ADE20k
156+
Metrics:
157+
mIoU: 49.62
158+
mIoU(ms+flip): 50.36
159+
Task: Semantic Segmentation
160+
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/segformer/segformer_mit-b5_640x640_160k_ade20k/segformer_mit-b5_640x640_160k_ade20k_20210801_121243-41d2845b.pth

docs/useful_tools.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -255,6 +255,36 @@ Examples:
255255
python tools/analyze_logs.py log.json --keys loss --legend loss
256256
```
257257

258+
### Model conversion
259+
260+
`tools/model_converters/` provide several scripts to convert pretrain models released by other repos to MMSegmentation style.
261+
262+
#### ViT Swin MiT Transformer Models
263+
264+
- ViT
265+
266+
`tools/model_converters/vit2mmseg.py` convert keys in timm pretrained vit models to MMSegmentation style.
267+
268+
```shell
269+
python tools/model_converters/vit2mmseg.py ${SRC} ${DST}
270+
```
271+
272+
- Swin
273+
274+
`tools/model_converters/swin2mmseg.py` convert keys in official pretrained swin models to MMSegmentation style.
275+
276+
```shell
277+
python tools/model_converters/swin2mmseg.py ${SRC} ${DST}
278+
```
279+
280+
- SegFormer
281+
282+
`tools/model_converters/mit2mmseg.py` convert keys in official pretrained mit models to MMSegmentation style.
283+
284+
```shell
285+
python tools/model_converters/mit2mmseg.py ${SRC} ${DST}
286+
```
287+
258288
## Model Serving
259289

260290
In order to serve an `MMSegmentation` model with [`TorchServe`](https://pytorch.org/serve/), you can follow the steps:

docs_zh-CN/useful_tools.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -259,6 +259,36 @@ python tools/analyze_logs.py xxx.log.json [--keys ${KEYS}] [--legend ${LEGEND}]
259259
python tools/analyze_logs.py log.json --keys loss --legend loss
260260
```
261261

262+
### 转换其他仓库的权重
263+
264+
`tools/model_converters/` 提供了若干个预训练权重转换脚本,支持将其他仓库的预训练权重的 key 转换为与 MMSegmentation 相匹配的 key。
265+
266+
#### ViT Swin MiT Transformer 模型
267+
268+
- ViT
269+
270+
`tools/model_converters/vit2mmseg.py` 将 timm 预训练模型转换到 MMSegmentation。
271+
272+
```shell
273+
python tools/model_converters/vit2mmseg.py ${SRC} ${DST}
274+
```
275+
276+
- Swin
277+
278+
`tools/model_converters/swin2mmseg.py` 将官方预训练模型转换到 MMSegmentation。
279+
280+
```shell
281+
python tools/model_converters/swin2mmseg.py ${SRC} ${DST}
282+
```
283+
284+
- SegFormer
285+
286+
`tools/model_converters/mit2mmseg.py` 将官方预训练模型转换到 MMSegmentation。
287+
288+
```shell
289+
python tools/model_converters/mit2mmseg.py ${SRC} ${DST}
290+
```
291+
262292
## 模型服务
263293

264294
为了用 [`TorchServe`](https://pytorch.org/serve/) 服务 `MMSegmentation` 的模型 , 您可以遵循如下流程:

model-index.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@ Import:
2323
- configs/psanet/psanet.yml
2424
- configs/pspnet/pspnet.yml
2525
- configs/resnest/resnest.yml
26+
- configs/segformer/segformer.yml
2627
- configs/sem_fpn/sem_fpn.yml
2728
- configs/setr/setr.yml
2829
- configs/swin/swin.yml

tools/model_converters/mit_convert.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
import torch
66

77

8-
def mit_convert(ckpt):
8+
def convert_mit(ckpt):
99
new_ckpt = OrderedDict()
1010
# Process the concat between q linear weights and kv linear weights
1111
for k, v in ckpt.items():
@@ -73,5 +73,5 @@ def parse_args():
7373

7474
ckpt = torch.load(src_path, map_location='cpu')
7575

76-
ckpt = mit_convert(ckpt)
76+
ckpt = convert_mit(ckpt)
7777
torch.save(ckpt, dst_path)
Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
import argparse
2+
from collections import OrderedDict
3+
4+
import torch
5+
6+
7+
def convert_swin(ckpt):
8+
new_ckpt = OrderedDict()
9+
10+
def correct_unfold_reduction_order(x):
11+
out_channel, in_channel = x.shape
12+
x = x.reshape(out_channel, 4, in_channel // 4)
13+
x = x[:, [0, 2, 1, 3], :].transpose(1,
14+
2).reshape(out_channel, in_channel)
15+
return x
16+
17+
def correct_unfold_norm_order(x):
18+
in_channel = x.shape[0]
19+
x = x.reshape(4, in_channel // 4)
20+
x = x[[0, 2, 1, 3], :].transpose(0, 1).reshape(in_channel)
21+
return x
22+
23+
for k, v in ckpt.items():
24+
if k.startswith('head'):
25+
continue
26+
elif k.startswith('layers'):
27+
new_v = v
28+
if 'attn.' in k:
29+
new_k = k.replace('attn.', 'attn.w_msa.')
30+
elif 'mlp.' in k:
31+
if 'mlp.fc1.' in k:
32+
new_k = k.replace('mlp.fc1.', 'ffn.layers.0.0.')
33+
elif 'mlp.fc2.' in k:
34+
new_k = k.replace('mlp.fc2.', 'ffn.layers.1.')
35+
else:
36+
new_k = k.replace('mlp.', 'ffn.')
37+
elif 'downsample' in k:
38+
new_k = k
39+
if 'reduction.' in k:
40+
new_v = correct_unfold_reduction_order(v)
41+
elif 'norm.' in k:
42+
new_v = correct_unfold_norm_order(v)
43+
else:
44+
new_k = k
45+
new_k = new_k.replace('layers', 'stages', 1)
46+
elif k.startswith('patch_embed'):
47+
new_v = v
48+
if 'proj' in k:
49+
new_k = k.replace('proj', 'projection')
50+
else:
51+
new_k = k
52+
else:
53+
new_v = v
54+
new_k = k
55+
56+
new_ckpt[new_k] = new_v
57+
58+
return new_ckpt
59+
60+
61+
def main():
62+
parser = argparse.ArgumentParser(
63+
description='Convert keys in official pretrained swin models to'
64+
'MMSegmentation style.')
65+
parser.add_argument('src', help='src segmentation model path')
66+
# The dst path must be a full path of the new checkpoint.
67+
parser.add_argument('dst', help='save path')
68+
args = parser.parse_args()
69+
70+
checkpoint = torch.load(args.src, map_location='cpu')
71+
if 'state_dict' in checkpoint:
72+
state_dict = checkpoint['state_dict']
73+
elif 'model' in checkpoint:
74+
state_dict = checkpoint['model']
75+
else:
76+
state_dict = checkpoint
77+
weight = convert_swin(state_dict)
78+
with open(args.dst, 'wb') as f:
79+
torch.save(weight, f)
80+
81+
82+
if __name__ == '__main__':
83+
main()

0 commit comments

Comments
 (0)