Skip to content

Commit ae3f4d1

Browse files
author
Tete Xiao
authored
Update README.md
1 parent 1b993f6 commit ae3f4d1

File tree

1 file changed

+25
-10
lines changed

1 file changed

+25
-10
lines changed

README.md

Lines changed: 25 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -34,20 +34,20 @@ So we re-implement the `DataParallel` module, and make it support distributing d
3434
We split our models into encoder and decoder, where encoders are usually modified directly from classification networks, and decoders consist of final convolutions and upsampling.
3535

3636
Encoder: (resnetXX_dilatedYY: customized resnetXX with dilated convolutions, output feature map is 1/YY of input size.)
37-
- resnet34_dilated16, resnet34_dilated8
38-
- resnet50_dilated16, resnet50_dilated8
37+
- ResNet50, resnet50_dilated16, resnet50_dilated8
38+
- ResNet101, resnet101_dilated16, resnet101_dilated8
3939

4040
***Coming soon***:
41-
- resnet101_dilated16, resnet101_dilated8
41+
- ResNeXt101, resnext101_dilated16, resnext101_dilated8
4242

4343
Decoder:
4444
- c1_bilinear (1 conv + bilinear upsample)
4545
- c1_bilinear_deepsup (c1_blinear + deep supervision trick)
4646
- ppm_bilinear (pyramid pooling + bilinear upsample, see [PSPNet](https://hszhao.github.io/projects/pspnet) paper for details)
4747
- ppm_bilinear_deepsup (ppm_bilinear + deep supervision trick)
4848

49-
***Coming soon***:
50-
- UPerNet based on Feature Pyramid Network (FPN) and Pyramid Pooling Module (PPM), with down-sampling rate of 4, 8 and 16. It doesn't need dilated convolution, a operator that is time-and-memory consuming. *Without bells and whistles*, it is comparable or even better compared with PSPNet, while requires much shorter training time and less GPU memory.
49+
***New***:
50+
- UPerNet based on Feature Pyramid Network (FPN) and Pyramid Pooling Module (PPM), with down-sampling rate of 4, 8 and 16. It doesn't need dilated convolution, a operator that is time-and-memory consuming. *Without bells and whistles*, it is comparable or even better compared with PSPNet, while requires much shorter training time and less GPU memory. E.g., you cannot train a PSPNet-101 on TITAN Xp GPUs with only 12GB memory, while you can train a UPerNet-101 on such GPUs.
5151

5252

5353
## Performance:
@@ -87,15 +87,15 @@ IMPORTANT: We use our self-trained base model on ImageNet. The model takes the i
8787
<td>Yes</td><td>42.53</td><td>80.91</td><td>61.72</td>
8888
</tr>
8989
<tr>
90-
<td rowspan="2">UperNet-50</td>
90+
<td rowspan="2"><b>UperNet-50</b></td>
9191
<td>No</td><td>40.44</td><td>79.80</td><td>60.12</td>
9292
<td rowspan="2">1.75 * 20 = 35.0 hours</td>
9393
</tr>
9494
<tr>
9595
<td>Yes</td><td>41.55</td><td>80.23</td><td>60.89</td>
9696
</tr>
9797
<tr>
98-
<td rowspan="2">UperNet-101</td>
98+
<td rowspan="2"><b>UperNet-101</b></td>
9999
<td>No</td><td>41.98</td><td>80.63</td><td>61.34</td>
100100
<td rowspan="2">2.5 * 25 = 50.0 hours</td>
101101
</tr>
@@ -109,7 +109,7 @@ IMPORTANT: We use our self-trained base model on ImageNet. The model takes the i
109109
</tr>
110110
</tbody></table>
111111

112-
The speed is benchmarked on a server with 8 NVIDIA Pascal Titan Xp GPUs (12GB GPU memory), except for ResNet-101_dilated8, which is benchmarked on a server with 8 NVIDIA Tesla P40 GPUS (22GB GPU memory), because of the insufficient memory issue when using dilated conv on a very deep network.
112+
The speed is benchmarked on a server with 8 NVIDIA Pascal Titan Xp GPUs (12GB GPU memory), ***except for*** ResNet-101_dilated8, which is benchmarked on a server with 8 NVIDIA Tesla P40 GPUS (22GB GPU memory), because of the insufficient memory issue when using dilated conv on a very deep network.
113113

114114
## Environment
115115
The code is developed under the following configurations.
@@ -153,6 +153,17 @@ chmod +x download_ADE20K.sh
153153
python3 train.py --num_gpus NUM_GPUS
154154
```
155155
156+
Train a UPerNet (e.g., ResNet-50 or ResNet-101)
157+
```bash
158+
python3 train.py --num_gpus NUM_GPUS --arch_encoder resnet50 --arch_decoder upernet
159+
--segm_downsampling_rate 4 --padding_constant 32
160+
```
161+
or
162+
```bash
163+
python3 train.py --num_gpus NUM_GPUS --arch_encoder resnet101 --arch_decoder upernet
164+
--segm_downsampling_rate 4 --padding_constant 32
165+
```
166+
156167
3. Input arguments: (see full input arguments via ```python3 train.py -h ```)
157168
```bash
158169
usage: train.py [-h] [--id ID] [--arch_encoder ARCH_ENCODER]
@@ -181,6 +192,11 @@ usage: train.py [-h] [--id ID] [--arch_encoder ARCH_ENCODER]
181192
```bash
182193
python3 eval.py --id MODEL_ID --suffix SUFFIX
183194
```
195+
Evaluate a UPerNet (e.g, UPerNet-50)
196+
```bash
197+
python3 eval.py --id MODEL_ID --suffix SUFFIX
198+
--arch_encoder resnet50 --arch_decoder upernet --padding_constant 32
199+
```
184200
185201
2. Input arguments: (see full input arguments via ```python3 eval.py -h ```)
186202
```bash
@@ -190,8 +206,7 @@ usage: eval.py [-h] --id ID [--suffix SUFFIX] [--arch_encoder ARCH_ENCODER]
190206
[--num_val NUM_VAL] [--num_class NUM_CLASS]
191207
[--batch_size BATCH_SIZE] [--imgSize IMGSIZE]
192208
[--imgMaxSize IMGMAXSIZE] [--padding_constant PADDING_CONSTANT]
193-
[--segm_downsampling_rate SEGM_DOWNSAMPLING_RATE] [--ckpt CKPT]
194-
[--visualize] [--result RESULT] [--gpu_id GPU_ID]
209+
[--ckpt CKPT] [--visualize] [--result RESULT] [--gpu_id GPU_ID]
195210
```
196211
197212

0 commit comments

Comments
 (0)