@@ -6,7 +6,7 @@ ADE20K is the largest open source dataset for semantic segmentation and scene pa
66https://github.com/CSAILVision/sceneparsing
77
88Pretrained models can be found at:
9- http://sceneparsing.csail.mit.edu/model/
9+ http://sceneparsing.csail.mit.edu/model/pytorch
1010
1111<img src =" ./teaser/ADE_val_00000278.png " width =" 900 " />
1212<img src =" ./teaser/ADE_val_00001519.png " width =" 900 " />
@@ -64,19 +64,28 @@ IMPORTANT: We use our self-trained base model on ImageNet. The model takes the i
6464 <th valign="bottom">Overall Score</th>
6565 <th valign="bottom">Training Time</th>
6666 <tr>
67- <td>ResNet18_dilated8 + c1_bilinear_deepsup</td>
67+ <td rowspan="2" >ResNet18_dilated8 + c1_bilinear_deepsup</td>
6868 <td>No</td><td>33.82</td><td>76.05</td><td>54.94</td>
69- <td>0.42 * 20 = 8.4 hours</td>
69+ <td rowspan="2" >0.42 * 20 = 8.4 hours</td>
7070 </tr>
7171 <tr>
72- <td>ResNet18_dilated8 + ppm_bilinear_deepsup</td>
72+ <td>Yes</td><td>35.34</td><td>77.41</td><td>56.38</td>
73+ </tr>
74+ <tr>
75+ <td rowspan="2">ResNet18_dilated8 + ppm_bilinear_deepsup</td>
7376 <td>No</td><td>38.00</td><td>78.64</td><td>58.32</td>
74- <td>1.1 * 20 = 22.0 hours</td>
77+ <td rowspan="2">1.1 * 20 = 22.0 hours</td>
78+ </tr>
79+ <tr>
80+ <td>Yes</td><td>38.81</td><td>79.29</td><td>59.05</td>
7581 </tr>
7682 <tr>
77- <td>ResNet50_dilated8 + c1_bilinear_deepsup</td>
83+ <td rowspan="2" >ResNet50_dilated8 + c1_bilinear_deepsup</td>
7884 <td>No</td><td>34.88</td><td>76.54</td><td>55.71</td>
79- <td>1.38 * 20 = 27.6 hours</td>
85+ <td rowspan="2">1.38 * 20 = 27.6 hours</td>
86+ </tr>
87+ <tr>
88+ <td>Yes</td><td>35.49</td><td>77.53</td><td>56.66</td>
8089 </tr>
8190 <tr>
8291 <td rowspan="2">ResNet50_dilated8 + ppm_bilinear_deepsup</td>
0 commit comments