Skip to content

Commit 16013dd

Browse files
committed
add readme
1 parent 28a92ae commit 16013dd

File tree

3 files changed

+109
-191
lines changed

3 files changed

+109
-191
lines changed

README.md

Lines changed: 107 additions & 188 deletions
Original file line numberDiff line numberDiff line change
@@ -1,196 +1,115 @@
1-
<div align="center">
2-
<img src="resources/mmseg-logo.png" width="600"/>
3-
<div>&nbsp;</div>
4-
<div align="center">
5-
<b><font size="5">OpenMMLab website</font></b>
6-
<sup>
7-
<a href="https://openmmlab.com">
8-
<i><font size="4">HOT</font></i>
9-
</a>
10-
</sup>
11-
&nbsp;&nbsp;&nbsp;&nbsp;
12-
<b><font size="5">OpenMMLab platform</font></b>
13-
<sup>
14-
<a href="https://platform.openmmlab.com">
15-
<i><font size="4">TRY IT OUT</font></i>
16-
</a>
17-
</sup>
18-
</div>
19-
<div>&nbsp;</div>
20-
</div>
21-
<br />
22-
23-
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/mmsegmentation)](https://pypi.org/project/mmsegmentation/)
24-
[![PyPI](https://img.shields.io/pypi/v/mmsegmentation)](https://pypi.org/project/mmsegmentation)
25-
[![docs](https://img.shields.io/badge/docs-latest-blue)](https://mmsegmentation.readthedocs.io/en/latest/)
26-
[![badge](https://github.com/open-mmlab/mmsegmentation/workflows/build/badge.svg)](https://github.com/open-mmlab/mmsegmentation/actions)
27-
[![codecov](https://codecov.io/gh/open-mmlab/mmsegmentation/branch/master/graph/badge.svg)](https://codecov.io/gh/open-mmlab/mmsegmentation)
28-
[![license](https://img.shields.io/github/license/open-mmlab/mmsegmentation.svg)](https://github.com/open-mmlab/mmsegmentation/blob/master/LICENSE)
29-
[![issue resolution](https://isitmaintained.com/badge/resolution/open-mmlab/mmsegmentation.svg)](https://github.com/open-mmlab/mmsegmentation/issues)
30-
[![open issues](https://isitmaintained.com/badge/open/open-mmlab/mmsegmentation.svg)](https://github.com/open-mmlab/mmsegmentation/issues)
31-
32-
Documentation: https://mmsegmentation.readthedocs.io/
33-
34-
English | [简体中文](README_zh-CN.md)
35-
36-
## Introduction
37-
38-
MMSegmentation is an open source semantic segmentation toolbox based on PyTorch.
39-
It is a part of the OpenMMLab project.
40-
41-
The master branch works with **PyTorch 1.5+**.
42-
43-
![demo image](resources/seg_demo.gif)
44-
45-
### Major features
46-
47-
- **Unified Benchmark**
48-
49-
We provide a unified benchmark toolbox for various semantic segmentation methods.
50-
51-
- **Modular Design**
52-
53-
We decompose the semantic segmentation framework into different components and one can easily construct a customized semantic segmentation framework by combining different modules.
54-
55-
- **Support of multiple methods out of box**
56-
57-
The toolbox directly supports popular and contemporary semantic segmentation frameworks, *e.g.* PSPNet, DeepLabV3, PSANet, DeepLabV3+, etc.
58-
59-
- **High efficiency**
60-
61-
The training speed is faster than or comparable to other codebases.
62-
63-
## License
64-
65-
This project is released under the [Apache 2.0 license](LICENSE).
66-
67-
## Changelog
68-
69-
v0.20.2 was released in 12/15/2021.
70-
Please refer to [changelog.md](docs/en/changelog.md) for details and release history.
71-
72-
## Benchmark and model zoo
73-
74-
Results and models are available in the [model zoo](docs/en/model_zoo.md).
75-
76-
Supported backbones:
77-
78-
- [x] ResNet (CVPR'2016)
79-
- [x] ResNeXt (CVPR'2017)
80-
- [x] [HRNet (CVPR'2019)](configs/hrnet)
81-
- [x] [ResNeSt (ArXiv'2020)](configs/resnest)
82-
- [x] [MobileNetV2 (CVPR'2018)](configs/mobilenet_v2)
83-
- [x] [MobileNetV3 (ICCV'2019)](configs/mobilenet_v3)
84-
- [x] [Vision Transformer (ICLR'2021)](configs/vit)
85-
- [x] [Swin Transformer (ICCV'2021)](configs/swin)
86-
- [x] [Twins (NeurIPS'2021)](configs/twins)
87-
88-
Supported methods:
89-
90-
- [x] [FCN (CVPR'2015/TPAMI'2017)](configs/fcn)
91-
- [x] [ERFNet (T-ITS'2017)](configs/erfnet)
92-
- [x] [UNet (MICCAI'2016/Nat. Methods'2019)](configs/unet)
93-
- [x] [PSPNet (CVPR'2017)](configs/pspnet)
94-
- [x] [DeepLabV3 (ArXiv'2017)](configs/deeplabv3)
95-
- [x] [BiSeNetV1 (ECCV'2018)](configs/bisenetv1)
96-
- [x] [PSANet (ECCV'2018)](configs/psanet)
97-
- [x] [DeepLabV3+ (CVPR'2018)](configs/deeplabv3plus)
98-
- [x] [UPerNet (ECCV'2018)](configs/upernet)
99-
- [x] [ICNet (ECCV'2018)](configs/icnet)
100-
- [x] [NonLocal Net (CVPR'2018)](configs/nonlocal_net)
101-
- [x] [EncNet (CVPR'2018)](configs/encnet)
102-
- [x] [Semantic FPN (CVPR'2019)](configs/sem_fpn)
103-
- [x] [DANet (CVPR'2019)](configs/danet)
104-
- [x] [APCNet (CVPR'2019)](configs/apcnet)
105-
- [x] [EMANet (ICCV'2019)](configs/emanet)
106-
- [x] [CCNet (ICCV'2019)](configs/ccnet)
107-
- [x] [DMNet (ICCV'2019)](configs/dmnet)
108-
- [x] [ANN (ICCV'2019)](configs/ann)
109-
- [x] [GCNet (ICCVW'2019/TPAMI'2020)](configs/gcnet)
110-
- [x] [FastFCN (ArXiv'2019)](configs/fastfcn)
111-
- [x] [Fast-SCNN (ArXiv'2019)](configs/fastscnn)
112-
- [x] [ISANet (ArXiv'2019/IJCV'2021)](configs/isanet)
113-
- [x] [OCRNet (ECCV'2020)](configs/ocrnet)
114-
- [x] [DNLNet (ECCV'2020)](configs/dnlnet)
115-
- [x] [PointRend (CVPR'2020)](configs/point_rend)
116-
- [x] [CGNet (TIP'2020)](configs/cgnet)
117-
- [x] [BiSeNetV2 (IJCV'2021)](configs/bisenetv2)
118-
- [x] [STDC (CVPR'2021)](configs/stdc)
119-
- [x] [SETR (CVPR'2021)](configs/setr)
120-
- [x] [DPT (ArXiv'2021)](configs/dpt)
121-
- [x] [SegFormer (NeurIPS'2021)](configs/segformer)
122-
- [x] [MaskCLIP](configs/maskclip)
123-
- [x] [MaskCLIP+](configs/maskclip_plus)
1+
# Extract Free Dense Labels from CLIP
2+
```
3+
███╗ ███╗ █████╗ ███████╗██╗ ██╗ ██████╗██╗ ██╗██████╗
4+
████╗ ████║██╔══██╗██╔════╝██║ ██╔╝██╔════╝██║ ██║██╔══██╗
5+
██╔████╔██║███████║███████╗█████╔╝ ██║ ██║ ██║██████╔╝
6+
██║╚██╔╝██║██╔══██║╚════██║██╔═██╗ ██║ ██║ ██║██╔═══╝
7+
██║ ╚═╝ ██║██║ ██║███████║██║ ██╗╚██████╗███████╗██║██║
8+
╚═╝ ╚═╝╚═╝ ╚═╝╚══════╝╚═╝ ╚═╝ ╚═════╝╚══════╝╚═╝╚═╝
9+
```
10+
This is the code for our paper: [Extract Free Dense Labels from CLIP](https://arxiv.org/abs/2112.01071).
11+
12+
This repo is a fork of [mmsegmentation](https://github.com/open-mmlab/mmsegmentation). So the installation and data preparation is pretty similar.
13+
14+
# Installation
15+
**Step 0.** Install PyTorch and Torchvision following [official instructions](https://pytorch.org/get-started/locally/), e.g.,
16+
17+
```shell
18+
pip install torch torchvision
19+
# FYI, we're using torch==1.9.1 and torchvision==0.10.1
20+
```
21+
22+
**Step 1.** Install [MMCV](https://github.com/open-mmlab/mmcv) using [MIM](https://github.com/open-mmlab/mim).
23+
```shell
24+
pip install -U openmim
25+
mim install mmcv-full
26+
```
27+
28+
**Step 2.** Install [CLIP](https://github.com/openai/CLIP).
29+
```shell
30+
pip install ftfy regex tqdm
31+
pip install git+https://github.com/openai/CLIP.git
32+
```
33+
34+
**Step 3.** Install MaskCLIP.
35+
```shell
36+
git clone https://github.com/chongzhou96/MaskCLIP.git
37+
cd MaskCLIP
38+
pip install -v -e .
39+
# "-v" means verbose, or more output
40+
# "-e" means installing a project in editable mode,
41+
# thus any local modifications made to the code will take effect without reinstallation.
42+
```
43+
44+
# Dataset Preparation
45+
Please refer to [dataset_prepare.md](docs/en/dataset_prepare.md#prepare-datasets). In our paper, we experiment with [Pascal VOC](docs/en/dataset_prepare.md#pascal-voc), [Pascal Context](docs/en/dataset_prepare.md#pascal-context), and [COCO Stuff 164k](docs/en/dataset_prepare.md#coco-stuff-164k).
46+
47+
# MaskCLIP
48+
MaskCLIP doesn't require any training. We only need to (1) download and convert the CLIP model and (2) prepare the text embeddings of the objects of interest.
12449

125-
Supported datasets:
50+
**Step 0.** Download and convert the CLIP models, e.g.,
51+
```shell
52+
mkdir -p pretrain
53+
python tools/maskclip_utils/convert_clip_weights.py --model ViT16 --backbone
54+
# Other options for model: RN50, RN101, RN50x4, RN50x16, RN50x64, ViT32, ViT16, ViT14
55+
```
56+
57+
**Step 1.** Prepare the text embeddings of the objects of interest, e.g.,
58+
```shell
59+
python tools/maskclip_utils/prompt_engineering.py --model ViT16 --class-set context
60+
# Other options for model: RN50, RN101, RN50x4, RN50x16, ViT32, ViT16
61+
# Other options for class-set: voc, context, stuff
62+
# Actually, we've played around with many more interesting target classes. (See prompt_engineering.py)
63+
```
64+
65+
**Step 2.** Get quantitative results (mIoU):
66+
```shell
67+
python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} --eval mIoU
68+
# e.g., python tools/test.py configs/maskclip/maskclip_vit16_520x520_pascal_context_59.py pretrain/ViT16_clip_backbone.pth --eval mIoU
69+
```
70+
71+
**Step 3. (optional)** Get qualitative results:
72+
```shell
73+
python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} --show-dir ${OUTPUT_DIR}
74+
# e.g., python tools/test.py configs/maskclip/maskclip_vit16_520x520_pascal_context_59.py pretrain/ViT16_clip_backbone.pth --show-dir output/
75+
```
12676

127-
- [x] [Cityscapes](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/en/dataset_prepare.md#cityscapes)
128-
- [x] [PASCAL VOC](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/en/dataset_prepare.md#pascal-voc)
129-
- [x] [ADE20K](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/en/dataset_prepare.md#ade20k)
130-
- [x] [Pascal Context](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/en/dataset_prepare.md#pascal-context)
131-
- [x] [COCO-Stuff 10k](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/en/dataset_prepare.md#coco-stuff-10k)
132-
- [x] [COCO-Stuff 164k](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/en/dataset_prepare.md#coco-stuff-164k)
133-
- [x] [CHASE_DB1](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/en/dataset_prepare.md#chase-db1)
134-
- [x] [DRIVE](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/en/dataset_prepare.md#drive)
135-
- [x] [HRF](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/en/dataset_prepare.md#hrf)
136-
- [x] [STARE](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/en/dataset_prepare.md#stare)
137-
- [x] [Dark Zurich](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/en/dataset_prepare.md#dark-zurich)
138-
- [x] [Nighttime Driving](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/en/dataset_prepare.md#nighttime-driving)
139-
- [x] [LoveDA](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/en/dataset_prepare.md#loveda)
77+
# MaskCLIP+
78+
MaskCLIP+ trains another segmentation model with pseudo labels extracted from MaskCLIP.
14079

141-
## Installation
80+
**Step 0.** Download and convert the CLIP models, e.g.,
81+
```shell
82+
mkdir -p pretrain
83+
python tools/maskclip_utils/convert_clip_weights.py --model ViT16
84+
# Other options for model: RN50, RN101, RN50x4, RN50x16, RN50x64, ViT32, ViT16, ViT14
85+
```
86+
87+
**Step 1.** Prepare the text embeddings of the target dataset, e.g.,
88+
```shell
89+
python tools/maskclip_utils/prompt_engineering.py --model ViT16 --class-set context
90+
# Other options for model: RN50, RN101, RN50x4, RN50x16, ViT32, ViT16
91+
# Other options for class-set: voc, context, stuff
92+
```
14293

143-
Please refer to [get_started.md](docs/en/get_started.md#installation) for installation and [dataset_prepare.md](docs/en/dataset_prepare.md#prepare-datasets) for dataset preparation.
94+
**Train.** Depending on your setup (single/mutiple GPU(s), multiple machines), the training script can be different. Here, we give an example of multiple GPUs on a single machine. For more infomation, please refer to [train.md](docs/en/train.md).
95+
```shell
96+
sh tools/dist_train.sh ${CONFIG_FILE} ${GPU_NUM}
97+
# e.g., sh tools/dist_train.sh configs/maskclip_plus/zero_shot/maskclip_plus_r50_deeplabv3plus_r101-d8_480x480_40k_pascal_context.py 4
98+
```
14499

145-
## Get Started
146-
147-
Please see [train.md](docs/en/train.md) and [inference.md](docs/en/inference.md) for the basic usage of MMSegmentation.
148-
There are also tutorials for [customizing dataset](docs/en/tutorials/customize_datasets.md), [designing data pipeline](docs/en/tutorials/data_pipeline.md), [customizing modules](docs/en/tutorials/customize_models.md), and [customizing runtime](docs/en/tutorials/customize_runtime.md).
149-
We also provide many [training tricks](docs/en/tutorials/training_tricks.md) for better training and [useful tools](docs/en/useful_tools.md) for deployment.
150-
151-
A Colab tutorial is also provided. You may preview the notebook [here](demo/MMSegmentation_Tutorial.ipynb) or directly [run](https://colab.research.google.com/github/open-mmlab/mmsegmentation/blob/master/demo/MMSegmentation_Tutorial.ipynb) on Colab.
152-
153-
## Citation
154-
155-
If you find this project useful in your research, please consider cite:
156-
157-
```latex
158-
@misc{mmseg2020,
159-
title={{MMSegmentation}: OpenMMLab Semantic Segmentation Toolbox and Benchmark},
160-
author={MMSegmentation Contributors},
161-
howpublished = {\url{https://github.com/open-mmlab/mmsegmentation}},
162-
year={2020}
100+
**Inference.** See step 2 and step 3 under the MaskCLIP section. (We will release the trained models soon.)
101+
102+
103+
# Citation
104+
If you use MaskCLIP or this code base in your work, please cite
105+
```
106+
@InProceedings{zhou2022maskclip,
107+
author = {Zhou, Chong and Loy, Chen Change and Dai, Bo},
108+
title = {Extract Free Dense Labels from CLIP},
109+
booktitle = {European Conference on Computer Vision (ECCV)},
110+
year = {2022}
163111
}
164112
```
165113

166-
## Contributing
167-
168-
We appreciate all contributions to improve MMSegmentation. Please refer to [CONTRIBUTING.md](.github/CONTRIBUTING.md) for the contributing guideline.
169-
170-
## Acknowledgement
171-
172-
MMSegmentation is an open source project that welcome any contribution and feedback.
173-
We wish that the toolbox and benchmark could serve the growing research
174-
community by providing a flexible as well as standardized toolkit to reimplement existing methods
175-
and develop their own new semantic segmentation methods.
176-
177-
## Projects in OpenMMLab
178-
179-
- [MMCV](https://github.com/open-mmlab/mmcv): OpenMMLab foundational library for computer vision.
180-
- [MMClassification](https://github.com/open-mmlab/mmclassification): OpenMMLab image classification toolbox and benchmark.
181-
- [MMDetection](https://github.com/open-mmlab/mmdetection): OpenMMLab detection toolbox and benchmark.
182-
- [MMDetection3D](https://github.com/open-mmlab/mmdetection3d): OpenMMLab's next-generation platform for general 3D object detection.
183-
- [MMSegmentation](https://github.com/open-mmlab/mmsegmentation): OpenMMLab semantic segmentation toolbox and benchmark.
184-
- [MMAction2](https://github.com/open-mmlab/mmaction2): OpenMMLab's next-generation action understanding toolbox and benchmark.
185-
- [MMTracking](https://github.com/open-mmlab/mmtracking): OpenMMLab video perception toolbox and benchmark.
186-
- [MMPose](https://github.com/open-mmlab/mmpose): OpenMMLab pose estimation toolbox and benchmark.
187-
- [MMEditing](https://github.com/open-mmlab/mmediting): OpenMMLab image and video editing toolbox.
188-
- [MMOCR](https://github.com/open-mmlab/mmocr): A Comprehensive Toolbox for Text Detection, Recognition and Understanding.
189-
- [MMGeneration](https://github.com/open-mmlab/mmgeneration): A powerful toolkit for generative models.
190-
- [MIM](https://github.com/open-mmlab/mim): MIM Installs OpenMMLab Packages.
191-
- [MMFlow](https://github.com/open-mmlab/mmflow): OpenMMLab optical flow toolbox and benchmark.
192-
- [MMFewShot](https://github.com/open-mmlab/mmfewshot): OpenMMLab few shot learning toolbox and benchmark.
193-
- [MMHuman3D](https://github.com/open-mmlab/mmhuman3d): OpenMMLab 3D human parametric model toolbox and benchmark.
194-
- [MMSelfSup](https://github.com/open-mmlab/mmselfsup): OpenMMLab self-supervised learning toolbox and benchmark.
195-
- [MMRazor](https://github.com/open-mmlab/mmrazor): OpenMMLab Model Compression Toolbox and Benchmark.
196-
- [MMDeploy](https://github.com/open-mmlab/mmdeploy): OpenMMLab Model Deployment Framework.
114+
# Contact
115+
For questions about our paper or code, please contact [Chong Zhou](mailto:[email protected]).

docs/en/dataset_prepare.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
## Prepare datasets
22

3-
It is recommended to symlink the dataset root to `$MMSEGMENTATION/data`.
3+
It is recommended to symlink the dataset root to `$MASKCLIP/data`.
44
If your folder structure is different, you may need to change the corresponding paths in config files.
55

66
```none
7-
mmsegmentation
7+
maskclip
88
├── mmseg
99
├── tools
1010
├── configs

mmseg/models/backbones/vit.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -465,7 +465,6 @@ def forward(self, inputs):
465465

466466
def train(self, mode=True):
467467
super(VisionTransformer, self).train(mode)
468-
self._freeze()
469468
if mode and self.norm_eval:
470469
for m in self.modules():
471470
if isinstance(m, nn.LayerNorm):

0 commit comments

Comments
 (0)