Skip to content

Commit 89f6647

Browse files
committed
[Doc] Updata transforms Doc
1 parent eef3888 commit 89f6647

File tree

1 file changed

+104
-21
lines changed

1 file changed

+104
-21
lines changed

docs/en/advanced_guides/transforms.md

Lines changed: 104 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,13 @@
11
# Data Transforms
22

3+
In this tutorial, we introduce the design of transforms pipeline in MMSegmentation.
4+
5+
The structure of this guide is as follows:
6+
7+
- [Data Transforms](#data-transforms)
8+
- [Design of Data pipelines](#design-of-data-pipelines)
9+
- [Customization data transformation](#customization-data-transformation)
10+
311
## Design of Data pipelines
412

513
Following typical conventions, we use `Dataset` and `DataLoader` for data loading
@@ -10,6 +18,24 @@ we introduce a new `DataContainer` type in MMCV to help collect and distribute
1018
data of different size.
1119
See [here](https://github.com/open-mmlab/mmcv/blob/master/mmcv/parallel/data_container.py) for more details.
1220

21+
In 1.x version of MMSegmentation, all data transformations are inherited from `BaseTransform`.
22+
The input and output types of transformations are both dict. A simple example is as follow:
23+
24+
```python
25+
>>> from mmseg.datasets.transforms import LoadAnnotations
26+
>>> transforms = LoadAnnotations()
27+
>>> img_path = './data/cityscapes/leftImg8bit/train/aachen/aachen_000000_000019_leftImg8bit.png.png'
28+
>>> gt_path = './data/cityscapes/gtFine/train/aachen/aachen_000015_000019_gtFine_instanceTrainIds.png'
29+
>>> results = dict(
30+
>>> img_path=img_path,
31+
>>> seg_map_path=gt_path,
32+
>>> reduce_zero_label=False,
33+
>>> seg_fields=[])
34+
>>> data_dict = transforms(results)
35+
>>> print(data_dict.keys())
36+
dict_keys(['img_path', 'seg_map_path', 'reduce_zero_label', 'seg_fields', 'gt_seg_map'])
37+
```
38+
1339
The data preparation pipeline and the dataset is decomposed. Usually a dataset
1440
defines how to process the annotations and a data pipeline defines all the steps to prepare a data dict.
1541
A pipeline consists of a sequence of operations. Each operation takes a dict as input and also output a dict for the next transform.
@@ -43,47 +69,104 @@ test_pipeline = [
4369
]
4470
```
4571

46-
For each operation, we list the related dict fields that are added/updated/removed.
47-
Before pipelines, the information we can directly obtain from the datasets are img_path, seg_map_path.
72+
For each operation, we list the related dict fields that are `added`/`updated`/`removed`.
73+
Before pipelines, the information we can directly obtain from the datasets are `img_path` and `seg_map_path`.
4874

4975
### Data loading
5076

51-
`LoadImageFromFile`
77+
`LoadImageFromFile`: Load an image from file.
5278

53-
- add: img, img_shape, ori_shape
79+
- add: `img`, `img_shape`, `ori_shape`
5480

55-
`LoadAnnotations`
81+
`LoadAnnotations`: Load semantic segmentation maps provided by dataset.
5682

57-
- add: seg_fields, gt_seg_map
83+
- add: `seg_fields`, `gt_seg_map`
5884

5985
### Pre-processing
6086

61-
`RandomResize`
87+
`RandomResize`: Random resize image & segmentation map.
6288

63-
- add: scale, scale_factor, keep_ratio
64-
- update: img, img_shape, gt_seg_map
89+
- add: `scale`, `scale_factor`, `keep_ratio`
90+
- update: `img`, `img_shape`, `gt_seg_map`
6591

66-
`Resize`
92+
`Resize`: Resize image & segmentation map.
6793

68-
- add: scale, scale_factor, keep_ratio
69-
- update: img, gt_seg_map, img_shape
94+
- add: `scale`, `scale_factor`, `keep_ratio`
95+
- update: `img`, `gt_seg_map`, `img_shape`
7096

71-
`RandomCrop`
97+
`RandomCrop`: Random crop image & segmentation map.
7298

73-
- update: img, pad_shape, gt_seg_map
99+
- update: `img`, `gt_seg_map`, `img_shape`.
74100

75-
`RandomFlip`
101+
`RandomFlip`: Flip the image & segmentation map.
76102

77-
- add: flip, flip_direction
78-
- update: img, gt_seg_map
103+
- add: `flip`, `flip_direction`
104+
- update: `img`, `gt_seg_map`
79105

80-
`PhotoMetricDistortion`
106+
`PhotoMetricDistortion`: Apply photometric distortion to image sequentially,
107+
every transformation is applied with a probability of 0.5.
108+
The position of random contrast is in second or second to last(mode 0 or 1 below, respectively).
81109

82-
- update: img
110+
```
111+
1. random brightness
112+
2. random contrast (mode 0)
113+
3. convert color from BGR to HSV
114+
4. random saturation
115+
5. random hue
116+
6. convert color from HSV to BGR
117+
7. random contrast (mode 1)
118+
```
119+
120+
- update: `img`
83121

84122
### Formatting
85123

86-
`PackSegInputs`
124+
`PackSegInputs`: Pack the inputs data for the semantic segmentation.
87125

88-
- add: inputs, data_sample
126+
- add: `inputs`, `data_sample`
89127
- remove: keys specified by `meta_keys` (merged into the metainfo of data_sample), all other keys
128+
129+
## Customization data transformation
130+
131+
The customized data transformation must inherinted from `BaseTransform` and implement `transform` function.
132+
Here we use a simple flipping transformation as example:
133+
134+
```python
135+
import random
136+
import mmcv
137+
from mmcv.transforms import BaseTransform, TRANSFORMS
138+
139+
@TRANSFORMS.register_module()
140+
class MyFlip(BaseTransform):
141+
def __init__(self, direction: str):
142+
super().__init__()
143+
self.direction = direction
144+
145+
def transform(self, results: dict) -> dict:
146+
img = results['img']
147+
results['img'] = mmcv.imflip(img, direction=self.direction)
148+
return results
149+
```
150+
151+
Thus, we can instantiate a `MyFlip` object and use it to process the data dict.
152+
153+
```python
154+
import numpy as np
155+
156+
transform = MyFlip(direction='horizontal')
157+
data_dict = {'img': np.random.rand(224, 224, 3)}
158+
data_dict = transform(data_dict)
159+
processed_img = data_dict['img']
160+
```
161+
162+
Or, we can use `MyFlip` transformation in data pipeline in our config file.
163+
164+
```python
165+
pipeline = [
166+
...
167+
dict(type='MyFlip', direction='horizontal'),
168+
...
169+
]
170+
```
171+
172+
Note that if you want to use `MyFlip` in config, you must ensure the file containing `MyFlip` is imported during the program run.

0 commit comments

Comments
 (0)