11# Data Transforms
22
3+ In this tutorial, we introduce the design of transforms pipeline in MMSegmentation.
4+
5+ The structure of this guide is as follows:
6+
7+ - [ Data Transforms] ( #data-transforms )
8+ - [ Design of Data pipelines] ( #design-of-data-pipelines )
9+ - [ Customization data transformation] ( #customization-data-transformation )
10+
311## Design of Data pipelines
412
513Following typical conventions, we use ` Dataset ` and ` DataLoader ` for data loading
@@ -10,6 +18,24 @@ we introduce a new `DataContainer` type in MMCV to help collect and distribute
1018data of different size.
1119See [ here] ( https://github.com/open-mmlab/mmcv/blob/master/mmcv/parallel/data_container.py ) for more details.
1220
21+ In 1.x version of MMSegmentation, all data transformations are inherited from ` BaseTransform ` .
22+ The input and output types of transformations are both dict. A simple example is as follow:
23+
24+ ``` python
25+ >> > from mmseg.datasets.transforms import LoadAnnotations
26+ >> > transforms = LoadAnnotations()
27+ >> > img_path = ' ./data/cityscapes/leftImg8bit/train/aachen/aachen_000000_000019_leftImg8bit.png.png'
28+ >> > gt_path = ' ./data/cityscapes/gtFine/train/aachen/aachen_000015_000019_gtFine_instanceTrainIds.png'
29+ >> > results = dict (
30+ >> > img_path = img_path,
31+ >> > seg_map_path = gt_path,
32+ >> > reduce_zero_label = False ,
33+ >> > seg_fields = [])
34+ >> > data_dict = transforms(results)
35+ >> > print (data_dict.keys())
36+ dict_keys([' img_path' , ' seg_map_path' , ' reduce_zero_label' , ' seg_fields' , ' gt_seg_map' ])
37+ ```
38+
1339The data preparation pipeline and the dataset is decomposed. Usually a dataset
1440defines how to process the annotations and a data pipeline defines all the steps to prepare a data dict.
1541A pipeline consists of a sequence of operations. Each operation takes a dict as input and also output a dict for the next transform.
@@ -43,47 +69,104 @@ test_pipeline = [
4369]
4470```
4571
46- For each operation, we list the related dict fields that are added/ updated/ removed.
47- Before pipelines, the information we can directly obtain from the datasets are img_path, seg_map_path.
72+ For each operation, we list the related dict fields that are ` added ` / ` updated ` / ` removed ` .
73+ Before pipelines, the information we can directly obtain from the datasets are ` img_path ` and ` seg_map_path ` .
4874
4975### Data loading
5076
51- ` LoadImageFromFile `
77+ ` LoadImageFromFile ` : Load an image from file.
5278
53- - add: img, img_shape, ori_shape
79+ - add: ` img ` , ` img_shape ` , ` ori_shape `
5480
55- ` LoadAnnotations `
81+ ` LoadAnnotations ` : Load semantic segmentation maps provided by dataset.
5682
57- - add: seg_fields, gt_seg_map
83+ - add: ` seg_fields ` , ` gt_seg_map `
5884
5985### Pre-processing
6086
61- ` RandomResize `
87+ ` RandomResize ` : Random resize image & segmentation map.
6288
63- - add: scale, scale_factor, keep_ratio
64- - update: img, img_shape, gt_seg_map
89+ - add: ` scale ` , ` scale_factor ` , ` keep_ratio `
90+ - update: ` img ` , ` img_shape ` , ` gt_seg_map `
6591
66- ` Resize `
92+ ` Resize ` : Resize image & segmentation map.
6793
68- - add: scale, scale_factor, keep_ratio
69- - update: img, gt_seg_map, img_shape
94+ - add: ` scale ` , ` scale_factor ` , ` keep_ratio `
95+ - update: ` img ` , ` gt_seg_map ` , ` img_shape `
7096
71- ` RandomCrop `
97+ ` RandomCrop ` : Random crop image & segmentation map.
7298
73- - update: img, pad_shape, gt_seg_map
99+ - update: ` img ` , ` gt_seg_map ` , ` img_shape ` .
74100
75- ` RandomFlip `
101+ ` RandomFlip ` : Flip the image & segmentation map.
76102
77- - add: flip, flip_direction
78- - update: img, gt_seg_map
103+ - add: ` flip ` , ` flip_direction `
104+ - update: ` img ` , ` gt_seg_map `
79105
80- ` PhotoMetricDistortion `
106+ ` PhotoMetricDistortion ` : Apply photometric distortion to image sequentially,
107+ every transformation is applied with a probability of 0.5.
108+ The position of random contrast is in second or second to last(mode 0 or 1 below, respectively).
81109
82- - update: img
110+ ```
111+ 1. random brightness
112+ 2. random contrast (mode 0)
113+ 3. convert color from BGR to HSV
114+ 4. random saturation
115+ 5. random hue
116+ 6. convert color from HSV to BGR
117+ 7. random contrast (mode 1)
118+ ```
119+
120+ - update: ` img `
83121
84122### Formatting
85123
86- ` PackSegInputs `
124+ ` PackSegInputs ` : Pack the inputs data for the semantic segmentation.
87125
88- - add: inputs, data_sample
126+ - add: ` inputs ` , ` data_sample `
89127- remove: keys specified by ` meta_keys ` (merged into the metainfo of data_sample), all other keys
128+
129+ ## Customization data transformation
130+
131+ The customized data transformation must inherinted from ` BaseTransform ` and implement ` transform ` function.
132+ Here we use a simple flipping transformation as example:
133+
134+ ``` python
135+ import random
136+ import mmcv
137+ from mmcv.transforms import BaseTransform, TRANSFORMS
138+
139+ @TRANSFORMS.register_module ()
140+ class MyFlip (BaseTransform ):
141+ def __init__ (self , direction : str ):
142+ super ().__init__ ()
143+ self .direction = direction
144+
145+ def transform (self , results : dict ) -> dict :
146+ img = results[' img' ]
147+ results[' img' ] = mmcv.imflip(img, direction = self .direction)
148+ return results
149+ ```
150+
151+ Thus, we can instantiate a ` MyFlip ` object and use it to process the data dict.
152+
153+ ``` python
154+ import numpy as np
155+
156+ transform = MyFlip(direction = ' horizontal' )
157+ data_dict = {' img' : np.random.rand(224 , 224 , 3 )}
158+ data_dict = transform(data_dict)
159+ processed_img = data_dict[' img' ]
160+ ```
161+
162+ Or, we can use ` MyFlip ` transformation in data pipeline in our config file.
163+
164+ ``` python
165+ pipeline = [
166+ ...
167+ dict (type = ' MyFlip' , direction = ' horizontal' ),
168+ ...
169+ ]
170+ ```
171+
172+ Note that if you want to use ` MyFlip ` in config, you must ensure the file containing ` MyFlip ` is imported during the program run.
0 commit comments