Skip to content

Commit 6333dc1

Browse files
authored
[Project] Medical semantic seg dataset: dr_hagis (open-mmlab#2729)
1 parent 81edd98 commit 6333dc1

File tree

7 files changed

+316
-0
lines changed

7 files changed

+316
-0
lines changed
Lines changed: 155 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,155 @@
1+
# DR HAGIS: Diabetic Retinopathy, Hypertension, Age-related macular degeneration and Glacuoma ImageS
2+
3+
## Description
4+
5+
This project supports **`DR HAGIS: Diabetic Retinopathy, Hypertension, Age-related macular degeneration and Glacuoma ImageS`**, which can be downloaded from [here](https://paperswithcode.com/dataset/dr-hagis).
6+
7+
### Dataset Overview
8+
9+
The DR HAGIS database has been created to aid the development of vessel extraction algorithms suitable for retinal screening programmes. Researchers are encouraged to test their segmentation algorithms using this database. All thirty-nine fundus images were obtained from a diabetic retinopathy screening programme in the UK. Hence, all images were taken from diabetic patients.
10+
11+
Besides the fundus images, the manual segmentation of the retinal surface vessels is provided by an expert grader. These manually segmented images can be used as the ground truth to compare and assess the automatic vessel extraction algorithms. Masks of the FOV are provided as well to quantify the accuracy of vessel extraction within the FOV only. The images were acquired in different screening centers, therefore reflecting the range of image resolutions, digital cameras and fundus cameras used in the clinic. The fundus images were captured using a Topcon TRC-NW6s, Topcon TRC-NW8 or a Canon CR DGi fundus camera with a horizontal 45 degree field-of-view (FOV). The images are 4752x3168 pixels, 3456x2304 pixels, 3126x2136 pixels, 2896x1944 pixels or 2816x1880 pixels in size.
12+
13+
### Original Statistic Information
14+
15+
| Dataset name | Anatomical region | Task type | Modality | Num. Classes | Train/Val/Test Images | Train/Val/Test Labeled | Release Date | License |
16+
| ------------------------------------------------------- | ----------------- | ------------ | ------------------ | ------------ | --------------------- | ---------------------- | ------------ | ------- |
17+
| [DR HAGIS](https://paperswithcode.com/dataset/dr-hagis) | head and neck | segmentation | fundus photography | 2 | 40/-/- | yes/-/- | 2017 | - |
18+
19+
| Class Name | Num. Train | Pct. Train | Num. Val | Pct. Val | Num. Test | Pct. Test |
20+
| :--------: | :--------: | :--------: | :------: | :------: | :-------: | :-------: |
21+
| background | 40 | 96.38 | - | - | - | - |
22+
| vessel | 40 | 3.62 | - | - | - | - |
23+
24+
Note:
25+
26+
- `Pct` means percentage of pixels in this category in all pixels.
27+
28+
### Visualization
29+
30+
![bac](https://raw.githubusercontent.com/uni-medical/medical-datasets-visualization/main/2d/semantic_seg/fundus_photography/dr_hagis/dr_hagis_dataset.png)
31+
32+
## Usage
33+
34+
### Prerequisites
35+
36+
- Python v3.8
37+
- PyTorch v1.10.0
38+
- [MIM](https://github.com/open-mmlab/mim) v0.3.4
39+
- [MMCV](https://github.com/open-mmlab/mmcv) v2.0.0rc4
40+
- [MMEngine](https://github.com/open-mmlab/mmengine) v0.2.0 or higher
41+
- [MMSegmentation](https://github.com/open-mmlab/mmsegmentation) v1.0.0rc5
42+
43+
All the commands below rely on the correct configuration of `PYTHONPATH`, which should point to the project's directory so that Python can locate the module files. In `dr_hagis/` root directory, run the following line to add the current directory to `PYTHONPATH`:
44+
45+
```shell
46+
export PYTHONPATH=`pwd`:$PYTHONPATH
47+
```
48+
49+
### Dataset preparing
50+
51+
- download dataset from [here](https://paperswithcode.com/dataset/dr-hagis) and decompress data to path `'data/'`.
52+
- run script `"python tools/prepare_dataset.py"` to format data and change folder structure as below.
53+
- run script `"python ../../tools/split_seg_dataset.py"` to split dataset and generate `train.txt`, `val.txt` and `test.txt`. If the label of official validation set and test set can't be obtained, we generate `train.txt` and `val.txt` from the training set randomly.
54+
55+
```none
56+
mmsegmentation
57+
├── mmseg
58+
├── projects
59+
│ ├── medical
60+
│ │ ├── 2d_image
61+
│ │ │ ├── fundus_photography
62+
│ │ │ │ ├── dr_hagis
63+
│ │ │ │ │ ├── configs
64+
│ │ │ │ │ ├── datasets
65+
│ │ │ │ │ ├── tools
66+
│ │ │ │ │ ├── data
67+
│ │ │ │ │ │ ├── train.txt
68+
│ │ │ │ │ │ ├── val.txt
69+
│ │ │ │ │ │ ├── images
70+
│ │ │ │ │ │ │ ├── train
71+
│ │ │ │ | │ │ │ ├── xxx.png
72+
│ │ │ │ | │ │ │ ├── ...
73+
│ │ │ │ | │ │ │ └── xxx.png
74+
│ │ │ │ │ │ ├── masks
75+
│ │ │ │ │ │ │ ├── train
76+
│ │ │ │ | │ │ │ ├── xxx.png
77+
│ │ │ │ | │ │ │ ├── ...
78+
│ │ │ │ | │ │ │ └── xxx.png
79+
```
80+
81+
### Divided Dataset Information
82+
83+
***Note: The table information below is divided by ourselves.***
84+
85+
| Class Name | Num. Train | Pct. Train | Num. Val | Pct. Val | Num. Test | Pct. Test |
86+
| :--------: | :--------: | :--------: | :------: | :------: | :-------: | :-------: |
87+
| background | 32 | 96.21 | 8 | 97.12 | - | - |
88+
| vessel | 32 | 3.79 | 8 | 2.88 | - | - |
89+
90+
### Training commands
91+
92+
Train models on a single server with one GPU.
93+
94+
```shell
95+
mim train mmseg ./configs/${CONFIG_FILE}
96+
```
97+
98+
### Testing commands
99+
100+
Test models on a single server with one GPU.
101+
102+
```shell
103+
mim test mmseg ./configs/${CONFIG_FILE} --checkpoint ${CHECKPOINT_PATH}
104+
```
105+
106+
<!-- List the results as usually done in other model's README. [Example](https://github.com/open-mmlab/mmsegmentation/tree/dev-1.x/configs/fcn#results-and-models)
107+
108+
You should claim whether this is based on the pre-trained weights, which are converted from the official release; or it's a reproduced result obtained from retraining the model in this project. -->
109+
110+
## Dataset Citation
111+
112+
If this work is helpful for your research, please consider citing the below paper.
113+
114+
```
115+
@article{holm2017dr,
116+
title={DR HAGIS—a fundus image database for the automatic extraction of retinal surface vessels from diabetic patients},
117+
author={Holm, Sven and Russell, Greg and Nourrit, Vincent and McLoughlin, Niall},
118+
journal={Journal of Medical Imaging},
119+
volume={4},
120+
number={1},
121+
pages={014503--014503},
122+
year={2017},
123+
publisher={Society of Photo-Optical Instrumentation Engineers}
124+
}
125+
```
126+
127+
## Checklist
128+
129+
- [x] Milestone 1: PR-ready, and acceptable to be one of the `projects/`.
130+
131+
- [x] Finish the code
132+
133+
- [x] Basic docstrings & proper citation
134+
135+
- [ ] Test-time correctness
136+
137+
- [x] A full README
138+
139+
- [ ] Milestone 2: Indicates a successful model implementation.
140+
141+
- [ ] Training-time correctness
142+
143+
- [ ] Milestone 3: Good to be a part of our core package!
144+
145+
- [ ] Type hints and docstrings
146+
147+
- [ ] Unit tests
148+
149+
- [ ] Code polishing
150+
151+
- [ ] Metafile.yml
152+
153+
- [ ] Move your modules into the core package following the codebase's file hierarchy structure.
154+
155+
- [ ] Refactor your modules into the core package following the codebase's file hierarchy structure.
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
dataset_type = 'DRHAGISDataset'
2+
data_root = 'data/'
3+
img_scale = (512, 512)
4+
train_pipeline = [
5+
dict(type='LoadImageFromFile'),
6+
dict(type='LoadAnnotations'),
7+
dict(type='Resize', scale=img_scale, keep_ratio=False),
8+
dict(type='RandomFlip', prob=0.5),
9+
dict(type='PhotoMetricDistortion'),
10+
dict(type='PackSegInputs')
11+
]
12+
test_pipeline = [
13+
dict(type='LoadImageFromFile'),
14+
dict(type='Resize', scale=img_scale, keep_ratio=False),
15+
dict(type='LoadAnnotations'),
16+
dict(type='PackSegInputs')
17+
]
18+
train_dataloader = dict(
19+
batch_size=16,
20+
num_workers=4,
21+
persistent_workers=True,
22+
sampler=dict(type='InfiniteSampler', shuffle=True),
23+
dataset=dict(
24+
type=dataset_type,
25+
data_root=data_root,
26+
ann_file='train.txt',
27+
data_prefix=dict(img_path='images/', seg_map_path='masks/'),
28+
pipeline=train_pipeline))
29+
val_dataloader = dict(
30+
batch_size=1,
31+
num_workers=4,
32+
persistent_workers=True,
33+
sampler=dict(type='DefaultSampler', shuffle=False),
34+
dataset=dict(
35+
type=dataset_type,
36+
data_root=data_root,
37+
ann_file='val.txt',
38+
data_prefix=dict(img_path='images/', seg_map_path='masks/'),
39+
pipeline=test_pipeline))
40+
test_dataloader = val_dataloader
41+
val_evaluator = dict(type='IoUMetric', iou_metrics=['mIoU', 'mDice'])
42+
test_evaluator = dict(type='IoUMetric', iou_metrics=['mIoU', 'mDice'])
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
_base_ = [
2+
'./dr-hagis_512x512.py', 'mmseg::_base_/models/fcn_unet_s5-d16.py',
3+
'mmseg::_base_/default_runtime.py',
4+
'mmseg::_base_/schedules/schedule_20k.py'
5+
]
6+
custom_imports = dict(imports='datasets.dr-hagis_dataset')
7+
img_scale = (512, 512)
8+
data_preprocessor = dict(size=img_scale)
9+
optimizer = dict(lr=0.0001)
10+
optim_wrapper = dict(optimizer=optimizer)
11+
model = dict(
12+
data_preprocessor=data_preprocessor,
13+
decode_head=dict(num_classes=2),
14+
auxiliary_head=None,
15+
test_cfg=dict(mode='whole', _delete_=True))
16+
vis_backends = None
17+
visualizer = dict(vis_backends=vis_backends)
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
_base_ = [
2+
'./dr-hagis_512x512.py', 'mmseg::_base_/models/fcn_unet_s5-d16.py',
3+
'mmseg::_base_/default_runtime.py',
4+
'mmseg::_base_/schedules/schedule_20k.py'
5+
]
6+
custom_imports = dict(imports='datasets.dr-hagis_dataset')
7+
img_scale = (512, 512)
8+
data_preprocessor = dict(size=img_scale)
9+
optimizer = dict(lr=0.001)
10+
optim_wrapper = dict(optimizer=optimizer)
11+
model = dict(
12+
data_preprocessor=data_preprocessor,
13+
decode_head=dict(num_classes=2),
14+
auxiliary_head=None,
15+
test_cfg=dict(mode='whole', _delete_=True))
16+
vis_backends = None
17+
visualizer = dict(vis_backends=vis_backends)
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
_base_ = [
2+
'./dr-hagis_512x512.py', 'mmseg::_base_/models/fcn_unet_s5-d16.py',
3+
'mmseg::_base_/default_runtime.py',
4+
'mmseg::_base_/schedules/schedule_20k.py'
5+
]
6+
custom_imports = dict(imports='datasets.dr-hagis_dataset')
7+
img_scale = (512, 512)
8+
data_preprocessor = dict(size=img_scale)
9+
optimizer = dict(lr=0.01)
10+
optim_wrapper = dict(optimizer=optimizer)
11+
model = dict(
12+
data_preprocessor=data_preprocessor,
13+
decode_head=dict(num_classes=2),
14+
auxiliary_head=None,
15+
test_cfg=dict(mode='whole', _delete_=True))
16+
vis_backends = None
17+
visualizer = dict(vis_backends=vis_backends)
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
from mmseg.datasets import BaseSegDataset
2+
from mmseg.registry import DATASETS
3+
4+
5+
@DATASETS.register_module()
6+
class DRHAGISDataset(BaseSegDataset):
7+
"""DRHAGISDataset dataset.
8+
9+
In segmentation map annotation for DRHAGISDataset,
10+
``reduce_zero_label`` is fixed to False. The ``img_suffix``
11+
is fixed to '.png' and ``seg_map_suffix`` is fixed to '.png'.
12+
13+
Args:
14+
img_suffix (str): Suffix of images. Default: '.png'
15+
seg_map_suffix (str): Suffix of segmentation maps. Default: '.png'
16+
"""
17+
METAINFO = dict(classes=('background', 'vessel'))
18+
19+
def __init__(self,
20+
img_suffix='.png',
21+
seg_map_suffix='.png',
22+
**kwargs) -> None:
23+
super().__init__(
24+
img_suffix=img_suffix,
25+
seg_map_suffix=seg_map_suffix,
26+
reduce_zero_label=False,
27+
**kwargs)
Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
import glob
2+
import os
3+
import shutil
4+
5+
import mmengine
6+
import numpy as np
7+
from PIL import Image
8+
9+
root_path = 'data/'
10+
img_suffix = '.jpg'
11+
seg_map_suffix = '_manual_orig.png'
12+
save_img_suffix = '.png'
13+
save_seg_map_suffix = '.png'
14+
15+
x_train = glob.glob(os.path.join('data/DRHAGIS/**/*' + img_suffix))
16+
17+
mmengine.mkdir_or_exist(root_path + 'images/train/')
18+
mmengine.mkdir_or_exist(root_path + 'masks/train/')
19+
20+
D3_palette = {0: (0, 0, 0), 1: (1, 1, 1)}
21+
D3_invert_palette = {v: k for k, v in D3_palette.items()}
22+
D2_255_convert_dict = {0: 0, 255: 1}
23+
24+
part_dir_dict = {0: 'train/', 1: 'val/'}
25+
for ith, part in enumerate([x_train]):
26+
part_dir = part_dir_dict[ith]
27+
for img in part:
28+
basename = os.path.basename(img)
29+
shutil.copy(
30+
img, root_path + 'images/' + part_dir + basename.split('.')[0] +
31+
save_img_suffix)
32+
mask_path = root_path + 'DRHAGIS/Manual_Segmentations/' + basename.split( # noqa
33+
'.')[0] + seg_map_suffix
34+
label = np.array(Image.open(mask_path))
35+
36+
save_mask_path = root_path + 'masks/' + part_dir + basename.split(
37+
'.')[0] + save_seg_map_suffix # noqa
38+
mask = np.array(Image.open(mask_path)).astype(np.uint8)
39+
mask[mask == 255] = 1
40+
mask = Image.fromarray(mask)
41+
mask.save(save_mask_path)

0 commit comments

Comments
 (0)