Skip to content

Commit d3f2922

Browse files
authored
[Project] Medical semantic seg dataset: Chest x ray images with pneumothorax masks (open-mmlab#2687)
1 parent c923f4d commit d3f2922

9 files changed

+305
-6
lines changed

projects/medical/2d_image/histopathology/pannuke/README.md

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -121,12 +121,6 @@ To test models on a single server with one GPU. (default)
121121
mim test mmseg ./configs/${CONFIG_FILE} --checkpoint ${CHECKPOINT_PATH}
122122
```
123123

124-
<!-- List the results as usually done in other model's README. [Example](https://github.com/open-mmlab/mmsegmentation/tree/dev-1.x/configs/fcn#results-and-models)
125-
126-
You should claim whether this is based on the pre-trained weights, which are converted from the official release; or it's a reproduced result obtained from retraining the model in this project. -->
127-
128-
12x512 | 0.0001 | 58.87 | 62.42 | [config](https://github.com/open-mmlab/mmsegmentation/tree/dev-1.x/projects/medical/2d_image/histopathology/pannuke/configs/fcn-unet-s5-d16_unet_1xb16-0.0001-20k_pannuke-512x512.py) |
129-
130124
## Checklist
131125

132126
- [x] Milestone 1: PR-ready, and acceptable to be one of the `projects/`.
Lines changed: 119 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,119 @@
1+
# Chest X-ray Images with Pneumothorax Masks
2+
3+
## Description
4+
5+
This project support **`Chest X-ray Images with Pneumothorax Masks `**, and the dataset used in this project can be downloaded from [here](https://www.kaggle.com/datasets/vbookshelf/pneumothorax-chest-xray-images-and-masks).
6+
7+
### Dataset Overview
8+
9+
A pneumothorax (noo-moe-THOR-aks) is a collapsed lung. A pneumothorax occurs when air leaks into the space between your lung and chest wall. This air pushes on the outside of your lung and makes it collapse. Pneumothorax can be a complete lung collapse or a collapse of only a portion of the lung.
10+
11+
A pneumothorax can be caused by a blunt or penetrating chest injury, certain medical procedures, or damage from underlying lung disease. Or it may occur for no obvious reason. Symptoms usually include sudden chest pain and shortness of breath. On some occasions, a collapsed lung can be a life-threatening event.
12+
13+
Treatment for a pneumothorax usually involves inserting a needle or chest tube between the ribs to remove the excess air. However, a small pneumothorax may heal on its own.
14+
15+
### Statistic Information
16+
17+
| Dataset Name | Anatomical Region | Task type | Modality | Num. Classes | Train/Val/Test Images | Train/Val/Test Labeled | Release date | License |
18+
| --------------------------------------------------------------------------------------------------------------------------------- | ----------------- | ------------ | -------- | ------------ | --------------------- | ---------------------- | ------------ | --------------------------------------------------------------- |
19+
| [Chest-x-ray-images-with-pneumothorax-masks](https://www.kaggle.com/datasets/vbookshelf/pneumothorax-chest-xray-images-and-masks) | throax | segmentation | x_ray | 2 | 10675/-/1372 | yes/-/yes | 2020 | [CC-BY-NC 4.0](https://creativecommons.org/licenses/by-sa/4.0/) |
20+
21+
| Class Name | Num. Train | Pct. Train | Num. Val | Pct. Val | Num. Test | Pct. Test |
22+
| :----------: | :--------: | :--------: | :------: | :------: | :-------: | :-------: |
23+
| background | 10675 | 99.7 | - | - | 1372 | 99.71 |
24+
| pneumothroax | 2379 | 0.3 | - | - | 290 | 0.29 |
25+
26+
### Visualization
27+
28+
![chest_x_ray_images_with_pneumothorax_masks](https://raw.githubusercontent.com/uni-medical/medical-datasets-visualization/main/2d/semantic_seg/x_ray/chest_x_ray_images_with_pneumothorax_masks/chest_x_ray_images_with_pneumothorax_masks_dataset.png?raw=true)
29+
30+
### Prerequisites
31+
32+
- Python 3.8
33+
- PyTorch 1.10.0
34+
- pillow(PIL) 9.3.0
35+
- scikit-learn(sklearn) 1.2.0
36+
- [MIM](https://github.com/open-mmlab/mim) v0.3.4
37+
- [MMCV](https://github.com/open-mmlab/mmcv) v2.0.0rc4
38+
- [MMEngine](https://github.com/open-mmlab/mmengine) v0.2.0 or higher
39+
- [MMSegmentation](https://github.com/open-mmlab/mmsegmentation) v1.0.0rc5
40+
41+
All the commands below rely on the correct configuration of PYTHONPATH, which should point to the project's directory so that Python can locate the module files. In chest_x_ray_images_with_pneumothorax_masks/ root directory, run the following line to add the current directory to PYTHONPATH:
42+
43+
```shell
44+
export PYTHONPATH=`pwd`:$PYTHONPATH
45+
```
46+
47+
### Dataset preparing
48+
49+
- download dataset from [here](https://www.kaggle.com/datasets/vbookshelf/pneumothorax-chest-xray-images-and-masks) and decompression data to path 'data/'.
50+
- run script `"python tools/prepare_dataset.py"` to format data and change folder structure as below.
51+
- run script `"python ../../tools/split_seg_dataset.py"` to split dataset and generate `train.txt`, `val.txt` and `test.txt`. If the label of official validation set and test set cannot be obtained, we generate `train.txt` and `val.txt` from the training set randomly.
52+
53+
```none
54+
mmsegmentation
55+
├── mmseg
56+
├── projects
57+
│ ├── medical
58+
│ │ ├── 2d_image
59+
│ │ │ ├── x_ray
60+
│ │ │ │ ├── chest_x_ray_images_with_pneumothorax_masks
61+
│ │ │ │ │ ├── configs
62+
│ │ │ │ │ ├── datasets
63+
│ │ │ │ │ ├── tools
64+
│ │ │ │ │ ├── data
65+
│ │ │ │ │ │ ├── train.txt
66+
│ │ │ │ │ │ ├── val.txt
67+
│ │ │ │ │ │ ├── images
68+
│ │ │ │ │ │ │ ├── train
69+
│ │ │ │ | │ │ │ ├── xxx.png
70+
│ │ │ │ | │ │ │ ├── ...
71+
│ │ │ │ | │ │ │ └── xxx.png
72+
│ │ │ │ │ │ ├── masks
73+
│ │ │ │ │ │ │ ├── train
74+
│ │ │ │ | │ │ │ ├── xxx.png
75+
│ │ │ │ | │ │ │ ├── ...
76+
│ │ │ │ | │ │ │ └── xxx.png
77+
```
78+
79+
### Training commands
80+
81+
```shell
82+
mim train mmseg ./configs/${CONFIG_PATH}
83+
```
84+
85+
To train on multiple GPUs, e.g. 8 GPUs, run the following command:
86+
87+
```shell
88+
mim train mmseg ./configs/${CONFIG_PATH} --launcher pytorch --gpus 8
89+
```
90+
91+
### Testing commands
92+
93+
```shell
94+
mim test mmseg ./configs/${CONFIG_PATH} --checkpoint ${CHECKPOINT_PATH}
95+
```
96+
97+
## Checklist
98+
99+
- [x] Milestone 1: PR-ready, and acceptable to be one of the `projects/`.
100+
101+
- [x] Finish the code
102+
- [x] Basic docstrings & proper citation
103+
- [x] Test-time correctness
104+
- [x] A full README
105+
106+
- [x] Milestone 2: Indicates a successful model implementation.
107+
108+
- [x] Training-time correctness
109+
110+
- [ ] Milestone 3: Good to be a part of our core package!
111+
112+
- [ ] Type hints and docstrings
113+
- [ ] Unit tests
114+
- [ ] Code polishing
115+
- [ ] Metafile.yml
116+
117+
- [ ] Move your modules into the core package following the codebase's file hierarchy structure.
118+
119+
- [ ] Refactor your modules into the core package following the codebase's file hierarchy structure.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
dataset_type = 'ChestPenumoMaskDataset'
2+
data_root = 'data/'
3+
img_scale = (512, 512)
4+
train_pipeline = [
5+
dict(type='LoadImageFromFile'),
6+
dict(type='LoadAnnotations'),
7+
dict(type='Resize', scale=img_scale, keep_ratio=False),
8+
dict(type='RandomFlip', prob=0.5),
9+
dict(type='PhotoMetricDistortion'),
10+
dict(type='PackSegInputs')
11+
]
12+
test_pipeline = [
13+
dict(type='LoadImageFromFile'),
14+
dict(type='Resize', scale=img_scale, keep_ratio=False),
15+
dict(type='LoadAnnotations'),
16+
dict(type='PackSegInputs')
17+
]
18+
train_dataloader = dict(
19+
batch_size=16,
20+
num_workers=4,
21+
persistent_workers=True,
22+
sampler=dict(type='InfiniteSampler', shuffle=True),
23+
dataset=dict(
24+
type=dataset_type,
25+
data_root=data_root,
26+
ann_file='train.txt',
27+
data_prefix=dict(img_path='images/', seg_map_path='masks/'),
28+
pipeline=train_pipeline))
29+
val_dataloader = dict(
30+
batch_size=1,
31+
num_workers=4,
32+
persistent_workers=True,
33+
sampler=dict(type='DefaultSampler', shuffle=False),
34+
dataset=dict(
35+
type=dataset_type,
36+
data_root=data_root,
37+
ann_file='val.txt',
38+
data_prefix=dict(img_path='images/', seg_map_path='masks/'),
39+
pipeline=test_pipeline))
40+
test_dataloader = val_dataloader
41+
val_evaluator = dict(type='IoUMetric', iou_metrics=['mIoU', 'mDice'])
42+
test_evaluator = dict(type='IoUMetric', iou_metrics=['mIoU', 'mDice'])
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
_base_ = [
2+
'mmseg::_base_/models/fcn_unet_s5-d16.py',
3+
'./chest-x-ray-images-with-pneumothorax-masks_512x512.py',
4+
'mmseg::_base_/default_runtime.py',
5+
'mmseg::_base_/schedules/schedule_20k.py'
6+
]
7+
custom_imports = dict(
8+
imports='datasets.chest-x-ray-images-with-pneumothorax-masks_dataset')
9+
img_scale = (512, 512)
10+
data_preprocessor = dict(size=img_scale)
11+
optimizer = dict(lr=0.01)
12+
optim_wrapper = dict(optimizer=optimizer)
13+
model = dict(
14+
data_preprocessor=data_preprocessor,
15+
decode_head=dict(
16+
num_classes=2, loss_decode=dict(use_sigmoid=True), out_channels=1),
17+
auxiliary_head=None,
18+
test_cfg=dict(mode='whole', _delete_=True))
19+
vis_backends = None
20+
visualizer = dict(vis_backends=vis_backends)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
_base_ = [
2+
'mmseg::_base_/models/fcn_unet_s5-d16.py',
3+
'./chest-x-ray-images-with-pneumothorax-masks_512x512.py',
4+
'mmseg::_base_/default_runtime.py',
5+
'mmseg::_base_/schedules/schedule_20k.py'
6+
]
7+
custom_imports = dict(
8+
imports='datasets.chest-x-ray-images-with-pneumothorax-masks_dataset')
9+
img_scale = (512, 512)
10+
data_preprocessor = dict(size=img_scale)
11+
optimizer = dict(lr=0.0001)
12+
optim_wrapper = dict(optimizer=optimizer)
13+
model = dict(
14+
data_preprocessor=data_preprocessor,
15+
decode_head=dict(num_classes=2),
16+
auxiliary_head=None,
17+
test_cfg=dict(mode='whole', _delete_=True))
18+
vis_backends = None
19+
visualizer = dict(vis_backends=vis_backends)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
_base_ = [
2+
'mmseg::_base_/models/fcn_unet_s5-d16.py',
3+
'./chest-x-ray-images-with-pneumothorax-masks_512x512.py',
4+
'mmseg::_base_/default_runtime.py',
5+
'mmseg::_base_/schedules/schedule_20k.py'
6+
]
7+
custom_imports = dict(
8+
imports='datasets.chest-x-ray-images-with-pneumothorax-masks_dataset')
9+
img_scale = (512, 512)
10+
data_preprocessor = dict(size=img_scale)
11+
optimizer = dict(lr=0.001)
12+
optim_wrapper = dict(optimizer=optimizer)
13+
model = dict(
14+
data_preprocessor=data_preprocessor,
15+
decode_head=dict(num_classes=2),
16+
auxiliary_head=None,
17+
test_cfg=dict(mode='whole', _delete_=True))
18+
vis_backends = None
19+
visualizer = dict(vis_backends=vis_backends)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
_base_ = [
2+
'mmseg::_base_/models/fcn_unet_s5-d16.py',
3+
'./chest-x-ray-images-with-pneumothorax-masks_512x512.py',
4+
'mmseg::_base_/default_runtime.py',
5+
'mmseg::_base_/schedules/schedule_20k.py'
6+
]
7+
custom_imports = dict(
8+
imports='datasets.chest-x-ray-images-with-pneumothorax-masks_dataset')
9+
img_scale = (512, 512)
10+
data_preprocessor = dict(size=img_scale)
11+
optimizer = dict(lr=0.01)
12+
optim_wrapper = dict(optimizer=optimizer)
13+
model = dict(
14+
data_preprocessor=data_preprocessor,
15+
decode_head=dict(num_classes=2),
16+
auxiliary_head=None,
17+
test_cfg=dict(mode='whole', _delete_=True))
18+
vis_backends = None
19+
visualizer = dict(vis_backends=vis_backends)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
from mmseg.datasets import BaseSegDataset
2+
from mmseg.registry import DATASETS
3+
4+
5+
@DATASETS.register_module()
6+
class ChestPenumoMaskDataset(BaseSegDataset):
7+
"""ChestPenumoMaskDataset dataset.
8+
9+
In segmentation map annotation for ChestPenumoMaskDataset,
10+
0 stands for background, which is included in 2 categories.
11+
``reduce_zero_label`` is fixed to False. The ``img_suffix``
12+
is fixed to '.png' and ``seg_map_suffix`` is fixed to '.png'.
13+
14+
Args:
15+
img_suffix (str): Suffix of images. Default: '.png'
16+
seg_map_suffix (str): Suffix of segmentation maps. Default: '.png'
17+
reduce_zero_label (bool): Whether to mark label zero as ignored.
18+
Default to False.
19+
"""
20+
METAINFO = dict(classes=('background', 'penumothroax'))
21+
22+
def __init__(self,
23+
img_suffix='.png',
24+
seg_map_suffix='.png',
25+
reduce_zero_label=False,
26+
**kwargs) -> None:
27+
super().__init__(
28+
img_suffix=img_suffix,
29+
seg_map_suffix=seg_map_suffix,
30+
reduce_zero_label=reduce_zero_label,
31+
**kwargs)
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
import glob
2+
import os
3+
import shutil
4+
5+
from PIL import Image
6+
from sklearn.model_selection import train_test_split
7+
8+
root_path = 'data/'
9+
img_suffix = '.png'
10+
seg_map_suffix = '.png'
11+
save_img_suffix = '.png'
12+
save_seg_map_suffix = '.png'
13+
14+
all_imgs = glob.glob('data/siim-acr-pneumothorax/png_images/*' + img_suffix)
15+
x_train, x_test = train_test_split(all_imgs, test_size=0.2, random_state=0)
16+
17+
print(len(x_train), len(x_test))
18+
os.system('mkdir -p ' + root_path + 'images/train/')
19+
os.system('mkdir -p ' + root_path + 'images/val/')
20+
os.system('mkdir -p ' + root_path + 'masks/train/')
21+
os.system('mkdir -p ' + root_path + 'masks/val/')
22+
23+
part_dir_dict = {0: 'train/', 1: 'val/'}
24+
for ith, part in enumerate([x_train, x_test]):
25+
part_dir = part_dir_dict[ith]
26+
for img in part:
27+
basename = os.path.basename(img)
28+
img_save_path = os.path.join(root_path, 'images', part_dir,
29+
basename.split('.')[0] + save_img_suffix)
30+
shutil.copy(img, img_save_path)
31+
mask_path = 'data/siim-acr-pneumothorax/png_masks/' + basename
32+
mask = Image.open(mask_path).convert('L')
33+
mask_save_path = os.path.join(
34+
root_path, 'masks', part_dir,
35+
basename.split('.')[0] + save_seg_map_suffix)
36+
mask.save(mask_save_path)

0 commit comments

Comments
 (0)