Skip to content

Commit e64c2fd

Browse files
authored
[CodeCamp2023-565] Fine tune ONNX Models (MMSegemetation) Inference for NVIDIA Jetson (open-mmlab#3372)
Fine tune ONNX Models (MMSegemetation) Inference for NVIDIA Jetson
1 parent cf8fc7b commit e64c2fd

File tree

4 files changed

+746
-0
lines changed

4 files changed

+746
-0
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -129,6 +129,7 @@ To migrate from MMSegmentation 0.x, please refer to [migration](docs/en/migratio
129129
- [Tutorial 3: Inference with existing models](docs/en/user_guides/3_inference.md)
130130
- [Tutorial 4: Train and test with existing models](docs/en/user_guides/4_train_test.md)
131131
- [Tutorial 5: Model deployment](docs/en/user_guides/5_deployment.md)
132+
- [Deploy mmsegmentation on Jetson platform](docs/zh_cn/user_guides/deploy_jetson.md)
132133
- [Useful Tools](docs/en/user_guides/useful_tools.md)
133134
- [Feature Map Visualization](docs/en/user_guides/visualization_feature_map.md)
134135
- [Visualization](docs/en/user_guides/visualization.md)

README_zh-CN.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -124,6 +124,7 @@ MMSegmentation v1.x 在 0.x 版本的基础上有了显著的提升,提供了
124124
- [教程3:使用预训练模型推理](docs/zh_cn/user_guides/3_inference.md)
125125
- [教程4:使用现有模型进行训练和测试](docs/zh_cn/user_guides/4_train_test.md)
126126
- [教程5:模型部署](docs/zh_cn/user_guides/5_deployment.md)
127+
- [在 Jetson 平台部署 mmsegmentation](docs/zh_cn/user_guides/deploy_jetson.md)
127128
- [常用工具](docs/zh_cn/user_guides/useful_tools.md)
128129
- [特征图可视化](docs/zh_cn/user_guides/visualization_feature_map.md)
129130
- [可视化](docs/zh_cn/user_guides/visualization.md)
Lines changed: 372 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,372 @@
1+
# 将 MMSeg 模型调优及部署到 NVIDIA Jetson 平台教程
2+
3+
- 请先查阅[MMSegmentation 模型部署](https://github.com/open-mmlab/mmsegmentation/blob/main/docs/zh_cn/user_guides/5_deployment.md)文档。
4+
- **本教程所用 mmsegmentation 版本: v1.1.2**
5+
- **本教程所用 NVIDIA Jetson 设备: NVIDIA Jetson AGX Orin 64G**
6+
7+
<div align="center">
8+
<img src="https://github.com/AI-Tianlong/Useful-Tools/assets/50650583/b5466cfd-71a9-4e06-9823-c253a97d57b5" alt="Smiley face" width="50%">
9+
</div>
10+
11+
## 1 配置 [mmsegmentation](https://github.com/open-mmlab/mmsegmentation)
12+
13+
- 根据[安装和验证](https://github.com/open-mmlab/mmsegmentation/blob/main/docs/zh_cn/get_started.md)文档,完成开发 [mmsegmentation](https://github.com/open-mmlab/mmsegmentation) 所需的 [`pytorch`](https://pytorch.org/get-started/locally/)[`mmcv`](https://github.com/open-mmlab/mmcv)[`mmengine`](https://github.com/open-mmlab/mmengine) 等环境依赖安装。
14+
- 从 GitHub 使用 git clone 命令完成 [mmsegmentation](https://github.com/open-mmlab/mmsegmentation) 下载。网络不好的同学,可通过 [MMSeg GitHub](https://github.com/open-mmlab/mmsegmentation) 页面进行 zip 的下载。
15+
```bash
16+
git clone https://github.com/open-mmlab/mmsegmentation.git
17+
```
18+
- 使用 `pip install -v -e.` 命令动态安装 mmsegmentation 。
19+
```bash
20+
cd mmsegmentation
21+
pip install -v -e .
22+
```
23+
提示成功安装后,可通过 `pip list` 命令查看到 mmsegmentation 已通过本地安装方式安装到了您的环境中。
24+
![mmseg-install](https://github.com/AI-Tianlong/Useful-Tools/assets/50650583/a9c7bcc9-cdcc-40a4-bd7b-8153195549c8)
25+
26+
## 2 准备您的数据集
27+
28+
- 本教程使用遥感图像语义分割数据集 [potsdam](https://github.com/open-mmlab/mmsegmentation/blob/main/docs/zh_cn/user_guides/2_dataset_prepare.md#isprs-potsdam) 作为示例。
29+
- 根据 [potsdam 数据准备](https://github.com/open-mmlab/mmsegmentation/blob/main/docs/zh_cn/user_guides/2_dataset_prepare.md#isprs-potsdam)文档,进行数据集下载及 MMSeg 格式的准备。
30+
- 数据集介绍: potsdam 数据集是以德国一个典型的历史城市 Potsdam 命名的,该城市有着大建筑群、狭窄的街道和密集的建筑结构。 potsdam 数据集包含 38 幅 6000x6000 像素的图像,空间分辨率为 5cm,数据集的示例如下图:
31+
![potsdam-img](https://github.com/AI-Tianlong/Useful-Tools/assets/50650583/3bc0a75b-1693-4ae6-aeea-ad502e955068)
32+
33+
## 3 从 config 页面下载模型的 pth 权重文件
34+
35+
这里以 [`deeplabv3plus_r101-d8_4xb4-80k_potsdam-512x512.py`](../../configs/deeplabv3plus/deeplabv3plus_r101-d8_4xb4-80k_potsdam-512x512.py) 配置文件举例,在 [configs](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/deeplabv3plus#potsdam) 页面下载权重文件,
36+
![pth](https://github.com/AI-Tianlong/Useful-Tools/assets/50650583/8f747362-caf4-406c-808d-4ca72babb209)
37+
38+
## 4 通过 [OpenMMLab deployee](https://platform.openmmlab.com/deploee) 以交互式方式进行模型转换及测速
39+
40+
### 4.1 模型转换
41+
42+
在该部分中,[OpenMMLab 官网](https://platform.openmmlab.com/deploee)提供了模型转换及模型测速的交互界面,无需任何代码,即可通过选择对应选项完成模型 ONNX 格式`xxxx.onnx` 和 TensorRT `.engine`格式的转换。
43+
如您的自定义 config 文件中有相对引用关系,如:
44+
45+
```python
46+
# xxxx.py
47+
_base_ = [
48+
'../_base_/models/deeplabv3plus_r50-d8.py',
49+
'../_base_/datasets/potsdam.py',
50+
'../_base_/default_runtime.py',
51+
'../_base_/schedules/schedule_80k.py'
52+
]
53+
```
54+
55+
您可以使用以下代码消除相对引用关系,以生成完整的 config 文件。
56+
57+
```python
58+
import mmengine
59+
60+
mmengine.Config.fromfile("configs/deeplabv3plus/deeplabv3plus_r101-d8_4xb4-80k_potsdam-512x512.py").dump("My_config.py")
61+
```
62+
63+
使用上述代码后,您能够看到,在`My_config.py`包含着完整的配置文件,无相对引用。这时,上传模型 config 至网页内对应处。
64+
65+
#### 创建转换任务
66+
67+
按照下图提示及自己的需求,创建转换任务并提交。
68+
69+
<div align="center">
70+
<img src="https://github.com/AI-Tianlong/Useful-Tools/assets/50650583/4918d2f9-d63c-480f-97f1-054529770cfd" alt="NVIDIA-Jetson" width="80%">
71+
</div>
72+
73+
### 4.2 模型测速
74+
75+
在完成模型转换后可通过**模型测速**界面,完成在真实设备上的模型测速。
76+
77+
#### 创建测速任务
78+
79+
<div align="center">
80+
<img src="https://github.com/AI-Tianlong/Useful-Tools/assets/50650583/27340556-c81a-4ce3-8560-2c4727d3355e" alt="NVIDIA-Jetson" width="100%">
81+
</div>
82+
83+
<div align="center">
84+
<img src="https://github.com/AI-Tianlong/Useful-Tools/assets/50650583/6f4fc3a9-ba9d-4829-8407-ed1470ba7bf3" alt="NVIDIA-Jetson" width="100%">
85+
</div>
86+
87+
测速完成后,可在页面生成完整的测速报告。[查看测速报告示例](https://openmmlab-deploee.oss-cn-shanghai.aliyuncs.com/tmp/profile_speed/4352f5.txt)
88+
89+
## 5 通过 OpenMMLab mmdeploy 以命令行将模型转换为ONNX格式
90+
91+
该部分可以通过 mmdeploy 库对 mmseg 训练好的模型进行推理格式的转换。这里给出一个示例,具体文档可见[ mmdeploy 模型转换文档](../../docs/zh_cn/user_guides/5_deployment.md)
92+
93+
### 5.1 通过源码构建 mmdeploy 库
94+
95+
在您安装 mmsegmentation 库的虚拟环境下,通过 `git clone`命令从 GitHub 克隆 [mmdeploy](https://github.com/open-mmlab/mmdeploy)
96+
97+
### 5.2 模型转换
98+
99+
如您的 config 中含有相对引用,仍需进行消除,如[4.1 模型转换](#4.1-模型转换)所述,
100+
进入 mmdeploy 文件夹,执行以下命令,即可完成模型转换。
101+
102+
```bash
103+
python tools/deploy.py \
104+
configs/mmseg/segmentation_onnxruntime_static-512x512.py \
105+
../atl_config.py \
106+
../deeplabv3plus_r18-d8_512x512_80k_potsdam_20211219_020601-75fd5bc3.pth \
107+
../2_13_1024_5488_1536_6000.png \
108+
--work-dir ../atl_models \
109+
--device cpu \
110+
--show \
111+
--dump-info
112+
```
113+
114+
```bash
115+
# 使用方法
116+
python ./tools/deploy.py \
117+
${部署配置文件路径} \
118+
${模型配置文件路径} \
119+
${模型权重路径} \
120+
${输入图像路径} \
121+
--work-dir ${用来保存日志和模型文件路径} \
122+
--device ${cpu/cuda:0} \
123+
--show \ # 是否显示检测的结果
124+
--dump-info # 是否输出 SDK 信息
125+
126+
```
127+
128+
执行成功后,您将能够看到以下提示,即为转换成功。
129+
130+
```bash
131+
10/08 17:40:44 - mmengine - INFO - visualize pytorch model success.
132+
10/08 17:40:44 - mmengine - INFO - All process success.
133+
```
134+
135+
<div align="center">
136+
<img src="https://github.com/AI-Tianlong/Useful-Tools/assets/50650583/b752ccf8-903f-4ad3-ad7c-74fc25cb89a5" alt="NVIDIA-Jetson" width="400">
137+
</div>
138+
139+
# 6 在 Jetson 平台进行转换及部署
140+
141+
## 6.1 环境准备
142+
143+
参考[如何在 Jetson 模组上安装 MMDeploy](https://github.com/open-mmlab/mmdeploy/blob/main/docs/zh_cn/01-how-to-build/jetsons.md)文档,完成在 Jetson 上的环境准备工作。
144+
****:安装 Pytorch,可查阅 [NVIDIA Jetson Pytorch 安装文档](https://github.com/open-mmlab/mmdeploy/blob/main/docs/zh_cn/01-how-to-build/jetsons.md)安装最新的 Pytorch。
145+
146+
### 6.1.1 创建虚拟环境
147+
148+
```bash
149+
conda create -n {您虚拟环境的名字} python={python版本}
150+
```
151+
152+
### 6.1.2 虚拟环境内安装Pytorch
153+
154+
<font color="red">注意:</font>这里不要安装最新的 pytorch 2.0,因为 pyTorch 1.11 是最后一个使用 USE_DISTRIBUTED 构建的wheel,否则会在用mmdeploy进行模型转换的时候提示`AttributeError: module 'torch.distributed' has no attribute 'ReduceOp'`的错误。参考以下链接:https://forums.developer.nvidia.com/t/module-torch-distributed-has-no-attribute-reduceop/256581/6
155+
下载`torch-1.11.0-cp38-cp38-linux_aarch64.whl`并安装
156+
157+
```bash
158+
pip install torch-1.11.0-cp38-cp38-linux_aarch64.whl
159+
```
160+
161+
执行以上命令后,您将能看到以下提示,即为安装成功。
162+
163+
```bash
164+
Processing ./torch-1.11.0-cp38-cp38-linux_aarch64.whl
165+
Requirement already satisfied: typing-extensions in /home/sirs/miniconda3/envs/openmmlab/lib/python3.8/site-packages (from torch==1.11.0) (4.7.1)
166+
Installing collected packages: torch
167+
Successfully installed torch-1.11.0
168+
```
169+
170+
### 6.1.3 将 Jetson Pack 自带的 tensorrt 拷贝至虚拟环境下
171+
172+
请参考[配置 TensorRT](https://github.com/open-mmlab/mmdeploy/blob/main/docs/zh_cn/01-how-to-build/jetsons.md#%E9%85%8D%E7%BD%AE-tensorrt)
173+
JetPack SDK 自带 TensorRT。 但是为了能够在 Conda 环境中成功导入,我们需要将 TensorRT 拷贝进先前创建的 Conda 环境中。
174+
175+
```bash
176+
export PYTHON_VERSION=`python3 --version | cut -d' ' -f 2 | cut -d'.' -f1,2`
177+
cp -r /usr/lib/python${PYTHON_VERSION}/dist-packages/tensorrt* ~/miniconda/envs/{您的虚拟环境名字}/lib/python${PYTHON_VERSION}/site-packages/
178+
```
179+
180+
### 6.1.4 安装 MMCV
181+
182+
通过`mim install mmcv`或从源码对其进行编译。
183+
184+
```bash
185+
pip install openmim
186+
mim install mmcv
187+
```
188+
189+
或者从源码对其进行编译。
190+
191+
```bash
192+
sudo apt-get install -y libssl-dev
193+
git clone https://github.com/open-mmlab/mmcv.git
194+
cd mmcv
195+
pip install -e .
196+
```
197+
198+
<font color="red">注:pytorch版本发生变动后,需要重新编译mmcv。</font>
199+
200+
### 6.1.5 安装 ONNX
201+
202+
<font color="red">注:以下方式二选一</font>
203+
204+
- conda
205+
```bash
206+
conda install -c conda-forge onnx
207+
```
208+
- pip
209+
```bash
210+
python3 -m pip install onnx
211+
```
212+
213+
### 6.1.6 安装 ONNX Runtime
214+
215+
根据网页 [ONNX Runtime](https://elinux.org/Jetson_Zoo#ONNX_Runtime) 选择合适的ONNX Runtime版本进行下载安装。
216+
示例:
217+
218+
```bash
219+
# Install pip wheel
220+
$ pip3 install onnxruntime_gpu-1.10.0-cp38-cp38-linux_aarch64.whl
221+
222+
```
223+
224+
## 6.2 在 Jetson AGX Orin 进行模型转换及推理
225+
226+
### 6.2.1 ONNX 模型转换
227+
228+
[4.1 模型转换](#4.1-模型转换)相同,在 Jetson 平台下进入安装好的虚拟环境,以及mmdeploy 目录,进行模型ONNX转换。
229+
230+
```bash
231+
python tools/deploy.py \
232+
configs/mmseg/segmentation_onnxruntime_static-512x512.py \
233+
../atl_config.py \
234+
../deeplabv3plus_r18-d8_512x512_80k_potsdam_20211219_020601-75fd5bc3.pth \
235+
../2_13_3584_2560_4096_3072.png \
236+
--work-dir ../atl_models \
237+
--device cpu \
238+
--show \
239+
--dump-info
240+
241+
```
242+
243+
<font color="red">注:</font> 如果报错提示内容:
244+
245+
```none
246+
AttributeError: module 'torch.distributed' has no attribute 'ReduceOp'
247+
```
248+
249+
可参考以下链接进行解决:https://forums.developer.nvidia.com/t/module-torch-distributed-has-no-attribute-reduceop/256581/6,即安装 pytorch 1.11.0 版本。
250+
251+
转换成功后,您将会看到如下信息以及包含 ONNX 模型的文件夹:
252+
253+
```bash
254+
10/09 19:58:22 - mmengine - INFO - visualize pytorch model success.
255+
10/09 19:58:22 - mmengine - INFO - All process success.
256+
```
257+
258+
<div align="center">
259+
<img src="https://github.com/AI-Tianlong/Useful-Tools/assets/50650583/d68f1cf6-0e80-4261-91a3-6046b17de146" alt="NVIDIA-Jetson" width="400">
260+
<img src="https://github.com/AI-Tianlong/Useful-Tools/assets/50650583/70470a39-6a4f-4fd5-a06d-9b9d59a768ef" alt="NVIDIA-Jetson" width="160">
261+
</div>
262+
263+
### 6.2.2 TensorRT 模型转换
264+
265+
更换部署trt配置文件,进行 TensorRT 模型转换。
266+
267+
```bash
268+
python tools/deploy.py \
269+
configs/mmseg/segmentation_tensorrt_static-512x512.py \
270+
../atl_config.py \
271+
../deeplabv3plus_r18-d8_512x512_80k_potsdam_20211219_020601-75fd5bc3.pth \
272+
../2_13_3584_2560_4096_3072.png \
273+
--work-dir ../atl_trt_models \
274+
--device cuda:0 \
275+
--show \
276+
--dump-info
277+
278+
```
279+
280+
转换成功后您将看到以下信息及 TensorRT 模型文件夹:
281+
282+
```bash
283+
10/09 20:15:50 - mmengine - INFO - visualize pytorch model success.
284+
10/09 20:15:50 - mmengine - INFO - All process success.
285+
```
286+
287+
<div align="center">
288+
<img src="https://github.com/AI-Tianlong/Useful-Tools/assets/50650583/2ac1428f-b787-4fdd-beaf-6397e5b21e33" alt="NVIDIA-Jetson" width="340">
289+
<img src="https://github.com/AI-Tianlong/Useful-Tools/assets/50650583/70470a39-6a4f-4fd5-a06d-9b9d59a768ef" alt="NVIDIA-Jetson" width="200">
290+
</div>
291+
292+
## 6.3 模型测速
293+
294+
执行以下命令完成模型测速,详细内容请查看[ profiler ](https://github.com/open-mmlab/mmdeploy/blob/main/docs/zh_cn/02-how-to-run/useful_tools.md#profiler)
295+
296+
```bash
297+
python tools/profiler.py \
298+
${DEPLOY_CFG} \
299+
${MODEL_CFG} \
300+
${IMAGE_DIR} \
301+
--model ${MODEL} \
302+
--device ${DEVICE} \
303+
--shape ${SHAPE} \
304+
--num-iter ${NUM_ITER} \
305+
--warmup ${WARMUP} \
306+
--cfg-options ${CFG_OPTIONS} \
307+
--batch-size ${BATCH_SIZE} \
308+
--img-ext ${IMG_EXT}
309+
```
310+
311+
示例:
312+
313+
```bash
314+
python tools/profiler.py \
315+
configs/mmseg/segmentation_tensorrt_static-512x512.py \
316+
../atl_config.py \
317+
../atl_demo_img \
318+
--model /home/sirs/AI-Tianlong/OpenMMLab/atl_trt_models/end2end.engine \
319+
--device cuda:0 \
320+
--shape 512x512 \
321+
--num-iter 100
322+
```
323+
324+
测速结果
325+
326+
![image](https://github.com/AI-Tianlong/Useful-Tools/assets/50650583/874e9742-ee10-490c-9e69-17da0096c49b)
327+
328+
## 6.4 模型推理
329+
330+
根据[6.2.2](#6.2.2-TensorRT-模型转换)中生成的TensorRT模型文件夹,进行模型推理。
331+
332+
```python
333+
from mmdeploy.apis.utils import build_task_processor
334+
from mmdeploy.utils import get_input_shape, load_config
335+
import torch
336+
337+
deploy_cfg='./mmdeploy/configs/mmseg/segmentation_tensorrt_static-512x512.py'
338+
model_cfg='./atl_config.py'
339+
device='cuda:0'
340+
backend_model = ['./atl_trt_models/end2end.engine']
341+
image = './atl_demo_img/2_13_2048_1024_2560_1536.png'
342+
343+
# read deploy_cfg and model_cfg
344+
deploy_cfg, model_cfg = load_config(deploy_cfg, model_cfg)
345+
346+
# build task and backend model
347+
task_processor = build_task_processor(model_cfg, deploy_cfg, device)
348+
model = task_processor.build_backend_model(backend_model)
349+
350+
# process input image
351+
input_shape = get_input_shape(deploy_cfg)
352+
model_inputs, _ = task_processor.create_input(image, input_shape)
353+
354+
# do model inference
355+
with torch.no_grad():
356+
result = model.test_step(model_inputs)
357+
358+
# visualize results
359+
task_processor.visualize(
360+
image=image,
361+
model=model,
362+
result=result[0],
363+
window_name='visualize',
364+
output_file='./output_segmentation.png')
365+
```
366+
367+
即可得到推理结果:
368+
369+
<div align="center">
370+
<img src="https://github.com/AI-Tianlong/Useful-Tools/assets/50650583/d0ae1fa8-e223-4b3f-b699-6bfa8db38133" alt="NVIDIA-Jetson" width="40%">
371+
<img src="https://github.com/AI-Tianlong/Useful-Tools/assets/50650583/6d999cbe-2101-4e1b-b4a9-13115c9d1928" alt="NVIDIA-Jetson" width="40%">
372+
</div>

0 commit comments

Comments
 (0)