|
| 1 | +# 将 MMSeg 模型调优及部署到 NVIDIA Jetson 平台教程 |
| 2 | + |
| 3 | +- 请先查阅[MMSegmentation 模型部署](https://github.com/open-mmlab/mmsegmentation/blob/main/docs/zh_cn/user_guides/5_deployment.md)文档。 |
| 4 | +- **本教程所用 mmsegmentation 版本: v1.1.2** |
| 5 | +- **本教程所用 NVIDIA Jetson 设备: NVIDIA Jetson AGX Orin 64G** |
| 6 | + |
| 7 | +<div align="center"> |
| 8 | + <img src="https://github.com/AI-Tianlong/Useful-Tools/assets/50650583/b5466cfd-71a9-4e06-9823-c253a97d57b5" alt="Smiley face" width="50%"> |
| 9 | +</div> |
| 10 | + |
| 11 | +## 1 配置 [mmsegmentation](https://github.com/open-mmlab/mmsegmentation) |
| 12 | + |
| 13 | +- 根据[安装和验证](https://github.com/open-mmlab/mmsegmentation/blob/main/docs/zh_cn/get_started.md)文档,完成开发 [mmsegmentation](https://github.com/open-mmlab/mmsegmentation) 所需的 [`pytorch`](https://pytorch.org/get-started/locally/)、[`mmcv`](https://github.com/open-mmlab/mmcv)、[`mmengine`](https://github.com/open-mmlab/mmengine) 等环境依赖安装。 |
| 14 | +- 从 GitHub 使用 git clone 命令完成 [mmsegmentation](https://github.com/open-mmlab/mmsegmentation) 下载。网络不好的同学,可通过 [MMSeg GitHub](https://github.com/open-mmlab/mmsegmentation) 页面进行 zip 的下载。 |
| 15 | + ```bash |
| 16 | + git clone https://github.com/open-mmlab/mmsegmentation.git |
| 17 | + ``` |
| 18 | +- 使用 `pip install -v -e.` 命令动态安装 mmsegmentation 。 |
| 19 | + ```bash |
| 20 | + cd mmsegmentation |
| 21 | + pip install -v -e . |
| 22 | + ``` |
| 23 | + 提示成功安装后,可通过 `pip list` 命令查看到 mmsegmentation 已通过本地安装方式安装到了您的环境中。 |
| 24 | +  |
| 25 | + |
| 26 | +## 2 准备您的数据集 |
| 27 | + |
| 28 | +- 本教程使用遥感图像语义分割数据集 [potsdam](https://github.com/open-mmlab/mmsegmentation/blob/main/docs/zh_cn/user_guides/2_dataset_prepare.md#isprs-potsdam) 作为示例。 |
| 29 | +- 根据 [potsdam 数据准备](https://github.com/open-mmlab/mmsegmentation/blob/main/docs/zh_cn/user_guides/2_dataset_prepare.md#isprs-potsdam)文档,进行数据集下载及 MMSeg 格式的准备。 |
| 30 | +- 数据集介绍: potsdam 数据集是以德国一个典型的历史城市 Potsdam 命名的,该城市有着大建筑群、狭窄的街道和密集的建筑结构。 potsdam 数据集包含 38 幅 6000x6000 像素的图像,空间分辨率为 5cm,数据集的示例如下图: |
| 31 | +  |
| 32 | + |
| 33 | +## 3 从 config 页面下载模型的 pth 权重文件 |
| 34 | + |
| 35 | +这里以 [`deeplabv3plus_r101-d8_4xb4-80k_potsdam-512x512.py`](../../configs/deeplabv3plus/deeplabv3plus_r101-d8_4xb4-80k_potsdam-512x512.py) 配置文件举例,在 [configs](https://github.com/open-mmlab/mmsegmentation/tree/main/configs/deeplabv3plus#potsdam) 页面下载权重文件, |
| 36 | + |
| 37 | + |
| 38 | +## 4 通过 [OpenMMLab deployee](https://platform.openmmlab.com/deploee) 以交互式方式进行模型转换及测速 |
| 39 | + |
| 40 | +### 4.1 模型转换 |
| 41 | + |
| 42 | +在该部分中,[OpenMMLab 官网](https://platform.openmmlab.com/deploee)提供了模型转换及模型测速的交互界面,无需任何代码,即可通过选择对应选项完成模型 ONNX 格式`xxxx.onnx` 和 TensorRT `.engine`格式的转换。 |
| 43 | +如您的自定义 config 文件中有相对引用关系,如: |
| 44 | + |
| 45 | +```python |
| 46 | +# xxxx.py |
| 47 | +_base_ = [ |
| 48 | + '../_base_/models/deeplabv3plus_r50-d8.py', |
| 49 | + '../_base_/datasets/potsdam.py', |
| 50 | + '../_base_/default_runtime.py', |
| 51 | + '../_base_/schedules/schedule_80k.py' |
| 52 | +] |
| 53 | +``` |
| 54 | + |
| 55 | +您可以使用以下代码消除相对引用关系,以生成完整的 config 文件。 |
| 56 | + |
| 57 | +```python |
| 58 | +import mmengine |
| 59 | + |
| 60 | +mmengine.Config.fromfile("configs/deeplabv3plus/deeplabv3plus_r101-d8_4xb4-80k_potsdam-512x512.py").dump("My_config.py") |
| 61 | +``` |
| 62 | + |
| 63 | +使用上述代码后,您能够看到,在`My_config.py`包含着完整的配置文件,无相对引用。这时,上传模型 config 至网页内对应处。 |
| 64 | + |
| 65 | +#### 创建转换任务 |
| 66 | + |
| 67 | +按照下图提示及自己的需求,创建转换任务并提交。 |
| 68 | + |
| 69 | +<div align="center"> |
| 70 | + <img src="https://github.com/AI-Tianlong/Useful-Tools/assets/50650583/4918d2f9-d63c-480f-97f1-054529770cfd" alt="NVIDIA-Jetson" width="80%"> |
| 71 | +</div> |
| 72 | + |
| 73 | +### 4.2 模型测速 |
| 74 | + |
| 75 | +在完成模型转换后可通过**模型测速**界面,完成在真实设备上的模型测速。 |
| 76 | + |
| 77 | +#### 创建测速任务 |
| 78 | + |
| 79 | +<div align="center"> |
| 80 | + <img src="https://github.com/AI-Tianlong/Useful-Tools/assets/50650583/27340556-c81a-4ce3-8560-2c4727d3355e" alt="NVIDIA-Jetson" width="100%"> |
| 81 | +</div> |
| 82 | + |
| 83 | +<div align="center"> |
| 84 | + <img src="https://github.com/AI-Tianlong/Useful-Tools/assets/50650583/6f4fc3a9-ba9d-4829-8407-ed1470ba7bf3" alt="NVIDIA-Jetson" width="100%"> |
| 85 | +</div> |
| 86 | + |
| 87 | +测速完成后,可在页面生成完整的测速报告。[查看测速报告示例](https://openmmlab-deploee.oss-cn-shanghai.aliyuncs.com/tmp/profile_speed/4352f5.txt) |
| 88 | + |
| 89 | +## 5 通过 OpenMMLab mmdeploy 以命令行将模型转换为ONNX格式 |
| 90 | + |
| 91 | +该部分可以通过 mmdeploy 库对 mmseg 训练好的模型进行推理格式的转换。这里给出一个示例,具体文档可见[ mmdeploy 模型转换文档](../../docs/zh_cn/user_guides/5_deployment.md)。 |
| 92 | + |
| 93 | +### 5.1 通过源码构建 mmdeploy 库 |
| 94 | + |
| 95 | +在您安装 mmsegmentation 库的虚拟环境下,通过 `git clone`命令从 GitHub 克隆 [mmdeploy](https://github.com/open-mmlab/mmdeploy) |
| 96 | + |
| 97 | +### 5.2 模型转换 |
| 98 | + |
| 99 | +如您的 config 中含有相对引用,仍需进行消除,如[4.1 模型转换](#4.1-模型转换)所述, |
| 100 | +进入 mmdeploy 文件夹,执行以下命令,即可完成模型转换。 |
| 101 | + |
| 102 | +```bash |
| 103 | +python tools/deploy.py \ |
| 104 | + configs/mmseg/segmentation_onnxruntime_static-512x512.py \ |
| 105 | + ../atl_config.py \ |
| 106 | + ../deeplabv3plus_r18-d8_512x512_80k_potsdam_20211219_020601-75fd5bc3.pth \ |
| 107 | + ../2_13_1024_5488_1536_6000.png \ |
| 108 | + --work-dir ../atl_models \ |
| 109 | + --device cpu \ |
| 110 | + --show \ |
| 111 | + --dump-info |
| 112 | +``` |
| 113 | + |
| 114 | +```bash |
| 115 | +# 使用方法 |
| 116 | +python ./tools/deploy.py \ |
| 117 | + ${部署配置文件路径} \ |
| 118 | + ${模型配置文件路径} \ |
| 119 | + ${模型权重路径} \ |
| 120 | + ${输入图像路径} \ |
| 121 | + --work-dir ${用来保存日志和模型文件路径} \ |
| 122 | + --device ${cpu/cuda:0} \ |
| 123 | + --show \ # 是否显示检测的结果 |
| 124 | + --dump-info # 是否输出 SDK 信息 |
| 125 | + |
| 126 | +``` |
| 127 | + |
| 128 | +执行成功后,您将能够看到以下提示,即为转换成功。 |
| 129 | + |
| 130 | +```bash |
| 131 | +10/08 17:40:44 - mmengine - INFO - visualize pytorch model success. |
| 132 | +10/08 17:40:44 - mmengine - INFO - All process success. |
| 133 | +``` |
| 134 | + |
| 135 | +<div align="center"> |
| 136 | + <img src="https://github.com/AI-Tianlong/Useful-Tools/assets/50650583/b752ccf8-903f-4ad3-ad7c-74fc25cb89a5" alt="NVIDIA-Jetson" width="400"> |
| 137 | +</div> |
| 138 | + |
| 139 | +# 6 在 Jetson 平台进行转换及部署 |
| 140 | + |
| 141 | +## 6.1 环境准备 |
| 142 | + |
| 143 | +参考[如何在 Jetson 模组上安装 MMDeploy](https://github.com/open-mmlab/mmdeploy/blob/main/docs/zh_cn/01-how-to-build/jetsons.md)文档,完成在 Jetson 上的环境准备工作。 |
| 144 | +**注**:安装 Pytorch,可查阅 [NVIDIA Jetson Pytorch 安装文档](https://github.com/open-mmlab/mmdeploy/blob/main/docs/zh_cn/01-how-to-build/jetsons.md)安装最新的 Pytorch。 |
| 145 | + |
| 146 | +### 6.1.1 创建虚拟环境 |
| 147 | + |
| 148 | +```bash |
| 149 | +conda create -n {您虚拟环境的名字} python={python版本} |
| 150 | +``` |
| 151 | + |
| 152 | +### 6.1.2 虚拟环境内安装Pytorch |
| 153 | + |
| 154 | +<font color="red">注意:</font>这里不要安装最新的 pytorch 2.0,因为 pyTorch 1.11 是最后一个使用 USE_DISTRIBUTED 构建的wheel,否则会在用mmdeploy进行模型转换的时候提示`AttributeError: module 'torch.distributed' has no attribute 'ReduceOp'`的错误。参考以下链接:https://forums.developer.nvidia.com/t/module-torch-distributed-has-no-attribute-reduceop/256581/6 |
| 155 | +下载`torch-1.11.0-cp38-cp38-linux_aarch64.whl`并安装 |
| 156 | + |
| 157 | +```bash |
| 158 | +pip install torch-1.11.0-cp38-cp38-linux_aarch64.whl |
| 159 | +``` |
| 160 | + |
| 161 | +执行以上命令后,您将能看到以下提示,即为安装成功。 |
| 162 | + |
| 163 | +```bash |
| 164 | +Processing ./torch-1.11.0-cp38-cp38-linux_aarch64.whl |
| 165 | +Requirement already satisfied: typing-extensions in /home/sirs/miniconda3/envs/openmmlab/lib/python3.8/site-packages (from torch==1.11.0) (4.7.1) |
| 166 | +Installing collected packages: torch |
| 167 | +Successfully installed torch-1.11.0 |
| 168 | +``` |
| 169 | + |
| 170 | +### 6.1.3 将 Jetson Pack 自带的 tensorrt 拷贝至虚拟环境下 |
| 171 | + |
| 172 | +请参考[配置 TensorRT](https://github.com/open-mmlab/mmdeploy/blob/main/docs/zh_cn/01-how-to-build/jetsons.md#%E9%85%8D%E7%BD%AE-tensorrt)。 |
| 173 | +JetPack SDK 自带 TensorRT。 但是为了能够在 Conda 环境中成功导入,我们需要将 TensorRT 拷贝进先前创建的 Conda 环境中。 |
| 174 | + |
| 175 | +```bash |
| 176 | +export PYTHON_VERSION=`python3 --version | cut -d' ' -f 2 | cut -d'.' -f1,2` |
| 177 | +cp -r /usr/lib/python${PYTHON_VERSION}/dist-packages/tensorrt* ~/miniconda/envs/{您的虚拟环境名字}/lib/python${PYTHON_VERSION}/site-packages/ |
| 178 | +``` |
| 179 | + |
| 180 | +### 6.1.4 安装 MMCV |
| 181 | + |
| 182 | +通过`mim install mmcv`或从源码对其进行编译。 |
| 183 | + |
| 184 | +```bash |
| 185 | +pip install openmim |
| 186 | +mim install mmcv |
| 187 | +``` |
| 188 | + |
| 189 | +或者从源码对其进行编译。 |
| 190 | + |
| 191 | +```bash |
| 192 | +sudo apt-get install -y libssl-dev |
| 193 | +git clone https://github.com/open-mmlab/mmcv.git |
| 194 | +cd mmcv |
| 195 | +pip install -e . |
| 196 | +``` |
| 197 | + |
| 198 | +<font color="red">注:pytorch版本发生变动后,需要重新编译mmcv。</font> |
| 199 | + |
| 200 | +### 6.1.5 安装 ONNX |
| 201 | + |
| 202 | +<font color="red">注:以下方式二选一</font> |
| 203 | + |
| 204 | +- conda |
| 205 | + ```bash |
| 206 | + conda install -c conda-forge onnx |
| 207 | + ``` |
| 208 | +- pip |
| 209 | + ```bash |
| 210 | + python3 -m pip install onnx |
| 211 | + ``` |
| 212 | + |
| 213 | +### 6.1.6 安装 ONNX Runtime |
| 214 | + |
| 215 | +根据网页 [ONNX Runtime](https://elinux.org/Jetson_Zoo#ONNX_Runtime) 选择合适的ONNX Runtime版本进行下载安装。 |
| 216 | +示例: |
| 217 | + |
| 218 | +```bash |
| 219 | +# Install pip wheel |
| 220 | +$ pip3 install onnxruntime_gpu-1.10.0-cp38-cp38-linux_aarch64.whl |
| 221 | + |
| 222 | +``` |
| 223 | + |
| 224 | +## 6.2 在 Jetson AGX Orin 进行模型转换及推理 |
| 225 | + |
| 226 | +### 6.2.1 ONNX 模型转换 |
| 227 | + |
| 228 | +同[4.1 模型转换](#4.1-模型转换)相同,在 Jetson 平台下进入安装好的虚拟环境,以及mmdeploy 目录,进行模型ONNX转换。 |
| 229 | + |
| 230 | +```bash |
| 231 | +python tools/deploy.py \ |
| 232 | + configs/mmseg/segmentation_onnxruntime_static-512x512.py \ |
| 233 | + ../atl_config.py \ |
| 234 | + ../deeplabv3plus_r18-d8_512x512_80k_potsdam_20211219_020601-75fd5bc3.pth \ |
| 235 | + ../2_13_3584_2560_4096_3072.png \ |
| 236 | + --work-dir ../atl_models \ |
| 237 | + --device cpu \ |
| 238 | + --show \ |
| 239 | + --dump-info |
| 240 | + |
| 241 | +``` |
| 242 | + |
| 243 | +<font color="red">注:</font> 如果报错提示内容: |
| 244 | + |
| 245 | +```none |
| 246 | +AttributeError: module 'torch.distributed' has no attribute 'ReduceOp' |
| 247 | +``` |
| 248 | + |
| 249 | +可参考以下链接进行解决:https://forums.developer.nvidia.com/t/module-torch-distributed-has-no-attribute-reduceop/256581/6,即安装 pytorch 1.11.0 版本。 |
| 250 | + |
| 251 | +转换成功后,您将会看到如下信息以及包含 ONNX 模型的文件夹: |
| 252 | + |
| 253 | +```bash |
| 254 | +10/09 19:58:22 - mmengine - INFO - visualize pytorch model success. |
| 255 | +10/09 19:58:22 - mmengine - INFO - All process success. |
| 256 | +``` |
| 257 | + |
| 258 | +<div align="center"> |
| 259 | + <img src="https://github.com/AI-Tianlong/Useful-Tools/assets/50650583/d68f1cf6-0e80-4261-91a3-6046b17de146" alt="NVIDIA-Jetson" width="400"> |
| 260 | + <img src="https://github.com/AI-Tianlong/Useful-Tools/assets/50650583/70470a39-6a4f-4fd5-a06d-9b9d59a768ef" alt="NVIDIA-Jetson" width="160"> |
| 261 | +</div> |
| 262 | + |
| 263 | +### 6.2.2 TensorRT 模型转换 |
| 264 | + |
| 265 | +更换部署trt配置文件,进行 TensorRT 模型转换。 |
| 266 | + |
| 267 | +```bash |
| 268 | +python tools/deploy.py \ |
| 269 | + configs/mmseg/segmentation_tensorrt_static-512x512.py \ |
| 270 | + ../atl_config.py \ |
| 271 | + ../deeplabv3plus_r18-d8_512x512_80k_potsdam_20211219_020601-75fd5bc3.pth \ |
| 272 | + ../2_13_3584_2560_4096_3072.png \ |
| 273 | + --work-dir ../atl_trt_models \ |
| 274 | + --device cuda:0 \ |
| 275 | + --show \ |
| 276 | + --dump-info |
| 277 | + |
| 278 | +``` |
| 279 | + |
| 280 | +转换成功后您将看到以下信息及 TensorRT 模型文件夹: |
| 281 | + |
| 282 | +```bash |
| 283 | +10/09 20:15:50 - mmengine - INFO - visualize pytorch model success. |
| 284 | +10/09 20:15:50 - mmengine - INFO - All process success. |
| 285 | +``` |
| 286 | + |
| 287 | +<div align="center"> |
| 288 | + <img src="https://github.com/AI-Tianlong/Useful-Tools/assets/50650583/2ac1428f-b787-4fdd-beaf-6397e5b21e33" alt="NVIDIA-Jetson" width="340"> |
| 289 | + <img src="https://github.com/AI-Tianlong/Useful-Tools/assets/50650583/70470a39-6a4f-4fd5-a06d-9b9d59a768ef" alt="NVIDIA-Jetson" width="200"> |
| 290 | +</div> |
| 291 | + |
| 292 | +## 6.3 模型测速 |
| 293 | + |
| 294 | +执行以下命令完成模型测速,详细内容请查看[ profiler ](https://github.com/open-mmlab/mmdeploy/blob/main/docs/zh_cn/02-how-to-run/useful_tools.md#profiler) |
| 295 | + |
| 296 | +```bash |
| 297 | +python tools/profiler.py \ |
| 298 | + ${DEPLOY_CFG} \ |
| 299 | + ${MODEL_CFG} \ |
| 300 | + ${IMAGE_DIR} \ |
| 301 | + --model ${MODEL} \ |
| 302 | + --device ${DEVICE} \ |
| 303 | + --shape ${SHAPE} \ |
| 304 | + --num-iter ${NUM_ITER} \ |
| 305 | + --warmup ${WARMUP} \ |
| 306 | + --cfg-options ${CFG_OPTIONS} \ |
| 307 | + --batch-size ${BATCH_SIZE} \ |
| 308 | + --img-ext ${IMG_EXT} |
| 309 | +``` |
| 310 | + |
| 311 | +示例: |
| 312 | + |
| 313 | +```bash |
| 314 | +python tools/profiler.py \ |
| 315 | + configs/mmseg/segmentation_tensorrt_static-512x512.py \ |
| 316 | + ../atl_config.py \ |
| 317 | + ../atl_demo_img \ |
| 318 | + --model /home/sirs/AI-Tianlong/OpenMMLab/atl_trt_models/end2end.engine \ |
| 319 | + --device cuda:0 \ |
| 320 | + --shape 512x512 \ |
| 321 | + --num-iter 100 |
| 322 | +``` |
| 323 | + |
| 324 | +测速结果 |
| 325 | + |
| 326 | + |
| 327 | + |
| 328 | +## 6.4 模型推理 |
| 329 | + |
| 330 | +根据[6.2.2](#6.2.2-TensorRT-模型转换)中生成的TensorRT模型文件夹,进行模型推理。 |
| 331 | + |
| 332 | +```python |
| 333 | +from mmdeploy.apis.utils import build_task_processor |
| 334 | +from mmdeploy.utils import get_input_shape, load_config |
| 335 | +import torch |
| 336 | + |
| 337 | +deploy_cfg='./mmdeploy/configs/mmseg/segmentation_tensorrt_static-512x512.py' |
| 338 | +model_cfg='./atl_config.py' |
| 339 | +device='cuda:0' |
| 340 | +backend_model = ['./atl_trt_models/end2end.engine'] |
| 341 | +image = './atl_demo_img/2_13_2048_1024_2560_1536.png' |
| 342 | + |
| 343 | +# read deploy_cfg and model_cfg |
| 344 | +deploy_cfg, model_cfg = load_config(deploy_cfg, model_cfg) |
| 345 | + |
| 346 | +# build task and backend model |
| 347 | +task_processor = build_task_processor(model_cfg, deploy_cfg, device) |
| 348 | +model = task_processor.build_backend_model(backend_model) |
| 349 | + |
| 350 | +# process input image |
| 351 | +input_shape = get_input_shape(deploy_cfg) |
| 352 | +model_inputs, _ = task_processor.create_input(image, input_shape) |
| 353 | + |
| 354 | +# do model inference |
| 355 | +with torch.no_grad(): |
| 356 | + result = model.test_step(model_inputs) |
| 357 | + |
| 358 | +# visualize results |
| 359 | +task_processor.visualize( |
| 360 | + image=image, |
| 361 | + model=model, |
| 362 | + result=result[0], |
| 363 | + window_name='visualize', |
| 364 | + output_file='./output_segmentation.png') |
| 365 | +``` |
| 366 | + |
| 367 | +即可得到推理结果: |
| 368 | + |
| 369 | +<div align="center"> |
| 370 | + <img src="https://github.com/AI-Tianlong/Useful-Tools/assets/50650583/d0ae1fa8-e223-4b3f-b699-6bfa8db38133" alt="NVIDIA-Jetson" width="40%"> |
| 371 | + <img src="https://github.com/AI-Tianlong/Useful-Tools/assets/50650583/6d999cbe-2101-4e1b-b4a9-13115c9d1928" alt="NVIDIA-Jetson" width="40%"> |
| 372 | +</div> |
0 commit comments