[Other General Issues]PaddleDetection使用TensorRT选项，对进行yolov3_mobilenet_v1_qat推理时报错

**PaddleDetection team appreciate any suggestion or problem you delivered~**

## Checklist:

1. 查找历史相关issue寻求解答/I have searched related issues but cannot get the expected help.
2. 翻阅[FAQ](https://paddledetection.readthedocs.io/FAQ.html) /I have read the [FAQ documentation](https://paddledetection.readthedocs.io/FAQ.html) but cannot get the expected help.

## 描述问题/Describe the bug
按照官方教程，在VOC数据集上训练了一个yolov3_mobilenet_v1_270e_voc模型，量化config使用默认的yolov3_mobilenet_v1_qat.yml。导出为paddleserving模型后，不开启--use_trt选项时可以正常运行和检测，但无加速和压缩效果。开启--use_trt时报此错误。

在PaddleServing群里咨询管理后，管理称这是模型问题，建议联系PaddleDetection。

另外，用benchmark脚本测试时也同样报TensorRT相关错误，无法测试trt_fp32、trt_fp16和trt_int8，但use_cpu、use_gpu可以测试。

## 复现/Reproduction

1. 您使用的命令是？/What command or script did you run?
paddleserving命令：
```
cd /home/ubuntu/lxd-storage/xzy/PaddleCV/PaddleDetection/inference_model/yolov3_mobilenet_v1_270e_qat_pdserving/yolov3_mobilenet_v1_qat
python -m paddle_serving_server.serve --model serving_server --port 9393 --gpu_ids 0 --precision int8 --use_trt
```
benchmark测试命令：
```
bash deploy/benchmark/benchmark.sh ./inference_model/yolov3_mobilenet_v1_270e_voc_origin model
bash deploy/benchmark/benchmark_quant.sh ./inference_model/yolov3_mobilenet_v1_270e_qat/yolov3_mobilenet_v1_qat model
```

2. 您是否更改过代码或配置文件？您是否理解您所更改的内容？还请您提供所更改的部分代码。/Did you make any modifications on the code or config? Did you understand what you have modified? Please provide the codes that you modified.

否

4. 您使用的数据集是？/What dataset did you use?

VOC数据集

5. 请提供您出现的报错信息及相关log。/Please provide the error messages or relevant log information.
paddleserving报错信息和log：
```
/home/ubuntu/miniconda3/envs/paddle_env/lib/python3.7/runpy.py:125: RuntimeWarning: 'paddle_serving_server.serve' found in sys.modules after import of package 'paddle_serving_server', but prior to execution of 'paddle_serving_server.serve'; this may result in unpredictable behaviour
  warn(RuntimeWarning(msg))
Going to Run Comand
/home/ubuntu/miniconda3/envs/paddle_env/lib/python3.7/site-packages/paddle_serving_server/serving-gpu-101-0.8.3/serving -enable_model_toolkit -inferservice_path workdir_9393 -inferservice_file infer_service.prototxt -max_concurrency 0 -num_threads 4 -port 9393 -precision int8 -use_calib=False -reload_interval_s 10 -resource_path workdir_9393 -resource_file resource.prototxt -workflow_path workdir_9393 -workflow_file workflow.prototxt -bthread_concurrency 4 -max_body_size 536870912
I0100 00:00:00.000000 11926 op_repository.h:68] RAW: Succ regist op: GeneralDistKVInferOp
I0100 00:00:00.000000 11926 op_repository.h:68] RAW: Succ regist op: GeneralDistKVQuantInferOp
I0100 00:00:00.000000 11926 op_repository.h:68] RAW: Succ regist op: GeneralInferOp
I0100 00:00:00.000000 11926 op_repository.h:68] RAW: Succ regist op: GeneralReaderOp
I0100 00:00:00.000000 11926 op_repository.h:68] RAW: Succ regist op: GeneralRecOp
I0100 00:00:00.000000 11926 op_repository.h:68] RAW: Succ regist op: GeneralResponseOp
I0100 00:00:00.000000 11926 service_manager.h:79] RAW: Service[LoadGeneralModelService] insert successfully!
I0100 00:00:00.000000 11926 load_general_model_service.pb.h:333] RAW: Success regist service[LoadGeneralModelService][PN5baidu14paddle_serving9predictor26load_general_model_service27LoadGeneralModelServiceImplE]
I0100 00:00:00.000000 11926 service_manager.h:79] RAW: Service[GeneralModelService] insert successfully!
I0100 00:00:00.000000 11926 general_model_service.pb.h:1608] RAW: Success regist service[GeneralModelService][PN5baidu14paddle_serving9predictor13general_model23GeneralModelServiceImplE]
I0100 00:00:00.000000 11926 factory.h:155] RAW: Succ insert one factory, tag: PADDLE_INFER, base type N5baidu14paddle_serving9predictor11InferEngineE
W0100 00:00:00.000000 11926 paddle_engine.cpp:34] RAW: Succ regist factory: ::baidu::paddle_serving::predictor::FluidInferEngine<PaddleInferenceEngine>->::baidu::paddle_serving::predictor::InferEngine, tag: PADDLE_INFER in macro!
I0415 14:52:29.661229 11929 analysis_predictor.cc:576] TensorRT subgraph engine is enabled
--- Running analysis [ir_graph_build_pass]
--- Running analysis [ir_graph_clean_pass]
--- Running analysis [ir_analysis_pass]
--- Running IR pass [conv_affine_channel_fuse_pass]
--- Running IR pass [adaptive_pool2d_convert_global_pass]
--- Running IR pass [conv_eltwiseadd_affine_channel_fuse_pass]
--- Running IR pass [shuffle_channel_detect_pass]
--- Running IR pass [quant_conv2d_dequant_fuse_pass]
--- Running IR pass [delete_quant_dequant_op_pass]
I0415 14:52:29.881975 11929 fuse_pass_base.cc:57] ---  detected 47 subgraphs
--- Running IR pass [delete_quant_dequant_filter_op_pass]
I0415 14:52:29.944126 11929 fuse_pass_base.cc:57] ---  detected 47 subgraphs
--- Running IR pass [simplify_with_basic_ops_pass]
--- Running IR pass [embedding_eltwise_layernorm_fuse_pass]
--- Running IR pass [multihead_matmul_fuse_pass_v2]
--- Running IR pass [multihead_matmul_fuse_pass_v3]
--- Running IR pass [skip_layernorm_fuse_pass]
--- Running IR pass [unsqueeze2_eltwise_fuse_pass]
--- Running IR pass [squeeze2_matmul_fuse_pass]
--- Running IR pass [reshape2_matmul_fuse_pass]
--- Running IR pass [flatten2_matmul_fuse_pass]
--- Running IR pass [map_matmul_v2_to_mul_pass]
--- Running IR pass [map_matmul_v2_to_matmul_pass]
--- Running IR pass [map_matmul_to_mul_pass]
--- Running IR pass [fc_fuse_pass]
--- Running IR pass [conv_elementwise_add_fuse_pass]
--- Running IR pass [tensorrt_subgraph_pass]
I0415 14:52:30.001792 11929 tensorrt_subgraph_pass.cc:138] ---  detect a sub-graph with 145 nodes
I0415 14:52:30.034184 11929 tensorrt_subgraph_pass.cc:395] Prepare TRT engine (Optimize model structure, Select OP kernel etc). This process may cost a lot of time.
terminate called after throwing an instance of 'paddle::platform::EnforceNotMet'
  what():

--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------

----------------------
Error Message Summary:
----------------------
UnimplementedError: no OpConverter for optype [nearest_interp_v2]
  [Hint: it should not be null.] (at /paddle/paddle/fluid/inference/tensorrt/convert/op_converter.h:142)

Aborted (core dumped)
```

Benchmark测试报错：
```
model_dir : ./inference_model/yolov3_mobilenet_v1_270e_qat/yolov3_mobilenet_v1_qat
img_dir: demo/fire_smoke_demo
model  ./inference_model/yolov3_mobilenet_v1_270e_qat/yolov3_mobilenet_v1_qat, run_mode: trt_int8
-----------  Running Arguments -----------
batch_size: 1
camera_id: -1
cpu_threads: 1
device: GPU
enable_mkldnn: False
image_dir: demo/fire_smoke_demo
image_file: None
model_dir: ./inference_model/yolov3_mobilenet_v1_270e_qat/yolov3_mobilenet_v1_qat
output_dir: output
reid_batch_size: 50
reid_model_dir: None
run_benchmark: True
run_mode: trt_int8
save_images: False
save_mot_txt_per_img: False
save_mot_txts: False
scaled: False
threshold: 0.5
trt_calib_mode: False
trt_max_shape: 1280
trt_min_shape: 1
trt_opt_shape: 640
use_dark: True
use_gpu: False
video_file: None
------------------------------------------
-----------  Model Configuration -----------
Model Arch: YOLO
Transform Order:
--transform op: Resize
--transform op: NormalizeImage
--transform op: Permute
--------------------------------------------
Traceback (most recent call last):
  File "deploy/python/infer.py", line 773, in <module>
    main()
  File "deploy/python/infer.py", line 726, in main
    enable_mkldnn=FLAGS.enable_mkldnn)
  File "deploy/python/infer.py", line 94, in __init__
    enable_mkldnn=enable_mkldnn)
  File "deploy/python/infer.py", line 563, in load_predictor
    predictor = create_predictor(config)
ValueError: (InvalidArgument) Pass tensorrt_subgraph_pass has not been registered. Please use the paddle inference library compiled with tensorrt or disable the tensorrt engine in inference configuration!
  [Hint: Expected Has(pass_type) == true, but received Has(pass_type):0 != true:1.] (at /paddle/paddle/fluid/framework/ir/pass.h:236)
```

## 环境/Environment
1. 请提供您使用的Paddle和PaddleDetection的版本号/Please provide the version of Paddle and PaddleDetection you use：
paddlepaddle-gpu=2.2.2.post101
paddledet=2.3.0

2. 如您在使用PaddleDetection的同时还在使用其他产品，如PaddleServing、PaddleInference等，请您提供其版本号/ Please provide the version of any other related tools/products used, such as the version of PaddleServing and etc：
paddleslim=2.2.2
paddle-serving-server-gpu=0.8.3.post101
tensorrt=6.0.1.5

3. 请提供您使用的操作系统信息，如Linux/Windows/MacOS /Please provide the OS information, e.g., Linux：
Ubuntu 16.04

4. 请问您使用的Python版本是？/ Please provide the version of Python you used.
Python 3.7

7. 请问您使用的CUDA/cuDNN的版本号是？/ Please provide the version of CUDA/cuDNN you used.
CUDA 10.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Other General Issues]PaddleDetection使用TensorRT选项，对进行yolov3_mobilenet_v1_qat推理时报错 #5726

Checklist:

描述问题/Describe the bug

复现/Reproduction

环境/Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Other General Issues]PaddleDetection使用TensorRT选项，对进行yolov3_mobilenet_v1_qat推理时报错 #5726

Description

Checklist:

描述问题/Describe the bug

复现/Reproduction

环境/Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions