Please note this project is not the best implementation of YOLOv4, I fork it to our private repository only to use its onnx-converting scripts and get mAP data.
The latest and best implementation of YOLOv4 is https://github.com/WongKinYiu/ScaledYOLOv4 I have already testd and confirmed on jetson NANO 4G only Tiny and CSP version of ScaledYOLOv4 can be used because of memory and performance limitation.
For ScaledYOLOv4 Tiny, it is really Alexey darknet YOLOv4 Tiny, its training and testing can be done in https://github.com/AlexeyAB/darknet, for its mAP testing, it can be done by evaluate_on_coco.py(but there are some bugs in evaluate_on_coco.py and other files in this project, I have fixed it and will commit the fixes here).
For how to get mAP with the trained-out .weights and .cfg of AlexeyAB darknet, after you have executed 'git clone https://github.com/AlexeyAB/darknet' to download source code and built out darknet, and configured it, and finished training with our own coco2017-formatted dataset, just execute commands like :
cd /workspace/pytorch-YOLOv4
python evaluate_on_coco.py -c cfg/yolov4-tiny.cfg -w /workspace/AB_darknet/backup/yolov4-tiny_last.weights -dir /workspace/AB_darknet/data/coco/val2017 -gta /workspace/AB_darknet/data/coco/annotations/instances_val2017.json -r yolov4-tiny_results.json
For how to generate out onnx with .weights trained out by darknet, just execute commands like:
python demo_darknet2onnx.py /workspace/AB_darknet/cfg/yolov4-tiny.cfg /workspace/AB_darknet/backup/yolov4-tiny_final.weights data/dog.jpg 1
For generating out onnx with .pth trained out by this pytorch-YOLOv4, execute a command with the following format: python demo_pytorch2onnx.py <weight_file> <image_path> <batch_size> <n_classes> <IN_IMAGE_H> <IN_IMAGE_W>
E.g.
python demo_pytorch2onnx.py yolov4.pth dog.jpg 8 80 416 416
A minimal PyTorch implementation of YOLOv4.
-
Paper Yolo v4: https://arxiv.org/abs/2004.10934
-
Source code:https://github.com/AlexeyAB/darknet
-
More details: http://pjreddie.com/darknet/yolo/
-
Inference
-
Train
- Mocaic
├── README.md
├── dataset.py dataset
├── demo.py demo to run pytorch --> tool/darknet2pytorch
├── demo_darknet2onnx.py tool to convert into onnx --> tool/darknet2pytorch
├── demo_pytorch2onnx.py tool to convert into onnx
├── models.py model for pytorch
├── train.py train models.py
├── cfg.py cfg.py for train
├── cfg cfg --> darknet2pytorch
├── data
├── weight --> darknet2pytorch
├── tool
│ ├── camera.py a demo camera
│ ├── coco_annotation.py coco dataset generator
│ ├── config.py
│ ├── darknet2pytorch.py
│ ├── region_loss.py
│ ├── utils.py
│ └── yolo_layer.py
- baidu(https://pan.baidu.com/s/1dAGEW8cm-dqK14TbhhVetA Extraction code:dm5b)
- google(https://drive.google.com/open?id=1cewMfusmPjYWbrnuJRuKhPMwRe_b9PaT)
you can use darknet2pytorch to convert it yourself, or download my converted model.
- baidu
- yolov4.pth(https://pan.baidu.com/s/1ZroDvoGScDgtE1ja_QqJVw Extraction code:xrq9)
- yolov4.conv.137.pth(https://pan.baidu.com/s/1ovBie4YyVQQoUrC3AY0joA Extraction code:kcel)
- google
- yolov4.pth(https://drive.google.com/open?id=1wv_LiFeCRYwtpkqREPeI13-gPELBDwuJ)
- yolov4.conv.137.pth(https://drive.google.com/open?id=1fcbR0bWzYfIEdLJPzOsn4R5mlvR6IQyA)
use yolov4 to train your own data
-
Download weight
-
Transform data
For coco dataset,you can use tool/coco_annotation.py.
# train.txt image_path1 x1,y1,x2,y2,id x1,y1,x2,y2,id x1,y1,x2,y2,id ... image_path2 x1,y1,x2,y2,id x1,y1,x2,y2,id x1,y1,x2,y2,id ... ... ...
-
Train
you can set parameters in cfg.py.
python train.py -g [GPU_ID] -dir [Dataset direction] ...
2.1 Performance on MS COCO dataset (using pretrained DarknetWeights from https://github.com/AlexeyAB/darknet)
ONNX and TensorRT models are converted from Pytorch (TianXiaomo): Pytorch->ONNX->TensorRT. See following sections for more details of conversions.
- val2017 dataset (input size: 416x416)
Model type | AP | AP50 | AP75 | APS | APM | APL |
---|---|---|---|---|---|---|
DarkNet (YOLOv4 paper) | 0.471 | 0.710 | 0.510 | 0.278 | 0.525 | 0.636 |
Pytorch (TianXiaomo) | 0.466 | 0.704 | 0.505 | 0.267 | 0.524 | 0.629 |
TensorRT FP32 + BatchedNMSPlugin | 0.472 | 0.708 | 0.511 | 0.273 | 0.530 | 0.637 |
TensorRT FP16 + BatchedNMSPlugin | 0.472 | 0.708 | 0.511 | 0.273 | 0.530 | 0.636 |
- testdev2017 dataset (input size: 416x416)
Model type | AP | AP50 | AP75 | APS | APM | APL |
---|---|---|---|---|---|---|
DarkNet (YOLOv4 paper) | 0.412 | 0.628 | 0.443 | 0.204 | 0.444 | 0.560 |
Pytorch (TianXiaomo) | 0.404 | 0.615 | 0.436 | 0.196 | 0.438 | 0.552 |
TensorRT FP32 + BatchedNMSPlugin | 0.412 | 0.625 | 0.445 | 0.200 | 0.446 | 0.564 |
TensorRT FP16 + BatchedNMSPlugin | 0.412 | 0.625 | 0.445 | 0.200 | 0.446 | 0.563 |
Image input size is NOT restricted in 320 * 320
, 416 * 416
, 512 * 512
and 608 * 608
.
You can adjust your input sizes for a different input ratio, for example: 320 * 608
.
Larger input size could help detect smaller targets, but may be slower and GPU memory exhausting.
height = 320 + 96 * n, n in {0, 1, 2, 3, ...}
width = 320 + 96 * m, m in {0, 1, 2, 3, ...}
-
Load the pretrained darknet model and darknet weights to do the inference (image size is configured in cfg file already)
python demo.py -cfgfile <cfgFile> -weightfile <weightFile> -imgfile <imgFile>
-
Load pytorch weights (pth file) to do the inference
python models.py <num_classes> <weightfile> <imgfile> <IN_IMAGE_H> <IN_IMAGE_W> <namefile(optional)>
-
Load converted ONNX file to do inference (See section 3 and 4)
-
Load converted TensorRT engine file to do inference (See section 5)
There are 2 inference outputs.
- One is locations of bounding boxes, its shape is
[batch, num_boxes, 1, 4]
which represents x1, y1, x2, y2 of each bounding box. - The other one is scores of bounding boxes which is of shape
[batch, num_boxes, num_classes]
indicating scores of all classes for each bounding box.
Until now, still a small piece of post-processing including NMS is required. We are trying to minimize time and complexity of post-processing.
-
This script is to convert the official pretrained darknet model into ONNX
-
Pytorch version Recommended:
- Pytorch 1.4.0 for TensorRT 7.0 and higher
- Pytorch 1.5.0 and 1.6.0 for TensorRT 7.1.2 and higher
-
Install onnxruntime
pip install onnxruntime
-
Run python script to generate ONNX model and run the demo
python demo_darknet2onnx.py <cfgFile> <weightFile> <imageFile> <batchSize>
- Positive batch size will generate ONNX model of static batch size, otherwise, batch size will be dynamic
- Dynamic batch size will generate only one ONNX model
- Static batch size will generate 2 ONNX models, one is for running the demo (batch_size=1)
-
You can convert your trained pytorch model into ONNX using this script
-
Pytorch version Recommended:
- Pytorch 1.4.0 for TensorRT 7.0 and higher
- Pytorch 1.5.0 and 1.6.0 for TensorRT 7.1.2 and higher
-
Install onnxruntime
pip install onnxruntime
-
Run python script to generate ONNX model and run the demo
python demo_pytorch2onnx.py <weight_file> <image_path> <batch_size> <n_classes> <IN_IMAGE_H> <IN_IMAGE_W>
For example:
python demo_pytorch2onnx.py yolov4.pth dog.jpg 8 80 416 416
- Positive batch size will generate ONNX model of static batch size, otherwise, batch size will be dynamic
- Dynamic batch size will generate only one ONNX model
- Static batch size will generate 2 ONNX models, one is for running the demo (batch_size=1)
- TensorRT version Recommended: 7.0, 7.1
-
Run the following command to convert YOLOv4 ONNX model into TensorRT engine
trtexec --onnx=<onnx_file> --explicitBatch --saveEngine=<tensorRT_engine_file> --workspace=<size_in_megabytes> --fp16
- Note: If you want to use int8 mode in conversion, extra int8 calibration is needed.
-
Run the following command to convert YOLOv4 ONNX model into TensorRT engine
trtexec --onnx=<onnx_file> \ --minShapes=input:<shape_of_min_batch> --optShapes=input:<shape_of_opt_batch> --maxShapes=input:<shape_of_max_batch> \ --workspace=<size_in_megabytes> --saveEngine=<engine_file> --fp16
-
For example:
trtexec --onnx=yolov4_-1_3_320_512_dynamic.onnx \ --minShapes=input:1x3x320x512 --optShapes=input:4x3x320x512 --maxShapes=input:8x3x320x512 \ --workspace=2048 --saveEngine=yolov4_-1_3_320_512_dynamic.engine --fp16
python demo_trt.py <tensorRT_engine_file> <input_image> <input_H> <input_W>
-
This demo here only works when batchSize is dynamic (1 should be within dynamic range) or batchSize=1, but you can update this demo a little for other dynamic or static batch sizes.
-
Note1: input_H and input_W should agree with the input size in the original ONNX file.
-
Note2: extra NMS operations are needed for the tensorRT output. This demo uses python NMS code from
tool/utils.py
.
-
First:Conversion to ONNX
tensorflow >=2.0
1: Thanks:github:https://github.com/onnx/onnx-tensorflow
2: Run git clone https://github.com/onnx/onnx-tensorflow.git && cd onnx-tensorflow Run pip install -e .
Note:Errors will occur when using "pip install onnx-tf", at least for me,it is recommended to use source code installation
- Compile the DeepStream Nvinfer Plugin
cd DeepStream
make
- Build a TRT Engine.
For single batch,
trtexec --onnx=<onnx_file> --explicitBatch --saveEngine=<tensorRT_engine_file> --workspace=<size_in_megabytes> --fp16
For multi-batch,
trtexec --onnx=<onnx_file> --explicitBatch --shapes=input:Xx3xHxW --optShapes=input:Xx3xHxW --maxShapes=input:Xx3xHxW --minShape=input:1x3xHxW --saveEngine=<tensorRT_engine_file> --fp16
Note :The maxShapes could not be larger than model original shape.
- Write the deepstream config file for the TRT Engine.
Reference:
- https://github.com/eriklindernoren/PyTorch-YOLOv3
- https://github.com/marvis/pytorch-caffe-darknet-convert
- https://github.com/marvis/pytorch-yolo3
@article{yolov4,
title={YOLOv4: YOLOv4: Optimal Speed and Accuracy of Object Detection},
author={Alexey Bochkovskiy, Chien-Yao Wang, Hong-Yuan Mark Liao},
journal = {arXiv},
year={2020}
}