Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/BobLd/YOLOv3MLNet
Use the YOLO v3 (ONNX) model for object detection in C# using ML.Net
https://github.com/BobLd/YOLOv3MLNet
computer-vision csharp dotnet machine-learning ml ml-net neural-network object-detection onnx onnx-torch python yolo yolov3
Last synced: 3 months ago
JSON representation
Use the YOLO v3 (ONNX) model for object detection in C# using ML.Net
- Host: GitHub
- URL: https://github.com/BobLd/YOLOv3MLNet
- Owner: BobLd
- License: mit
- Created: 2020-10-21T14:59:42.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2021-07-17T14:05:23.000Z (over 3 years ago)
- Last Synced: 2024-11-08T05:02:51.228Z (4 months ago)
- Topics: computer-vision, csharp, dotnet, machine-learning, ml, ml-net, neural-network, object-detection, onnx, onnx-torch, python, yolo, yolov3
- Language: C#
- Homepage:
- Size: 8.09 MB
- Stars: 20
- Watchers: 3
- Forks: 8
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-yolo-object-detection - BobLd/YOLOv3MLNet
- awesome-yolo-object-detection - BobLd/YOLOv3MLNet
README
**Another case study, based on [this](https://github.com/onnx/models/tree/master/vision/object_detection_segmentation/yolov3) YOLO v3 model is available [here](https://github.com/BobLd/YOLOv3MLNet/tree/master/YOLOV3MLNetSO).**
**See [here](https://github.com/BobLd/YOLOv4MLNet) for YOLO v4 use.**
# YOLO v3 in ML.Net
Use the YOLO v3 algorithms for object detection in C# using ML.Net. We start with a Torch model, then converting it to ONNX format and use it in ML.Net.This is a case study on a document layout YOLO trained model. The model can be found in the following Medium article: [Object Detection — Document Layout Analysis Using Monk AI](https://medium.com/towards-artificial-intelligence/object-detection-document-layout-analysis-using-monk-object-detection-toolkit-6c57200bde5).
## Main differences
- The ONNX conversion removes 1 feature which is the *objectness score*, pc. The original model has (5 + classes) features for each bounding box, the ONNX model has (4 + classes) features per bounding box. We will use the class probability as a proxy for the *objectness score* when performing the Non-maximum Suppression (NMS) step. This is a known issue, more info [here](https://github.com/ultralytics/yolov3/issues/750).
- Image resizing is not optimised, and will always yield 416x416 size image. This is not the case in the original model (see this issue: [RECTANGULAR INFERENCE](https://github.com/ultralytics/yolov3/issues/232)).# Export to ONNX in Python
This is based on this article [Object Detection — Document Layout Analysis Using Monk AI](https://medium.com/towards-artificial-intelligence/object-detection-document-layout-analysis-using-monk-object-detection-toolkit-6c57200bde5).## Load the model
```python
import os
import sys
from IPython.display import Image
sys.path.append("../Monk_Object_Detection/7_yolov3/lib")
from infer_detector import Infergtf = Infer()
f = open("dla_yolov3/classes.txt")
class_list = f.readlines()
f.close()model_name = "yolov3"
weights = "dla_yolov3/dla_yolov3.pt"
gtf.Model(model_name, class_list, weights, use_gpu=False, input_size=(416, 416))
```## Test the model
```python
img_path = "test_square.jpg"
gtf.Predict(img_path, conf_thres=0.2, iou_thres=0.5)
Image(filename='output/test_square.jpg')
```## Export the model
You need to set `ONNX_EXPORT = True` in `...\Monk_Object_Detection\7_yolov3\lib\models.py` before loading the model.We name the input layer `image` and the 2 ouput layers `classes`, `bboxes`. This is not needed but helps the clarity.
```python
import torch
import torchvision.models as modelsdummy_input = torch.randn(1, 3, 416, 416) # Create the right input shape (e.g. for an image)
dummy_input = torch.nn.Sigmoid()(dummy_input) # limit between 0 and 1 (superfluous?)
torch.onnx.export(gtf.system_dict["local"]["model"],
dummy_input,
"dla_yolov3.onnx",
input_names=["image"],
output_names=["classes", "bboxes"],
opset_version=9)
```# Check exported model with Netron
The ONNX model can be viewed in [Netron](https://www.electronjs.org/apps/netron). Our model looks like this:
- The input layer size is [1 x 3 x 416 x 416]. This corresponds to 1 batch size x 3 colors x 416 pixels height x 416 pixel width (more info about fixed batch size [here](https://github.com/ultralytics/yolov3/issues/1030)).
As per this [article](https://medium.com/analytics-vidhya/yolo-v3-theory-explained-33100f6d193):
> For an image of size 416 x 416, YOLO predicts ((52 x 52) + (26 x 26) + 13 x 13)) x 3 = 10,647 bounding boxes.
- The `bboxes` output layer is of size [10,647 x 4]. This corresponds to 10,647 bounding boxes x 4 bounding box coordinates (x, y, h, w).
- The `classes` output layer is of size [10,647 x 18]. This corresponds to 10,647 bounding boxes x 18 classes (this model has only 18 classes).Hence, each bounding box has (4 + classes) = 22 features. The total number of prediction in this model is 22 x 10,647.
**NB**: The ONNX conversion removes 1 feature which is the *objectness score*, pc. The original model has (5 + classes) features for each bounding box. We will use the class probability as a proxy for the *objectness score*.

More information can be found in this article: [YOLO v3 theory explained](https://medium.com/analytics-vidhya/yolo-v3-theory-explained-33100f6d193)
# Load model in C#
# Predict in C#
# Resources
- https://medium.com/towards-artificial-intelligence/object-detection-document-layout-analysis-using-monk-object-detection-toolkit-6c57200bde5
- https://medium.com/analytics-vidhya/yolo-v3-theory-explained-33100f6d193
- https://towardsdatascience.com/non-maximum-suppression-nms-93ce178e177c
- https://michhar.github.io/convert-pytorch-onnx/