https://github.com/veritone/yolo-sod
Standalone Object Detector
https://github.com/veritone/yolo-sod
Last synced: 4 months ago
JSON representation
Standalone Object Detector
- Host: GitHub
- URL: https://github.com/veritone/yolo-sod
- Owner: veritone
- License: agpl-3.0
- Created: 2024-11-21T16:21:52.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-02-19T15:16:31.000Z (over 1 year ago)
- Last Synced: 2025-02-19T16:27:55.378Z (over 1 year ago)
- Language: Python
- Size: 127 KB
- Stars: 0
- Watchers: 3
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# yolo_sod
A wrapper around Ultralytics' YOLO library, used for object detection. Results
are output in a custom JSON format.
# Installation
Installation requires `poetry`.
To install `poetry`, use `pipx`:
``` shell
sudo apt install pipx
pipx install poetry
```
To install `yolo-sod`, from this directory run:
``` shell
poetry install
```
This will install all the requirements in their own virtual environment. To
enter the environment, from this directory run `poetry shell`.
# Exporting a model
To export a model run
``` shell
yolo-sod-export
```
command from withing the `poetry` environment. Without any arguments this will
build an INT8 precision TensorRT engine for the large YOLOv11 model. This will
download the necessary files and use the VOC2007 dataset from `/import/datasets`
for calibration.
Model files will be saved in the current working directory.
Various configuration options can also be set. These are listed with:
``` shell
yolo-sod-export --help
```
Note that by default a 2 GiB workspace is used for TensorRT. This program takes
a large amount of GPU memory during the calibration and building process, so
setting a larger value may lead to memory allocation errors. For the large
YOLOv11 model a workspace of 2GiB requires just under 8GiB of GPU memory to do
the calibration.
# Model inference
To perform inference on a video run
``` shell
yolo-sod --input --output --model-od
```
from within the `poetry` shell. This will decode the video using Nvidia's NvDec
GPU decoder and perform object detection using the model. The detections will be
written to JSON output of the form:
``` json
{
"label": {
"timestamp1": [
[[x0, y0, x1, y1], "category", confidence],
[[x0, y0, x1, y1], "category", confidence],
...
],
"timestamp2": [
[[x0, y0, x1, y1], "category", confidence],
[[x0, y0, x1, y1], "category", confidence],
...
],
...
}
}
```
Supported video codecs:
- `av1`
- `h264`
- `hevc`
- `mjpeg`
- `mpeg1video`
- `mpeg2video`
- `mpeg4video`
- `vc1`
- `vp8`
- `vp9`
Additional inference options can be found by running
``` shell
yolo-sod --help
```
# Building Packages
To build packages (source and wheel) run
``` shell
poetry build
```
from within this directory. The packages will be produced in the `./dist`
directory.