https://github.com/veritone/yolo-sod

Standalone Object Detector
https://github.com/veritone/yolo-sod

Last synced: 6 months ago
JSON representation

Standalone Object Detector

Host: GitHub
URL: https://github.com/veritone/yolo-sod
Owner: veritone
License: agpl-3.0
Created: 2024-11-21T16:21:52.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2025-02-19T15:16:31.000Z (over 1 year ago)
Last Synced: 2025-02-19T16:27:55.378Z (over 1 year ago)
Language: Python
Size: 127 KB
Stars: 0
Watchers: 3
Forks: 0
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# yolo_sod

A wrapper around Ultralytics' YOLO library, used for object detection. Results
are output in a custom JSON format.

# Installation

Installation requires `poetry`.

To install `poetry`, use `pipx`:

``` shell
sudo apt install pipx
pipx install poetry
```

To install `yolo-sod`, from this directory run:

``` shell
poetry install
```

This will install all the requirements in their own virtual environment. To
enter the environment, from this directory run `poetry shell`.

# Exporting a model

To export a model run

``` shell
yolo-sod-export
```

command from withing the `poetry` environment. Without any arguments this will
build an INT8 precision TensorRT engine for the large YOLOv11 model. This will
download the necessary files and use the VOC2007 dataset from `/import/datasets`
for calibration.

Model files will be saved in the current working directory.

Various configuration options can also be set. These are listed with:

``` shell
yolo-sod-export --help
```

Note that by default a 2 GiB workspace is used for TensorRT. This program takes
a large amount of GPU memory during the calibration and building process, so
setting a larger value may lead to memory allocation errors. For the large
YOLOv11 model a workspace of 2GiB requires just under 8GiB of GPU memory to do
the calibration.

# Model inference

To perform inference on a video run

``` shell
yolo-sod --input --output --model-od
```

from within the `poetry` shell. This will decode the video using Nvidia's NvDec
GPU decoder and perform object detection using the model. The detections will be
written to JSON output of the form:

``` json
{
"label": {
"timestamp1": [
[[x0, y0, x1, y1], "category", confidence],
[[x0, y0, x1, y1], "category", confidence],
...
],
"timestamp2": [
[[x0, y0, x1, y1], "category", confidence],
[[x0, y0, x1, y1], "category", confidence],
...
],
...
}
}
```

Supported video codecs:

- `av1`
- `h264`
- `hevc`
- `mjpeg`
- `mpeg1video`
- `mpeg2video`
- `mpeg4video`
- `vc1`
- `vp8`
- `vp9`

Additional inference options can be found by running

``` shell
yolo-sod --help
```

# Building Packages

To build packages (source and wheel) run

``` shell
poetry build
```

from within this directory. The packages will be produced in the `./dist`
directory.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/veritone/yolo-sod

Awesome Lists containing this project

README