An open API service indexing awesome lists of open source software.

https://github.com/onepunchmonk/oxidized-vision

Compile PyTorch vision models into ultra-fast Rust binaries for edge, server, and browser deployment. (WORK IN PROGRESS )
https://github.com/onepunchmonk/oxidized-vision

Last synced: 4 months ago
JSON representation

Compile PyTorch vision models into ultra-fast Rust binaries for edge, server, and browser deployment. (WORK IN PROGRESS )

Awesome Lists containing this project

README

          

# ๐Ÿš€ OxidizedVision

[![CI](https://github.com/OnePunchMonk/Oxidized-Vision/actions/workflows/ci.yml/badge.svg)](https://github.com/OnePunchMonk/Oxidized-Vision/actions/workflows/ci.yml)
[![PyPI](https://img.shields.io/pypi/v/oxidizedvision)](https://pypi.org/project/oxidizedvision/)
[![Python](https://img.shields.io/pypi/pyversions/oxidizedvision)](https://pypi.org/project/oxidizedvision/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

**Compile your PyTorch models to Rust for ultra-fast, memory-safe inference.**

OxidizedVision is a production-grade toolkit that bridges the gap between Python-based model training and Rust-based deployment. It provides a seamless pipeline to **convert**, **optimize**, **validate**, **benchmark**, **profile**, and **package** your models โ€” from a trained PyTorch `nn.Module` to a deployable Rust binary, REST API, or WebAssembly module.

---

## โœจ Key Features

| Feature | Description |
|---|---|
| ๐Ÿ”„ **Model Conversion** | PyTorch โ†’ TorchScript โ†’ ONNX with a single command |
| โšก **Optimization** | ONNX graph simplification, constant folding, INT8/FP16 quantization |
| โœ… **Validation** | Numerical consistency checks (MAE, RMSE, Cosine Similarity) across formats |
| ๐Ÿ“Š **Benchmarking** | Latency (avg, p50, p95, p99), throughput, and memory profiling |
| ๐Ÿ”ฌ **Profiling** | Parameter count, model size, per-layer breakdown |
| ๐Ÿ“ฆ **Packaging** | Auto-generate a deployable Rust crate (server or CLI) |
| ๐ŸŒ **Multi-Backend** | `tract` (pure Rust), `tch` (LibTorch), `tensorrt` (NVIDIA GPU) |
| ๐Ÿงฉ **WASM Support** | Run models in the browser via WebAssembly |
| ๐Ÿ“‹ **Model Registry** | Track all converted models and their metadata locally |
| ๐ŸŽจ **Rich CLI** | Beautiful terminal output with progress indicators and tables |
| ๐Ÿ”€ **Multi-Model Server** | Serve multiple models from a single Rust server instance |
| โฑ๏ธ **Dynamic Batching** | Configurable request batching for efficient inference |
| ๐Ÿ“ **Structured Logging** | `tracing` (Rust) + Rich/JSON (Python) for full observability |
| ๐Ÿ“ˆ **Metrics Endpoint** | `/metrics` for monitoring request counts and server health |

---

## ๐Ÿ—๏ธ Architecture

```
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Python Client (CLI) โ”‚
โ”‚ convert โ”‚ validate โ”‚ benchmark โ”‚ optimize โ”‚ profile โ”‚
โ”‚ package โ”‚ serve โ”‚ list โ”‚ info โ”‚
โ”‚ โ”‚
โ”‚ Global: --verbose โ”‚ --json-log โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚ Generates
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Rust Runtimes โ”‚
โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚ โ”‚ runner_tch โ”‚ โ”‚ runner_tract โ”‚ โ”‚ runner_tensorrt โ”‚ โ”‚
โ”‚ โ”‚ (LibTorch) โ”‚ โ”‚ (Pure Rust) โ”‚ โ”‚ (GPU / TensorRT)โ”‚ โ”‚
โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚ โ”‚ All implement Runner trait โ”‚ โ”‚
โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚ โ”‚ runner_core (Shared Trait) โ”‚ โ”‚
โ”‚ โ”‚ + tracing structured logging โ”‚ โ”‚
โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚ Deploys to
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ–ผ โ–ผ โ–ผ
Native Binary REST API Server WASM Module
(multi-model,
batching,
/metrics)
```

---

## โšก Quickstart

### 1. Install

```bash
# From PyPI
pip install oxidizedvision

# From source (development)
pip install -e "./python_client[dev]"
```

### 2. Create a Config

```yaml
# config.yml
model:
path: examples/example_unet/model.py
class_name: UNet
input_shape: [1, 3, 256, 256]

export:
output_dir: out
model_name: unet

validate:
tolerance_mae: 1e-4
tolerance_cos_sim: 0.999

benchmark:
iters: 100
device: cpu
```

### 3. Run the Pipeline

```bash
# Convert PyTorch โ†’ TorchScript + ONNX
oxidizedvision convert config.yml

# Validate numerical consistency
oxidizedvision validate config.yml

# Optimize the ONNX model
oxidizedvision optimize out/unet.onnx --quantize int8

# Benchmark performance
oxidizedvision benchmark out/unet.pt --runners torchscript,tract

# Profile the model
oxidizedvision profile config.yml

# Package into a Rust crate
oxidizedvision package out/unet.onnx --runner tract --template server

# List registered models
oxidizedvision list
```

### 4. Debug with Structured Logging

```bash
# Verbose mode (DEBUG level)
oxidizedvision --verbose convert config.yml

# JSON log output (for CI / log aggregation)
oxidizedvision --json-log convert config.yml
```

---

## ๐Ÿ“– CLI Reference

| Command | Description | Example |
|---|---|---|
| `convert` | Convert PyTorch โ†’ TorchScript + ONNX | `oxidizedvision convert config.yml` |
| `validate` | Check numerical consistency | `oxidizedvision validate config.yml --num-tests 5` |
| `benchmark` | Measure inference performance | `oxidizedvision benchmark out/model.pt --runners torchscript,tract` |
| `optimize` | Optimize an ONNX model | `oxidizedvision optimize out/model.onnx --quantize fp16` |
| `profile` | Analyze model parameters and layers | `oxidizedvision profile config.yml` |
| `package` | Generate deployable Rust crate | `oxidizedvision package out/model.onnx --template server` |
| `serve` | Start inference server | `oxidizedvision serve ./binary --port 8080` |
| `list` | List registered models | `oxidizedvision list` |
| `info` | Detailed model information | `oxidizedvision info unet` |

### Global Options

| Flag | Description |
|---|---|
| `--verbose` / `-v` | Enable DEBUG-level logging |
| `--json-log` | Emit logs as JSON lines (for CI / production) |

---

## ๐Ÿฆ€ Rust Runtimes

### Shared Runner Trait

All backends implement a common `Runner` trait:

```rust
pub trait Runner: Send + Sync {
fn from_config(config: &RunnerConfig) -> Result where Self: Sized;
fn run(&self, input: &ArrayD) -> Result>;
fn info(&self) -> ModelInfo;
}
```

### Available Backends

| Backend | Model Format | GPU | WASM | Dependencies |
|---|---|---|---|---|
| `runner_tract` | ONNX | โŒ | โœ… | None (pure Rust) |
| `runner_tch` | TorchScript | โœ… | โŒ | LibTorch |
| `runner_tensorrt` | ONNX โ†’ Engine | โœ… | โŒ | TensorRT SDK |

---

## ๐Ÿ–ฅ๏ธ Inference Server

The built-in `image_server` example provides a production-ready REST API:

```bash
# Single model
cargo run -p image_server -- --model model.onnx --port 8080

# Multi-model (serve multiple models simultaneously)
cargo run -p image_server -- \
--model segmenter=models/seg.onnx \
--model classifier=models/cls.onnx \
--port 8080

# With dynamic batching
cargo run -p image_server -- \
--model model.onnx \
--max-batch-size 8 \
--max-wait-ms 50

# JSON structured logs
cargo run -p image_server -- --model model.onnx --log-format json
```

### Endpoints

| Method | Path | Description |
|---|---|---|
| `POST` | `/predict` | Inference on the default model |
| `POST` | `/predict/{model_name}` | Inference on a named model |
| `GET` | `/health` | Health check with per-model status |
| `GET` | `/metrics` | Request counts, error counts, batch status |
| `GET` | `/models` | List all loaded models |

---

## ๐Ÿ—‚๏ธ Project Structure

```
Oxidized-Vision/
โ”œโ”€โ”€ python_client/ # Python CLI & pipeline
โ”‚ โ”œโ”€โ”€ oxidizedvision/
โ”‚ โ”‚ โ”œโ”€โ”€ cli.py # Typer CLI entry point
โ”‚ โ”‚ โ”œโ”€โ”€ config.py # Pydantic config models
โ”‚ โ”‚ โ”œโ”€โ”€ convert.py # Model conversion
โ”‚ โ”‚ โ”œโ”€โ”€ validate.py # Numerical validation
โ”‚ โ”‚ โ”œโ”€โ”€ benchmark.py # Performance measurement
โ”‚ โ”‚ โ”œโ”€โ”€ optimize.py # ONNX optimization
โ”‚ โ”‚ โ”œโ”€โ”€ profile.py # Model profiling
โ”‚ โ”‚ โ”œโ”€โ”€ registry.py # Model registry
โ”‚ โ”‚ โ””โ”€โ”€ logging.py # Structured logging (Rich / JSON)
โ”‚ โ””โ”€โ”€ tests/ # pytest test suite
โ”œโ”€โ”€ rust_runtime/ # Rust inference runtimes
โ”‚ โ”œโ”€โ”€ crates/
โ”‚ โ”‚ โ”œโ”€โ”€ runner_core/ # Shared Runner trait + tracing
โ”‚ โ”‚ โ”œโ”€โ”€ runner_tch/ # LibTorch backend
โ”‚ โ”‚ โ”œโ”€โ”€ runner_tract/ # tract (ONNX) backend
โ”‚ โ”‚ โ””โ”€โ”€ runner_tensorrt/ # TensorRT backend
โ”‚ โ””โ”€โ”€ examples/
โ”‚ โ”œโ”€โ”€ image_server/ # Multi-model REST API with batching
โ”‚ โ”œโ”€โ”€ denoiser_cli/ # Image denoising CLI
โ”‚ โ””โ”€โ”€ wasm_frontend/ # Browser inference demo
โ”œโ”€โ”€ tools/ # Standalone scripts
โ”œโ”€โ”€ benchmarks/ # Benchmark infrastructure
โ”œโ”€โ”€ examples/ # User-facing examples
โ”‚ โ””โ”€โ”€ example_unet/ # Complete UNet example
โ”œโ”€โ”€ docs/ # Architecture docs
โ””โ”€โ”€ .github/workflows/ # CI/CD + PyPI auto-deploy
```

---

## ๐Ÿงช Testing

```bash
# Python tests
pytest python_client/tests/ -v --cov=oxidizedvision

# Rust tests
cargo test --workspace
```

### Pre-commit Hooks

```bash
pip install pre-commit
pre-commit install
pre-commit run --all-files
```

---

## ๐Ÿ“ฆ Publishing to PyPI

Releases are automatically published to PyPI when a GitHub Release is created with a `v*` tag (e.g., `v1.0.2`). See [`.github/workflows/publish.yml`](.github/workflows/publish.yml) for details.

To publish manually:

```bash
pip install build twine
python -m build
twine upload dist/*
```

---

## ๐Ÿค Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup, testing instructions, and PR guidelines.

---

## ๐Ÿ“„ License

MIT License โ€” see [LICENSE](LICENSE) for details.