https://github.com/onepunchmonk/oxidized-vision
Compile PyTorch vision models into ultra-fast Rust binaries for edge, server, and browser deployment. (WORK IN PROGRESS )
https://github.com/onepunchmonk/oxidized-vision
Last synced: 4 months ago
JSON representation
Compile PyTorch vision models into ultra-fast Rust binaries for edge, server, and browser deployment. (WORK IN PROGRESS )
- Host: GitHub
- URL: https://github.com/onepunchmonk/oxidized-vision
- Owner: OnePunchMonk
- License: mit
- Created: 2025-10-15T15:23:51.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2025-10-15T18:23:38.000Z (8 months ago)
- Last Synced: 2026-01-14T09:33:03.250Z (5 months ago)
- Language: Python
- Homepage:
- Size: 85.9 KB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Codeowners: CODEOWNERS
Awesome Lists containing this project
README
# ๐ OxidizedVision
[](https://github.com/OnePunchMonk/Oxidized-Vision/actions/workflows/ci.yml)
[](https://pypi.org/project/oxidizedvision/)
[](https://pypi.org/project/oxidizedvision/)
[](https://opensource.org/licenses/MIT)
**Compile your PyTorch models to Rust for ultra-fast, memory-safe inference.**
OxidizedVision is a production-grade toolkit that bridges the gap between Python-based model training and Rust-based deployment. It provides a seamless pipeline to **convert**, **optimize**, **validate**, **benchmark**, **profile**, and **package** your models โ from a trained PyTorch `nn.Module` to a deployable Rust binary, REST API, or WebAssembly module.
---
## โจ Key Features
| Feature | Description |
|---|---|
| ๐ **Model Conversion** | PyTorch โ TorchScript โ ONNX with a single command |
| โก **Optimization** | ONNX graph simplification, constant folding, INT8/FP16 quantization |
| โ
**Validation** | Numerical consistency checks (MAE, RMSE, Cosine Similarity) across formats |
| ๐ **Benchmarking** | Latency (avg, p50, p95, p99), throughput, and memory profiling |
| ๐ฌ **Profiling** | Parameter count, model size, per-layer breakdown |
| ๐ฆ **Packaging** | Auto-generate a deployable Rust crate (server or CLI) |
| ๐ **Multi-Backend** | `tract` (pure Rust), `tch` (LibTorch), `tensorrt` (NVIDIA GPU) |
| ๐งฉ **WASM Support** | Run models in the browser via WebAssembly |
| ๐ **Model Registry** | Track all converted models and their metadata locally |
| ๐จ **Rich CLI** | Beautiful terminal output with progress indicators and tables |
| ๐ **Multi-Model Server** | Serve multiple models from a single Rust server instance |
| โฑ๏ธ **Dynamic Batching** | Configurable request batching for efficient inference |
| ๐ **Structured Logging** | `tracing` (Rust) + Rich/JSON (Python) for full observability |
| ๐ **Metrics Endpoint** | `/metrics` for monitoring request counts and server health |
---
## ๐๏ธ Architecture
```
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Python Client (CLI) โ
โ convert โ validate โ benchmark โ optimize โ profile โ
โ package โ serve โ list โ info โ
โ โ
โ Global: --verbose โ --json-log โ
โโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Generates
โโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Rust Runtimes โ
โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โ
โ โ runner_tch โ โ runner_tract โ โ runner_tensorrt โ โ
โ โ (LibTorch) โ โ (Pure Rust) โ โ (GPU / TensorRT)โ โ
โ โโโโโโโโฌโโโโโโโ โโโโโโโโฌโโโโโโโ โโโโโโโโโฌโโโโโโโโโโ โ
โ โ All implement Runner trait โ โ
โ โโโโโโโโผโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโผโโโโโโโโโโ โ
โ โ runner_core (Shared Trait) โ โ
โ โ + tracing structured logging โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Deploys to
โโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโ
โผ โผ โผ
Native Binary REST API Server WASM Module
(multi-model,
batching,
/metrics)
```
---
## โก Quickstart
### 1. Install
```bash
# From PyPI
pip install oxidizedvision
# From source (development)
pip install -e "./python_client[dev]"
```
### 2. Create a Config
```yaml
# config.yml
model:
path: examples/example_unet/model.py
class_name: UNet
input_shape: [1, 3, 256, 256]
export:
output_dir: out
model_name: unet
validate:
tolerance_mae: 1e-4
tolerance_cos_sim: 0.999
benchmark:
iters: 100
device: cpu
```
### 3. Run the Pipeline
```bash
# Convert PyTorch โ TorchScript + ONNX
oxidizedvision convert config.yml
# Validate numerical consistency
oxidizedvision validate config.yml
# Optimize the ONNX model
oxidizedvision optimize out/unet.onnx --quantize int8
# Benchmark performance
oxidizedvision benchmark out/unet.pt --runners torchscript,tract
# Profile the model
oxidizedvision profile config.yml
# Package into a Rust crate
oxidizedvision package out/unet.onnx --runner tract --template server
# List registered models
oxidizedvision list
```
### 4. Debug with Structured Logging
```bash
# Verbose mode (DEBUG level)
oxidizedvision --verbose convert config.yml
# JSON log output (for CI / log aggregation)
oxidizedvision --json-log convert config.yml
```
---
## ๐ CLI Reference
| Command | Description | Example |
|---|---|---|
| `convert` | Convert PyTorch โ TorchScript + ONNX | `oxidizedvision convert config.yml` |
| `validate` | Check numerical consistency | `oxidizedvision validate config.yml --num-tests 5` |
| `benchmark` | Measure inference performance | `oxidizedvision benchmark out/model.pt --runners torchscript,tract` |
| `optimize` | Optimize an ONNX model | `oxidizedvision optimize out/model.onnx --quantize fp16` |
| `profile` | Analyze model parameters and layers | `oxidizedvision profile config.yml` |
| `package` | Generate deployable Rust crate | `oxidizedvision package out/model.onnx --template server` |
| `serve` | Start inference server | `oxidizedvision serve ./binary --port 8080` |
| `list` | List registered models | `oxidizedvision list` |
| `info` | Detailed model information | `oxidizedvision info unet` |
### Global Options
| Flag | Description |
|---|---|
| `--verbose` / `-v` | Enable DEBUG-level logging |
| `--json-log` | Emit logs as JSON lines (for CI / production) |
---
## ๐ฆ Rust Runtimes
### Shared Runner Trait
All backends implement a common `Runner` trait:
```rust
pub trait Runner: Send + Sync {
fn from_config(config: &RunnerConfig) -> Result where Self: Sized;
fn run(&self, input: &ArrayD) -> Result>;
fn info(&self) -> ModelInfo;
}
```
### Available Backends
| Backend | Model Format | GPU | WASM | Dependencies |
|---|---|---|---|---|
| `runner_tract` | ONNX | โ | โ
| None (pure Rust) |
| `runner_tch` | TorchScript | โ
| โ | LibTorch |
| `runner_tensorrt` | ONNX โ Engine | โ
| โ | TensorRT SDK |
---
## ๐ฅ๏ธ Inference Server
The built-in `image_server` example provides a production-ready REST API:
```bash
# Single model
cargo run -p image_server -- --model model.onnx --port 8080
# Multi-model (serve multiple models simultaneously)
cargo run -p image_server -- \
--model segmenter=models/seg.onnx \
--model classifier=models/cls.onnx \
--port 8080
# With dynamic batching
cargo run -p image_server -- \
--model model.onnx \
--max-batch-size 8 \
--max-wait-ms 50
# JSON structured logs
cargo run -p image_server -- --model model.onnx --log-format json
```
### Endpoints
| Method | Path | Description |
|---|---|---|
| `POST` | `/predict` | Inference on the default model |
| `POST` | `/predict/{model_name}` | Inference on a named model |
| `GET` | `/health` | Health check with per-model status |
| `GET` | `/metrics` | Request counts, error counts, batch status |
| `GET` | `/models` | List all loaded models |
---
## ๐๏ธ Project Structure
```
Oxidized-Vision/
โโโ python_client/ # Python CLI & pipeline
โ โโโ oxidizedvision/
โ โ โโโ cli.py # Typer CLI entry point
โ โ โโโ config.py # Pydantic config models
โ โ โโโ convert.py # Model conversion
โ โ โโโ validate.py # Numerical validation
โ โ โโโ benchmark.py # Performance measurement
โ โ โโโ optimize.py # ONNX optimization
โ โ โโโ profile.py # Model profiling
โ โ โโโ registry.py # Model registry
โ โ โโโ logging.py # Structured logging (Rich / JSON)
โ โโโ tests/ # pytest test suite
โโโ rust_runtime/ # Rust inference runtimes
โ โโโ crates/
โ โ โโโ runner_core/ # Shared Runner trait + tracing
โ โ โโโ runner_tch/ # LibTorch backend
โ โ โโโ runner_tract/ # tract (ONNX) backend
โ โ โโโ runner_tensorrt/ # TensorRT backend
โ โโโ examples/
โ โโโ image_server/ # Multi-model REST API with batching
โ โโโ denoiser_cli/ # Image denoising CLI
โ โโโ wasm_frontend/ # Browser inference demo
โโโ tools/ # Standalone scripts
โโโ benchmarks/ # Benchmark infrastructure
โโโ examples/ # User-facing examples
โ โโโ example_unet/ # Complete UNet example
โโโ docs/ # Architecture docs
โโโ .github/workflows/ # CI/CD + PyPI auto-deploy
```
---
## ๐งช Testing
```bash
# Python tests
pytest python_client/tests/ -v --cov=oxidizedvision
# Rust tests
cargo test --workspace
```
### Pre-commit Hooks
```bash
pip install pre-commit
pre-commit install
pre-commit run --all-files
```
---
## ๐ฆ Publishing to PyPI
Releases are automatically published to PyPI when a GitHub Release is created with a `v*` tag (e.g., `v1.0.2`). See [`.github/workflows/publish.yml`](.github/workflows/publish.yml) for details.
To publish manually:
```bash
pip install build twine
python -m build
twine upload dist/*
```
---
## ๐ค Contributing
See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup, testing instructions, and PR guidelines.
---
## ๐ License
MIT License โ see [LICENSE](LICENSE) for details.