An open API service indexing awesome lists of open source software.

https://github.com/jamjamjon/usls

A Rust library integrated with ONNXRuntime, providing a collection of Computer Vison and Vision-Language models.
https://github.com/jamjamjon/usls

clip cuda florence2 grounding-dino moondream ocr onnx onnxruntime rust rust-yolo sam sapiens smolvlm tensorrt yolo yolo-rs yolo-rust yolov10 yolov11 yolov8

Last synced: 2 days ago
JSON representation

A Rust library integrated with ONNXRuntime, providing a collection of Computer Vison and Vision-Language models.

Awesome Lists containing this project

README

        

usls



Rust Continuous Integration Badge


usls Version


Rust MSRV


ONNXRuntime MSRV


CUDA MSRV


TensorRT MSRV


Crates.io Total Downloads




Examples


usls documentation

**usls** is a Rust library integrated with **ONNXRuntime**, offering a suite of advanced models for **Computer Vision** and **Vision-Language** tasks, including:

- **YOLO Models**: [YOLOv5](https://github.com/ultralytics/yolov5), [YOLOv6](https://github.com/meituan/YOLOv6), [YOLOv7](https://github.com/WongKinYiu/yolov7), [YOLOv8](https://github.com/ultralytics/ultralytics), [YOLOv9](https://github.com/WongKinYiu/yolov9), [YOLOv10](https://github.com/THU-MIG/yolov10), [YOLO11](https://github.com/ultralytics/ultralytics), [YOLOv12](https://github.com/sunsmarterjie/yolov12)
- **SAM Models**: [SAM](https://github.com/facebookresearch/segment-anything), [SAM2](https://github.com/facebookresearch/segment-anything-2), [MobileSAM](https://github.com/ChaoningZhang/MobileSAM), [EdgeSAM](https://github.com/chongzhou96/EdgeSAM), [SAM-HQ](https://github.com/SysCV/sam-hq), [FastSAM](https://github.com/CASIA-IVA-Lab/FastSAM)
- **Vision Models**: [RT-DETR](https://arxiv.org/abs/2304.08069), [RTMO](https://github.com/open-mmlab/mmpose/tree/main/projects/rtmo), [Depth-Anything](https://github.com/LiheYoung/Depth-Anything), [DINOv2](https://github.com/facebookresearch/dinov2), [MODNet](https://github.com/ZHKKKe/MODNet), [Sapiens](https://arxiv.org/abs/2408.12569), [DepthPro](https://github.com/apple/ml-depth-pro), [FastViT](https://github.com/apple/ml-fastvit), [BEiT](https://github.com/microsoft/unilm/tree/master/beit), [MobileOne](https://github.com/apple/ml-mobileone)
- **Vision-Language Models**: [CLIP](https://github.com/openai/CLIP), [jina-clip-v1](https://huggingface.co/jinaai/jina-clip-v1), [BLIP](https://arxiv.org/abs/2201.12086), [GroundingDINO](https://github.com/IDEA-Research/GroundingDINO), [YOLO-World](https://github.com/AILab-CVC/YOLO-World), [Florence2](https://arxiv.org/abs/2311.06242), [Moondream2](https://github.com/vikhyat/moondream/tree/main)
- **OCR Models**: [FAST](https://github.com/czczup/FAST), [DB(PaddleOCR-Det)](https://arxiv.org/abs/1911.08947), [SVTR(PaddleOCR-Rec)](https://arxiv.org/abs/2205.00159), [SLANet](https://paddlepaddle.github.io/PaddleOCR/latest/algorithm/table_recognition/algorithm_table_slanet.html), [TrOCR](https://huggingface.co/microsoft/trocr-base-printed), [DocLayout-YOLO](https://github.com/opendatalab/DocLayout-YOLO)

πŸ‘‰ More Supported Models

| Model | Task / Description | Example | CoreML | CUDA
FP32 | CUDA
FP16 | TensorRT
FP32 | TensorRT
FP16 |
| -------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------- | ---------------------------- | ------ | -------------- | -------------- | ------------------ | ------------------ |
| [BEiT](https://github.com/microsoft/unilm/tree/master/beit) | Image Classification | [demo](examples/beit) | βœ… | βœ… | βœ… | | |
| [ConvNeXt](https://github.com/facebookresearch/ConvNeXt) | Image Classification | [demo](examples/convnext) | βœ… | βœ… | βœ… | | |
| [FastViT](https://github.com/apple/ml-fastvit) | Image Classification | [demo](examples/fastvit) | βœ… | βœ… | βœ… | | |
| [MobileOne](https://github.com/apple/ml-mobileone) | Image Classification | [demo](examples/mobileone) | βœ… | βœ… | βœ… | | |
| [DeiT](https://github.com/facebookresearch/deit) | Image Classification | [demo](examples/deit) | βœ… | βœ… | βœ… | | |
| [DINOv2](https://github.com/facebookresearch/dinov2) | VisionΒ Embedding | [demo](examples/dinov2) | βœ… | βœ… | βœ… | βœ… | βœ… |
| [YOLOv5](https://github.com/ultralytics/yolov5) | Image Classification
Object Detection
Instance Segmentation | [demo](examples/yolo) | βœ… | βœ… | βœ… | βœ… | βœ… |
| [YOLOv6](https://github.com/meituan/YOLOv6) | Object Detection | [demo](examples/yolo) | βœ… | βœ… | βœ… | βœ… | βœ… |
| [YOLOv7](https://github.com/WongKinYiu/yolov7) | Object Detection | [demo](examples/yolo) | βœ… | βœ… | βœ… | βœ… | βœ… |
| [YOLOv8
YOLO11](https://github.com/ultralytics/ultralytics) | Object Detection
Instance Segmentation
Image Classification
Oriented Object Detection
Keypoint Detection | [demo](examples/yolo) | βœ… | βœ… | βœ… | βœ… | βœ… |
| [YOLOv9](https://github.com/WongKinYiu/yolov9) | Object Detection | [demo](examples/yolo) | βœ… | βœ… | βœ… | βœ… | βœ… |
| [YOLOv10](https://github.com/THU-MIG/yolov10) | Object Detection | [demo](examples/yolo) | βœ… | βœ… | βœ… | βœ… | βœ… |
| [YOLOv12](https://github.com/sunsmarterjie/yolov12) | Object Detection | [demo](examples/yolo) | βœ… | βœ… | βœ… | βœ… | βœ… |
| [RT-DETR](https://github.com/lyuwenyu/RT-DETR) | Object Detection | [demo](examples/rtdetr) | βœ… | βœ… | βœ… | | |
| [RF-DETR](https://github.com/roboflow/rf-detr) | Object Detection | [demo](examples/rfdetr) | βœ… | βœ… | βœ… | | |
| [PP-PicoDet](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.8/configs/picodet) | Object Detection | [demo](examples/picodet-layout) | βœ… | βœ… | βœ… | | |
| [DocLayout-YOLO](https://github.com/opendatalab/DocLayout-YOLO) | Object Detection | [demo](examples/picodet-layout) | βœ… | βœ… | βœ… | | |
| [D-FINE](https://github.com/manhbd-22022602/D-FINE) | Object Detection | [demo](examples/d-fine) | βœ… | βœ… | βœ… | | |
| [DEIM](https://github.com/ShihuaHuang95/DEIM) | Object Detection | [demo](examples/deim) | βœ… | βœ… | βœ… | | |
| [RTMO](https://github.com/open-mmlab/mmpose/tree/main/projects/rtmo) | Keypoint Detection | [demo](examples/rtmo) | βœ… | βœ… | βœ… | ❌ | ❌ |
| [SAM](https://github.com/facebookresearch/segment-anything) | Segment Anything | [demo](examples/sam) | βœ… | βœ… | βœ… | | |
| [SAM2](https://github.com/facebookresearch/segment-anything-2) | Segment Anything | [demo](examples/sam) | βœ… | βœ… | βœ… | | |
| [MobileSAM](https://github.com/ChaoningZhang/MobileSAM) | Segment Anything | [demo](examples/sam) | βœ… | βœ… | βœ… | | |
| [EdgeSAM](https://github.com/chongzhou96/EdgeSAM) | Segment Anything | [demo](examples/sam) | βœ… | βœ… | βœ… | | |
| [SAM-HQ](https://github.com/SysCV/sam-hq) | Segment Anything | [demo](examples/sam) | βœ… | βœ… | βœ… | | |
| [FastSAM](https://github.com/CASIA-IVA-Lab/FastSAM) | Instance Segmentation | [demo](examples/yolo) | βœ… | βœ… | βœ… | βœ… | βœ… |
| [YOLO-World](https://github.com/AILab-CVC/YOLO-World) | Open-Set Detection With Language | [demo](examples/yolo) | βœ… | βœ… | βœ… | βœ… | βœ… |
| [GroundingDINO](https://github.com/IDEA-Research/GroundingDINO) | Open-Set Detection With Language | [demo](examples/grounding-dino) | βœ… | βœ… | βœ… | | |
| [CLIP](https://github.com/openai/CLIP) | Vision-Language Embedding | [demo](examples/clip) | βœ… | βœ… | βœ… | ❌ | ❌ |
| [jina-clip-v1](https://huggingface.co/jinaai/jina-clip-v1) | Vision-Language Embedding | [demo](examples/clip) | βœ… | βœ… | βœ… | ❌ | ❌ |
| [BLIP](https://github.com/salesforce/BLIP) | Image Captioning | [demo](examples/blip) | βœ… | βœ… | βœ… | ❌ | ❌ |
| [DB(PaddleOCR-Det)](https://arxiv.org/abs/1911.08947) | Text Detection | [demo](examples/db) | βœ… | βœ… | βœ… | βœ… | βœ… |
| [FAST](https://github.com/czczup/FAST) | Text Detection | [demo](examples/fast) | βœ… | βœ… | βœ… | βœ… | βœ… |
| [LinkNet](https://arxiv.org/abs/1707.03718) | Text Detection | [demo](examples/linknet) | βœ… | βœ… | βœ… | βœ… | βœ… |
| [SVTR(PaddleOCR-Rec)](https://arxiv.org/abs/2205.00159) | Text Recognition | [demo](examples/svtr) | βœ… | βœ… | βœ… | βœ… | βœ… |
| [SLANet](https://paddlepaddle.github.io/PaddleOCR/latest/algorithm/table_recognition/algorithm_table_slanet.html) | Tabel Recognition | [demo](examples/slanet) | βœ… | βœ… | βœ… | | |
| [TrOCR](https://huggingface.co/microsoft/trocr-base-printed) | Text Recognition | [demo](examples/trocr) | βœ… | βœ… | βœ… | | |
| [YOLOPv2](https://arxiv.org/abs/2208.11434) | Panoptic Driving Perception | [demo](examples/yolop) | βœ… | βœ… | βœ… | βœ… | βœ… |
| [DepthAnything v1
DepthAnything v2](https://github.com/LiheYoung/Depth-Anything) | Monocular Depth Estimation | [demo](examples/depth-anything) | βœ… | βœ… | βœ… | ❌ | ❌ |
| [DepthPro](https://github.com/apple/ml-depth-pro) | Monocular Depth Estimation | [demo](examples/depth-pro) | βœ… | βœ… | βœ… | | |
| [MODNet](https://github.com/ZHKKKe/MODNet) | Image Matting | [demo](examples/modnet) | βœ… | βœ… | βœ… | βœ… | βœ… |
| [Sapiens](https://github.com/facebookresearch/sapiens/tree/main) | Foundation for Human Vision Models | [demo](examples/sapiens) | βœ… | βœ… | βœ… | | |
| [Florence2](https://arxiv.org/abs/2311.06242) | a Variety of Vision Tasks | [demo](examples/florence2) | βœ… | βœ… | βœ… | | |
| [Moondream2](https://github.com/vikhyat/moondream/tree/main) | Open-Set Object Detection
Open-Set Keypoints Detection
ImageΒ Caption
Visual Question Answering | [demo](examples/moondream2) | βœ… | βœ… | βœ… | | |
| [OWLv2](https://huggingface.co/google/owlv2-base-patch16-ensemble) | Open-Set Object Detection | [demo](examples/owlv2) | βœ… | βœ… | βœ… | | |
| [SmolVLM(256M, 500M)](https://huggingface.co/HuggingFaceTB/SmolVLM-256M-Instruct) | Visual Question Answering | [demo](examples/smolvlm) | βœ… | βœ… | βœ… | | |

## ⛳️ Cargo Features

By default, **none of the following features are enabled**. You can enable them as needed:

- **`auto`**: Automatically downloads prebuilt ONNXRuntime binaries from Pyke’s CDN for supported platforms.

- If disabled, you'll need to [compile `ONNXRuntime` from source](https://github.com/microsoft/onnxruntime) or [download a precompiled package](https://github.com/microsoft/onnxruntime/releases), and then [link it manually](https://ort.pyke.io/setup/linking).


πŸ‘‰ For Linux or macOS Users

- Download from the [Releases page](https://github.com/microsoft/onnxruntime/releases).
- Set up the library path by exporting the `ORT_DYLIB_PATH` environment variable:
```shell
export ORT_DYLIB_PATH=/path/to/onnxruntime/lib/libonnxruntime.so.1.20.1
```


- **`ffmpeg`**: Adds support for video streams, real-time frame visualization, and video export.

- Powered by [video-rs](https://github.com/oddity-ai/video-rs) and [minifb](https://github.com/emoon/rust_minifb). For any issues related to `ffmpeg` features, please refer to the issues of these two crates.
- **`cuda`**: Enables the NVIDIA TensorRT provider.
- **`trt`**: Enables the NVIDIA TensorRT provider.
- **`mps`**: Enables the Apple CoreML provider.

## 🎈 Example

* **Using `CUDA`**

```
cargo run -r -F cuda --example yolo -- --device cuda:0
```
* **Using Apple `CoreML`**

```
cargo run -r -F mps --example yolo -- --device mps
```
* **Using `TensorRT`**

```
cargo run -r -F trt --example yolo -- --device trt
```
* **Using `CPU`**

```
cargo run -r --example yolo
```

All examples are located in the [examples](./examples/) directory.

## πŸ₯‚ Integrate Into Your Own Project

Add `usls` as a dependency to your project's `Cargo.toml`

```Shell
cargo add usls -F cuda
```

Or use a specific commit:

```Toml
[dependencies]
usls = { git = "https://github.com/jamjamjon/usls", rev = "commit-sha" }
```

## πŸ₯³ If you find this helpful, please give it a star ⭐

## πŸ“Œ License

This project is licensed under [LICENSE](LICENSE).