https://github.com/jamjamjon/usls

A Rust library integrated with ONNXRuntime, providing a collection of Computer Vison and Vision-Language models.
https://github.com/jamjamjon/usls
clip cuda florence2 grounding-dino imshow moondream ocr onnx onnxruntime rust-yolo sam sapiens smolvlm tensorrt yolo yolo-rs yolo-rust yolov10 yolov11 yolov8
Last synced: 5 months ago
JSON representation
A Rust library integrated with ONNXRuntime, providing a collection of Computer Vison and Vision-Language models.
Host: GitHub
URL: https://github.com/jamjamjon/usls
Owner: jamjamjon
License: mit
Created: 2024-03-29T07:36:09.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2025-05-13T09:28:25.000Z (5 months ago)
Last Synced: 2025-05-13T10:38:45.437Z (5 months ago)
Topics: clip, cuda, florence2, grounding-dino, imshow, moondream, ocr, onnx, onnxruntime, rust-yolo, sam, sapiens, smolvlm, tensorrt, yolo, yolo-rs, yolo-rust, yolov10, yolov11, yolov8
Language: Rust
Homepage:
Size: 22.4 MB
Stars: 141
Watchers: 3
Forks: 20
Open Issues: 4
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

awesome-yolo-object-detection - usls - Language models. (Other Versions of YOLO)
awesome-yolo-object-detection - usls - Language models. (Other Versions of YOLO)
awesome-rust-list - usls - Language models. (Machine Learning)
awesome-rust-list - usls - Language models. (Machine Learning)
README

          
usls




    

        

    

    

        

    

    

        

    

    

        

    

    

        

    

    

        

    

    

        

    





    

        

    

    

        

    



**usls** is a Rust library integrated with  **ONNXRuntime**, offering a suite of advanced models for **Computer Vision** and **Vision-Language** tasks, including:

- **YOLO Models**: [YOLOv5](https://github.com/ultralytics/yolov5), [YOLOv6](https://github.com/meituan/YOLOv6), [YOLOv7](https://github.com/WongKinYiu/yolov7), [YOLOv8](https://github.com/ultralytics/ultralytics), [YOLOv9](https://github.com/WongKinYiu/yolov9), [YOLOv10](https://github.com/THU-MIG/yolov10), [YOLO11](https://github.com/ultralytics/ultralytics), [YOLOv12](https://github.com/sunsmarterjie/yolov12)

- **SAM Models**: [SAM](https://github.com/facebookresearch/segment-anything), [SAM2](https://github.com/facebookresearch/segment-anything-2), [MobileSAM](https://github.com/ChaoningZhang/MobileSAM), [EdgeSAM](https://github.com/chongzhou96/EdgeSAM), [SAM-HQ](https://github.com/SysCV/sam-hq), [FastSAM](https://github.com/CASIA-IVA-Lab/FastSAM)

- **Vision Models**: [RT-DETR](https://arxiv.org/abs/2304.08069), [RTMO](https://github.com/open-mmlab/mmpose/tree/main/projects/rtmo), [Depth-Anything](https://github.com/LiheYoung/Depth-Anything), [DINOv2](https://github.com/facebookresearch/dinov2), [MODNet](https://github.com/ZHKKKe/MODNet), [Sapiens](https://arxiv.org/abs/2408.12569), [DepthPro](https://github.com/apple/ml-depth-pro), [FastViT](https://github.com/apple/ml-fastvit), [BEiT](https://github.com/microsoft/unilm/tree/master/beit), [MobileOne](https://github.com/apple/ml-mobileone)

- **Vision-Language Models**: [CLIP](https://github.com/openai/CLIP), [jina-clip-v1](https://huggingface.co/jinaai/jina-clip-v1), [BLIP](https://arxiv.org/abs/2201.12086), [GroundingDINO](https://github.com/IDEA-Research/GroundingDINO), [YOLO-World](https://github.com/AILab-CVC/YOLO-World), [Florence2](https://arxiv.org/abs/2311.06242), [Moondream2](https://github.com/vikhyat/moondream/tree/main)

- **OCR Models**: [FAST](https://github.com/czczup/FAST), [DB(PaddleOCR-Det)](https://arxiv.org/abs/1911.08947), [SVTR(PaddleOCR-Rec)](https://arxiv.org/abs/2205.00159), [SLANet](https://paddlepaddle.github.io/PaddleOCR/latest/algorithm/table_recognition/algorithm_table_slanet.html), [TrOCR](https://huggingface.co/microsoft/trocr-base-printed), [DocLayout-YOLO](https://github.com/opendatalab/DocLayout-YOLO)

👉 More Supported Models

| Model 
| --------------------------------------------------- 
| [BEiT](https://github.com/microsoft/unilm/tree/master/beit) 
| [ConvNeXt](https://github.com/facebookresearch/ConvNeXt) 
| [FastViT](https://github.com/apple/ml-fastvit) 
| [MobileOne](https://github.com/apple/ml-mobileone) 
| [DeiT](https://github.com/facebookresearch/deit) 
| [DINOv2](https://github.com/facebookresearch/dinov2) 
| [YOLOv5](https://github.com/ultralytics/yolov5) 
| [YOLOv6](https://github.com/meituan/YOLOv6) 
| [YOLOv7](https://github.com/WongKinYiu/yolov7) 
| [YOLOv8
YOLO11](https://github.com/ultralytics/ultralytics) 
| [YOLOv9](https://github.com/WongKinYiu/yolov9) 
| [YOLOv10](https://github.com/THU-MIG/yolov10) 
| [YOLOv12](https://github.com/sunsmarterjie/yolov12) 
| [RT-DETR](https://github.com/lyuwenyu/RT-DETR) 
| [RF-DETR](https://github.com/roboflow/rf-detr) 
| [PP-PicoDet](https://github.com/PaddlePaddle/Paddle 
| [DocLayout-YOLO](https://github.com/opendatalab/DocLayout-YOLO) 
| [D-FINE](https://github.com/manhbd-22022602/D-FINE) 
| [DEIM](https://github.com/ShihuaHuang95/DEIM) 
| [RTMO](https://github.com/open-mmlab/mmpose/tree/ma 
| [SAM](https://github.com/facebookresearch/segment-anything) 
| [SAM2](https://github.com/facebookresearch/segment-anything-2) 
| [MobileSAM](https://github.com/ChaoningZhang/MobileSAM) 
| [EdgeSAM](https://github.com/chongzhou96/EdgeSAM) 
| [SAM-HQ](https://github.com/SysCV/sam-hq) 
| [FastSAM](https://github.com/CASIA-IVA-Lab/FastSAM) 
| [YOLO-World](https://github.com/AILab-CVC/YOLO-World) 
| [GroundingDINO](https://github.com/IDEA-Research/GroundingDINO) 
| [CLIP](https://github.com/openai/CLIP) 
| [jina-clip-v1](https://huggingface.co/jinaai/jina-clip-v1) 
| [BLIP](https://github.com/salesforce/BLIP) 
| [DB(PaddleOCR-Det)](https://arxiv.org/abs/1911.08947) 
| [FAST](https://github.com/czczup/FAST) 
| [LinkNet](https://arxiv.org/abs/1707.03718) 
| [SVTR(PaddleOCR-Rec)](https://arxiv.org/abs/2205.00159) 
| [SLANet](https://paddlepaddle.github.io/PaddleOCR/l 
| [TrOCR](https://huggingface.co/microsoft/trocr-base-printed) 
| [YOLOPv2](https://arxiv.org/abs/2208.11434) 
| [DepthAnything v1
DepthAnything v2](https://gith 
| [DepthPro](https://github.com/apple/ml-depth-pro) 
| [MODNet](https://github.com/ZHKKKe/MODNet) 
| [Sapiens](https://github.com/facebookresearch/sapiens/tree/main) 
| [Florence2](https://arxiv.org/abs/2311.06242) 
| [Moondream2](https://github.com/vikhyat/moondream/tree/main) 
| [OWLv2](https://huggingface.co/google/owlv2-base-pa 
| [SmolVLM(256M, 500M)](https://huggingface.co/Huggin

| Task / Description                                                                                                           | Example                      | CoreML | CUDA
FP32 | CUDA
FP16 | TensorRT
FP32 | TensorRT
FP16 | ----------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------- | ---------------------------- | ------ | -------------- | -------------- | ------------------ | ------------------ | | Image Classification                                                                                                         | [demo](examples/beit)           | ✅     | ✅             | ✅             |                    |                    | | Image Classification                                                                                                         | [demo](examples/convnext)       | ✅     | ✅             | ✅             |                    |                    | | Image Classification                                                                                                         | [demo](examples/fastvit)        | ✅     | ✅             | ✅             |                    |                    | | Image Classification                                                                                                         | [demo](examples/mobileone)      | ✅     | ✅             | ✅             |                    |                    | | Image Classification                                                                                                         | [demo](examples/deit)           | ✅     | ✅             | ✅             |                    |                    | | Vision Embedding                                                                                                            | [demo](examples/dinov2)         | ✅     | ✅             | ✅             | ✅                 | ✅                 | | Image Classification
Object Detection
Instance Segmentation                                                        | [demo](examples/yolo)           | ✅     | ✅             | ✅             | ✅                 | ✅                 | | Object Detection                                                                                                             | [demo](examples/yolo)           | ✅     | ✅             | ✅             | ✅                 | ✅                 | | Object Detection                                                                                                             | [demo](examples/yolo)           | ✅     | ✅             | ✅             | ✅                 | ✅                 | | Object Detection
Instance Segmentation
Image Classification
Oriented Object Detection
Keypoint Detection | [demo](examples/yolo)           | ✅     | ✅             | ✅             | ✅                 | ✅                 | | Object Detection                                                                                                             | [demo](examples/yolo)           | ✅     | ✅             | ✅             | ✅                 | ✅                 | | Object Detection                                                                                                             | [demo](examples/yolo)           | ✅     | ✅             | ✅             | ✅                 | ✅                 | | Object Detection                                                                                                             | [demo](examples/yolo)           | ✅     | ✅             | ✅             | ✅                 | ✅                 | | Object Detection                                                                                                             | [demo](examples/rtdetr)         | ✅     | ✅             | ✅             |                    |                    | | Object Detection                                                                                                             | [demo](examples/rfdetr)         | ✅     | ✅             | ✅             |                    |                    | Detection/tree/release/2.8/configs/picodet)                    | Object Detection                                                                                                             | [demo](examples/picodet-layout) | ✅     | ✅             | ✅             |                    |                    | | Object Detection                                                                                                             | [demo](examples/picodet-layout) | ✅     | ✅             | ✅             |                    |                    | | Object Detection                                                                                                             | [demo](examples/d-fine)         | ✅     | ✅             | ✅             |                    |                    | | Object Detection                                                                                                             | [demo](examples/deim)           | ✅     | ✅             | ✅             |                    |                    | in/projects/rtmo)                                              | Keypoint Detection                                                                                                           | [demo](examples/rtmo)           | ✅     | ✅             | ✅             | ❌                 | ❌                 | | Segment Anything                                                                                                             | [demo](examples/sam)            | ✅     | ✅             | ✅             |                    |                    | | Segment Anything                                                                                                             | [demo](examples/sam)            | ✅     | ✅             | ✅             |                    |                    | | Segment Anything                                                                                                             | [demo](examples/sam)            | ✅     | ✅             | ✅             |                    |                    | | Segment Anything                                                                                                             | [demo](examples/sam)            | ✅     | ✅             | ✅             |                    |                    | | Segment Anything                                                                                                             | [demo](examples/sam)            | ✅     | ✅             | ✅             |                    |                    | | Instance Segmentation                                                                                                        | [demo](examples/yolo)           | ✅     | ✅             | ✅             | ✅                 | ✅                 | | Open-Set Detection With Language                                                                                             | [demo](examples/yolo)           | ✅     | ✅             | ✅             | ✅                 | ✅                 | | Open-Set Detection With Language                                                                                             | [demo](examples/grounding-dino) | ✅     | ✅             | ✅             |                    |                    | | Vision-Language Embedding                                                                                                    | [demo](examples/clip)           | ✅     | ✅             | ✅             | ❌                 | ❌                 | | Vision-Language Embedding                                                                                                    | [demo](examples/clip)           | ✅     | ✅             | ✅             | ❌                 | ❌                 | | Image Captioning                                                                                                             | [demo](examples/blip)           | ✅     | ✅             | ✅             | ❌                 | ❌                 | | Text Detection                                                                                                               | [demo](examples/db)             | ✅     | ✅             | ✅             | ✅                 | ✅                 | | Text Detection                                                                                                               | [demo](examples/fast)           | ✅     | ✅             | ✅             | ✅                 | ✅                 | | Text Detection                                                                                                               | [demo](examples/linknet)        | ✅     | ✅             | ✅             | ✅                 | ✅                 | | Text Recognition                                                                                                             | [demo](examples/svtr)           | ✅     | ✅             | ✅             | ✅                 | ✅                 | atest/algorithm/table_recognition/algorithm_table_slanet.html) | Tabel Recognition                                                                                                            | [demo](examples/slanet)         | ✅     | ✅             | ✅             |                    |                    | | Text Recognition                                                                                                             | [demo](examples/trocr)          | ✅     | ✅             | ✅             |                    |                    | | Panoptic Driving Perception                                                                                                  | [demo](examples/yolop)          | ✅     | ✅             | ✅             | ✅                 | ✅                 | ub.com/LiheYoung/Depth-Anything)                             | Monocular Depth Estimation                                                                                                   | [demo](examples/depth-anything) | ✅     | ✅             | ✅             | ❌                 | ❌                 | | Monocular Depth Estimation                                                                                                   | [demo](examples/depth-pro)      | ✅     | ✅             | ✅             |                    |                    | | Image Matting                                                                                                                | [demo](examples/modnet)         | ✅     | ✅             | ✅             | ✅                 | ✅                 | | Foundation for Human Vision Models                                                                                           | [demo](examples/sapiens)        | ✅     | ✅             | ✅             |                    |                    | | a Variety of Vision Tasks                                                                                                    | [demo](examples/florence2)      | ✅     | ✅             | ✅             |                    |                    | | Open-Set Object Detection
Open-Set Keypoints Detection
Image Caption
Visual Question Answering               | [demo](examples/moondream2)     | ✅     | ✅             | ✅             |                    |                    | tch16-ensemble)                                                | Open-Set Object Detection                                                                                                    | [demo](examples/owlv2)          | ✅     | ✅             | ✅             |                    |                    | gFaceTB/SmolVLM-256M-Instruct)                                                | Visual Question Answering                                                                                                    | [demo](examples/smolvlm)          | ✅     | ✅             | ✅             |                    |                    |

## ⛳️ Cargo Features

By default, **none of the following features are enabled**. You can enable them as needed:

- **`auto`**: Automatically downloads prebuilt ONNXRuntime binaries from Pyke’s CDN for supported platforms.

  - If disabled, you'll need to [compile `ONNXRuntime` from source](https://github.com/microsoft/onnxruntime) or [download a precompiled package](https://github.com/microsoft/onnxruntime/releases), and then [link it manually](https://ort.pyke.io/setup/linking).

    

    👉 For Linux or macOS Users

    - Download from the [Releases page](https://github.com/microsoft/onnxruntime/releases).

    - Set up the library path by exporting the `ORT_DYLIB_PATH` environment variable:

      ```shell

      export ORT_DYLIB_PATH=/path/to/onnxruntime/lib/libonnxruntime.so.1.20.1

      ```

    

- **`ffmpeg`**: Adds support for video streams, real-time frame visualization, and video export.

  - Powered by [video-rs](https://github.com/oddity-ai/video-rs) and [minifb](https://github.com/emoon/rust_minifb). For any issues related to `ffmpeg` features, please refer to the issues of these two crates.

- **`cuda`**: Enables the NVIDIA TensorRT provider.

- **`trt`**: Enables the NVIDIA TensorRT provider.

- **`mps`**: Enables the Apple CoreML provider.

## 🎈 Example

* **Using `CUDA`**

  ```

  cargo run -r -F cuda --example yolo -- --device cuda:0

  ```

* **Using Apple `CoreML`**

  ```

  cargo run -r -F mps --example yolo -- --device mps

  ```

* **Using `TensorRT`**

  ```

  cargo run -r -F trt --example yolo -- --device trt

  ```

* **Using `CPU`**

  ```

  cargo run -r --example yolo

  ```

All examples are located in the [examples](./examples/) directory.

## 🥂 Integrate Into Your Own Project

Add `usls` as a dependency to your project's `Cargo.toml`

```Shell

cargo add usls -F cuda

```

Or use a specific commit:

```Toml

[dependencies]

usls = { git = "https://github.com/jamjamjon/usls", rev = "commit-sha" }

```

## 🥳 If you find this helpful, please give it a star ⭐

## 📌 License

This project is licensed under [LICENSE](LICENSE).
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/jamjamjon/usls

Awesome Lists containing this project

README

usls