https://github.com/offbit-ai/litert

Rust bindings for LiteRT, successor to TensorFlow Lite. is Google's On-device framework for high-performance ML & GenAI deployment on edge platforms, via efficient conversion, runtime, and optimization
https://github.com/offbit-ai/litert
genai litert litert-lm ml tensorflow
Last synced: 11 days ago
JSON representation
Host: GitHub
URL: https://github.com/offbit-ai/litert
Owner: offbit-ai
License: apache-2.0
Created: 2026-04-16T13:26:44.000Z (3 months ago)
Default Branch: main
Last Pushed: 2026-05-03T11:27:50.000Z (2 months ago)
Last Synced: 2026-05-20T19:31:21.303Z (about 1 month ago)
Topics: genai, litert, litert-lm, ml, tensorflow
Language: Rust
Homepage:
Size: 299 KB
Stars: 3
Watchers: 0
Forks: 1
Open Issues: 1
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
- Notice: NOTICE
Awesome Lists containing this project

README

          # LiteRT-rs

[![CI](https://github.com/offbit-ai/LiteRT/actions/workflows/ci.yml/badge.svg)](https://github.com/offbit-ai/LiteRT/actions/workflows/ci.yml)

[![crates.io](https://img.shields.io/crates/v/litert.svg?label=litert)](https://crates.io/crates/litert)

[![crates.io](https://img.shields.io/crates/v/litertlm.svg?label=litertlm)](https://crates.io/crates/litertlm)

[![crates.io](https://img.shields.io/crates/v/litert-sys.svg?label=litert-sys)](https://crates.io/crates/litert-sys)

[![crates.io](https://img.shields.io/crates/v/litert-lm-sys.svg?label=litert-lm-sys)](https://crates.io/crates/litert-lm-sys)

[![docs.rs](https://img.shields.io/docsrs/litert?label=docs.rs%2Flitert)](https://docs.rs/litert)

[![docs.rs](https://img.shields.io/docsrs/litertlm?label=docs.rs%2Flitertlm)](https://docs.rs/litertlm)

[![MSRV](https://img.shields.io/badge/rustc-1.75%2B-blue.svg)](https://releases.rs/docs/1.75.0/)

[![License](https://img.shields.io/badge/license-Apache--2.0-informational.svg)](LICENSE)

[![LiteRT](https://img.shields.io/badge/LiteRT-2.1.4-informational.svg)](https://github.com/google-ai-edge/LiteRT)

Safe, zero-friction Rust bindings for [Google LiteRT] 2.x — on-device ML

inference and LLM text generation. Add a crate to `Cargo.toml` and

`cargo build`. No Bazel, no CMake, no `libclang` on user machines.

[Google LiteRT]: https://ai.google.dev/edge/litert

### ML inference (`litert`)

```toml

[dependencies]

litert = "0.3"

```

```rust

use litert::{CompilationOptions, CompiledModel, Environment, Model, TensorBuffer};

let env = Environment::new()?;

let model = Model::from_file("mobilenet.tflite")?;

let compiled = CompiledModel::new(env, model, &CompilationOptions::new()?)?;

// ... fill input buffers, compiled.run(...), read outputs ...

# Ok::<(), litert::Error>(())

```

### LLM text generation (`litertlm`)

```toml

[dependencies]

litertlm = "0.3"

```

```rust

use litertlm::{Backend, Engine, EngineSettings, SamplerParams};

let engine = Engine::new(

    EngineSettings::new("Qwen3-0.6B.litertlm")

        .backend(Backend::Gpu)

        .max_num_tokens(512),

)?;

// Streaming (token-by-token)

let mut conv = engine.create_conversation(SamplerParams::default().top_p(0.95))?;

conv.send_message_stream("Explain Rust lifetimes", |chunk| {

    print!("{chunk}");

})?;

// Or blocking

let mut session = engine.create_session(SamplerParams::default().top_p(0.95))?;

let response = session.generate("Explain Rust lifetimes")?;

# Ok::<(), litertlm::Error>(())

```

## Why

| Other options                      | Friction                                                            |

|------------------------------------|---------------------------------------------------------------------|

| Build LiteRT from source via CMake | Bazel or CMake + protoc + flatc + abseil + Android NDK on your box  |

| Invoke via Python (`ai-edge-litert`) | Python interpreter + wheel dependency graph                         |

| Hand-roll FFI against TFLite C API | Maintain a sysroot per target + track header drift manually         |

`litert-rs` takes the same upstream runtime binaries Google publishes, pins

each by SHA-256, and downloads them into a user-level cache the first time

`cargo build` runs. Your app links against that cached `libLiteRt.{so,dylib,dll}`.

## Crates

| Crate | What it is | crates.io |

|-------|------------|-----------|

| [`litert`](https://crates.io/crates/litert) | Safe ML inference wrappers (CompiledModel, TensorBuffer, GPU) | 0.3.x |

| [`litertlm`](https://crates.io/crates/litertlm) | Safe LLM text generation (Engine, Session, Conversation streaming) | 0.3.x |

| [`litert-sys`](https://crates.io/crates/litert-sys) | Raw FFI — LiteRT 2.x C API | 0.3.x |

| [`litert-lm-sys`](https://crates.io/crates/litert-lm-sys) | Raw FFI — LiteRT-LM C engine API | 0.3.x |

## Platform support

| Rust target                     | CPU | GPU accelerator(s) shipped        | Source                         |

|---------------------------------|-----|-----------------------------------|--------------------------------|

| `aarch64-apple-darwin`          | ✅  | Metal, WebGPU                     | litert-lm prebuilt (Git LFS)   |

| `x86_64-unknown-linux-gnu`      | ✅  | WebGPU                            | litert-lm prebuilt (Git LFS)   |

| `aarch64-unknown-linux-gnu`     | ✅  | WebGPU                            | litert-lm prebuilt (Git LFS)   |

| `x86_64-pc-windows-msvc`        | ✅  | WebGPU                            | litert-lm prebuilt (Git LFS)   |

| `aarch64-linux-android`         | ✅  | OpenCL/GL (via `ClGlAccelerator`) | LiteRT Maven AAR               |

| `x86_64-linux-android`          | ✅  | OpenCL/GL                         | LiteRT Maven AAR               |

| `wasm32-unknown-emscripten`     | ✅  | — (XNNPACK only; GPU deferred)    | LiteRT-rs CMake+emcc build     |

| `aarch64-apple-ios`             | ⏳  | —                                 | deferred (no upstream prebuilt) |

`litertlm` / `litert-lm-sys` (LLM inference) are desktop/Android only this

release. WASM support for the LLM stack is on the 0.4.0 roadmap — see

`wasm-patches/litert-lm-v0.10.2/`.

## Environment variables (escape hatches)

| Variable              | Effect                                                                     |

|-----------------------|----------------------------------------------------------------------------|

| `LITERT_LIB_DIR`      | Directory containing `libLiteRt.{so,dylib,dll}`. Bypasses the downloader. |

| `LITERT_NO_DOWNLOAD`  | Fail the build if any prebuilt is missing from cache (air-gapped CI).     |

| `LITERT_CACHE_DIR`    | Override the cache root. Default: `$XDG_CACHE_HOME/litert-sys`.           |

## macOS downstream binaries (one-time setup)

The prebuilt `libLiteRt.dylib` Google ships has `install_name=@rpath/libLiteRt.dylib`

and wasn't linked with `-headerpad_max_install_names`, so we can't rewrite that

identifier to an absolute path post-download. `litert-sys`' build script emits

an `-rpath` flag for its own tests and examples, but Cargo's `rustc-link-arg`

does **not** propagate to downstream consumer binaries. Without action, the

binaries your crate produces on macOS will fail at launch with:

    dyld: Library not loaded: @rpath/libLiteRt.dylib

Fix it once per downstream crate — add this tiny [build.rs](https://doc.rust-lang.org/cargo/reference/build-scripts.html)

next to your `Cargo.toml`:

```rust

// build.rs

fn main() {

    // `litert-sys` declares `links = "LiteRt"` and publishes its cache

    // directory as `DEP_LITERT_LIB_DIR`. Embedding it as an rpath makes

    // dyld find libLiteRt.dylib without DYLD_LIBRARY_PATH.

    if let Ok(dir) = std::env::var("DEP_LITERT_LIB_DIR") {

        println!("cargo:rustc-link-arg=-Wl,-rpath,{dir}");

    }

}

```

Alternatively, prefix individual invocations with

`DYLD_LIBRARY_PATH=$(cargo xtask cache-dir)`, or link with

`RUSTFLAGS="-C link-arg=-Wl,-rpath,/path/to/cache"`.

Linux, Windows, and Android are unaffected.

## WebAssembly (browser + Node.js + wasmtime)

`litert` and `litert-sys` cross-compile to `wasm32-unknown-emscripten` for

in-browser ML inference, server-side WASM (Cloudflare Workers, wasmtime),

or Node.js. The runtime is TFLite + XNNPACK CPU kernels, statically

linked into the produced `.wasm`.

### Prerequisites

* [emsdk](https://github.com/emscripten-core/emsdk) ≥ 5.0.7. Install via

  `git clone https://github.com/emscripten-core/emsdk && cd emsdk && ./emsdk

  install latest && ./emsdk activate latest`.

* `rustup target add wasm32-unknown-emscripten`.

* `source $EMSDK/emsdk_env.sh` before each build session.

### Build

```bash

source $EMSDK/emsdk_env.sh

# NODERAWFS=1 lets the WASM module read host files (model.tflite) under

# Node/wasmtime. ALLOW_MEMORY_GROWTH=1 lets the heap grow past 16 MB so

# larger models load. Drop both for a browser bundle (use --preload-file

# or fetch+MEMFS instead).

RUSTFLAGS="-C link-arg=-sNODERAWFS=1 -C link-arg=-sALLOW_MEMORY_GROWTH=1" \

  cargo build -p litert --example add_wasm \

              --target wasm32-unknown-emscripten --release

```

Output (`target/wasm32-unknown-emscripten/release/examples/`):

- `add_wasm.wasm` — ~5–12 MB WebAssembly module (12 MB debug, smaller in

  release with `-Oz`).

- `add_wasm.js` — emscripten JS shim that knows how to instantiate the

  `.wasm`.

### Run (Node.js)

```bash

node target/wasm32-unknown-emscripten/release/examples/add_wasm.js

# add_10x10.tflite — WASM CPU inference

# first 5 outputs: [100.0, 102.0, 104.0, 106.0, 108.0]

# last 5 outputs:  [290.0, 292.0, 294.0, 296.0, 298.0]

```

### Browser

Drop `NODERAWFS=1` (no host filesystem in browsers). Embed the model with

emcc's `--preload-file model.tflite` (bundles into a `.data` sidecar), or

`fetch()` it from JS and write to MEMFS before calling `Model::from_file`,

or use `Model::from_bytes` with a `Vec` you `fetch()`'d. Then drop the

`.wasm` + `.js` into a static page:

```html

```

### Building from source (no published artifact yet)

While the v0.3.0 prebuilt tarball is being staged, build the static

archives locally and point `litert-sys` at them via `LITERT_LIB_DIR`:

```bash

# 1. Clone + patch upstream LiteRT

git clone --depth=1 --branch=v2.1.4 \

    https://github.com/google-ai-edge/LiteRT.git /tmp/litert

cd /tmp/litert

git apply $LITERT_RS_DIR/wasm-patches/litert-v2.1.4/01-cmake-emscripten-support.patch

# 2. Cross-compile via CMake + emcc

emcmake cmake -S litert -B litert/build-wasm \

    -DCMAKE_BUILD_TYPE=Release \

    -DLITERT_ENABLE_GPU=OFF -DLITERT_ENABLE_NPU=OFF \

    -DLITERT_DISABLE_KLEIDIAI=ON -DLITERT_BUILD_TESTS=OFF \

    -DTFLITE_ENABLE_GPU=OFF

emmake cmake --build litert/build-wasm \

    --target litert_runtime_c_api_shared_lib -j

# 3. Flatten archives into a single dir for litert-sys

mkdir -p /tmp/litert-wasm-libs

find litert/build-wasm \( -name "lib*.a" ! -path "*/testdata/*" ! -name "input.a" \) \

    -exec cp {} /tmp/litert-wasm-libs/ \;

# 4. Build

cd $LITERT_RS_DIR

LITERT_LIB_DIR=/tmp/litert-wasm-libs \

RUSTFLAGS="-C link-arg=-sNODERAWFS=1 -C link-arg=-sALLOW_MEMORY_GROWTH=1" \

  cargo build -p litert --example add_wasm --target wasm32-unknown-emscripten

```

Once the [`build-litert-wasm.yml`](.github/workflows/build-litert-wasm.yml)

GitHub Actions workflow has uploaded a SHA-pinned tarball, `cargo build`

will download and verify it automatically — no `LITERT_LIB_DIR`, no emsdk

setup needed for end users on the WASM target.

### Limitations (v0.3.0)

- **CPU only.** WebGPU acceleration is on the 0.4.0 roadmap.

- **No `set_global_log_severity`.** The WASM build doesn't export the

  logger-control symbols. Returns `Error::Unsupported`; LiteRT logs at

  default verbosity.

- **No LLM stack.** `litertlm` / `litert-lm-sys` need separate fork patches

  to LiteRT-LM (orchestrator + transitive C++ deps); 0.4.0 milestone.

## Cross-platform development

End users only need `cargo`. The sections below are for contributors who want

to regenerate bindings or build for foreign targets locally.

### Tooling prerequisites (contributors only)

* `rustup` with stable + any target triples you want to exercise.

* A container engine for foreign-target builds: **Docker** or **Podman**.

  * macOS: `brew install podman && podman machine init && podman machine start`.

  * Linux: your distro's Docker/Podman packages.

* `cross` ≥ 0.2.5: `cargo install cross --locked`.

* If you're using Podman: `export CROSS_CONTAINER_ENGINE=podman`.

macOS and Windows target toolchains run **natively**, not through `cross`.

`cross` is only invoked for Linux + Android targets.

### Workspace automation

All cross-target chores flow through a single `xtask` binary.

```bash

cargo xtask targets            # list every supported Rust target triple

cargo xtask regen-bindings     # rebuild litert-sys bindings for every target

cargo xtask regen-bindings --target aarch64-apple-darwin   # single target

cargo xtask build-all          # cross-build the workspace for every target

```

`regen-bindings` dispatches automatically:

* **Host target** → native `cargo build -p litert-sys --features generate-bindings`.

* **Foreign target** → `cross build …`, which runs `bindgen` inside a

  container image that already has `libclang` + the target sysroot installed

  (see `Cross.toml`).

### CI

[`.github/workflows/ci.yml`](.github/workflows/ci.yml) runs three matrices on

every push and PR:

1. **`native`** — macOS arm64 (tests + build), Linux x86_64 (tests + `fmt --check` + `clippy -D warnings`), Windows x86_64 (build).

2. **`cross`** — Linux arm64, Android arm64, Android x86_64 (build only).

3. **`bindings-drift`** — regenerates all 6 target binding files via `cargo xtask regen-bindings` and fails on any `git diff`. If drift is detected the regenerated files are uploaded as a build artifact so they can be inspected or accepted in a PR.

Drift is the authoritative check: if the CI-generated bindings for a target

differ from what's committed, that's the signal to update the committed file.

### First-build download

On the first `cargo build` for a given target, `litert-sys/build.rs` emits a

one-time warning while it fetches the pinned prebuilt libraries:

```

warning: litert-sys: downloading 4 file(s) of LiteRT prebuilt v0.10.2

         for target `macos_arm64` into

         /Users/you/Library/Caches/litert-sys/v0.10.2/aarch64-apple-darwin

         (first build only)

```

Subsequent builds hit the cache (with a size check; a SHA-256 verified marker

file short-circuits rehashing). Deleting the cache directory or bumping the

pinned upstream version triggers a fresh download + re-verification.

## Regenerating bindings for a new target

```bash

# 1. Add the triple to:

#      Cargo.toml workspace members (if needed)

#      Cross.toml (pre-build apt-get for libclang)

#      litert-sys/build.rs target_spec()  — with pinned checksums

#      xtask/src/main.rs TARGETS

#

# 2. Regenerate:

cargo xtask regen-bindings --target 

#

# 3. Commit litert-sys/src/bindings/.rs and push.

```

## Credits

This project is a binding, not a fork. The runtime that does the work —

model loading, graph compilation, kernel execution, GPU/NPU delegation — is

Google's [LiteRT] and [LiteRT-LM]:

* **LiteRT** (Apache-2.0, © 2024–2026 Google LLC) — C API headers vendored

  under `third_party/litert-v2.1.4/`, source:

  .

* **LiteRT-LM** (Apache-2.0, © 2024–2026 Google LLC) — source of the

  prebuilt `libLiteRt.*` + accelerator plugins we download:

  .

* **TensorFlow Lite**, **XNNPACK**, **abseil-cpp**, **flatbuffers**,

  **protobuf** and the rest of the transitive open-source stack that LiteRT

  itself is built on.

See [NOTICE](NOTICE) for the full attribution and

[third_party/litert-v2.1.4/LICENSE](third_party/litert-v2.1.4/LICENSE) for

the upstream LiteRT license text.

[LiteRT]: https://github.com/google-ai-edge/LiteRT

[LiteRT-LM]: https://github.com/google-ai-edge/litert-lm

## License

Licensed under the [Apache License, Version 2.0](LICENSE). By contributing

you agree that your contribution is licensed under the same terms.
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/offbit-ai/litert

Awesome Lists containing this project

README