{"id":21923206,"url":"https://github.com/deislabs/wasi-nn-onnx","last_synced_at":"2025-07-21T21:31:37.852Z","repository":{"id":46086259,"uuid":"378040942","full_name":"deislabs/wasi-nn-onnx","owner":"deislabs","description":"Experimental ONNX implementation for WASI NN.","archived":true,"fork":false,"pushed_at":"2021-11-15T15:28:31.000Z","size":65086,"stargazers_count":48,"open_issues_count":9,"forks_count":4,"subscribers_count":17,"default_branch":"main","last_synced_at":"2025-07-18T03:46:29.703Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/deislabs.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-06-18T05:12:13.000Z","updated_at":"2025-02-11T21:59:35.000Z","dependencies_parsed_at":"2022-09-18T07:07:08.279Z","dependency_job_id":null,"html_url":"https://github.com/deislabs/wasi-nn-onnx","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/deislabs/wasi-nn-onnx","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deislabs%2Fwasi-nn-onnx","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deislabs%2Fwasi-nn-onnx/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deislabs%2Fwasi-nn-onnx/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deislabs%2Fwasi-nn-onnx/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/deislabs","download_url":"https://codeload.github.com/deislabs/wasi-nn-onnx/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deislabs%2Fwasi-nn-onnx/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266382142,"owners_count":23920644,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-21T11:47:31.412Z","response_time":64,"last_error":null,"robots_txt_status":null,"robots_txt_updated_at":null,"robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-28T21:09:41.527Z","updated_at":"2025-07-21T21:31:35.655Z","avatar_url":"https://github.com/deislabs.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ONNX implementation for WASI NN\n\nThis project is an experimental [ONNX][onnx] implementation for [the WASI NN\nspecification][wasi-nn], and it enables performing neural network inferences in\nWASI runtimes at near-native performance for ONNX models by leveraging CPU\nmulti-threading or GPU usage on the runtime, and exporting this host\nfunctionality to guest modules running in WebAssembly.\n\nIt follows the [WASI NN implementation from Wasmtime][wasmtime-impl], and adds\ntwo new runtimes for performing inferences on ONNX models:\n\n- one based on the [native ONNX runtime][msft], which uses [community-built Rust\n  bindings][bindings] to the runtime's C API.\n- one based on the [Tract crate][tract], which is a native inference engine for\n  running ONNX models, written in Rust.\n\n### How does this work?\n\nWASI NN is a \"graph loader\" API. This means the guest WebAssembly module passes\nthe ONNX model as opaque bytes to the runtime, together with input tensors, the\nruntime performs the inference, and the guest module can then retrieve the\noutput tensors. The WASI NN API is as follows:\n\n- `load` a model using one or more opaque byte arrays\n- `init_execution_context` and bind some tensors to it using `set_input`\n- `compute` the ML inference using the bound context\n- retrieve the inference result tensors using `get_output`\n\nThe two back-ends from this repository implement the API defined above using\neach of the two runtimes mentioned. So why two implementations? The main reason\nhas to do with the performance vs. ease of configuration trade-off. More\nspecifically:\n\n- the native ONNX runtime will provide the most performance, with multi-threaded\n  CPU and access to the GPU. Additionally, any ONNX module should be fully\n  compatible with this runtime (keeping in mind our current implementation\n  limitations described below). However, setting it up requires that the ONNX\n  shared libraries be downloaded and configured on the host.\n- the Tract runtime is implemented purely in Rust, and does not need any shared\n  libraries. However, it only passes _successfully about 85% of ONNX backend\n  tests_, and it does not implement internal multi-threading or GPU access.\n\nThe following represents a _very simple_ benchmark of running two computer\nvision models, [SqueezeNetV1][sq] and [MobileNetV2][mb], compiled natively, run\nwith WASI NN with both back-ends, and run purely on WebAssembly using Tract. All\ninferences are performed on the CPU-only for now:\n\n![SqueezeNet  performance](docs/squeezenet.png)\n\n![MobileNetV2 performance](docs/mobilenetv2.png)\n\nA few notes on the performance:\n\n- this represents _very early_ data, on a limited number of runs and models, and\n  should only be interpreted in terms of the relative performance difference we\n  can expect between native, WASI NN and pure WebAssembly\n- the ONNX runtime is running multi-threaded on the CPU _only_, as the GPU is\n  not yet enabled\n- in each case, all tests are executing the same ONNX model on the same images\n- all WebAssembly modules (both those built with WASI NN and the ones running\n  pure Wasm) are run with Wasmtime v0.28, with caching enabled\n- no other special optimizations have been performed on either module, and we\n  suspect that optimizations such as using\n  [`wasm-opt`](https://github.com/WebAssembly/binaryen),\n  [Wizer](https://github.com/bytecodealliance/wizer), or AOT compilation could\n  significantly improve module startup time.\n- there are known limitations in both runtimes that, when fixed, should also\n  significantly improve the performance\n\n- as we test with more ONNX models, the data should be updated\n\n### Current limitations\n\n- only FP32 tensor types are currently supported\n  ([#20](https://github.com/deislabs/wasi-nn-onnx/issues/20)) - this is the main\n  limitation right now, and it has to do with the way we track state internally.\n  This should not affect popular models (such as computer vision scenarios).\n- GPU execution is not yet enabled in the native ONNX runtime\n  ([#9](https://github.com/deislabs/wasi-nn-onnx/issues/9))\n\n### Building, running, and writing WebAssembly modules that use WASI NN\n\nThe following are the build instructions for Linux. First, download\n[the ONNX runtime 1.6 shared library](https://github.com/microsoft/onnxruntime/releases/tag/v1.6.0)\nand unarchive it. Then, build the helper binary:\n\n```\n➜ cargo build --release --bin wasmtime-onnx --features  tract,c_onnxruntime\n```\n\nAt this point, follow [the Rust example and test](./tests/rust/src/main.rs) to\nbuild a WebAssembly module that uses this API, which uses the\n[Rust client bindings for the API](https://github.com/bytecodealliance/wasi-nn).\n\nThen, to run the example and test from this repository, using the native ONNX\nruntime:\n\n```\n➜ LD_LIBRARY_PATH=\u003cPATH-TO-ONNX\u003e/onnx/onnxruntime-linux-x64-1.6.0/lib RUST_LOG=wasi_nn_onnx_wasmtime=info,wasmtime_onnx=info \\\n        ./target/release/wasmtime-onnx \\\n        tests/rust/target/wasm32-wasi/release/wasi-nn-rust.wasm \\\n        --cache cache.toml \\\n        --dir tests/testdata \\\n        --invoke batch_squeezenet \\\n        --c-runtime\n```\n\nOr to run the same function using the Tract runtime:\n\n```\n➜ LD_LIBRARY_PATH=\u003cPATH-TO-ONNX\u003e/onnx/onnxruntime-linux-x64-1.6.0/lib RUST_LOG=wasi_nn_onnx_wasmtime=info,wasmtime_onnx=info \\\n      ./target/release/wasmtime-onnx \\\n      tests/rust/target/wasm32-wasi/release/wasi-nn-rust.wasm \\\n      --cache cache.toml \\\n      --dir tests/testdata \\\n      --invoke batch_squeezenet \\\n```\n\nThe project exposes two Cargo features: `tract`, which is the default feature,\nand `c_onnxruntime`, which when enabled, will compile support for using the C\nAPI for the ONNX runtime. After building with this feature enabled, running the\nbinary requires passing the path to the ONNX shared libraries, either as part of\nthe PATH, or by setting the `LD_LIBRARY_PATH`.\n\n### Contributing\n\nWe welcome any contribution that adheres to our code of conduct. This project is\nexperimental, and we are delighted you are interested in using or contributing\nto it! Please have a look at\n[the issue queue](https://github.com/deislabs/wasi-nn-onnx/issues) and either\ncomment on existing issues, or open new ones for bugs or questions. We are\nparticularly looking for help in fixing the current known limitations, so please\nhave a look at\n[issues labeled with `help wanted`](https://github.com/deislabs/wasi-nn-onnx/issues?q=is%3Aissue+is%3Aopen+sort%3Aupdated-desc+label%3A%22help+wanted%22).\n\n### Code of Conduct\n\nThis project has adopted the\n[Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).\n\nFor more information see the\n[Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or\ncontact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any\nadditional questions or comments.\n\n[onnx]: https://onnx.ai/\n[wasi-nn]: https://github.com/Webassembly/wasi-nn\n[wasmtime-impl]:\n  https://github.com/bytecodealliance/wasmtime/tree/main/crates/wasi-nn\n[msft]: https://github.com/microsoft/onnxruntime\n[bindings]: https://github.com/nbigaouette/onnxruntime-rs\n[tract]: https://github.com/sonos/tract\n[nn]: https://bytecodealliance.org/articles/using-wasi-nn-in-wasmtime\n[intel-talk]: https://youtu.be/lz2I_4vvCuc\n[sq]:\n  https://github.com/onnx/models/tree/master/vision/classification/squeezenet\n[mb]: https://github.com/onnx/models/tree/master/vision/classification/mobilenet\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeislabs%2Fwasi-nn-onnx","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdeislabs%2Fwasi-nn-onnx","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeislabs%2Fwasi-nn-onnx/lists"}