{"id":13822100,"url":"https://github.com/WebAssembly/wasi-nn","last_synced_at":"2025-05-16T15:32:43.580Z","repository":{"id":38613989,"uuid":"263765444","full_name":"WebAssembly/wasi-nn","owner":"WebAssembly","description":"Neural Network proposal for WASI","archived":false,"fork":false,"pushed_at":"2024-10-28T18:54:41.000Z","size":881,"stargazers_count":471,"open_issues_count":23,"forks_count":36,"subscribers_count":49,"default_branch":"main","last_synced_at":"2024-11-13T01:02:22.811Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/WebAssembly.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-05-13T23:22:40.000Z","updated_at":"2024-11-08T06:16:37.000Z","dependencies_parsed_at":"2023-01-19T16:17:52.061Z","dependency_job_id":"96f3d3cc-0749-4ce0-ac6b-30d79bb4d394","html_url":"https://github.com/WebAssembly/wasi-nn","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WebAssembly%2Fwasi-nn","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WebAssembly%2Fwasi-nn/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WebAssembly%2Fwasi-nn/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WebAssembly%2Fwasi-nn/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/WebAssembly","download_url":"https://codeload.github.com/WebAssembly/wasi-nn/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225436512,"owners_count":17474156,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-04T08:01:42.854Z","updated_at":"2024-11-19T22:32:16.985Z","avatar_url":"https://github.com/WebAssembly.png","language":null,"funding_links":[],"categories":["Others"],"sub_categories":[],"readme":"# `wasi-nn`\n\nA proposed [WebAssembly System Interface](https://github.com/WebAssembly/WASI) API for machine\nlearning (ML).\n\n### Current Phase\n\n`wasi-nn` is currently in [Phase 2].\n\n[Phase 2]: https://github.com/WebAssembly/WASI/blob/42fe2a3ca159011b23099c3d10b5b1d9aff2140e/docs/Proposals.md#phase-2---proposed-spec-text-available-cg--wg\n\n### Champions\n\n- [Andrew Brown](https://github.com/abrown)\n- [Mingqiu Sun](https://github.com/mingqiusun)\n\n### Phase 4 Advancement Criteria\n\n`wasi-nn` must have at least two complete independent implementations.\n\n## Table of Contents\n\n- [Introduction](#introduction)\n- [Goals](#goals)\n- [Non-goals](#non-goals)\n- [API walk-through](#api-walk-through)\n- [Detailed design discussion](#detailed-design-discussion)\n- [Considered alternatives](#considered-alternatives)\n- [Stakeholder Interest \u0026 Feedback](#stakeholder-interest--feedback)\n- [References \u0026 acknowledgements](#references--acknowledgements)\n\n### Introduction\n\n`wasi-nn` is a WASI API for performing ML inference. Its name derives from the fact that ML models\nare also known as neural networks (`nn`). ML models are typically trained using a large data set,\nresulting in one or more files that describe the model's weights. The model is then used to compute\nan \"inference,\" e.g., the probabilities of classifying an image as a set of tags. This API is\nconcerned initially with inference, not training.\n\nWhy expose ML inference as a WASI API? Though the functionality of inference can be encoded into\nWebAssembly, there are two primary motivations for `wasi-nn`:\n1. __ease of use__: an entire ecosystem already exists to train and use models (e.g., Tensorflow,\n   ONNX, OpenVINO, etc.); `wasi-nn` is designed to make it easy to use existing model formats as-is\n2. __performance__: the nature of ML inference makes it amenable to hardware acceleration of various\n   kinds; without this hardware acceleration, inference can suffer slowdowns of several hundred\n   times. Hardware acceleration for ML is very diverse \u0026mdash; SIMD (e.g., AVX512), GPUs, TPUs,\n   FPGAs \u0026mdash; and it is unlikely (impossible?) that all of these would be supported natively in\n   WebAssembly\n\nWebAssembly programs that want to use a host's ML capabilities can access these capabilities through\n`wasi-nn`'s core abstractions: _backends_, _graphs_, and _tensors_. A user selects a _backend_ for\ninference and loads a model, instantiated as a _graph_, to use in the _backend_. Then, the user\npasses _tensor_ inputs to the _graph_, computes the inference, and retrieves the _tensor_ outputs.\n\n`wasi-nn` _backends_ correspond to existing ML frameworks, e.g., Tensorflow, ONNX, OpenVINO, etc.\n`wasi-nn` places no requirements on hosts to support specific _backends_; the API is purposefully\ndesigned to allow the largest number of ML frameworks to implement it. `wasi-nn` _graphs_ can be\npassed as opaque byte sequences to support any number of model formats. This makes the API\nframework- and format-agnostic, since we expect device vendors to provide the ML _backend_ and\nsupport for their particular _graph_ format.\n\nUsers can find language bindings for `wasi-nn` at the [wasi-nn bindings] repository; request\nadditional language support there. More information about `wasi-nn` can be found at:\n\n[wasi-nn bindings]: https://github.com/bytecodealliance/wasi-nn\n\n - Blog post: [Machine Learning in WebAssembly: Using wasi-nn in\n   Wasmtime](https://bytecodealliance.org/articles/using-wasi-nn-in-wasmtime)\n - Blog post: [Implementing a WASI Proposal in Wasmtime:\n   wasi-nn](https://bytecodealliance.org/articles/implementing-wasi-nn-in-wasmtime)\n - Blog post: [Neural network inferencing for PyTorch and TensorFlow with ONNX, WebAssembly System\n   Interface, and wasi-nn](https://deislabs.io/posts/wasi-nn-onnx/)\n - Recorded talk: [Machine Learning with Wasm\n   (wasi-nn)](https://www.youtube.com/watch?v=lz2I_4vvCuc)\n - Recorded talk: [Lightning Talk: High Performance Neural Network Inferencing Using\n   wasi-nn](https://www.youtube.com/watch?v=jnM0tsRVM_8)\n\n### Goals\n\nThe primary goal of `wasi-nn` is to allow users to perform ML inference from WebAssembly using\nexisting models (i.e., ease of use) and with maximum performance. Though the primary focus is\ninference, we plan to leave open the possibility to perform ML training in the future (request\ntraining in an [issue](https://github.com/WebAssembly/wasi-nn/issues)!).\n\nAnother design goal is to make the API framework- and model-agnostic; this allows for implementing\nthe API with multiple ML frameworks and model formats. The `load` method will return an error\nmessage when an unsupported model encoding scheme is passed in. This approach is similar to how a\nbrowser deals with image or video encoding.\n\n### Non-goals\n\n`wasi-nn` is not designed to provide support for individual ML operations (a \"model builder\" API).\nThe ML field  is still evolving rapidly, with new operations and network topologies emerging\ncontinuously. It would be a challenge to define an evolving set of operations to support in the API.\nInstead, our approach is to start with a \"model loader\" API, inspired by WebNN’s model loader\nproposal.\n\n### API walk-through\n\nThe following example describes how a user would use `wasi-nn`:\n\n```rust\n// Load the model.\nlet encoding = wasi_nn::GRAPH_ENCODING_...;\nlet target = wasi_nn::EXECUTION_TARGET_CPU;\nlet graph = wasi_nn::load(\u0026[bytes, more_bytes], encoding, target);\n\n// Configure the execution context.\nlet context = wasi_nn::init_execution_context(graph);\nlet tensor = wasi_nn::Tensor { ... };\nwasi_nn::set_input(context, 0, tensor);\n\n// Compute the inference.\nwasi_nn::compute(context);\nwasi_nn::get_output(context, 0, \u0026mut output_buffer, output_buffer.len());\n```\n\nNote that the details above will depend on the model and backend used; the pseudo-Rust simply\nillustrates the general idea, minus any error-checking. Consult the\n[AssemblyScript](https://github.com/bytecodealliance/wasi-nn/tree/main/assemblyscript/examples) and\n[Rust](https://github.com/bytecodealliance/wasi-nn/tree/main/rust/examples) bindings for more\ndetailed examples.\n\n### Detailed design discussion\n\nFor the details of the API, see [wasi-nn.wit](wit/wasi-nn.wit).\n\n\u003c!--\nThis section should mostly refer to the .wit.md file that specifies the API. This section is for\nany discussion of the choices made in the API which don't make sense to document in the spec file\nitself.\n--\u003e\n\n#### Should `wasi-nn` support training models?\n\nIdeally, yes. In the near term, however, exposing (and implementing) the inference-focused API is\nsufficiently complex to postpone a training-capable API until later. Also, models are typically\ntrained offline, prior to deployment, and it is unclear why training models using WASI would be an\nadvantage over training them natively. (Conversely, the inference API does make sense: performing ML\ninference in a Wasm deployment is a known use case). See associated discussion\n[here](https://github.com/WebAssembly/wasi-nn/issues/6) and feel free to open pull requests or\nissues related to this that fit within the goals above.\n\n#### Should `wasi-nn` support inspecting models?\n\nIdeally, yes. The ability to inspect models would allow users to determine, at runtime, the tensor\nshapes of the inputs and outputs of a model. As with ML training (above), this can be added in the\nfuture.\n\n\u003c!--\nMore \"tricky\" design choices fit here.\n--\u003e\n\n### Considered alternatives\n\nThere are other ways to perform ML inference from a WebAssembly program:\n\n1. a user could specify a __custom host API__ for ML tasks; this is similar to the approach taken\n   [here](https://github.com/second-state/wasmedge_tensorflow_interface). The advantages and\n   disadvantages are in line with other \"spec vs. custom\" trade-offs: the user can precisely tailor\n   the API to their use case, etc., but will not be able to switch easily between implementations.\n2. a user could __compile a framework and/or model to WebAssembly__; this is similar to\n   [here](https://github.com/sonos/tract) and\n   [here](https://blog.tensorflow.org/2020/03/introducing-webassembly-backend-for-tensorflow-js.html).\n   The primary disadvantage to this approach is performance: WebAssembly, even with the recent\n   addition of 128-bit SIMD, does not have optimized primitives for performing ML inference or\n   accessing ML-optimized hardware. The performance loss can be of several orders of magnitude.\n\n\n### Stakeholder Interest \u0026 Feedback\n\nTODO before entering Phase 3.\n\n\u003c!--\nThis should include a list of implementers who have expressed interest in implementing the proposal\n--\u003e\n\n### References \u0026 acknowledgements\n\nMany thanks for valuable feedback and advice from:\n\n- [Brian Jones](https://github.com/brianjjones)\n- [Radu Matei](https://github.com/radu-matei)\n- [Steve Schoettler](https://github.com/stevelr)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FWebAssembly%2Fwasi-nn","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FWebAssembly%2Fwasi-nn","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FWebAssembly%2Fwasi-nn/lists"}