An open API service indexing awesome lists of open source software.

https://github.com/dimforge/wgml

Cross-platform GPU LLM inference with WebGPU and wgmath.
https://github.com/dimforge/wgml

Last synced: 9 months ago
JSON representation

Cross-platform GPU LLM inference with WebGPU and wgmath.

Awesome Lists containing this project

README

          

# wgml − GPU local inference every platform


crates.io






-----

**wgml** is a set of [Rust](https://www.rust-lang.org/) libraries exposing [WebGPU](https://www.w3.org/TR/WGSL/) shaders
and kernels for local Large Language Models (LLMs) inference on the GPU. It is cross-platform and runs on the web.
**wgml** can be used as a rust library to assemble your own transformer from the provided operators (and write your
owns on top of it).

Aside from the library, two binary crates are provided:
- **wgml-bench** is a basic benchmarking utility for measuring calculation times for matrix multiplication with various
quantization formats.
- **wgml-chat** is a basic chat GUI application for loading GGUF files and chat with the model. It can be run natively
or on the browser. Check out its [README](./crates/wgml-chat/README.md) for details on how to run it. You can run
it from your browser with the [online demo](https://wgmath.rs/demos/wgml/index.html).

⚠️ **wgml** is still under heavy development and might be lacking some important features. Contributions are welcome!

----