Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/pacoccino/ai-mask
Serve local ML inference engines to web apps
https://github.com/pacoccino/ai-mask
ai artificial-intelligence chrome-extension machine-learning wasm webgpu
Last synced: about 1 month ago
JSON representation
Serve local ML inference engines to web apps
- Host: GitHub
- URL: https://github.com/pacoccino/ai-mask
- Owner: pacoccino
- Created: 2024-02-22T21:34:48.000Z (12 months ago)
- Default Branch: main
- Last Pushed: 2024-04-09T08:42:49.000Z (10 months ago)
- Last Synced: 2024-12-29T06:51:33.317Z (about 1 month ago)
- Topics: ai, artificial-intelligence, chrome-extension, machine-learning, wasm, webgpu
- Language: TypeScript
- Homepage: https://chromewebstore.google.com/detail/lkfaajachdpegnlpikpdajccldcgfdde
- Size: 510 KB
- Stars: 23
- Watchers: 3
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome-local-ai - AI-Mask - Browser extension to provide model inference to web apps. Backed by web-llm and transformers.js (Inference UI)
README
# AI-Mask
![AI-Mask Logo](/packages/extension/icons/icon-128.png)**Bring local inference into web apps !**
[Download extension](https://chromewebstore.google.com/detail/lkfaajachdpegnlpikpdajccldcgfdde) | [Supported apps](#supported-apps) | [Integration guide](#for-integrators) | [SDK Documentation](/packages/sdk)
[![npm version](https://badge.fury.io/js/@ai-mask%2Fsdk.svg)](https://badge.fury.io/js/@ai-mask%2Fsdk)
> This is an experimental project at an MVP stage.
> **[Feedback](https://github.com/pacoccino/ai-mask/discussions) will be greatly appreciated for the future of this project.**
## What ?AI-Mask is a chrome **web extension** that serves as a local provider to **AI** models execution. It runs model **on-device** for web apps which needs it, for **free**, and with full-**privacy**.
See it as the [Metamask](https://metamask.io/) of AI.
**Try it !**
[Install the extension](https://pacoccino.github.io/ai-mask/), then open the [Chat app](https://chatbot.opac.me)[AI-Mask Demo.mp4](https://github.com/pacoccino/ai-mask/assets/1371207/f75e8b27-c91a-4bc6-bd14-8eae0d68050f)
## Why ?
On-device AI inference is getting quite a traction recently. Most of our devices are already capable of executing machine learning models and software compatibility is ready.
Thanks to some [amazing](https://github.com/mlc-ai/web-llm) [libraries](https://github.com/xenova/transformers.js), running **machine learning** models **in the browser** has become ridiculously easy, accelerated with **WASM** and **WebGPU**. This means they'll work and run nearly at **full-performance** on virtually **any device**, hardware and operating system.
**But** State-of-the-art web inference libraries store models in the browser cache, which have been, for security reason, [domain partitionned](https://developer.chrome.com/blog/http-cache-partitioning). This means that if multiple web apps use the same models, it needs to be downloaded once per domain, which can use a **lot of disk space**.
With this extension, the models are **cached only once** and served to the websites conveniently though an **unified SDK**.
## Future
This is a test to see if it's interesting and getting traction from users and app developers.
Another major feature planned is also to proxy requests to OpenAI-like APIs. Users would store their API keys in the extension, and apps would query the extension to run models.
This would solve:
- Users won't have to share API keys with non-trusted apps anymore
- Users won't share private data with apps
- App developers won't need to have a backend server which proxies API request to alleviate CORS issues and manipulate responses## Supported Apps
Web apps that are compatible with this extension for local inference:
- [Demo App](https://pacoccino.github.io/ai-mask/) ([code](/examples/demo-app/))
- [chatbot-ui](https://chatbot.opac.me) ([code](https://github.com/mckaywrigley/chatbot-ui/pull/1590))
- [fully-local-pdf-chatbot](https://fully-local-pdf-chatbot-topaz.vercel.app/) ([code](https://github.com/jacoblee93/fully-local-pdf-chatbot/pull/19))## Usage
### For Users
Enjoy **free** and **private** execution of AI models !
Do not pay for using models again, do not leak private data, and do not give your API keys to third-party apps.
**How To:**
1. [Install the extension](https://chromewebstore.google.com/detail/lkfaajachdpegnlpikpdajccldcgfdde)
2. Use a [Supported app](#supported-apps)### For Integrators
Easily support AI-Mask in your AI apps, and bring free and private local-inference to your users ! Do not store API keys again, and get rid of your backend and server costs.
**Quick Start**
Install package:
```shell
npm install -S @ai-mask/sdk
```
Run inference:
```typescript
import { AIMaskClient } from '@ai-mask'const messages = [{ role: 'user', content: 'What is the capital of France ? ' }]
const aiMaskClient = new AIMaskClient()
const response = await aiMaskClient.chat(
{ messages },
{ modelId: 'gemma-2b-it-q4f32_1' },
)
```For full reference, see [AI-Mask SDK Documentation](/packages/sdk)
You can see the [demo app code](/examples/demo-app/) and an [example pull request](https://github.com/pacoccino/chatbot-ui/pull/1/files) to see how it's easy to integrate into existing apps
**Note**: App users must have the extension installed for this to work.
## Technology
AI-Mask is a ManifestV3 extension, heavily relying on the work of third party libraries to execute model inference:
- [web-llm](https://github.com/mlc-ai/web-llm) Inference with WASM/WebGPU via Apache TVM
- [transformers.js](https://github.com/mlc-ai/web-llm) Inference with WASM via ONNX RuntimeIssues with service workers:
- [WebGPU is not exposed to service workers](https://github.com/gpuweb/gpuweb/issues/4197)
- For [some reasons](https://github.com/xenova/transformers.js/pull/462), transformers.js can only run monothreaded in service workersTo solve these issues, the engines runs in an offscreen document
## Contribute
### Developpement
Requirements:
- Node 18+
- pnpm 8+ (for monorepo workspace management)#### Start development server for all packages (sdk/extension/demo-app)
```
pnpm dev
```
#### Typecheck and build for production
```
pnpm build
```## Roadmap
- [X] Documentation
- [x] Deploy demo app
- [x] Deploy extension
- [x] SDK Working in web workers
- [x] ReadableStream option
- [ ] Bring back computation in service worker from chrome 124 thx to webgpu support
- [ ] Proxy OpenAI-like API requests and store user keys
- [ ] Create Langchain community libs
- [ ] Interrupts
- [ ] Include react Hooks/utilities in SDK
- [ ] Pull request in one popular AI app
- [ ] Implement more tasks
- [ ] Add more models
- [ ] Unload model from memory after being inactive for a while