https://github.com/jgw96/web-ai-toolkit
The Web AI Toolkit simplifies the integration of AI features, such as OCR, speech-to-text, text summarization and more into your application. It ensures data privacy and offline capability by running all AI workloads locally, leveraging WebNN when available, with a fallback to WebGPU.
https://github.com/jgw96/web-ai-toolkit
ai image-classification rag retrieval-augmented-generation speech-to-text transformersjs webai webgpu webnn
Last synced: 3 months ago
JSON representation
The Web AI Toolkit simplifies the integration of AI features, such as OCR, speech-to-text, text summarization and more into your application. It ensures data privacy and offline capability by running all AI workloads locally, leveraging WebNN when available, with a fallback to WebGPU.
- Host: GitHub
- URL: https://github.com/jgw96/web-ai-toolkit
- Owner: jgw96
- License: mit
- Created: 2024-07-03T07:02:08.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2025-01-13T06:13:51.000Z (5 months ago)
- Last Synced: 2025-03-09T20:46:29.254Z (3 months ago)
- Topics: ai, image-classification, rag, retrieval-augmented-generation, speech-to-text, transformersjs, webai, webgpu, webnn
- Language: TypeScript
- Homepage:
- Size: 126 KB
- Stars: 39
- Watchers: 2
- Forks: 3
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-webnn - Web AI Toolkit
- awesome-webnn - Web AI Toolkit
README
# Web AI Toolkit
The Web AI Toolkit simplifies the integration of AI features, such as OCR, speech-to-text, text summarization and more into your application. It ensures data privacy and offline capability by running all AI workloads locally, leveraging WebNN when available, with a fallback to WebGPU.
## Installation
To install the Web AI Toolkit, run:
```sh
npm install web-ai-toolkit
```## Available Functions
*Note: Supported hardware is listed in priority of device selection. For example, for transcribing an audio file,
the code will attempt to choose the GPU first and then the CPU otherwise.*| Function Name | Parameter | Type | Default Value | Supported Hardware |
|-----------------------|----------------|------------------------|---------------|--------------------|
| transcribeAudioFile | audioFile | Blob | - | GPU / CPU |
| | model | string | "Xenova/whisper-tiny"| |
| | timestamps | boolean | false | |
| | language | string | "en-US" | |
| textToSpeech | text | string | - | GPU / CPU |
| | model | string | "Xenova/mms-tts-eng"| |
| summarize | text | string | - | GPU / CPU |
| | model | string | "Xenova/distilbart-cnn-6-6"| |
| ocr | image | Blob | - | GPU / CPU |
| | model | string | "Xenova/trocr-small-printed"| |
| classifyImage | image | Blob | - | NPU / GPU / CPU |
| | model | string | "Xenova/resnet-50"| |
| doRAGSearch | texts | Array | [] | GPU
| | query | string | "" | |## Usage
Here are examples of how to use each function:
### RAG (Retrieval-Augmented Generation)
```javascript
import { doRAGSearch } from 'web-ai-toolkit';window.showOpenFilePicker().then(async (file) => {
const fileBlob = await file[0].getFile();
const text = await fileBlob.text();// text can be derived from anything
// this sample is just meant to be extremely simple
// for example, your text could be an array of text that you have OCR'ed
// from some photosconst query = "My Search Query";
const ragQuery = await doRAGSearch([text], query);
console.log(ragQuery);
});
```### Transcribe Audio File
```javascript
import { transcribeAudioFile } from 'web-ai-toolkit';const audioFile = ...; // Your audio file Blob
const transcription = await transcribeAudioFile(audioFile, "Xenova/whisper-tiny", true, "en-US");
console.log(transcription);
```### Text to Speech
```javascript
import { textToSpeech } from 'web-ai-toolkit';const text = "Hello, world!";
const audio = await textToSpeech(text);
console.log(audio);
```### Summarize Text
```javascript
import { summarize } from 'web-ai-toolkit';const text = "Long text to be summarized...";
const summary = await summarize(text);
console.log(summary);
```### Optical Character Recognition (OCR)
```javascript
import { ocr } from 'web-ai-toolkit';const image = ...; // Your image Blob
const text = await ocr(image);
console.log(text);
```### Image Classification
```javascript
import { classifyImage } from 'web-ai-toolkit';const image = ...; // Your image Blob
const text = await classifyImage(image);
console.log(text);
```## Technical Details
The Web AI Toolkit utilizes the [transformers.js project](https://huggingface.co/docs/transformers.js/index) to run AI workloads. All AI processing is performed locally on the device, ensuring data privacy and reducing latency. AI workloads are run using the [WebNN API](https://learn.microsoft.com/en-us/windows/ai/directml/webnn-overview) when available, otherwise falling back to the WebGPU API, or even to the CPU with WebAssembly. Choosing the correct hardware to target is handled by the library.
## Contribution
We welcome contributions to the Web AI Toolkit. Please fork the repository and submit a pull request with your changes. For major changes, please open an issue first to discuss what you would like to change.
## License
The Web AI Toolkit is licensed under the MIT License. See the [LICENSE](LICENSE) file for more details.
## Contact
For questions or support, please open an issue here on GitHub
---
Thank you for using the Web AI Toolkit! We hope it makes integrating AI into your applications easier and more efficient.