https://github.com/praeclarum/web-transformers

Transformer neural networks in the browser
https://github.com/praeclarum/web-transformers

Last synced: 3 months ago
JSON representation

Transformer neural networks in the browser

Host: GitHub
URL: https://github.com/praeclarum/web-transformers
Owner: praeclarum
License: mit
Created: 2022-12-07T03:33:23.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2023-05-09T06:04:04.000Z (about 2 years ago)
Last Synced: 2024-08-10T19:17:12.276Z (11 months ago)
Language: TypeScript
Size: 1.3 MB
Stars: 89
Watchers: 5
Forks: 2
Open Issues: 5
Metadata Files:
- Readme: README.md
- License: LICENSE.md

Awesome Lists containing this project

README

        # web-transformers

[![Build and Test](https://github.com/praeclarum/web-transformers/actions/workflows/build.yml/badge.svg?branch=main)](https://github.com/praeclarum/web-transformers/actions/workflows/build.yml)

This library enables you to run huggingface transformer models directly in the browser.

It accomplishes this by running the models using the

[ONNX Runtime JavaScript API](https://github.com/microsoft/onnxruntime/tree/main/js)

and by implementing its own JavaScript-only tokenization library.

At the moment, it is compatible with Google's T5 models, but it was designed to be expanded.

I hope to support GPT2, Roberta, and InCoder in the future.

## Usage

The following example code shows how to load a tokenizer and a transformer model.

It then uses those objects to create a `generate` functions which generates

output text from input text.

```typescript

import { AutoTokenizer, T5ForConditionalGeneration } from 'web-transformers';

// Load the tokenizer and model

const tokenizer = AutoTokenizer.fromPretrained(modelId, modelsPath);

const model = new T5ForConditionalGeneration(modelId, modelsPath);

// Generate text using Seq2Seq

async function generate(inputText: string) {

    // Tokenize the text

    const inputTokenIds = await tokenizer.encode(inputText);

    // Generate output

    const generationOptions = {

        "maxLength": 50,

        "topK": 10,

    };

    const outputTokenIds = await model.generate(inputTokenIds, generationOptions);

    // Convert output tokens into text

    const outputText = await tokenizer.decode(finalOutputTokenIds, true);

    return outputText;

};

// Translate

console.log(await generate("translate English to French: Hello World!"));

```

## Sample Web App

This repo contains a [sample nextjs app](sample) that translates text using a transformer neural network.

First build the library:

```bash

git clone [email protected]:praeclarum/web-transformers.git

cd web-transformers

npm i

```

Now, build the sample:

```bash

cd sample

npm i

```

Before running the sample, you need to download the model files that will be executed. You can do this by running the following command:

```bash

npm run download_models

```

Then you can run the sample with:

```bash

npm run dev

```

## Models

Currently only the *T5* network is supported.

### Sampling

The neural network outputs the logarithm of the probability of each token.

In order to get a token, a probabilistic sample has to be taken.

The following algorithms are implemented:

* *Greedy*: Take the token with the highest probability.

* *Top-k*: Take the top-k tokens with the highest probability.

## Sponsorship

This library was made possible thanks to the financial support of [Reflect](https://reflect.app). Thanks Reflect!

## License

Copyright © 2022 [Frank A. Krueger](https://github.com/praeclarum). This project is [MIT](LICENSE.md) licensed.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/praeclarum/web-transformers

Awesome Lists containing this project

README