Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/jgw96/speech-to-text-web-toolkit

Making Speech-To-Text on the web easy, both local and in the cloud
https://github.com/jgw96/speech-to-text-web-toolkit

ai lit transformersjs webcomponents whisper

Last synced: 17 days ago
JSON representation

Making Speech-To-Text on the web easy, both local and in the cloud

Host: GitHub
URL: https://github.com/jgw96/speech-to-text-web-toolkit
Owner: jgw96
Created: 2024-03-04T21:55:02.000Z (12 months ago)
Default Branch: main
Last Pushed: 2024-04-20T07:06:05.000Z (10 months ago)
Last Synced: 2025-02-01T18:11:22.201Z (17 days ago)
Topics: ai, lit, transformersjs, webcomponents, whisper
Language: TypeScript
Homepage:
Size: 1.84 MB
Stars: 1
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md

Awesome Lists containing this project

README

        # Speech-To-Text Web Toolkit

The Speech-To-Text Web Toolkit is a web component that a developer can use to enable accurate speech-to-text, either locally or in the cloud, in their web-based application, including a WebView. Think of it as an upgrade to the Web Speech API.

Speech-To-Text uses the Azure Speech Recognition API to do speech-to-text in the cloud. For local speech-to-text, Speech-To-Text uses Transformers.js to run the OpenAI Whisper model locally. 

```bash

npm i speech-to-text-toolkit

```

## API

```html

  

  

    Start

    Stop

  

  

    document.querySelector('#start-button').addEventListener('click', startRecording);

    document.querySelector('#stop-button').addEventListener('click', stopRecording);

    const speechToText = document.querySelector('speech-to-text');

    speechToText.addEventListener('recognized', (e) => {

      console.log('recognized', e.detail);

    });

    function startRecording() {

      console.log('startRecording');

      speechToText.startSpeechToText();

    }

    function stopRecording() {

      console.log('stopRecording');

      speechToText.stopSpeechToText();

    }

  

```

## More Usage examples

### Do transcription on the device

The Speech-To-Text Toolkit can do speech-to-text on the device, using the users GPU with a fallback to the CPU. This does mean that you use more of the users device resources, and depending on the device, execution may be slower. However, this also means that your speech never leaves the users device. To use local transcription, set up  like the following:

```html

  

  

    Start

    Stop

  

```

### Do transcription in the cloud with the Azure Speech SDK

1. Get started by grabbing an API key for the Azure Speech SDK as described [here](https://learn.microsoft.com/en-us/azure/ai-services/speech-service/get-started-speech-to-text?tabs=windows%2Cterminal&pivots=programming-language-javascript#prerequisites).

2. Now, using your new API key + the region you created it for, let's set up  like the following:

```html

  

    Start

    Stop

  

```

The component will now use the Azure Speech SDK for Speech-To-Text transcription. Note, this will incur a cost as the docs linked to above discuss.

### Do transcription in the cloud or locally based on the device

The Speech-To-Text Web toolkit can also decide automatically to do the transcription in the cloud or locally. To enable this, set up  like the following:

```html

  

    Start

    Stop

  

```

 will run locally if the users device has more than 4GB of RAM, and has a battery level over 50%. Otherwise, transcription will happen in the cloud.