Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jgw96/speech-to-text-web-toolkit
Making Speech-To-Text on the web easy, both local and in the cloud
https://github.com/jgw96/speech-to-text-web-toolkit
ai lit transformersjs webcomponents whisper
Last synced: about 2 months ago
JSON representation
Making Speech-To-Text on the web easy, both local and in the cloud
- Host: GitHub
- URL: https://github.com/jgw96/speech-to-text-web-toolkit
- Owner: jgw96
- Created: 2024-03-04T21:55:02.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2024-04-20T07:06:05.000Z (9 months ago)
- Last Synced: 2024-12-06T08:06:40.036Z (about 2 months ago)
- Topics: ai, lit, transformersjs, webcomponents, whisper
- Language: TypeScript
- Homepage:
- Size: 1.84 MB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
Awesome Lists containing this project
README
# Speech-To-Text Web Toolkit
The Speech-To-Text Web Toolkit is a web component that a developer can use to enable accurate speech-to-text, either locally or in the cloud, in their web-based application, including a WebView. Think of it as an upgrade to the Web Speech API.
Speech-To-Text uses the Azure Speech Recognition API to do speech-to-text in the cloud. For local speech-to-text, Speech-To-Text uses Transformers.js to run the OpenAI Whisper model locally.
```bash
npm i speech-to-text-toolkit
```## API
```html
Start
Stop
document.querySelector('#start-button').addEventListener('click', startRecording);
document.querySelector('#stop-button').addEventListener('click', stopRecording);const speechToText = document.querySelector('speech-to-text');
speechToText.addEventListener('recognized', (e) => {
console.log('recognized', e.detail);
});function startRecording() {
console.log('startRecording');
speechToText.startSpeechToText();
}function stopRecording() {
console.log('stopRecording');
speechToText.stopSpeechToText();
}
```## More Usage examples
### Do transcription on the device
The Speech-To-Text Toolkit can do speech-to-text on the device, using the users GPU with a fallback to the CPU. This does mean that you use more of the users device resources, and depending on the device, execution may be slower. However, this also means that your speech never leaves the users device. To use local transcription, set up like the following:
```html
Start
Stop
```### Do transcription in the cloud with the Azure Speech SDK
1. Get started by grabbing an API key for the Azure Speech SDK as described [here](https://learn.microsoft.com/en-us/azure/ai-services/speech-service/get-started-speech-to-text?tabs=windows%2Cterminal&pivots=programming-language-javascript#prerequisites).
2. Now, using your new API key + the region you created it for, let's set up like the following:```html
Start
Stop
```
The component will now use the Azure Speech SDK for Speech-To-Text transcription. Note, this will incur a cost as the docs linked to above discuss.### Do transcription in the cloud or locally based on the device
The Speech-To-Text Web toolkit can also decide automatically to do the transcription in the cloud or locally. To enable this, set up like the following:
```html
Start
Stop
```will run locally if the users device has more than 4GB of RAM, and has a battery level over 50%. Otherwise, transcription will happen in the cloud.