https://github.com/seapagan/vosk-test
Some experiments in using 'Vosk' speech-to-text under Python including real-time from a microphone over the web.
https://github.com/seapagan/vosk-test
python speech-to-text vosk vosk-api websocket
Last synced: 4 months ago
JSON representation
Some experiments in using 'Vosk' speech-to-text under Python including real-time from a microphone over the web.
- Host: GitHub
- URL: https://github.com/seapagan/vosk-test
- Owner: seapagan
- Created: 2024-12-18T13:57:52.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2024-12-30T14:16:12.000Z (6 months ago)
- Last Synced: 2025-02-19T06:04:07.593Z (4 months ago)
- Topics: python, speech-to-text, vosk, vosk-api, websocket
- Language: Python
- Homepage:
- Size: 23.4 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Speech To Text Experiments
Just some experiments with speech to text using the `vosk` package and models.
Vosk is not as accurate as `Whisper`, but transcription is a lot faster. I'll be
looking at the `fast-whisper` package later too.## Installation
```bash
uv sync
source .venv/bin/activate
```You also need to download a suitable`vosk` model. The model should be extracted
in it's own named folder in the root of the repository. You can get the models
from the [vosk website](https://alphacephei.com/vosk/models).> [!NOTE]
>
> It should be possible to pass the resulting transcribed result through another
> model like [recasepunc](https://github.com/benob/recasepunc) to end up with a
> properly punctuated and cased string. This is something to look into adding
> later.## Files
- `backend.py`: A FastAPI backend and Jinja template that uses the `vosk`
package to transcribe audio files. This version will only return once the
recording is done.
- `test_microphone.py`: A script that uses the `sounddevice` package to record
from the microphone and transcribe the audio in real-time using the `vosk`
package (modified from the original code from the `vosk` package).
- `transcribe_file.py`: A script that uses the `vosk` package to transcribe an
audio file.