Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/gabrieldiem/batch-stt-whisper
Using STT (Speech to text), transcribe to a text file all audio files from a folder using local inference with open source OpenAI's Whisper LLM for audio transcription.
https://github.com/gabrieldiem/batch-stt-whisper
Last synced: about 17 hours ago
JSON representation
Using STT (Speech to text), transcribe to a text file all audio files from a folder using local inference with open source OpenAI's Whisper LLM for audio transcription.
- Host: GitHub
- URL: https://github.com/gabrieldiem/batch-stt-whisper
- Owner: gabrieldiem
- Created: 2024-09-16T12:23:20.000Z (3 months ago)
- Default Branch: master
- Last Pushed: 2024-09-16T19:03:27.000Z (3 months ago)
- Last Synced: 2024-11-05T14:04:25.665Z (about 2 months ago)
- Language: Python
- Homepage:
- Size: 2.93 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Batch STT using Whisper from OpenAI
Using STT (Speech to text), transcribe to a text file all audio files from a folder using local inference with open source OpenAI's Whisper LLM for audio transcription.
## Requirements
An Nvidia graphic card with CUDA capability is required. Information about different Whisper model sizes [here](https://github.com/openai/whisper/blob/main/model-card.md).
The program is set up to use model size `medium` that has 769M parameters and can be loaded with ~5 GB of GPU VRAM. If you want to change the size of the model you can change the `WHISPER_MODEL_ID` constant in the script.
> Tested with Nvidia RTX 3060 with 6GB of VRAM.
## Run the script
Install the requirements:
```shell
pip install -r ./requirements.txt
```Run the script:
```shell
python ./whisper_process_folder.py path/to/folder/with/audios
```## Development
### Compile requirements
```shell
pip-compile --output-file=requirements.txt "./requirements.in"
```