https://github.com/bbc-esq/whispers2t-transcriber
Uses the powerful WhisperS2T and Ctranslate2 libraries to batch transcribe multiple files
https://github.com/bbc-esq/whispers2t-transcriber
audio-recorder audio-recording audio-transcribing audio-transcription ctranslate2 flash-attention-2 transcr transcriber transcription whispers2t
Last synced: 7 months ago
JSON representation
Uses the powerful WhisperS2T and Ctranslate2 libraries to batch transcribe multiple files
- Host: GitHub
- URL: https://github.com/bbc-esq/whispers2t-transcriber
- Owner: BBC-Esq
- Created: 2024-02-29T18:36:51.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-03-13T17:16:22.000Z (8 months ago)
- Last Synced: 2025-04-06T11:24:58.192Z (8 months ago)
- Topics: audio-recorder, audio-recording, audio-transcribing, audio-transcription, ctranslate2, flash-attention-2, transcr, transcriber, transcription, whispers2t
- Language: Python
- Homepage:
- Size: 94.7 KB
- Stars: 38
- Watchers: 3
- Forks: 1
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# 🚀WhisperS2T-transcriber🚀
* Uses the powerful WhisperS2T and Ctranslate2 libraries to batch transcribe multiple files
## Requirements
1) 🐍Python 3.11 or Python 3.12
2) 📁[Git](https://git-scm.com/downloads)
3) 📁[Git Large File Storage](https://git-lfs.com/)
8) 🪟 Windows (linux not yet supported)
# Installation
Download the latest release and extract the files your computer. Navigate to the respository folder, create a command prompt, and run the following commands:
```
python -m venv .
```
```
.\Scripts\activate
```
> Run this again to activate the environment each time you restart the program.
Run the installation script:
```
python setup.py
```
# Usage
```
python whispers2t_batch_gui.py
```
The program will process any and all of the following file types:
* ```.aac```, ```.amr```, ```.asf```, ```.avi```, ```.flac```, ```.m4a```, ```.mkv```, ```.mp3```, ```.mp4```, ```.wav```, ```.webm```, ```.wma```.
### Important
All transcriptions are output in the same folder of the file that was transcribed. If you'd like to change this behavior put an issue on Github requesting it.