https://github.com/jjaruna/autotranscriptgui
๐๏ธ Powerful GUI tool to transcribe and translate audio/video files using Whisper and OpenAI โ fast, simple, and GPU-optimized.
https://github.com/jjaruna/autotranscriptgui
fast-whisper gui openai-api subtitulos transcripcion transcription translate whisper
Last synced: about 1 month ago
JSON representation
๐๏ธ Powerful GUI tool to transcribe and translate audio/video files using Whisper and OpenAI โ fast, simple, and GPU-optimized.
- Host: GitHub
- URL: https://github.com/jjaruna/autotranscriptgui
- Owner: jjaruna
- Created: 2025-07-05T07:35:46.000Z (12 months ago)
- Default Branch: main
- Last Pushed: 2025-07-05T08:15:30.000Z (12 months ago)
- Last Synced: 2025-07-05T08:34:16.549Z (12 months ago)
- Topics: fast-whisper, gui, openai-api, subtitulos, transcripcion, transcription, translate, whisper
- Language: Python
- Homepage:
- Size: 0 Bytes
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# AutoTranscript GUI ๐๏ธ
**AutoTranscript** is a powerful, GPU-accelerated subtitle generator built on top of OpenAI's Whisper model. It features both a **command-line interface (CLI)** and a beautiful **CustomTkinter-based GUI** for users who prefer a graphical workflow.
Supports:
- Languages such as: English, Chinese, Japanese, Korean.
- Local audio/video files.
- Translate or transcribe YouTube videos using only the link.
- Subtitle translation to English.
- OpenAI API (for higher quality translations) NOT AVAILABLE
---
## โจ Features
- ๐ฅ๏ธ Full-featured **GUI with progress tracking**, real-time logs.
- ๐ Generate `.srt` subtitle files from media files
- ๐ Supports multilingual transcription and optional **translation to English**
- ๐ง Uses [Faster-Whisper](https://github.com/guillaumekln/faster-whisper) for fast GPU-accelerated transcription
---
## YOUTUBE TUTORIAL IN SPANISH
(https://www.youtube.com/watch?v=dB6D1i1BjXc)
---
## ๐ธ GUI Preview
> 
---
## ๐งฉ Requirements
- Python
- NVIDIA GPU with CUDA (recommended)
- Visual C++ Redistributable 14
---
## Installation for Releases
- Extract the .rar file.
- Go to the app folder.
- At the top of the path bar, type cmd.
- In the console, type: pip install -r requirements.txt.
- Go back to the .bat file and run it.
---
## ๐ฆ Installation
```bash
git clone https://github.com/jjaruna/autoTranscriptGUI.git
cd autoTranscriptGUI
pip install -r requirements.txt
```
---
## ๐ Launch the GUI
```bash
python AutoTranscriptGUI.py
```
### ๐ Whisper Model Comparison Summary
| Model | VRAM (Min) | โ๏ธ Performance | ๐ฏ Use Case | ๐ Translate into English |
|--------------------|---------------|------------------------|-----------------------------------------------------------|--------------------------|
| `tiny` | โฅ 1 GB | โก Very Fast | Quick tests, low-resource devices | โ
|
| `base` | โฅ 2 GB | โก Fast | Simple transcriptions, short audio | โ
|
| `small` | โฅ 4 GB | โ๏ธ Balanced | Decent accuracy and speed for general use | โ
|
| `medium` | โฅ 8 GB | ๐ Slower | High-quality results for longer files | โ
|
| `large-v1` | โฅ 10 GB | ๐ข Slower | Older but still strong performer | โ
|
| `large-v2` | โฅ 10 GB | ๐ข Slower | More robust, especially with noisy inputs | โ
|
| `large-v3` | โฅ 12 GB | ๐ Slowest | Highest accuracy offline, latest version | โ
|
| `large-v3-turbo` | โฅ 8โ10 GB | โก Fastest | High-speed, high-accuracy, great multilingual support | โ |
# ๐ง Recommendation
After testing the `large-v3-turbo` model more than 10 times, I can confidently say it is the **fastest** and **most accurate** among all Whisper models included in this app.
๐ฅ๏ธ My system has **4GB of VRAM**, and despite being under the recommended VRAM for large models, `large-v3-turbo` still performed exceptionally well.
โ ๏ธ **Note:** Your experience may vary depending on your GPU and available VRAM. Use this recommendation as a reference, **not a guarantee**. If you encounter performance issues, try smaller models like `medium` or `small`.
---
## ๐ฅ๏ธ CLI Mode (Optional)
You can still use the command-line version via `autosub.py`:
```bash
python autosub.py myvideo.mp4 -l ja --translate --model base
```
### CLI Options
| Option | Description |
|---------------------|-------------|
| `filename` | File path |
| `-l`, `--language` | Force language (e.g. `en`, `es`, `zh`) |
| `-t`, `--translate` | Translate to English |
| `-o`, `--openai` | Use OpenAI API |
| `--model` | Whisper model to use |
| `--debug` | Enable debug mode |
| `--keep` | Keep intermediate WAV file |
---
## ๐ Output
- Subtitles are saved as `.srt` files in the same folder as your media.
- If translated, original and translated text will be preserved.
---
## ๐งช Example GUI Workflow
1. Open GUI
2. Select video/audio file
3. Choose language and Whisper model
4. (Optional) Enable "Translate to English"
5. Click **Start Transcription**
---
## ๐ Credits
- Built with [OpenAI Whisper](https://github.com/openai/whisper)
- Powered by [Faster-Whisper](https://github.com/guillaumekln/faster-whisper)
- GUI built with [CustomTkinter](https://github.com/TomSchimansky/CustomTkinter)
- Thank you General Koi, for the great help in testing and reviewing the Japanese transcripts.
---
## ๐ License
MIT License โ free for personal and commercial use.