An open API service indexing awesome lists of open source software.

https://github.com/jjaruna/autotranscriptgui

๐ŸŽ™๏ธ Powerful GUI tool to transcribe and translate audio/video files using Whisper and OpenAI โ€” fast, simple, and GPU-optimized.
https://github.com/jjaruna/autotranscriptgui

fast-whisper gui openai-api subtitulos transcripcion transcription translate whisper

Last synced: about 1 month ago
JSON representation

๐ŸŽ™๏ธ Powerful GUI tool to transcribe and translate audio/video files using Whisper and OpenAI โ€” fast, simple, and GPU-optimized.

Awesome Lists containing this project

README

          

# AutoTranscript GUI ๐ŸŽ™๏ธ

**AutoTranscript** is a powerful, GPU-accelerated subtitle generator built on top of OpenAI's Whisper model. It features both a **command-line interface (CLI)** and a beautiful **CustomTkinter-based GUI** for users who prefer a graphical workflow.

Supports:
- Languages such as: English, Chinese, Japanese, Korean.
- Local audio/video files.
- Translate or transcribe YouTube videos using only the link.
- Subtitle translation to English.
- OpenAI API (for higher quality translations) NOT AVAILABLE

---

## โœจ Features

- ๐Ÿ–ฅ๏ธ Full-featured **GUI with progress tracking**, real-time logs.
- ๐Ÿ“œ Generate `.srt` subtitle files from media files
- ๐ŸŒ Supports multilingual transcription and optional **translation to English**
- ๐Ÿง  Uses [Faster-Whisper](https://github.com/guillaumekln/faster-whisper) for fast GPU-accelerated transcription

---
## YOUTUBE TUTORIAL IN SPANISH

(https://www.youtube.com/watch?v=dB6D1i1BjXc)

---
## ๐Ÿ“ธ GUI Preview

> ![image](https://github.com/user-attachments/assets/d328dff2-4d82-485c-95b8-162405a3e856)

---

## ๐Ÿงฉ Requirements

- Python
- NVIDIA GPU with CUDA (recommended)
- Visual C++ Redistributable 14

---
## Installation for Releases

- Extract the .rar file.
- Go to the app folder.
- At the top of the path bar, type cmd.
- In the console, type: pip install -r requirements.txt.
- Go back to the .bat file and run it.
---

## ๐Ÿ“ฆ Installation

```bash
git clone https://github.com/jjaruna/autoTranscriptGUI.git
cd autoTranscriptGUI
pip install -r requirements.txt
```
---

## ๐Ÿš€ Launch the GUI

```bash
python AutoTranscriptGUI.py
```
### ๐Ÿ” Whisper Model Comparison Summary

| Model | VRAM (Min) | โš™๏ธ Performance | ๐ŸŽฏ Use Case | ๐ŸŒ Translate into English |
|--------------------|---------------|------------------------|-----------------------------------------------------------|--------------------------|
| `tiny` | โ‰ฅ 1 GB | โšก Very Fast | Quick tests, low-resource devices | โœ… |
| `base` | โ‰ฅ 2 GB | โšก Fast | Simple transcriptions, short audio | โœ… |
| `small` | โ‰ฅ 4 GB | โš–๏ธ Balanced | Decent accuracy and speed for general use | โœ… |
| `medium` | โ‰ฅ 8 GB | ๐Ÿ•’ Slower | High-quality results for longer files | โœ… |
| `large-v1` | โ‰ฅ 10 GB | ๐Ÿข Slower | Older but still strong performer | โœ… |
| `large-v2` | โ‰ฅ 10 GB | ๐Ÿข Slower | More robust, especially with noisy inputs | โœ… |
| `large-v3` | โ‰ฅ 12 GB | ๐ŸŒ Slowest | Highest accuracy offline, latest version | โœ… |
| `large-v3-turbo` | โ‰ฅ 8โ€“10 GB | โšก Fastest | High-speed, high-accuracy, great multilingual support | โŒ |

# ๐Ÿง  Recommendation

After testing the `large-v3-turbo` model more than 10 times, I can confidently say it is the **fastest** and **most accurate** among all Whisper models included in this app.

๐Ÿ–ฅ๏ธ My system has **4GB of VRAM**, and despite being under the recommended VRAM for large models, `large-v3-turbo` still performed exceptionally well.

โš ๏ธ **Note:** Your experience may vary depending on your GPU and available VRAM. Use this recommendation as a reference, **not a guarantee**. If you encounter performance issues, try smaller models like `medium` or `small`.

---

## ๐Ÿ–ฅ๏ธ CLI Mode (Optional)

You can still use the command-line version via `autosub.py`:

```bash
python autosub.py myvideo.mp4 -l ja --translate --model base
```

### CLI Options

| Option | Description |
|---------------------|-------------|
| `filename` | File path |
| `-l`, `--language` | Force language (e.g. `en`, `es`, `zh`) |
| `-t`, `--translate` | Translate to English |
| `-o`, `--openai` | Use OpenAI API |
| `--model` | Whisper model to use |
| `--debug` | Enable debug mode |
| `--keep` | Keep intermediate WAV file |

---

## ๐Ÿ“ Output

- Subtitles are saved as `.srt` files in the same folder as your media.
- If translated, original and translated text will be preserved.

---

## ๐Ÿงช Example GUI Workflow

1. Open GUI
2. Select video/audio file
3. Choose language and Whisper model
4. (Optional) Enable "Translate to English"
5. Click **Start Transcription**

---

## ๐Ÿ™ Credits

- Built with [OpenAI Whisper](https://github.com/openai/whisper)
- Powered by [Faster-Whisper](https://github.com/guillaumekln/faster-whisper)
- GUI built with [CustomTkinter](https://github.com/TomSchimansky/CustomTkinter)
- Thank you General Koi, for the great help in testing and reviewing the Japanese transcripts.

---

## ๐Ÿ“„ License

MIT License โ€” free for personal and commercial use.