https://github.com/protik09/subgen-whisperx
Generate subtitles for any Video or Audio file using the amazing capabilties of WhisperX.
https://github.com/protik09/subgen-whisperx
subtitle subtitles subtitles-generator whisper whisper-ai whisperx
Last synced: over 1 year ago
JSON representation
Generate subtitles for any Video or Audio file using the amazing capabilties of WhisperX.
- Host: GitHub
- URL: https://github.com/protik09/subgen-whisperx
- Owner: protik09
- License: mit
- Created: 2025-02-11T04:02:33.000Z (over 1 year ago)
- Default Branch: master
- Last Pushed: 2025-02-28T04:02:46.000Z (over 1 year ago)
- Last Synced: 2025-02-28T11:47:41.973Z (over 1 year ago)
- Topics: subtitle, subtitles, subtitles-generator, whisper, whisper-ai, whisperx
- Language: Python
- Homepage: https://github.com/protik09/subgen-whisperx
- Size: 9.4 MB
- Stars: 4
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: Readme.md
- License: LICENSE
Awesome Lists containing this project
README
# SubGen-WhisperX
A powerful subtitle generation tool using WhisperX for accurate speech-to-text transcription with precise timestamp alignment.
## Features
- 🎯 Accurate speech recognition using WhisperX
- ⚡ GPU acceleration support (CUDA)
- 🎵 Handles both video and audio files
- 📁 Batch processing support
- ⏱️ Performance timing and logging
- 🔧 Multiple language model options
- 🎛️ Configurable compute device selection
## Prerequisites
- Python 3.10 or later
- NVIDIA GPU with CUDA 12 support (optional, for GPU acceleration)
- Latest driver from nVidia
- FFmpeg
- Git
## Installation
### The easy way
1. Clone the repository:
```bash
git clone https://github.com/protik09/subgen-whisperx.git
cd subgen-whisperx
```
2. Create and activate a conda environment:
```bash
.\activate_venv.ps1
```
or
```bash
.\activate_venv.sh
```
### The manual way
1. Clone the repository:
```bash
git clone https://github.com/protik09/subgen-whisperx.git
cd subgen-whisperx
```
2. Create and activate a conda environment:
```bash
conda create -n whisperx python=3.10 -y
conda activate whisperx
```
3. Install PyTorch with CUDA support:
```bash
conda install -y pytorch==2.0.0 torchaudio==2.0.0 pytorch-cuda=11.8 -c pytorch -c nvidia
```
4. Install required packages:
```bash
pip install whisperx ffmpeg coloredlogs halo
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu121 --force-reinstall --no-cache-dir
```
## Usage
### Basic Usage
Generate subtitles for a single video file:
```bash
python subgen_whisperx.py -f path/to/video.mp4
```
### Advanced Options
Process all media files in a directory:
```bash
python subgen_whisperx.py -d path/to/directory
```
Specify compute device and model size:
```bash
python subgen_whisperx.py -f video.mp4 -c cuda -m medium
```
Set logging level:
```bash
python subgen_whisperx.py -f video.mp4 -l DEBUG
```
### Command Line Arguments
| Argument | Description | Default |
|----------|-------------|---------|
| `-f`, `--file` | Path to input media file | None |
| `-d`, `--directory` | Path to directory containing media files | None |
| `-c`, `--compute_device` | Device for computation (`cuda` or `cpu`) | Auto-detect |
| `-m`, `--model_size` | WhisperX model size | `base.en` |
| `-l`, `--log-level` | Logging level | `ERROR` |
| `-t-, '--txt' | Text file with fille/folder paths | None |
## Output
The script generates SRT subtitle files in the same directory as the input media:
- Format: `filename.ai-{language}.srt`
- Example: `Meetings-0822.ai-en.srt` for a video called `Meetings-0822.mp4`
## Troubleshooting
If the automatic model selection leads to the CUDA Out of Memory Issue, just manually select
the next smaller model using the `-m` flag.
## Performance
- GPU acceleration provides significantly faster processing
- There is CPU fallback if GPU access fails
- Progress indicators show real-time status
- Performance timing information displayed after completion
## License
This project operates under the MIT Open Source License
## Contributing
1. Fork the repository
2. Create your feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add some amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request
## Acknowledgments
- [WhisperX](https://github.com/m-bain/whisperX) for the core transcription technology
- [FFmpeg](https://ffmpeg.org/) for media processing capabilities