https://github.com/arslanex/whisper-transcriber
A scalable Python module for robust audio transcription using OpenAI's Whisper model. Supports multiple languages, batch processing, and output formats like JSON and SRT.
https://github.com/arslanex/whisper-transcriber
audio-processing openai openai-whisper python whisper
Last synced: 3 months ago
JSON representation
A scalable Python module for robust audio transcription using OpenAI's Whisper model. Supports multiple languages, batch processing, and output formats like JSON and SRT.
- Host: GitHub
- URL: https://github.com/arslanex/whisper-transcriber
- Owner: Arslanex
- Created: 2024-11-21T16:18:04.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2024-11-23T21:04:31.000Z (6 months ago)
- Last Synced: 2025-01-24T00:31:45.891Z (4 months ago)
- Topics: audio-processing, openai, openai-whisper, python, whisper
- Language: Jupyter Notebook
- Homepage:
- Size: 68.4 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# 🎙️ Whisper Transcription Module
## 🌟 Overview
A powerful, flexible Python module for audio transcription leveraging OpenAI's Whisper model, designed to transform audio content into accurate, multilingual text.
## ✨ Key Features
- 🔊 **Advanced Audio Transcription**
- Utilizes state-of-the-art Whisper AI technology
- Supports multiple languages and dialects- 🌐 **Multilingual Support**
- Transcribe and translate audio across 99 languages
- Automatic language detection- 📄 **Flexible Output Formats**
- TXT, JSON, SRT, VTT
- Customizable transcription settings- 📂 **Versatile Processing**
- Single file and batch processing
- Configurable model sizes
- GPU and CPU support
## 📚 Documentation
| 🇺🇸 English| 🇹🇷 Türkçe |
|-------------------------------------------------|-------------|
| [Installation Guide](docs/en/README.md)| [Installation Guide](docs/en/README.md)|
| [CLI Usage Guide](docs/en/README.md)|[Komut Satırı Kullanım Kılavuzu](docs/tr/README.md) |
| [Module Usage Guide](docs/en/MODULE_USAGE_EN.md)| [Modül Kullanım Kılavuzu](docs/tr/MODULE_USAGE_TR.md) |
| [Feature Specifications](docs/en/FEATURES_EN.md)| [Özellik Spesifikasyonları](docs/tr/FEATURES_TR.md)|## 🚀 Demo Scripts
The `demo_scripts` directory offers comprehensive scenarios demonstrating the module's capabilities:
| Scenario | Description | Key Features |
|----------|-------------|--------------|
| 1: Basic Transcription | Simple audio transcription | Default 'base' model, quick processing |
| 2: Multilingual Translation | Translate audio to English | Multi-language support, configurable logging |
| 3: Batch Processing | Process multiple audio files | Directory-wide transcription, format flexibility |
| 4: Advanced Configuration | Detailed transcription control | Quality filtering, segment management |
| 5: Error Handling | Robust error management | Fallback strategies, comprehensive logging |
| 6: Advanced Batch Processing | Large-scale transcription | Parallel processing, detailed reporting |## 📋 System Requirements
### 💻 Computational Resources
- **Python**: 3.8+
- **CPU**: All models supported
- **GPU**: Optional acceleration
- Use `--device cuda` for GPU transcription
- Automatic CPU fallback### 📦 Dependencies
- openai-whisper
- torch
- numpy
- soundfile
- ffmpeg-python## 🤝 Contributing
1. Fork the repository
2. Create a virtual environment
3. Install development dependencies: `pip install -e .[dev]`
4. Run tests: `pytest`
5. Submit a pull request## 🐛 Support
- [Open an Issue](https://github.com/yourusername/WhisperDemo/issues)
- Consult [Troubleshooting Guide](docs/en/TROUBLESHOOTING.md)## 📄 License
MIT License - see the LICENSE file for details.
## 🙏 Acknowledgements
- OpenAI for the Whisper model
- Python open-source community