https://github.com/vseprr/srt-smart-translator
AI-powered SRT subtitle translator with context preservation
https://github.com/vseprr/srt-smart-translator
deepl flask nlp python spacy srt subtitle translation
Last synced: 2 months ago
JSON representation
AI-powered SRT subtitle translator with context preservation
- Host: GitHub
- URL: https://github.com/vseprr/srt-smart-translator
- Owner: vseprr
- License: mit
- Created: 2025-12-18T20:06:15.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2025-12-24T22:23:46.000Z (6 months ago)
- Last Synced: 2026-04-12T11:39:32.588Z (2 months ago)
- Topics: deepl, flask, nlp, python, spacy, srt, subtitle, translation
- Language: HTML
- Homepage:
- Size: 1.77 MB
- Stars: 18
- Watchers: 0
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# π¬ Smart SRT Translator
AI-powered subtitle translation with context preservation
Features β’
The Problem β’
How It Works β’
Installation β’
Usage β’
API β’
Contributing
---
## β¨ Features
- π§ **Smart Sentence Merging** β Uses SpaCy NLP to detect sentence boundaries across subtitle blocks
- π **Context-Aware Translation** β Translates complete sentences, not fragmented blocks
- β‘ **Proportional Splitting** β Redistributes translations back to original timestamps using character ratios
- π **29+ Languages** β Powered by DeepL API with support for major world languages
- π¨ **Modern Web UI** β Dark glassmorphism theme with drag-and-drop file upload
- π **Real-time Progress** β Server-Sent Events (SSE) for live translation status updates
- π **Secure** β API keys stored locally, never transmitted to third parties
- π οΈ **Multi-Model Support** β Install multiple SpaCy language models for different source languages
- π **Auto Language Detection** β Automatically detects source file language
- β οΈ **Smart Warnings** β Alerts for language mismatches and same language selections
---
## π₯ Demo
---
## π― The Problem: Why Context Matters?
### Turkish β English Example
Turkish sentence structure places the verb at the end. When subtitles split a sentence into multiple lines, standard translators fail to capture the meaning of the first line because the action (verb) is missing until the end.
**Original (Split in 3 lines):**
> 1. BΓΌtΓΌn bu olanlardan
> 2. sonra, beni affetmeni
> 3. beklemiyorum.
| Method | Output (Subtitle) | Why it fails/succeeds? |
| :--- | :--- | :--- |
| **Standard (Line-by-Line)** | 1. From all these things
2. after, to forgive me
3. **I am not waiting.** | β **FAIL:** "Beklemiyorum" is translated as "waiting" physically, instead of "expecting". The sentence is broken and meaningless. |
| **SRT Smart Translator** | 1. After all that has happened,
2. I do not expect
3. you to forgive me. | β
**SUCCESS:** It merges lines, understands "affetmeni beklemiyorum" implies expectation, translates correctly, and re-splits by timing. |
---
## π‘ How It Works
Smart SRT Translator uses a 4-step pipeline:
```
ββββββββββββ ββββββββββββ ββββββββββββ ββββββββββββ
β Parse βββββΆβ Merge βββββΆβTranslate βββββΆβ Split β
β SRT β βSentences β β (API) β β Back β
ββββββββββββ ββββββββββββ ββββββββββββ ββββββββββββ
```
### 1. Parse
Reads the SRT file with UTF-8 BOM support using `pysrt`
### 2. Merge
SpaCy NLP detects sentence boundaries and merges split sentences
### 3. Translate
Complete sentences are sent to DeepL API for contextual translation
### 4. Smart Split
Translation is proportionally split back to original block structure using character ratios
---
## π Installation
### Prerequisites
- **Python 3.8+** β [Download from python.org](https://www.python.org/downloads/)
- β οΈ Check "Add Python to PATH" during installation!
- **DeepL API Key** β [Get free API key](https://www.deepl.com/pro-api)
### Quick Start
```bash
# 1. Clone repository
git clone https://github.com/vseprr/srt-smart-translator.git
cd srt-smart-translator
# 2. Create virtual environment
python -m venv venv
# 3. Activate virtual environment
# Windows (PowerShell):
.\venv\Scripts\Activate.ps1
# Windows (CMD):
venv\Scripts\activate
# macOS/Linux:
source venv/bin/activate
# 4. Install dependencies
pip install -r requirements.txt
# 5. Start the application
python app.py
```
### First Run
1. Browser opens automatically to **http://localhost:5000**
2. You'll see the **Setup Wizard** π§ββοΈ
3. Select one or more language models to install:
- π¬π§ English (en_core_web_sm)
- πΉπ· Turkish (tr_core_news_lg)
- πͺπΈ Spanish (es_core_news_sm)
- π«π· French (fr_core_news_sm)
- π©πͺ German (de_core_news_sm)
- π Multilingual (xx_sent_ud_sm) - works with any language
- β Custom (install from URL)
4. Wait for installation to complete
5. Enter your DeepL API key in Settings
6. Start translating! π
> π‘ **Tip:** For multilingual models (xx_*), just type the install command - language is auto-selected as "Multilingual / Universal".
### Windows Quick Launch
After initial setup, double-click `UI-Start.bat` to launch (auto-setup if first time).
---
## π Usage
1. **Start the server:** `python app.py` (browser opens automatically)
2. **Upload SRT file** via drag-and-drop
3. **Select target language** and click "Start Translation"
4. **Download** the translated file when complete
### Warnings System
- π΄ **Language Mismatch** β No SpaCy model for detected language, using fallback
- π£ **Universal Model** β Using multilingual model (works for all languages)
- π **Same Language** β Source and target languages are the same
---
## π Project Structure
```
srt-smart-translator/
βββ app.py # Flask server + API endpoints
βββ parser.py # SRT file reading/writing
βββ engine.py # Sentence merging algorithm
βββ translator.py # DeepL API integration
βββ requirements.txt # Python dependencies
βββ UI-Start.bat # Windows quick launcher
βββ backend/
β βββ model_manager.py # SpaCy model management
β βββ language_data.py # Language configurations
βββ templates/
β βββ index.html # Main translation page
β βββ setup.html # First-run setup wizard
β βββ settings.html # Settings & model management
βββ static/
β βββ style.css # Dark glassmorphism theme
βββ uploads/ # Temporary upload storage
βββ outputs/ # Translated files
```
---
## βοΈ Settings Page Features
- **API Key Management** β Save/remove DeepL API key
- **Installed Models** β View all installed SpaCy models
- **Remove Model** β Uninstalls model with `pip uninstall`
- **Add Model** β Install via:
- `python -m spacy download xx_model`
- `pip install https://...whl`
- Direct wheel URL
---
## π API Reference
| Method | Endpoint | Description |
|--------|----------|-------------|
| `GET` | `/` | Main page (HTML) |
| `GET` | `/setup` | Setup wizard (if no models) |
| `GET` | `/settings` | Settings page |
| `GET` | `/api/config` | Check API key status |
| `POST` | `/api/config` | Save API key |
| `DELETE` | `/api/config` | Remove API key |
| `POST` | `/api/install-model` | Install SpaCy model |
| `POST` | `/api/remove-model` | Uninstall SpaCy model |
| `POST` | `/upload` | Upload SRT file |
| `POST` | `/translate` | Start translation job |
| `GET` | `/status/{job_id}` | Translation status (JSON) |
| `GET` | `/progress/{job_id}` | Real-time progress (SSE) |
| `GET` | `/download/{job_id}` | Download translated file |
---
## π¨ Tech Stack
| Component | Technology |
|-----------|------------|
| Backend | Flask 3.x |
| NLP | SpaCy (multiple models) |
| Language Detection | langdetect |
| Translation | DeepL Free API |
| SRT Parsing | pysrt |
| Frontend | Vanilla HTML/CSS/JS |
| Design | Dark Glassmorphism |
---
## β οΈ Known Limitations
- **Single file only** β No batch translation yet
- **SRT format only** β VTT, ASS not supported
- **Internet required** β DeepL API needs connectivity
---
## πΊοΈ Roadmap
- [x] ~~Multi-language SpaCy model support~~
- [x] ~~Automatic source language detection~~
- [x] ~~First-run setup wizard~~
- [x] ~~Real pip uninstall for models~~
- [ ] Batch file translation
- [ ] VTT/ASS format support
- [ ] Formality selection (formal/informal)
- [ ] Translation history
- [ ] PWA support for offline UI
---
## π€ Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
1. Fork the repository
2. Create your feature branch (`git checkout -b feature/AmazingFeature`)
3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request
---
## π License
This project is open source and available under the [MIT License](LICENSE).
---
## π Acknowledgements
- [DeepL](https://www.deepl.com/) for their excellent translation API
- [SpaCy](https://spacy.io/) for natural language processing
- [pysrt](https://github.com/byroot/pysrt) for SRT file handling
- [Turkish NLP Suite](https://huggingface.co/turkish-nlp-suite) for Turkish SpaCy model
---
Made with β€οΈ for the subtitle community