An open API service indexing awesome lists of open source software.

https://github.com/vseprr/srt-smart-translator

AI-powered SRT subtitle translator with context preservation
https://github.com/vseprr/srt-smart-translator

deepl flask nlp python spacy srt subtitle translation

Last synced: 2 months ago
JSON representation

AI-powered SRT subtitle translator with context preservation

Awesome Lists containing this project

README

          

# 🎬 Smart SRT Translator


Smart SRT Translator Logo


AI-powered subtitle translation with context preservation


Features β€’
The Problem β€’
How It Works β€’
Installation β€’
Usage β€’
API β€’
Contributing

---

## ✨ Features

- 🧠 **Smart Sentence Merging** – Uses SpaCy NLP to detect sentence boundaries across subtitle blocks
- πŸ”„ **Context-Aware Translation** – Translates complete sentences, not fragmented blocks
- ⚑ **Proportional Splitting** – Redistributes translations back to original timestamps using character ratios
- 🌍 **29+ Languages** – Powered by DeepL API with support for major world languages
- 🎨 **Modern Web UI** – Dark glassmorphism theme with drag-and-drop file upload
- πŸ“Š **Real-time Progress** – Server-Sent Events (SSE) for live translation status updates
- πŸ” **Secure** – API keys stored locally, never transmitted to third parties
- πŸ› οΈ **Multi-Model Support** – Install multiple SpaCy language models for different source languages
- πŸ” **Auto Language Detection** – Automatically detects source file language
- ⚠️ **Smart Warnings** – Alerts for language mismatches and same language selections

---

## πŸŽ₯ Demo


Smart SRT Translator Demo

---

## 🎯 The Problem: Why Context Matters?

### Turkish β†’ English Example

Turkish sentence structure places the verb at the end. When subtitles split a sentence into multiple lines, standard translators fail to capture the meaning of the first line because the action (verb) is missing until the end.

**Original (Split in 3 lines):**
> 1. BΓΌtΓΌn bu olanlardan
> 2. sonra, beni affetmeni
> 3. beklemiyorum.

| Method | Output (Subtitle) | Why it fails/succeeds? |
| :--- | :--- | :--- |
| **Standard (Line-by-Line)** | 1. From all these things
2. after, to forgive me
3. **I am not waiting.** | ❌ **FAIL:** "Beklemiyorum" is translated as "waiting" physically, instead of "expecting". The sentence is broken and meaningless. |
| **SRT Smart Translator** | 1. After all that has happened,
2. I do not expect
3. you to forgive me. | βœ… **SUCCESS:** It merges lines, understands "affetmeni beklemiyorum" implies expectation, translates correctly, and re-splits by timing. |

---

## πŸ’‘ How It Works

Smart SRT Translator uses a 4-step pipeline:

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Parse │───▢│ Merge │───▢│Translate │───▢│ Split β”‚
β”‚ SRT β”‚ β”‚Sentences β”‚ β”‚ (API) β”‚ β”‚ Back β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

### 1. Parse
Reads the SRT file with UTF-8 BOM support using `pysrt`

### 2. Merge
SpaCy NLP detects sentence boundaries and merges split sentences

### 3. Translate
Complete sentences are sent to DeepL API for contextual translation

### 4. Smart Split
Translation is proportionally split back to original block structure using character ratios

---

## πŸ›  Installation

### Prerequisites

- **Python 3.8+** – [Download from python.org](https://www.python.org/downloads/)
- ⚠️ Check "Add Python to PATH" during installation!
- **DeepL API Key** – [Get free API key](https://www.deepl.com/pro-api)

### Quick Start

```bash
# 1. Clone repository
git clone https://github.com/vseprr/srt-smart-translator.git
cd srt-smart-translator

# 2. Create virtual environment
python -m venv venv

# 3. Activate virtual environment
# Windows (PowerShell):
.\venv\Scripts\Activate.ps1
# Windows (CMD):
venv\Scripts\activate
# macOS/Linux:
source venv/bin/activate

# 4. Install dependencies
pip install -r requirements.txt

# 5. Start the application
python app.py
```

### First Run

1. Browser opens automatically to **http://localhost:5000**
2. You'll see the **Setup Wizard** πŸ§™β€β™‚οΈ
3. Select one or more language models to install:
- πŸ‡¬πŸ‡§ English (en_core_web_sm)
- πŸ‡ΉπŸ‡· Turkish (tr_core_news_lg)
- πŸ‡ͺπŸ‡Έ Spanish (es_core_news_sm)
- πŸ‡«πŸ‡· French (fr_core_news_sm)
- πŸ‡©πŸ‡ͺ German (de_core_news_sm)
- 🌐 Multilingual (xx_sent_ud_sm) - works with any language
- βž• Custom (install from URL)
4. Wait for installation to complete
5. Enter your DeepL API key in Settings
6. Start translating! πŸŽ‰

> πŸ’‘ **Tip:** For multilingual models (xx_*), just type the install command - language is auto-selected as "Multilingual / Universal".

### Windows Quick Launch

After initial setup, double-click `UI-Start.bat` to launch (auto-setup if first time).

---

## πŸš€ Usage

1. **Start the server:** `python app.py` (browser opens automatically)
2. **Upload SRT file** via drag-and-drop
3. **Select target language** and click "Start Translation"
4. **Download** the translated file when complete

### Warnings System

- πŸ”΄ **Language Mismatch** – No SpaCy model for detected language, using fallback
- 🟣 **Universal Model** – Using multilingual model (works for all languages)
- 🟠 **Same Language** – Source and target languages are the same

---

## πŸ“ Project Structure

```
srt-smart-translator/
β”œβ”€β”€ app.py # Flask server + API endpoints
β”œβ”€β”€ parser.py # SRT file reading/writing
β”œβ”€β”€ engine.py # Sentence merging algorithm
β”œβ”€β”€ translator.py # DeepL API integration
β”œβ”€β”€ requirements.txt # Python dependencies
β”œβ”€β”€ UI-Start.bat # Windows quick launcher
β”œβ”€β”€ backend/
β”‚ β”œβ”€β”€ model_manager.py # SpaCy model management
β”‚ └── language_data.py # Language configurations
β”œβ”€β”€ templates/
β”‚ β”œβ”€β”€ index.html # Main translation page
β”‚ β”œβ”€β”€ setup.html # First-run setup wizard
β”‚ └── settings.html # Settings & model management
β”œβ”€β”€ static/
β”‚ └── style.css # Dark glassmorphism theme
β”œβ”€β”€ uploads/ # Temporary upload storage
└── outputs/ # Translated files
```

---

## βš™οΈ Settings Page Features

- **API Key Management** – Save/remove DeepL API key
- **Installed Models** – View all installed SpaCy models
- **Remove Model** – Uninstalls model with `pip uninstall`
- **Add Model** – Install via:
- `python -m spacy download xx_model`
- `pip install https://...whl`
- Direct wheel URL

---

## πŸ”Œ API Reference

| Method | Endpoint | Description |
|--------|----------|-------------|
| `GET` | `/` | Main page (HTML) |
| `GET` | `/setup` | Setup wizard (if no models) |
| `GET` | `/settings` | Settings page |
| `GET` | `/api/config` | Check API key status |
| `POST` | `/api/config` | Save API key |
| `DELETE` | `/api/config` | Remove API key |
| `POST` | `/api/install-model` | Install SpaCy model |
| `POST` | `/api/remove-model` | Uninstall SpaCy model |
| `POST` | `/upload` | Upload SRT file |
| `POST` | `/translate` | Start translation job |
| `GET` | `/status/{job_id}` | Translation status (JSON) |
| `GET` | `/progress/{job_id}` | Real-time progress (SSE) |
| `GET` | `/download/{job_id}` | Download translated file |

---

## 🎨 Tech Stack

| Component | Technology |
|-----------|------------|
| Backend | Flask 3.x |
| NLP | SpaCy (multiple models) |
| Language Detection | langdetect |
| Translation | DeepL Free API |
| SRT Parsing | pysrt |
| Frontend | Vanilla HTML/CSS/JS |
| Design | Dark Glassmorphism |

---

## ⚠️ Known Limitations

- **Single file only** – No batch translation yet
- **SRT format only** – VTT, ASS not supported
- **Internet required** – DeepL API needs connectivity

---

## πŸ—ΊοΈ Roadmap

- [x] ~~Multi-language SpaCy model support~~
- [x] ~~Automatic source language detection~~
- [x] ~~First-run setup wizard~~
- [x] ~~Real pip uninstall for models~~
- [ ] Batch file translation
- [ ] VTT/ASS format support
- [ ] Formality selection (formal/informal)
- [ ] Translation history
- [ ] PWA support for offline UI

---

## 🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

1. Fork the repository
2. Create your feature branch (`git checkout -b feature/AmazingFeature`)
3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request

---

## πŸ“„ License

This project is open source and available under the [MIT License](LICENSE).

---

## πŸ™ Acknowledgements

- [DeepL](https://www.deepl.com/) for their excellent translation API
- [SpaCy](https://spacy.io/) for natural language processing
- [pysrt](https://github.com/byroot/pysrt) for SRT file handling
- [Turkish NLP Suite](https://huggingface.co/turkish-nlp-suite) for Turkish SpaCy model

---


Made with ❀️ for the subtitle community