Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mikebridge/translocutor
Translate VTT Subtitle Files via ChatGPT
https://github.com/mikebridge/translocutor
ai captions chatgpt-api python subtitles translation
Last synced: about 13 hours ago
JSON representation
Translate VTT Subtitle Files via ChatGPT
- Host: GitHub
- URL: https://github.com/mikebridge/translocutor
- Owner: mikebridge
- License: other
- Created: 2024-08-24T23:58:07.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2025-02-02T19:29:11.000Z (13 days ago)
- Last Synced: 2025-02-02T20:19:21.585Z (13 days ago)
- Topics: ai, captions, chatgpt-api, python, subtitles, translation
- Language: Python
- Homepage:
- Size: 25.4 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Translocutor
Translocutor is a python CLI that will translate VTT subtitle files into another language
using the ChatGPT `gpt-4o` api.Currently, it creates a single VTT file that contains the original subtitles along with
the translated language in a different colour.This project is in an alpha state.
## Initial Setup
### OpenAI Key
First, ensure that you have an [API key for ChatGPT](https://platform.openai.com/docs/quickstart/create-and-export-an-api-key), and check if you have
an [organization id](https://platform.openai.com/settings/organization/general), then set them in your environment:```sh
export OPENAI_API_KEY=my_key_value
export ORGANIZATION_ID=my_organization_id
```You can alternatively set the keys in the .env file.
### Python
It's recommended that you create a virtual python environment rather than use the
default installation:```sh
pip3 install venv
mkdir -i ~/venv # or wherever you want to store your virtual environments
cd ~/venv
python3 -m venv translocutor
source ~/venv/translocutor/bin/activate
```Once you have a virtual environment active, you can install the required packages
by running the following command:```sh
pip install -r requirements.txt
```Then set up the translocutor package in the venv:
```sh
pip install -e .
```## Usage
Once you've installed the required packages, you can run the script with the following command:
```
translocutor -f my_subtitle_file.fr.vtt
```This will pass your subtitle file to the ChatGPT API, and write a new .vtt file
You can translate to something other than English (the default). ChatGPT seems to be
able to figure out what language you want. Feel free to experiment.```
translocutor -t german -f my_subtitle_file.fr.vtt
```There is some minimal help available:
```
translocutor --help
```## Getting VTT Files
[yt-dlp](https://github.com/yt-dlp/yt-dlp/wiki/Installation) can be a good way
of getting vtt files, e.g.:```sh
yt-dlp --all-subs https://example.com/whatever-video
```## Technical Notes
This uses ChatGPT API's [structured outputs](https://openai.com/index/introducing-structured-outputs-in-the-api/).
Translocutor partitions the caption file and submits one partition at a time containing several lines.
This should increase the accuracy of the translation by providing more context and continuity,
instead of just submitting one line or one caption at a time. This is done by estimating the
tokens with [tiktoken](https://github.com/openai/tiktoken) and making a guess at the size of the
result, to try to keep it under 4096 tokens.## Legal Disclaimer
This project has nothing to do with ChatGPT or OpenAI. It's just a project that uses their APIs.