Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/extremq/gptsubtitler
Automatically subtitle any video spoken in any language to a language of your choice using AI.
https://github.com/extremq/gptsubtitler
ai ffmpeg gpt huggingface openai subtitles transcriber translation whisper
Last synced: 2 months ago
JSON representation
Automatically subtitle any video spoken in any language to a language of your choice using AI.
- Host: GitHub
- URL: https://github.com/extremq/gptsubtitler
- Owner: extremq
- License: mit
- Created: 2023-05-16T16:04:13.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-05-18T12:56:48.000Z (over 1 year ago)
- Last Synced: 2024-11-13T01:44:00.346Z (2 months ago)
- Topics: ai, ffmpeg, gpt, huggingface, openai, subtitles, transcriber, translation, whisper
- Language: Python
- Homepage:
- Size: 40 KB
- Stars: 43
- Watchers: 3
- Forks: 4
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Subtitler
Automatically subtitle any video spoken in any language to a language of your choice using AI.Models used:
- [OpenAI whisper C++ port](https://github.com/ggerganov/whisper.cpp) - for audio-to-text
- [Facebook M2M10](https://huggingface.co/facebook/m2m100_418M) - for translationTools used:
- `ffmpeg`**Please don't forget to star the repository if you find it useful or educational!**
Before:
https://github.com/extremq/subtitler/assets/45830561/49f6ecce-cfdc-4f1c-97eb-07a36ac841c9
After (in Romanian - `model_type=medium, language_model_type=base`):
https://github.com/extremq/subtitler/assets/45830561/20bc5169-0ce3-47cd-adb7-15d75daf27f4
# Setup
Install using `pip`.```
pip install gptsubtitler
```Install [`ffmpeg`](https://ffmpeg.org/):
```bash
# Ubuntu or Debian
sudo apt update && sudo apt install ffmpeg# MacOS
brew install ffmpeg# Windows using Chocolatey https://chocolatey.org/
choco install ffmpeg
```# Quick guide
Example usage for adding subtitles and translating them in Romanian:Command line:
```bash
gptsubtitler soldier.mp4 --source_language en --target_language ro --captioning_model_type medium --language_model_type base
```Or in Python
```py
from gptsubtitler import Transcriber# I strongly recommend using the "medium" model_type.
Transcriber.transcribe("soldier.mp4", source_language="en", target_language="ro", captioning_model_type="medium", language_model_type="base")
```You can also use the `Translator` class from `translator.py` if you just want to translate some text.
Example usage for translating from English to Romanian:
```py
from gptsubtitler import Translatorprint(Translator.translate("Hi!", target_language="ro", source_language="en"))
```If you have generated a `.srt` file and just want to add subtitles:
```py
from gptsubtitler import create_video_with_subtitles
create_video_with_subtitles("video.mp4", "output.srt", "video_subtitled.mp4")
```# Options
```
Args:
video_file (str): Path to video file.output_video_file (str, optional): Path to output video file. Defaults to video_file_subtitled.
output_subtitle_file (str, optional): Path to output SRT file. Defaults to "output.srt".
source_language (str, optional): Source language for translation. Defaults to en.
target_language (str, optional): Target language for translation. Defaults to None.
captioning_model_type (str, optional): Model type. Defaults to "base".
language_model_type (str, optional): Language model type. Defaults to "base".
model_dir (str, optional): Path to model directory. Defaults to None.
```Available options for `captioning_model_type` (the audio to text model):
- tiny
- base - default
- small
- medium
- largeAvailable options for `language_model_type` (the language translator model):
- base - default
- large