https://github.com/ogkalu2/comic-translate

Desktop app for automatically translating comics - BDs, Manga, Manhwa, Fumetti and more in a variety of formats (Image, Pdf, Epub, cbr, cbz, etc) and in multiple languages.
https://github.com/ogkalu2/comic-translate

anime comics computer-vision deep-learning gui inpainting machine-translation manga manhua manhwa neural-network ocr pyside6 python pytorch segmentation text-detection text-segmentation translation webtoons

Last synced: about 1 month ago
JSON representation

Desktop app for automatically translating comics - BDs, Manga, Manhwa, Fumetti and more in a variety of formats (Image, Pdf, Epub, cbr, cbz, etc) and in multiple languages.

Host: GitHub
URL: https://github.com/ogkalu2/comic-translate
Owner: ogkalu2
License: apache-2.0
Created: 2024-01-25T08:45:17.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2026-02-04T02:56:39.000Z (about 1 month ago)
Last Synced: 2026-02-04T14:24:50.853Z (about 1 month ago)
Topics: anime, comics, computer-vision, deep-learning, gui, inpainting, machine-translation, manga, manhua, manhwa, neural-network, ocr, pyside6, python, pytorch, segmentation, text-detection, text-segmentation, translation, webtoons
Language: Python
Homepage: https://comic-translate.com
Size: 27.4 MB
Stars: 2,372
Watchers: 28
Forks: 254
Open Issues: 99
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-comics-understanding - Comic Translate - Comic translation (🔧 Tools & Repositories / Overview of Comic/Manga Datasets and Tasks)

README

          # Comic Translate

English | [한국어](docs/README_ko.md) | [Français](docs/README_fr.md) | [简体中文](docs/README_zh-CN.md) | [日本語](docs/README_ja.md) | [Português Brasileiro](docs/README_pt-BR.md)



## Intro

Many Automatic Manga Translators exist. Very few properly support comics of other kinds in other languages. 

This project was created to utilize the ability of State of the Art (SOTA) Large Language Models (LLMs) like GPT-4 and translate comics from all over the world. Currently, it supports translating to and from English, Korean, Japanese, French, Simplified Chinese, Traditional Chinese, Russian, German, Dutch, Spanish and Italian. It can translate to (but not from) Turkish, Polish, Portuguese and Brazillian Portuguese.

- [The State of Machine Translation](#the-state-of-machine-translation)

- [Preview](#comic-samples)

- [Getting Started](#installation)

    - [Installation](#installation)

        - [Python](#python)

    - [Usage](#usage)

        - [Tips](#tips)

    - [API keys](#api-keys)

        - [Getting API Keys](#getting-api-keys)

            - [Open AI](#open-ai-gpt)

            - [Google Cloud Vision](#google-cloud-vision)

- [How it works](#how-it-works)

    - [Text Detection](#text-detection)

    - [OCR](#OCR)

    - [Inpainting](#inpainting)

    - [Translation](#translation)

    - [Text Rendering](#text-rendering)

- [Acknowledgements](#acknowledgements)

## The State of Machine Translation

For a couple dozen languages, the best Machine Translator is not Google Translate, Papago or even DeepL, but a SOTA LLM like GPT-4o, and by far. 

This is very apparent for distant language pairs (Korean<->English, Japanese<->English etc) where other translators still often devolve into gibberish.

Excerpt from "The Walking Practice"(보행 연습) by Dolki Min(돌기민)

![Model](https://i.imgur.com/72jvLBa.png)

## Comic Samples

GPT-4 as Translator.

Note: Some of these also have Official English Translations

[The Wretched of the High Seas](https://www.drakoo.fr/bd/drakoo/les_damnes_du_grand_large/les_damnes_du_grand_large_-_histoire_complete/9782382330128)

 

[Journey to the West](https://ac.qq.com/Comic/comicInfo/id/541812)

 

[The Wormworld Saga](https://wormworldsaga.com/index.php)

 

[Frieren: Beyond Journey's End](https://renta.papy.co.jp/renta/sc/frm/item/220775/title/742932/)

 

[Days of Sand](https://9ekunst.nl/2021/05/20/nieuw-album-van-aimee-de-jongh-is-benauwend-als-een-zandstorm/)

 

[Player (OH Hyeon-Jun)](https://comic.naver.com/webtoon/list?titleId=745876&page=1&sort=ASC&tab=fri)

 

[Carbon & Silicon](https://www.amazon.com/Carbone-Silicium-French-Mathieu-Bablet-ebook/dp/B0C1LTGZ85/)

 

## Installation

### Python

Install Python 3.12. Tick "Add python.exe to PATH" during the setup.

```bash

https://www.python.org/downloads/

```

Install git

```bash

https://git-scm.com/

```

Install uv

```

https://docs.astral.sh/uv/getting-started/installation/

```

Then, in the command line

```bash

git clone https://github.com/ogkalu2/comic-translate

cd comic-translate

uv init --python 3.12

```

and install the requirements

```bash

uv add -r requirements.txt --compile-bytecode

```

To Update, run this in the comic-translate folder

```bash

git pull

uv init --python 3.12 (Note: only run this line if you did not use uv for the first time installation)

uv add -r requirements.txt --compile-bytecode

```

If you have an NVIDIA GPU, then it is recommended to run

```bash

uv pip install onnxruntime-gpu

```

## Usage

In the comic-translate directory, run

```bash

uv run comic.py

```

This will launch the GUI

### Tips

* If you have a CBR file, you'll need to install Winrar or 7-Zip then add the folder it's installed to (e.g "C:\Program Files\WinRAR" for Windows) to Path. If it's installed but not to Path, you may get the error, 

```bash

raise RarCannotExec("Cannot find working tool")

```

In that case, Instructions for [Windows](https://www.windowsdigitals.com/add-folder-to-path-environment-variable-in-windows-11-10/), [Linux](https://linuxize.com/post/how-to-add-directory-to-path-in-linux/), [Mac](https://techpp.com/2021/09/08/set-path-variable-in-macos-guide/)

* Make sure the selected Font supports characters of the target language

* v2.0 introduces a Manual Mode. When you run into issues with Automatic Mode (No text detected, Incorrect OCR, Insufficient Cleaning etc), you are now able to make corrections. Simply Undo the Image and toggle Manual Mode.

* In Automatic Mode, Once an Image has been processed, it is loaded in the Viewer or stored to be loaded on switch so you can keep reading in the app as the other Images are being translated.

* Ctrl + Mouse Wheel to Zoom otherwise Vertical Scrolling

* The Usual Trackpad Gestures work for viewing the Image

* Right, Left Keys to Navigate Between Images

## API Keys

To following selections will require access to closed resources and subsequently, API Keys:

* GPT-4o or 4o-mini for Translation (Paid, about $0.01 USD/Page for 4o)

* DeepL Translator (Free for 500,000 characters/month)

* GPT-4o for OCR (Default Option for French, Russian, German, Dutch, Spanish, Italian) (Paid, about $0.02 USD/Page)

* Microsoft Azure Vision for OCR (Free for 5000 images/month)

* Google Cloud Vision for OCR (Free for 1000 images/month)

You can set your API Keys by going to Settings > Credentials

### Getting API Keys

#### Open AI (GPT)

* Go to OpenAI's Platform website at [platform.openai.com](https://platform.openai.com/) and sign in with (or create) an OpenAI account.

* Hover your Mouse over the right taskbar of the page and select "API Keys."

* Click "Create New Secret Key" to generate a new API key. Copy and store it.

#### Google Cloud Vision 

* Sign in/Create a [Google Cloud](https://cloud.google.com/) account. Go to [Cloud Resource Manager](https://console.cloud.google.com/cloud-resource-manager) and click "Create Project". Set your project name. 

* [Select your project here](https://console.cloud.google.com/welcome) then select "Billing" then "Create Account". In the pop-up, "Enable billing account", and accept the offer of a free trial account. Your "Account type" should be individual. Fill in a valid credit card.

* Enable Google Cloud Vison for your project [here](https://console.cloud.google.com/apis/library/vision.googleapis.com)

* In the [Google Cloud Credentials](https://console.cloud.google.com/apis/credentials) page, click "Create Credentials" then API Key. Copy and store it.

## How it works

### Speech Bubble Detection and Text Segmentation

[bubble-and-text-detector](https://huggingface.co/ogkalu/comic-text-and-bubble-detector). RT-DETR-v2 model trained on 11k images of comics (Manga, Webtoons, Western).

Algorithmic segmentation based on the boxes provided from the detection model.

  

### OCR

By Default:

* [manga-ocr](https://github.com/kha-white/manga-ocr) for Japanese

* [Pororo](https://github.com/yunwoong7/korean_ocr_using_pororo) for Korean 

* [PPOCRv5](https://www.paddleocr.ai/main/en/version3.x/algorithm/PP-OCRv5/PP-OCRv5.html) for Everything Else

Optional:

These can be used for any of the supported languages. An API Key is required.

* [Google Cloud Vision](https://cloud.google.com/vision/docs/ocr)

* [Microsoft Azure Vision](https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/overview-ocr)

### Inpainting

To remove the segmented text

* A [Manga/Anime finetuned](https://huggingface.co/dreMaz/AnimeMangaInpainting) [lama](https://github.com/advimman/lama) checkpoint. Implementation courtsey of [lama-cleaner](https://github.com/Sanster/lama-cleaner)

* [AOT-GAN](https://arxiv.org/abs/2104.01431) based model by [zyddnys](https://github.com/zyddnys)

 

### Translation

Currently, this supports using GPT-4.1, DeepL, Claude-3, 

Gemini-2.5, Yandex, Google Translate and Microsoft Azure Translator.

All LLMs are fed the entire page text to aid translations. 

There is also the Option to provide the Image itself for further context. 

### Text Rendering

Wrapped text in bounding boxes obtained from bubbles and text.

## Acknowledgements

* [https://github.com/Sanster/lama-cleaner](https://github.com/Sanster/lama-cleaner)

* [https://huggingface.co/dreMaz](https://huggingface.co/dreMaz)

* [https://github.com/yunwoong7/korean_ocr_using_pororo](https://github.com/yunwoong7/korean_ocr_using_pororo)

* [https://github.com/kha-white/manga-ocr](https://github.com/kha-white/manga-ocr)

* [https://github.com/JaidedAI/EasyOCR](https://github.com/JaidedAI/EasyOCR)

* [https://github.com/PaddlePaddle/PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)

* [https://github.com/RapidAI/RapidOCR](https://github.com/RapidAI/RapidOCR)

* [https://github.com/phenom-films/dayu_widgets](https://github.com/phenom-films/dayu_widgets)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ogkalu2/comic-translate

Awesome Lists containing this project

README