https://github.com/gasparl/openai_translation
Python script for translating .docx documents using the OpenAI API. It processes text in chunks, maintains context, supports sample translations for style, and saves results in .docx format.
https://github.com/gasparl/openai_translation
book-translation books chatgpt-api docx openai openai-api translation translation-api
Last synced: 10 months ago
JSON representation
Python script for translating .docx documents using the OpenAI API. It processes text in chunks, maintains context, supports sample translations for style, and saves results in .docx format.
- Host: GitHub
- URL: https://github.com/gasparl/openai_translation
- Owner: gasparl
- License: gpl-3.0
- Created: 2024-12-05T05:52:57.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-03-02T07:23:47.000Z (over 1 year ago)
- Last Synced: 2025-04-14T18:57:59.839Z (about 1 year ago)
- Topics: book-translation, books, chatgpt-api, docx, openai, openai-api, translation, translation-api
- Language: Python
- Homepage: https://gasparl.github.io/openai_translation/
- Size: 32.2 KB
- Stars: 3
- Watchers: 1
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Docx Translator with OpenAI
This repository provides a Python script for translating `.docx` files using OpenAI's GPT(-4) API.
---
## Features
- Supports translation between any pair of languages (configurable in the script).
- Handles large texts by splitting them into manageable chunks that fit OpenAI's token limits.
- Maintains contextual continuity by incorporating previous translations and source text into the translation process.
- Allows the use of a sample translation file to guide translation tone and style.
- The specific instructions can be modified. (The current instructions focuse on formal equivalence, aiming to preserve the original meaning, style, and structure of the text.)
- Reads input from `.docx` files and writes translated output to `.docx` files while preserving paragraph structure.
- Includes error handling and retry mechanisms for API calls.
---
## Requirements
1. Python 3.7 or higher.
2. Required libraries:
- `openai`
- `python-docx`
- `tiktoken`
3. A valid OpenAI API key.
To install the required libraries, run:
```bash
pip install openai python-docx tiktoken
## Setup
### Configuration File
Create a `config.json` file in the same directory as the script and add your OpenAI API key:
```json
{
"OPENAI_API_KEY": "your-api-key-here"
}
```
### Input
- Prepare the `.docx` file you want to translate.
- Optionally, create a `.docx` file with sample translations to guide the translation style.
---
## Usage
1. Open the script and modify the `main()` function call:
```python
main('input.docx', 'output.docx', sample_translation_file='sample_translation.docx')
```
- Replace `input.docx` with the path to your input file.
- Replace `output.docx` with the desired output file path.
- Optionally, specify a sample translation file.
2. Run the script; e.g. in bash:
```bash
python openai_translator.py
```
3. The translated document will be saved as the specified output file.