https://github.com/milahu/hocr-editor-qt
graphical HOCR editor to produce minimal diffs for proofreading of tesseract OCR output
https://github.com/milahu/hocr-editor-qt
cst-editor hocr hocr-editor minimal-diff ocr-post-processing ocr-postprocessing ocr-proofreading proofreading tesseract tesseract-ocr
Last synced: 4 months ago
JSON representation
graphical HOCR editor to produce minimal diffs for proofreading of tesseract OCR output
- Host: GitHub
- URL: https://github.com/milahu/hocr-editor-qt
- Owner: milahu
- License: mit
- Created: 2025-08-25T15:24:39.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2025-10-25T13:51:56.000Z (7 months ago)
- Last Synced: 2025-10-25T15:20:43.496Z (7 months ago)
- Topics: cst-editor, hocr, hocr-editor, minimal-diff, ocr-post-processing, ocr-postprocessing, ocr-proofreading, proofreading, tesseract, tesseract-ocr
- Language: Python
- Homepage:
- Size: 534 KB
- Stars: 3
- Watchers: 0
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: readme.md
- License: license.txt
Awesome Lists containing this project
README
# hocr-editor-qt
graphical HOCR editor to produce minimal diffs for proofreading of tesseract OCR output
## usage
```
python hocr-editor.py test/data/mit-license-template/mit-license-template.hocr
```
## screenshot

## install
### Linux
#### NixOS Linux
```
nix-shell -p git
git clone https://github.com/milahu/hocr-editor-qt
cd hocr-editor-qt
nix-shell
```
#### Debian Linux
```
sudo apt install git python3
git clone https://github.com/milahu/hocr-editor-qt
cd hocr-editor-qt
pip install -r requirements.txt
```
### Windows
install `git` and `python3` with the [chocolatey package manager](https://chocolatey.org/install)
powershell → rightclick → run as admin
```
choco install git python3
```
now in a non-admin powershell
```
git clone https://github.com/milahu/hocr-editor-qt
cd hocr-editor-qt
pip install -r requirements.txt
```