https://github.com/carlosacchi/captiocr
CaptiOCR - A real-time screen text extraction tool using Tesseract OCR. Capture, recognize, and log on-screen text dynamically. Future updates will include on-demand language installation, resizable selection areas, and live text overlays.
https://github.com/carlosacchi/captiocr
captions live logging ocr ocr-python ocr-recognition
Last synced: 10 months ago
JSON representation
CaptiOCR - A real-time screen text extraction tool using Tesseract OCR. Capture, recognize, and log on-screen text dynamically. Future updates will include on-demand language installation, resizable selection areas, and live text overlays.
- Host: GitHub
- URL: https://github.com/carlosacchi/captiocr
- Owner: CarloSacchi
- License: mit
- Created: 2025-02-07T14:10:44.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2025-02-20T21:42:38.000Z (10 months ago)
- Last Synced: 2025-02-20T22:30:27.230Z (10 months ago)
- Topics: captions, live, logging, ocr, ocr-python, ocr-recognition
- Language: Python
- Homepage:
- Size: 189 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# 🖥️ CaptiOCR - Real-Time Screen Text Extraction
**CaptiOCR** is an open-source **real-time screen text extraction tool** that allows users to capture on-screen text dynamically using **Tesseract OCR**. With an easy-to-use interface, you can **select an area of the screen** and extract text **continuously**, making it ideal for live captions, subtitles, or instant text recognition.
🚀 **Planned future enhancements** include:
- ✅ **On-demand language installation** for Tesseract OCR
- ✅ **Dynamically resizable selection area** instead of a fixed rectangle
- ✅ **Improved multi-monitor support**
- ✅ **Live text and Trasnlation overlay feature**
---
## 📌 Features
✔ **Real-time OCR processing** using [Tesseract OCR](https://github.com/tesseract-ocr/tesseract)
✔ **Multi-language support** (English, Italian, more in the future)
✔ **Logs and saves extracted text** to a file
✔ **Hotkey Support:** `Ctrl+Q` to stop capture
---
## 🛠️ Prerequisites
Before installing, ensure you have the following:
- ✅ **Python 3.x** installed
- ✅ **Tesseract OCR** installed ([Download it here](https://github.com/tesseract-ocr/tesseract))
- ✅ **Required Python libraries**
---
## 👥 Installation
### **1️⃣ Install Python & Dependencies**
```bash
pip install pytesseract pillow keyboard pyautogui tkinter
```
### **2️⃣ Install Tesseract OCR**
📉 **Windows users**:
Download & install Tesseract from [here](https://github.com/tesseract-ocr/tesseract).
Then, update the script with the correct Tesseract path:
```python
TESSDATA_PREFIX = r'C:\Program Files\Tesseract-OCR\tessdata'
TESSERACT_CMD = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
```
---
## ▶️ Running the Application
After installing the dependencies, clone the repository and run the script:
```bash
git clone https://github.com/CarloSacchi/CaptiOCR.git
cd CaptiOCR
python CaptiOCR.py
```
---
## 🖥️ How to Use
1️⃣ **Select a language** from the dropdown menu (Ita or Eng).
2️⃣ **Click "Start (Select Area)"** to open the selection window.
3️⃣ **Drag the yellow rectangle** over the area you want to capture.
4️⃣ **Press OK** to start real-time OCR.
5️⃣ **Press `Ctrl+Q`or `STOP`** to stop the capture.


📂 **Captured text is saved in a file named** in the same folder as the Python file, `capture_YYYY-MM-DD-HH-MM-SS.txt`.


---
## 📊 Future Development
This project is **open-source and under active development**. Planned features include:
- 📉 **Dynamic area selection (resizable & movable)**
- 📉 **On-demand installation of additional OCR languages**
- 📉 **Better multi-monitor support**
- 📉 **Live translator with DeepL or Google Translate API**
- 📉 **I improvement**
- 📉 **Cloud integration for storage**
If you want to contribute, feel free to submit **pull requests or feature requests**! 🚀
---
## 📚 License
This project is licensed under the **MIT License**, allowing free use and modification.
👤 **Author:** Carlo Sacchi