{"id":25819133,"url":"https://github.com/carlosacchi/captiocr","last_synced_at":"2026-03-16T10:01:23.847Z","repository":{"id":276324968,"uuid":"928937000","full_name":"carlosacchi/captiocr","owner":"carlosacchi","description":"CaptiOCR - A real-time screen text extraction tool using Tesseract OCR. Capture, recognize, and log on-screen text dynamically. Future updates will include on-demand language installation, resizable selection areas, and live text overlays.","archived":false,"fork":false,"pushed_at":"2026-02-17T20:39:59.000Z","size":828,"stargazers_count":6,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-02-18T01:35:06.079Z","etag":null,"topics":["captions","live","live-caption","live-captioning","live-captions","live-transcript","logging","ocr","ocr-python","ocr-recognition","saver","transcription"],"latest_commit_sha":null,"homepage":"https://www.captiocr.com","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/carlosacchi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-02-07T14:10:44.000Z","updated_at":"2026-02-17T20:40:00.000Z","dependencies_parsed_at":"2025-02-28T08:15:21.740Z","dependency_job_id":"ce5e320d-d282-42b8-a978-037c6b17813b","html_url":"https://github.com/carlosacchi/captiocr","commit_stats":null,"previous_names":["carlosacchi/live-caption-ocr-reader","carlosacchi/captiocr"],"tags_count":26,"template":false,"template_full_name":null,"purl":"pkg:github/carlosacchi/captiocr","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/carlosacchi%2Fcaptiocr","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/carlosacchi%2Fcaptiocr/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/carlosacchi%2Fcaptiocr/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/carlosacchi%2Fcaptiocr/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/carlosacchi","download_url":"https://codeload.github.com/carlosacchi/captiocr/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/carlosacchi%2Fcaptiocr/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29658170,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-20T16:33:43.953Z","status":"ssl_error","status_checked_at":"2026-02-20T16:33:43.598Z","response_time":59,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["captions","live","live-caption","live-captioning","live-captions","live-transcript","logging","ocr","ocr-python","ocr-recognition","saver","transcription"],"created_at":"2025-02-28T08:15:10.441Z","updated_at":"2026-02-20T17:07:35.932Z","avatar_url":"https://github.com/carlosacchi.png","language":"Python","funding_links":[],"categories":["Screen Capture and Recording"],"sub_categories":[],"readme":"# 🖥️ CaptiOCR - Real-Time Screen Text Extraction\n\n[![GitHub Release](https://img.shields.io/github/v/release/CarloSacchi/CaptiOCR)](https://github.com/CarloSacchi/CaptiOCR/releases/latest)\n[![CodeQL](https://github.com/carlosacchi/captiocr/actions/workflows/github-code-scanning/codeql/badge.svg)](https://github.com/carlosacchi/captiocr/actions/workflows/github-code-scanning/codeql)\n\n**CaptiOCR** is an open-source **real-time screen text extraction tool** designed to capture and transcribe captions (subtitles) from video conferencing applications like **Microsoft Teams**, **Zoom**, and **Google Meet**. With an intuitive interface and powerful OCR capabilities, you can select any screen area and extract text continuously in real-time.\n\n---\n\n## ✨ Key Features\n\n✅ **Real-time OCR processing** using [Tesseract OCR](https://github.com/tesseract-ocr/tesseract)  \n✅ **Multi-language support** (English, Italian, French, German, Portuguese)  \n✅ **Multi-monitor support** with DPI awareness  \n✅ **Dynamic area selection** - drag, resize, and move capture areas during operation  \n✅ **Text processing** - automatic duplicate removal and text cleaning  \n✅ **Profile management** - save and load different configurations  \n✅ **Hotkey support** - `Ctrl+Q` to stop capture  \n✅ **Export options** - save captured text with custom naming  \n✅ **Debug logging** for troubleshooting  \n✅ **Modular architecture** - clean, maintainable codebase  \n\n---\n\n## 🛠️ Prerequisites\n\nBefore installation, ensure you have:\n- ✅ **Python 3.9+** installed  \n- ✅ **Tesseract OCR** installed ([Download here](https://github.com/tesseract-ocr/tesseract))  \n- ✅ **Windows OS** (primary support)  \n\n---\n\n## 📦 Installation\n\n### **1️⃣ Clone the Repository**\n```bash\ngit clone https://github.com/CarloSacchi/CaptiOCR.git\ncd CaptiOCR\n```\n\n### **2️⃣ Install Python Dependencies**\n```bash\npip install -r requirements.txt\n```\n\n### **3️⃣ Install Tesseract OCR**\n\n**Windows users:**  \nDownload and install Tesseract from the [official releases](https://github.com/tesseract-ocr/tesseract/releases).  \nThe application will automatically detect standard installation paths.\n\n---\n\n## 🚀 Quick Start\n\nRun the application:\n```bash\npython CaptiOCR.py\n```\n\n### **Basic Usage:**\n\n1️⃣ **Select Language** - Choose your OCR language from the dropdown  \n2️⃣ **Click \"Start (Select Area)\"** - Open the area selection tool  \n3️⃣ **Drag to Select** - Draw a rectangle around the text area you want to capture  \n4️⃣ **Press ENTER** - Begin real-time text extraction  \n5️⃣ **Press Ctrl+Q or STOP** - End the capture session  \n6️⃣ **Name Your Capture** - Save with a custom filename  \n\n📁 **Output:** Captured text is saved in the `captures/` folder as timestamped `.txt` files.\n\n---\n\n## 🎯 Advanced Features\n\n### **Multi-Monitor Support**\n- **Automatic detection** of all connected monitors\n- **DPI awareness** for high-resolution displays\n- **Cross-monitor selection** - capture areas spanning multiple screens\n- **Monitor-specific positioning** for consistent setups\n\n### **Dynamic Capture Areas**\n- **Resizable borders** - adjust capture area during operation\n- **Movable windows** - reposition without stopping capture\n- **Multiple profiles** - save configurations for different applications\n\n### **Text Processing**\n- **Duplicate detection** - automatic removal of repeated text\n- **Text cleaning** - remove artifacts and formatting issues\n- **Processed output** - clean, readable transcriptions\n\n### **Profile Management**\n- **Save Settings** - store optimized configurations\n- **Quick Load** - switch between saved profiles\n- **Application-specific** - different settings for Teams, Zoom, Meet\n\n---\n\n## 💡 Tips \u0026 Best Practices\n\n### **Optimizing OCR Accuracy**\n- **Language Selection**: Choose the correct language model for best results with accents and special characters\n- **Capture Area**: Select narrow, wide rectangles focusing on subtitle regions\n- **Minimum Size**: Ensure capture areas are at least 50×50 pixels\n- **Stable Areas**: Target regions where text appears consistently\n\n### **Performance Optimization**\n- **Close unnecessary applications** to reduce system load\n- **Use specific language models** rather than auto-detection\n- **Regular cleanup** of old capture files and logs\n- **Monitor system resources** during extended capture sessions\n\n---\n\n## 📁 Project Structure\n\n```\nCaptiOCR/\n├── CaptiOCR.py              # Main application entry point\n├── captiocr/                # Core application modules\n│   ├── config/              # Settings and constants\n│   ├── core/                # OCR and capture logic\n│   ├── models/              # Data models\n│   ├── ui/                  # User interface components\n│   └── utils/               # Utilities and helpers\n├── captures/                # Saved text outputs\n├── config/                  # User preferences\n├── tessdata/                # OCR language files\n├── logs/                    # Application logs\n└── resources/               # Icons and assets\n```\n\n---\n\n## 🔧 Configuration\n\nThe application uses JSON configuration files stored in `config/`:\n- **User preferences** - UI settings, language choices\n- **Language data** - Available OCR models\n- **Capture profiles** - Saved area configurations\n\n---\n\n## 📋 System Requirements\n\n- **OS**: Windows 10/11 (primary), Linux/macOS (experimental)\n- **RAM**: 4GB minimum, 8GB recommended\n- **CPU**: Multi-core processor recommended for real-time processing\n- **Display**: Support for multiple monitors with varying DPI\n- **Storage**: 100MB+ for application and language files\n\n---\n\n## 🐛 Troubleshooting\n\n### **Common Issues:**\n- **OCR not working**: Verify Tesseract installation and PATH\n- **Text not detected**: Check language selection and capture area size\n- **Performance issues**: Close other applications, check system resources\n- **Multi-monitor problems**: Update display drivers, check DPI settings\n\n### **Debug Logging:**\nEnable debug logging in the application settings to capture detailed operation information for troubleshooting.\n\n---\n\n## 🗺️ Roadmap\n\n### **Upcoming Features**\n- 🔄 **Live translation** integration\n- 🔄 **Cloud storage** synchronization  \n- 🔄 **Export formats** (PDF, HTML, Word)\n- 🔄 **API integration** for external applications\n- 🔄 **Dark mode** and theme customization\n- 🔄 **Batch processing** capabilities\n\n---\n\n## 🤝 Contributing\n\nWe welcome contributions! Here's how to get started:\n\n1. **Fork** the repository\n2. **Create** a feature branch (`git checkout -b feature/amazing-feature`)\n3. **Follow** the coding guidelines in `CLAUDE.md`\n4. **Commit** your changes (`git commit -m 'Add amazing feature'`)\n5. **Push** to the branch (`git push origin feature/amazing-feature`)\n6. **Open** a Pull Request\n\n### **Development Guidelines**\n- Follow **PEP 8** Python style guide\n- Use **type hints** and **docstrings**\n- Maintain **modular architecture**\n- Add **comprehensive logging**\n- Update **version numbers** for functional changes\n\n---\n\n## 📄 License\n\nThis project is licensed under the **MIT License** - see the [LICENSE](LICENSE) file for details.\n\n---\n\n## 👤 Author \u0026 Support\n\n**Author:** Carlo Sacchi\n**Website:** [https://www.captiocr.com](https://www.captiocr.com)\n\nFor support, feature requests, or bug reports, please open an issue on GitHub.\n\n---\n\n**⭐ If CaptiOCR helps you, please consider giving it a star on GitHub!**","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcarlosacchi%2Fcaptiocr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcarlosacchi%2Fcaptiocr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcarlosacchi%2Fcaptiocr/lists"}