{"id":29573003,"url":"https://github.com/oshekharo/image2text-pro","last_synced_at":"2025-07-19T05:12:46.904Z","repository":{"id":117668510,"uuid":"602663173","full_name":"OshekharO/Image2Text-Pro","owner":"OshekharO","description":"An advanced, configurable OCR tool for extracting text from images with preprocessing and parallel processing capabilities. Optimized for Chinese text but supports multiple languages.","archived":false,"fork":false,"pushed_at":"2025-06-24T05:50:25.000Z","size":22,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-06-24T06:33:58.080Z","etag":null,"topics":["mtl","ocr","opencv","python","tesseract-ocr"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/OshekharO.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-02-16T17:24:37.000Z","updated_at":"2025-06-24T05:52:12.000Z","dependencies_parsed_at":"2024-11-09T18:22:39.759Z","dependency_job_id":"df3530c0-9fa1-4ac7-b5ef-f386ad1ba2d8","html_url":"https://github.com/OshekharO/Image2Text-Pro","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/OshekharO/Image2Text-Pro","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OshekharO%2FImage2Text-Pro","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OshekharO%2FImage2Text-Pro/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OshekharO%2FImage2Text-Pro/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OshekharO%2FImage2Text-Pro/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/OshekharO","download_url":"https://codeload.github.com/OshekharO/Image2Text-Pro/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OshekharO%2FImage2Text-Pro/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265892712,"owners_count":23845062,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["mtl","ocr","opencv","python","tesseract-ocr"],"created_at":"2025-07-19T05:12:46.289Z","updated_at":"2025-07-19T05:12:46.884Z","avatar_url":"https://github.com/OshekharO.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Image2Text-Pro\n\nAdvanced OCR tool for extracting text from images with preprocessing and parallel processing.\n\n![Python](https://img.shields.io/badge/Python-3.8+-blue?logo=python)\n![OpenCV](https://img.shields.io/badge/OpenCV-4.5+-green?logo=opencv)\n![Tesseract](https://img.shields.io/badge/Tesseract-OCR-orange)\n\n## Features ✨\n\n- 📷 Supports multiple image formats (JPG, PNG, TIFF, BMP)\n- 🔍 Advanced image preprocessing for better OCR accuracy\n- ⚡ Parallel processing for fast batch operations\n- 🌍 Multi-language support (Chinese by default)\n- 📊 Progress tracking and performance metrics\n- 🛠️ Configurable preprocessing and OCR parameters\n\n## Installation 🛠️\n\n1. Install Tesseract OCR:\n   ```bash\n   # On Ubuntu/Debian\n   sudo apt install tesseract-ocr\n   sudo apt install libtesseract-dev\n\n   # On macOS\n   brew install tesseract\n   ```\n\n2. Python Dependencies:\n   ```python\n    pip install -r requirements.txt\n   ```\n\n## Usage 🚀\n\n1. Basic Command:\n   \n   `\n   python text_extractor.py -i input_images -o output_texts\n   `\n\n3. Advanced Usage:\n   \n   `\n   python text_extractor.py \\\n  -i ./photos \\\n  -o ./extracted_texts \\\n  --lang eng+chi_sim \\\n  --psm 11 \\\n  --workers 8 \n   `\n   \n## Contributing 🤝\n   We welcome contributions! Please:\n   1. Fork the repository\n   2. Create a feature branch (git checkout -b feature/your-feature)\n   3. Commit your changes (git commit -m 'Add some feature')\n   4. Push to the branch (git push origin feature/your-feature)\n   5. Open a Pull Request\n\n\u003cdiv align=\"center\"\u003e \u003cp\u003eMade with ❤️ and Python\u003c/p\u003e \u003csub\u003eOCR accuracy may vary depending on image quality and language complexity\u003c/sub\u003e \u003c/div\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foshekharo%2Fimage2text-pro","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Foshekharo%2Fimage2text-pro","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foshekharo%2Fimage2text-pro/lists"}