{"id":13627358,"url":"https://github.com/t0mer/ocr-docker","last_synced_at":"2025-07-05T23:06:43.175Z","repository":{"id":86362108,"uuid":"354230607","full_name":"t0mer/ocr-docker","owner":"t0mer","description":"ocr-docker is small, Flask powerd web app, helps us to extract text from images and pdf document using OCR","archived":false,"fork":false,"pushed_at":"2025-01-27T00:14:37.000Z","size":100808,"stargazers_count":51,"open_issues_count":3,"forks_count":14,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-03-02T13:11:18.392Z","etag":null,"topics":["docker","flask","ocr","python","tesseract"],"latest_commit_sha":null,"homepage":"","language":"CSS","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/t0mer.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-04-03T07:40:55.000Z","updated_at":"2025-02-25T23:20:24.000Z","dependencies_parsed_at":"2025-03-02T13:20:40.868Z","dependency_job_id":null,"html_url":"https://github.com/t0mer/ocr-docker","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/t0mer%2Focr-docker","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/t0mer%2Focr-docker/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/t0mer%2Focr-docker/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/t0mer%2Focr-docker/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/t0mer","download_url":"https://codeload.github.com/t0mer/ocr-docker/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244066179,"owners_count":20392406,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["docker","flask","ocr","python","tesseract"],"created_at":"2024-08-01T22:00:33.250Z","updated_at":"2025-03-17T16:09:51.626Z","avatar_url":"https://github.com/t0mer.png","language":"CSS","funding_links":[],"categories":["Projects by main language"],"sub_categories":["css"],"readme":"# OCR-Docker\n## Extract text from images \u0026 pdf files\n\nOCR-Docker is a Python \u0026 [Flask](https://flask.palletsprojects.com/en/1.1.x/) powered, easy to use system that helps us to easily extract text from images and pdf files in multiple languages.\n\n## Features\n\n- Extract text from images (png, jpg, tiff).\n- Extract text from pdf files (single or multiple pages).\n\n## Components and Frameworks used in TTS-STT\n* [tesseract-ocr](https://github.com/tesseract-ocr/) - open source ocr\n* [tessdata](https://github.com/tesseract-ocr/tessdata) - tesseract-ocr data models\n* [ghostscript](https://www.ghostscript.com/)\n* [imagemagick](https://imagemagick.org/index.php)\n* [pytesseract](https://pypi.org/project/pytesseract/)\n* [Pillow](https://pypi.org/project/Pillow/)\n* [Image](https://pypi.org/project/image/)\n* [Flask](https://flask.palletsprojects.com/en/1.1.x/)\n* [Loguru](https://pypi.org/project/loguru/)\n* [PyYAML](https://pypi.org/project/PyYAML/)\n\n The OCR (Optical Character Recognition) feature is free thanks to [tesseract-ocr](https://github.com/tesseract-ocr/) which is an Open Source OCR project.\n## Installation\n#### docker-compose from hub\n```yaml\nversion: \"3.7\"\nservices:\n  ocr:\n    image: techblog/ocr-docker:latest\n    ports:\n      - \"8080:8080\"\n    container_name: tts-stt\n    labels:\n      - \"com.ouroboros.enable=true\"\n    networks:\n      - default\n    restart: unless-stopped\n```\nNow, run ```docker-compose up -d``` to pull and run your container.\nOpen your browser and navigate to your container ip address with port 8080, you should see the following screen.\n\n[![OCR](https://github.com/t0mer/ocr-docker/blob/main/screenshot/ocr.png?raw=true \"OCR\")](https://github.com/t0mer/ocr-docker/blob/main/screenshot/ocr.png?raw=true \"OCR\")\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ft0mer%2Focr-docker","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ft0mer%2Focr-docker","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ft0mer%2Focr-docker/lists"}