{"id":22160271,"url":"https://github.com/zabir-nabil/autoocr","last_synced_at":"2025-07-26T09:31:23.013Z","repository":{"id":57412877,"uuid":"186209201","full_name":"zabir-nabil/autoocr","owner":"zabir-nabil","description":"Python wrapper for cross platform tesseract OCR engine with multiple languages (e.g. Bangla)","archived":false,"fork":false,"pushed_at":"2023-01-30T08:59:32.000Z","size":1189,"stargazers_count":17,"open_issues_count":0,"forks_count":4,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-06-12T11:04:23.506Z","etag":null,"topics":["bangla-ocr","image-to-text","multi-language-ocr","ocr","python-ocr","tesseract"],"latest_commit_sha":null,"homepage":"https://pypi.org/project/autoocr/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/zabir-nabil.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-05-12T03:57:54.000Z","updated_at":"2024-05-31T15:32:27.000Z","dependencies_parsed_at":"2023-02-16T05:45:56.036Z","dependency_job_id":null,"html_url":"https://github.com/zabir-nabil/autoocr","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/zabir-nabil/autoocr","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zabir-nabil%2Fautoocr","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zabir-nabil%2Fautoocr/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zabir-nabil%2Fautoocr/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zabir-nabil%2Fautoocr/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/zabir-nabil","download_url":"https://codeload.github.com/zabir-nabil/autoocr/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zabir-nabil%2Fautoocr/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265393726,"owners_count":23757650,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bangla-ocr","image-to-text","multi-language-ocr","ocr","python-ocr","tesseract"],"created_at":"2024-12-02T04:07:15.153Z","updated_at":"2025-07-26T09:31:22.631Z","avatar_url":"https://github.com/zabir-nabil.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# autoocr\n\u003e A Python wrapper for cross platform tesseract OCR engine with multiple languages (e.g. Bangla)\n\n## Installations\n\n```\npip3 install autoocr\n```\n\n## Usage\n\n### Mac OS\n\n* Import the library\n\n```\nfrom autoocr import AutoOCR # import the AutoOCR class\n```\n\n* Specify the language\n\n```\noa = AutoOCR(lang='bangla') # specify the language code\n```\n* Set the tessdata folder, on mac you can do `brew list tesseract` to get the path. This is only needed once.\n\n```\noa.set_datapath('/usr/local/Cellar/tesseract/4.0.0_1/share/tessdata')\n```\n* Get the text from image by passing the path to image\n\n```\nout_text = oa.get_text('image_ocr.jpg')\n```\n\n[![demo of autoocr on mac](demo.gif)](https://www.youtube.com/channel/UCVaObCskAlvvctDP9vZvW6w)\n\n\n### Windows\n\n* Install tesseract engine\n\n* Import the library\n\n```\nfrom autoocr import AutoOCR # import the AutoOCR class\n```\n\n* Specify the language\n\n```\noa = AutoOCR(lang='bangla') # specify the language code\n```\n* Set the tessdata folder. This is only needed once.\n\n```\noa.set_datapath('/path/to/tessdata')\n```\n* Get the text from image by passing the path to image\n\n```\nout_text = oa.get_text('image_ocr.jpg')\n```\n\n\n### Linux\n\n* Install tesseract engine. Follow this page [tesseract-ocr](https://tesseract-ocr.github.io/)\n\n* Import the library\n\n```\nfrom autoocr import AutoOCR # import the AutoOCR class\n```\n\n* Specify the language\n\n```\noa = AutoOCR(lang='bangla') # specify the language code\n```\n* Set the tessdata folder. This is only needed once. Run, `rpm -ql tesseract` for yum to get the location.\n\n```\noa.set_datapath('/path/to/tessdata')\n```\n* Get the text from image by passing the path to image\n\n```\nout_text = oa.get_text('image_ocr.jpg')\n```\n\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n[![MIT License](https://opensource.org/files/CDPost.png)](https://opensource.org/)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzabir-nabil%2Fautoocr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzabir-nabil%2Fautoocr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzabir-nabil%2Fautoocr/lists"}