{"id":22144734,"url":"https://github.com/mohamedsaidsallam/simple-tesseract-python-ocr","last_synced_at":"2025-03-24T12:19:11.844Z","repository":{"id":135771250,"uuid":"326376373","full_name":"MohamedSaidSallam/Simple-Tesseract-Python-OCR","owner":"MohamedSaidSallam","description":"A simple tesseract python OCR done as a project for ASU 2020 for computer vision course.","archived":false,"fork":false,"pushed_at":"2021-01-03T15:52:19.000Z","size":15,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-01-29T17:23:42.827Z","etag":null,"topics":["computer-vision","ocr","python","python-argparse","tesseract","tesseract-ocr"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/MohamedSaidSallam.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.MD","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-01-03T10:04:19.000Z","updated_at":"2021-01-03T22:34:31.000Z","dependencies_parsed_at":null,"dependency_job_id":"3e5fd6fe-1cc1-4c8b-86c1-ba6e59113cec","html_url":"https://github.com/MohamedSaidSallam/Simple-Tesseract-Python-OCR","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MohamedSaidSallam%2FSimple-Tesseract-Python-OCR","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MohamedSaidSallam%2FSimple-Tesseract-Python-OCR/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MohamedSaidSallam%2FSimple-Tesseract-Python-OCR/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MohamedSaidSallam%2FSimple-Tesseract-Python-OCR/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/MohamedSaidSallam","download_url":"https://codeload.github.com/MohamedSaidSallam/Simple-Tesseract-Python-OCR/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245267567,"owners_count":20587459,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computer-vision","ocr","python","python-argparse","tesseract","tesseract-ocr"],"created_at":"2024-12-01T22:31:05.625Z","updated_at":"2025-03-24T12:19:11.812Z","avatar_url":"https://github.com/MohamedSaidSallam.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Simple Tesseract Python OCR\n\n[![GitHub Release][github_release_badge]][github_release_link]\n[![License][license-image]][license-url]\n\nA simple tesseract python OCR done as a project for ASU 2020 for computer vision course.\n\n## Getting Started\n\nThese instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.\n\n### Prerequisites\n\nInstall the requirements using the following:\n\n```sh\npip install -r requirements.txt\n```\n\nor if you are using python venv:\n\n```sh\npython -m venv venv\nvenv/Scripts/activate\npip install -r requirements.txt\n```\n\nduplicate the ```.env.example``` and rename it to ```.env``` and fill in the ```tesseract_path```.\n\n### Running the code\n\nyou can get the list of paramters using the following:\n\n```sh\npython -m ocr --help\n```\n\n```sh\nusage: __main__.py [-h] -i IMAGE [-c] [-t TEXT_OUTPUT_FILENAME]\n                   [-f IMAGE_OUTPUT_FILENAME] [-v] [--getGrayScaleImage]\n                   [--removeNoise] [--applyThresholding]\n                   [--applyThresholdingInv] [--getDilatedImage]\n                   [--getErodedImage] [--applyOpening] [--applyClosing]\n                   [--getCannyResult]\n\nA simple tesseract python script to get text from input image. by default this\nlist of preprocessing functions is used [getGrayScaleImage, removeNoise,\napplyThresholdingInv, getDilatedImage]\n\noptional arguments:\n  -h, --help            show this help message and exit\n  -i IMAGE, --image IMAGE\n                        path to input image\n  -c, --show-final-image\n                        show the final image with an overlay of the text\n                        recognised. (default: False)\n  -t TEXT_OUTPUT_FILENAME, --text-output-filename TEXT_OUTPUT_FILENAME\n                        file name to put the text output in. (default:\n                        output.txt)\n  -f IMAGE_OUTPUT_FILENAME, --image-output-filename IMAGE_OUTPUT_FILENAME\n                        filename to output the final image in. (default:\n                        output.png)\n  -v, --verbose         Show intermediate images. (default: False)\n  --getGrayScaleImage   (PreProcessing) adds getGrayScaleImage to\n                        preprocessing. order is important.\n  --removeNoise         (PreProcessing) adds removeNoise to preprocessing.\n                        order is important.\n  --applyThresholding   (PreProcessing) adds applyThresholding to\n                        preprocessing. order is important.\n  --applyThresholdingInv\n                        (PreProcessing) adds applyThresholdingInv to\n                        preprocessing. order is important.\n  --getDilatedImage     (PreProcessing) adds getDilatedImage to preprocessing.\n                        order is important.\n  --getErodedImage      (PreProcessing) adds getErodedImage to preprocessing.\n                        order is important.\n  --applyOpening        (PreProcessing) adds applyOpening to preprocessing.\n                        order is important.\n  --applyClosing        (PreProcessing) adds applyClosing to preprocessing.\n                        order is important.\n  --getCannyResult      (PreProcessing) adds getCannyResult to preprocessing.\n                        order is important.\n\nSource: https://github.com/TheDigitalPhoenixX/Simple-Tesseract-Python-OCR\n```\n\n#### Example\n\n```sh\npy -m ocr -i \"example input\\input.jpg\" -v\n```\n\ninput.jpg\n![input.png](example%20input/input.jpg)\n\noutput.txt\n\n```\nThis is SAMPLE TEXT\nText is at different regions\n```\n\noutput.png\n![output.png](docs/output.png)\n\nverbose:\n![getGrayScaleImage.png](docs/getGrayScaleImage.png)\n![removeNoise.png](docs/removeNoise.png)\n![applyThresholdingInv.png](docs/applyThresholdingInv.png)\n![getDilatedImage.png](docs/getDilatedImage.png)\n\n## Built With\n\n* [Visual Studio Code](https://code.visualstudio.com/) - Code Editor\n\n## Contributing\n\nPlease read [CONTRIBUTING.md](CONTRIBUTING.md) for details on our code of conduct, and the process for submitting pull requests to us.\n\n## Versioning\n\nWe use [SemVer](http://semver.org/) for versioning. For the versions available, see the [tags on this repository][github-tags].\n\n## Authors\n\n* **Mohamed Said Sallam** - Main Dev - [TheDigitalPhoenixX](https://github.com/TheDigitalPhoenixX)\n\nSee also the list of [contributors][github-contributors] who participated in this project and their work in [CONTRIBUTORS.md](CONTRIBUTORS.md).\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details\n\n## Acknowledgments\n\n* [README.md Template](https://gist.github.com/PurpleBooth/109311bb0361f32d87a2)\n\n[license-image]: https://img.shields.io/badge/License-MIT-brightgreen.svg\n[license-url]: https://opensource.org/licenses/MIT\n\n[github_release_badge]: https://img.shields.io/github/v/release/TheDigitalPhoenixX/Simple-Tesseract-Python-OCR.svg?style=flat\u0026include_prereleases\n[github_release_link]: https://github.com/TheDigitalPhoenixX/Simple-Tesseract-Python-OCR/releases\n\n[github-contributors]: https://github.com/TheDigitalPhoenixX/Simple-Tesseract-Python-OCR/contributors\n[github-tags]: https://github.com/TheDigitalPhoenixX/Simple-Tesseract-Python-OCR/tags\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmohamedsaidsallam%2Fsimple-tesseract-python-ocr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmohamedsaidsallam%2Fsimple-tesseract-python-ocr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmohamedsaidsallam%2Fsimple-tesseract-python-ocr/lists"}