{"id":20376616,"url":"https://github.com/jakartaresearch/receipt-ocr","last_synced_at":"2026-03-06T02:38:54.298Z","repository":{"id":46314059,"uuid":"385632353","full_name":"jakartaresearch/receipt-ocr","owner":"jakartaresearch","description":"Receipt OCR","archived":false,"fork":false,"pushed_at":"2023-10-09T14:44:05.000Z","size":565,"stargazers_count":48,"open_issues_count":1,"forks_count":13,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-03-26T02:51:08.856Z","etag":null,"topics":["computer-vision","deep-learning"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jakartaresearch.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-07-13T14:28:03.000Z","updated_at":"2025-03-09T18:08:38.000Z","dependencies_parsed_at":"2022-08-25T06:21:56.669Z","dependency_job_id":null,"html_url":"https://github.com/jakartaresearch/receipt-ocr","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jakartaresearch%2Freceipt-ocr","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jakartaresearch%2Freceipt-ocr/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jakartaresearch%2Freceipt-ocr/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jakartaresearch%2Freceipt-ocr/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jakartaresearch","download_url":"https://codeload.github.com/jakartaresearch/receipt-ocr/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248536113,"owners_count":21120681,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computer-vision","deep-learning"],"created_at":"2024-11-15T01:38:38.318Z","updated_at":"2026-03-06T02:38:54.236Z","avatar_url":"https://github.com/jakartaresearch.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Optical Character Recognition for Receipt\n\n## Sample Results\nInput Image             |  Output\n:----------------------:|:----------------------:\n\u003cimg src=\"./data/tes.jpg\" width=\"300\" title=\"sample-input\"\u003e  |  \u003cimg src=\"./data/sample_output.jpg\" width=\"300\" title=\"sample-output\"\u003e\n\n## References\n\n| Title                                                                                   | Author           | Year | Github | Paper | Download Model|\n| ----------------------------------------------------------------------------------------| ---------------- | ---- | --------- | ----- |  -------- | \n| Character Region Awareness for Text Detection                                           | Clova AI Research, NAVER Corp.| 2019 | https://github.com/clovaai/CRAFT-pytorch | https://arxiv.org/abs/1904.01941 | [craft_mlt_25k.pth](https://drive.google.com/file/d/1Jk4eGD7crsqCCg9C9VjCLkMN3ze8kutZ/view)|\n| What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis | Clova AI Research, NAVER Corp.| 2019 | https://github.com/clovaai/deep-text-recognition-benchmark | https://arxiv.org/abs/1904.01906 | [TPS-ResNet-BiLSTM-Attn-case-sensitive.pth](https://www.dropbox.com/sh/j3xmli4di1zuv3s/AAArdcPgz7UFxIHUuKNOeKv_a?dl=0) |\n\n## Folder structure\n```\n.\n├─ configs               \n|  ├─ craft_config.yaml  \n|  └─ star_config.yaml   \n├─ data\n|  ├─ sample_output.jpg  \n|  └─ tes.jpg\n├─ notebooks                          \n|  ├─ export_onnx_model.ipynb         \n|  ├─ inference_default_engine.ipynb  \n|  ├─ inference_onnx_engine.ipynb     \n|  └─ test_api.ipynb                  \n├─ src                                                               \n|  ├─ text_detector                                         \n|  │  ├─ basenet                                           \n|  │  │  ├─ __init__.py                           \n|  │  │  └─ vgg16_bn.py                           \n|  │  ├─ modules                                              \n|  │  │  ├─ __init__.py                           \n|  │  │  ├─ craft.py                              \n|  │  │  ├─ craft_utils.py                        \n|  │  │  ├─ imgproc.py                            \n|  │  │  ├─ refinenet.py                          \n|  │  │  └─ utils.py                              \n|  │  ├─ __init__.py                              \n|  │  ├─ infer.py                                 \n|  │  └─ load_model.py                            \n|  ├─ text_recognizer                                           \n|  │  ├─ modules                                              \n|  │  │  ├─ dataset.py                            \n|  │  │  ├─ feature_extraction.py                 \n|  │  │  ├─ model.py                              \n|  │  │  ├─ model_utils.py                        \n|  │  │  ├─ prediction.py                         \n|  │  │  ├─ sequence_modeling.py                  \n|  │  │  ├─ transformation.py                     \n|  │  │  └─ utils.py                              \n|  │  ├─ __init__.py                              \n|  │  ├─ infer.py                                 \n|  │  └─ load_model.py                            \n|  ├─ __init__.py                                 \n|  ├─ engine.py                                   \n|  └─ model.py                                    \n├─ .gitignore\n├─ CONTRIBUTING.md\n├─ Dockerfile\n├─ environment.yaml\n├─ LICENSE\n├─ main.py\n├─ pyproject.toml\n├─ README.md\n├─ requirements.txt\n├─ setup.cfg\n```\n\n## Model Preparation\nYou need to create \"models\" folder to store this:\n- detector_model = \"models/text_detector/craft_mlt_25k.pth\"\n- recognizer_model = \"models/text_recognizer/TPS-ResNet-BiLSTM-Attn-case-sensitive.pth\"\n\nDownload all of pretrained models from \"References\" section\n\n## Requirements\nYou can setup the environment using conda or pip\n```\npip install -r requirements.txt\n```\nor\n```\nconda env create -f environment.yaml\n```\n\n## Container\n```\ndocker build -t receipt-ocr .\ndocker run -d --name receipt-ocr-service -p 80:80 receipt-ocr\ndocker start receipt-ocr-service\ndocker stop receipt-ocr-service\n```\n\n## How to contribute?\nCheck the docs [here](CONTRIBUTING.md)\n\n## Creator\n[![](https://github.com/andreaschandra/git-assets/blob/master/pictures/ruben.png)](https://github.com/rubentea16)","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjakartaresearch%2Freceipt-ocr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjakartaresearch%2Freceipt-ocr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjakartaresearch%2Freceipt-ocr/lists"}