{"id":13485375,"url":"https://github.com/SimformSolutionsPvtLtd/tesseract-OCR-iOS-demo","last_synced_at":"2025-03-27T19:31:11.074Z","repository":{"id":45368355,"uuid":"225019014","full_name":"SimformSolutionsPvtLtd/tesseract-OCR-iOS-demo","owner":"SimformSolutionsPvtLtd","description":"This prototype is to recognize text inside the image and for that it uses Tesseract OCR. The underlying Tesseract engine will process the picture and return anything that it believes is text.","archived":false,"fork":false,"pushed_at":"2020-12-01T08:47:12.000Z","size":12930,"stargazers_count":31,"open_issues_count":1,"forks_count":5,"subscribers_count":5,"default_branch":"master","last_synced_at":"2024-10-30T20:44:20.635Z","etag":null,"topics":["demo","example-project","ios","ocr","ocr-library","optical-character-recognition","sample","swift","tesseract"],"latest_commit_sha":null,"homepage":"","language":"Swift","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/SimformSolutionsPvtLtd.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-11-30T13:47:32.000Z","updated_at":"2024-09-06T11:33:54.000Z","dependencies_parsed_at":"2022-09-06T07:31:36.416Z","dependency_job_id":null,"html_url":"https://github.com/SimformSolutionsPvtLtd/tesseract-OCR-iOS-demo","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SimformSolutionsPvtLtd%2Ftesseract-OCR-iOS-demo","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SimformSolutionsPvtLtd%2Ftesseract-OCR-iOS-demo/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SimformSolutionsPvtLtd%2Ftesseract-OCR-iOS-demo/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SimformSolutionsPvtLtd%2Ftesseract-OCR-iOS-demo/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/SimformSolutionsPvtLtd","download_url":"https://codeload.github.com/SimformSolutionsPvtLtd/tesseract-OCR-iOS-demo/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245910755,"owners_count":20692497,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["demo","example-project","ios","ocr","ocr-library","optical-character-recognition","sample","swift","tesseract"],"created_at":"2024-07-31T18:00:21.013Z","updated_at":"2025-03-27T19:31:08.354Z","avatar_url":"https://github.com/SimformSolutionsPvtLtd.png","language":"Swift","funding_links":[],"categories":["iOS","iOS Guides"],"sub_categories":["Swift"],"readme":"# Tesseract OCR iOS Prototype\n\nThis prototype is to recognize text inside the image and for that it uses Tesseract OCR. The underlying Tesseract engine will process the picture and return anything that it believes is text.\n\n\u003cimg src=\"https://user-images.githubusercontent.com/8736329/70211504-4678b500-175b-11ea-9479-8362a0b8cde0.gif\"\nwidth=\"350\" height=\"600\"\u003e     \u003cimg src=\"https://user-images.githubusercontent.com/8736329/70211414-031e4680-175b-11ea-8575-c371f08b0720.gif\"\nwidth=\"350\" height=\"600\"\u003e \n\n\n## Getting Started\n\nThese are the following points we need to follow to use [tesseract OCR iOS](https://github.com/gali8/Tesseract-OCR-iOS) and get the better the output out of it. \n\n## Add Trained Data[Tessdata Folder]\n\nAs we all know training data is used to train an algorithm. Generally, training data is a certain percentage of an overall dataset along with a testing set. As a rule, the better the training data, the better the algorithm or classifier performs. Tesseract requires language-specific training data to perform predictions, here language-specific denotes that it predicts within the boundaries of a given language.\n\n\nTo add training data drag the tessdata folder and set the added Folders option to create folder references, It will create a referenced folder. Do not forget to select a target before clicking Finish.\nFor this project we have only included English training files to tessdata folder. You can download and add [tessdata](https://github.com/tesseract-ocr/tessdata) as per your project requirements.\n\n\n![Monosnap 2019-12-05 11-41-18](https://user-images.githubusercontent.com/8736329/70208814-88eac380-1754-11ea-81ea-c66b2a789dc0.png)\n\n## Scaling and Removing noise from an image\n\n### Scaling Image\n\nImage scaling is performed ultimately to achieve resolution enhancement without loss of image quality. We can implement this using an aspect ratio of an image that has a proportional relationship with image width and height.\n\n\n### Removing Noise from Image\n\n\nImage noise is a random variation of brightness or color information in images and is usually an aspect of electronic noise. Removing noise from image improves its quality.\n\n## Use Cases of Tesseract OCR\n\n\nIt can be used to recognize documents, receipts, and street-signs etc. Let's go through all of them with examples.\n\n\n### Documents \n\n- Let’s consider an example of a picture of a book page.\n\n![doc1](https://user-images.githubusercontent.com/8736329/70234375-886b2080-1786-11ea-9f66-b68dfb759dcb.png)\n\n```\noutput:\n\nMild Splendour of the various-vested Night!\nMother of sirildly-working visions! haill\nI watch thy gliding, while with watery li ht\nThy weak eye glimmers through a fleecy veil;\nAnd when thou lovest thy pale orb to shroud\nBehind the ather'd blackness lost on high;\nAnd when thou dartest from the wind-rent cloud\nThy placid lightning o'er the awaken'd sky.\n```\n\n\n### Receipts\n\n\n- A slightly difficult example is a Receipt which has non-uniform text layout and multiple fonts. Book pages and documents have very well defined structure and very little variation in font sizes and equally spaced data which is not the case in bill receipts. These examples shows how tesseract will perform on scanned receipts.\n\n\n![receipt](https://user-images.githubusercontent.com/8736329/70234293-5b1e7280-1786-11ea-8b18-27728a210bc0.png)\n\n```\noutput:\n\nStore #05666\n3515 DEL MAR HTS, RD\nSAN DIEGO, CA 92130\n(858) 792-7040\n\nRegister #4 Transaction #571140\nCashier #56661020 8/20/17 5:45PM\n\nwellness+ with Plenti\nPlenti Card#: 31)000000000(4553\n1 G2 RETRACT BOLD BLK 2PK\n1.99 T\n\nSALE 1/1.99, Reg 1/4.69\nDiscount 2 70-\n1 Items\n\nSubtotal\n1.99\nTax\n.15\nTotal\n2.14\n\n*MASTER*\n2.14\nMASTER card * #)0()000000000(5485\nApp #AA APPROVAL AUTO\nRef # 05639E\nEntry Method: Chip\n```\n\n\n### Street Signs\n\n- It can be used to recognize street signs as well, with this example we can see that how tesseract will behave when we pass image with symbols and dark boundaries.\n- Tesseract does not do a very good job with dark boundaries and often assumes it to be text. However, if we help Tesseract a bit by cropping out the text region, it gives perfect output.\n\n![ss1](https://user-images.githubusercontent.com/8736329/70234485-bfd9cd00-1786-11ea-8f7e-1e328fc63733.jpeg)\n\n```\noutput:\n\n2:fi)::s\n\nCaution\nSite traffic\n```\n\n- There is a mistake in output due to a symbol. \n\n### License\n\nTesseract OCR iOS and TesseractOCR.framework are distributed under the MIT license (see LICENSE.md).\n\nTesseract, maintained by Google (http://code.google.com/p/tesseract-ocr/), is distributed under the Apache 2.0 license (see http://www.apache.org/licenses/LICENSE-2.0).\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FSimformSolutionsPvtLtd%2Ftesseract-OCR-iOS-demo","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FSimformSolutionsPvtLtd%2Ftesseract-OCR-iOS-demo","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FSimformSolutionsPvtLtd%2Ftesseract-OCR-iOS-demo/lists"}