{"id":22656519,"url":"https://github.com/pranav-0309/ocr_model_dc","last_synced_at":"2026-04-20T09:32:41.252Z","repository":{"id":267122350,"uuid":"900317357","full_name":"pranav-0309/OCR_model_dc","owner":"pranav-0309","description":"OCR model to extract a primary and a secondary ID, for each image-insurance type pair.","archived":false,"fork":false,"pushed_at":"2024-12-08T13:45:03.000Z","size":2672,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-29T07:44:57.318Z","etag":null,"topics":["jupyter-notebook","ocr","ocr-python","ocr-recognition","python3","pytorch"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pranav-0309.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-12-08T13:24:53.000Z","updated_at":"2024-12-08T13:45:06.000Z","dependencies_parsed_at":null,"dependency_job_id":"e5395cf2-3bd1-45ae-8979-05aa9ba80bc0","html_url":"https://github.com/pranav-0309/OCR_model_dc","commit_stats":null,"previous_names":["pranav-0309/ocr_model_dc"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pranav-0309%2FOCR_model_dc","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pranav-0309%2FOCR_model_dc/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pranav-0309%2FOCR_model_dc/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pranav-0309%2FOCR_model_dc/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pranav-0309","download_url":"https://codeload.github.com/pranav-0309/OCR_model_dc/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246156029,"owners_count":20732359,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["jupyter-notebook","ocr","ocr-python","ocr-recognition","python3","pytorch"],"created_at":"2024-12-09T10:14:44.594Z","updated_at":"2026-04-20T09:32:41.204Z","avatar_url":"https://github.com/pranav-0309.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"![digitizing_team](digitizing_team.png)\r\n\r\nIn this project we're taking a hypothetical scenario where an insurance company has a new initiative which is to digitalize all historical insurance claim documents, which includes improving the labeling of some IDs scanned from paper documents and identifying them as primary or secondary IDs.\r\n\r\nTo help them in their effort, I've used multi-modal learning to train an Optical Character Recognition (OCR) model. To improve the classification, the model will use **images** of the scanned documents as input and their **insurance type** (home, life, auto, health, or other). \r\n\r\nIntegrating different data modalities (such as image and text) enables the model to perform better in complex scenarios, helping to capture more nuanced information. \r\n\r\nThe **labels** that the model will be trained to identify are of two types: a primary and a secondary ID, for each image-insurance type pair.\r\n\r\nTo have a look at my code, open the `notebook.ipynb` file!","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpranav-0309%2Focr_model_dc","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpranav-0309%2Focr_model_dc","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpranav-0309%2Focr_model_dc/lists"}