{"id":13476301,"url":"https://github.com/gnonio/korporize","last_synced_at":"2025-03-27T02:32:26.449Z","repository":{"id":101944794,"uuid":"261885668","full_name":"gnonio/korporize","owner":"gnonio","description":"OCR - Object Character Recognition for any image you browse upon","archived":false,"fork":false,"pushed_at":"2020-05-20T22:28:25.000Z","size":37214,"stargazers_count":11,"open_issues_count":2,"forks_count":2,"subscribers_count":1,"default_branch":"master","last_synced_at":"2024-08-01T16:44:46.372Z","etag":null,"topics":["javascript","ocr-recognition","webextensions"],"latest_commit_sha":null,"homepage":"","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gnonio.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2020-05-06T21:46:42.000Z","updated_at":"2024-02-21T11:38:34.000Z","dependencies_parsed_at":null,"dependency_job_id":"7d81a64b-0d48-44fe-8d06-2f8186c4f7ff","html_url":"https://github.com/gnonio/korporize","commit_stats":null,"previous_names":[],"tags_count":5,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gnonio%2Fkorporize","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gnonio%2Fkorporize/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gnonio%2Fkorporize/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gnonio%2Fkorporize/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gnonio","download_url":"https://codeload.github.com/gnonio/korporize/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":222184234,"owners_count":16945023,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["javascript","ocr-recognition","webextensions"],"created_at":"2024-07-31T16:01:28.746Z","updated_at":"2024-10-30T08:31:22.708Z","avatar_url":"https://github.com/gnonio.png","language":"JavaScript","funding_links":[],"categories":["JavaScript"],"sub_categories":[],"readme":"[![korporize](./img/korporize.svg)](http://tesseract.projectnaptha.com)\n\n## korpora - OCR - Optical Character Recognition\n\nOffline text recognition from any image. This web extension will enable context menu access to extract text from any image while browsing. Builds upon [Tesseract.js](https://github.com/naptha/tesseract.js)\n\n****\n\n### Install\n\n- [Addons Mozilla page](https://addons.mozilla.org/en-GB/firefox/addon/korporize/)\n\n#### Alternate install for advanced users\n\n- [download this repository](https://github.com/gnonio/korporize/archive/master.zip)\n- follow instructions for [Temporary installation in Firefox](./user-install.md)\n\n****\n\n### Usage\n\n- Right click over an image in a web page\n- Select \"Extract Text from Image\"\n- A popup will open with korporize interface\n- Wait for tesseract to work in the background\n- Obtain results in korporize panel\n- (Optional) copy results to clipboard\n\n****\n\nTo obtain good results:\n- make sure the automatic language detected is suitable for the characters in the image loaded\n- force another language via Options page\n- increase quality in Options page\n(try Normal or Best - both will take longer)\n- make sure you have a suitable page segmentation for the image\n(will make this choice handier in future releases)\n- choose a high resolution version of the image\n\n****\n\n### Features\n\n- Extracts text from any image while browsing\n- Works offline (requires network only the first time a language is used to cache the dictionaries)\n- Automatic language detection (based on the visited web page)\n- Prevents downloading twice already loaded images\n\n****\n\n### Notes\n\n- Careful with the size of language dictionaries\n- Expect around 8Mb for Normal and 12Mb for Best Quality per language\n- Aside from above dictionaries no other data is ever stored by korporize\n\n****\n\n### Todo\n\n- Many other options for accessing Tesseract functionality (image from link, PDF load and save, etc...)\n- Preloading of language dictionaries (via Options page)\n- Provide some cache management options\n- Provide access as an API for other webextensions","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgnonio%2Fkorporize","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgnonio%2Fkorporize","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgnonio%2Fkorporize/lists"}