{"id":18834679,"url":"https://github.com/gkovacs/pdfocr","last_synced_at":"2025-04-06T18:14:28.858Z","repository":{"id":872436,"uuid":"613143","full_name":"gkovacs/pdfocr","owner":"gkovacs","description":"Adds text to PDF files using the cuneiform OCR software","archived":false,"fork":false,"pushed_at":"2021-02-17T23:51:27.000Z","size":38,"stargazers_count":326,"open_issues_count":24,"forks_count":50,"subscribers_count":18,"default_branch":"master","last_synced_at":"2025-03-30T15:12:31.301Z","etag":null,"topics":["ocr","pdf","ruby"],"latest_commit_sha":null,"homepage":"","language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gkovacs.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2010-04-16T06:20:50.000Z","updated_at":"2025-02-05T22:18:30.000Z","dependencies_parsed_at":"2022-08-16T11:15:23.457Z","dependency_job_id":null,"html_url":"https://github.com/gkovacs/pdfocr","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gkovacs%2Fpdfocr","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gkovacs%2Fpdfocr/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gkovacs%2Fpdfocr/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gkovacs%2Fpdfocr/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gkovacs","download_url":"https://codeload.github.com/gkovacs/pdfocr/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247526763,"owners_count":20953143,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ocr","pdf","ruby"],"created_at":"2024-11-08T02:13:33.944Z","updated_at":"2025-04-06T18:14:28.820Z","avatar_url":"https://github.com/gkovacs.png","language":"Ruby","readme":"# pdfocr\n\npdfocr adds an OCR text layer to scanned PDF files, allowing them to be searched. It currently depends on Ruby 1.8.7 or above, and uses ocropus, cuneiform, or tesseract for performing OCR.\n\n## Using\n\nTo use, run:\n\npdfocr -i input.pdf -o output.pdf\n\nFor more details, see the manpage.\n\n## Dependencies\n\npdfocr requires tesseract and hocr2pdf. These can be provided by installing the packages tesseract-ocr, tesseract-ocr-eng (or other languages you need), and exactimage from your distribution.\n\n## Credits\n\npdfocr was written by [Geza Kovacs](http://github.com/gkovacs)\n\npdfocr is hosted at http://github.com/gkovacs/pdfocr\n\nChristian Pietsch added tesseract support.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgkovacs%2Fpdfocr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgkovacs%2Fpdfocr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgkovacs%2Fpdfocr/lists"}