{"id":15442856,"url":"https://github.com/tddschn/apple-vision-utils","last_synced_at":"2025-08-12T16:11:20.076Z","repository":{"id":240961653,"uuid":"803946935","full_name":"tddschn/apple-vision-utils","owner":"tddschn","description":"Fast and accurate OCR on images and PDFs using Apple Vision framework directly from command line.","archived":false,"fork":false,"pushed_at":"2025-01-31T01:53:39.000Z","size":344,"stargazers_count":6,"open_issues_count":1,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-08-08T22:55:02.803Z","etag":null,"topics":["apple-vision-framework","command-line-tool","ocr","pdf","pyobjc","python3"],"latest_commit_sha":null,"homepage":"https://pypi.org/project/apple-vision-utils","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tddschn.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-05-21T17:00:06.000Z","updated_at":"2025-03-29T12:12:07.000Z","dependencies_parsed_at":null,"dependency_job_id":"eb0a964a-8115-45b0-aa92-8c9f2da5e170","html_url":"https://github.com/tddschn/apple-vision-utils","commit_stats":null,"previous_names":["tddschn/apple-vision-utils"],"tags_count":9,"template":false,"template_full_name":null,"purl":"pkg:github/tddschn/apple-vision-utils","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tddschn%2Fapple-vision-utils","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tddschn%2Fapple-vision-utils/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tddschn%2Fapple-vision-utils/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tddschn%2Fapple-vision-utils/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tddschn","download_url":"https://codeload.github.com/tddschn/apple-vision-utils/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tddschn%2Fapple-vision-utils/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":270092981,"owners_count":24525538,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-12T02:00:09.011Z","response_time":80,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["apple-vision-framework","command-line-tool","ocr","pdf","pyobjc","python3"],"created_at":"2024-10-01T19:30:49.633Z","updated_at":"2025-08-12T16:11:20.015Z","avatar_url":"https://github.com/tddschn.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Apple Vision Framework Python Utilities\n\nFast and accurate OCR on images and PDFs using Apple Vision framework (`pyobjc-framework-Vision`) directly from command line.\n\n- [Apple Vision Framework Python Utilities](#apple-vision-framework-python-utilities)\n  - [Features](#features)\n  - [Demo](#demo)\n  - [Installation](#installation)\n    - [pipx](#pipx)\n    - [pip](#pip)\n    - [`uv tool` installation doesn't work](#uv-tool-installation-doesnt-work)\n  - [Usage](#usage)\n    - [Command Line](#command-line)\n    - [As a Library](#as-a-library)\n  - [Develop](#develop)\n  - [Test](#test)\n\n## Features\n\n- Fast and accurate, multi-language support (`-l`, `--lang`), powered by Apple's industry-strength Vision framework (`pyobjc-framework-Vision`).\n- Supports all common input image formats: PNG, JPEG, TIFF and WebP.\n- Supports PDF input (the file gets converted to images first). This tool does NOT assume a file is PDF just because it has a `.pdf` extension, you need to pass `-p`, `--pdf` flag.\n- Outputs extracted text only by default, but can output in JSON format containing confidence of recognition for each line with `-j`, `--json` flag.\n- Supports text clipping based on start and end markers (`-s`, `-S`, `-e`, `-E`).\n\n## Demo\n\nBelow is the output of running the [tests](#test):\n\nhttps://g.teddysc.me/96d5b1217b90035c163b3c97ce99112f\n\n## Installation\n\nRequires Python \u003e= 3.11, \u003c4.0.\n\nSince this package uses Apple's Vision framework, it only works on macOS.\n\nTo OCR PDFs with `-p`, you need to install required dependency `poppler` with `brew install poppler` ([detailed guide](https://github.com/Belval/pdf2image)).\n\n### pipx\n\nThis is the recommended installation method.\n\n```\n$ pipx install apple-vision-utils\n```\n\n### [pip](https://pypi.org/project/apple-vision-utils/)\n\n```\n$ pip install apple-vision-utils\n```\n\n### `uv tool` installation doesn't work\n\nI tried to install this with `uv tool install` using different Python versions on Apple Silicon Mac, it didn't work. May be caused by some peculiarities of objc interfacing libs. Just use `pipx` for now.\n\n## Usage\n\n### Command Line\n\n```\n$ apple-ocr --help\n\nusage: apple-ocr [-h] [-j] [-p] [-l LANG] [--pdf2image-only] [--pdf2image-dir PDF2IMAGE_DIR] [-s START_MARKER_INCLUSIVE] [-S START_MARKER_EXCLUSIVE] [-e END_MARKER_INCLUSIVE] [-E END_MARKER] [-V] file_path\n\nExtract text from an image or PDF using Apple's Vision framework.\n\npositional arguments:\n  file_path             Path to the image or PDF file.\n\noptions:\n  -h, --help            show this help message and exit\n  -j, --json            Output results in JSON format.\n  -p, --pdf             Specify if the input file is a PDF.\n  -l LANG, --lang LANG  Specify the language for text recognition (e.g., eng,\n                        fra, deu, zh-Hans for Simplified Chinese, zh-Hant for\n                        Traditional Chinese). Default is 'zh-Hant', which\n                        works with images containing both Chinese characters\n                        and latin letters.\n  --pdf2image-only      Only convert PDF to images without performing OCR.\n  --pdf2image-dir PDF2IMAGE_DIR\n                        Specify the directory to store output images. By\n                        default, a secure temporary directory is created.\n  -s START_MARKER_INCLUSIVE, --start-marker-inclusive START_MARKER_INCLUSIVE\n                        Specify the start marker (included, as the first line of the extracted text) for text extraction in PDF.\n  -S START_MARKER_EXCLUSIVE, --start-marker-exclusive START_MARKER_EXCLUSIVE\n                        Specify the start marker (excluded, as the first line of the extracted text) for text extraction in PDF.\n  -e END_MARKER_INCLUSIVE, --end-marker-inclusive END_MARKER_INCLUSIVE\n                        Specify the end marker (included, as the last line of the extracted text) for text extraction in PDF.\n  -E END_MARKER, --end-marker END_MARKER\n                        Specify the end marker (excluded, as the last line of the extracted text) for text extraction in PDF.\n  -V, --version         show program's version number and exit\n```\n\n### As a Library\n\nYou can also use the utility functions in your own Python code:\n\n```python\nfrom apple_vision_utils.utils import image_to_text, pdf_to_images, process_pdf, clip_results\n\n# Extract text from an image\nresults = image_to_text(\"path/to/image.png\", lang=\"eng\")\n\n# Convert PDF to images\nimages = pdf_to_images(\"path/to/document.pdf\")\n\n# Process PDF for text recognition\npdf_results = process_pdf(\"path/to/document.pdf\", lang=\"eng\")\n\n# Clip text results based on markers\nclipped_results = clip_results(results, start_marker_inclusive=\"Start\", end_marker_exclusive=\"End\")\n```\n\n## Develop\n\n```\n$ git clone https://github.com/tddschn/apple-vision-utils.git\n$ cd apple-vision-utils\n$ poetry install\n```\n\n## Test\n\n```\n# in the root of the project\npoetry install\npoetry shell\ncd tests \u0026\u0026 ./test.sh\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftddschn%2Fapple-vision-utils","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftddschn%2Fapple-vision-utils","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftddschn%2Fapple-vision-utils/lists"}