{"id":13543301,"url":"https://github.com/hrishikeshrt/google_drive_ocr","last_synced_at":"2026-01-04T16:58:04.101Z","repository":{"id":42225638,"uuid":"376928691","full_name":"hrishikeshrt/google_drive_ocr","owner":"hrishikeshrt","description":"Perform OCR using Google's Drive API v3","archived":false,"fork":false,"pushed_at":"2022-05-17T19:41:19.000Z","size":98,"stargazers_count":38,"open_issues_count":2,"forks_count":11,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-03-22T09:31:45.198Z","etag":null,"topics":["command-line-tool","drive-api","google-ocr","multiprocessing","ocr","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hrishikeshrt.png","metadata":{"files":{"readme":"README.rst","changelog":"HISTORY.rst","contributing":"CONTRIBUTING.rst","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-06-14T19:04:30.000Z","updated_at":"2024-09-27T13:50:03.000Z","dependencies_parsed_at":"2022-08-20T15:00:53.052Z","dependency_job_id":null,"html_url":"https://github.com/hrishikeshrt/google_drive_ocr","commit_stats":null,"previous_names":[],"tags_count":10,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hrishikeshrt%2Fgoogle_drive_ocr","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hrishikeshrt%2Fgoogle_drive_ocr/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hrishikeshrt%2Fgoogle_drive_ocr/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hrishikeshrt%2Fgoogle_drive_ocr/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hrishikeshrt","download_url":"https://codeload.github.com/hrishikeshrt/google_drive_ocr/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246815805,"owners_count":20838516,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["command-line-tool","drive-api","google-ocr","multiprocessing","ocr","python"],"created_at":"2024-08-01T11:00:29.096Z","updated_at":"2026-01-04T16:58:04.034Z","avatar_url":"https://github.com/hrishikeshrt.png","language":"Python","funding_links":[],"categories":["OCR","Python"],"sub_categories":[],"readme":"=========================\nGoogle OCR (Drive API v3)\n=========================\n\n\n.. image:: https://img.shields.io/pypi/v/google_drive_ocr?color=success\n        :target: https://pypi.python.org/pypi/google_drive_ocr\n\n.. image:: https://readthedocs.org/projects/google-drive-ocr/badge/?version=latest\n        :target: https://google-drive-ocr.readthedocs.io/en/latest/?version=latest\n        :alt: Documentation Status\n\n.. image:: https://img.shields.io/pypi/pyversions/google_drive_ocr\n        :target: https://pypi.python.org/pypi/google_drive_ocr\n        :alt: Python Version Support\n\n.. image:: https://img.shields.io/github/issues/hrishikeshrt/google_drive_ocr\n        :target: https://github.com/hrishikeshrt/google_drive_ocr/issues\n        :alt: GitHub Issues\n\n.. image:: https://img.shields.io/github/followers/hrishikeshrt?style=social\n        :target: https://github.com/hrishikeshrt\n        :alt: GitHub Followers\n\n.. image:: https://img.shields.io/twitter/follow/hrishikeshrt?style=social\n        :target: https://twitter.com/hrishikeshrt\n        :alt: Twitter Followers\n\n\nPerform OCR using Google's Drive API v3\n\n\n* Free software: GNU General Public License v3\n* Documentation: https://google-drive-ocr.readthedocs.io.\n\nFeatures\n========\n\n* Perform OCR using Google's Drive API v3\n* Class :code:`GoogleOCRApplication()` for use in projects\n* Highly configurable CLI\n* Run OCR on a single image file\n* Run OCR on multiple image files\n* Run OCR on all images in directory\n* Use multiple workers (:code:`multiprocessing`)\n* Work on a PDF document directly\n\nUsage\n=====\n\nUsing in a Project\n------------------\n\nCreate a :code:`GoogleOCRApplication` application instance:\n\n.. code-block:: python\n\n    from google_drive_ocr import GoogleOCRApplication\n\n    app = GoogleOCRApplication('client_secret.json')\n\nPerform OCR on a single image:\n\n.. code-block:: python\n\n    app.perform_ocr('image.png')\n\n\nPerform OCR on mupltiple images:\n\n.. code-block:: python\n\n    app.perform_ocr_batch(['image_1.png', 'image_2.png', 'image_3.png'])\n\nPerform OCR on multiple images using multiple workers (:code:`multiprocessing`):\n\n.. code-block:: python\n\n    app.perform_ocr_batch(['image_1.png', 'image_3.png', 'image_2.png'], workers=2)\n\n\nUsing Command Line Interface\n----------------------------\n\nTypical usage with several options:\n\n.. code-block:: console\n\n    google-ocr --client-secret client_secret.json \\\n    --upload-folder-id \u003cgoogle-drive-folder-id\u003e  \\\n    --image-dir images/ --extension .jpg \\\n    --workers 4 --no-keep\n\nShow help message with the full set of options:\n\n.. code-block:: console\n\n    google-ocr --help\n\nConfiguration\n^^^^^^^^^^^^^\n\nThe default location for configuration is :code:`~/.gdo.cfg`.\nIf configuration is written to this location with a set of options,\nwe don't have to specify those options again on the subsequent runs.\n\nSave configuration and exit:\n\n.. code-block:: console\n\n    google-ocr --client-secret client_secret.json --write-config ~/.gdo.cfg\n\n\nRead configuration from a custom location (if it was written to a custom location):\n\n.. code-block:: console\n\n    google-ocr --config ~/.my_config_file ..\n\nPerforming OCR\n^^^^^^^^^^^^^^\n\n**Note**: It is assumed that the :code:`client-secret` option is saved in configuration file.\n\nSingle image file:\n\n.. code-block:: console\n\n    google-ocr -i image.png\n\nMultiple image files:\n\n.. code-block:: console\n\n    google-ocr -b image_1.png image_2.png image_3.png\n\nAll image files from a directory with a specific extension:\n\n.. code-block:: console\n\n    google-ocr --image-dir images/ --extension .png\n\nMultiple workers (:code:`multiprocessing`):\n\n.. code-block:: console\n\n    google-ocr -b image_1.png image_2.png image_3.png --workers 2\n\nPDF files:\n\n.. code-block:: console\n\n    google-ocr --pdf document.pdf --pages 1-3 5 7-10 13\n\n\n**Note**:\nYou must setup a Google application and download :code:`client_secrets.json` file before using :code:`google_drive_ocr`.\n\nSetup Instructions\n==================\n\nCreate a project on Google Cloud Platform\n\n**Wizard**: https://console.developers.google.com/start/api?id=drive\n\n**Instructions**:\n\n    * https://cloud.google.com/genomics/downloading-credentials-for-api-access\n    * Select application type as \"Installed Application\"\n    * Create credentials OAuth consent screen --\u003e OAuth client ID\n    * Save :code:`client_secret.json`\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhrishikeshrt%2Fgoogle_drive_ocr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhrishikeshrt%2Fgoogle_drive_ocr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhrishikeshrt%2Fgoogle_drive_ocr/lists"}