{"id":13501390,"url":"https://github.com/betodealmeida/gsheets-db-api","last_synced_at":"2025-04-04T22:08:17.854Z","repository":{"id":50161657,"uuid":"148225023","full_name":"betodealmeida/gsheets-db-api","owner":"betodealmeida","description":"A Python DB-API and SQLAlchemy dialect to Google Spreasheets","archived":false,"fork":false,"pushed_at":"2022-12-08T11:32:58.000Z","size":124,"stargazers_count":216,"open_issues_count":16,"forks_count":16,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-03-28T21:06:06.882Z","etag":null,"topics":["api","db","google","python","spreadsheet","spreadsheets","sql"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/betodealmeida.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-09-10T22:07:02.000Z","updated_at":"2025-02-25T11:16:49.000Z","dependencies_parsed_at":"2023-01-25T12:45:07.992Z","dependency_job_id":null,"html_url":"https://github.com/betodealmeida/gsheets-db-api","commit_stats":null,"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/betodealmeida%2Fgsheets-db-api","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/betodealmeida%2Fgsheets-db-api/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/betodealmeida%2Fgsheets-db-api/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/betodealmeida%2Fgsheets-db-api/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/betodealmeida","download_url":"https://codeload.github.com/betodealmeida/gsheets-db-api/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247256112,"owners_count":20909240,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["api","db","google","python","spreadsheet","spreadsheets","sql"],"created_at":"2024-07-31T22:01:35.599Z","updated_at":"2025-04-04T22:08:17.838Z","avatar_url":"https://github.com/betodealmeida.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"[![Build Status](https://travis-ci.org/betodealmeida/gsheets-db-api.svg?branch=master)](https://travis-ci.org/betodealmeida/gsheets-db-api) [![codecov](https://codecov.io/gh/betodealmeida/gsheets-db-api/branch/master/graph/badge.svg)](https://codecov.io/gh/betodealmeida/gsheets-db-api)\n\n**Note:** [shillelagh](https://github.com/betodealmeida/shillelagh/) is a drop-in replacement for `gsheets-db-api`, with many additional features. You should use it instead. If you're using SQLAlchemy all you need to do:\n\n```bash\n$ pip uninstall gsheetsdb\n$ pip install shillelagh\n```\n\nIf you're using the DB API:\n\n```bash\n# from gsheetsdb import connect\nfrom shillelagh.backends.apsw.db import connect\n```\n\n# A Python DB API 2.0 for Google Spreadsheets #\n\nThis module allows you to query Google Spreadsheets using SQL.\n\nUsing [this spreadsheet](https://docs.google.com/spreadsheets/d/1_rN3lm0R_bU3NemO0s9pbFkY5LQPcuy1pscv8ZXPtg8/) as an example:\n\n| | A | B |\n|-|--------|-----|\n| 1 | country | cnt |\n| 2 | BR | 1 |\n| 3 | BR | 3 |\n| 4 | IN | 5 |\n\nHere's a simple query using the Python API:\n\n```python\nfrom gsheetsdb import connect\n\nconn = connect()\nresult = conn.execute(\"\"\"\n    SELECT\n        country\n      , SUM(cnt)\n    FROM\n        \"https://docs.google.com/spreadsheets/d/1_rN3lm0R_bU3NemO0s9pbFkY5LQPcuy1pscv8ZXPtg8/\"\n    GROUP BY\n        country\n\"\"\", headers=1)\nfor row in result:\n    print(row)\n```\n\nThis will print:\n\n```\nRow(country='BR', sum_cnt=4.0)\nRow(country='IN', sum_cnt=5.0)\n```\n\n## How it works ##\n\n### Transpiling ###\n\nGoogle spreadsheets can actually be queried with a [very limited SQL API](https://developers.google.com/chart/interactive/docs/querylanguage). This module will transpile the SQL query into a simpler query that the API understands. Eg, the query above would be translated to:\n\n```sql\nSELECT A, SUM(B) GROUP BY A\n```\n\n### Processors ###\n\nIn addition to transpiling, this module also provides pre- and post-processors. The pre-processors add more columns to the query, and the post-processors build the actual result from those extra columns. Eg, `COUNT(*)` is not supported, so the following query:\n\n```sql\nSELECT COUNT(*) FROM \"https://docs.google.com/spreadsheets/d/1_rN3lm0R_bU3NemO0s9pbFkY5LQPcuy1pscv8ZXPtg8/\"\n```\n\nGets translated to:\n\n```sql\nSELECT COUNT(A), COUNT(B)\n```\n\nAnd then the maximum count is returned. This assumes that at least one column has no `NULL`s.\n\n\n### SQLite ###\nWhen a query can't be expressed, the module will issue a `SELECT *`, load the data into an in-memory SQLite table, and execute the query in SQLite. This is obviously inneficient, since all data has to be downloaded, but ensures that all queries succeed.\n\n## Installation ##\n\n```bash\n$ pip install gsheetsdb\n$ pip install gsheetsdb[cli]         # if you want to use the CLI\n$ pip install gsheetsdb[sqlalchemy]  # if you want to use it with SQLAlchemy\n```\n\n## CLI ##\n\nThe module will install an executable called `gsheetsdb`:\n\n```bash\n$ gsheetsdb --headers=1\n\u003e SELECT * FROM \"https://docs.google.com/spreadsheets/d/1_rN3lm0R_bU3NemO0s9pbFkY5LQPcuy1pscv8ZXPtg8/\"\ncountry      cnt\n---------  -----\nBR             1\nBR             3\nIN             5\n\u003e SELECT country, SUM(cnt) FROM \"https://docs.google.com/spreadsheets/d/1_rN3lm0R_bU3NemO0s9pbFkY5LQPcuy1\npscv8ZXPtg8/\" GROUP BY country\ncountry      sum cnt\n---------  ---------\nBR                 4\nIN                 5\n\u003e\n```\n\n## SQLAlchemy support ##\n\nThis module provides a SQLAlchemy dialect. You don't need to specify a URL, since the spreadsheet is extracted from the `FROM` clause:\n\n```python\nfrom sqlalchemy import *\nfrom sqlalchemy.engine import create_engine\nfrom sqlalchemy.schema import *\n\nengine = create_engine('gsheets://')\ninspector = inspect(engine)\n\ntable = Table(\n    'https://docs.google.com/spreadsheets/d/1_rN3lm0R_bU3NemO0s9pbFkY5LQPcuy1pscv8ZXPtg8/edit#gid=0',\n    MetaData(bind=engine),\n    autoload=True)\nquery = select([func.count(table.columns.country)], from_obj=table)\nprint(query.scalar())  # prints 3.0\n```\n\nAlternatively, you can initialize the engine with a \"catalog\". The catalog is a Google spreadsheet where each row points to another Google spreadsheet, with URL, number of headers and schema as the columns. You can see an example [here](https://docs.google.com/spreadsheets/d/1AAqVVSpGeyRZyrr4n--fb_IxhLwwKtLbjfu4h6MyyYA/edit#gid=0):\n\n|| A | B | C |\n|-|-|-|-|\n| 1 | https://docs.google.com/spreadsheets/d/1_rN3lm0R_bU3NemO0s9pbFkY5LQPcuy1pscv8ZXPtg8/edit#gid=0 | 1 | default |\n| 2 | https://docs.google.com/spreadsheets/d/1_rN3lm0R_bU3NemO0s9pbFkY5LQPcuy1pscv8ZXPtg8/edit#gid=1077884006 | 2 | default |\n\nThis will make the two spreadsheets above available as \"tables\" in the `default` schema.\n\n\n## Authentication ##\n\nYou can access spreadsheets that are shared only within an organization. In order to do this, first [create a service account](https://developers.google.com/api-client-library/python/auth/service-accounts#creatinganaccount). Make sure you select \"Enable G Suite Domain-wide Delegation\". Download the key as a JSON file.\n\nNext, you need to manage API client access at https://admin.google.com/${DOMAIN}/AdminHome?chromeless=1#OGX:ManageOauthClients. Add the \"Unique ID\" from the previous step as the \"Client Name\", and add `https://spreadsheets.google.com/feeds` as the scope.\n\nNow, when creating the connection from the DB API or from SQLAlchemy you can point to the JSON file and the user you want to impersonate:\n\n```python\n\u003e\u003e\u003e auth = {'service_account_file': '/path/to/certificate.json', 'subject': 'user@domain.com'}\n\u003e\u003e\u003e conn = connect(auth)\n```\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbetodealmeida%2Fgsheets-db-api","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbetodealmeida%2Fgsheets-db-api","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbetodealmeida%2Fgsheets-db-api/lists"}