{"id":15287726,"url":"https://github.com/potix2/spark-google-spreadsheets","last_synced_at":"2025-10-07T06:47:59.108Z","repository":{"id":2744066,"uuid":"42240184","full_name":"potix2/spark-google-spreadsheets","owner":"potix2","description":"Google Spreadsheets datasource for SparkSQL and DataFrames","archived":false,"fork":false,"pushed_at":"2023-07-24T21:17:52.000Z","size":74,"stargazers_count":57,"open_issues_count":10,"forks_count":47,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-06-30T03:41:32.253Z","etag":null,"topics":["data-frame","scala","spark","sparksql","spreadsheet"],"latest_commit_sha":null,"homepage":null,"language":"Scala","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/potix2.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2015-09-10T11:20:38.000Z","updated_at":"2024-07-18T13:53:21.000Z","dependencies_parsed_at":"2025-04-13T06:04:00.912Z","dependency_job_id":"895ae6c1-aab9-4dd7-962b-3bc32f5b1454","html_url":"https://github.com/potix2/spark-google-spreadsheets","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/potix2/spark-google-spreadsheets","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/potix2%2Fspark-google-spreadsheets","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/potix2%2Fspark-google-spreadsheets/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/potix2%2Fspark-google-spreadsheets/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/potix2%2Fspark-google-spreadsheets/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/potix2","download_url":"https://codeload.github.com/potix2/spark-google-spreadsheets/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/potix2%2Fspark-google-spreadsheets/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278734424,"owners_count":26036404,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-07T02:00:06.786Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-frame","scala","spark","sparksql","spreadsheet"],"created_at":"2024-09-30T15:36:08.800Z","updated_at":"2025-10-07T06:47:59.092Z","avatar_url":"https://github.com/potix2.png","language":"Scala","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Spark Google Spreadsheets\n\nGoogle Spreadsheets datasource for [SparkSQL and DataFrames](http://spark.apache.org/docs/latest/sql-programming-guide.html)\n\n[![Build Status](https://travis-ci.org/potix2/spark-google-spreadsheets.svg?branch=master)](https://travis-ci.org/potix2/spark-google-spreadsheets)\n\n## Notice\n\nThe version 0.4.0 breaks compatibility with previous versions. You must\nuse a ** spreadsheetId ** to identify which spreadsheet is to be accessed or altered.\nIn older versions, spreadsheet name was used.\n\nIf you don't know spreadsheetId, please read the [Introduction to the Google Sheets API v4](https://developers.google.com/sheets/guides/concepts).\n\n## Requirements\n\nThis library supports different versions of Spark:\n\n### Latest compatible versions\n\n| This library | Spark Version |\n| ------------ | ------------- |\n| 0.6.x        | 2.3.x, 2.4.x  |\n| 0.5.x        | 2.0.x         |\n| 0.4.x        | 1.6.x         |\n\n## Linking\n\nUsing SBT:\n\n```\nlibraryDependencies += \"com.github.potix2\" %% \"spark-google-spreadsheets\" % \"0.6.3\"\n```\n\nUsing Maven:\n\n```xml\n\u003cdependency\u003e\n  \u003cgroupId\u003ecom.github.potix2\u003c/groupId\u003e\n  \u003cartifactId\u003espark-google-spreadsheets_2.11\u003c/artifactId\u003e\n  \u003cversion\u003e0.6.3\u003c/version\u003e\n\u003c/dependency\u003e\n```\n\n## SQL API\n\n```sql\nCREATE TABLE cars\nUSING com.github.potix2.spark.google.spreadsheets\nOPTIONS (\n    path \"\u003cspreadsheetId\u003e/worksheet1\",\n    serviceAccountId \"xxxxxx@developer.gserviceaccount.com\",\n    credentialPath \"/path/to/credential.p12\"\n)\n```\n\n## Scala API\n\n```scala\nimport org.apache.spark.sql.SQLContext\n\nval sqlContext = new SQLContext(sc)\n\n// Creates a DataFrame from a specified worksheet\nval df = sqlContext.read.\n    format(\"com.github.potix2.spark.google.spreadsheets\").\n    option(\"serviceAccountId\", \"xxxxxx@developer.gserviceaccount.com\").\n    option(\"credentialPath\", \"/path/to/credential.p12\").\n    load(\"\u003cspreadsheetId\u003e/worksheet1\")\n\n// Saves a DataFrame to a new worksheet\ndf.write.\n    format(\"com.github.potix2.spark.google.spreadsheets\").\n    option(\"serviceAccountId\", \"xxxxxx@developer.gserviceaccount.com\").\n    option(\"credentialPath\", \"/path/to/credential.p12\").\n    save(\"\u003cspreadsheetId\u003e/newWorksheet\")\n\n```\n\n### Using Google default application credentials\n\nProvide authentication credentials to your application code by setting the environment variable \n`GOOGLE_APPLICATION_CREDENTIALS`. The variable should be set to the path of the service account json file.\n\n\n```scala\nimport org.apache.spark.sql.SQLContext\n\nval sqlContext = new SQLContext(sc)\n\n// Creates a DataFrame from a specified worksheet\nval df = sqlContext.read.\n    format(\"com.github.potix2.spark.google.spreadsheets\").\n    load(\"\u003cspreadsheetId\u003e/worksheet1\")\n```\n\nMore details: https://cloud.google.com/docs/authentication/production\n\n## License\n\nCopyright 2016-2018, Katsunori Kanda\n\nLicensed under the Apache License, Version 2.0 (the \"License\"); you may not use this file except in compliance with the License. You may obtain a copy of the License at\n\nhttp://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software distributed under the License is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpotix2%2Fspark-google-spreadsheets","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpotix2%2Fspark-google-spreadsheets","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpotix2%2Fspark-google-spreadsheets/lists"}