{"id":15334230,"url":"https://github.com/yaooqinn/spark-postgres","last_synced_at":"2025-04-15T03:02:19.497Z","repository":{"id":44502359,"uuid":"175567263","full_name":"yaooqinn/spark-postgres","owner":"yaooqinn","description":"PostgreSQL and GreenPlum Data Source for Apache Spark","archived":false,"fork":false,"pushed_at":"2024-02-21T00:23:04.000Z","size":80,"stargazers_count":35,"open_issues_count":3,"forks_count":13,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-03-28T14:43:10.847Z","etag":null,"topics":["greenplum","postgres","postgresql","spark","sparksql","transactional"],"latest_commit_sha":null,"homepage":"https://yaooqinn.github.io/spark-postgres/","language":"Scala","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/yaooqinn.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-03-14T07:08:34.000Z","updated_at":"2024-06-20T07:21:44.000Z","dependencies_parsed_at":"2023-01-18T16:04:04.320Z","dependency_job_id":null,"html_url":"https://github.com/yaooqinn/spark-postgres","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yaooqinn%2Fspark-postgres","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yaooqinn%2Fspark-postgres/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yaooqinn%2Fspark-postgres/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yaooqinn%2Fspark-postgres/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/yaooqinn","download_url":"https://codeload.github.com/yaooqinn/spark-postgres/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248997095,"owners_count":21195798,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["greenplum","postgres","postgresql","spark","sparksql","transactional"],"created_at":"2024-10-01T10:06:22.446Z","updated_at":"2025-04-15T03:02:19.466Z","avatar_url":"https://github.com/yaooqinn.png","language":"Scala","funding_links":[],"categories":[],"sub_categories":[],"readme":"# PostgreSQL \u0026 GreenPlum Data Source for Apache Spark [![License](https://img.shields.io/badge/license-Apache%202-4EB1BA.svg)](https://www.apache.org/licenses/LICENSE-2.0.html) [![](https://tokei.rs/b1/github/yaooqinn/spark-greenplum)](https://github.com/yaooqinn/spark-greenplum) [![GitHub release](https://img.shields.io/github/release/yaooqinn/spark-greenplum.svg)](https://github.com/yaooqinn/spark-greenplum/releases) [![codecov](https://codecov.io/gh/yaooqinn/spark-greenplum/branch/master/graph/badge.svg)](https://codecov.io/gh/yaooqinn/spark-greenplum) [![Build Status](https://travis-ci.com/yaooqinn/spark-greenplum.svg?branch=master)](https://travis-ci.com/yaooqinn/spark-greenplum)[![HitCount](http://hits.dwyl.io/yaooqinn/spark-greenplum.svg)](http://hits.dwyl.io/yaooqinn/spark-greenplum)\n\nA library for reading data from and transferring data to Greenplum databases with Apache Spark, for Spark SQL and DataFrames.\n\nThis library is **100x faster** than Apache Spark's JDBC DataSource while transferring data from Spark to Greenpum databases.\n\nAlso, this library is fully **transactional** .\n\n## Try it now !\n\n### CTAS\n```genericsql\nCREATE TABLE tbl\nUSING greenplum\noptions ( \n  url \"jdbc:postgresql://greenplum:5432/\",\n  delimiter \"\\t\",\n  dbschema \"gptest\",\n  dbtable \"store_sales\",\n  user 'gptest',\n  password 'test')\nAS\n SELECT * FROM tpcds_100g.store_sales WHERE ss_sold_date_sk\u003c=2451537 AND ss_sold_date_sk\u003e 2451520;\n```\n\n### View \u0026 Insert\n\n```genericsql\nCREATE TEMPORARY TABLE tbl\nUSING greenplum\noptions ( \n  url \"jdbc:postgresql://greenplum:5432/\",\n  delimiter \"\\t\",\n  dbschema \"gptest\",\n  dbtable \"store_sales\",\n  user 'gptest',\n  password 'test')\n  \nINSERT INTO TABLE tbl SELECT * FROM tpcds_100g.store_sales WHERE ss_sold_date_sk\u003c=2451537 AND ss_sold_date_sk\u003e 2451520;\n\n```\n\nPlease refer to [Spark SQL Guide - JDBC To Other Databases](http://spark.apache.org/docs/latest/sql-data-sources-jdbc.html) to learn more about the similar usage. \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyaooqinn%2Fspark-postgres","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fyaooqinn%2Fspark-postgres","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyaooqinn%2Fspark-postgres/lists"}