{"id":21238768,"url":"https://github.com/splitgraph/sgr","last_synced_at":"2025-04-13T02:19:41.592Z","repository":{"id":37837108,"uuid":"153455875","full_name":"splitgraph/sgr","owner":"splitgraph","description":"sgr (command line client for Splitgraph) and the splitgraph Python library","archived":false,"fork":false,"pushed_at":"2024-04-30T22:14:48.000Z","size":9838,"stargazers_count":322,"open_issues_count":53,"forks_count":17,"subscribers_count":8,"default_branch":"master","last_synced_at":"2025-04-04T05:03:11.241Z","etag":null,"topics":["data","data-version-control","developer-tools","postgres","postgresql","python","sql"],"latest_commit_sha":null,"homepage":"https://www.splitgraph.com","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/splitgraph.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-10-17T12:48:16.000Z","updated_at":"2024-12-22T14:37:03.000Z","dependencies_parsed_at":"2024-06-19T05:14:38.791Z","dependency_job_id":"20d50bcd-db8e-4329-8290-b1dc9587122d","html_url":"https://github.com/splitgraph/sgr","commit_stats":{"total_commits":2081,"total_committers":16,"mean_commits":130.0625,"dds":"0.14271984622777512","last_synced_commit":"e89d0630743d1757b88779bc233239425a3d18bc"},"previous_names":["splitgraph/splitgraph"],"tags_count":37,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/splitgraph%2Fsgr","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/splitgraph%2Fsgr/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/splitgraph%2Fsgr/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/splitgraph%2Fsgr/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/splitgraph","download_url":"https://codeload.github.com/splitgraph/sgr/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248654424,"owners_count":21140295,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data","data-version-control","developer-tools","postgres","postgresql","python","sql"],"created_at":"2024-11-21T00:38:23.109Z","updated_at":"2025-04-13T02:19:41.563Z","avatar_url":"https://github.com/splitgraph.png","language":"Python","readme":"# `sgr`\n\n![Build status](https://github.com/splitgraph/sgr/workflows/build_all/badge.svg)\n[![Coverage Status](https://coveralls.io/repos/github/splitgraph/splitgraph/badge.svg?branch=master)](https://coveralls.io/github/splitgraph/splitgraph?branch=master)\n[![PyPI version](https://badge.fury.io/py/splitgraph.svg)](https://badge.fury.io/py/splitgraph)\n[![Discord chat room](https://img.shields.io/discord/718534846472912936.svg)](https://discord.gg/4Qe2fYA)\n[![Follow](https://img.shields.io/badge/twitter-@Splitgraph-blue.svg)](https://twitter.com/Splitgraph)\n\n## Overview\n\n**`sgr`** is the CLI for [**Splitgraph**](https://www.splitgraph.com), a\nserverless API for data-driven Web applications.\n\nWith addition of the optional [`sgr` Engine](engine/README.md) component, `sgr`\ncan become a stand-alone tool for building, versioning and querying reproducible\ndatasets. We use it as the storage engine for Splitgraph. It's inspired by\nDocker and Git, so it feels familiar. And it's powered by\n[PostgreSQL](https://postgresql.org), so it works seamlessly with existing tools\nin the Postgres ecosystem. Use `sgr` to package your data into self-contained\n**Splitgraph data images** that you can\n[share with other `sgr` instances](https://www.splitgraph.com/docs/getting-started/decentralized-demo).\n\nTo install the `sgr` CLI or a local `sgr` Engine, see the\n[Installation](#installation) section of this readme.\n\n### Build and Query Versioned, Reproducible Datasets\n\n[**Splitfiles**](https://www.splitgraph.com/docs/concepts/splitfiles) give you a\ndeclarative language, inspired by Dockerfiles, for expressing data\ntransformations in ordinary SQL familiar to any researcher or business analyst.\nYou can reference other images, or even other databases, with a simple JOIN.\n\n![](pics/splitfile.png)\n\nWhen you build data images with Splitfiles, you get provenance tracking of the\nresulting data: it's possible to find out what sources went into every dataset\nand know when to rebuild it if the sources ever change. You can easily integrate\n`sgr` your existing CI pipelines, to keep your data up-to-date and stay on top\nof changes to upstream sources.\n\nSplitgraph images are also version-controlled, and you can manipulate them with\nGit-like operations through a CLI. You can check out any image into a PostgreSQL\nschema and interact with it using any PostgreSQL client. `sgr` will capture your\nchanges to the data, and then you can commit them as delta-compressed changesets\nthat you can package into new images.\n\n`sgr` supports PostgreSQL\n[foreign data wrappers](https://wiki.postgresql.org/wiki/Foreign_data_wrappers).\nWe call this feature\n[mounting](https://www.splitgraph.com/docs/concepts/mounting). With mounting,\nyou can query other databases (like PostgreSQL/MongoDB/MySQL) or open data\nproviders (like\n[Socrata](https://www.splitgraph.com/docs/ingesting-data/socrata)) from your\n`sgr` instance with plain SQL. You can even snapshot the results or use them in\nSplitfiles.\n\n![](pics/splitfiles.gif)\n\n## Components\n\nThe code in this repository contains:\n\n- **[`sgr` CLI](https://www.splitgraph.com/docs/architecture/sgr-client)**:\n  `sgr` is the main command line tool used to work with Splitgraph \"images\"\n  (data snapshots). Use it to ingest data, work with Splitfiles, and push data\n  to Splitgraph.\n- **[`sgr` Engine](https://github.com/splitgraph/sgr/blob/master/engine/README.md)**: a\n  [Docker image](https://hub.docker.com/r/splitgraph/engine) of the latest\n  Postgres with `sgr` and other required extensions pre-installed.\n- **[Splitgraph Python library](https://www.splitgraph.com/docs/python-api/splitgraph.core)**:\n  All `sgr` functionality is available in the Python API, offering first-class\n  support for data science workflows including Jupyter notebooks and Pandas\n  dataframes.\n\n## Docs\n\n- [`sgr` documentation](https://www.splitgraph.com/docs/sgr-cli/introduction)\n- [Advanced `sgr` documentation](https://www.splitgraph.com/docs/sgr-advanced/getting-started/introduction)\n- [`sgr` command reference](https://www.splitgraph.com/docs/sgr/image-management-creation/checkout_)\n- [`splitgraph` package reference](https://www.splitgraph.com/docs/python-api/modules)\n\nWe also recommend reading our Blog, including some of our favorite posts:\n\n- [Supercharging `dbt` with `sgr`: versioning, sharing, cross-DB joins](https://www.splitgraph.com/blog/dbt)\n- [Querying 40,000+ datasets with SQL](https://www.splitgraph.com/blog/40k-sql-datasets)\n- [Foreign data wrappers: PostgreSQL's secret weapon?](https://www.splitgraph.com/blog/foreign-data-wrappers)\n\n## Installation\n\nPre-requisites:\n\n- Docker is required to run the `sgr` Engine. `sgr` must have access to Docker.\n  You either need to [install Docker locally](https://docs.docker.com/install/)\n  or have access to a remote Docker socket.\n\nYou can get the `sgr` single binary from\n[the releases page](https://github.com/splitgraph/sgr/releases).\nOptionally, you can run\n[`sgr engine add`](https://www.splitgraph.com/docs/sgr/engine-management/engine-add)\nto create an engine.\n\nFor Linux and OSX, once Docker is running, install `sgr` with a single script:\n\n```bash\n$ bash -c \"$(curl -sL https://github.com/splitgraph/sgr/releases/latest/download/install.sh)\"\n```\n\nThis will download the `sgr` binary and set up the `sgr` Engine Docker\ncontainer.\n\nSee the\n[installation guide](https://www.splitgraph.com/docs/sgr-cli/installation) for\nmore installation methods.\n\n## Quick start guide\n\nYou can follow the\n[quick start guide](https://www.splitgraph.com/docs/sgr-advanced/getting-started/five-minute-demo)\nthat will guide you through the basics of using `sgr` with Splitgraph or\nstandalone.\n\nAlternatively, `sgr` comes with plenty of [examples](https://github.com/splitgraph/sgr/tree/master/examples) to get you\nstarted.\n\nIf you're stuck or have any questions, check out the\n[documentation](https://www.splitgraph.com/docs/sgr-advanced/getting-started/introduction)\nor join our [Discord channel](https://discord.gg/4Qe2fYA)!\n\n## Contributing\n\n### Setting up a development environment\n\n- `sgr` requires Python 3.7 or later.\n- Install [Poetry](https://github.com/python-poetry/poetry):\n  `curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python`\n  to manage dependencies\n- Install pre-commit hooks (we use [Black](https://github.com/psf/black) to\n  format code)\n- `git clone --recurse-submodules https://github.com/splitgraph/sgr.git`\n- `poetry install`\n- To build the\n  [engine](https://www.splitgraph.com/docs/architecture/splitgraph-engine)\n  Docker image: `cd engine \u0026\u0026 make`\n\n### Running tests\n\nThe test suite requires [docker-compose](https://github.com/docker/compose). You\nwill also need to add these lines to your `/etc/hosts` or equivalent:\n\n```\n127.0.0.1       local_engine\n127.0.0.1       remote_engine\n127.0.0.1       objectstorage\n```\n\nTo run the core test suite, do\n\n```\ndocker-compose -f test/architecture/docker-compose.core.yml up -d\npoetry run pytest -m \"not mounting and not example\"\n```\n\nTo run the test suite related to \"mounting\" and importing data from other\ndatabases (PostgreSQL, MySQL, Mongo), do\n\n```\ndocker-compose -f test/architecture/docker-compose.core.yml -f test/architecture/docker-compose.mounting.yml up -d\npoetry run pytest -m mounting\n```\n\nFinally, to test the\n[example projects](https://github.com/splitgraph/sgr/tree/master/examples),\ndo\n\n```\n# Example projects spin up their own engines\ndocker-compose -f test/architecture/docker-compose.core.yml -f test/architecture/docker-compose.core.yml down -v\npoetry run pytest -m example\n```\n\nAll of these tests run in\n[CI](https://github.com/splitgraph/sgr/actions).\n","funding_links":[],"categories":["Python"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsplitgraph%2Fsgr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsplitgraph%2Fsgr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsplitgraph%2Fsgr/lists"}