{"id":48845933,"url":"https://github.com/uname-n/deltabase","last_synced_at":"2026-04-15T05:05:05.252Z","repository":{"id":252716979,"uuid":"840831770","full_name":"uname-n/deltabase","owner":"uname-n","description":"a lightweight, comprehensive solution for managing delta tables built on polars and deltalake","archived":false,"fork":false,"pushed_at":"2025-01-01T23:14:44.000Z","size":745,"stargazers_count":121,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-11-27T15:19:42.269Z","etag":null,"topics":["database","delta-tables","deltalake","polars","sql"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/uname-n.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-10T20:24:36.000Z","updated_at":"2025-10-19T13:33:33.000Z","dependencies_parsed_at":"2024-08-18T08:03:12.691Z","dependency_job_id":null,"html_url":"https://github.com/uname-n/deltabase","commit_stats":null,"previous_names":["uname-n/deltadb","uname-n/deltabase"],"tags_count":10,"template":false,"template_full_name":null,"purl":"pkg:github/uname-n/deltabase","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/uname-n%2Fdeltabase","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/uname-n%2Fdeltabase/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/uname-n%2Fdeltabase/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/uname-n%2Fdeltabase/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/uname-n","download_url":"https://codeload.github.com/uname-n/deltabase/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/uname-n%2Fdeltabase/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31826919,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-14T18:05:02.291Z","status":"online","status_checked_at":"2026-04-15T02:00:06.175Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["database","delta-tables","deltalake","polars","sql"],"created_at":"2026-04-15T05:05:04.603Z","updated_at":"2026-04-15T05:05:05.237Z","avatar_url":"https://github.com/uname-n.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003ch4 align=\"center\"\u003e\n  \u003cimg src=\"./docs/assets/banner.svg\" alt=\"banner\"\u003e\n\u003c/h4\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://uname-n.github.io/deltabase\"\u003edocumentation (wip)\u003c/a\u003e\n\u003c/p\u003e\n\n**DeltaBase** is a lightweight, comprehensive solution for managing Delta Tables in both local and cloud environments. Built on the high-performance frameworks [**polars**](https://github.com/pola-rs/polars) and [**deltalake**](https://github.com/delta-io/delta-rs), DeltaBase streamlines data operations with features like upsert, delete, commit, and version control. Designed for data engineers, analysts, and developers, it ensures data consistency, efficient versioning, and seamless integration into your workflows.\n\n## Installation\nTo install **DeltaBase**, run the following command:\n```bash\npip install deltabase\n```\n\n## Quick Start\n```python\nfrom deltabase import delta\n\n# connect to a delta source\ndb:delta = delta.connect(path=\"mydelta\")\n\n# upsert records into a table \ndb.upsert(table=\"mytable\", primary_key=\"id\", data=[\n    {\"id\": 1, \"name\": \"alice\"}\n])\n\n# commit table to delta source\ndb.commit(table=\"mytable\")\n\n# read records from sql context\nresult = db.sql(\"select * from mytable\")\nprint(result) # output: [{\"id\": 1, \"name\": \"alice\"}]\n```\n\nSee a full example of **DeltaBase** in action [here](https://github.com/uname-n/deltabase/blob/master/examples/magic.ipynb).\n\n## Usage\n\n### Connecting to a Delta Source\nEstablish a connection to your Delta source, whether it's a local directory or remote cloud storage.\n\n```python\nfrom deltabase import delta\n\ndb = delta.connect(path=\"local_path/mydelta\")\ndb = delta.connect(path=\"s3://your-bucket/path\")\ndb = delta.connect(path=\"az://your-container/path\")\ndb = delta.connect(path=\"abfs[s]://your-container/path\")\n```\n\n### Register Tables\nLoad tables into the SQL context from the Delta source using the `register` method. You can also register data directly from a DataFrame or specify options like version and alias.\n\n```python\n# load existing table from delta\ndb.register(table=\"mytable\")\n\n# load under an alias\ndb.register(table=\"mytable\", alias=\"table_alias\")\n\n# load a specific version\ndb.register(table=\"mytable\", version=1)\n\n# load data directly\ndata = DataFrame([{\"id\": 1, \"name\": \"Alice\"}])\ndb.register(table=\"mytable\", data=data)\n\n# load with pyarrow options\ndb.register(\n    table=\"mytable\",\n    pyarrow_options={\"partitions\": [(\"year\", \"=\", \"2021\")]}\n)\n```\n\n### Running SQL Queries\nExecute SQL queries against your registered tables using the `sql` method.\n\n```python\n# run a query and get the result in json format\nresult = db.sql(\"select * from mytable\")\n\n# get the result as a polars dataframe\nresult = db.sql(\"select * from mytable\", dtype=\"polars\")\n\n# return a LazyFrame for deferred execution\nresult = db.sql(\"select * from mytable\", lazy=True)\n```\n\n### Upserting Data\nInsert new records or update existing ones using the `upsert` method. It automatically handles schema changes and efficiently synchronizes data.\n\n```python\n# upsert a single record\ndb.upsert(\n    table=\"mytable\",\n    primary_key=\"id\",\n    data={\"id\": 1, \"name\": \"Alice\"}\n)\n\n# upsert multiple records\ndb.upsert(\n    table=\"mytable\",\n    primary_key=\"id\",\n    data=[\n        {\"id\": 2, \"name\": \"Bob\", \"job\": \"Chef\"},\n        {\"id\": 3, \"name\": \"Sam\"},\n    ]\n)\n\n# upsert dataframes\ndata = DataFrame([{\"id\": 4, \"name\": \"Dave\"}])\ndb.upsert(table=\"mytable\", primary_key=\"id\", data=data)\n\n# upsert lazyframes\ndata = LazyFrame([{\"id\": 5, \"name\": \"Eve\"}])\ndb.upsert(table=\"mytable\", primary_key=\"id\", data=data)\n```\n\n### Committing Changes\nPersist changes made in the SQL context back to the Delta source using the `commit` method. You can enforce schema changes or partition your data during this process.\n\n\n```python\ndb.commit(table=\"mytable\")\ndb.commit(table=\"mytable\", force=True)\ndb.commit(table=\"mytable\", partition_by=[\"job\"])\n```\n\n### Deleting Data\nRemove records from a table or delete the table from the SQL context using the delete method.\n\n```python\n# delete records using a sql condition\ndb.delete(table=\"mytable\", filter=\"name='Bob'\")\n\n# delete records using a lambda function\ndb.delete(table=\"mytable\", filter=lambda row: row[\"name\"] == \"Sam\")\n\n# delete table from sql context\ndb.delete(table=\"mytable\")\n```\n\n### Checking Out Previous Versions\nRevert to a previous version of a table using the `checkout` method. This is useful for loading historical data or restoring a previous state.\n\n```python\n# get a specific version by number\ndb.checkout(table=\"mytable\", version=1)\n\n# get out a version by date string\ndb.checkout(table=\"mytable\", version=\"2024-01-01\")\n\n# get out a version by datetime object\ndb.checkout(table=\"mytable\", version=datetime(2024, 1, 1))\n```\n\n### Configuring Output Data Types\nSet the output data format by adjusting the `dtype` attribute in the configuration object. The default format is `json`.\n\n```python\n# set output data type to polars dataframe\ndb.config.dtype = \"polars\"\n\n# run a sql query and get results as polars dataframe\nresult = db.sql(\"SELECT * FROM mytable\")\n```\n\n### Jupyter Notebook Magic\n**DeltaBase** provides magic commands for use in Jupyter notebooks, enhancing your interactive data exploration experience. Magic commands are automatically enabled when you connect to delta source within a notebook.\n\n#### Using SQL Magic\n```sql\n%%sql\nselect * from mytable\n```\n\n#### Using AI Magic\n```sql\n%%ai\nwhat data is available to me?\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Funame-n%2Fdeltabase","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Funame-n%2Fdeltabase","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Funame-n%2Fdeltabase/lists"}