{"id":13398290,"url":"https://github.com/ibis-project/ibis","last_synced_at":"2025-05-13T15:03:54.542Z","repository":{"id":30584152,"uuid":"34139230","full_name":"ibis-project/ibis","owner":"ibis-project","description":"the portable Python dataframe library","archived":false,"fork":false,"pushed_at":"2025-05-05T03:24:05.000Z","size":181658,"stargazers_count":5725,"open_issues_count":303,"forks_count":633,"subscribers_count":81,"default_branch":"main","last_synced_at":"2025-05-05T22:41:32.338Z","etag":null,"topics":["bigquery","clickhouse","database","datafusion","duckdb","impala","mssql","mysql","pandas","polars","postgresql","pyarrow","pyspark","python","snowflake","sql","sqlite","trino"],"latest_commit_sha":null,"homepage":"https://ibis-project.org","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ibis-project.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"docs/CONTRIBUTING.md","funding":null,"license":"LICENSE.txt","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2015-04-17T20:43:46.000Z","updated_at":"2025-05-05T13:39:59.000Z","dependencies_parsed_at":"2023-10-04T02:55:53.541Z","dependency_job_id":"6a8a8a53-cc5c-4a24-adb2-3c0e9beecdd8","html_url":"https://github.com/ibis-project/ibis","commit_stats":{"total_commits":8721,"total_committers":215,"mean_commits":"40.562790697674416","dds":0.585827313381493,"last_synced_commit":"34c465cde2429147beb018fb5fb5b9a0308c6c2b"},"previous_names":["cloudera/ibis"],"tags_count":59,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ibis-project%2Fibis","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ibis-project%2Fibis/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ibis-project%2Fibis/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ibis-project%2Fibis/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ibis-project","download_url":"https://codeload.github.com/ibis-project/ibis/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253968447,"owners_count":21992255,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bigquery","clickhouse","database","datafusion","duckdb","impala","mssql","mysql","pandas","polars","postgresql","pyarrow","pyspark","python","snowflake","sql","sqlite","trino"],"created_at":"2024-07-30T19:00:21.965Z","updated_at":"2025-05-13T15:03:49.469Z","avatar_url":"https://github.com/ibis-project.png","language":"Python","funding_links":[],"categories":["Python","Language bindings","Data Management \u0026 Processing","1. Core Frameworks \u0026 Libraries","Curated List","sqlite","Libraries","Database Clients","Data Analysis"],"sub_categories":["Python","Database \u0026 Cloud Management","Data Tools"],"readme":"# Ibis\n\n[![Documentation status](https://img.shields.io/badge/docs-docs.ibis--project.org-blue.svg)](http://ibis-project.org)\n[![Project chat](https://img.shields.io/badge/zulip-join_chat-purple.svg?logo=zulip)](https://ibis-project.zulipchat.com)\n[![Anaconda badge](https://anaconda.org/conda-forge/ibis-framework/badges/version.svg)](https://anaconda.org/conda-forge/ibis-framework)\n[![PyPI](https://img.shields.io/pypi/v/ibis-framework.svg)](https://pypi.org/project/ibis-framework)\n[![Build status](https://github.com/ibis-project/ibis/actions/workflows/ibis-main.yml/badge.svg)](https://github.com/ibis-project/ibis/actions/workflows/ibis-main.yml?query=branch%3Amain)\n[![Build status](https://github.com/ibis-project/ibis/actions/workflows/ibis-backends.yml/badge.svg)](https://github.com/ibis-project/ibis/actions/workflows/ibis-backends.yml?query=branch%3Amain)\n[![Codecov branch](https://img.shields.io/codecov/c/github/ibis-project/ibis/main.svg)](https://codecov.io/gh/ibis-project/ibis)\n\n## What is Ibis?\n\nIbis is the portable Python dataframe library:\n\n- Fast local dataframes (via DuckDB by default)\n- Lazy dataframe expressions\n- Interactive mode for iterative data exploration\n- [Compose Python dataframe and SQL code](#python--sql-better-together)\n- Use the same dataframe API for [nearly 20 backends](#backends)\n- Iterate locally and deploy remotely by [changing a single line of code](#portability)\n\nSee the documentation on [\"Why Ibis?\"](https://ibis-project.org/why) to learn more.\n\n## Getting started\n\nYou can `pip install` Ibis with a backend and example data:\n\n```bash\npip install 'ibis-framework[duckdb,examples]'\n```\n\n\u003e 💡 **Tip**\n\u003e\n\u003e See the [installation guide](https://ibis-project.org/install) for more installation options.\n\nThen use Ibis:\n\n```python\n\u003e\u003e\u003e import ibis\n\u003e\u003e\u003e ibis.options.interactive = True\n\u003e\u003e\u003e t = ibis.examples.penguins.fetch()\n\u003e\u003e\u003e t\n┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓\n┃ species ┃ island    ┃ bill_length_mm ┃ bill_depth_mm ┃ flipper_length_mm ┃ body_mass_g ┃ sex    ┃ year  ┃\n┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩\n│ string  │ string    │ float64        │ float64       │ int64             │ int64       │ string │ int64 │\n├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤\n│ Adelie  │ Torgersen │           39.1 │          18.7 │               181 │        3750 │ male   │  2007 │\n│ Adelie  │ Torgersen │           39.5 │          17.4 │               186 │        3800 │ female │  2007 │\n│ Adelie  │ Torgersen │           40.3 │          18.0 │               195 │        3250 │ female │  2007 │\n│ Adelie  │ Torgersen │           NULL │          NULL │              NULL │        NULL │ NULL   │  2007 │\n│ Adelie  │ Torgersen │           36.7 │          19.3 │               193 │        3450 │ female │  2007 │\n│ Adelie  │ Torgersen │           39.3 │          20.6 │               190 │        3650 │ male   │  2007 │\n│ Adelie  │ Torgersen │           38.9 │          17.8 │               181 │        3625 │ female │  2007 │\n│ Adelie  │ Torgersen │           39.2 │          19.6 │               195 │        4675 │ male   │  2007 │\n│ Adelie  │ Torgersen │           34.1 │          18.1 │               193 │        3475 │ NULL   │  2007 │\n│ Adelie  │ Torgersen │           42.0 │          20.2 │               190 │        4250 │ NULL   │  2007 │\n│ …       │ …         │              … │             … │                 … │           … │ …      │     … │\n└─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘\n\u003e\u003e\u003e g = t.group_by(\"species\", \"island\").agg(count=t.count()).order_by(\"count\")\n\u003e\u003e\u003e g\n┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓\n┃ species   ┃ island    ┃ count ┃\n┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩\n│ string    │ string    │ int64 │\n├───────────┼───────────┼───────┤\n│ Adelie    │ Biscoe    │    44 │\n│ Adelie    │ Torgersen │    52 │\n│ Adelie    │ Dream     │    56 │\n│ Chinstrap │ Dream     │    68 │\n│ Gentoo    │ Biscoe    │   124 │\n└───────────┴───────────┴───────┘\n```\n\n\u003e 💡 **Tip**\n\u003e\n\u003e See the [getting started tutorial](https://ibis-project.org/tutorials/basics) for a full introduction to Ibis.\n\n## Python + SQL: better together\n\nFor most backends, Ibis works by compiling its dataframe expressions into SQL:\n\n```python\n\u003e\u003e\u003e ibis.to_sql(g)\nSELECT\n  \"t1\".\"species\",\n  \"t1\".\"island\",\n  \"t1\".\"count\"\nFROM (\n  SELECT\n    \"t0\".\"species\",\n    \"t0\".\"island\",\n    COUNT(*) AS \"count\"\n  FROM \"penguins\" AS \"t0\"\n  GROUP BY\n    1,\n    2\n) AS \"t1\"\nORDER BY\n  \"t1\".\"count\" ASC\n```\n\nYou can mix SQL and Python code:\n\n```python\n\u003e\u003e\u003e a = t.sql(\"SELECT species, island, count(*) AS count FROM penguins GROUP BY 1, 2\")\n\u003e\u003e\u003e a\n┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓\n┃ species   ┃ island    ┃ count ┃\n┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩\n│ string    │ string    │ int64 │\n├───────────┼───────────┼───────┤\n│ Adelie    │ Torgersen │    52 │\n│ Adelie    │ Biscoe    │    44 │\n│ Adelie    │ Dream     │    56 │\n│ Gentoo    │ Biscoe    │   124 │\n│ Chinstrap │ Dream     │    68 │\n└───────────┴───────────┴───────┘\n\u003e\u003e\u003e b = a.order_by(\"count\")\n\u003e\u003e\u003e b\n┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓\n┃ species   ┃ island    ┃ count ┃\n┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩\n│ string    │ string    │ int64 │\n├───────────┼───────────┼───────┤\n│ Adelie    │ Biscoe    │    44 │\n│ Adelie    │ Torgersen │    52 │\n│ Adelie    │ Dream     │    56 │\n│ Chinstrap │ Dream     │    68 │\n│ Gentoo    │ Biscoe    │   124 │\n└───────────┴───────────┴───────┘\n```\n\nThis allows you to combine the flexibility of Python with the scale and performance of modern SQL.\n\n## Backends\n\nIbis supports nearly 20 backends:\n\n- [Apache DataFusion](https://ibis-project.org/backends/datafusion/)\n- [Apache Druid](https://ibis-project.org/backends/druid/)\n- [Apache Flink](https://ibis-project.org/backends/flink)\n- [Apache Impala](https://ibis-project.org/backends/impala/)\n- [Apache PySpark](https://ibis-project.org/backends/pyspark/)\n- [BigQuery](https://ibis-project.org/backends/bigquery/)\n- [ClickHouse](https://ibis-project.org/backends/clickhouse/)\n- [DuckDB](https://ibis-project.org/backends/duckdb/)\n- [Exasol](https://ibis-project.org/backends/exasol)\n- [MySQL](https://ibis-project.org/backends/mysql/)\n- [Oracle](https://ibis-project.org/backends/oracle/)\n- [Polars](https://ibis-project.org/backends/polars/)\n- [PostgreSQL](https://ibis-project.org/backends/postgresql/)\n- [RisingWave](https://ibis-project.org/backends/risingwave/)\n- [SQL Server](https://ibis-project.org/backends/mssql/)\n- [SQLite](https://ibis-project.org/backends/sqlite/)\n- [Snowflake](https://ibis-project.org/backends/snowflake)\n- [Theseus](https://voltrondata.com/start)\n- [Trino](https://ibis-project.org/backends/trino/)\n\n## How it works\n\nMost Python dataframes are tightly coupled to their execution engine. And many databases only support SQL, with no Python API. Ibis solves this problem by providing a common API for data manipulation in Python, and compiling that API into the backend’s native language. This means you can learn a single API and use it across any supported backend (execution engine).\n\nIbis broadly supports two types of backend:\n\n1. SQL-generating backends\n2. DataFrame-generating backends\n\n![Ibis backend types](./docs/images/backends.png)\n\n## Portability\n\nTo use different backends, you can set the backend Ibis uses:\n\n```python\n\u003e\u003e\u003e ibis.set_backend(\"duckdb\")\n\u003e\u003e\u003e ibis.set_backend(\"polars\")\n\u003e\u003e\u003e ibis.set_backend(\"datafusion\")\n```\n\nTypically, you'll create a connection object:\n\n```python\n\u003e\u003e\u003e con = ibis.duckdb.connect()\n\u003e\u003e\u003e con = ibis.polars.connect()\n\u003e\u003e\u003e con = ibis.datafusion.connect()\n```\n\nAnd work with tables in that backend:\n\n```python\n\u003e\u003e\u003e con.list_tables()\n['penguins']\n\u003e\u003e\u003e t = con.table(\"penguins\")\n```\n\nYou can also read from common file formats like CSV or Apache Parquet:\n\n```python\n\u003e\u003e\u003e t = con.read_csv(\"penguins.csv\")\n\u003e\u003e\u003e t = con.read_parquet(\"penguins.parquet\")\n```\n\nThis allows you to iterate locally and deploy remotely by changing a single line of code.\n\n\u003e 💡 **Tip**\n\u003e\n\u003e Check out [the blog on backend agnostic arrays](https://ibis-project.org/posts/backend-agnostic-arrays/) for one example using the same code across DuckDB and BigQuery.\n\n## Community and contributing\n\nIbis is an open source project and welcomes contributions from anyone in the community.\n\n- Read [the contributing guide](https://github.com/ibis-project/ibis/blob/main/docs/CONTRIBUTING.md).\n- We care about keeping the community welcoming for all. Check out [the code of conduct](https://github.com/ibis-project/ibis/blob/main/CODE_OF_CONDUCT.md).\n- The Ibis project is open sourced under the [Apache License](https://github.com/ibis-project/ibis/blob/main/LICENSE.txt).\n\nJoin our community by interacting on GitHub or chatting with us on [Zulip](https://ibis-project.zulipchat.com/).\n\nFor more information visit https://ibis-project.org/.\n\n## Governance\n\nThe Ibis project is an [independently governed](https://github.com/ibis-project/governance/blob/main/governance.md) open source community project to build and maintain the portable Python dataframe library. Ibis has contributors across a range of data companies and institutions.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fibis-project%2Fibis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fibis-project%2Fibis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fibis-project%2Fibis/lists"}