{"id":13671774,"url":"https://github.com/splitgraph/seafowl","last_synced_at":"2025-05-15T20:06:15.849Z","repository":{"id":56773523,"uuid":"510375115","full_name":"splitgraph/seafowl","owner":"splitgraph","description":"Analytical database for data-driven Web applications 🪶","archived":false,"fork":false,"pushed_at":"2025-02-25T09:01:09.000Z","size":4686,"stargazers_count":483,"open_issues_count":45,"forks_count":13,"subscribers_count":9,"default_branch":"main","last_synced_at":"2025-05-11T15:46:40.309Z","etag":null,"topics":["api","database","datafusion","delta-lake","delta-rs","edge","http","rust","serverless","sql","visualization"],"latest_commit_sha":null,"homepage":"https://seafowl.io","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/splitgraph.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2022-07-04T13:41:52.000Z","updated_at":"2025-05-10T14:55:49.000Z","dependencies_parsed_at":"2024-02-12T09:35:35.380Z","dependency_job_id":null,"html_url":"https://github.com/splitgraph/seafowl","commit_stats":null,"previous_names":[],"tags_count":37,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/splitgraph%2Fseafowl","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/splitgraph%2Fseafowl/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/splitgraph%2Fseafowl/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/splitgraph%2Fseafowl/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/splitgraph","download_url":"https://codeload.github.com/splitgraph/seafowl/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254414499,"owners_count":22067272,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["api","database","datafusion","delta-lake","delta-rs","edge","http","rust","serverless","sql","visualization"],"created_at":"2024-08-02T09:01:18.324Z","updated_at":"2025-05-15T20:06:10.747Z","avatar_url":"https://github.com/splitgraph.png","language":"Rust","readme":"![Seafowl](./docs/static/logotype.svg)\n\n![CI](https://github.com/splitgraph/seafowl/workflows/CI/badge.svg)\n[![Docker Pulls](https://img.shields.io/docker/pulls/splitgraph/seafowl)](https://hub.docker.com/r/splitgraph/seafowl)\n[![Docker Image Size (latest by date)](https://img.shields.io/docker/image-size/splitgraph/seafowl)](https://hub.docker.com/r/splitgraph/seafowl)\n[![GitHub all releases](https://img.shields.io/github/downloads/splitgraph/seafowl/total)](https://github.com/splitgraph/seafowl/releases)\n[![GitHub release (latest by date including pre-releases)](https://img.shields.io/github/v/release/splitgraph/seafowl?include_prereleases\u0026sort=semver)](https://github.com/splitgraph/seafowl/releases)\n\n**[Home page](https://seafowl.io) |\n[Docs](https://www.splitgraph.com/docs/seafowl/getting-started/introduction) |\n[Benchmarks](https://observablehq.com/@seafowl/benchmarks) |\n[Demo](https://observablehq.com/@seafowl/interactive-visualization-demo) |\n[Nightly builds](https://nightly.link/splitgraph/seafowl/workflows/nightly/main) |\n[Download](https://github.com/splitgraph/seafowl/releases)**\n\n[**✨✨✨ Seafowl now proudly powers the EDB Postgres Lakehouse ✨✨✨**](https://www.enterprisedb.com/workload/rapid-analytics-for-postgres)\n\nSeafowl is an analytical database for modern data-driven Web applications.\n\nIts CDN and HTTP cache-friendly query execution API lets you deliver data to your visualizations,\ndashboards and notebooks by running SQL straight from the user's browser.\n\n## Features\n\n### Fast analytics...\n\nSeafowl is built around\n[Apache DataFusion](https://arrow.apache.org/datafusion/user-guide/introduction.html), a fast and\nextensible query execution framework. It uses [Apache Parquet](https://parquet.apache.org/) columnar\nstorage, adhering to the [Delta Lake](https://delta.io/) protocol, making it perfect for analytical\nworkloads.\n\nFor `SELECT` queries, Seafowl supports a large subset of the PostgreSQL dialect. If there's\nsomething missing, you can\n[write a user-defined function](https://splitgraph.com/docs/seafowl/guides/custom-udf-wasm) for\nSeafowl in anything that compiles to WebAssembly.\n\nIn addition, you can write data to Seafowl by:\n\n- [uploading a CSV or a Parquet file](https://splitgraph.com/docs/seafowl/guides/uploading-csv-parquet)...\n- pointing a table to a local or an\n  [externally hosted CSV or a Parquet file](https://seafowl.io/docs/guides/csv-parquet-http-external)...\n- pointing a table to a [remote database](https://seafowl.io/docs/guides/remote-tables)...\n- or using\n  [standard SQL DML statements](https://splitgraph.com/docs/seafowl/guides/writing-sql-queries).\n\n### ...at the edge\n\nSeafowl is designed to be deployed to modern serverless environments. It ships as a single binary,\nmaking it simple to run anywhere.\n\nSeafowl's architecture is inspired by modern cloud data warehouses like Snowflake or BigQuery,\nletting you separate storage and compute. You can store Seafowl data in an object storage like S3 or\nMinio and scale to zero. Or, you can\n[build a self-contained Docker image](https://splitgraph.com/docs/seafowl/guides/baking-dataset-docker-image)\nwith Seafowl and your data, letting you deploy your data to any platform that supports Docker.\n\nSeafowl's query execution API follows HTTP cache semantics. This means you can\n[put Seafowl behind a CDN](https://splitgraph.com/docs/seafowl/guides/querying-cache-cdn) like\nCloudflare or a cache like Varnish and have query results cached and delivered to your users in\nmilliseconds. Even without a cache, you can get the benefits of caching query results in your user's\nbrowser.\n\n## Quickstart\n\nStart Seafowl:\n\n```bash\ndocker run --rm -p 8080:8080 \\\n    -e SEAFOWL__FRONTEND__HTTP__WRITE_ACCESS=any \\\n    splitgraph/seafowl:nightly\n```\n\nOr download it from [the releases page](https://github.com/splitgraph/seafowl/releases) and run it\nwithout Docker:\n\n```bash\nSEAFOWL__FRONTEND__HTTP__WRITE_ACCESS=any ./seafowl\n```\n\nAdd a Parquet dataset from HTTP:\n\n```bash\ncurl -i -H \"Content-Type: application/json\" localhost:8080/q -d@- \u003c\u003cEOF\n{\"query\": \"CREATE EXTERNAL TABLE tripdata \\\nSTORED AS PARQUET \\\nLOCATION 'https://d37ci6vzurychx.cloudfront.net/trip-data/yellow_tripdata_2022-01.parquet';\nCREATE TABLE tripdata AS SELECT * FROM staging.tripdata;\n\"}\nEOF\n```\n\nRun a query:\n\n```bash\ncurl -i -H \"Content-Type: application/json\" localhost:8080/q \\\n  -d@-\u003c\u003cEOF\n{\"query\": \"SELECT\n    EXTRACT(hour FROM tpep_dropoff_datetime) AS hour,\n    COUNT(*) AS trips,\n    SUM(total_amount) AS total_amount,\n    AVG(tip_amount / total_amount) AS tip_fraction\n  FROM tripdata\n  WHERE total_amount != 0\n  GROUP BY 1\n  ORDER BY 4 DESC\"}\nEOF\n\n{\"hour\":21,\"trips\":109685,\"total_amount\":2163599.240000029,\"tip_fraction\":0.12642660660636984}\n{\"hour\":22,\"trips\":107252,\"total_amount\":2154126.55000003,\"tip_fraction\":0.12631676747865359}\n{\"hour\":19,\"trips\":159241,\"total_amount\":3054993.040000063,\"tip_fraction\":0.1252992155287979}\n{\"hour\":18,\"trips\":183020,\"total_amount\":3551738.5100000845,\"tip_fraction\":0.1248666037263193}\n{\"hour\":20,\"trips\":122613,\"total_amount\":2402858.8600000343,\"tip_fraction\":0.12414978866883832}\n{\"hour\":1,\"trips\":45485,\"total_amount\":940333.4000000034,\"tip_fraction\":0.12336981088023881}\n...\n```\n\n## CLI\n\nSeafowl also provides a CLI to accommodate frictionless prototyping, troubleshooting and testing of\nthe core features:\n\n```bash\n$ ./seafowl --cli -c /path/to/seafowl.toml\ndefault\u003e CREATE TABLE t\nAS VALUES\n(1, 'one'),\n(2, 'two');\nTime: 0.021s\ndefault\u003e SELECT * FROM t;\n+---------+---------+\n| column1 | column2 |\n+---------+---------+\n| 1       | one     |\n| 2       | two     |\n+---------+---------+\nTime: 0.009s\ndefault\u003e \\d t\n+---------------+--------------+------------+-------------+-----------+-------------+\n| table_catalog | table_schema | table_name | column_name | data_type | is_nullable |\n+---------------+--------------+------------+-------------+-----------+-------------+\n| default       | public       | t          | column1     | Int64     | YES         |\n| default       | public       | t          | column2     | Utf8      | YES         |\n+---------------+--------------+------------+-------------+-----------+-------------+\nTime: 0.005s\ndefault\u003e \\q\n$\n```\n\nIt does so by circumventing Seafowl's primary HTTP interface, which involves properly formatted HTTP\nrequests with queries, authentication, as well as dealing with potentially faulty networking setups,\nand can sometimes be too tedious for a quick manual interactive session.\n\n## Documentation\n\nSee the [documentation](https://www.splitgraph.com/docs/seafowl/getting-started/introduction) for\nmore guides and examples. This includes a longer\n[tutorial](https://www.splitgraph.com/docs/seafowl/getting-started/tutorial-fly-io/introduction),\nfollowing which you will:\n\n- Deploy Seafowl to [Fly.io](https://fly.io)\n- Put it behind Cloudflare CDN or Varnish\n- Build an interactive [Observable](https://observablehq.com) notebook querying data on it, just\n  like [this one](https://observablehq.com/@seafowl/interactive-visualization-demo)\n\n## Tests\n\nPlease consult the dedicated [README](./tests/README.md) for more info on how to run the Seafowl\ntest suite locally.\n\n## Pre-built binaries and Docker images\n\nWe do not yet provide full build instructions, but we do produce binaries and Docker images as\nprebuilt artifacts.\n\n### Release builds\n\nYou can find release binaries on our [releases page](https://github.com/splitgraph/seafowl/releases)\n\n### Nightly builds\n\nWe produce nightly binaries after every merge to `main`. You can find them in GitHub Actions\nartifacts (only if you're logged in, see\n[this issue](https://github.com/actions/upload-artifact/issues/51)) or **via\n[nightly.link](https://nightly.link/splitgraph/seafowl/workflows/nightly/main)**:\n\n- [Linux (x86_64-unknown-linux-gnu)](https://nightly.link/splitgraph/seafowl/workflows/nightly/main/seafowl-nightly-x86_64-unknown-linux-gnu.zip)\n- [OSX (x86_64-apple-darwin)](https://nightly.link/splitgraph/seafowl/workflows/nightly/main/seafowl-nightly-x86_64-apple-darwin.zip)\n- [Windows (x86_64-pc-windows-msvc)](https://nightly.link/splitgraph/seafowl/workflows/nightly/main/seafowl-nightly-x86_64-pc-windows-msvc.zip)\n\n### Docker images\n\nWe produce [Docker images](https://hub.docker.com/r/splitgraph/seafowl/tags) on every merge to\n`main`.\n\n- Release builds are tagged according to their version, e.g. `v0.1.0` results in\n  `splitgraph/seafowl:0.1.0` and `0.1`.\n- Nightly builds are tagged as `splitgraph/seafowl:nightly`\n\n## Long-term feature roadmap\n\nThere are many features we're planning for Seafowl. Where appropriate, we'll also aim to upstream\nthese changes into DataFusion itself.\n\n### Support for JSON functions and storage\n\nWe're planning on adding the JSON datatype to Seafowl, as well as a suite of functions to\nmanipulate/access JSON data, similar to the\n[functions supported by PostgreSQL](https://www.postgresql.org/docs/current/functions-json.html) .\n\n### PostgreSQL-compatible endpoint\n\nThis will make Seafowl queryable by existing BI tools like Metabase/Superset/Looker.\n","funding_links":[],"categories":["Rust","api"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsplitgraph%2Fseafowl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsplitgraph%2Fseafowl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsplitgraph%2Fseafowl/lists"}