Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-duckdb
🦆 A curated list of awesome DuckDB resources
https://github.com/szarnyasg/awesome-duckdb
Last synced: 6 days ago
JSON representation
-
Resources
- DuckDB Clients - Client APIs for DuckDB.
- DuckDB Documentation PDF - The DuckDB documentation as a single PDF file.
- DuckDB setup - GitHub Action to install DuckDB in CI.
- DuckDB snippets - Collection of snippets curated by MotherDuck.
- DuckDB tldr page - DuckDB's entry in [tldr pages](https://tldr.sh/), available in CLI via the `tldr duckdb` command.
- Compatible DuckDB Extensions for AWS Lambda - Extensions specifically compiled for the AWS Lambda runtime (GLIBC 2.26).
- docker-duckdb - Docker image for DuckDB CLI.
- DuckDB AWS Lambda layer - Run DuckDB in AWS Lambda functions.
- Serverless DuckDB as API - Use DuckDB as API with Amazon API Gateway and AWS Lambda.
- Serverless Parquet Repartitioner - Use DuckDB to repartition data in S3-based Data Lakes.
- duckdb-nf - Example uses of DuckDB with Nextflow.
- DuckDB version manager (`duckman`) - platform installer and version manager for DuckDB.
- Serverless DuckDB over S3 - Running DuckDB over a data lake on S3 using lambda.
- Official Documentation - Official DuckDB documentation.
- Official Blog - Official DuckDB blog.
- DuckERD CLI
-
Client APIs
-
Tools Powered by DuckDB
-
- Ibis Project - A DataFrame API for interacting with DuckDB (and other compute engines).
- MotherDuck - Serverless data warehouse powered by DuckDB.
- Hex Dataframe SQL - Hex's Dataframe SQL cells are powered by DuckDB.
- Mode - Mode uses DuckDB for their in-memory data engine.
- VulcanSQL - DuckDB can be used as a caching layer or a data connector in VulcanSQL, a Data API framework for data folks to create REST APIs by writing SQL templates.
- Honeycomb Maps - A browser-based geospatial analysis tool leveraging DuckDB Wasm.
- Bauplan - A serverless data transformation platform for data lakes.
- Census - Census's dataset diffing for incremental syncs is powered by DuckDB.
- Parquet Explorer - Visual Studio Code extension for exploring Parquet files with SQL, powered by DuckDB.
- Huey - Blazing-fast & intuitive pivot tables on .parquet, .csv, .json files and .duckdb tables in the browser based on DuckDB WASM. Open source (MIT). Zero install!
- DatalakeStudio - Load, explore, transform your datasets and expose them via API. Integration with external APIs, S3, PostgreSQL and ChatGPT.
- Spice.ai - A unified SQL query interface and portable runtime to locally materialize (using an embedded DuckDB), accelerate, and query datasets from any database, data warehouse, or data lake.
- Malloy - Malloy is an experimental language for describing data relationships and transformations. Malloy connects to BigQuery, Snowflake, Trino, and Postgres, and natively supports DuckDB.
- Quackpipe - Serverless OLAP API/UI built on top of DuckDB with basic ClickHouse API compatibility and Motherduck support.
- ParadeDB - Postgres for Search and Analytics, powered by DuckDB-embedded-in-Postgres.
- Crunchy Bridge for Analytics - Fully managed DBaaS based in Postgres integrated with DuckDB.
- Whereabouts - Fast, accurate, open-source geocoding in Python, using DuckDB.
- Rill Developer - Tool for effortlessly transforming data sets into powerful, opinionated dashboards using SQL.
- UniverSQL - An implementation of Snowflake API, enables running queries on Snowflake tables locally with DuckDB without a running warehouse.
- Phoenix Analytics - Plug and play analytics for Phoenix applications, powered by DuckDB.
- sqlglot - Python transpiler that translates between 23 different SQL dialects including DuckDB.
- Honeycomb Maps - A browser-based geospatial analysis tool leveraging DuckDB Wasm.
- Definite - Definite pulls all your data into a single place for analytics and dashboards. No engineering or SQL required. Get a managed data warehouse (DuckDB), ELT, data modeling / transformations and BI in a single platform.
- Amphi ETL - Low-code data pipelines for structured and unstructured data. SQL transformations are powered by DuckDB.
- Excalichart.com - A fast, free dashboard for exploring your data.
- Latitude - Latitude uses DuckDB to power data snapshots. Drop a CSV file and query it with SQL at the speed of light.
- Iceburst - The real-time data lake for monitoring & security.
-
Web Clients
- Online DuckDB Shell - Online DuckDB shell powered by WebAssembly.
- Sekuel Playground - Query your local parquet, csv, json. Your data will not be sent out of the device you are using.
- CSVFiddle - Free tool to explore and share insights from CSV files using SQL. Import data, write SQL, then instantly share it with anyone.
- Codapi - Embed executable code snippets directly into your product documentation, online course or blog post.
- QuackDB - Open-source online DuckDB SQL playground and editor.
- WhatTheDuck - WhatTheDuck is an open-source web application built on DuckDB. It allows users to upload CSV files, store them in tables, and perform SQL queries on the data.
- QuackDB - Open-source online DuckDB SQL playground and editor.
-
-
SQL Clients and IDE that Support DuckDB
-
Web Clients
- qStudio - A free SQL tool specialized for data analysts. It runs on every operating system and allows easy browsing of tables and charting of results.
- DuckDB SQL Tools - Free DuckDB SQL Tools for VS Code IDE. [Premium version available](https://github.com/RandomFractals/pro-data-tools/blob/main/duckdb-tools.md#duckdb-pro-tools) with advanced features.
- VSCode SQLTools (Free) - Free open-source VSCode extension to query and explore your DuckDB databases with latest DuckDB support.
- DataGrip - Paid SQL IDE by Jetbrains that supports many different database technologies, including DuckDB.
-
-
Projects Powered by DuckDB
-
Web Clients
- `endoflife.date` database - Daily dumps of endoflife.date data.
- nodbi - NoSQL Database Connector for R, providing a common API across Elasticsearch, CouchDB, MongoDB, SQLite, PostgreSQL, and DuckDB.
-
-
Integrations
-
Web Clients
- data load tool - DuckDB destination - Extract and load data from APIs to DuckDB using dlt.
- target-duckdb - Load data to DuckDB based on Singer spec.
- Kestra DuckDB plugin - Run queries with DuckDB to schedule data transformations and process automations, and run event-driven anomaly detection pipelines.
-
-
Extensions
-
Web Clients
- Official Extensions - Official DuckDB extensions.
- `mysql` - To read from and write to MySQL databases.
- `postgres` - To read from and write to PostgreSQL databases.
- `spatial` - Enables geospatial processing.
- `sqlite` - To read from and write to SQLite databases.
- `vss` - Add support for vector similarity search.
- Kùzu - Scan DuckDB tables in Kùzu, an embeddable property graph database management system.
-
-
Media
-
Talks
- State of the Duck @ DuckCon #4 - Hannes Mühleisen and Mark Raasveldt.
- In-Process Analytical Data Management with DuckDB @ PyData Amsterdam - Hannes Mühleisen.
- DuckDB: The Power of a Data Warehouse in your Python Process @ PyData Yerevan - Gábor Szárnyas.
- DuckDB: Bringing analytical SQL directly to your Python shell @ EuroPython - Pedro Holanda.
- DuckDB keynote @ Data + AI Summit 2023 - Hannes Mühleisen.
- DuckDB: Bringing Analytical SQL Directly To Your Python Shell @ FOSDEM - Pedro Holanda.
- State of the Duck @ DuckCon #2 - Hannes Mühleisen & Mark Raasveldt.
- DuckDB Extensions @ DuckCon - Pedro Holanda & Sam Ansmink.
- Developing Systems in Academia: The Good, the Bad, and the not-so-Ugly Duckling @ CIDR - Hannes Mühleisen.
- DuckDB An Embeddable Analytical Database @ FOSDEM - Hannes Mühleisen.
- Why should you care about DuckDB? @ Dublin DuckDB meetup - Mihai Bojin.
- Exploring Monte Carlo Simulations With DuckDB @ Dublin DuckDB meetup - James McNeill.
- DuckDB and recommenders : a lightning fast synergy @ Dublin DuckDB meetup - Khalil Muhammad.
- State of the Duck @ DuckCon #3 - Hannes Mühleisen and Mark Raasveldt.
- Nextflow and database uses: powering data engineering, exploring DuckDB, and beyond - Edmund Miller.
-
Podcasts
- Developer Voices: Implementing Hardware-Friendly Databases - Hannes Mühleisen.
- The Geek Narrator: DuckDB Internals - Mark Raasveldt.
- Software Engineering Daily: DuckDB - Hannes Mühleisen.
- The Analytics Engineering Podcast: The Personal Data Warehouse - Jordan Tigani.
-
Blog Posts
- Modern Data Stack in a Box - Fast, free, and open-source Modern Data Stack deployed on a laptop using the combination of DuckDB, Meltano, dbt, and Apache Superset.
- How to use DuckDB, Motherduck and Kestra for ETL - How DuckDB can transform data, mask sensitive PII information, detect anomalies in event-driven workflows, and streamline reporting use cases.
- DuckDB vs. MotherDuck — how do they compare - What are key differences between them, and when to choose each of these options.
- Exploring StarCraft 2 data with Airflow, DuckDB and Streamlit - Example project using DuckDB to persist API data, but also explains how to use DuckDB as a versatile data manipulation tool in data wrangling scripts.
- DuckDB: The Rising Star in the Big Data Landscape
- Exploring StarCraft 2 data with Airflow, DuckDB and Streamlit - Example project using DuckDB to persist API data, but also explains how to use DuckDB as a versatile data manipulation tool in data wrangling scripts.
- How to Make a DuckDB Extension for a Table Function? - How to make a DuckDB extension to fetch data from external sources.
- Exploring StarCraft 2 data with Airflow, DuckDB and Streamlit - Example project using DuckDB to persist API data, but also explains how to use DuckDB as a versatile data manipulation tool in data wrangling scripts.
-
-
Libraries Powered by DuckDB
-
Web Clients
- Mosaic - An extensible framework for linking databases and interactive views.
- Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rewrites.
- PyGWalker - A Pyhon library that turns your dataframe into an interactive UI for data visualization.
-
Programming Languages
Categories
Sub Categories
Keywords
duckdb
10
sql
8
database
4
olap
4
parquet
3
csv
2
clickhouse
2
databricks
2
snowflake
2
golang
2
data
2
data-science
2
s3
2
data-analysis
2
analytics
2
postgres
2
faceting
1
duckdb-api
1
elasticsearch
1
datalake
1
duckdb-engine
1
bm25
1
gigapipe
1
lambda
1
lambda-functions
1
big-data
1
qryn
1
aggregations
1
rest-api
1
server
1
docker
1
version-manager
1
versioning
1
ruby
1
arrow
1
ffi
1
ffi-bindings
1
rust
1
c-bindings
1
common-lisp
1
lisp
1
artificial-intelligence
1
developers
1
infrastructure
1
machine-learning
1
time-series
1
api
1
clickhouse-server
1
sqlglot
1
dashboard
1