https://github.com/datonic/datadex

📦 Serverless and local-first Open Data Platform
https://github.com/datonic/datadex

dbt duckdb open-data quarto sql

Last synced: 3 months ago
JSON representation

📦 Serverless and local-first Open Data Platform

Host: GitHub
URL: https://github.com/datonic/datadex
Owner: datonic
License: mit
Created: 2022-03-16T10:08:34.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2025-03-11T10:31:11.000Z (4 months ago)
Last Synced: 2025-03-19T20:57:32.115Z (4 months ago)
Topics: dbt, duckdb, open-data, quarto, sql
Language: Jupyter Notebook
Homepage: http://datadex.datonic.io
Size: 15 MB
Stars: 285
Watchers: 3
Forks: 16
Open Issues: 12
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

D A T A D E X

The Open Data Platform for your community Open Data

Datadex is a fully open-source, serverless, and local-first Data Platform that improves how [communities collaborate on Open Data](https://davidgasquez.com/community-level-open-data-infrastructure/). Datadex is not a new tool, it is a pattern showing an opinionated bridge between existing ones.

The goal is to increase your community's coordination and shared understanding of the world. Datadex makes it easy to produce data products built by your community, for your community.

### 🚀 Implementations

Check other [real-world production Open Data Portals](https://davidgasquez.com/modern-open-data-portals/) of the Datadex pattern in the following repositories:

- [LUNG-SARG](https://github.com/open-radiogenomics/lung-sarg). The Open Data Platform for Sustainable, Accessible Lung Radiogenomics.
- [Datania](https://github.com/davidgasquez/datania/). An Open Data Platform at national level that unifies and harmonizes information from different sources.
- [Gitcoin Grants Data Portal](https://github.com/davidgasquez/gitcoin-grants-data-portal). A Data hub for Gitcoin Grants data and related models.
- [Filecoin Data Portal](https://github.com/davidgasquez/filecoin-data-portal/). A data portal for data related to the Filecoin network and ecosystem.

### 💡 Principles

> [Make Open Data compatible with the Modern Data Ecosystem](https://handbook.davidgasquez.com/Open+Data).

- **Open**: Code, standards, infrastructure, and data, all public and open source. Rely on open source tools, standards, public infrastructure, and [accessible data formats](https://voltrondata.com/codex/a-new-frontier).
- **Modular and Interoperable**: Easy to replace, extend or remove components of the pattern. Environment flexibility (your laptop, in a cluster, or from the browser) when running and when deploying (S3 + GH Pages, IPFS, Hugging Face).
- **Permissionless**: Any improvement is one Pull Request away. Update pipelines, add datasets, or improve documentation. When consuming, there are no API limits, just plain files.
- **Data as Code**: Reproducible datasets with declarative stateless transformations tracked in `git`. Data is versioned alongside the code. Models are reusable, packaged, and versioned.
- **Glue**: Be a bridge between tools and approaches. E.g: Use software engineering good practices like types, tests, materialized views, and more.

## ⚙️ Setup

Datadex is mainly a Python project, so you'll need to have Python installed. If you hit any issue, please [open an issue](https:github.com/datonic/datadex/issues/new)! The easiest way to get started is using a Python virtual environment, but a development container is also provided.

### 🐍 Python Virtual Environment

The recommended way is to install [`uv`](https://github.com/astral-sh/uv) and let it manage the Python environment. The following commands will install the dependencies and create a virtual environment in the project's folder.

```bash
make setup
```

Alternatively, you can rely on your system's Python installation to create a virtual environment and install the dependencies.

```bash
# Create a virtual environment
python3 -m venv .venv
source .venv/bin/activate

# Install the package and dependencies
pip install -e ".[dev]"
```

Now, you should be able to spin up Dagster UI (`make dev` or `dagster dev`) and [access it locally](http://127.0.0.1:3000).

### 🐳 Docker / Dev Containers

You can use [VSCode Remote Containers](https://code.visualstudio.com/docs/remote/containers) to get started with Datadex too. If you have Docker running, open the project in VSCode and click on the bottom right corner to open the project in a container.

Once inside the develpment environment, you'll only need to run `make dev` to spin up the [Dagster UI locally](http://127.0.0.1:3000). You'll also have a few extra extensions installed and configured to work with the project.

The development environment can also run in your browser thanks to GitHub Codespaces!

[![badge](https://github.com/codespaces/badge.svg)](https://codespaces.new/davidgasquez/datadex)

## 📜 License

Datadex is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/datonic/datadex

Awesome Lists containing this project

README

D A T A D E X