https://github.com/esadek/mini-mds
Lightweight, open source, locally-hosted Modern Data Stack
https://github.com/esadek/mini-mds
dash dbt dlt duckdb modern-data-stack pandera prefect
Last synced: 10 months ago
JSON representation
Lightweight, open source, locally-hosted Modern Data Stack
- Host: GitHub
- URL: https://github.com/esadek/mini-mds
- Owner: esadek
- License: mit
- Created: 2024-09-28T21:27:16.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-04-07T06:31:16.000Z (10 months ago)
- Last Synced: 2025-04-15T14:14:31.237Z (10 months ago)
- Topics: dash, dbt, dlt, duckdb, modern-data-stack, pandera, prefect
- Language: Python
- Homepage:
- Size: 370 KB
- Stars: 14
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Mini MDS
[](https://github.com/esadek/mini-mds/actions/workflows/ci.yml)
[](https://www.python.org/downloads/)
[](https://github.com/esadek/mini-mds/blob/main/LICENSE)
[](https://github.com/astral-sh/ruff)
Lightweight, open source, locally-hosted Modern Data Stack
- Extract and Load: [Polars](https://pola.rs/) and [dlt](https://dlthub.com/)
- Data Quality: [Pandera](https://www.union.ai/pandera/)
- Storage: [DuckDB](https://duckdb.org/)
- Transformation: [dbt](https://www.getdbt.com/)
- Orchestration: [Prefect](https://www.prefect.io/)
- Visualization: [Dash](https://dash.plotly.com/)
## Installation
Prerequisites: Install [git](https://git-scm.com/) and [uv](https://docs.astral.sh/uv/).
Clone repository and change directory:
```bash
git clone https://github.com/esadek/mini-mds.git
cd mini-mds
```
## Usage
Extract, validate, load and transform data:
```bash
uv run prefect/elt.py
```
Visualize data:
```bash
uv run dash/app.py
```
## Architecture
```mermaid
flowchart LR
A(CSV) --> B[Polars]
subgraph Prefect
B --> C[Pandera]
C --> D[dlt]
E[dbt Core]
end
D --> F[(DuckDB)]
E <--> F
F --> G[Dash]
```
## Project Structure
```
mini-mds
├── .github/ # GitHub workflows
├── dash/ # Dash application
├── dbt/ # dbt project
├── duckdb/ # DuckDB warehouse
├── prefect/ # Prefect workflows
├── .editorconfig # Editor configuration
├── .gitignore # Untracked files to ignore
├── .python-version # Default Python version
├── LICENSE # MIT license
├── pyproject.toml # Project metadata
├── README.md # Documentation
└── uv.lock # Dependency lockfile
```