Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mharrisb1/daglib
Lightweight DAG composition framework
https://github.com/mharrisb1/daglib
abandoned dag dask orchestration python workflow
Last synced: 15 days ago
JSON representation
Lightweight DAG composition framework
- Host: GitHub
- URL: https://github.com/mharrisb1/daglib
- Owner: mharrisb1
- License: mit
- Created: 2022-06-29T14:48:01.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2022-08-05T16:35:34.000Z (over 2 years ago)
- Last Synced: 2025-01-10T07:51:55.183Z (30 days ago)
- Topics: abandoned, dag, dask, orchestration, python, workflow
- Language: Python
- Homepage:
- Size: 3.58 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# ⚗️ Daglib - Lightweight DAG composition framework
[![PyPI version](https://badge.fury.io/py/daglib.svg)](https://badge.fury.io/py/daglib)
[![PyPI - Downloads](https://img.shields.io/pypi/dm/daglib)](https://pypi.org/project/daglib/)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/daglib.svg)](https://pypi.org/project/daglib/)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/ambv/black)
[![Checked with mypy](https://img.shields.io/badge/mypy-checked-blue.svg)](https://mypy.readthedocs.io/en/stable/)
[![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white)](https://github.com/pre-commit/pre-commit)Daglib is a lightweight, embeddable parallel task execution library used for turning pure Python functions into executable task graphs.
# Installation
Core
```shell
pip install daglib
```With visualizations enabled
```shell
pip install 'daglib[graphviz]' # static visualizations
# or
pip install 'daglib[ipycytoscape]' # interactive visulizations
```# Create your first DAG
```python
import daglibdag = daglib.Dag()
@dag.task()
def task_1a():
return "Hello"@dag.task()
def task_1b():
return "world!"@dag.task()
def task_2(task_1a, task_1b):
return f"{task_1a}, {task_1b}"dag.run()
```'Hello, world!'
# Beyond the "Hello, world!" example
For a more involved example, we will create a small pipeline that takes data from four source tables and creates a single reporting table. The data is driver-level information from the current 2022 Formula 1 season. The output will be a pivot table for team-level metrics.
## Source Tables
1. Team - Team of driver
2. Points - Current total Driver's World Championship points for each driver for the season
3. Wins - Current number of wins for each driver for the season
4. Podiums - Current number of times the driver finished in the top 3 for the season```python
import pandas as pd
import daglib# Ignore. Used to render the DataFrame correctly in the README
pd.set_option("display.notebook_repr_html", False)dag = daglib.Dag()
@dag.task()
def team():
return pd.DataFrame(dict(
driver=["Max", "Charles", "Lewis", "Sergio", "Carlos", "George"],
team=["Red Bull", "Ferrari", "Mercedes", "Red Bull", "Ferrari", "Mercedes"],
)).set_index("driver")@dag.task()
def points():
return pd.DataFrame(dict(
driver=["Max", "Charles", "Lewis", "Sergio", "Carlos", "George"],
points=[258, 178, 146, 173, 156, 158]
)).set_index("driver")@dag.task()
def wins():
return pd.DataFrame(dict(
driver=["Max", "Charles", "Lewis", "Sergio", "Carlos", "George"],
wins=[8, 3, 0, 1, 1, 0]
)).set_index("driver")@dag.task()
def podiums():
return pd.DataFrame(dict(
driver=["Max", "Charles", "Lewis", "Sergio", "Carlos", "George"],
podiums=[10, 5, 6, 6, 6, 5]
)).set_index("driver")@dag.task()
def driver_metrics(team, points, wins, podiums):
return team.join(points).join(wins).join(podiums)@dag.task()
def team_metrics(driver_metrics):
return driver_metrics.groupby("team").sum().sort_values("points", ascending=False)dag.run()
```points wins podiums
team
Red Bull 431 9 16
Ferrari 334 4 11
Mercedes 304 0 11## Task Graph Visualization
The DAG we created above will create a task graph that looks like the following
![task graph](https://storage.googleapis.com/daglib-image-assets/example-dag.png)