https://github.com/korawica/dukdig
Friendly Airflow DAG generator for DE
https://github.com/korawica/dukdig
airflow-dags yaml-template
Last synced: 2 months ago
JSON representation
Friendly Airflow DAG generator for DE
- Host: GitHub
- URL: https://github.com/korawica/dukdig
- Owner: korawica
- License: mit
- Created: 2025-08-06T12:22:29.000Z (2 months ago)
- Default Branch: main
- Last Pushed: 2025-08-07T11:05:41.000Z (2 months ago)
- Last Synced: 2025-08-07T13:05:47.632Z (2 months ago)
- Topics: airflow-dags, yaml-template
- Language: Python
- Homepage:
- Size: 7.81 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# DukDig
DukDig is friendly Airflow DAG generator for Data Engineer.
This template will generate routing task.
**File Structure**:
```text
dags/
├── { domain }/
│ ├── { module-dags }/
│ │ ├── __init__.py
│ │ ├── dag.yml
│ │ ├── variables.yml
│ │ └── assets/
│ │ ├── schema-mapping.json
│ │ └── transform-query.sql
│ │
│ └── { module-dags }/
│ ├── __init__.py
```## Installation
| Airflow Version | Supported | Noted |
|:---------------:|:---------:|-------|
| `2.7.1` | :x: | |
| `3.x.x` | :x: | |## Examples
This DAG generator engine need you define the `dag.yaml` file and set engine
object to get the current path on `__init__.py` file.> [!NOTE]
> If you want to dynamic environment config on `dag.yaml` file, you can use a
> `variable.yaml` file for dynamic value that marking on config template.```yaml
name: sales_dag
schedule: "@daily"
authors: ["de-team"]
tasks:
- name: start- name: etl_sales_master
upstream: start
type: group
tasks:
- name: extract
type: task
op: python
uses: libs.gcs.csv@1.1.0
assets:
- name: schema-mapping.json
alias: schema
convertor: basic
params:
path: gcs://{{ var("PROJECT_ID") }}/sales/master/date/{ exec_date:%y }- name: transform
upstream: extract
type: task
op: docker
uses: docker.rgt.co.th/image.transform:0.0.1
assets:
- name: transform-query.sql
alias: transform
params:
path: gcs://{{ var("PROJECT_ID") }}/landing/master/date/{ exec_date:%y }- name: sink
type: task
op: python
run: |
import time
time.sleep(5)- name: end
upstream: etl_sales_master
``````python
from dukdig import DokDagdoc = """
# SALES DAGThis DAG will extract data from Google Cloud Storage to Google BigQuery LakeHouse
via DuckDB engine.
"""dag = DokDag(
"sales_dag", path=__file__, gb=globals()
)
dag.gen()
```