https://github.com/stateful-y/kedro-dagster
Kedro plugin to support running pipelines on Dagster
https://github.com/stateful-y/kedro-dagster
dagster data-engineering data-science dataops kedro kedro-plugin machine-leaning-engineering machine-learning mlops orchestration
Last synced: about 2 months ago
JSON representation
Kedro plugin to support running pipelines on Dagster
- Host: GitHub
- URL: https://github.com/stateful-y/kedro-dagster
- Owner: stateful-y
- License: apache-2.0
- Created: 2024-11-11T16:33:02.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2026-04-04T20:07:47.000Z (2 months ago)
- Last Synced: 2026-04-11T22:58:24.787Z (about 2 months ago)
- Topics: dagster, data-engineering, data-science, dataops, kedro, kedro-plugin, machine-leaning-engineering, machine-learning, mlops, orchestration
- Language: Python
- Homepage: https://kedro-dagster.readthedocs.io/
- Size: 4.37 MB
- Stars: 22
- Watchers: 2
- Forks: 1
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE.md
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
[](https://pypi.org/project/kedro-dagster/)
[](https://github.com/stateful-y/kedro-dagster/blob/main/LICENSE)
[](https://pypi.org/project/kedro-dagster/)
[](https://anaconda.org/conda-forge/kedro-dagster)
[](https://codecov.io/gh/stateful-y/kedro-dagster/branch/main)
[](https://kedro.org)
[](https://slack.kedro.org)
## What is Kedro-Dagster?
The Kedro-Dagster plugin enables seamless integration between [Kedro](https://kedro.readthedocs.io/), a framework for creating reproducible and maintainable data science code, and [Dagster](https://dagster.io/), a data orchestrator for machine learning and data pipelines. This plugin makes use of Dagster's orchestration capabilities to automate and monitor Kedro pipelines effectively.
Currently, Kedro-Dagster supports Kedro versions 0.19.x and 1.x, and Dagster versions 1.10.x, 1.11.x, and 1.12.x.
## What are the features of Kedro-Dagster?
- **Configuration‑driven workflows:** Centralize orchestration settings in a `dagster.yml` file for each Kedro environment. Define jobs from filtered Kedro pipelines, assign executors, schedules.
- **Customization:** The core integration lives in the auto‑generated Dagster `definitions.py`. For advanced use cases, you can extend or override these definitions.
- **Kedro hooks preservation:** Kedro hooks are preserved and called at the appropriate time during pipeline execution, so custom logic (e.g., data validation, logging) continues to work seamlessly.
- **MLflow compatibility:** Use [Kedro-MLflow](https://github.com/Galileo-Galilei/kedro-mlflow) with Dagster’s [MLflow integration](https://dagster.io/integrations/dagster-mlflow) to track experiments, log models, and register artifacts.
- **Logger integration:** Unifies Kedro and Dagster logging so logs from Kedro nodes appear in the Dagster UI and are easy to trace and debug. Additionally, provides configuration to customize Dagster run loggers.
- **(Experimental) Dagster partition support:** Make use of Dagster's partitions to fan-out Kedro nodes acting on partitioned data.
## How to install Kedro-Dagster?
Install the Kedro-Dagster plugin using `pip`:
```bash
pip install kedro-dagster
```
or using `uv`:
```bash
uv pip install kedro-dagster
```
or using `conda`:
```bash
conda install -c conda-forge kedro-dagster
```
or using `mamba`:
```bash
mamba install -c conda-forge kedro-dagster
```
or alternatively, add `kedro-dagster` to your `requirements.txt` or `pyproject.toml` file.
## How to get started with Kedro-Dagster?
1. **Initialize the plugin in your Kedro project**
Use the following command to generate a `definitions.py` file, where all translated Kedro objects are available as Dagster objects, and a `dagster.yml` configuration file:
```bash
kedro dagster init --env
```
2. **Configure jobs, executors, and schedules**
Define your job executors and schedules in the `dagster.yml` configuration file located in your Kedro project's `conf/` directory. This file allows you to filter Kedro pipelines and assign specific executors and schedules to them.
```yaml
# conf/local/dagster.yml
schedules:
daily: # Schedule name
cron_schedule: "0 0 * * *" # Schedule parameters
executors: # Executor name
sequential: # Executor parameters
in_process:
multiprocess:
multiprocess:
max_concurrent: 2
jobs:
default: # Job name
pipeline: # Pipeline filter parameters
pipeline_name: __default__
executor: sequential
parallel_data_processing:
pipeline:
pipeline_name: data_processing
node_names:
- preprocess_companies_node
- preprocess_shuttles_node
schedule: daily
executor: multiprocess
data_science:
pipeline:
pipeline_name: data_science
schedule: daily
executor: sequential
```
3. **Launch the Dagster UI**
Start the Dagster UI to monitor and manage your pipelines using the following command:
```bash
kedro dagster dev --env
```
The Dagster UI will be available at http://127.0.0.1:3000.
For a concrete use-case, see the [Kedro-Dagster example repository](https://github.com/stateful-y/kedro-dagster-example).
## How do I use Kedro-Dagster?
Full documentation is available at [https://kedro-dagster.readthedocs.io/](https://kedro-dagster.readthedocs.io/).
## Can I contribute?
We welcome contributions, feedback, and questions:
- **Report issues or request features:** [GitHub Issues](https://github.com/stateful-y/kedro-dagster/issues)
- **Join the discussion:** [GitHub Discussions](https://github.com/stateful-y/kedro-dagster/discussions)
- **Contributing Guide:** [CONTRIBUTING.md](https://github.com/stateful-y/kedro-dagster/blob/main/CONTRIBUTING.md)
If you are interested in becoming a maintainer or taking a more active role, please reach out to Guillaume Tauzin on [GitHub Discussions](https://github.com/stateful-y/kedro-dagster/discussions).
## Where can I learn more?
- Full documentation: [https://kedro-dagster.readthedocs.io/](https://kedro-dagster.readthedocs.io/)
- GitHub Discussions: [https://github.com/stateful-y/kedro-dagster/discussions](https://github.com/stateful-y/kedro-dagster/discussions)
For questions and discussions, you can also open a [discussion](https://github.com/stateful-y/kedro-dagster/discussions).
## License
This project is licensed under the terms of the [Apache-2.0 License](https://github.com/stateful-y/kedro-dagster/blob/main/LICENSE).
## Acknowledgements
This project is maintained by [stateful-y](https://stateful-y.io), an ML consultancy specializing in MLOps and data science & engineering. If you're interested in collaborating or learning more about our services, please visit our website.