https://github.com/factorpricingmodel/prefect-yaml
Schedule your tasks with YAML configuration files in Prefect perfectly
https://github.com/factorpricingmodel/prefect-yaml
orchestration prefect scheduler yaml-configuration
Last synced: 5 months ago
JSON representation
Schedule your tasks with YAML configuration files in Prefect perfectly
- Host: GitHub
- URL: https://github.com/factorpricingmodel/prefect-yaml
- Owner: factorpricingmodel
- License: mit
- Created: 2022-12-09T16:08:14.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2023-02-24T12:47:30.000Z (over 3 years ago)
- Last Synced: 2025-09-23T16:27:14.354Z (9 months ago)
- Topics: orchestration, prefect, scheduler, yaml-configuration
- Language: Python
- Homepage: https://prefect-yaml.readthedocs.io/en/latest/
- Size: 209 KB
- Stars: 3
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project
README
# Prefect YAML
Package to run prefect with YAML configuration. For further details, please refer
to the [documentation](https://prefect-yaml.readthedocs.io/en/latest/)
## Installation
Install this via pip (or your favourite package manager):
`pip install prefect-yaml`
## Usage
Run the command line `prefect-yaml` with the specified configuration
file.
For example, the following YAML configuration is located in [examples/simple_config.yaml](examples/simple_config.yaml).
```
metadata:
output:
directory: .output
task:
task_a:
caller: math:fabs
parameters:
- -9.0
output:
format: json
task_b:
caller: math:sqrt
parameters:
- !data task_a
output:
directory: null
task_c:
caller: math:fsum
parameters:
- [!data task_b, 1]
```
Run the following command to generate all the task outputs to the
directory `.output` in the running directory.
```shell
prefect-yaml -c examples/simple_config.yaml
```
The output directory contains all the task outputs in the specified
format.
```shell
% tree .output
.output
├── task_a.json
└── task_c.pickle
0 directories, 2 files
```
The expected behavior is to
1. run `task_a` to dump the value `fabs(-9.0)` to the output directory in JSON format,
2. run `task_b` to get the value `sqrt(9.0)` (from the output of `task_a`)
3. run `task_c` to dump the value `fsum([3.0, 1.0])` to the output directory in pickle format.
As the output directory in `task_b` is overridden as `null`, the output of `task_b` is passed to `task_c` in memory. Also, the output format in `task_c`
is not specified so it is dumped in default format (pickle).
For further details, please see the section [configuration](https://prefect-yaml.readthedocs.io/en/latest/configuration.html) in the documentation.
## Configuration
The output section defines how the task writes and loads the task return. The section in `metadata` applies for all tasks globally while that in each `task`
overrides the global parameters.
For further details, please see the [documentation](https://prefect-yaml.readthedocs.io/en/latest/configuration.html#output) for parameter definitions
in each section.
## Output
The default output format is either pickle (default) or JSON, while users
can define their own output format.
For example, if you would like to use `pandas` to load and dump the parquet file
in pyarrow engine by default, you can define the configuration like below.
```
metadata:
format: parquet
dump-caller: object.to_parquet
dump-parameters:
engine: pyarrow
load-caller: pandas:read_parquet
load-parameters:
engine: pyarrow
```
All the output parameters, like directory, dumper and loaders, can be overridden
in the task level. You can also specify which tasks to export to the output
directory, while the others to only be passed down to downstream in memory.
For further details, please see the [output](https://prefect-yaml.readthedocs.io/en/latest/output.html) section in documentation.
## Roadmap
Currently the project is still under development. The basic features are
mostly available while the following features are coming soon
- Multi cloud storage support
- Subtasks supported in each task
-
## Contributing
All levels of contributions are welcomed. Please refer to the [contributing](https://prefect-yaml.readthedocs.io/en/latest/contributing.html)
section for development and release guidelines.