{"id":15685646,"url":"https://github.com/cdeil/prefect-tutorial","last_synced_at":"2026-02-28T00:51:36.557Z","repository":{"id":142038890,"uuid":"443525915","full_name":"cdeil/prefect-tutorial","owner":"cdeil","description":"Python Prefect tutorial","archived":false,"fork":false,"pushed_at":"2022-01-11T21:06:58.000Z","size":21,"stargazers_count":9,"open_issues_count":0,"forks_count":1,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-01-04T00:51:34.045Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cdeil.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-01-01T11:34:19.000Z","updated_at":"2023-09-17T03:22:30.000Z","dependencies_parsed_at":null,"dependency_job_id":"eac37fce-2b5c-462f-a90c-be07120f1271","html_url":"https://github.com/cdeil/prefect-tutorial","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cdeil%2Fprefect-tutorial","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cdeil%2Fprefect-tutorial/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cdeil%2Fprefect-tutorial/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cdeil%2Fprefect-tutorial/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cdeil","download_url":"https://codeload.github.com/cdeil/prefect-tutorial/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240252799,"owners_count":19772173,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-03T17:28:13.437Z","updated_at":"2026-02-28T00:51:31.527Z","avatar_url":"https://github.com/cdeil.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Python Prefect tutorial\n\nA Python [Prefect](https://www.prefect.io/) workflow orchestration tutorial.\n\nIt is at a beginner to intermediate level. There are many things in Prefect that we only cover briefly (e.g.\nschedules, logging, runners, server, ui, cloud, dask) or not at all (e.g. distributed computation). It should take\nroughly an hour to read through the content, or two hours if you want to follow along and hack a bit with the examples.\n\nWritten by Christoph Deil in January 2022.\n\nFeedback and contributions are welcome any time via Github issues or pull request.\n\n## Agenda\n\nThis tutorial will use Prefect 0.15 and we will cover the following topics:\n\n1. [Tutorial setup](#tutorial-setup)\n2. [What is Prefect?](#what-is-prefect)\n3. [Prefect examples](#prefect-examples)\n4. [Prefect from scratch](#prefect-from-scratch)\n5. [Prefect orchestration](#prefect-orchestration)\n6. [Prefect Orion](#prefect-orion) \n7. [Further resources](#further-resources)\n\n## Tutorial setup\n\nThis tutorial uses Python 3.9 and Prefect 0.15. Other versions might or might not work the same. \n\nIf you'd like to follow along with the tutorial, one way to get the relevant code is this:\n\n```\nconda create -n prefect-tutorial python=3.9 anaconda\nconda activate prefect-tutorial\npip install \"prefect[viz]==0.15\"\n```\n\nTo check that your installation was successful and which `python` and `prefect` you're using and what their versions are:\n```\n% which python\n% python --version\n% which prefect\n% prefect version\n```\n\nTo execute the examples from this tutorial:\n```\ngit clone https://github.com/cdeil/prefect-tutorial.git\ncd prefect-tutorial\n```\n\nTo read the code or tests or examples in the Prefect repository:\n```\ngit clone https://github.com/PrefectHQ/prefect.git\ncd prefect\n```\n\n## What is Prefect?\n\n* Prefect is a Python workflow orchestration system. That's very vague. This tutorial should give you a better understanding of what Prefect actually does.\n* Tag-line from https://www.prefect.io/: \"Orchestrate the\nmodern data stack. The easiest way to build, run, and monitor data pipelines at scale.\"\n* [Prefect](https://en.wikipedia.org/wiki/Prefect) is a title similar to manager. There's also [Ford Prefect](https://en.wikipedia.org/wiki/Ford_Prefect_(character)) from hitchhikers guide to the Galaxy, which seems to be a favourite book of the Prefect team. For example Marvin the Paranoid Android appears (see [blog post](https://medium.com/the-prefect-blog/prefect-runs-on-prefect-3e6df553c3a4)).\n* Prefect core - open source Python package, `prefect` cli, UI, server\n  * [prefect](https://github.com/PrefectHQ/prefect) - Python package. Currently version 0.15, release 1.0 coming soon\n  * [server](https://github.com/PrefectHQ/server) - Prefect API and backend\n  * [ui](https://github.com/PrefectHQ/ui) - Web user interface\n* Prefect cloud - cloud SaaS with hybrid execution model\n  * Hybrid execution model: no code or data sent to Prefect cloud, only task and flow metadata (see e.g. [blog post](https://medium.com/the-prefect-blog/the-prefect-hybrid-model-1b70c7fd296)) \n  * SaaS for server, UI, API, scheduler\n  * Does NOT offer compute and agents - have to buy (e.g. Coiled or AKS) or self-host separately \n  * Various enterprise features, security, access control, ...\n  * Not sure, but I think self-hosting the server is difficult because the open-source version doesn't offer any auth?\n  * See https://www.prefect.io/pricing/ - pay per successful task run (20k/month free runs, then \u003c $0.005, misc discounts)\n* Prefect Orion - Prefect 2.0 in tech preview now, with planned release in early 2022 (see section below)\n* Prefect can use [Dask](https://dask.org/) to scale compute up or out when needed\n* Development started in 2017 by Jeremiah Lowin, open-sourced in 2019 (see [blog post](https://medium.com/the-prefect-blog/open-sourcing-the-prefect-platform-d19a6d6f6dad))\n* Developed by start-up Prefect Technologies Inc (mostly in Washington DC, now fully remote) that got $11M series A and $32M series B funding in 2021 (see [blog post](https://www.prefect.io/blog/escape-velocity)), with community contributions on Github.\n* Prefect [license](https://github.com/PrefectHQ/prefect/blob/master/LICENSE) is Apache 2 (fully open source) or in parts (server, UI, Orion) [Prefect community license](https://www.prefect.io/legal/prefect-community-license/) which is not an [OSI](https://opensource.org/) approved open source license, but for most users places no significant restrictions - it basically forbids anyone except Prefect Industries Inc (e.g. AWS or Azure) to offer a competing cloud service built on Prefect (see e.g. [here](https://medium.com/the-prefect-blog/open-sourcing-the-prefect-platform-d19a6d6f6dad))    \n* Very active Slack (10k members, response typically within the hour)\n* Docs are pretty good. Some parts (e.g. [About Prefect - Why not airflow?](https://docs.prefect.io/core/about_prefect/why-not-airflow.html)) very wordy and long read but still not explaining fully technically how Prefect works.\n* Code is pure Python and pretty good (see [Github](https://github.com/PrefectHQ/prefect)). Sometimes time is better spent reading the code than the docs, e.g. to figure out how schedulers trigger flow runs. But also easy to get lost in levels of indirection, e.g. stepping through Prefect code execution in debugger even for very simple scripts it's very difficult to follow what's going on.\n* Tests are pretty good (see e.g. [tests/core/test_flow.py](https://github.com/PrefectHQ/prefect/blob/master/tests/core/test_flow.py)). Reading tests is also a good way to learn Prefect in-depth.\n\n## Prefect examples\n\nThe Prefect core API is documented here: https://docs.prefect.io/core/\n\nLet's learn by running a few [examples](examples).\n\nMost important parts of the API that we'll learn: `prefect.Flow`, `prefect.task` decorator, `prefect.Task`,\n`prefect.Parameter`, `prefect.schedules.IntervalSchedule`.\n\nSome more rarely used parts of the API (in our codebase) that we'll also look at: `Task.map`, `prefect.case`,\n`prefect.unmapped`, `prefect.tasks.control_flow.conditional.ifelse`, `prefect.tasks.core.operators.GetItem`,\n`prefect.tasks.control_flow.merge`, `prefect.context`.\n\n## Prefect from scratch\n\nNotImplementedError: please skip this section for now, since it's very much WIP and not in a useful state yet.\n\nWould you like to understand fully how Prefect works?\n\nLet's re-implement a minimal simplified version of Prefect from scratch to see how e.g. `task` and `Flow` work.\nFor sure it is very far from complete, only a very minimal core subset of Prefect is re-implemented. This is a learning\nexercise, in any real project you should use Prefect directly, not `briefect`. Also it might not be fully correct,\ni.e. it could be that Prefect actually does things differently internally.\n\nSee the [briefect](briefect) Python package and try it out via [examples/example_briefect.py](examples/example_briefect.py):\n```\npython example/example_briefect.py\n```\n\nNotes:\n* TBD: explain implementation a bit\n\n## Prefect orchestration\n\nSo far we have been running Prefect via `flow.run()` on a single machine.\nCompared to only using Python directly this makes a few things easier, mostly scheduling and error handling. \n\nHowever, Prefect was designed from the start as a workflow automation system consisting of multiple\ncomponents (api server, database, user interface, agents) that together allow managing tasks and flows\nin a well-thought out way, possibly at large scale with Dask or other executors.\n\nSee https://docs.prefect.io/orchestration/ and the [architecture\noverview](https://docs.prefect.io/orchestration/#architecture-overview)\n\nWe won't go in depth on this part of Prefect because (a) we don't use it yet and (b) it will significantly\nchange soon with Prefect Orion (see below) where the server and UI are packaged with the Python API.\n\nBut let's try out a simple example at https://cloud.prefect.io/ to see what it's about.\nThis is basically following https://cloud.prefect.io/tutorial.\n\n* Go to https://cloud.prefect.io/ and make an API key\n* Change `example_01_etl.py` to call `flow.register(project_name=\"spam\")` instead of `flow.run()`\n\n```\nprefect auth login --key \u003ctoken\u003e\nprefect create project spam\npython examples/example_01_etl.py\nprefect agent local start\n```\n* Go to https://cloud.prefect.io/ again and \"Quick Run\" the flow\n\nThe idea is that the Prefect server and UI provide an orchestration and scheduling and monitoring solution.\nE.g. data engineers could use that, but also it could be built for business people who then can run pre-defined\nworkflows, with some customisation e.g. changing parameters or schedules.\n\n## Prefect Orion\n\nPrefect Orion (see [Why \"Orion\"?](https://orion-docs.prefect.io/faq/#why-orion)) is the name for the new Prefect 2.0 that is in tech preview now, with planned release in early 2022.\n\nIf you'd like to learn more, read the [intro blog post](https://www.prefect.io/blog/announcing-prefect-orion/)\nand check out the [website](https://www.prefect.io/orion/) and [docs](https://orion-docs.prefect.io/).\n\nPrefect Orion will have a few significant changes, most prominently:\n\n1. Ding, dong the DAG is dead, the wicked DAG is dead (see [song](https://youtu.be/kPIdRJlzERo))\n2. Prefect embraces Python type hints as part of the API (see [here](https://orion-docs.prefect.io/concepts/flows/)\n3. The Prefect server and UI are now included in the Python package (see [docs](https://orion-docs.prefect.io/tutorials/orion/))\n\nWell, the DAG isn't really dead. The `with Flow() as flow` is dead, there's now a `@flow`\ndecorator. Flows are functions that call tasks at run time. So no more in-advance flow definition that creates a DAG.\nTasks are executed at runtime and the DAG is formed dynamically and then only available after for visualisation and\ninspection post-run. This allows a free mix of Python normal code (if, for, while, whatever) and Prefect tasks.\n\nType hints are used via [Pydantic](https://pydantic-docs.helpmanual.io/) under the hood and partly it surfaces in the\nAPI. The Prefect CLI is now built with [Typer](https://typer.tiangolo.com/), Prefect Core CLI was built with\n[click](https://click.palletsprojects.com/).\n\nThe source code is available in the branch [orion on github](https://github.com/PrefectHQ/prefect/tree/orion)\nand (at the time of writing) pre-release `2.0a6` is up on [pypi](https://pypi.org/project/prefect/#history).\n\nLet's try out one quick example ([examples/example_orion.py](examples/example_orion.py)):\n\n```\npip install -U \"prefect\u003e=2.0.0a\"\nprefect version\npython examples/example_orion.py\n```\n\nThe Prefect server and UI are now included, so you can run this locally:\n```\nprefect orion start\nopen http://localhost:4200\n```\n\nBy default Prefect will now use a local SQLite database (see [here](https://orion-docs.prefect.io/tutorials/orion/#the-database))\nwhich you can inspect to learn about the underlying data model and see what information it collects:\n\n```\n% sqlite3 ~/.prefect/orion.db \nSQLite version 3.37.0 2021-11-27 14:13:22\nEnter \".help\" for usage hints.\nsqlite\u003e .tables\ndeployment            flow_run_state        task_run_state      \nflow                  saved_search          task_run_state_cache\nflow_run              task_run            \nsqlite\u003e select * from flow;\n75775846-911d-40ee-9909-d0767d543d00|2022-01-04 20:19:07.789835|2022-01-04 20:19:07.789852|Github Stars|[]\nsqlite\u003e select * from flow_run;\n044c3280-827b-4fe9-ab91-15474a8d8526|2022-01-04 20:19:07.827114|2022-01-04 20:19:08.629000|chubby-earthworm|COMPLETED|1|2022-01-04 20:19:07.817840||2022-01-04 20:19:07.843549|2022-01-04 20:19:08.564030|1970-01-01 00:00:00.720481|891142a92d8eada2da7a118a36036e30|{\"repos\": [\"PrefectHQ/Prefect\", \"PrefectHQ/miter-design\"]}||{}|{}|{}|[]|0|75775846-911d-40ee-9909-d0767d543d00|||4ce19c39-7177-40fb-b2c0-ff7967f9bb8b\nsqlite\u003e select * from task_run;\na62b6e6c-86cb-4805-9614-d33b8fc9dd73|2022-01-04 20:19:07.872005|2022-01-04 20:19:08.230000|get_stars-e40861f0-0|COMPLETED|1|2022-01-04 20:19:07.866437||2022-01-04 20:19:07.898846|2022-01-04 20:19:08.210336|1970-01-01 00:00:00.311490|e40861f01e841d03bf533259cd352bf6|0||||{\"max_retries\": 3, \"retry_delay_seconds\": 0.0}|{\"repo\": []}|[]|044c3280-827b-4fe9-ab91-15474a8d8526|f18fe717-179c-49eb-b9c8-f4f1378b0d43\n2bf057e6-667a-470c-a678-addc47cf4de3|2022-01-04 20:19:08.243963|2022-01-04 20:19:08.553000|get_stars-e40861f0-1|COMPLETED|1|2022-01-04 20:19:08.237511||2022-01-04 20:19:08.266165|2022-01-04 20:19:08.533337|1970-01-01 00:00:00.267172|e40861f01e841d03bf533259cd352bf6|1||||{\"max_retries\": 3, \"retry_delay_seconds\": 0.0}|{\"repo\": []}|[]|044c3280-827b-4fe9-ab91-15474a8d8526|ffb31e39-c9dc-400b-bbf5-7ea71d39d547\nsqlite\u003e .q\n```\n\nThat's it, we're out of time.\n\nIf you like to learn more:\n* [Prefect Orion docs](https://orion-docs.prefect.io/)\n* [Prefect Core docs](https://docs.prefect.io/)\n\nIf you'd like to go back to Prefect core with your installation:\n```\npip install \"prefect[viz]==0.15\"\nprefect version\n```\n\n## Further resources\n\nLink collection mostly of blog posts on Prefect. Probably not really useful for you, it's rather a random link\ncollection, mostly things I've read or still plan to read and wanted to note down.\n\n* https://www.prefect.io/blog/you-no-longer-need-two-separate-systems-for-batch-processing-and-streaming/\n* https://www.prefect.io/blog/prefect-zero-to-hero/\n* https://www.datarevenue.com/en-blog/what-we-are-loving-about-prefect\n* https://medium.com/the-prefect-blog/orchestrating-elt-with-prefect-and-dbt-a-flow-of-flows-part-1-aac77126473\n* https://towardsdatascience.com/orchestrate-a-data-science-project-in-python-with-prefect-e69c61a49074#67c1-8f85fb1cfe73\n* https://makeitnew.io/prefect-a-modern-python-native-data-workflow-engine-7ece02ceb396\n* https://rdrn.me/scaling-out-prefect/\n* https://airbyte.io/recipes/elt-pipeline-prefect-airbyte-dbt\n* https://airbyte.com/blog/announcing-prefect-integration-with-airbyte-to-automate-elt-pipelines\n* https://www.theoaklandgroup.co.uk/prefect-should-you-utilise-the-next-generation-of-data-pipelining-software/\n* https://docs.microsoft.com/en-us/events/build-may-2021/startups/breakouts/od549/\n* https://youtu.be/gbzL5TIFZZY Data Science DC Aug 2021 Meetup: Machine Learning Workflow Orchestration with Prefect","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcdeil%2Fprefect-tutorial","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcdeil%2Fprefect-tutorial","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcdeil%2Fprefect-tutorial/lists"}