Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-apache-airflow
Curated list of resources about Apache Airflow
https://github.com/jghoman/awesome-apache-airflow
Last synced: 1 day ago
JSON representation
-
Best practices, lessons learned and cool use cases
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Airflow Dag Python Package Management - Managing python package dependencies across 100+ dags can become painful. It's hard to keep track of which packages are used by which dag, and hard to clean up during DAG removal/upgrade. Learn how KubernetesPodOperator and DockerOperator can fix this.
- What we learned migrating off Cron to Airflow - [Katie Macias](https://medium.com/@katiemacias) describes [VideoAmp](https://www.videoamp.com/)'s Data Engineering's journey from cron to Airflow.
- Testing in Airflow Part 2 - [Chandu Kavar](https://twitter.com/chandukavar) and [Sarang Shinde](https://www.linkedin.com/in/sarang-shinde-219a4873/) have explained Integration Tests and End-to-End Pipeline Tests.
- We're all using Airflow wrong and how to fix it - [Jessica Laughlin](https://www.jldlaughlin.com/) of [Bluecore](https://www.bluecore.com/) shares three engineering problems associated with the Airflow design and how to solve them by using the [KubernetesPodOperator](https://github.com/apache/airflow/blob/v1-10-stable/airflow/contrib/operators/kubernetes_pod_operator.py) in two design patterns.
- Lessons learnt while Airflow-ing - part-2-lessons-learned-793fa3c0841e) - [Nehil Jain](https://twitter.com/nehiljain) has written a two-part series that covers the value of workflow schedulers, some best practices and pitfalls he found while working with Airflow. The [second article](https://medium.com/snaptravel/airflow-part-2-lessons-learned-793fa3c0841e) in particular includes many production tips.
- Why Robinhood uses Airflow - [Vineet Goel](https://twitter.com/vineetik) walks through why financial trading platform [Robinhood](https://robinhood.com/) picked Airflow over alternative work schedulers.
- Under the Hood: Building AIR at Qubole - [Sreenath Kamath](https://www.linkedin.com/in/sreenath-kamath-66a1b970/) and [Rajat Venkatesh](https://twitter.com/vrajat) write about building [Qubole](https://www.qubole.com/)'s [data discovery, insights and recommendations platform](https://www.qubole.com/blog/building-qdsair-infrastructure/) atop Airflow.
- Airflow: Why is nothing working? - TL;DR Airflow’s SubDagOperator causes deadlocks - Deep dive into troubleshooting a troublesome Airflow DAG with good tips on how to diagnosis problems.
- Apache Airflow as an External scheduler for distributed systems - [Arunkumar](https://medium.com/@rako) suggests using Airflow as a simple external scheduler for a distributed system.
- How Sift Trains Thousands of Models using Apache Airflow - Summary of [Sift Science](https://siftscience.com/)'s deployment strategy for its machine learning model pipelines.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Airflow Dag Management & Versioning - Efficently manage DAGs release process by using Git Submodules
- Testing in Airflow Part 2 - [Chandu Kavar](https://twitter.com/chandukavar) and [Sarang Shinde](https://www.linkedin.com/in/sarang-shinde-219a4873/) have explained Integration Tests and End-to-End Pipeline Tests.
- Upgrading & Scaling Airflow at Robinhood - [Abishek Ray](https://www.linkedin.com/in/abhishek-ray-29210145/) describes how Robinhood tackled upgrading its production Airflow while minimizing downtime.
- We're all using Airflow wrong and how to fix it - [Jessica Laughlin](https://www.jldlaughlin.com/) of [Bluecore](https://www.bluecore.com/) shares three engineering problems associated with the Airflow design and how to solve them by using the [KubernetesPodOperator](https://github.com/apache/airflow/blob/v1-10-stable/airflow/contrib/operators/kubernetes_pod_operator.py) in two design patterns.
- Getting started with Data Lineage - [Germain Tanguy](https://www.linkedin.com/in/germain-tanguy/) of [Dailymotion](https://www.dailymotion.com/) shares a data lineage prototype integrated to Apache Airflow.
- Collaboration between data engineers, data analysts and data scientists - [Germain Tanguy](https://www.linkedin.com/in/germain-tanguy/) of [Dailymotion](https://www.dailymotion.com/) shares how to efficiently release in production by collaboration with Apache Airflow.
- Airflow: Lesser Known Tips, Tricks, and Best Practises - [Kaxil Naik](https://www.linkedin.com/in/kaxil/) has explained the lesser-known yet very useful tips and best practises on using Airflow.
- Testing in Airflow Part 1 - [Chandu Kavar](https://twitter.com/chandukavar) has explained different categories of tests in Airflow. It includes DAG Validation Tests, DAG Definition Tests, and unit tests.
- Improving Airflow UI Security - WePay's [Joy Gao](https://twitter.com/joygao) breaks down the need for Role Based Access Controls (RBAC) and how she introduced it to Airflow.
- How to Create a Workflow in Apache Airflow to Track Disease Outbreaks in India - [Vinayak Mehta](https://twitter.com/vortex_ape) details how [SocialCops](https://socialcops.com/) uses Airflow to scrape India's Ministry of Health and Family Affairs to generate derived data on possible disease outbreaks.
- Lessons learnt while Airflow-ing - part-2-lessons-learned-793fa3c0841e) - [Nehil Jain](https://twitter.com/nehiljain) has written a two-part series that covers the value of workflow schedulers, some best practices and pitfalls he found while working with Airflow. The [second article](https://medium.com/snaptravel/airflow-part-2-lessons-learned-793fa3c0841e) in particular includes many production tips.
- Why Robinhood uses Airflow - [Vineet Goel](https://twitter.com/vineetik) walks through why financial trading platform [Robinhood](https://robinhood.com/) picked Airflow over alternative work schedulers.
- What we learned migrating off Cron to Airflow - [Katie Macias](https://medium.com/@katiemacias) describes [VideoAmp](https://www.videoamp.com/)'s Data Engineering's journey from cron to Airflow.
- Under the Hood: Building AIR at Qubole - [Sreenath Kamath](https://www.linkedin.com/in/sreenath-kamath-66a1b970/) and [Rajat Venkatesh](https://twitter.com/vrajat) write about building [Qubole](https://www.qubole.com/)'s [data discovery, insights and recommendations platform](https://www.qubole.com/blog/building-qdsair-infrastructure/) atop Airflow.
- Airflow: Why is nothing working? - TL;DR Airflow’s SubDagOperator causes deadlocks - Deep dive into troubleshooting a troublesome Airflow DAG with good tips on how to diagnosis problems.
- Apache Airflow as an External scheduler for distributed systems - [Arunkumar](https://medium.com/@rako) suggests using Airflow as a simple external scheduler for a distributed system.
- How Sift Trains Thousands of Models using Apache Airflow - Summary of [Sift Science](https://siftscience.com/)'s deployment strategy for its machine learning model pipelines.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Airflow Lessons from the Data Engineering Front in Chicago - [Alison Stanton](https://twitter.com/alison985) provides a list of tips to avoid gotchas in Airflow jobs.
- Data quality checkers - [Antoine Augusti](https://twitter.com/AntoineAugusti) describes the framework [drivy](https://www.drivy.co.uk/) has built atop Airflow to test their datasets for completeness, consistency, timeliness, uniquess, validity and accuracy.
- Building WePay's data warehouse using BigQuery and Airflow - The inestimable [Chris Riccomini](https://twitter.com/criccomini) describes how [WePay](https://go.wepay.com/), one of the first adopters of Airflow, integrated into their [Google Cloud Compute](https://cloud.google.com/compute/) environment.
- Using Apache Airflow to Create Data Infrastructure in the Public Sector - Despite an unfortunately very heavy sales pitch tone, this article blog post describes how [ARGO Labs](http://www.argolabs.org/), a non-profit data organization, utilizes Airflow for ETLing in public sector data.
- How to aggregate data for BigQuery using Apache Airflow - Example of how to use Airflow with Google BigQuery to power a Data Studio dashboard.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Deploying Apache Airflow in Azure to build and run data pipelines - It talks about running Airflow on Azure.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Automating data export from CrateDB to S3 with Apache Airflow
- Implementation of Data Retention Policy with CrateDB and Apache Airflow
- Ingesting NYC Taxi Data From S3 Into CrateDB - Describes how to build a database ingestion pipeline in Airflow by loading CSV files from S3 into [CrateDB](https://crate.io/).
- Airflow Lessons from the Data Engineering Front in Chicago - [Alison Stanton](https://twitter.com/alison985) provides a list of tips to avoid gotchas in Airflow jobs.
- Data’s Inferno: 7 Circles of Data Testing Hell with Airflow - The Wholesale Banking Advanced Analytics team at ING details how they torture test their Airflow DAGs before deployment.
- Data quality checkers - [Antoine Augusti](https://twitter.com/AntoineAugusti) describes the framework [drivy](https://www.drivy.co.uk/) has built atop Airflow to test their datasets for completeness, consistency, timeliness, uniquess, validity and accuracy.
- Building WePay's data warehouse using BigQuery and Airflow - The inestimable [Chris Riccomini](https://twitter.com/criccomini) describes how [WePay](https://go.wepay.com/), one of the first adopters of Airflow, integrated into their [Google Cloud Compute](https://cloud.google.com/compute/) environment.
- How to aggregate data for BigQuery using Apache Airflow - Example of how to use Airflow with Google BigQuery to power a Data Studio dashboard.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Deploying Apache Airflow in Azure to build and run data pipelines - It talks about running Airflow on Azure.
- The Zen of Python and Apache Airflow - Blog post about how the Zen of Python can be applied to Airflow code.
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Upgrading Airflow with Zero Downtime - A detailed article on how to deploy Airflow with zero downtime.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Breaking up the Airflow DAG monorepo - This post describes how to support managing Airflow DAGs from multiple git repos through S3.
- Improving Performance of Apache Airflow Scheduler - A story of an adventure that allowed [Databand](https://databand.ai/) to speed up DAG parsing time 10 times
- How SSENSE is using Apache Airflow to do Data Lineage on AWS - Exploring the fundamental themes of architecting and governing a data lake on AWS using Apache Arflow.
- Complex tasks orchestration at Hurb with Apache Airflow - This post shows how [Hurb](https://hurb.com) uses Apache Airflow to orchestrate complex tasks and how it leverages DAG dynamic creation to improve development speed.
- Automating data export from CrateDB to S3 with Apache Airflow
- Airflow Dag Python Package Management - Managing python package dependencies across 100+ dags can become painful. It's hard to keep track of which packages are used by which dag, and hard to clean up during DAG removal/upgrade. Learn how KubernetesPodOperator and DockerOperator can fix this.
- Airflow Dag Management & Versioning - Efficently manage DAGs release process by using Git Submodules
- boundary-layer:Declarative Airflow Workflows - [Kevin McHale](https://www.linkedin.com/in/mchalek) has explained open source project boundary-layer which generates airflow dag with declarative workflows.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Data Testing with Airflow repository
- ETL with airflow - ETL core principles and several end-to-end docker-based examples including Kimball, Data Vault on Hive and some simpler examples.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Bare minimal Airflow on Kubernetes (Local, EKS, AKS) - An article on deploying Airflow on local Kubernetes, AWS EKS and Azure AKS with bare minimal setup.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Using Apache Airflow’s Docker Operator with Amazon’s Container Repository - [Brian Campbell](https://www.linkedin.com/in/bvcampbell3) of [Lucid](https://www.lucidchart.com/) has tips for integrating AWS's [ECR](https://aws.amazon.com/ecr/) service with Airflow's DockerOperator.
- Airflow, Meta Data Engineering, and a Data Platform for the World’s Largest Democracy - [Vinayak Mehta](https://twitter.com/vortex_ape) talks about identifying data engineering patterns (meta data engineering) to automate DAG generation and how that helped [SocialCops](https://socialcops.com/) to power DISHA, a national data platform where Indian MPs and MLAs monitor the progress of 42 national level schemes.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Data’s Inferno: 7 Circles of Data Testing Hell with Airflow - The Wholesale Banking Advanced Analytics team at ING details how they torture test their Airflow DAGs before deployment.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Monitoring Airflow with Prometheus, StatsD and Grafana - A guide on how to setup operational dashboards to production cluster by [Databand](http://databand.ai) and get high level visibility on Airflow.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Airflow Dag Management & Versioning - Efficently manage DAGs release process by using Git Submodules
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Under the Hood: Building AIR at Qubole - [Sreenath Kamath](https://www.linkedin.com/in/sreenath-kamath-66a1b970/) and [Rajat Venkatesh](https://twitter.com/vrajat) write about building [Qubole](https://www.qubole.com/)'s [data discovery, insights and recommendations platform](https://www.qubole.com/blog/building-qdsair-infrastructure/) atop Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Productionizing ML with workflows at Twitter - In depth post on why and how Twitter use Airflow for ML workflows including including custom operators and a custom UI embedded in in the Airflow web interface.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- How to Best Use DuckDB with Apache Airflow - Tips on integrating [DuckDB](https://duckdb.org/) into Airflow jobs.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
- Apache Airflow at Pandora - [Ace Haidrey](https://www.linkedin.com/in/acehaidrey/) discusses why Pandora chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Running Apache Airflow At Lyft - This provides an overview on how Lyft operates Apache Airflow in production(monitoring, customization, etc).
- Securing Apache Airflow UI WITH DAG Level Access - Blog post about Airflow DAG level access and how Lyft uses it.
- Building a Production-Level ETL Pipeline Platform Using Apache Airflow - This post describes how the system management team at Cerner uses Airflow.
-
Introductions and tutorials
- Why Quizlet chose Apache Airflow for executing data workflows
- How to develop data pipeline in Airflow through TDD (test-driven development) - Learn how to build a sales data pipeline using TDD step-by-step and in the end how to configure a simple CI workflow using Github Actions.
- Why Quizlet chose Apache Airflow for executing data workflows
- How to develop data pipeline in Airflow through TDD (test-driven development) - Learn how to build a sales data pipeline using TDD step-by-step and in the end how to configure a simple CI workflow using Github Actions.
- Apache Airflow Monitoring Metrics - A two-part series by [maxcotec](https://maxcotec.com) on how you can utilize existing Airflow statsd metrics to monitor your airflow deployment on Grafana dashboard via Prometheus. Also learn how to create custom metrics.
- Introduction to Airflow - A web tutorial series by [maxcotec](https://maxcotec.com) for beginners and intermediate users of Apache Airflow.
- Start Building Better Data Pipelines With apache Airflow - Oct - Naman Gupta covers the basics of Airflow and its concepts.
- Remote spark-submit to YARN running on EMR - [Azhaguselvan](https://github.com/tamizhgeek) walks through submitting Spark jobs to existing EMR clusters with Airflow.
- Dustin Stansbury - part series that covers what workflow managers do in general, how Quizlet picked Airflow, a tour of Airflow's key concepts, and how Quizlet is now using Airflow in practice:
- Beyond CRON: an introduction to Workflow Management Systems
- Why Quizlet chose Apache Airflow for executing data workflows
- How Quizlet uses Apache Airflow in practice
- Integrating Apache Airflow with Databricks - While this tutorial is focused specifically on Databricks' Spark solutions, it does have a reasonable overview of Airflow basics and demonstrates how a third party solution can quickly integrate into Airflow.
- Apache Airflow 2.0 Tutorial - This article discusses the basic concepts that stand behind Airflow and discusses the problems it solves.
- Testing and debugging Apache Airflow - Article explaining how to apply unit testing, mocking and debugging to Airflow code.
- Get started developing workflows with Apache Airflow - This brief introductory tutorial covers how to create data pipeline and processing workflow using DAG, operators, Sensor, using Xcoms to communicate between operators.
- Get started with Airflow + Google Cloud Platform + Docker - Step-by-step introduction by [Jayce Jiang](https://medium.com/@junjiejiang94).
- How to develop data pipeline in Airflow through TDD (test-driven development) - Learn how to build a sales data pipeline using TDD step-by-step and in the end how to configure a simple CI workflow using Github Actions.
- Introduction to Airflow - A web tutorial series by [maxcotec](https://maxcotec.com) for beginners and intermediate users of Apache Airflow.
- ETL with Apache Airflow for Data Analysis on Transaction Data - thagana-4920b5181/) covers a practical case of doing an ETL process using Apache Airflow using a dummy ecommerce store's transactional, user and product data. The data is served via a flask API.
- Airflow Repository Template - A boilerplate repository for developing locally with Airflow, with linting & tests for valid DAGs and plugins. Just clone and run `make start-airflow` to get started! Add some CI jobs to deploy your code and you're done.
- Why Quizlet chose Apache Airflow for executing data workflows
- How to develop data pipeline in Airflow through TDD (test-driven development) - Learn how to build a sales data pipeline using TDD step-by-step and in the end how to configure a simple CI workflow using Github Actions.
- Why Quizlet chose Apache Airflow for executing data workflows
- How to develop data pipeline in Airflow through TDD (test-driven development) - Learn how to build a sales data pipeline using TDD step-by-step and in the end how to configure a simple CI workflow using Github Actions.
- Why Quizlet chose Apache Airflow for executing data workflows
- Understanding Apache Airflow’s key concepts
- How to develop data pipeline in Airflow through TDD (test-driven development) - Learn how to build a sales data pipeline using TDD step-by-step and in the end how to configure a simple CI workflow using Github Actions.
- Why Quizlet chose Apache Airflow for executing data workflows
- How to develop data pipeline in Airflow through TDD (test-driven development) - Learn how to build a sales data pipeline using TDD step-by-step and in the end how to configure a simple CI workflow using Github Actions.
- Why Quizlet chose Apache Airflow for executing data workflows
- How to develop data pipeline in Airflow through TDD (test-driven development) - Learn how to build a sales data pipeline using TDD step-by-step and in the end how to configure a simple CI workflow using Github Actions.
- Why Quizlet chose Apache Airflow for executing data workflows
- How to develop data pipeline in Airflow through TDD (test-driven development) - Learn how to build a sales data pipeline using TDD step-by-step and in the end how to configure a simple CI workflow using Github Actions.
- How to develop data pipeline in Airflow through TDD (test-driven development) - Learn how to build a sales data pipeline using TDD step-by-step and in the end how to configure a simple CI workflow using Github Actions.
- Why Quizlet chose Apache Airflow for executing data workflows
- How to develop data pipeline in Airflow through TDD (test-driven development) - Learn how to build a sales data pipeline using TDD step-by-step and in the end how to configure a simple CI workflow using Github Actions.
- Why Quizlet chose Apache Airflow for executing data workflows
- How to develop data pipeline in Airflow through TDD (test-driven development) - Learn how to build a sales data pipeline using TDD step-by-step and in the end how to configure a simple CI workflow using Github Actions.
- Why Quizlet chose Apache Airflow for executing data workflows
- How to develop data pipeline in Airflow through TDD (test-driven development) - Learn how to build a sales data pipeline using TDD step-by-step and in the end how to configure a simple CI workflow using Github Actions.
- Why Quizlet chose Apache Airflow for executing data workflows
- How to develop data pipeline in Airflow through TDD (test-driven development) - Learn how to build a sales data pipeline using TDD step-by-step and in the end how to configure a simple CI workflow using Github Actions.
- Why Quizlet chose Apache Airflow for executing data workflows
- Why Quizlet chose Apache Airflow for executing data workflows
- How to develop data pipeline in Airflow through TDD (test-driven development) - Learn how to build a sales data pipeline using TDD step-by-step and in the end how to configure a simple CI workflow using Github Actions.
- Why Quizlet chose Apache Airflow for executing data workflows
- Running Airflow on top of Apache Mesos - up, [Mesos, Airflow & Docker](http://agrajmangal.in/blog/big-data/mesos-airflow-docker/) by [Agraj Mangal](https://twitter.com/agrajm) is a quick overview of running Airflow atop Apache Mesos.
- Why Quizlet chose Apache Airflow for executing data workflows
- How to develop data pipeline in Airflow through TDD (test-driven development) - Learn how to build a sales data pipeline using TDD step-by-step and in the end how to configure a simple CI workflow using Github Actions.
- Apache Airflow Monitoring Metrics - A two-part series by [maxcotec](https://maxcotec.com) on how you can utilize existing Airflow statsd metrics to monitor your airflow deployment on Grafana dashboard via Prometheus. Also learn how to create custom metrics.
- Why Quizlet chose Apache Airflow for executing data workflows
- How to develop data pipeline in Airflow through TDD (test-driven development) - Learn how to build a sales data pipeline using TDD step-by-step and in the end how to configure a simple CI workflow using Github Actions.
- Why Quizlet chose Apache Airflow for executing data workflows
- How to develop data pipeline in Airflow through TDD (test-driven development) - Learn how to build a sales data pipeline using TDD step-by-step and in the end how to configure a simple CI workflow using Github Actions.
- Why Quizlet chose Apache Airflow for executing data workflows
- How to develop data pipeline in Airflow through TDD (test-driven development) - Learn how to build a sales data pipeline using TDD step-by-step and in the end how to configure a simple CI workflow using Github Actions.
- Why Quizlet chose Apache Airflow for executing data workflows
- Why Quizlet chose Apache Airflow for executing data workflows
- How to develop data pipeline in Airflow through TDD (test-driven development) - Learn how to build a sales data pipeline using TDD step-by-step and in the end how to configure a simple CI workflow using Github Actions.
- Why Quizlet chose Apache Airflow for executing data workflows
- How to develop data pipeline in Airflow through TDD (test-driven development) - Learn how to build a sales data pipeline using TDD step-by-step and in the end how to configure a simple CI workflow using Github Actions.
- Why Quizlet chose Apache Airflow for executing data workflows
- How to develop data pipeline in Airflow through TDD (test-driven development) - Learn how to build a sales data pipeline using TDD step-by-step and in the end how to configure a simple CI workflow using Github Actions.
- Why Quizlet chose Apache Airflow for executing data workflows
- How to develop data pipeline in Airflow through TDD (test-driven development) - Learn how to build a sales data pipeline using TDD step-by-step and in the end how to configure a simple CI workflow using Github Actions.
- Why Quizlet chose Apache Airflow for executing data workflows
- How to develop data pipeline in Airflow through TDD (test-driven development) - Learn how to build a sales data pipeline using TDD step-by-step and in the end how to configure a simple CI workflow using Github Actions.
- Why Quizlet chose Apache Airflow for executing data workflows
- How to develop data pipeline in Airflow through TDD (test-driven development) - Learn how to build a sales data pipeline using TDD step-by-step and in the end how to configure a simple CI workflow using Github Actions.
- Why Quizlet chose Apache Airflow for executing data workflows
- How to develop data pipeline in Airflow through TDD (test-driven development) - Learn how to build a sales data pipeline using TDD step-by-step and in the end how to configure a simple CI workflow using Github Actions.
- Why Quizlet chose Apache Airflow for executing data workflows
- How to develop data pipeline in Airflow through TDD (test-driven development) - Learn how to build a sales data pipeline using TDD step-by-step and in the end how to configure a simple CI workflow using Github Actions.
- How to develop data pipeline in Airflow through TDD (test-driven development) - Learn how to build a sales data pipeline using TDD step-by-step and in the end how to configure a simple CI workflow using Github Actions.
- Why Quizlet chose Apache Airflow for executing data workflows
- How to develop data pipeline in Airflow through TDD (test-driven development) - Learn how to build a sales data pipeline using TDD step-by-step and in the end how to configure a simple CI workflow using Github Actions.
- Why Quizlet chose Apache Airflow for executing data workflows
- Testing and debugging Apache Airflow - Article explaining how to apply unit testing, mocking and debugging to Airflow code.
- How to develop data pipeline in Airflow through TDD (test-driven development) - Learn how to build a sales data pipeline using TDD step-by-step and in the end how to configure a simple CI workflow using Github Actions.
- Why Quizlet chose Apache Airflow for executing data workflows
- How to develop data pipeline in Airflow through TDD (test-driven development) - Learn how to build a sales data pipeline using TDD step-by-step and in the end how to configure a simple CI workflow using Github Actions.
- How to develop data pipeline in Airflow through TDD (test-driven development) - Learn how to build a sales data pipeline using TDD step-by-step and in the end how to configure a simple CI workflow using Github Actions.
-
Vital links
-
Airflow deployment solutions
- airflow-pipeline - Airflow Docker container that comes preconfigured for Spark and Hadoop. It can be docker pulled at ``datagovsg/airflow-pipeline``.
- Three ways to run Airflow on Kubernetes - [Tim van de Keer](https://www.linkedin.com/in/tim-van-de-keer-bb5a1966) walks through several methods for deploying Airflow on Kubernetes.
- Apache Airflow Multi-Tier Free Deployment on Azure - A free Azure Resource Manager (ARM) template by Bitnami providing a one-click solution for Airflow deployment on Azure for production use-cases.
- Stable Celery Helm Chart - Curated Helm Chart in the official stable chart repository.
- Integrating Apache Airflow with Apache Ambari - [Mykola Mykhalov](https://www.linkedin.com/in/mykola-mykhalov-9079a8107/) walks through using [Apache Ambari](https://ambari.apache.org/) to configure and deploy an Airflow instance.
- Bitnami Airflow Docker image - A secure and up-to-date docker image for Airflow maintained by Bitnami.
- Bitnami Airflow Scheduler Docker image - A secure and up-to-date docker image for Airflow Scheduler maintained by Bitnami.
- Bitnami Airflow Worker Docker image - A secure and up-to-date docker image for Airflow Worker maintained by Bitnami. A CeleryExecutor docker-compose deployment is available [here](https://github.com/bitnami/bitnami-docker-airflow-worker/blob/master/docker-compose.yml).
- Introducing KEDA for Airflow - How to use KEDA scaler system to enable autoscaling of celery workers based on data stored in the Airflow metadata database.
- Installing Airflow on IBM Cloud - Quick and easy deployment on IBM Cloud with IBM [Bitnami Charts](https://github.com/bitnami/charts)
- KubernetesExecutor Helm Chart - A lean Helm Chart using the KubernetesExecutor for a more k8s native experience and complementary [KubernetesExecutor Docker Image](https://github.com/tekn0ir/airflow-docker).
- Puckel's Docker Image - [@Puckel_](https://twitter.com/Puckel_)'s well-crafted Docker image has become the base for many Airflow installations. It is regularly updated and closely tracks the official Apache releases.
- Kubernetes Custom Operator for Deploying Airflow - Kubernetes Custom controller (also called operator pattern) for deploying Airflow on Kubernetes.
- aws-airflow-stack - An AWS based Airflow cluster deployment with CeleryExecutor. Deploys after a few clicks with CloudFormation.
- kube-airflow - This repository contains both an Airflow Docker image (that appears to have been based on Puckel's work) and Kubernetes service definition. [mumoshu](https://github.com/mumoshu)'s repository has not been recently updated, but there are numerous forks that may be based on more recent releases.
- airflow-on-kubernetes - A guide on all relevant resources, scripts and projects that relate to running Airflow on Kubernetes.
- airflow-k8s-executor-on-GKE - A detailed tutorial to get a scalable, low maintenance airflow kubernetes executor environment deployed on [Google Kubernetes Engine](https://cloud.google.com/kubernetes-engine/) with [helm](https://helm.sh/).
- airflow-cookbook - Chef cookbook for deploying Airflow.
- Astronomer Platform - Apache Airflow as a Service on Kubernetes. For more information visit https://www.astronomer.io.
- Distribute & deploy Apache Airflow via Python PEX files - Example repo with steps to bundle, distribute, & deploy Apache Airflow as PEX files.
- Airflow-Component - Lightweight installer of federated Airflow-Airflow (RabbitMQ) reference architectrure on Compute node(s).
- Three ways to run Airflow on Kubernetes - [Tim van de Keer](https://www.linkedin.com/in/tim-van-de-keer-bb5a1966) walks through several methods for deploying Airflow on Kubernetes.
-
Airflow Summit 2020 videos
- Scheduler as a service - Apache Airflow at EA Digital Platform
- Keynote: How large companies use Airflow for ML and ETL pipelines
- Data flow with Airflow @ PayPal - w)
- Democratised data workflows at scale - YtHYT9M)
- Migrating Airflow-based Spark jobs to Kubernetes - the native way
- Keynote: Future of Airflow
- Run Airflow DAGs in a secure way
- Keynote: Making Airflow a sustainable project through D&I
- Airflow CI/CD: Github to Cloud Composer (safely)
- Advanced Apache Superset for Data Engineers
- Demo: Reducing the lines, a visual DAG editor
- AIP-31: Airflow functional DAG definition
- Autonomous driving with Airflow
- From cron to Airflow on Kubernetes: A startup story
- What open source taught us about business
- Data engineering hierarchy of needs
- Building reuseable and trustworthy ELT pipelines (A templated approach)
- Testing Airflow workflows - ensuring your DAGs work before going into production
- Data flow with Airflow @ PayPal - w)
- Keynote: How large companies use Airflow for ML and ETL pipelines
- Airflow on Kubernetes: Containerizing your workflows
- Migrating Airflow-based Spark jobs to Kubernetes - the native way
- Democratised data workflows at scale - YtHYT9M)
- Data DAGs with lineage for fun and for profit
- Run Airflow DAGs in a secure way
- Keynote: Future of Airflow
- Advanced Apache Superset for Data Engineers
- AIP-31: Airflow functional DAG definition
- Autonomous driving with Airflow
- From cron to Airflow on Kubernetes: A startup story
- Achieving Airflow Observability
- Airflow Summit 2020
- Warsaw
- Amsterdam
- BayArea
- Airflow Summit 2020 Playlist
- Keynote: Airflow then and now
- Scheduler as a service - Apache Airflow at EA Digital Platform
- Machine Learning with Apache Airflow
- Airflow: A beast character in the gaming world
- Effective Cross-DAG dependency
- What open source taught us about business
- Testing Airflow workflows - ensuring your DAGs work before going into production
- Adding an executor to Airflow: A contributor overflow exception
- Migration to Airflow backport providers
- From Zero to Airflow: bootstrapping a ML platform
- Airflow the perfect match in our analytics pipeline
- Airflow at Société Générale : An open source orchestration solution in a banking environment
- Airflow as the next gen of workflow system at Pinterest
- Improving Airflow's user experience
- Teaching an old DAG new tricks - bMM3c)
- Ask me anything with Airflow members
- Using Airflow to speed up development of data intensive tools
- Pipelines on pipelines: Agile CI/CD workflows for Airflow DAGs
- Production Docker image for Apache Airflow
- Airflow as an elastic ETL tool - JFCsp3I)
- How do we reason about the reliability of our data pipeline in Wrike
- Achieving Airflow observability with Databand
- From S3 to BigQuery - How a first-time Airflow user successfully implemented a data pipeline
- Airflow Summit 2020 Playlist
- Keynote: Airflow then and now
- Scheduler as a service - Apache Airflow at EA Digital Platform
- Keynote: How large companies use Airflow for ML and ETL pipelines
- Data DAGs with lineage for fun and for profit
- Airflow on Kubernetes: Containerizing your workflows
- Data flow with Airflow @ PayPal - w)
- Democratised data workflows at scale - YtHYT9M)
- Migrating Airflow-based Spark jobs to Kubernetes - the native way
- Keynote: Future of Airflow
- Run Airflow DAGs in a secure way
- Keynote: Making Airflow a sustainable project through D&I
- Airflow CI/CD: Github to Cloud Composer (safely)
- Advanced Apache Superset for Data Engineers
- Demo: Reducing the lines, a visual DAG editor
- AIP-31: Airflow functional DAG definition
- Autonomous driving with Airflow
- From cron to Airflow on Kubernetes: A startup story
- Achieving Airflow Observability
- Machine Learning with Apache Airflow
- Airflow: A beast character in the gaming world
- Effective Cross-DAG dependency
- What open source taught us about business
- Data engineering hierarchy of needs
- Building reuseable and trustworthy ELT pipelines (A templated approach)
- Testing Airflow workflows - ensuring your DAGs work before going into production
- Adding an executor to Airflow: A contributor overflow exception
- Migration to Airflow backport providers
- From Zero to Airflow: bootstrapping a ML platform
- Airflow the perfect match in our analytics pipeline
- Airflow at Société Générale : An open source orchestration solution in a banking environment
- Airflow as the next gen of workflow system at Pinterest
- Improving Airflow's user experience
- Teaching an old DAG new tricks - bMM3c)
- Ask me anything with Airflow members
- Using Airflow to speed up development of data intensive tools
- Pipelines on pipelines: Agile CI/CD workflows for Airflow DAGs
- Production Docker image for Apache Airflow
- Airflow as an elastic ETL tool - JFCsp3I)
- How do we reason about the reliability of our data pipeline in Wrike
- Achieving Airflow observability with Databand
- From S3 to BigQuery - How a first-time Airflow user successfully implemented a data pipeline
- Keynote: Airflow then and now
- Scheduler as a service - Apache Airflow at EA Digital Platform
- Keynote: How large companies use Airflow for ML and ETL pipelines
- Data DAGs with lineage for fun and for profit
- Airflow on Kubernetes: Containerizing your workflows
- Data flow with Airflow @ PayPal - w)
- Democratised data workflows at scale - YtHYT9M)
- Migrating Airflow-based Spark jobs to Kubernetes - the native way
- Keynote: Future of Airflow
- Achieving Airflow Observability
- Run Airflow DAGs in a secure way
- Keynote: Making Airflow a sustainable project through D&I
- Airflow CI/CD: Github to Cloud Composer (safely)
- Advanced Apache Superset for Data Engineers
- Demo: Reducing the lines, a visual DAG editor
- AIP-31: Airflow functional DAG definition
- Autonomous driving with Airflow
- From cron to Airflow on Kubernetes: A startup story
- Machine Learning with Apache Airflow
- Airflow: A beast character in the gaming world
- Effective Cross-DAG dependency
- What open source taught us about business
- Data engineering hierarchy of needs
- Building reuseable and trustworthy ELT pipelines (A templated approach)
- Testing Airflow workflows - ensuring your DAGs work before going into production
- Adding an executor to Airflow: A contributor overflow exception
- Migration to Airflow backport providers
- From Zero to Airflow: bootstrapping a ML platform
- Airflow the perfect match in our analytics pipeline
- Airflow at Société Générale : An open source orchestration solution in a banking environment
- Airflow as the next gen of workflow system at Pinterest
- Improving Airflow's user experience
- Teaching an old DAG new tricks - bMM3c)
- Ask me anything with Airflow members
- Using Airflow to speed up development of data intensive tools
- Pipelines on pipelines: Agile CI/CD workflows for Airflow DAGs
- Production Docker image for Apache Airflow
- Airflow as an elastic ETL tool - JFCsp3I)
- How do we reason about the reliability of our data pipeline in Wrike
- Achieving Airflow observability with Databand
- From S3 to BigQuery - How a first-time Airflow user successfully implemented a data pipeline
-
Books, blogs, podcasts, and such
- Data Pipelines with Apache Airflow - A Manning book (Early Access September 2019) on Airflow.
- The Airflow Podcast - A semiregular podcast discussing all things Airflow.
- Maxime Beauchemin - Maxime's blog on medium that gives insight into the philosophy behind Apache Airflow.
- Robert Chang - Blog posts about data engineering with Apache Airflow, explains why and has examples in code.
- Handling Airflow logs with Kubernetes Executor - A blogpost that outlines how you can set up remote S3 logging when using KubernetesExecutor, without creating complex infrastructure.
- Airflow 2.0: DAG Authoring Redesigned - Blog post about new ways of writing DAGs in Airflow 2.0.
- Airflow 2.0 Providers - Blog post about providers packages in Airflow 2.0.
-
Slide deck presentations and online videos
- Apache Airflow YouTube tutorials - [Marc Lamberti](https://twitter.com/marclambertiml) has created a series of YouTube tutorials covering many aspects of Airflow concepts, configuration and deployment.
- Modern Data Pipelines with Apache Airflow - A talk given by [Taylor Edmiston](https://twitter.com/kicksopenminds) and [Andy Cooper](https://twitter.com/andscoop) from Astronomer.io at Momentum Dev Con 2018 on getting started with Airflow, custom components, example DAGs, and the Astronomer Airflow CLI.
- Building Better Data Pipelines using Apache Airflow - Slides from [Sid Anand](https://twitter.com/r39132)'s talk at QCon 18 with a thorough overview of Airflow and its architecture.
- Airflow and Spark Streaming at Astronomer - How Astronomer uses dynamic DAGs to run Spark Streaming jobs with Airflow.
- Airflow Breeze - Development and Test Environment for Apache Airflow - oF68) - Screencast showing how to use Breeze environment by [Jarek Potiuk](https://github.com/potiuk).
- Apache Airflow YouTube tutorials - [Marc Lamberti](https://twitter.com/marclambertiml) has created a series of YouTube tutorials covering many aspects of Airflow concepts, configuration and deployment.
- Developing elegant workflows in Python code with Apache Airflow - [Michał Karzyński](https://twitter.com/postrational) at [Europython](https://ep2018.europython.eu/) gives a brief introduction to Airflow concepts including the role of workflow managers, DAGs and operators. Link includes both video and slides.
- How I learned to time travel, or, data pipelining and scheduling with Airflow - Comprehensive deck by [Laura Lorenz](https://twitter.com/lalorenz6) for why Airflow is necessary and how [Industry Dive](https://www.industrydive.com/) uses it.
- Introduction to Apache Airflow - Data Day Seattle 2016 - [Sid Anand](https://twitter.com/r39132) gives a thorough introduction to Airflow and how it was used at [Agari](https://www.agari.com/).
- Apache Airflow at WePay - [Chris Riccomini](https://twitter.com/criccomini) discusses why WePay chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Elegant data pipelining with Apache Airflow - Talks from [Bolke de Bruin](https://twitter.com/bolke2028) and [Fokko Driesprong](https://twitter.com/fokkodriesprong) at PyData Amsterdam 2018 about methodologies that provide clarity in ETL using Airflow.
- Airflow @ Lyft - Talks from [Tao Feng](https://github.com/feng-tao) at SF big data analytics meetup about how Lyft monitors running Airflow in production.
- Manageable data pipelines with Airflow and Kubernetes - Talk by [Jarek Potiuk](https://github.com/potiuk) and [Szymon Przedwojski](https://github.com/sprzedwojski). A introductory talk on Airflow from GDG Warsaw DevFest 2018.
- First Warsaw Apache Airflow Meetup - Live streamed recording from the first Apache Airflow Meetup in Warsaw in October 2019.
- What's coming in Apache Airflow 2.0 - joint talk by [Ash Berlin-Taylor](https://github.com/ashb), [Kaxil Naik](https://github.com/kaxil), [Jarek Potiuk](https://github.com/potiuk), [Kamil Breguła](https://github.com/mik-laj), [Daniel Imbermann](https://github.com/dimberman), and [Tomek Urbaszek](https://github.com/turbaszek) at the [Online NYC Meetup, 13th of May 2020](https://www.meetup.com/NYC-Apache-Airflow-Meetup/events/270483933/)
- Apache Airflow in the Cloud: Programmatically orchestrating workloads with Python - Slides from [Kaxil Naik](https://twitter.com/kaxil)'s & [Satyasheel](https://twitter.com/ss6012) talk at PyData London 18 introducing the basics of Airflow and how to orchestrate workloads on Google Cloud Platform (GCP).
- Developing elegant workflows in Python code with Apache Airflow - [Michał Karzyński](https://twitter.com/postrational) at [Europython](https://ep2018.europython.eu/) gives a brief introduction to Airflow concepts including the role of workflow managers, DAGs and operators. Link includes both video and slides.
- How I learned to time travel, or, data pipelining and scheduling with Airflow - Comprehensive deck by [Laura Lorenz](https://twitter.com/lalorenz6) for why Airflow is necessary and how [Industry Dive](https://www.industrydive.com/) uses it.
- Introduction to Apache Airflow - Data Day Seattle 2016 - [Sid Anand](https://twitter.com/r39132) gives a thorough introduction to Airflow and how it was used at [Agari](https://www.agari.com/).
- Operating Data Pipeline With Airflow - Airflow Meetup April-2018 - [Ananth Packkildurai](https://twitter.com/ananthdurai) talks about scaling airflow Local Executor and best practices to operate data pipeline at [Slack](https://slack.com/).
- Apache Airflow at WePay - [Chris Riccomini](https://twitter.com/criccomini) discusses why WePay chose Airflow and provides a detailed breakdown of their deployment and the infrastructure behind it.
- Apache Airflow @ Umuzi.org - [Sheena O'Connell](https://twitter.com/sheena_oconnell) discusses how South Africa-based tech bootcamp [Umuzi](https://www.umuzi.org/) uses Airflow.
- Apache Airflow YouTube tutorials - [Marc Lamberti](https://twitter.com/marclambertiml) has created a series of YouTube tutorials covering many aspects of Airflow concepts, configuration and deployment.
- Data Pipeline Management - [Ben Goldberg](https://www.linkedin.com/in/benjamin-goldberg-50247169/) walks the Chicago Kubernetes Meetup through how [SpotHero](https://spothero.com/) uses Airflow. Additionally, Ben has a very [complete slidedeck](https://docs.google.com/presentation/d/1hc12cFs5TmEajLwYNASLwz_C17Q5tyd__6oXzI16A9A/edit#slide=id.g320d39a12c_0_1017) of how Airflow plays within Kubernetes.
- Manageable data pipelines with Airflow and Kubernetes - Talk by [Jarek Potiuk](https://github.com/potiuk) and [Szymon Przedwojski](https://github.com/sprzedwojski). A introductory talk on Airflow from GDG Warsaw DevFest 2018.
- Migrating Apache Oozie Workflows to Apache Airflow - 6t_6Ao) - Talk from [Szymon Przedwojski](https://github.com/sprzedwojski) from Airflow Bay Area Meetup June 2018 about Oozie-to-Airflow migration tool.
- Building data lakes with Apache Airflow - Talk by [Bas Harenslak](https://github.com/BasPH) and [Julian de Ruiter](https://github.com/jrderuiter) at the Amsterdam Apache Airflow September 2018 meetup about building data lakes with Apache Airflow as the spider in the web managing all data flows.
- First Warsaw Apache Airflow Meetup - Live streamed recording from the first Apache Airflow Meetup in Warsaw in October 2019.
- What's coming in Apache Airflow 2.0 - joint talk by [Ash Berlin-Taylor](https://github.com/ashb), [Kaxil Naik](https://github.com/kaxil), [Jarek Potiuk](https://github.com/potiuk), [Kamil Breguła](https://github.com/mik-laj), [Daniel Imbermann](https://github.com/dimberman), and [Tomek Urbaszek](https://github.com/turbaszek) at the [Online NYC Meetup, 13th of May 2020](https://www.meetup.com/NYC-Apache-Airflow-Meetup/events/270483933/)
- Airflow Breeze - Development and Test Environment for Apache Airflow - oF68) - Screencast showing how to use Breeze environment by [Jarek Potiuk](https://github.com/potiuk).
- Manageable data pipelines with Airflow and Kubernetes - Talk by [Jarek Potiuk](https://github.com/potiuk) and [Szymon Przedwojski](https://github.com/sprzedwojski). A introductory talk on Airflow from GDG Warsaw DevFest 2018.
- First Warsaw Apache Airflow Meetup - Live streamed recording from the first Apache Airflow Meetup in Warsaw in October 2019.
- What's coming in Apache Airflow 2.0 - joint talk by [Ash Berlin-Taylor](https://github.com/ashb), [Kaxil Naik](https://github.com/kaxil), [Jarek Potiuk](https://github.com/potiuk), [Kamil Breguła](https://github.com/mik-laj), [Daniel Imbermann](https://github.com/dimberman), and [Tomek Urbaszek](https://github.com/turbaszek) at the [Online NYC Meetup, 13th of May 2020](https://www.meetup.com/NYC-Apache-Airflow-Meetup/events/270483933/)
- Airflow Breeze - Development and Test Environment for Apache Airflow - oF68) - Screencast showing how to use Breeze environment by [Jarek Potiuk](https://github.com/potiuk).
-
Libraries, Hooks, Utilities
- Airflow plugins - Central collection of repositories of various plugins for Airflow, including mailchimp, trello, sftp, GitHub, etc.
- Domino - Domino is an open source Graphical User Interface platform for creating data and Machine Learning workflows (DAGs) with no-code, visually intuitive drag-and-drop actions. It is also a standard for publishing and sharing your Python code so it can be automatically used by anyone, directly in the GUI.
- Airflow-Helper - setting up Airflow Variables, Connections, and Pools from a YAML configuration file.
- AirFly - Auto generate Airflow's dag.py on the fly.
- DEAfrica Airflow - Airflow libraries used by [Digital Earth Africa](https://digitalearthafrica.org/), an humanitarian effort to utilize satellite imagery of Africa.
- fileflow - Collection of modules to support large data transfers between Airflow operators through either local file system or S3. This addresses a gap where data is too large for XCOMs but too small or inconvenient for loading directly in the operator. Built by [Industry Dive](https://www.industrydive.com/).
- fairflow - Library to abstract away Airflow's Operators with functional pieces that transform the data from one operator to another.
- airflow-maintenance-dags - [Clairvoyant](http://clairvoyantsoft.com/) has a repo of Airflow DAGs that operator on Airflow itself, clearing out various bits of the backing metadata store.
- whirl - Fast iterative local development and testing of Apache Airflow workflows.
- airflow-code-editor - A plugin for Apache Airflow that allows you to edit DAGs in browser.
- Pylint-Airflow - A Pylint plugin for static code analysis on Airflow code.
- afctl - A CLI tool that includes everything required to create, manage and deploy airflow projects faster and smoother.
- Dag Dependencies viewer - A plugin which creates a view to visualize dependencies between the Airflow DAGs
- Airflow ECR Plugin - Plugin to refresh AWS ECR login token at regular intervals. This is helpful where DockerOperator needs to pull images hosted on ECR.
- AirflowK8sDebugger - A library for generate k8s pod yaml templates from an Airflow dag using the KubernetesPodOperator.
- Oozie to Airflow - A tool to easily convert between [Apache Oozie](http://oozie.apache.org/) workflows and Apache Airflow workflows.
- Airflow Ditto - An extensible framework to do transformations to an Airflow DAG and convert it into another DAG which is flow-isomorphic with the original DAG, to be able to run it on different environments (e.g. on different clouds, or even different container frameworks - Apache Spark on YARN vs Kubernetes). Comes with out-of-the-box support for EMR-to-HDInsight-DAG transforms.
- gusty - Create a DAG using any number of YAML, Python, Jupyter Notebook, or R Markdown files that represent individual tasks in the DAG. gusty also configures dependencies, DAGs, and TaskGroups, features support for your local operators, and more. A fully containerized demo is available [here](https://github.com/chriscardillo/gusty-demo).
- DAG checks - The dag-checks consist of checks that can help you in maintaining your Apache Airflow instance.
- Airflow DVC plugin - Plugin for open-source version-control system for data science and Machine Learning pipelines - [DVC](https://dvc.org/).
- Airflow Vars - A CLI for variables management, created for CD-Pipelines in order to allow robust and safe variables management.
- test_dags - a more complete solution for DAG integrity tests ([first Circle of Data’s Inferno are the first](https://medium.com/@ingwbaa/datas-inferno-7-circles-of-data-testing-hell-with-airflow-cef4adff58d8).
- airflow-priority - Priority Tags (P1, P2, etc) for Airflow DAGs with automated alerting to Datadog, New Relic, Slack, Discord, and more
- airflow-config - [Pydantic](https://pydantic.dev) / [Hydra](https://hydra.cc) based configuration system for DAG and Task arguments
- airflow-supervisor - Easy-to-use [supervisor](http://supervisord.org) integration for long running or "always on" DAGs
-
Meetups
-
Commercial Airflow-as-a-service providers
- Qubole - Qubole is mainly known as a service-and-support company for Apache Hive, but also provides Airflow as a component of its platform.
- Astronomer.io - Astronomer provides complete ETL lifecycle solutions and appears to be entirely focused on providing Airflow-based products.
- AWS MWAA - Amazon Managed Workflows for Apache Airflow (MWAA) is a managed orchestration service for Apache Airflow that makes it easier to set up and operate end-to-end data pipelines in the cloud at scale.
-
Cloud Composer resources
- Cloud Composer
- Enabling Autoscaling in Google Cloud Composer - Supercharge your Cloud Composer deployment while saving up some cost during idle periods.
- Scale your Composer environment together with your business - The Celery Executor architecture and ways to ensure high scheduler performance.
- The Smarter Way of Scaling With Composer’s Airflow Scheduler on GKE - [Roy Berkowitz](https://www.linkedin.com/in/roy-berkowitz-19922aa9/) discusses more effective use of nodes in the Cloud Composer service.
- Better together: orchestrating your Data Fusion pipelines with Cloud Composer - [Rachael Deacon-Smith](https://www.linkedin.com/in/rachael-deacon-smith-82660172) provides an overview of the operator for Datafusion use case on Cloud Composer.
- Better together: orchestrating your Data Fusion pipelines with Cloud Composer - [Rachael Deacon-Smith](https://www.linkedin.com/in/rachael-deacon-smith-82660172) provides an overview of the operator for Datafusion use case on Cloud Composer.
- pianka.sh - Missing command in the gcloud tool. This tool facilitates some administrative tasks.
-
Non-English resources
- Airflow Documentation-Chinese - (🇨🇳Chinese) [Apachecn](https://github.com/apachecn) has translated the Airflow official documentation.
- Gestion de Tâches avec Apache Airflow - (🇫🇷French) [Nicolas Crocfer](https://github.com/ncrocfer) - Overview of Airflow, basic concepts and how to write and trigger a DAG.
- Apache Airflow – Kaikki Mitä Meillä On, Lähtee Dageista - (🇫🇮Finnish) [Olli Iivonen](https://www.linkedin.com/in/oiivonen/)'s overview of Airflow, concepts and Airflow's usage at [Solita](https://www.solita.fi/).
- Airflow Documentation-Chinese - (🇨🇳Chinese) [Apachecn](https://github.com/apachecn) has translated the Airflow official documentation.
- Gestion de Tâches avec Apache Airflow - (🇫🇷French) [Nicolas Crocfer](https://github.com/ncrocfer) - Overview of Airflow, basic concepts and how to write and trigger a DAG.
- apache airflow 複数worker構成のalpine版docker imageを作った - (🇯🇵Japanese) [Akio Ohta](https://github.com/Drunkar) walks through his [Docker image](https://hub.docker.com/r/drunkar/airflow-alpine/) for deploying an Alpine-based Airflow system.
- Airflow - (🇻🇳Vietnamese) [Duyet Le](https://github.com/duyet) - Overview of Airflow, concept, basic use with use case.
- Michael Yang's Airflow Chinese Blog Posts - Michael Yang's Chinese blog posts about data engineering with Apache Airflow, conclude basic tutorials and devops skills.
- Panduan Dasar Apache Airflow - (🇮🇩Indonesian) [Imam Digmi](https://github.com/imamdigmi) - Overview of Airflow, concept, basic use with use case.
- Airflow - (🇻🇳Vietnamese) [Duyet Le](https://github.com/duyet) - Overview of Airflow, concept, basic use with use case.
- Airflowはすごいぞ!100行未満で本格的なデータパイプライン - (🇯🇵Japanese) [Hank Ehly](https://github.com/hankehly) gives a comprehensive introduction to Airflow's main concepts, and demonstrates how to create a data pipeline in less than 100 lines of code.
- AirflowのタスクログをS3に保存する方法 - (🇯🇵Japanese) [Hank Ehly](https://github.com/hankehly) shows step-by-step how to configure sending task logs to AWS S3.
- 【徹底解説】Airflow Fluentd Elasticsearch Docker の連携方法 - (🇯🇵Japanese) [Hank Ehly](https://github.com/hankehly) describes how to handle worker task logs with Fluentd, Elasticsearch and Docker.
- Airflow - Automatizando seu fluxo de trabalho - (🇧🇷Portuguese) [Gilson Filho](https://github.com/gilsondev)'s overview of Airflow, concept and basic use.
-
Sample projects
- GitLab Data Team DAGs - Several DAGs used to build analytics for the GitLab platform.
- Google Cloud Platform Public Datasets Pipelines - Cloud-native, data pipeline architecture for onboarding datasets to the Google Cloud Public Datasets Program.
- deploy-airflow-on-ecs-fargate - Deploy to Amazon ECS Fargate. Demonstrates various features and configurations, such as autoscaling workers to zero, S3 remote logging and secret management.
Categories
Best practices, lessons learned and cool use cases
250
Airflow Summit 2020 videos
142
Introductions and tutorials
83
Slide deck presentations and online videos
34
Libraries, Hooks, Utilities
25
Airflow deployment solutions
22
Non-English resources
14
Meetups
8
Cloud Composer resources
7
Books, blogs, podcasts, and such
7
Vital links
5
Commercial Airflow-as-a-service providers
3
Sample projects
3
License
2
Sub Categories
Keywords
airflow
21
apache-airflow
9
python
8
scheduler
4
docker
4
kubernetes
4
aws
4
management
2
workflow
2
azure
2
airflow-cookbook
2
cli
2
gui
1
open-source
1
data
1
workflows
1
airflow-toolkit
1
airflow-tools
1
command-line
1
command-line-tool
1
airfly
1
ast
1
automation
1
codegen
1
dag-automation
1
gutt
1
airflow-maintenance-dags
1
cleanup
1
docker-airflow
1
task
1
airflow-operator
1
crd
1
kubernetes-controller
1
kubernetes-operator
1
workflow-engine
1
airflow-cluster
1
aws-cloudformation
1
chef-cookbook
1
celery-workers
1
federated
1
rabbitmq-cluster
1
salt
1
aks
1
eks
1
ai
1
containers
1
data-architecture
1
data-engineering
1
data-pipelines
1
datasets
1