{"id":13485713,"url":"https://github.com/puckel/docker-airflow","last_synced_at":"2025-03-27T19:31:42.963Z","repository":{"id":33541266,"uuid":"37187397","full_name":"puckel/docker-airflow","owner":"puckel","description":"Docker Apache Airflow","archived":false,"fork":false,"pushed_at":"2023-03-01T09:11:52.000Z","size":236,"stargazers_count":3773,"open_issues_count":268,"forks_count":541,"subscribers_count":102,"default_branch":"master","last_synced_at":"2024-10-30T20:45:44.643Z","etag":null,"topics":["airflow","docker","docker-airflow","management","scheduler","task","workflow"],"latest_commit_sha":null,"homepage":"","language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/puckel.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2015-06-10T09:19:44.000Z","updated_at":"2024-10-29T14:18:24.000Z","dependencies_parsed_at":"2024-01-14T08:07:28.671Z","dependency_job_id":null,"html_url":"https://github.com/puckel/docker-airflow","commit_stats":null,"previous_names":[],"tags_count":39,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/puckel%2Fdocker-airflow","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/puckel%2Fdocker-airflow/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/puckel%2Fdocker-airflow/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/puckel%2Fdocker-airflow/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/puckel","download_url":"https://codeload.github.com/puckel/docker-airflow/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245910887,"owners_count":20692513,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["airflow","docker","docker-airflow","management","scheduler","task","workflow"],"created_at":"2024-07-31T18:00:30.378Z","updated_at":"2025-03-27T19:31:42.686Z","avatar_url":"https://github.com/puckel.png","language":"Shell","readme":"# docker-airflow\n[![CI status](https://github.com/puckel/docker-airflow/workflows/CI/badge.svg?branch=master)](https://github.com/puckel/docker-airflow/actions?query=workflow%3ACI+branch%3Amaster+event%3Apush)\n[![Docker Build status](https://img.shields.io/docker/build/puckel/docker-airflow?style=plastic)](https://hub.docker.com/r/puckel/docker-airflow/tags?ordering=last_updated)\n\n[![Docker Hub](https://img.shields.io/badge/docker-ready-blue.svg)](https://hub.docker.com/r/puckel/docker-airflow/)\n[![Docker Pulls](https://img.shields.io/docker/pulls/puckel/docker-airflow.svg)]()\n[![Docker Stars](https://img.shields.io/docker/stars/puckel/docker-airflow.svg)]()\n\nThis repository contains **Dockerfile** of [apache-airflow](https://github.com/apache/incubator-airflow) for [Docker](https://www.docker.com/)'s [automated build](https://registry.hub.docker.com/u/puckel/docker-airflow/) published to the public [Docker Hub Registry](https://registry.hub.docker.com/).\n\n## Informations\n\n* Based on Python (3.7-slim-buster) official Image [python:3.7-slim-buster](https://hub.docker.com/_/python/) and uses the official [Postgres](https://hub.docker.com/_/postgres/) as backend and [Redis](https://hub.docker.com/_/redis/) as queue\n* Install [Docker](https://www.docker.com/)\n* Install [Docker Compose](https://docs.docker.com/compose/install/)\n* Following the Airflow release from [Python Package Index](https://pypi.python.org/pypi/apache-airflow)\n\n## Installation\n\nPull the image from the Docker repository.\n\n    docker pull puckel/docker-airflow\n\n## Build\n\nOptionally install [Extra Airflow Packages](https://airflow.incubator.apache.org/installation.html#extra-package) and/or python dependencies at build time :\n\n    docker build --rm --build-arg AIRFLOW_DEPS=\"datadog,dask\" -t puckel/docker-airflow .\n    docker build --rm --build-arg PYTHON_DEPS=\"flask_oauthlib\u003e=0.9\" -t puckel/docker-airflow .\n\nor combined\n\n    docker build --rm --build-arg AIRFLOW_DEPS=\"datadog,dask\" --build-arg PYTHON_DEPS=\"flask_oauthlib\u003e=0.9\" -t puckel/docker-airflow .\n\nDon't forget to update the airflow images in the docker-compose files to puckel/docker-airflow:latest.\n\n## Usage\n\nBy default, docker-airflow runs Airflow with **SequentialExecutor** :\n\n    docker run -d -p 8080:8080 puckel/docker-airflow webserver\n\nIf you want to run another executor, use the other docker-compose.yml files provided in this repository.\n\nFor **LocalExecutor** :\n\n    docker-compose -f docker-compose-LocalExecutor.yml up -d\n\nFor **CeleryExecutor** :\n\n    docker-compose -f docker-compose-CeleryExecutor.yml up -d\n\nNB : If you want to have DAGs example loaded (default=False), you've to set the following environment variable :\n\n`LOAD_EX=n`\n\n    docker run -d -p 8080:8080 -e LOAD_EX=y puckel/docker-airflow\n\nIf you want to use Ad hoc query, make sure you've configured connections:\nGo to Admin -\u003e Connections and Edit \"postgres_default\" set this values (equivalent to values in airflow.cfg/docker-compose*.yml) :\n- Host : postgres\n- Schema : airflow\n- Login : airflow\n- Password : airflow\n\nFor encrypted connection passwords (in Local or Celery Executor), you must have the same fernet_key. By default docker-airflow generates the fernet_key at startup, you have to set an environment variable in the docker-compose (ie: docker-compose-LocalExecutor.yml) file to set the same key accross containers. To generate a fernet_key :\n\n    docker run puckel/docker-airflow python -c \"from cryptography.fernet import Fernet; FERNET_KEY = Fernet.generate_key().decode(); print(FERNET_KEY)\"\n\n## Configuring Airflow\n\nIt's possible to set any configuration value for Airflow from environment variables, which are used over values from the airflow.cfg.\n\nThe general rule is the environment variable should be named `AIRFLOW__\u003csection\u003e__\u003ckey\u003e`, for example `AIRFLOW__CORE__SQL_ALCHEMY_CONN` sets the `sql_alchemy_conn` config option in the `[core]` section.\n\nCheck out the [Airflow documentation](http://airflow.readthedocs.io/en/latest/howto/set-config.html#setting-configuration-options) for more details\n\nYou can also define connections via environment variables by prefixing them with `AIRFLOW_CONN_` - for example `AIRFLOW_CONN_POSTGRES_MASTER=postgres://user:password@localhost:5432/master` for a connection called \"postgres_master\". The value is parsed as a URI. This will work for hooks etc, but won't show up in the \"Ad-hoc Query\" section unless an (empty) connection is also created in the DB\n\n## Custom Airflow plugins\n\nAirflow allows for custom user-created plugins which are typically found in `${AIRFLOW_HOME}/plugins` folder. Documentation on plugins can be found [here](https://airflow.apache.org/plugins.html)\n\nIn order to incorporate plugins into your docker container\n- Create the plugins folders `plugins/` with your custom plugins.\n- Mount the folder as a volume by doing either of the following:\n    - Include the folder as a volume in command-line `-v $(pwd)/plugins/:/usr/local/airflow/plugins`\n    - Use docker-compose-LocalExecutor.yml or docker-compose-CeleryExecutor.yml which contains support for adding the plugins folder as a volume\n\n## Install custom python package\n\n- Create a file \"requirements.txt\" with the desired python modules\n- Mount this file as a volume `-v $(pwd)/requirements.txt:/requirements.txt` (or add it as a volume in docker-compose file)\n- The entrypoint.sh script execute the pip install command (with --user option)\n\n## UI Links\n\n- Airflow: [localhost:8080](http://localhost:8080/)\n- Flower: [localhost:5555](http://localhost:5555/)\n\n\n## Scale the number of workers\n\nEasy scaling using docker-compose:\n\n    docker-compose -f docker-compose-CeleryExecutor.yml scale worker=5\n\nThis can be used to scale to a multi node setup using docker swarm.\n\n## Running other airflow commands\n\nIf you want to run other airflow sub-commands, such as `list_dags` or `clear` you can do so like this:\n\n    docker run --rm -ti puckel/docker-airflow airflow list_dags\n\nor with your docker-compose set up like this:\n\n    docker-compose -f docker-compose-CeleryExecutor.yml run --rm webserver airflow list_dags\n\nYou can also use this to run a bash shell or any other command in the same environment that airflow would be run in:\n\n    docker run --rm -ti puckel/docker-airflow bash\n    docker run --rm -ti puckel/docker-airflow ipython\n\n# Simplified SQL database configuration using PostgreSQL\n\nIf the executor type is set to anything else than *SequentialExecutor* you'll need an SQL database.\nHere is a list of PostgreSQL configuration variables and their default values. They're used to compute\nthe `AIRFLOW__CORE__SQL_ALCHEMY_CONN` and `AIRFLOW__CELERY__RESULT_BACKEND` variables when needed for you\nif you don't provide them explicitly:\n\n| Variable            | Default value |  Role                |\n|---------------------|---------------|----------------------|\n| `POSTGRES_HOST`     | `postgres`    | Database server host |\n| `POSTGRES_PORT`     | `5432`        | Database server port |\n| `POSTGRES_USER`     | `airflow`     | Database user        |\n| `POSTGRES_PASSWORD` | `airflow`     | Database password    |\n| `POSTGRES_DB`       | `airflow`     | Database name        |\n| `POSTGRES_EXTRAS`   | empty         | Extras parameters    |\n\nYou can also use those variables to adapt your compose file to match an existing PostgreSQL instance managed elsewhere.\n\nPlease refer to the Airflow documentation to understand the use of extras parameters, for example in order to configure\na connection that uses TLS encryption.\n\nHere's an important thing to consider:\n\n\u003e When specifying the connection as URI (in AIRFLOW_CONN_* variable) you should specify it following the standard syntax of DB connections,\n\u003e where extras are passed as parameters of the URI (note that all components of the URI should be URL-encoded).\n\nTherefore you must provide extras parameters URL-encoded, starting with a leading `?`. For example:\n\n    POSTGRES_EXTRAS=\"?sslmode=verify-full\u0026sslrootcert=%2Fetc%2Fssl%2Fcerts%2Fca-certificates.crt\"\n\n# Simplified Celery broker configuration using Redis\n\nIf the executor type is set to *CeleryExecutor* you'll need a Celery broker. Here is a list of Redis configuration variables\nand their default values. They're used to compute the `AIRFLOW__CELERY__BROKER_URL` variable for you if you don't provide\nit explicitly:\n\n| Variable          | Default value | Role                           |\n|-------------------|---------------|--------------------------------|\n| `REDIS_PROTO`     | `redis://`    | Protocol                       |\n| `REDIS_HOST`      | `redis`       | Redis server host              |\n| `REDIS_PORT`      | `6379`        | Redis server port              |\n| `REDIS_PASSWORD`  | empty         | If Redis is password protected |\n| `REDIS_DBNUM`     | `1`           | Database number                |\n\nYou can also use those variables to adapt your compose file to match an existing Redis instance managed elsewhere.\n\n# Wanna help?\n\nFork, improve and PR.\n","funding_links":[],"categories":["HarmonyOS","Shell","workflow","Airflow deployment solutions","Soluções de deployment do Airflow"],"sub_categories":["Windows Manager"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpuckel%2Fdocker-airflow","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpuckel%2Fdocker-airflow","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpuckel%2Fdocker-airflow/lists"}