Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/xnuinside/airflow_in_docker_compose
Apache Airflow in Docker Compose (for both versions 1.10.* and 2.*)
https://github.com/xnuinside/airflow_in_docker_compose
airflow apache-airflow celery-executor docker docker-airflow docker-compose docker-compose-files docker-compose-template
Last synced: 6 days ago
JSON representation
Apache Airflow in Docker Compose (for both versions 1.10.* and 2.*)
- Host: GitHub
- URL: https://github.com/xnuinside/airflow_in_docker_compose
- Owner: xnuinside
- License: mit
- Created: 2019-07-30T10:05:50.000Z (over 5 years ago)
- Default Branch: main
- Last Pushed: 2023-11-23T13:19:03.000Z (about 1 year ago)
- Last Synced: 2024-12-13T01:12:01.187Z (27 days ago)
- Topics: airflow, apache-airflow, celery-executor, docker, docker-airflow, docker-compose, docker-compose-files, docker-compose-template
- Language: Python
- Homepage:
- Size: 513 KB
- Stars: 184
- Watchers: 13
- Forks: 78
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Maybe you will be also interested
* [airflow-helper](https://github.com/xnuinside/airflow-helper) - Pretty Fresh command line tool to set up Apache Airflow connections, variables & pools from yaml config. Support config inheritance & feature to get settings from existed server.
# Official Docker-Compose
Pay attention that in current time already exists official Docker-Compose.yml https://airflow.apache.org/docs/apache-airflow/stable/start/docker.html (maybe better to use it)### Apache Airflow version 2.0.0
(2.0 not 100% bacward compatible to 1.10+ this is because I move it to separate compose file):By default now RBAC is turn on and this mean, that to use Airflow UI you need create user first, for this in db_init service was added also command to create default user:
>> *airflow users create --firstname admin --lastname admin --email admin --password admin --username admin --role Admin*Change your user password and login as you want. By default it is login: admin, password: admin.
![New Apache Airflow 2.0](/docs/img/2.0.png?raw=true "Apache Airflow 2.0")
**Note:**
If you will run docker-compose for 2nd and more times in init_db you will see log:```
initdb_1 | admin already exist in the db
airflow_in_docker_compose_initdb_1 exited with code 0
```**[docker-compose-with-celery-executor.yml](docker-compose-2.0-with-celery-executor.yml)**
**NOTE: if you previous run Airflow 1.10 - remove your DB volume files before run 2.0 or change db init command to db upgrade.**
```bash
git clone https://github.com/xnuinside/airflow_in_docker_compose
cd airflow_in_docker_compose
docker-compose -f docker-compose-2.0-with-celery-executor.yml up --build```
### Apache Airflow 2.* with 2 Celery Workers (or more)Because was issue about run Apache 2.0 with 2 Celery workers I think will be not bad to have docker-compose with such set up.
I added it as separate compose file:
**[docker-compose-2.0-with-celery-executor-2-workers.yml](docker-compose-2.0-with-celery-executor-2-workers.yml)**
To check that your workers up&run well - use flower UI (it exists in docker-compose setup):
![Flower UI with 2 workers](/docs/img/flower.png?raw=true "Flower UI with 2 workers")### Apache Airflow version 1.10.14:
```bashgit clone https://github.com/xnuinside/airflow_in_docker_compose
cd airflow_in_docker_compose# to run airflow with 1 Celery worker
docker-compose up --build```
Wait until all services will succesfull up & open http://localhost:8080/admin.
### FAQ & Help
Exists different behaviour of Docker Compose on different OS relative to file system specifications, work with access rights & etc. This docker-compose file tested by me in MacOS mostly, some time I can up & run it on wsl (but not each update).
In issues you can find some cases when something goes wrong and maybe it will help you to solve own issue.
**Ubuntu Issues:**
1. [Permission denied error](https://github.com/xnuinside/airflow_in_docker_compose/issues/4)
**WSL Issues:**
1. [No DAGs in UI in Airflow 2.0 & failed airflow init on second runs](https://github.com/xnuinside/airflow_in_docker_compose/issues/10) - Not resolved yet
Also at the end of this README.md file exists section https://github.com/xnuinside/airflow_in_docker_compose#for-windows-10-users with some information wor WSL users. Maybe it also can help.
**Problem with connection to PostgreSQL (at first time run)**:
If you share low count of resources for Docker or you have a machine with low perfomance, up&run PostgreSQL for the first time can take a significant time. And you can see the errors like this:
```
Is the server running on host "postgres" (172.25.0.3) and accepting
initdb_1 | TCP/IP connections on port 5432?
```In normal behaviour - in docker-compose I added autorestarts so after 10-15 secs all servers will be up&run, but sometimes 3 retries can be not enough.
I can recommend in this case at first time run postgres service separate until you will see information that Postgres is up & ready to accept connections.:
```
docker-compose -f docker-compose-2.0-with-celery-executor-2-workers.yml up --build postgres
```If you had any troubles & you successfully solve it - please open an issue with solution, I will add it to this readme.md file. Thank you!
# Apache Airflow with Docker Compose examples
**UPD from July 2020:
Those articles was created before release of official Apache Airflow Docker image and they use puckel/docker-airflow.
Now, already exist official image apache/airflow. So this docker-compose files became 'legacy'
and all sources moved to 'docker_with_puckel_image'.
Main Docker Compose Cluster based on apache/airflow Image**Docker-compose config based on official image (required docker-compose version 3.7 and higher):
**[docker-compose-with-celery-executor.yml](docker-compose-with-celery-executor.yml)**
And env file with config setting for Airflow (used in docker-compose-with-celery-executor.yml):
**[.env](.env)**Source files for article with description on Medium.
**Apache Airflow with LocalExecutor:**
**Apache Airflow with CeleryExecutor:**
**Install Python dependencies to docker-compose cluster without re-build images**
![Main Apache Airflow UI](/docs/img/main.png?raw=true "Main Apache Airflow UI")
![Version](/docs/img/version.png?raw=true "Version Screen")### 10.12.2022:
1. Updated version to 2.5.0### 20.09.2022:
1. Updated version to 2.4.0
2. all files with version 1.* & puckel images moved to "archive" folder
3. 2* became default version
4. Updated docker-compose version### 20.09.2022:
1. Updated version to 2.4.0
2. all files with version 1.* & puckel images moved to "archive" folder
3. 2* became default version
4. Updated docker-compose version### 03.02.2021:
1. In docker-compose files for Airflow 2.0 **scheduler** service restart police changed to 'any' because for some reason it exist with 0 if error in DB and init is not finished yet, so restart policy 'on-failure' does not works.
2. Added example for Apached Airflow 2.0 with 2 workers.### 02.02.2021:
1. Added FAQ section with issues that might help
2. Updated fernet key in .env### 18.12.2020:
1. Added separate docker-compose file for Apache Airflow 2.0 version### 16.12.2020:
1. Update Apache Airflow version to 1.10.14
2. Change init db command to "airflow db init"### 29.11.2020:
1. Update Apache Airflow version to 1.10.12
2. Update PostgreSQL DB to 13.1
3. Added restart_policy to services in docker-compose### 07.2020:
1. All compose files with puckel_image moved to docker_with_puckel_image
2. Creted docker-compose config based on official image (required docker-compose version 3.7 and higher):
**[docker-compose-with-celery-executor.yml](docker-compose-with-celery-executor.yml)**
And env file with config setting for Airflow (used in docker-compose-with-celery-executor.yml):
**[.env](.env)**
3. At the bottom of readme added note for Windows 10 users### 21.07.2020:
1. Docker Compose files with puckel images moved to docker_with_puckel_image
2. Added docker-compose-with-celery.yml based on official image.### 18.12.19 changes:
1. added samples for article https://medium.com/@xnuinside/install-python-dependencies-to-docker-compose-cluster-without-re-build-images-8c63a431e11c (docker-compose-volume-packages.yml, packages.pth, added commented lines to Dockerfile)
2. added .dockerignore### 29.11.19 changes:
1. Apache Airflow Image was updated to version 1.10.6
2. Added test_dag into airflow_files## For Windows 10 Users
If you try to work on Windows 10 & run docker-compose on it you will got an issue for **postgres** service:FATAL: data directory "/var/lib/postgresql/data/pgdata" has wrong ownership
To solve this issue you must do additional steps (unfortanutely there is no more quick workaround, check: https://forums.docker.com/t/data-directory-var-lib-postgresql-data-pgdata-has-wrong-ownership/17963/23 and https://forums.docker.com/t/trying-to-get-postgres-to-work-on-persistent-windows-mount-two-issues/12456/5?u=friism):
1. Create docker volume:
docker volume create --name volume-postgresql -d local
2. in docker-compose.yml:
2.1 add volume at thetop of the file, under 'networks' defining like this:```
networks:
airflow:volumes:
volume-postgresql:
external: true
```2.2 change *postgres* service volumes:
was:
```
- ./database/data:/var/lib/postgresql/data/pgdata
- ./database/logs:/var/lib/postgresql/data/log
```become:
```
- volume-postgresql:/var/lib/postgresql/data/pgdata
- volume-postgresql:/var/lib/postgresql/data/log
```Or use WSL and run docker under it.
If you never use docker with mount local folders as volumes under WSL possible you need first follow up this article: https://nickjanetakis.com/blog/setting-up-docker-for-windows-and-wsl-to-work-flawlessly#ensure-volume-mounts-work because by default volumes are not mounted correct and you will not see any 'dags' in Airflow.