Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/noelmcloughlin/airflow-component
Lightweight IaC Installer of Federated Apache-Airflow
https://github.com/noelmcloughlin/airflow-component
airflow celery-workers federated rabbitmq-cluster salt
Last synced: about 1 month ago
JSON representation
Lightweight IaC Installer of Federated Apache-Airflow
- Host: GitHub
- URL: https://github.com/noelmcloughlin/airflow-component
- Owner: noelmcloughlin
- License: apache-2.0
- Created: 2021-08-27T08:24:23.000Z (over 3 years ago)
- Default Branch: master
- Last Pushed: 2021-09-09T12:30:44.000Z (over 3 years ago)
- Last Synced: 2024-11-05T13:52:29.630Z (3 months ago)
- Topics: airflow, celery-workers, federated, rabbitmq-cluster, salt
- Language: Jinja
- Homepage:
- Size: 1.01 MB
- Stars: 6
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Support: SUPPORT.md
Awesome Lists containing this project
- awesome-apache-airflow - Airflow-Component - Lightweight installer of federated Airflow-Airflow (RabbitMQ) reference architectrure on Compute node(s). (Airflow deployment solutions)
README
# Lightweight federated Apache-Airflow Installer
Provision federated (or single-node) reference deployment architecture of Apache Airflow (RabbitMQ, Postgres), via lightweight single-source-of-truth installer. Heavy lifting/CI by [Saltstack-formulas community](https://github.com/saltstack-formulas).
![Airflow-Component](/templates/img/airflow-component.png?raw=true "Federated Airflow, Reference Deployment Architecture")primary: controller01.main.net user: main\airflowservice - Active Scheduler, UI, worker
secondary: controller02.main.net user: main\airflowservice - Standby Scheduler, UI, worker
worker01/02: apples, applesdev, applestest
worker01/02: oranges, orangesdev, orangestest
worker01/02: edge
worker01/02: fog# TL'DR
~/airflow-component/installer.sh | tee ~/iac-installer.log
# PREPARE
Declare your configuration in [sitedata.j2](https://github.com/noelmcloughlin/airflow-component/blob/master/sitedata.j2)
Commission your infrastructure inline with [our infra ticket guidelines](https://github.com/noelmcloughlin/airflow-component/blob/master/INFRA.md)
Logon as airflowservice on each participating host and user.
Ensure proxy is published ~/.bashrc and ~/airflow-component/installer.sh files if applicable:
export HTTP_PROXY="http://myproxy:8080"
export HTTPS_PROXY="http://myproxy:8080"
export http_proxy="${HTTP_PROXY}"
export https_proxy="${HTTPS_PROXY}"
export no_proxy="localhost,*.net"
export PATH=${PATH}:/usr/local/binPlan to deploy primary/secondary hosts before workers.
Optionally wipe data on any-all servers before reinstall. Normally this is not needed!!
sudo rm -fr ~/.local /var/lib/rabbitmq/ /var/log/rabbitmq/ /usr/lib/systemd/system/rabbitmq-serv* /usr/lib/systemd/system/airflow-* /etc/rabbitmq/ /var/lib/pgsql /srv/salt && sudo reboot
# (RE)INSTALL/UPGRADE
Logon as airflowservice on participating hosts and users. Get the software: For hosts without network connectivity to your git (i.e. from fog), use another method, i.e. sftp, see [SUPPORT](https://github.com/noelmcloughlin/airflow-component/blob/master/SUPPORT.md)
cd && rm -fr airflow-component airflow-dags
for name in component dags; do
git clone https://github.com/noelmcloughlin/airflow-${name}
done && cd ~/dags && rm -fr * && cp -Rp ../airflow-dags/dags/* .; chmod +x $( find . -name *.py)On each participating host (begin with primary/secondary), install Airflow. Duration is ~15-30mins depending on compute resources:
~/airflow-component/installer.sh | tee ~/iac-installer.log
Note, the installation summary may indicate failures. Evaluate result as follows. For failures see [SUPPORT](https://github.com/noelmcloughlin/airflow-component/blob/master/SUPPORT.md)
- Success if 0 task fails: cluster join worked too. OK!
- Success if 1 task fails: cluster join is best effort, other node was not ready (race condition). OK!
- Retryable if >1 task fails: sometimes the 2nd attempt just works! NOK!
- All other outcomes are failures.Import variables:
airflow variables import ~/airflow-dags/variables.json
# USER INTERFACE
Airflow:
- http://primary.main.net:18080 (user/pass: airflow/airflow or custom)
- http://secondary.main.net:18080 (user/pass: ditto)RabbitMQ:
- http://primary.main.net:15672 (user/pass: airflow/airflow)
- http://secondary.main.net:15672 (user/pass: airflow/airflow)
- http://[worker-ipaddr]:15672 (user/pass: airflow/airflow)Celery Flower:
- http://primary.main.net:5555
- http://secondary.main.net:5555
- http://[worker-ipaddr]:5555# TROUBLESHOOTING
See [SUPPORT](https://github.com/noelmcloughlin/airflow-component/blob/master/SUPPORT.md)