https://github.com/andgineer/airflow
Apache Airflow 2 docker-compose environment with scheduler, workers, DB and live reload of DAGs
https://github.com/andgineer/airflow
airflow anaconda docker docker-compose python
Last synced: about 2 months ago
JSON representation
Apache Airflow 2 docker-compose environment with scheduler, workers, DB and live reload of DAGs
- Host: GitHub
- URL: https://github.com/andgineer/airflow
- Owner: andgineer
- Created: 2020-08-18T10:27:51.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2025-01-05T13:47:45.000Z (over 1 year ago)
- Last Synced: 2025-01-13T01:36:38.943Z (over 1 year ago)
- Topics: airflow, anaconda, docker, docker-compose, python
- Language: Python
- Homepage:
- Size: 358 KB
- Stars: 0
- Watchers: 4
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
[](https://github.com/andgineer/airflow/actions)
[](https://htmlpreview.github.io/?https://github.com/andgineer/airflow/blob/python-coverage-comment-action-data/htmlcov/index.html)
# Apache Airflow 3 + Anaconda
Docker Compose environment for local debugging of Apache Airflow DAGs with live reload.
Includes Airflow scheduler, Celery workers, PostgreSQL database, and Miniconda for using machine learning and data science packages from Anaconda in your ETL pipelines.
[Apache Airflow](https://airflow.apache.org/docs/stable/) is a workflow management platform for building and monitoring data pipelines. Pipelines are configured as Python code, enabling dynamic pipeline generation.
## Quick Start
```bash
./compose.sh build
./up.sh
```
**Access:**
- Airflow UI: http://127.0.0.1:8080/home (username: `admin`, password: `admin`)
- Celery Flower: http://127.0.0.1:5551
DAGs are in `etl/` and mounted into containers for live updates.
## Demo DAG
The repository includes a demo DAG `HelloPandas` to verify everything works.
Check the `merge` task logs for: `Done. Returned value was: ('Hello', 'Pandas')`
## Database Connections
The environment creates a PostgreSQL database for ETL tasks (same server as Airflow metadata DB in `airflow-db` container).
**Required Airflow Connections:**
- `etl_db` - ETL tasks database
- `db_dev` - Development/business database for ETL operations
## Configuration
**Scaling Workers:** Use `docker-compose --scale` or deploy workers on separate machines.
**Email Notifications:** Configure SMTP server in `airflow.cfg`.
## Development
### Testing
```bash
# Create/activate conda environment
. ./activate.sh
# Run tests
pytest
```
### Database Migrations
Define SQLAlchemy models in `etl/db/models/` (inherit from `db.models.Base`).
```bash
# Generate migration script
./alembic.sh revision --autogenerate -m "Schema changes."
# Apply migrations
./alembic.sh upgrade head
```
## Coverage Reports
- [Codecov](https://app.codecov.io/gh/andgineer/airflow/tree/master/etl)
- [Coveralls](https://coveralls.io/github/andgineer/airflow)