Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/chandulal/airflow-testing

Airflow Unit Tests and Integration Tests
https://github.com/chandulal/airflow-testing

airflow airflow-dags airflow-testing testing

Last synced: about 2 months ago
JSON representation

Airflow Unit Tests and Integration Tests

Awesome Lists containing this project

README

        

# Airflow Testing
This project contains different categories of tests with examples.

## Five Categories of Tests
1. DAG Validation Tests: To test the validity of the DAG, checking typos and cyclicity.
2. DAG/Pipeline Definition Tests: To test the total number of tasks in the DAG, upstream and downstream dependencies of each task, etc.
3. Unit Tests: To test the logic of custom Operators, custom Sensor, etc.
4. Integration Tests: To test the communication between tasks. For example, task1 pass some information to task 2 using Xcoms.
5. End to End Pipeline Tests: To test and verify the integration between each task. You can also assert the data on successful completion of the E2E pipeline.

Clone this repo to run these test in your local machine.

## Unit Tests

Unit tests cover all tests falls under teh first four categories.

#### How to run?
1. Build the airflow image. Go to project root directory and run

```docker build . -t airflow-test```

2. Run the unit tests from the docker. Use your repository location for **{SourceDir}** (Eg. If you cloned your repo at `/User/username/airflow-testing/` then
SourceDir is `/User/username`.)

```docker run -ti -v {SourceDir}/airflow-testing:/opt --entrypoint /mnt/entrypoint.sh airflow-test run_unit_tests```

## End-to-End Tests

End-to-End tests cover all tests of category five. To run these tests,
we need to set up airflow environment in minikube. Also, we need to set up
all the component required by your DAGs.

#### Minikube set up

Prerequisite:


git clone https://github.com/chandulal/airflow-testing.git
brew cask install virtualbox (run if you don't have virtual box installed)

Install minikube


brew cask install minikube
brew install kubernetes-cli
minikube start --cpus 4 --memory 8192

#### Mount DAGs, Plugins, etc.

Mount all your DAGs,Plugins, etc. in minikube

 

minikube mount {project dir}/src/main/python/:/data


#### Deploy Airflow in minikube

Open new terminal. Go to project root dir and run:

 

kubectl apply -f airflow.kube.yaml

wait for 3-4 min to start all airflow components.

This will set up following components:

* Postgres (To store the metadata of airflow)
* Redis (Broker for celery executors)
* Airflow Scheduler
* Celery Workers
* Airflow Web Server
* Flower

#### Access Airflow
Get minikube ip by running ```minikube ip``` command

Use minikube ip and access:

**Airflow UI:** {minikube-ip}:31317

**Flower:** {minikube-ip}:32081

#### How Airflow works in minikube?

![minkube_airflow_architecture](https://github.com/chandulal/airflow-testing/blob/master/how_minikube_work.png)

#### How to run these tests?

1. Install all required components to run your DAGs in minikube. To run integration tests,
available in this repo, we required MySQL and Presto on minikube.


kubectl apply -f {SourceDir}/k8s/mysql/mysql.kube.yaml
kubectl apply -f {SourceDir}/k8s/presto/presto.kube.yaml

2. Run the integration tests from the docker. Use absolute path of this repository in your machine for **{SourceDir}**

```docker run -ti -v {SourceDir}/airflow-testing:/opt --entrypoint /mnt/entrypoint.sh airflow-test run_integration_tests {minikube-ip} ```