Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/hieuung/simple-data-pipeline
Simple data-pipeline, depyment using containerization with Kubenetes and Helm
https://github.com/hieuung/simple-data-pipeline
airflow containerization etl-pipeline self-learning
Last synced: about 2 months ago
JSON representation
Simple data-pipeline, depyment using containerization with Kubenetes and Helm
- Host: GitHub
- URL: https://github.com/hieuung/simple-data-pipeline
- Owner: hieuung
- Created: 2024-02-17T11:01:36.000Z (10 months ago)
- Default Branch: master
- Last Pushed: 2024-02-25T18:37:40.000Z (10 months ago)
- Last Synced: 2024-02-25T19:42:59.384Z (10 months ago)
- Topics: airflow, containerization, etl-pipeline, self-learning
- Language: Python
- Homepage:
- Size: 41 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Simple-data-pipeline
Simple data pipeline, deployment using containerization with Kubernetes and Helm for ETL learning## Tech stack
- Containerization (docker, kubernetes, minikube, helm)
- Airflow
- SQL (Postgres)## Descriptions
This pipeline extracts data from the production database (from the project [Simple web-app](https://github.com/hieuung/Simple-web-app)) transforms and loads it into a data sink using Python, SQL(Postgres), and Airflow for job scheduling.## Deployment
- Install [docker](https://docs.docker.com/engine/install/ubuntu/), [kubernetes](https://kubernetes.io/docs/tasks/tools/), [minikube](https://minikube.sigs.k8s.io/docs/start/), and [helm](https://helm.sh/docs/intro/install/)- Clone and build project [Simple web-app](https://github.com/hieuung/Simple-web-app) following instructions.
- Clone this project to local.
- (Optional) Build your own Airflow image (build your own dags) using. Dockerfile provided
- Deploy Airflow on minikube using built Airflow image (currently [my Airflow image](https://hub.docker.com/repository/docker/hieuung/dags/general))
```
cd `path_to_this_repo`/hieu_airflow/deploymenthelm repo add apache-airflow https://airflow.apache.org
helm upgrade --install airflow apache-airflow/airflow --namespace airflow --create-namespacehelm upgrade -f values.yaml airflow apache-airflow/airflow --namespace airflow
```## Result
- Verify deployment, service
```
kubectl get depolyment -n airflow
``````
kubectl get service -n airflow
```> **_NOTE:_** Using ```minkibe tunnel``` if LoadBalancer not exposes External-IP