https://github.com/alexvanzyl/airflow-lab-bda
Hands-on lab demonstrating reading from multiple sources and writing to JSON file.
https://github.com/alexvanzyl/airflow-lab-bda
Last synced: 3 months ago
JSON representation
Hands-on lab demonstrating reading from multiple sources and writing to JSON file.
- Host: GitHub
- URL: https://github.com/alexvanzyl/airflow-lab-bda
- Owner: alexvanzyl
- Created: 2022-06-01T13:38:56.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2022-06-03T16:27:19.000Z (almost 3 years ago)
- Last Synced: 2024-12-29T10:13:02.367Z (5 months ago)
- Language: Python
- Size: 6.84 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Airflow Lab
Hands-on lab demonstrating reading from multiple sources and writing to JSON file.
## Prerequisites
- Docker and Docker Compose
- Python 3.8+## Getting started
For Linux users to ensure files created in `dags`, `logs` and `plugins` will be created with the right user permission execute the below command at the root of the project:```shell
echo -e "AIRFLOW_UID=$(id -u)" > .env
```
### 1. Spin up the containers
Docker compose V2:
```shell
docker compose up -d
```
for older versions:
```shell
docker-compose up -d
```### 2. Create a new database in Postgres
```shell
# Execute psql on the running postgres container.
# When prompt for password use "airflow" (without quotes)
docker compose exec postgres psql -U airflow -W# Once in the psql shell create the database
airflow=# CREATE DATABASE airflow_lab_db;# You can confirm the DB was created with the below command
airflow=#\lList of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
----------------+---------+----------+------------+------------+---------------------
airflow | airflow | UTF8 | en_US.utf8 | en_US.utf8 |
airflow_lab_db | airflow | UTF8 | en_US.utf8 | en_US.utf8 |# Finally quit out of the shell
airflow=#\q
```### 3. Create new Postgres connection
- From the Airflow UI [http://localhost:8080/](http://localhost:8080/) Goto `Admin > Connections`
- Click on the "+" (Add new record) button
- Fill in the connection details:
- Connection Id: `postgres_conn`
- Connection Type: `Postgres`
- Host: `postgres`
- Schema: `airflow_lab_db`
- Login: `airflow`
- Password: `airflow`
- Port: `5432`
- Extra: `{"cursor": "realdictcursor"}` Reference: [here](https://airflow.apache.org/docs/apache-airflow-providers-postgres/stable/_api/airflow/providers/postgres/hooks/postgres/index.html#airflow.providers.postgres.hooks.postgres.CursorType)
- Finally save the connection.### 4. Seed the database with users
- From the Airflow UI [http://localhost:8080/](http://localhost:8080/) Goto `DAGS`
- Run the `seed_postgres_db_dag` dag; this can be done directly from the table by pressing the play button under "Actions"
- Once done you can pause it if you want