{"id":26893544,"url":"https://github.com/gades-dataeng/mod2-airflow","last_synced_at":"2025-10-21T18:03:01.176Z","repository":{"id":282963435,"uuid":"931117600","full_name":"GADES-DATAENG/mod2-airflow","owner":"GADES-DATAENG","description":"This repository contains the practical code and examples for the second class of the Fundamentals of Data Engineering with Python and SQL course. The focus is on introducing Apache Airflow, a powerful workflow orchestration tool widely used in modern data pipelines.","archived":false,"fork":false,"pushed_at":"2025-02-11T19:48:40.000Z","size":6299,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-17T22:41:52.154Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/GADES-DATAENG.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-02-11T18:40:24.000Z","updated_at":"2025-02-21T10:33:07.000Z","dependencies_parsed_at":"2025-03-17T22:51:56.562Z","dependency_job_id":null,"html_url":"https://github.com/GADES-DATAENG/mod2-airflow","commit_stats":null,"previous_names":["gades-dataeng/mod2-airflow"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GADES-DATAENG%2Fmod2-airflow","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GADES-DATAENG%2Fmod2-airflow/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GADES-DATAENG%2Fmod2-airflow/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GADES-DATAENG%2Fmod2-airflow/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/GADES-DATAENG","download_url":"https://codeload.github.com/GADES-DATAENG/mod2-airflow/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246558113,"owners_count":20796696,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-03-31T23:58:18.149Z","updated_at":"2025-10-21T18:03:01.170Z","avatar_url":"https://github.com/GADES-DATAENG.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# mod2-airflow\nThis repository contains the practical code and examples for the second class of the Fundamentals of Data Engineering with Python and SQL course. The focus is on introducing Apache Airflow, a powerful workflow orchestration tool widely used in modern data pipelines.\n\n## Setup Instructions\n\n### Step 1: Clone the Repository\n\nIf you haven't already cloned the repository, you can do so by running the following command:\n\n```bash\ngit clone git@github.com:GADES-DATAENG/mod2-airflow.git\ncd webinar\n```\n\n### Step 2: Create your .env file\nBefore starting the services, you need to build the .env file with some variables. Please check the .env.template file and use it as\na template for your .env file.\n```bash\ncp .env.template .env\n```\n\n### Step 3: Get your GCP service account JSON credentials file\nAfter downloading your GCP service account JSON credentials file, just past it under the keys folder with the name `gcp-key.json`\nIf you don't need (or have) any GCP account yet, you can just create an empty file with the name `gcp-key.json`\n\n### Step 4: Start the Services with Docker Compose\nOnce the image is built, you can start the services (Airflow, and other dependencies) using Docker Compose. Run the following command:\n```bash\ndocker-compose up -d\n```\n\nThis command will start all the containers defined in the `docker-compose.yml` file. It will set up Airflow, and any necessary services, including BigQuery integration.\n\n### Step 6: Access the Services\n- **Airflow Web UI**: You can access the Airflow web interface at http://localhost:8080\n    - Default login credentials are\n        - **Username**: `airflow`\n        - **Password**: `airflow`\n\n## Environment Setup\n- The service account key file (`gcp-key.json`) should be inside the `keys` folder\n\nEnsure that the key file is placed correctly in the repository folder as:\n```bash\n/mod2-airflow/gcp-key.json\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgades-dataeng%2Fmod2-airflow","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgades-dataeng%2Fmod2-airflow","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgades-dataeng%2Fmod2-airflow/lists"}