{"id":21900836,"url":"https://github.com/rifa8/capstone-project-with-dynamic-dag","last_synced_at":"2026-04-13T13:32:52.683Z","repository":{"id":216501323,"uuid":"740942312","full_name":"rifa8/capstone-project-with-dynamic-dag","owner":"rifa8","description":"The project focuses on creating an ELT pipeline to consolidate data from diverse resources into a single source of truth in BigQuery. The heart of this project is the innovative use of Apache Airflow to design a dynamic Directed Acyclic Graph (DAG) that automates task generation based on predefined file configurations.","archived":false,"fork":false,"pushed_at":"2024-01-22T14:42:38.000Z","size":5028,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-03-22T06:12:56.195Z","etag":null,"topics":["dynamic-dag","elt","visualization"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rifa8.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-01-09T11:42:13.000Z","updated_at":"2024-07-14T19:37:49.000Z","dependencies_parsed_at":"2024-01-22T16:54:02.828Z","dependency_job_id":null,"html_url":"https://github.com/rifa8/capstone-project-with-dynamic-dag","commit_stats":null,"previous_names":["rifa8/capstone-project-with-dynamic-dag"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/rifa8/capstone-project-with-dynamic-dag","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rifa8%2Fcapstone-project-with-dynamic-dag","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rifa8%2Fcapstone-project-with-dynamic-dag/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rifa8%2Fcapstone-project-with-dynamic-dag/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rifa8%2Fcapstone-project-with-dynamic-dag/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rifa8","download_url":"https://codeload.github.com/rifa8/capstone-project-with-dynamic-dag/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rifa8%2Fcapstone-project-with-dynamic-dag/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31754993,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-13T13:27:56.013Z","status":"ssl_error","status_checked_at":"2026-04-13T13:21:23.512Z","response_time":93,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dynamic-dag","elt","visualization"],"created_at":"2024-11-28T15:10:53.046Z","updated_at":"2026-04-13T13:32:52.628Z","avatar_url":"https://github.com/rifa8.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Capstone Project Brief Data Engineering Team D (\"The Future\")\n\n## Constraints\n\n- Separate data comes from multiple sources such as databases, CSV and JSON.\n- Constraints for each problem will be specifically defined in the project description section.\n\n## About the Project\n\n### Background\nAn edu-tech platform called \"pinter-skuy\" provides online courses facilitated by professional mentors, and anyone can enroll in these courses. As the business gains momentum, the management level aims to conduct monitoring and evaluation of their online courses.\n\nTherefore, the information that has been stored in different sources to date is intended to be consolidated into a **_single source of truth_** for subsequent analysis.\n\n## Tools and Framework\n![Github Badge](https://img.shields.io/badge/Github-black?logo=github)\n![Docker Badge](https://img.shields.io/badge/Docker-2496ED?logo=docker\u0026logoColor=fff\u0026style=flat-square)\n![Cloud-Shell](https://img.shields.io/badge/Cloud-Shell-blue?logo=googlecloud)\n![Phyton](https://img.shields.io/badge/Phyton-white?logo=python)\n![Postgres](https://img.shields.io/badge/Postgres-blue?logo=postgresql\u0026logoColor=white)\n![Airflow](https://img.shields.io/badge/Airflow-green?logo=apacheairflow\u0026logoColor=white)\n![Google_BigQuery_Badge](https://img.shields.io/badge/BigQuery-white?logo=googlebigquery)\n\n## ERD\n![ERD](imgs/erd.png)\n\n## Flowchart Project\n![flowchart](imgs/flowchart.png)\n\n## Running Project\n```\ngit clone https://github.com/rifa8/capstone-project-with-dynamic-dag\n```\n\n```\ndocker compose up -d\n```\n\nThen open `localhost:8080` to access Airflow.\n```\nUsername: airflow\nPassword: airflow\n```\n![airflow](imgs/airflow.png)\n\nNext, set up connections in Airflow. Go to `Admin \u003e\u003e Connections` in the Airflow UI, then add a connection. In this project, there are 2 connections, `to_bq` for connecting to BigQuery and `pg_conn` for connecting to the PostgreSQL database.\n![to_bq](imgs/conn.png)\n\n\n![pg_conn](imgs/pg_conn.png)\n\nThen run the DAG.\n\nTIPS: Let the DAG run according to the schedule. Do not manually run the DAG so that the ExternalTaskSensor can activate automatically.\n\nFirst, activate the DAG `dag_etl_to_dwh` and wait until its status is success. After that, activate the DAG `dag_etl_to_datamart` and the `ExternalTaskSensor` will run automatically because the task status in the DAG `dag_etl_to_dwh` is already success, as specified in the script for dag_etl_to_datamart, `allowed_states=['success']`.\n```\ntask_wait_ext_task = ExternalTaskSensor(\n    task_id=f\"wait_{ext_task_depen['dag_id']}_{ext_task_depen['task_id']}\",\n    external_dag_id=ext_task_depen['dag_id'],\n    external_task_id=ext_task_depen['task_id'],\n    allowed_states=['success'],\n    execution_delta=timedelta(minutes=ext_task_depen['minutes_delta'])\n                )\n```\n\n![dwh](imgs/dag-dwh.png)\n\n\n![datamart](imgs/dag-datamart.png)\n\nCheck tables in BigQuery\n\nDataset dwh\n![dwh](imgs/dwh.png)\n\nDataset datamart\n![datamart](imgs/datamart.png)\n\nTo view the visualization results [(dashboard)](visualization/Project-Team-D.pdf), you can also access the following link: [Looker Studio](https://lookerstudio.google.com/reporting/7b4f543d-1a82-4d73-a124-0cfb29a7e5a9/page/h81mD)\n\n## Author\n- Yoga Martafian [![Github Badge](https://img.shields.io/badge/Github-black?logo=github)](https://github.com/artasaurrus)\n\n- Karno [![Github Badge](https://img.shields.io/badge/Github-black?logo=github)](https://github.com/Karnopunta)\n\n- Muhammad Rifa [![Github Badge](https://img.shields.io/badge/Github-black?logo=github)](https://github.com/rifa8)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frifa8%2Fcapstone-project-with-dynamic-dag","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frifa8%2Fcapstone-project-with-dynamic-dag","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frifa8%2Fcapstone-project-with-dynamic-dag/lists"}