{"id":18709153,"url":"https://github.com/jojoee/why-airflow","last_synced_at":"2025-06-12T09:37:18.689Z","repository":{"id":91701067,"uuid":"319184631","full_name":"jojoee/why-airflow","owner":"jojoee","description":"Apache Airflow - introduction and setup in Google Cloud Platform (GCP)","archived":false,"fork":false,"pushed_at":"2020-12-07T08:10:04.000Z","size":1350,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-05-19T08:11:37.134Z","etag":null,"topics":["airflow","installation","introduction","jupyter-notebook"],"latest_commit_sha":null,"homepage":"https://www.youtube.com/watch?v=Mf60CKCFOAU","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jojoee.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-12-07T02:42:10.000Z","updated_at":"2023-08-03T06:38:30.000Z","dependencies_parsed_at":null,"dependency_job_id":"261a4fce-3850-43bf-9b9f-9aeb3f9b3114","html_url":"https://github.com/jojoee/why-airflow","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/jojoee/why-airflow","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jojoee%2Fwhy-airflow","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jojoee%2Fwhy-airflow/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jojoee%2Fwhy-airflow/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jojoee%2Fwhy-airflow/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jojoee","download_url":"https://codeload.github.com/jojoee/why-airflow/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jojoee%2Fwhy-airflow/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259439434,"owners_count":22857728,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["airflow","installation","introduction","jupyter-notebook"],"created_at":"2024-11-07T12:26:31.637Z","updated_at":"2025-06-12T09:37:18.665Z","avatar_url":"https://github.com/jojoee.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Why Apache Airflow\n\n[![Demo](https://raw.githack.com/jojoee/why-airflow/master/asset/demo.png)](https://github.com/jojoee/why-airflow/blob/master/why-airflow.ipynb)\n\n## Member\n- Nathachai Thongniran, ณัฐชัย ทองนิรันดร์, 6071009021\n- Sirichart Gobpradit, สิริชาติ กอบประดิษฐ์ 6170970021\n- Sittichai Tirasaengaroon, สิทธิชัย ติรแสงอรุณ, 6071033021\n\n## Content\n1. Study\n2. Install the tools in Big Data ecosystem\n3. How to use\n4. Shared VM images\n5. Link to access your VMs for demo\n6. VDOs Presentation: 12 mins\n7. Report\n    - Installation manual (to prevent copying VM from the class)\n    - Architecture diagram \u0026 Data flow diagram\n    - Explanation on your solution to the objective of your project\n    - Result (if any)\n    - Demo (screen capture with explanation)\n\n## CMD\n```bash\nssh admin@mongo\nssh admin@airflow-project\nairflow version\nlsof -n -i:8080 | grep LISTEN\n\n# print the list of active DAGs\nairflow list_dags\n# prints the list of tasks the \"tutorial\" dag_id\nairflow list_tasks my_simple_dag_direction2\n# prints the hierarchy of tasks in the tutorial DAG\nairflow list_tasks my_simple_dag_direction2 --tree\nairflow test \u003cdag_id\u003e \u003ctask_id\u003e \u003cdate\u003e\nairflow test my_tutorial templated 2015-06-01\nairflow test airflow_tutorial_v01 print_world 2017-07-01\nairflow test airflow_tutorial_v02 greet 2017-07-01\nairflow test airflow_tutorial_v02 greet 2017-07-04\nairflow test my_cryptocurrency save_data 2017-07-04\nairflow backfill my_tutorial -s 2015-06-01 -e 2015-06-07\nairflow test my_cryptocurrency save_data 2015-06-01\n```\n\n## Reference\n### Official document\n- https://airflow.apache.org/\n- https://airflow.apache.org/docs/apache-airflow/stable/faq.html\n- https://airflow.apache.org/docs/apache-airflow/stable/start.html\n\n### Comparison\n- http://bytepawn.com/luigi-airflow-pinball.html\n- https://www.quora.com/Which-is-a-better-data-pipeline-scheduling-platform-Airflow-or-Luigi\n\n### Others / Tutorial\n- https://github.com/hgrif/airflow-tutorial\n- https://gtoonstra.github.io/etl-with-airflow/\n- https://airflow.incubator.apache.org/tutorial.html\n- https://medium.com/@mozesr/basic-airflow-73361b62814f\n- https://hackernoon.com/airflow-the-missing-context-1a04b3a9475c\n- https://cloud.google.com/composer/docs/how-to/using/writing-dags\n- https://cwiki.apache.org/confluence/display/AIRFLOW/Common+Pitfalls\n- https://robinhood.engineering/why-robinhood-uses-airflow-aed13a9a90c8\n- https://engineering.pandora.com/apache-airflow-at-pandora-1d7a844d68ee\n- https://medium.com/handy-tech/airflow-tips-tricks-and-pitfalls-9ba53fba14eb\n- https://guptakumartanuj.wordpress.com/2018/01/08/airflow-installation-on-airflow/\n- https://medium.com/airbnb-engineering/airflow-a-workflow-management-platform-46318b977fd8\n- https://medium.com/wbaa/datas-inferno-7-circles-of-data-testing-hell-with-airflow-cef4adff58d8\n- https://towardsdatascience.com/getting-started-with-apache-airflow-df1aa77d7b1b\n- http://michal.karzynski.pl/blog/2017/03/19/developing-workflows-with-apache-airflow/\n- https://medium.com/@dustinstansbury/how-quizlet-uses-apache-airflow-in-practice-a903cbb5626d\n- https://medium.com/@dustinstansbury/understanding-apache-airflows-key-concepts-a96efed52b1a\n- https://medium.com/bluecore-engineering/were-all-using-airflow-wrong-and-how-to-fix-it-a56f14cb0753\n- https://towardsdatascience.com/why-quizlet-chose-apache-airflow-for-executing-data-workflows-3f97d40e9571\n- https://medium.com/@dustinstansbury/beyond-cron-an-introduction-to-workflow-management-systems-19987afcdb5e\n- https://blog.insightdatascience.com/airflow-101-start-automating-your-batch-workflows-with-ease-8e7d35387f94\n- https://blog.usejournal.com/testing-in-airflow-part-1-dag-validation-tests-dag-definition-tests-and-unit-tests-2aa94970570c\n- https://medium.com/inside-socialcops/how-to-create-a-workflow-in-apache-airflow-to-track-disease-outbreaks-in-india-8a7abb13bd41\n\n### Example code\n- https://github.com/kadnan/airflow-scraping\n- https://github.com/gtoonstra/etl-with-airflow\n- https://github.com/GoogleCloudPlatform/airflow-operator\n\n### Misc\n- https://github.com/puckel/docker-airflow\n\n### Others\n- Hooks concept\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjojoee%2Fwhy-airflow","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjojoee%2Fwhy-airflow","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjojoee%2Fwhy-airflow/lists"}