{"id":39446538,"url":"https://github.com/anilkulkarni87/airflow-docker","last_synced_at":"2026-01-18T04:25:28.386Z","repository":{"id":131014238,"uuid":"336749448","full_name":"anilkulkarni87/airflow-docker","owner":"anilkulkarni87","description":"This is my Apache Airflow Local development setup on Windows 10 WSL2/Mac using docker-compose. It will also include some sample DAGs and workflows.","archived":false,"fork":false,"pushed_at":"2024-02-09T18:26:57.000Z","size":109,"stargazers_count":22,"open_issues_count":0,"forks_count":10,"subscribers_count":4,"default_branch":"main","last_synced_at":"2024-05-07T18:26:55.342Z","etag":null,"topics":["airflow","airflow-community","airflow-dags","airflow-docker","airflow-testing","apache-superset","data-engineering","docker","python","sample-dags","sql","superset","treasuredata","workflows","wsl2","yaml"],"latest_commit_sha":null,"homepage":"https://anilkulkarni87.github.io/airflow-docker/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/anilkulkarni87.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2021-02-07T09:32:24.000Z","updated_at":"2024-05-06T12:30:21.000Z","dependencies_parsed_at":"2023-04-19T13:32:12.654Z","dependency_job_id":null,"html_url":"https://github.com/anilkulkarni87/airflow-docker","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/anilkulkarni87/airflow-docker","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anilkulkarni87%2Fairflow-docker","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anilkulkarni87%2Fairflow-docker/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anilkulkarni87%2Fairflow-docker/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anilkulkarni87%2Fairflow-docker/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/anilkulkarni87","download_url":"https://codeload.github.com/anilkulkarni87/airflow-docker/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anilkulkarni87%2Fairflow-docker/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28529529,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-18T00:39:45.795Z","status":"online","status_checked_at":"2026-01-18T02:00:07.578Z","response_time":98,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["airflow","airflow-community","airflow-dags","airflow-docker","airflow-testing","apache-superset","data-engineering","docker","python","sample-dags","sql","superset","treasuredata","workflows","wsl2","yaml"],"created_at":"2026-01-18T04:25:27.564Z","updated_at":"2026-01-18T04:25:28.379Z","avatar_url":"https://github.com/anilkulkarni87.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\r\n  \u003ca href=\"\" rel=\"noopener\"\u003e\r\n \u003cimg width=100px height=100px src=\"https://cwiki.apache.org/confluence/download/attachments/145723561/airflow_white_bg.png?api=v2\" alt=\"Project logo\"\u003e\u003c/a\u003e\r\n\u003c/p\u003e\r\n\r\n\u003ch3 align=\"center\"\u003eAirflow Made Easy | Local Setup Using Docker\u003c/h3\u003e\r\n\r\n\r\n\r\n[![Execute Airflow Unit Tests](https://github.com/anilkulkarni87/airflow-docker/actions/workflows/main.yml/badge.svg)](https://github.com/anilkulkarni87/airflow-docker/actions/workflows/main.yml)\r\n\r\n[![Deploy GitHub Pages](https://github.com/anilkulkarni87/airflow-docker/actions/workflows/jekyll-gh-pages.yml/badge.svg)](https://github.com/anilkulkarni87/airflow-docker/actions/workflows/jekyll-gh-pages.yml)\r\n\r\n\u003cp align=\"center\"\u003e This is my Apache Airflow Local development setup using docker-compose. It will also include some sample DAGs and workflows.\r\n    \u003cbr\u003e \r\n\u003c/p\u003e\r\n\r\n#### Recent Updates:\r\n03-Dec-2023\r\n- Upgrade to airflow 2.7.3\r\n- Upgraded superset to add secret key\r\n- Added superset database connection image\r\n- Works on M1 Mac\r\n\r\n03-May-2022\r\n- Added Dockerfile to extend airflow image\r\n- Adding additional Pypi package (td-client)\r\n- Upgrade to Airflow 2.3.0\r\n\r\n29-Jun-2021\r\n- Updated image to Airflow 2.1.1\r\n- Leveraging _PIP_ADDITIONAL_REQUIREMENTS to install additional dependencies\r\n- Developing and testing operators for Treasure Data\r\n- Read more at [Treasure Data](./TreasureData.md)\r\n\r\n## 📝 Table of Contents\r\n\r\n- [About](#about)\r\n- [Data Engineering Projects](#projects)\r\n- [Data Visualization](#superset)\r\n- [Getting Started](#getting_started)\r\n- [Usage](#usage)\r\n- [Running the tests](#tests)\r\n- [Github Workflow](#githubworkflow)\r\n- [Built Using](#built_using)\r\n- [Authors](#authors)\r\n- [Acknowledgments](#acknowledgement)\r\n- [Cleanup](#cleanup)\r\n\r\n## 🧐 About \u003ca name = \"about\"\u003e\u003c/a\u003e\r\n\r\nSetup Apache Airflow 2.0 locally on Windows 10 (WSL2) via Docker Compose. The oiginal docker-compose.yaml file was taken from the official [github](#https://github.com/apache/airflow/blob/master/docs/apache-airflow/start/docker-compose.yaml) repo. \r\n\r\nThis contains service definitions for\r\n- airflow-scheduler\r\n- airflow-webserver\r\n- airflow-worker\r\n- airflow-init - To initialize db and create user\r\n- flower\r\n- redis\r\n- postgres - This is backend for airflow. I am also creating additional database `userdata` as a backend for my data flow. This is not recommended. Its ideal to have separate databases for airflow and your data.\r\n\r\nI have added additional command to add a airflow db connection as part of the docker-compose\r\n\r\nDirectories I am mounting:\r\n- ./dags\r\n- ./logs\r\n- ./plugins\r\n- ./sql - for Sql files. We can leveraje jinja templating in our queries. Refer the sample Dag.\r\n- ./test - Has Unit tests for Airflow Dags.\r\n- ./pg-init-scripts - This has scripts to create additional database in postgres.\r\n\r\n## Data Engineering Projects \u003ca name = \"projects\"\u003e\u003c/a\u003e\r\nHere you will find some personal projects that I have worked on. These projects will throw light on some of the airflow features I have used and learnings related to other technologies. \r\n- Project 1 -\u003e [Get Covid testing data](./COVID_NY.md)\r\n\r\n## Data Visualization \u003ca name = \"superset\"\u003e\u003c/a\u003e\r\nTo experiment with Apache Superset. Read more [here](./SUPERSET.md)\r\n\r\n## 🏁 Getting Started \u003ca name = \"getting_started\"\u003e\u003c/a\u003e\r\n\r\nThese instructions will get you a copy of the project up and running on your local machine for development and testing purposes. \r\n\r\nClone this repo to your machine\r\n\r\n```\r\ndocker-compose -f docker-compose.yaml up airflow-init\r\ndocker-compose -f docker-compose.yaml up\r\n```\r\n\r\n### Prerequisites\r\n\r\nWhat things you need to install the software and how to install them.\r\n\r\nYou should have [Docker](#https://docs.docker.com/engine/installation/) and [Docker-compose](#https://docs.docker.com/compose/install/) v1.27.0 or more installed on your machine\r\n\r\n- Install and configure [WSL2](#https://docs.microsoft.com/en-us/windows/wsl/tutorials/wsl-containers)\r\n- I also had to reset my Ubuntu installation and thats when it asked me to create a user. \r\n\r\n### Installing\r\n\r\nA step by step series of examples that tell you how to get a development env running.\r\n\r\nClone the Repo\r\n\r\n```\r\ngit clone\r\n```\r\n\r\nStart docker build\r\n\r\n```\r\n#To extend airflow image\r\ndocker-compose build\r\n\r\ndocker-compose -f docker-compose.yaml up airflow-init\r\n\r\ndocker-compose -f docker-compose.yaml up\r\n```\r\n\r\nKeep checking docker processes to make sure all machines are helthy\r\n\r\n```\r\ndocker ps\r\n```\r\n\r\nOnce you notice that all containers are healthy. \r\n\r\nAdd a connection to Postgres via command line and then Access Airflow UI\r\n\r\n```\r\ndocker exec -it airflow-docker_airflow-worker airflow connections add 'postgres_new' --conn-uri 'postgres://airflow:airflow@postgres:5432/airflow'\r\n```\r\n\r\n```\r\nhttp://localhost:8080\r\n```\r\n\r\n\r\nEnd with an example of getting some data out of the system or using it for a little demo.\r\n\r\n## 🔧 Running the tests \u003ca name = \"tests\"\u003e\u003c/a\u003e\r\n\r\nUnit test for airflow dags has been defined and present in the `test` folder. This folder is also mapped to the docker containers inside the docker-compose.yaml file.\r\nFollow below steps to execute unittests after the docker containers are running:\r\n```\r\n./airflow bash\r\npython -m unittest discover -v\r\n```\r\n\r\n### Github Workflow for running tests \u003ca name=\"githubworkflow\"\u003e\u003c/a\u003e\r\nI had to create another docker-compose to be able to execute unit tests whenever I push code to master. \r\nPlease refer\r\n- [Docker Compose for github workflow](./docker-compose-githubworkflow.yaml)\r\n- [Workflow Yaml file](./.github/workflows/main.yml)\r\n\r\n\r\n### Break down into end to end tests\r\n\r\nAnother #TODO\r\n\r\n## 🎈 Usage \u003ca name=\"usage\"\u003e\u003c/a\u003e\r\n\r\nNow you can create new dags and place them in your local system and can see it coming live on web UI. Refer the sample dag in the repo. \r\n\r\n  ### ~~Important~~ : \r\n  ~~Edit the postgres_default connection from the UI or through command line if you want to persist data in postgres as part of the dags you create. Even better you can always add a new connection.~~\r\n\r\n    Update: This is now taken care of the in the updated Docker compose file. The connection and the new database are created\r\n\r\n  ```\r\n  ./airflow.sh bash\r\n\r\n  airflow connections add 'postgres_new' --conn-uri 'postgres://airflow:airflow@postgres:5432/airflow'\r\n\r\n  connect to postgres and create new database with name 'userdata'\r\n\r\n  ```\r\n  docker exec -it airflowdocker_postgres_1 /bin/bash\r\n  psql -U airflow\r\n  create database userdata;\r\n  ```\r\n\r\n  Turn on Dag: PostgreOperatorTest_Dag\r\n  ```\r\n\r\n## ⛏️ Built Using \u003ca name = \"built_using\"\u003e\u003c/a\u003e\r\n\r\n- [Postgres](https://www.postgresql.org/) - Database\r\n- [Redis](https://redis.io/) \r\n- [Apache Airflow](https://airflow.apache.org/) \r\n- [Docker](https://www.docker.com/) - build Tool\r\n- [Apache Superset](https://superset.apache.org/) - For Data visualization\r\n\r\n## ✍️ Authors \u003ca name = \"authors\"\u003e\u003c/a\u003e\r\n\r\n- The Airflow community\r\n- [@anilkulkarni87](https://github.com/anilkulkarni87) \r\n\r\n## 🎉 Acknowledgements \u003ca name = \"acknowledgement\"\u003e\u003c/a\u003e\r\n\r\n- [Apache Airflow](#https://github.com/apache/airflow/blob/master/docs/apache-airflow/start/docker-compose.yaml)\r\n- Inspiration is the Airflow Community\r\n\r\n## Cleanup \u003ca name = \"cleanup\"\u003e\u003c/a\u003e\r\n\r\n```\r\ndocker-compose down --volumes --rmi all\r\n```\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fanilkulkarni87%2Fairflow-docker","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fanilkulkarni87%2Fairflow-docker","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fanilkulkarni87%2Fairflow-docker/lists"}