{"id":15135993,"url":"https://github.com/ninja-van/airflow-boilerplate","last_synced_at":"2025-07-12T13:06:55.079Z","repository":{"id":54122482,"uuid":"263588436","full_name":"ninja-van/airflow-boilerplate","owner":"ninja-van","description":"A complete development environment setup for working with Airflow","archived":false,"fork":false,"pushed_at":"2023-02-15T23:44:42.000Z","size":1009,"stargazers_count":128,"open_issues_count":3,"forks_count":54,"subscribers_count":7,"default_branch":"master","last_synced_at":"2025-04-06T16:39:38.024Z","etag":null,"topics":["airflow","apache-airflow","boilerplate","pycharm","python"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ninja-van.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-05-13T09:45:10.000Z","updated_at":"2025-03-27T00:40:22.000Z","dependencies_parsed_at":"2024-09-21T10:10:36.868Z","dependency_job_id":null,"html_url":"https://github.com/ninja-van/airflow-boilerplate","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/ninja-van/airflow-boilerplate","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ninja-van%2Fairflow-boilerplate","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ninja-van%2Fairflow-boilerplate/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ninja-van%2Fairflow-boilerplate/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ninja-van%2Fairflow-boilerplate/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ninja-van","download_url":"https://codeload.github.com/ninja-van/airflow-boilerplate/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ninja-van%2Fairflow-boilerplate/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":264995367,"owners_count":23694947,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["airflow","apache-airflow","boilerplate","pycharm","python"],"created_at":"2024-09-26T06:03:09.897Z","updated_at":"2025-07-12T13:06:55.060Z","avatar_url":"https://github.com/ninja-van.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Airflow Boilerplate\nA complete development environment setup for working with Airflow, based on [this Medium article](https://medium.com/ninjavan-tech/setting-up-a-complete-local-development-environment-for-airflow-docker-pycharm-and-tests-3577ddb4ca94).\nIf you are interested in learning about the thoughts and processes behind this setup, do read the article. \nOtherwise, if you want to get hands-on immediately, you can skip it and just follow the instructions below \nto get started.\n\n![The overall setup diagram](/images/setup_diagram.png)\n\nThis boilerplate has more tools than was discussed in the article. In particular, it has the following things\nthat were not discussed in the article:\n- A sample DAG\n- A sample plugin\n- A sample test for the plugin\n- A sample helper method, `dags/common/stringcase.py`, accessible in both `dags/` and `plugins/`\n- A sample test for the helper method\n- A `spark-conf/` that is included in the Docker build step, you can explore this on your own\n- A `.pre-commit-config.yaml`\n\n# Getting Started\n\nInstall `docker` and `docker-compose` at:\n- https://docs.docker.com/install/\n- https://docs.docker.com/compose/install/\n\nClone this repo and `cd` into it:\n```\ngit clone https://github.com/ninja-van/airflow-boilerplate.git \u0026\u0026 cd airflow-boilerplate\n```\n\nCreate a virtualenv for this project. Feel free to choose your preferred way of managing Python virtual \nenvironments. I usually do it this way:\n```\npip install virtualenv\nvirtualenv .venv\n```\n\nActivate the virtual environment:\n```\nsource .venv/bin/activate\n```\n\nInstall the requirements:\n```\npip install -r requirements-airflow.txt\npip install -r requirements-dev.txt\n```\n\nInstall the pre-commit hook:\n```\npre-commit install\n```\nThis will ensure for each commit, any file changes are gone through the linter and formatter. On top of that,\ntests are ran, too, to make sure that nothing is broken.\n\n# Setting up the Docker environment\n\nIf you only want the DB to be up because you will mostly work using PyCharm:\n```\ndocker-compose -f docker/docker-compose.yml up -d airflow_initdb\n```\n\nIf you want the whole suit of Airflow components to be up and running:\n```\ndocker-compose -f docker/docker-compose.yml up -d\n```\nThis brings up the Airflow `postgres` metadatabase, `scheduler`, and `webserver`.\n\nTo access the `webserver`, once the Docker container is up and healthy, go to `localhost:8080`. You can start\nplaying around with the samples DAGs. \n\n# Setting up PyCharm\n\nEnsure that your Project Interpreter is pointing to the correct virtual environment.\n\n![Ensure that your Project Interpreter is pointing to the correct virtual environment](/images/python_interpreter.png)\n\nMark both `dags/` and `plugins/` as source.\n\n![Mark dags and plugins directories as \"Sources Root\"](/images/mark_as_source.png)\n\nRun `source env.sh` on the terminal and copy the environment variables.\n\n![Run env.sh and copy the env vars](/images/run_env_sh.png)\n\nAdd a new Run/Debug Configuration with the following parameters:  \n- Name: `\u003cwhatever_you_want\u003e`   \n- Script path: `\u003cpath_to_your_virtualenv_airflow_executable\u003e`\n- Parameters: `test \u003cdag_id\u003e \u003ctask_id\u003e \u003cexecution_date\u003e` \n- Environment variables: `paste your env vars here`\n\n![Run/debug configurations](/images/run_debug_config.png)\n\nAdd those environment variables to your test configuration (pytest in my case), so that you can just hit \nthe run/debug button next to your test functions.\n\n![Run/debug configurations](/images/pytest_template.png)\n\n# Generating a new fernet key\nIncluded in this boilerplate is a pre-generated fernet key. There should not be any security concern here\nbecause after all you are meant to run this environment only locally. If you wish to have a new fernet key,\nyou can follow these steps below.\n\nGenerate a fernet key:\n```\npython -c \"from cryptography.fernet import Fernet; FERNET_KEY = Fernet.generate_key().decode(); print(FERNET_KEY)\"\n```\n\nCopy that fernet key to clipboard.\nIn `env.sh`, paste it here:\n```\nexport AIRFLOW__CORE__FERNET_KEY=\u003cYOUR_FERNET_KEY_HERE\u003e\n```\n\nIn `airflow.cfg`, paste it here:\n```\nfernet_key = \u003cYOUR_FERNET_KEY_HERE\u003e\n```\n\n# Caveats\n- The PyPi packages are installed during build time instead of run time, to minimise the start-up time of our \ndevelopment environment. As a side-effect, if there is any new PyPi packages, the images need to be rebuilt. \nYou can do so by passing the extra `--build` flag:\n  ```\n  docker-compose -f docker/docker-compose.yml up -d --build\n  ```\n- PyCharm cannot recognise custom plugins registered dynamically by Airflow, because IDE does static analysis \nand the custom plugins are registered dynamically during runtime.\n\n![PyCharm failing to recognise custom plugin](/images/custom_plugin_not_recognised.png) \n\n- Not related to the build environment, but rather how Airflow works - some of the configs (like `rbac = True`) \nyou change in `airflow.cfg` might not be reflected immediately on runtime, because they are static \nconfigurations and are only evaluated once in the startup. To solve that problem, just restart your `webserver`:\n  ```\n  docker-compose -f docker/docker-compose.yml restart airflow_webserver\n  ```\n- Not related to the build environment, but rather how Airflow works - you cannot have a ;\npackage/module in `dags/` and `plugins/` with the same name. This will likely give you a `ModuleNotFoundError`\n\n# Concluding tips\n- If you are only interested in just using your IDE, and you do not need the Airflow `scheduler` or `webserver`, run:\n  ```\n  docker-compose -f docker/docker-compose.yml up -d airflow_initdb\n  ```\n\n- To remove the examples from the Webserver, change the following line in the `airflow.cfg`:\n  ```\n  load_examples = False\n  ```\n  Notice that the `docker-compose` immediately picks up the changes in `airflow.cfg`.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fninja-van%2Fairflow-boilerplate","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fninja-van%2Fairflow-boilerplate","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fninja-van%2Fairflow-boilerplate/lists"}