{"id":27484545,"url":"https://github.com/praisetompane/app_etl","last_synced_at":"2025-04-16T16:44:18.175Z","repository":{"id":274220208,"uuid":"922266528","full_name":"praisetompane/app_etl","owner":"praisetompane","description":"A toy API driven ETL application to experiment with the Flask(with gunicorn), SQLAlchemy, Alembic and Postgres.","archived":false,"fork":false,"pushed_at":"2025-04-12T14:41:45.000Z","size":609,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-04-12T14:44:17.930Z","etag":null,"topics":["alembic","docker","flask","python","railway","sql","sqlalchemy","worldhealthorg"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/praisetompane.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-01-25T18:50:04.000Z","updated_at":"2025-04-12T14:42:03.000Z","dependencies_parsed_at":null,"dependency_job_id":"2d6979ca-a575-4bf1-824c-2cd75e37b05d","html_url":"https://github.com/praisetompane/app_etl","commit_stats":null,"previous_names":["praisetompane-toy-applications/app_etl","praisetompane/app_etl"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/praisetompane%2Fapp_etl","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/praisetompane%2Fapp_etl/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/praisetompane%2Fapp_etl/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/praisetompane%2Fapp_etl/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/praisetompane","download_url":"https://codeload.github.com/praisetompane/app_etl/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":249259170,"owners_count":21239422,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["alembic","docker","flask","python","railway","sql","sqlalchemy","worldhealthorg"],"created_at":"2025-04-16T16:44:17.492Z","updated_at":"2025-04-16T16:44:18.165Z","avatar_url":"https://github.com/praisetompane.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# app_etl\n![build status](https://github.com/praisetompane/app_etl/actions/workflows/app_etl.yaml/badge.svg)\n\n## Objectives\n- An API driven ETL application to experiment with the Flask(with gunicorn), SQLAlchemy, Alembic and Postgres.\n- Extract data from World Health Organization.\n    - Supported Datasets\n        - Malaria Annual Confirmed Cases\n        - ...\n## Database\n- Structure after first run \u003cbr\u003e\n\n    ![](docs/app_etl_erd.png)\n\n- Connect to database\n    - The database is accessible on `localhost` at the port and credentials specified in [env](.env).\n\n## Project Structure\n- docs: Project documentation lives in here.\n- src: production code lives in folder and is divided in the modules below:\n    - app_etl: project package\n        - api:\n            - the API to the application lives in this module.\n            - the current implementation is a REST API, but a gRPC, CLI API, etc would be implemented in here.\n        - config:\n            - configurable values live in here.\n            - these are values such as Hand Ranks, Card Ranks.\n                - as the system scales, you could migrate these into a database to allow independently\n                changing config without restarting the application.\n        - core:\n            - the domain logic of the application lives in this module.\n        - gateway:\n            - all external interaction objects(e.g. files, external APIs etc) live in this module.\n        - model:\n            - The domain models for Poker live in this in this module.\n        - repository:\n            - Data interactions(persitence and access) concerns live in this module.\n        - app.py:\n            entry point to startup the application\n- tests: test code lives in folder.\n    the tests are intentionally separated from production code.\n    - benefits:\n        - tests can run against an installed version after executing `pip install .`.\n        - tests can run against the local copy with an editable install after executing `pip install --edit`.\n        - when using Docker, the entire app_etl folder can be copied without needing to exclude tests, which we don't release to PROD.\n    - more in depth discussion here: https://docs.pytest.org/en/latest/explanation/goodpractices.html#choosing-a-test-layout-import-rules\n\n- utilities: any useful scripts, such as curl \u0026 postman requests, JSON payloads, software installations, etc.\n\n## Dependencies\n- [Docker](https://docs.docker.com/get-started/)\n\n## Setup Instructions\n- The repository is configured to use [devcontainers](https://containers.dev) for development.\n    - [Developing inside a Container](https://code.visualstudio.com/docs/devcontainers/containers)\n\n## Run Program\n- The system automatically starts up as part of loading the project into an editor that supports devcontainers.\n    - If you wouuld like to run the prod image, change `dockerfile: Dockerfile.dev` to `dockerfile: Dockerfile` in [docker-compose](docker-compose.debug.yml).\n- Run an ETL\n    ```shell\n    # specifically imports malaria_annual_confirmed_cases\n    ./utilities/curl/malaria/malaria_annual_confirmed_cases.sh\n    ```\n- Debugging\n    - Running in debug mode and debug with VSCode:\n        - Open the \"Run and Debug\" view.\n        - Click the green play button.\u003cbr\u003e  \n            ![start system output](./docs/vscode_debugging.png)\u003cbr\u003e\n        - Allow debugging without frozen modules by clicking \"Debug Anyway\" once the debugger is installed and ready.\n            ![bypass frozen modueles](./docs/vscode_debugging_frozen.png)\n        - The server will inform you the host and port in the terminal output at the bottom.\u003cbr\u003e\n        - From here you debug like normal(i.e. add break points, step into code definitions, evaluate code snippets, etc) \u003cbr\u003e\n\n    - If you wouuld like to debug the prod image, change `dockerfile: Dockerfile.dev` to `dockerfile: Dockerfile` in [docker-compose.debug](docker-compose.debug.yml).\n\n## Testing\n- Run unit and integration tests\n    ```shell\n    pytest\n    ```\n- Run End to End tests\n    - Not Implemented\n\n## Database State Management\n\n- The database state (i.e. tables, stored procedures, indexes, etc) are managed using [Alembic](https://alembic.sqlalchemy.org/en/latest/).\n    - Migrations location: src/app_etl/migrations\n    - Migrations naming scheme: YYYY_MM_DD_HHMM_rev_nanme\n        - uses alembic's full revision scheme defined in alembic.ini\n        - example: `2025_02_08_0825-98af2865f6fc_create_schema_etl`\n    - Current database state can be queried with `SELECT * FROM public.alembic_version;`\n- To upgrade the database to latest migrations:\n    ```shell\n    alembic upgrade head\n    ```\n- To downgrade the database to the base state:\n    ```shell\n    alembic downgrade base\n    ```\n\n## Git Conventions\n- **NB:** The main is locked and all changes must come through a Pull Request.\n- Commit Messages:\n    - Provide concise commit messages that describe what you have done.\n        ```shell\n        # example:\n        git commit -m \"feat(core): algorithm\" -m\"implement my new shiny faster algorithm\"\n        ```\n    - screen shot of Githb view\n    - references:\n        - https://www.conventionalcommits.org/en/v1.0.0/\n        - https://www.freecodecamp.org/news/how-to-write-better-git-commit-messages/\n\n**Disclaimer**: This is still work in progress.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpraisetompane%2Fapp_etl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpraisetompane%2Fapp_etl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpraisetompane%2Fapp_etl/lists"}