{"id":26562629,"url":"https://github.com/rogarol/data_engineer_challenge","last_synced_at":"2026-04-14T19:32:09.545Z","repository":{"id":283552658,"uuid":"948537024","full_name":"rogarol/data_engineer_challenge","owner":"rogarol","description":"Data Engineer Coding Challenge by Globant","archived":false,"fork":false,"pushed_at":"2025-03-20T20:41:38.000Z","size":322,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-01T19:19:23.873Z","etag":null,"topics":["docker","fastapi","python","uv","uvicorn"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rogarol.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-03-14T14:08:37.000Z","updated_at":"2025-03-20T20:41:42.000Z","dependencies_parsed_at":"2025-03-22T07:15:21.725Z","dependency_job_id":null,"html_url":"https://github.com/rogarol/data_engineer_challenge","commit_stats":null,"previous_names":["rogarol/data_engineer_challenge"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/rogarol/data_engineer_challenge","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rogarol%2Fdata_engineer_challenge","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rogarol%2Fdata_engineer_challenge/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rogarol%2Fdata_engineer_challenge/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rogarol%2Fdata_engineer_challenge/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rogarol","download_url":"https://codeload.github.com/rogarol/data_engineer_challenge/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rogarol%2Fdata_engineer_challenge/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31812968,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-14T18:05:02.291Z","status":"ssl_error","status_checked_at":"2026-04-14T18:05:01.765Z","response_time":153,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["docker","fastapi","python","uv","uvicorn"],"created_at":"2025-03-22T15:18:25.183Z","updated_at":"2026-04-14T19:32:09.530Z","avatar_url":"https://github.com/rogarol.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Data Engineer Challenge\n\nA REST API built with **FastAPI**, running on **Uvicorn**, containerized via **Docker**, and powered by a **MySQL** database. All running in the **Azure** cloud services.\n\nThe API is already available on Azure in the following link:\n\nhttps://deccontainerappv4.ambitioushill-fad7020d.eastus.azurecontainerapps.io/docs\n\n## Table of Contents\n\n- [Features](#features)\n- [Tech Stack](#tech-stack)\n- [Architecture](#architecture)\n- [API Guide](#api-guide)\n- [Prerequisites](#prerequisites)\n- [Libraries](#Libraries)\n- [Deployment](#running-the-application)\n\n## Features\n\n- **DB Migration:** Load historical data of 3 tables (departments, jobs, employees).\n    - Receive historical data from CSV files\n    - Upload these files to the new DB\n    - Be able to insert batch transactions (1 up to 1000 rows) with one request\n\n- **Business Metrics:** Generate two reports for the following purposes (an endpoint for each one):\n    - Number of employees hired for each job and department in 2021 divided by quarter. The table must be ordered alphabetically by department and job.\n    - List of ids, name and number of employees hired of each department that hired more employees than the mean of employees hired in 2021 for all the departments, ordered by the number of employees hired (descending).\n\n## Architecture\n\n![Arquitecture Diagram](/diagrams/DEC_ArquitectureDiagram.png)\n\n## Tech Stack\n\n- **FastAPI:** For building the REST API.\n- **Uvicorn:** ASGI server for running the application.\n- **Docker:** For containerization and deployment.\n- **MySQL Database:** RDBMS open source database.\n- **Azure:** For cloud services.\n\n## Database DDL\n\n    ```sql\n    USE db_dec;\n\n    CREATE TABLE departments (\n        id INT PRIMARY KEY,\n        department VARCHAR(100) NOT NULL UNIQUE\n    );\n\n    CREATE TABLE jobs (\n        id INT PRIMARY KEY,\n        job VARCHAR(100) NOT NULL UNIQUE\n    );\n\n    CREATE TABLE hired_employees (\n        id INT PRIMARY KEY,\n        name VARCHAR(100) NULL,\n        datetime datetime NULL,\n        department_id INT NULL,\n        job_id INT NULL,\n        CONSTRAINT fk_department FOREIGN KEY (department_id) REFERENCES departments(id),\n        CONSTRAINT fk_job FOREIGN KEY (job_id) REFERENCES jobs(id)\n    );\n    ```\n\n## API Guide\n\n![Endpoints Guide](/diagrams/DEC_EndpointsGuide.png)\n\n### Prerequisites\n\n- [Docker](https://www.docker.com/get-started) and [Docker Compose](https://docs.docker.com/compose/install/)\n- Python 3.9\n- Git\n- MySQL Workbench\n\n### Libraries\n\nWe will use the following libraries in the project:\n- fastapi\n- uvicorn\n- sqlalchemy\n- pandas\n- azure-storage-blob\n- mysql-connector-python\n\n### Deployment\n\nFor Local deployment you can run the following command on docker:\n    \n    ```bash\n    docker build -t [image_name]\n    docker run --env-file [env_variables_file] -p 80:80 [image_name]\n\nFor deployment in Azure execute the scripts in the deploy folder, the scripts should be executed on the following order:\n\n    ```bash\n    cd deploy\n    sh create_resources.sh\n    sh deploy.sh\n\nAfter the first deployment, if you make changes on the app, just run the deploy script since all the Azure resources are already created.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frogarol%2Fdata_engineer_challenge","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frogarol%2Fdata_engineer_challenge","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frogarol%2Fdata_engineer_challenge/lists"}