https://github.com/rogarol/data_engineer_challenge
Data Engineer Coding Challenge by Globant
https://github.com/rogarol/data_engineer_challenge
docker fastapi python uv uvicorn
Last synced: 3 months ago
JSON representation
Data Engineer Coding Challenge by Globant
- Host: GitHub
- URL: https://github.com/rogarol/data_engineer_challenge
- Owner: rogarol
- Created: 2025-03-14T14:08:37.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-03-20T20:41:38.000Z (over 1 year ago)
- Last Synced: 2025-06-01T19:19:23.873Z (about 1 year ago)
- Topics: docker, fastapi, python, uv, uvicorn
- Language: Python
- Homepage:
- Size: 314 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Data Engineer Challenge
A REST API built with **FastAPI**, running on **Uvicorn**, containerized via **Docker**, and powered by a **MySQL** database. All running in the **Azure** cloud services.
The API is already available on Azure in the following link:
https://deccontainerappv4.ambitioushill-fad7020d.eastus.azurecontainerapps.io/docs
## Table of Contents
- [Features](#features)
- [Tech Stack](#tech-stack)
- [Architecture](#architecture)
- [API Guide](#api-guide)
- [Prerequisites](#prerequisites)
- [Libraries](#Libraries)
- [Deployment](#running-the-application)
## Features
- **DB Migration:** Load historical data of 3 tables (departments, jobs, employees).
- Receive historical data from CSV files
- Upload these files to the new DB
- Be able to insert batch transactions (1 up to 1000 rows) with one request
- **Business Metrics:** Generate two reports for the following purposes (an endpoint for each one):
- Number of employees hired for each job and department in 2021 divided by quarter. The table must be ordered alphabetically by department and job.
- List of ids, name and number of employees hired of each department that hired more employees than the mean of employees hired in 2021 for all the departments, ordered by the number of employees hired (descending).
## Architecture

## Tech Stack
- **FastAPI:** For building the REST API.
- **Uvicorn:** ASGI server for running the application.
- **Docker:** For containerization and deployment.
- **MySQL Database:** RDBMS open source database.
- **Azure:** For cloud services.
## Database DDL
```sql
USE db_dec;
CREATE TABLE departments (
id INT PRIMARY KEY,
department VARCHAR(100) NOT NULL UNIQUE
);
CREATE TABLE jobs (
id INT PRIMARY KEY,
job VARCHAR(100) NOT NULL UNIQUE
);
CREATE TABLE hired_employees (
id INT PRIMARY KEY,
name VARCHAR(100) NULL,
datetime datetime NULL,
department_id INT NULL,
job_id INT NULL,
CONSTRAINT fk_department FOREIGN KEY (department_id) REFERENCES departments(id),
CONSTRAINT fk_job FOREIGN KEY (job_id) REFERENCES jobs(id)
);
```
## API Guide

### Prerequisites
- [Docker](https://www.docker.com/get-started) and [Docker Compose](https://docs.docker.com/compose/install/)
- Python 3.9
- Git
- MySQL Workbench
### Libraries
We will use the following libraries in the project:
- fastapi
- uvicorn
- sqlalchemy
- pandas
- azure-storage-blob
- mysql-connector-python
### Deployment
For Local deployment you can run the following command on docker:
```bash
docker build -t [image_name]
docker run --env-file [env_variables_file] -p 80:80 [image_name]
For deployment in Azure execute the scripts in the deploy folder, the scripts should be executed on the following order:
```bash
cd deploy
sh create_resources.sh
sh deploy.sh
After the first deployment, if you make changes on the app, just run the deploy script since all the Azure resources are already created.