https://github.com/adilsaid64/real-time-data-monitoring

Exploring what a real-time data drift monitoring solution could look like within MLOps
https://github.com/adilsaid64/real-time-data-monitoring

data datadrift grafana machine-learning mlops mlops-workflow prometheus python software-engineering

Last synced: 5 months ago
JSON representation

Exploring what a real-time data drift monitoring solution could look like within MLOps

Host: GitHub
URL: https://github.com/adilsaid64/real-time-data-monitoring
Owner: adilsaid64
Created: 2025-07-10T22:35:01.000Z (6 months ago)
Default Branch: main
Last Pushed: 2025-08-03T22:14:30.000Z (5 months ago)
Last Synced: 2025-08-04T00:26:07.508Z (5 months ago)
Topics: data, datadrift, grafana, machine-learning, mlops, mlops-workflow, prometheus, python, software-engineering
Language: Python
Homepage:
Size: 952 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Real-Time Data Drift Monitoring

Exploring what a real-time data drift monitoring solution could look like within MLOps.

## How It Works

1. A baseline dataset (reference) is loaded at startup.
2. Incoming feature data is buffered in a rolling window.
3. Once the buffer is full:
- A KS test is run per feature.
- P-values and drift flags are recorded.
- Metrics are exposed to Prometheus.
4. Grafana visualizes:
- Number of features drifting
- Feature-level p-values & drift flags
- Last drift timestamp
- Historical drift trends

![system-diagram](assets/system-diagram.png)

## Running the Project

### 1. Install Dependencies

Use [`uv`](https://github.com/astral-sh/uv) to manage the Python environment:

```bash
uv venv
source .venv/bin/activate
uv sync
```

### 2. Start All Services
Use Docker Compose to spin up the Model Server, Metric Server, Prometheus and Grafana:

```bash
docker compose up --build
```

To scale the Model Server to handle more requests, you can use:
```bash
docker-compose up --build --scale model-server=10
```

This will start 10 instances of the model server, allowing it to handle more concurrent requests.

Requests to the prediction API is sent to the API Gateway (NGINX), which load balances across the model server replicas.

See [nginx.conf](nginx/nginx.conf).

### 3. Run the Drift Monitor
To simulate a live data stream:
- Without Drift (Normal Scenario):
```bash
uv run run.py --drift false
```

- With Drift (Simulated Drift Scenario):
```bash
uv run run.py --drift true
```

## Dashboards

Access the grafana dashboard from : http://localhost:4000/

### No Drift Scenario

![No Drift Scenario](assets/no-drift.png)

### Drift Scenario

![Drift Scenario](assets/drift.png)

## Load Testing With Locust

To run Load Testing with [Locust](https://docs.locust.io/en/stable/quickstart.html) follow these steps:

Run from root:

```bash
locust
```

Then open your browser and navigate to `http://localhost:8089` to access the Locust web interface. From there, you can start your load tests by specifying the target URL and the number of users to simulate.

The target URL is: `http://localhost:8002/get-prediction`

currently working on scaling up the metrics up endpoint

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/adilsaid64/real-time-data-monitoring

Awesome Lists containing this project

README