https://github.com/fabioba/mlops-architecture
This is an overview of a MLOps architecture that includes both Airflow and MLflow running on separate Docker containers.
https://github.com/fabioba/mlops-architecture
airflow docker docker-compose machine-learning mlflow mlops python workflow
Last synced: 10 months ago
JSON representation
This is an overview of a MLOps architecture that includes both Airflow and MLflow running on separate Docker containers.
- Host: GitHub
- URL: https://github.com/fabioba/mlops-architecture
- Owner: fabioba
- Created: 2022-10-12T17:10:21.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2022-10-18T20:49:26.000Z (over 3 years ago)
- Last Synced: 2025-04-04T02:41:30.414Z (about 1 year ago)
- Topics: airflow, docker, docker-compose, machine-learning, mlflow, mlops, python, workflow
- Language: Python
- Homepage:
- Size: 142 KB
- Stars: 21
- Watchers: 1
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# AIRFLOW_MLFLOW_DOCKER

## Table of content
- [Background](#background)
- [Tools Overview](#tools_overview)
- [Getting started](#getting_started)
* [Docker Compose configuration](#docker_config)
* [Airflow](#airflow)
* [MLflow](#mlflow)
- [Connect Airflow to MLflow](#airflow_and_mlflow)
- [References](#references)
## Background
The goal of this project is to create an ecosystem where to run **Data Pipelines** and monitor **Machine Learning Experiments**.
## Tools Overview
From `Airflow` documentation:
```
Apache Airflow is an open-source platform for developing, scheduling, and monitoring batch-oriented workflows
```
From `MLflow` documentation:
```
MLflow is an open source platform for managing the end-to-end machine learning lifecycle
```
From `Docker` documentation:
```
Docker Compose is a tool for defining and running multi-container Docker applications.
```
## Getting Started
The first step to structure this project is connecting `Airflow` and `MLflow` together: `docker compose`.
### Docker Compose Configuration
Create `docker-compose.yaml`, which contains the configuration of those docker containers responsible for running `Airflow` and `MLflow` services.
Each of those services runs on a different container:
* airflow-webserver
* airflow-scheduler
* airflow-worker
* airflow-triggerer
* mlflow
To create and start multiple container, from terminal run the following command:
```
docker compose up -d
```
### Airflow
In order to access to `Airflow server` visit the page: `localhost:8080`

And take a step into `Airflow` world!
To start creating DAGS initialize an empty folder named `dags` and populate it with as many scripts as you need.
```bash
└── dags
└── example_dag.py
```
### MLFlow
In order to monitor `MLflow experiments` through its server, visit the page: `localhost:600`

## Connect Airflow to MLflow
To establish a connection between `Airflow` and `MLflow`, define the URI of the `MLflow server`:
```
mlflow.set_tracking_uri('http://mlflow:600')
```
After that, create a new connection on `Airflow` that points to that port.

## References
* [Airflow Docker](https://airflow.apache.org/docs/apache-airflow/2.0.1/start/docker.html)
* [What is Airflow?](https://airflow.apache.org/docs/apache-airflow/stable/index.html)
* [MLflow](https://mlflow.org/docs/latest/index.html)