Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/piesposito/tand

TanD - Train and Deploy is a no-code framework to automatize the Machine Learning workflow.
https://github.com/piesposito/tand

data-science fastapi machine-learning mlflow pytorch sklearn workflow-automation

Last synced: 3 months ago
JSON representation

TanD - Train and Deploy is a no-code framework to automatize the Machine Learning workflow.

Awesome Lists containing this project

README

        

# TanD - Train and Deploy

TanD is a simple, no-code, flexible and customizable framework to automatize the Machine Learning workflow.

With TanD you can go through the whole ML workflow without writing a single line of code (for both `sklearn` and `torch` based models): by creating a project template and setting some configurations on a `.json` file you are able to train a ML model of your choice, store it to `mlflow` to control its lifecycle and create a ready-to-deploy API to serve your it.

Although TanD lets you run your workflows (from train to deploy) with no code at all, it is highly customizable, letting you introduce your chunks of code to enhance your modelling pipelines in anyway you want.

Our mission is to let you avoid repetitive tasks so you can focus on what matters. TanD brings Machine-Learning laziness to a whole new level.

## Rodamap
The project's roadmap (which is not defined in order of priority) is:
* Create project templates (`torch` and `sklearn`) for regression tasks in structured data;
* ~Create a `Dockerfile` in project templates to ease deployment~ OK;
* ~Create a `cron` job in Docker to update model parameters~ OK;
* Create tutorials for train and deploy with `tand`;
* Create project templates (`torch` / `transformers`) for classification tasks in text data;
* Create project templates (`torch`) for classification in image data;
* Create `documentation` for the project

# Index
* [Install](#Install)
* [Documentation](#Documentation)
* [Quick start](#Quick-start)

## Install

To install `tand` you can use pip command:

```
pip install train-and-deploy
```

You can also clone the repo and `pip install .` it locally:

```
git clone https://github.com/piEsposito/TanD.git
cd TanD
pip install .
```

## Documentation
Documentation for `tand.util` and explanation of project templates:
* [util](doc/util.md)

Documentation for `tand.deployment` with utils to ease deployment to cloud:
* [deployment](doc/deployment.md)

Documentation for project templates:
* [PyTorch structured data classification task](doc/pytorch-structured-classification.md)
* [PyTorch structured data regression task](doc/pytorch-structured-regression.md)
* [Sklearn structured data regression task](doc/sklearn-structured-classification.md)


---

## Quick start

After installing `tand` you can train and deploy on a sample project using the [UCI heart disease dataset](https://www.kaggle.com/ronitf/heart-disease-uci). Notice that you can perform the process on your own datasets by only changing the `.csv` file and setting some configurations. By following this steps, this tutorial will let you:

* Train a `torch` model with a dataset and log all of its metrics to `mlflow`;
* Automatically generate a `fastapi` based API service for serving the model (which receives a json with the features);
* Deploy the model to AWS ElasticBeanstalk with two lines of code.

To create the project with a `torch` based model, on an empty folder, type:

```
tand-create-project --template pytorch-structured-classification
```

That will create all the needed files on the folder. We should first check `config.json`:

```json
{
"train": {
"__help": "Configurations for the training of the project (of the model, etc...)",
"data_path": "data/data.csv",
"labels_column": "target",

"log_every": 250,
"epochs": 50,

"hidden_dim": 256,
"batch_size": 32,
"device": "cpu",
"labels": ["no_heart_disease", "heart_disease"],
"to_drop": []

},

"test": {
"__help": "Configurations for the testing of this project (train + app)"
},

"app": {
"__help": "Configurations for the service generated by this project",
"token": null
},

"mlflow": {
"__help": "Configurations for the mlflow model manager of the project",
"model_name": "pytorch-classifier-nn-heart-disease",
"experiment_name": "heart-disease-experiment"
}
}
```

The project is all set, but is important to check:
* If the `data_path` attribute of `train` is set properly;
* If the `labels_column` attribute of `train` is set according to the dataset label column;
* If the `labels` attribute of `train` is set in proper order with the names.

We should also see the `mlflow` pathes for both database and model logging. As we want to keep it simple, we will use `sqlite` and a local storage, but you can set it to remote buckets and database in a production environment. They are set in all the `env_files` folder files as:

```
MLFLOW_TRACKING_URI=sqlite:///database.db
MLFLOW_DEFAULT_ARTIFACT_ROOT=./mlruns/
```

But feel free to change it, according to [`mlflow` documentation](https://www.mlflow.org/docs/latest/tracking.html).

To train the model is as easy as running:

```
source env_files/train.env
python train.py
```

And you can see lots of metrics for the model at `mlflow`:

```
bash mlflow-server.sh
```

If you are running an `mlflow` experiment for first time on a project, it will be automatically set for production. If you rerun the experiment with a different dataset or parameters, you can set the production model at `mlflow` followint the [documentation](https://www.mlflow.org/docs/latest/model-registry.html). That will be useful once you deploy it.

That command also creates `request_model.json`, which is used both to validate request bodies for the API service and reordering it to comply with model. This file will be also used for unit-testing the API (this is also automatically generated).

```json
{
"age": 1,
"sex": 1,
"cp": 1,
"trestbps": 1,
"chol": 1,
"fbs": 1,
"restecg": 1,
"thalach": 1,
"exang": 1,
"oldpeak": 1,
"slope": 1,
"ca": 1,
"thal": 1
}
```

### Moving to the API creation

If you want to generate value with a ML model, you should deploy it. `tand` helps you with creating the API, a `Dockerfile` and all the configurations files needed to deploy the model with no code. Notice that the API is protected with a token, which defaults to `TOKEN123` but you can change it on `env_files/app.env` and `env_files/docker_app.env`.

The API contains simple authentication token, a `/` route for health checking, `/update-model` for model uptading (POST at it with proper credentials and it fetches the latest production model at `mlflow`), and `/predict`, which grabs the features from the request body and returns the prediction.

To test the API, just run:

```
source env_files/app.env
pytest
```

It should pass everything.

You now have some options for deployment. You can, using some arbitrary VM, run the app, build and run the image generated by `Dockerfile` or use `tand` features designed for AWS ElasticBeanstalk, which lets your model to be deployed cheap and scalable. We will cover both:

To run the app, just type:

```
uvicorn app:app --reload --host 0.0.0.0 --port 8000
```

You can test it with:

```
curl --header "Content-Type: application/json" \
--header "TOKEN: $API_TOKEN" \
--request POST \
--data '{"age":1,"sex":1,"cp":1,"trestbps":1,"chol":1,"fbs":1,"restecg":1,"thalach":1,"exang":1,"oldpeak":1,"slope":1,"ca":1,"thal":1}' \
http://localhost:8000/predict
```

Remember to `source env_files/app.env` before performing the request or else it will return status 401 Unauthorized.

You can build the Docker image with:

```
docker build . -t tand-app:v1
```

And run it with:

```
docker run -p 8000:8000 --env-file env_files/docker_app.env tand-app:v1
```

You can test it with:

```
curl --header "Content-Type: application/json" \
--header "TOKEN: $API_TOKEN" \
--request POST \
--data '{"age":1,"sex":1,"cp":1,"trestbps":1,"chol":1,"fbs":1,"restecg":1,"thalach":1,"exang":1,"oldpeak":1,"slope":1,"ca":1,"thal":1}' \
http://localhost:8000/predict
```

Remember to `source env_files/app.env` before performing the request or else it will return status 401 Unauthorized.

Last, we can deploy it to AWS ElasticBeanstalk. To do that, first you [should set your AWS credentials](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html) at your machine and then [install `eb` CLI](https://docs.amazonaws.cn/en_us/elasticbeanstalk/latest/dg/eb-cli3-install-linux.html). That should be done at your root environment, not `conda` or `virtualenv`.

You can generate the configurations with:

```
tand-prepare-aws-eb-deployment --init-git
```

We pass this `--init-git` flag because `eb` CLI uses some files from `.git` repository to upload the files to be deployed.

That will generate `deploy-aws-eb.sh`, which will be run for deployment. It will also generate `.ebextensions` containing:
* `cron.config` - which runs, on each instance, a daily task to update the instance ML model by fetching the last production one from `mlflow` (which is properly used when we set cloud-based `mlflow` backend);
* `options.config` - which sets the API token and `mlflow` backend env variables for the deployment; and
* `scaling.config` - which sets the scalability configurations for the deployment, including the maximum and minimum number of replicas and criteria for scaling (defaults to latency)

To finally deploy it to AWS, run:

```
bash deploy-aws.eb.sh
```

It takes about 5 minutes, after what you can `eb open` to get the link and then try it with

```
curl --header "Content-Type: application/json" \
--header "TOKEN: $API_TOKEN" \
--request POST \
--data '{"age":1,"sex":1,"cp":1,"trestbps":1,"chol":1,"fbs":1,"restecg":1,"thalach":1,"exang":1,"oldpeak":1,"slope":1,"ca":1,"thal":1}' \
http://YOUR_LINK_GOES_HERE/predict
```

Remember to properly set the token for testing.

And with that, we showed how can we train and deploy a model with `tand` with a couple terminal commands and no coding at all.

---

###### Made by Pi Esposito