https://github.com/j-buitrago/Scalable-R-API
The main purpose of this project is provide a method to create a scalable API using R
https://github.com/j-buitrago/Scalable-R-API
docker kubernetes plumber plumber-api r
Last synced: 5 months ago
JSON representation
The main purpose of this project is provide a method to create a scalable API using R
- Host: GitHub
- URL: https://github.com/j-buitrago/Scalable-R-API
- Owner: j-buitrago
- Created: 2020-01-16T21:06:05.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2020-01-17T14:37:29.000Z (over 5 years ago)
- Last Synced: 2024-08-13T07:13:30.858Z (8 months ago)
- Topics: docker, kubernetes, plumber, plumber-api, r
- Language: R
- Homepage:
- Size: 115 KB
- Stars: 1
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- jimsghstars - j-buitrago/Scalable-R-API - The main purpose of this project is provide a method to create a scalable API using R (R)
README
# Scalable-R-API
## Objective
The main purpose of this project is provide a method to create a scalable API using R. The easiest way to create a API in R is using Plumber package
If you are used this package you probably know that by default, plumber allows you to create a synchronous API. This is a serious problem if you want to use it in a production environment. There are probably many methods to avoid this, in this project I'm going to use Docker and Kubernetes for trying to fix this problem.
## Create a simple ML model
To use a real model, we can create it with the popular dataset ```mtcars```. The objective is not create a great model, we are just doing it to simulate a real situation of a ML model doing predictions via API.
Execute this command to create the object ```RfModel.RDS```. I recommend trimmer package () to simplify your ML model.
```
Rscript ./R/createModel.R
```## Build a Docker container
- I will use a previous image of R (3.5.0). To pull this image run:
```
docker pull rocker/r-ver:3.5.0
```- Now we can create our docker container with our Plumber API. Necessary information to build the container is in
```./Dockerfile```
```
docker build -t plumber-example .
```To run our container we just have to execute:
```
docker run --rm -p 8000:8000 plumber-example
```If everything is working correctly you should see something like this:

It's time to use our API to make predictions, if you run the next command you have to receive: ["6"], the prediction of our Random Forest model.
```
curl -d '{"data":{"mpg":21,"disp":160,"hp":100,"drat":3.9,"wt":2.62,"qsec":16.46,"vs":0,"am":1,"gear":4,"carb":4}}' http://127.0.0.1/prediction
```
Nice, we can use our model but what happen if we do this request first and immediately we try to make a prediction? Our prediction will need five seconds...```
curl http://127.0.0.1/asynchronousTest
```If we go to the file ```./R/PredictRf``` we can check that ```asynchronousTest``` just wait 5 seconds and return "OK".
We can use this function to check that our API is synchronous for now.## Kubernetes to scale R API
To use Kubernetes I'm going to use minikube, here you have documentation to install it: https://kubernetes.io/es/docs/tasks/tools/install-minikube/
Start our Kubernetes cluster:
```
minikube start
eval $(minikube docker-env)
kubectl apply -f deployment.yaml
```
You can check you have a pod named plumber-example-... running:```
kubectl get pods --output=wide
```We have to expose our service for being able to consume our API.
```
kubectl expose deployment plumber-example --type=LoadBalancer --name=plumber-service
```Using minikube you can know ip and port of the API running
```
minikube service plumber-service
```If you are not using minikube just execute:
```
kubectl describe services plumber-service
```what happens if we now use our ```asynchronousTest``` to check our API? At this moment our API is still synchronous, but here is when we can use kubernetes to scale it! Execute this command and try again:
```
kubectl scale deployment/plumber-example --replicas=3
```At this point you would have an asynchronous API thanks to Kubernetes! To check that you have three different pods running execute ```kubectl get pods --output=wide``` and you would see this:

## Resources
- https://kubernetes.io/docs/tutorials/stateless-application/expose-external-ip-address/
- https://medium.com/tmobile-tech/using-docker-to-deploy-an-r-plumber-api-863ccf91516d
- https://kubernetes.io/docs/setup/learning-environment/minikube/
- https://kubernetes.io/docs/concepts/overview/working-with-objects/kubernetes-objects/