{"id":14067480,"url":"https://github.com/j-buitrago/Scalable-R-API","last_synced_at":"2025-07-30T01:31:10.346Z","repository":{"id":223337842,"uuid":"234411820","full_name":"j-buitrago/Scalable-R-API","owner":"j-buitrago","description":"The main purpose of this project is provide a method to create a scalable API using R","archived":false,"fork":false,"pushed_at":"2020-01-17T14:37:29.000Z","size":118,"stargazers_count":1,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2024-08-13T07:13:30.858Z","etag":null,"topics":["docker","kubernetes","plumber","plumber-api","r"],"latest_commit_sha":null,"homepage":"","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/j-buitrago.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-01-16T21:06:05.000Z","updated_at":"2023-03-25T05:21:21.000Z","dependencies_parsed_at":null,"dependency_job_id":"619eaa8e-8a64-438c-8823-46c04240ece3","html_url":"https://github.com/j-buitrago/Scalable-R-API","commit_stats":null,"previous_names":["j-buitrago/scalable-r-api"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/j-buitrago%2FScalable-R-API","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/j-buitrago%2FScalable-R-API/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/j-buitrago%2FScalable-R-API/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/j-buitrago%2FScalable-R-API/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/j-buitrago","download_url":"https://codeload.github.com/j-buitrago/Scalable-R-API/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":228065328,"owners_count":17863978,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["docker","kubernetes","plumber","plumber-api","r"],"created_at":"2024-08-13T07:05:37.127Z","updated_at":"2024-12-04T07:31:10.724Z","avatar_url":"https://github.com/j-buitrago.png","language":"R","funding_links":[],"categories":["R"],"sub_categories":[],"readme":"# Scalable-R-API\n\n## Objective\n\nThe main purpose of this project is provide a method to create a scalable API using R. The easiest way to create a API in R is using Plumber package \u003chttps://cran.r-project.org/web/packages/plumber/index.html\u003e\n\nIf you are used this package you probably know that by default, plumber allows you to create a synchronous API. This is a serious problem if you want to use it in a production environment. There are probably many methods to avoid this, in this project I'm going to use Docker and Kubernetes for trying to fix this problem.\n\n\n## Create a simple ML model\n\nTo use a real model, we can create it with the popular dataset ```mtcars```. The objective is not create a great model, we are just doing it to simulate a real situation of a ML model doing predictions via API.\n\nExecute this command to create the object ```RfModel.RDS```. I recommend trimmer package (\u003chttps://cran.r-project.org/web/packages/trimmer/index.html\u003e) to simplify your ML model.\n\n```\nRscript ./R/createModel.R\n```\n\n## Build a Docker container \n\n- I will use a previous image of R (3.5.0). To pull this image run:\n\n```\ndocker pull rocker/r-ver:3.5.0\n```\n\n- Now we can create our docker container with our Plumber API. Necessary information to build the container is in \n```./Dockerfile```\n```\ndocker build -t plumber-example .\n```\n\nTo run our container we just have to execute:\n\n```\ndocker run --rm -p 8000:8000 plumber-example\n```\n\nIf everything is working correctly you should see something like this:\n\n![DockerRunning](https://github.com/j-buitrago/Scalable-R-API/blob/master/images/DockerRunning.png)\n\n\nIt's time to use our API to make predictions, if you run the next command you have to receive: [\"6\"], the prediction of our Random Forest model.\n\n```\ncurl -d '{\"data\":{\"mpg\":21,\"disp\":160,\"hp\":100,\"drat\":3.9,\"wt\":2.62,\"qsec\":16.46,\"vs\":0,\"am\":1,\"gear\":4,\"carb\":4}}' http://127.0.0.1/prediction\n```\nNice, we can use our model but what happen if we do this request first and immediately we try to make a prediction? Our prediction will need five seconds...\n\n```\ncurl http://127.0.0.1/asynchronousTest\n```\n\nIf we go to the file ```./R/PredictRf``` we can check that ```asynchronousTest``` just wait 5 seconds and return \"OK\".\nWe can use this function to check that our API is synchronous for now.\n\n## Kubernetes to scale R API\n\nTo use Kubernetes I'm going to use minikube, here you have documentation to install it: https://kubernetes.io/es/docs/tasks/tools/install-minikube/\n\nStart our Kubernetes cluster:\n\n```\nminikube start\neval $(minikube docker-env)\nkubectl apply -f deployment.yaml\n```\nYou can check you have a pod named plumber-example-... running:\n\n```\nkubectl get pods --output=wide\n```\n\nWe have to expose our service for being able to consume our API.\n\n```\nkubectl expose deployment plumber-example --type=LoadBalancer --name=plumber-service\n```\n\nUsing minikube you can know ip and port of the API running\n\n```\nminikube service plumber-service\n```\n\nIf you are not using minikube just execute:\n\n```\nkubectl describe services plumber-service\n```\n\nwhat happens if we now use our ```asynchronousTest``` to check our API? At this moment our API is still synchronous, but here is when we can use kubernetes to scale it! Execute this command and try again:\n\n```\nkubectl scale deployment/plumber-example --replicas=3\n```\n\nAt this point you would have an asynchronous API thanks to Kubernetes! To check that you have three different pods running execute ```kubectl get pods --output=wide``` and you would see this:\n\n![ThreePods](https://github.com/j-buitrago/Scalable-R-API/blob/master/images/ThreePods.png)\n\n\n## Resources\n\n- https://kubernetes.io/docs/tutorials/stateless-application/expose-external-ip-address/\n- https://medium.com/tmobile-tech/using-docker-to-deploy-an-r-plumber-api-863ccf91516d\n- https://kubernetes.io/docs/setup/learning-environment/minikube/\n- https://kubernetes.io/docs/concepts/overview/working-with-objects/kubernetes-objects/\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fj-buitrago%2FScalable-R-API","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fj-buitrago%2FScalable-R-API","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fj-buitrago%2FScalable-R-API/lists"}