{"id":29244231,"url":"https://github.com/hypnosapos/cartpole-rl-remote","last_synced_at":"2025-10-08T22:58:36.229Z","repository":{"id":29571737,"uuid":"120626003","full_name":"hypnosapos/cartpole-rl-remote","owner":"hypnosapos","description":"CartPole game by Reinforcement Learning, a journey from training to inference","archived":false,"fork":false,"pushed_at":"2025-06-16T08:35:34.000Z","size":84063,"stargazers_count":25,"open_issues_count":90,"forks_count":3,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-06-16T09:52:41.515Z","etag":null,"topics":["artificial-intelligence","cartpole","keras","keras-neural-networks","kubeflow","kubernetes-cluster","machine-learning","mlflow","mlops","polyaxon","pytorch","qlearning","reinforcement-learning","seldon","seldon-core","tensorflow"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hypnosapos.png","metadata":{"files":{"readme":"README.rst","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2018-02-07T14:32:24.000Z","updated_at":"2025-02-27T07:38:59.000Z","dependencies_parsed_at":"2023-02-10T08:02:51.600Z","dependency_job_id":"87978181-32eb-4438-8983-66463885f58a","html_url":"https://github.com/hypnosapos/cartpole-rl-remote","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/hypnosapos/cartpole-rl-remote","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hypnosapos%2Fcartpole-rl-remote","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hypnosapos%2Fcartpole-rl-remote/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hypnosapos%2Fcartpole-rl-remote/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hypnosapos%2Fcartpole-rl-remote/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hypnosapos","download_url":"https://codeload.github.com/hypnosapos/cartpole-rl-remote/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hypnosapos%2Fcartpole-rl-remote/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263403959,"owners_count":23461234,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","cartpole","keras","keras-neural-networks","kubeflow","kubernetes-cluster","machine-learning","mlflow","mlops","polyaxon","pytorch","qlearning","reinforcement-learning","seldon","seldon-core","tensorflow"],"created_at":"2025-07-03T21:09:08.928Z","updated_at":"2025-10-08T22:58:36.222Z","avatar_url":"https://github.com/hypnosapos.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"Cartpole RL Remote\n==================\n\n.. image:: https://circleci.com/gh/hypnosapos/cartpole-rl-remote/tree/master.svg?style=svg\n   :target: https://circleci.com/gh/hypnosapos/cartpole-rl-remote/tree/master\n   :alt: Build Status\n.. image:: https://app.fossa.io/api/projects/git%2Bgithub.com%2Fhypnosapos%2Fcartpole-rl-remote.svg?type=shield\n   :target: https://app.fossa.io/projects/git%2Bgithub.com%2Fhypnosapos%2Fcartpole-rl-remote?ref=badge_shield\n   :alt: License status\n.. image:: https://badges.frapsoft.com/os/v1/open-source.svg?v=103\n   :alt: We love OpenSource\n\n\nThis project is intended to play with `CartPole \u003chttps://gym.openai.com/envs/CartPole-v0/\u003e`_ game using Reinforcement Learning\nand to know how we may train a different model experiments with enough observability (metrics/monitoring).\n\nThe model is divided basically in three parts: Neural network model, QLearning algorithm and application runner.\n\nWe want to show you a journey from custom training of models to a productive platform based on open source  (training/inference).\n\nRequirements\n============\n\nBasic scenario (Station #1):\n\n- Make (gcc)\n- Docker (18+)\n- Docker compose (version 17.06.+, compose file format 3.3)\n- python \u003e= 3.5\n\nAdvanced scenarios (Station #2 and #3):\n\n- kubernetes (1.14+)\n- polyaxon (0.5.6)\n- seldon (0.4.0)\n\nStation #1: Custom trainer and metrics collection\n=================================================\n\nAs with any other software development, machine learning code must follow the same best practices, so\nit's very important to have on mind that our code should be run on any environment, on my laptop or on any cloud (avoid vendor services, and ensuring portability anywhere).\n\nFirst attempt was to train CartPole model with our own trainer by a multiprocessor python module,\nby default it'll try to use one processor for each hyperparameter combination (model experiment).\n\n*NOTE*: Yes, we could have tried tensorboard callbacks for background Keras model (or tensorflow models), but we wanna explore other ways too.\n\nCollecting metrics with visdom\n------------------------------\n\nWe trust in logs, so all details of model training should be outlined using builtins log libraries, and then the instrumentation\nmay come from tools that manage these log lines. We've used as first approach a log handler for Visdom server in order to send metrics to an external site.\n\nUsing python virtual env\n^^^^^^^^^^^^^^^^^^^^^^^^\n\nRequirements:\n\n- Python (3.5+)\n\nTo create a local virtual env for python, type::\n\n   make venv\n\nWhen this virtual env is activated, we can use the ``cartpole`` command client directly::\n\n   source .venv/bin/activate\n   cartpole --help\n\n\nWe have a couple of arguments to setup metrics collection: ``--metrics-engine`` and ``--metrics-config``.\n\nThe simplest way to train the model and collect metrics with visdom (trough docker container) is next command ::\n\n   make train-dev\n\nChange default values for hyperparameters in Makefile file if you wish another combination.\n\n*NOTE*: Render mode is activated with ``-r`` argument if you want to see CartPole game training.\n\nVisdom server might be ready at: http://localhost:8097 with metrics and evaluation model results, this process gets out an **h5** file with the best trained model as well.\n\n\nUsing docker compose\n^^^^^^^^^^^^^^^^^^^^\n\nIf you prefer use docker containers for everything launch this command (powered by docker-compose)::\n\n   make train-docker-visdom\n\n\n\n.. image:: assets/cartpole-visdom.gif\n   :alt: Basic Scenario - Visdom\n\nUsing docker log drivers, EFK in action\n---------------------------------------\n\nOk, it's possible to implement our own metrics collector, but as we are using containers, couldn't we use docker log drivers to extract metrics from log lines?\nYes, of course.\n\nWe've created a fluentd conf file (under directory **scaffold/efk/fluentd**) to specify the regex pattern of searched lines in logs, and fluentd will send metrics to elasticsearch,\nfinally visualizations of metrics will be available through kibana dashboard.\n\nTo run this stack type::\n\n   make train-docker-efk\n\n\nKibana URL would be: http://localhost:5601. Set the text ``cartpole-*`` as the index pattern.\nIn **scaffold/efk/kibana** directory you can find a kibana dashboard json file that you can import to view all graphics about cartpole model experiments.\n\n.. image:: assets/cartpole-efk.gif\n   :alt: Basic Scenario - EFK\n\nUsing ModelDB as experiment repository\n--------------------------------------\n\nModelDB is a MIT licensed project that let's track our model experiments pretty easy.\n\nCheck out typing::\n\n   make train-docker-modeldb\n\nFrontend is available at: http://localhost:3000\n\n.. image:: assets/cartpole-modeldb.gif\n   :alt: Basic Scenario - ModelDB\n\nStation #2: Advanced training with Polyaxon\n===========================================\n\nWell, we have a simple model trainer with simple hyperparameter tuning implementation (something like a well known grid algorithm).\n\nFew weeks ago I discovered `polyaxon \u003chttp://polyaxon.com\u003e`_ which goal is to train models seamlessly.\nThe challenge now would be create a polyaxon wrapper to train multiple experiments.\n\nIt uses kubernetes as platform so first thing we need is create one cluster (take a look at `k8s-gke \u003chttps://github.com/hypnosapos/k8s-gke\u003e`_)::\n\n   export GCP_CREDENTIALS=/your_path/gcp.json\n   export GCP_ZONE=europe-west4-a\n   export GCP_PROJECT_ID=\u003cmy_project\u003e\n   export GKE_CLUSTER_NAME=cartpole\n   export GITHUB_TOKEN=\u003cgithubtoken\u003e\n   git clone https://github.com/hypnosapos/k8s-gke.git\n   make -C k8s-gke gke-bastion gke-create-cluster gke-tiller-helm gke-proxy gke-ui-login-skip gke-ui\n\n\nWe'll use a ZFS server to create shared volumes in the same GCP_ZONE (feel free to change de volume driver)::\n\n   make -C scaffold/polyaxon gke-polyaxon-nfs\n\n\nInstall polyaxon components on kubernetes and configure the polyaxon client on gke-bastion container ::\n\n   make -C scaffold/polyaxon gke-polyaxon-preinstall gke-polyaxon-install gke-polyaxon-cartpole-init\n\n\nFinally, let's deploy our experiment group by this command::\n\n   make  -C scaffold/polyaxon gke-polyaxon-cartpole\n\n\nYou can use the gke-bastion container as proxy for gcloud, kubectl or polyaxon commands directly, i.e::\n\n   docker exec -it gke-bastion sh -c \"kubectl get pods -w -n polyaxon\"\n   docker exec -it gke-bastion sh -c \"polyaxon project experiments\"\n\nHere you have some screen shots of web console and command client\n\n.. image:: assets/polyaxon.png\n   :alt: Polyaxon\n\n.. image:: assets/polyaxon_charts.png\n   :alt: Polyaxon Chart\n\n.. image:: assets/polyaxon_cli.png\n   :alt: Polyaxon Command Client\n\nStation #3: Model inference with Seldon\n=======================================\n\nThe idea is to get trained models and deploy them within `Seldon \u003chttps://seldon.io\u003e`_.\n\nIn order to create your own seldon images use::\n\n    make seldon-build seldon-push\n\nThis command uses the official seldon wrapper to build and push your docker images.\nMainly the built image process attaches the best scored model (h5 file) to be served through the entry method \"predict\" for client requests when the seldon microservice is ready.\nNote that training models are moved from default \".models\" local directory to *scaffold/seldon* directory to be included into the docker image, but obviously you can choose another,\neven from a cloud storage such as S3, GCS, ... (probably you are thinking about linking the output directory used in training stage with polyaxon, you're right).\n\nWe provide some docker images for this PoC with different scores under the `dockerhub org hypnosapos \u003chttps://hub.docker.com/r/hypnosapos/cartpolerlremoteagent/tags/\u003e`_.\n\nDeploy Seldon\n-------------\n\nWe're going to use the same kubernetes cluster, but you may to use another.\n\nDeploy Seldon::\n\n   make gke-seldon-install\n\n\nDeploy CartPole within Seldon\n-----------------------------\n\nDeploy different seldon graphs for CartPole model, choose one value of: [model, abtest, router] for SELDON_MODEL_TYPE variable::\n\n   SELDON_MODEL_TYPE=router make gke-seldon-cartpole\n\nTake a look at files under directory **scaffold/k8s/seldon** .\n\nLet's deploy a router (it'll use an epsilon greedy router by seldon team) with three branches: two for \"untrained\" models ('cartpole-0' and 'cartpole-1', low score metric),\nand one branch with a \"max_score\" ('cartpole-2', score metric 7000, the max value in training).\nDefault branch will be 0 ('cartpole-0') at the beginning, as requests are received the router will redirect traffic automatically to branch 2 ('cartpole-2') according to the best scored model.\n\nCheck out that pods are ready::\n\n   docker exec -it gke-bastion sh -c \"kubectl get pods -l seldon-app=cartpole-router -w -n seldon\"\n   NAME                                               READY     STATUS    RESTARTS   AGE\n   cartpole-router-cartpole-router-6678798bf4-4sz7x   5/5       Running   0          2m\n\n   docker exec -it gke-bastion sh -c 'kubectl get pods -l seldon-app=cartpole-router -o jsonpath=\"{.items[*].spec.containers[*].image}\" -n seldon | tr -s \"[[:space:]]\" \"\\n\"'\n   hypnosapos/cartpolerlremoteagent:untrained\n   hypnosapos/cartpolerlremoteagent:untrained\n   hypnosapos/cartpolerlremoteagent:max_score\n   seldonio/mab_epsilon_greedy:1.1\n   seldonio/engine:0.1.6\n\n\n\nRun remote agent\n----------------\n\nYou have to get the external IP from svc/seldon-apiserver to set RUN_MODEL_IP variable.\n\nIn order to get model predictions launch this command in your shell::\n\n  export RUN_MODEL_IP=$(docker exec -it gke-bastion sh -c \\\n  'kubectl get svc seldon-apiserver -n seldon -o jsonpath=\"{.status.loadBalancer.ingress[0].ip}\"')\n  make docker-visdom\n  make run-dev\n\n\nModel metrics in running mode will be collected on `local visdom server \u003chttp://localhost:8059\u003e`_.\n\nTake a look at the grafana dashboard to view seldon metrics. Since *seldon-core-analytics* helm chart was installed\nwith loadbalancer endpoint type, find the public ip to get access.\n\n.. image:: assets/seldon.png\n   :alt: Seldon router\n\nOpen-source platforms\n=====================\n\nWe love OSS, we're K8S lovers and we think that this container orchestration engine is a key for the future of hybrid-clouds or multi-cloud strategies.\nImagine you are capable to specify an AI pipeline (gathering ETL or another intensive tasks of pre-processing stages, model evaluation, serving, etc) via declarative or programmatic DSL\n\nWe recommend to take a look at one of the most widely contributed OSS project: `kubeflow \u003chttps://www.kubeflow.org/\u003e`_ (we're focused on it today), some of the products shown above can be integrated within and you could be able to contribute or extend for your needs.\n\nLicense\n=======\n\n.. image:: https://app.fossa.io/api/projects/git%2Bgithub.com%2Fhypnosapos%2Fcartpole-rl-remote.svg?type=large\n   :target: https://app.fossa.io/projects/git%2Bgithub.com%2Fhypnosapos%2Fcartpole-rl-remote?ref=badge_large\n   :alt: License Check\n\nAuthors\n=======\n\n- David Suarez   - `davsuacar \u003chttp://github.com/davsuacar\u003e`_\n- Enrique Garcia - `engapa \u003chttp://github.com/engapa\u003e`_\n- Leticia Garcia - `laetitiae \u003chttp://github.com/laetitiae\u003e`_\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhypnosapos%2Fcartpole-rl-remote","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhypnosapos%2Fcartpole-rl-remote","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhypnosapos%2Fcartpole-rl-remote/lists"}