{"id":16540014,"url":"https://github.com/rickstaa/pytorch-rl-on-google-ai-example","last_synced_at":"2025-10-28T15:08:12.050Z","repository":{"id":104253161,"uuid":"283686601","full_name":"rickstaa/pytorch-RL-on-google-AI-example","owner":"rickstaa","description":"A sample repository for performing RL (DQL) on the google ai platform.","archived":false,"fork":false,"pushed_at":"2020-11-04T09:42:59.000Z","size":31,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-06-03T09:19:19.365Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rickstaa.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-07-30T06:18:19.000Z","updated_at":"2020-11-04T09:43:01.000Z","dependencies_parsed_at":null,"dependency_job_id":"a4f8ec8f-d29c-4174-aab7-d5212a4d7ad6","html_url":"https://github.com/rickstaa/pytorch-RL-on-google-AI-example","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/rickstaa/pytorch-RL-on-google-AI-example","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rickstaa%2Fpytorch-RL-on-google-AI-example","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rickstaa%2Fpytorch-RL-on-google-AI-example/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rickstaa%2Fpytorch-RL-on-google-AI-example/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rickstaa%2Fpytorch-RL-on-google-AI-example/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rickstaa","download_url":"https://codeload.github.com/rickstaa/pytorch-RL-on-google-AI-example/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rickstaa%2Fpytorch-RL-on-google-AI-example/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":260600021,"owners_count":23034623,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-11T18:51:21.759Z","updated_at":"2025-10-28T15:08:11.933Z","avatar_url":"https://github.com/rickstaa.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Pytorch RL on Google AI example\n\nThis repository contains a quick example on how to push a `PyTorch` based, deep q-learning\n(DQL) job to the Google ai platform while viewing the results using `tensorboard`. It was based on the\n[Google container documentation](https://cloud.google.com/ai-platform/training/docs/custom-containers-training) and uses the code of [the DQL Pong example of @colinskow's move37 class](https://github.com/colinskow/move37). This repository is not meant to be an in-depth review of either Reinforcement learning (RL), docker or\nGoogle cloud but mainly serves as a template for performing PyTorch based RL in the cloud. For more information on these topics, see the following resources:\n\n- Check out the videos of [@colinskow for an excellent overview of RL](https://www.youtube.com/watch?v=14BfO5lMiuk\u0026list=PLWzQK00nc192L7UMJyTmLXaHa3KcO0wBT).\n- Check out the video of [@kodekloud for an introduction into docker](https://www.youtube.com/watch?v=zJ6WbK9zFpI\u0026t=3488s).\n- For help with setting up and using the Google cloud platform, please checkout to the [Google documentation](https://cloud.Google.com/ai-platform/docs/).\n\n## Requirements\n\n- A [Google cloud account](https://cloud.Google.com/free).\n- A [Google project for which billing is enabled](https://cloud.Google.com/resource-manager/docs/creating-managing-projects).\n- The [Google SDK](https://cloud.Google.com/sdk/docs).\n- [Docker](https://docs.docker.com/engine/install/ubuntu/).\n- [NVIDIA-docker](https://github.com/NVIDIA/nvidia-docker#quickstart) (OPTIONAL).\n- [Available global GPU quota](https://cloud.Google.com/compute/quotas) (OPTIONAL).\n\n## How does this work\n\nIn order to train RL algorithms on the Google ai platform we need the following components:\n\n- A python training script.\n- A python requirements file.\n- A docker file.\n- [The Tensorboard package](https://pypi.org/project/tensorboard/).\n- Access to the [Google container registry](https://cloud.Google.com/container-registry/docs/quickstart).\n- A [Google cloud bucket](https://cloud.Google.com/storage/docs/json_api/v1/buckets).\n\n### The python training script\n\nFor the training script, I used the `dqn_basic` script of [@colinskow's move37 class](https://github.com/colinskow/move37). Two small modifications were made to this script to use it with the Google AI platform. First, I added the `model_dir` argument to the script to allow us to specify the Google cloud bucket location where we want to store the model and the TensorFlow logs:\n\n```python\nparser.add_argument(\n    \"--model-dir\", default=\".\", help=\"The directory to store the model\"\n)\n```\n\nFollowing, I used this argument to set the `log_dir` of the tensorboard `SummaryWriter` object:\n\n```python\nwriter = SummaryWriter(\n    comment=\"-\" + args.env,\n    log_dir=os.path.join(args.model_dir if args.model_dir else \".\", model_dir_name),\n)\n```\n\nLastly, I used the `gsutil` module to write the trained model to the Google cloud bucket that is specified in the `model_dir` argument:\n\n```python\n  retval = subprocess.check_call(\n      [\n          \"gsutil\",\n          \"cp\",\n          tmp_model_file,\n          os.path.join(args.model_dir, tmp_model_file),\n      ],\n      stdin=sys.stdout,\n      stderr=sys.stdout,\n  )\n  if retval \u003e 0:\n      raise Exception(\n          \"Could not save model as. Supplied Google cloud \"\n          \"bucket does not exists! Shutting down training.\"\n      )\n```\n\nAlternatively this can also be achieved with the `from Google.cloud import storage` module ([see the Google documentation for more information](https://cloud.Google.com/storage/docs/uploading-objects#storage-upload-object-code-sample)). To use this method comment out the code on [L180-L199](https://github.com/rickstaa/Pytorch_RL_on_google_AI_example/blob/8af3960064e1b67cfcc3efbdcbd020b3bb4c6153/dqn_basic.py#L180-L199) and [L271-L289](https://github.com/rickstaa/Pytorch_RL_on_google_AI_example/blob/8af3960064e1b67cfcc3efbdcbd020b3bb4c6153/dqn_basic.py#L271-L289) of the [dqn_basic.py](https://github.com/rickstaa/Pytorch_RL_on_google_AI_example/blob/master/dqn_basic.py) file.\n\n### Docker file\n\nThe Docker file is used to create the RL training container we want to push to the Google AI platform. Most of the steps in the Docker File are used to setup the required dependencies and transfer the required script files. The most important component is the `ENTRYPOINT` at the bottom of the script:\n\n```bash\nENTRYPOINT [\"python3\", \"dqn_basic.py\"]\n```\n\nThis entry point makes sure our training script is executed when we deploy the docker image to the Google AI platform. After the docker image is deployed, the Google AI platform will allocate the required resources (CPU, GPU and memory) for running our training. These resources are detached again when the training script finishes. As a result, you only pay for the resources you used.\n\n## Training in the cloud steps (CPU)\n\n### Create a Google cloud bucket\n\nTo obtain the lowest training cost, you must place both the container registry, storage bucket and Google AI server in the same region. You therefore first have to choose a region in which you want to perform the training. An overview of the possible google computing regions and zones can be found [here](https://cloud.google.com/compute/docs/regions-zones/). An overview of the google container registry regions can be found [here](https://cloud.google.com/container-registry/docs/pushing-and-pulling). After you found your region you have to export them as environment variables:\n\n```bash\nexport REGION=europe-west1\nexport CONTAINER_REGION=eu.gcr.io\n```\n\nFollowing you can create a Google cloud bucket:\n\n```bash\nexport PROJECT_ID=$(gcloud config list project --format \"value(core.project)\")\nexport BUCKET_NAME=${PROJECT_ID}-${REGION}-pytorch_dql_pong\ngsutil mb -l ${REGION} gs://${BUCKET_NAME}\n```\n\nPlease note that the cost you pay for storing/retrieving data in/from your bucket depends on the region you choose. An overview of the google cloud storage pricing can be found [here](https://cloud.google.com/storage/pricing).\n\n### Build and test the container locally\n\nBefore pushing the containerized RL training job to the AI platform, we first want to test whether the container executes without errors. To do first export the following bash environmental variables:\n\n```bash\nexport PROJECT_ID=$(gcloud config list project --format \"value(core.project)\")\nexport IMAGE_REPO_NAME=pytorch_dql_pong_gpu_container\nexport IMAGE_TAG=dql_pytorch_gpu\nexport IMAGE_URI=eu.gcr.io/$PROJECT_ID/$IMAGE_REPO_NAME:$IMAGE_TAG\n```\n\nYou are free to modify these variables in any way you like. After you set these variables, you can now build the docker image:\n\n```bash\ndocker build -f Dockerfile -t $IMAGE_URI ./\n```\n\nIf the container has successfully built, we can test it:\n\n```bash\ndocker run $IMAGE_URI\n```\n\n### Push the container to the container Registry\n\nIf the container with your RL algorithm, is executing successfully, you can push it to the Google container repository:\n\n```bash\ndocker push $IMAGE_URI\n```\n\n### Push the RL training job\n\nFinally, we are ready to submit the training job to AI Platform Training using the gcloud tool:\n\n```bash\nexport JOB_NAME=pytorch_dql_pong_job_$(date +%Y%m%d_%H%M%S)\nexport OUTPUT_PATH=gs://$BUCKET_NAME/$JOB_NAME\ngcloud ai-platform jobs submit training $JOB_NAME \\\n --region $REGION \\\n --scale-tier BASIC \\\n --master-image-uri $IMAGE_URI \\\n -- \\\n --no-cuda \\\n --model-dir $OUTPUT_PATH \\\n```\n\nThe most important arguments are the following:\n\n- `--region`: The region from which you want to use the computing resources.\n- `--scale-tier`: This is the type of computing resource you use for performing the training job (see [ai-platform pricing page](https://cloud.Google.com/ai-platform/training/pricing) for more information).\n- `--master-image-uri`: The URI to your Docker image.\n- `--no-cuda`: Forces the CPU to be used even if GPU is available.\n- `--model-dir`: The URI to the google cloud bucket.\n\n### Check the results\n\nAfter the training job has been deployed you can check visualize the results directly from the Google cloud bucket using the following tensorboard command:\n\n## Training in the cloud steps (GPU)\n\nTo use GPU during training, you have to change the job submit command. In the job submit command change the `--scale-tier` option from `BASIC` to `BASIC_GPU` and add the `--cuda` command:\n\n```bash\nexport JOB_NAME=pytorch_dql_pong_job_$(date +%Y%m%d_%H%M%S)\nexport OUTPUT_PATH=gs://$BUCKET_NAME/$JOB_NAME\ngcloud ai-platform jobs submit training $JOB_NAME \\\n --region $REGION \\\n --scale-tier BASIC_GPU \\\n --master-image-uri $IMAGE_URI \\\n -- \\\n --model-dir $OUTPUT_PATH\n```\n\n⚠️💰 Please keep in mind that changing the scale-tier from `BASIC` to `BASIC_GPU` increases the training cost! For an overview of the cost of training in the cloud see [the Google ai documentation](https://cloud.Google.com/ai-platform/training/pricing).\n\n## Hyperparameter tuning\n\nAdditionally, as explained in the [google documentation](https://cloud.google.com/ai-platform/training/docs/using-containers), you can also perform hyperparameter tuning in the cloud. In this example, I try to tune the `batch_size` hyperparameter. You can push a hyperparameter training job to the Google Ai cloud by supplying the job submit command with the hyperparameter `config.yaml` file:\n\n```bash\nexport JOB_NAME=pytorch_dql_pong_job_$(date +%Y%m%d_%H%M%S)\nexport OUTPUT_PATH=gs://$BUCKET_NAME/$JOB_NAME\ngcloud ai-platform jobs submit training $JOB_NAME \\\n  --region $REGION \\\n  --scale-tier BASIC \\\n  --master-image-uri $IMAGE_URI \\\n  --config config.yaml \\\n  -- \\\n  --no-cuda \\\n  --model-dir $OUTPUT_PATH\n```\n\n### Clean up\n\nWhen your finished with this example you can delete the Google cloud bucket using the following command:\n\n```bash\ngsutil rm -r gs://${PROJECT}-singularity\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frickstaa%2Fpytorch-rl-on-google-ai-example","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frickstaa%2Fpytorch-rl-on-google-ai-example","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frickstaa%2Fpytorch-rl-on-google-ai-example/lists"}