{"id":17888284,"url":"https://github.com/squat/kubeconeu2018","last_synced_at":"2025-04-03T02:42:50.023Z","repository":{"id":73956893,"uuid":"131829702","full_name":"squat/kubeconeu2018","owner":"squat","description":"KubeCon EU 2018 talk on automating GPU infrastructure for Kubernetes on Container Linux","archived":false,"fork":false,"pushed_at":"2018-05-15T14:41:55.000Z","size":21,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-02-08T16:44:33.969Z","etag":null,"topics":["container-linux","gpu","kubecon","kubenetes","nvidia","terraform"],"latest_commit_sha":null,"homepage":null,"language":"HCL","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/squat.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-05-02T09:37:26.000Z","updated_at":"2019-05-14T22:38:03.000Z","dependencies_parsed_at":"2023-03-13T20:17:28.851Z","dependency_job_id":null,"html_url":"https://github.com/squat/kubeconeu2018","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/squat%2Fkubeconeu2018","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/squat%2Fkubeconeu2018/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/squat%2Fkubeconeu2018/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/squat%2Fkubeconeu2018/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/squat","download_url":"https://codeload.github.com/squat/kubeconeu2018/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246927809,"owners_count":20856193,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["container-linux","gpu","kubecon","kubenetes","nvidia","terraform"],"created_at":"2024-10-28T13:37:00.740Z","updated_at":"2025-04-03T02:42:50.002Z","avatar_url":"https://github.com/squat.png","language":"HCL","funding_links":[],"categories":[],"sub_categories":[],"readme":"# KubeCon EU 2018\n\nThis repository contains the demo code for my KubeCon EU 2018 talk about automating GPU infrastructure for Kubernetes on Container Linux.\n\n[![youtube](https://img.youtube.com/vi/i6V4KPh_D5g/0.jpg)](https://www.youtube.com/watch?v=i6V4KPh_D5g)\n[![asciicast](https://asciinema.org/a/DE7RVqDsHSPjackcPmQwFElaX.png)](https://asciinema.org/a/DE7RVqDsHSPjackcPmQwFElaX)\n\n## Prerequisites\n\nYou will need a Google Cloud account with available quota for NVIDIA GPUs.\n\n## Getting Started\n\nEdit the `require.tf` Terraform file and uncomment and add the details for your Google Cloud project:\n\n```sh\n$EDITOR require.tf\n```\n\nModify the provided `terraform.tfvars` file to suit your project:\n\n```sh\n$EDITOR terraform.tfvars\n```\n\n## Running\n\n1. create cluster:\n\n    ```sh\n    terraform apply --auto-approve\n    ```\n\n2. get nodes:\n\n    ```sh\n    export KUBECONFIG=\"$(pwd)\"/assets/auth/kubeconfig\n    watch -n 1 kubectl get nodes\n    ```\n\n3. create GPU manifests:\n\n    ```sh\n    kubectl apply -f manifests\n    ```\n\n4. check status of driver installer:\n\n    ```sh\n    kubectl logs $(kubectl get pods -n kube-system | grep nvidia-driver-installer | awk '{print $1}') -c modulus -n kube-system -f\n    ```\n\n5. check status of device plugin:\n\n    ```sh\n    kubectl logs $(kubectl get pods -n kube-system | grep nvidia-gpu-device-plugin | awk '{print $1}' | head -n1 | tail -n1) -n kube-system -f\n    ```\n\n6. verify worker node has allocatable GPUs:\n\n    ```sh\n    kubectl describe node $(kubectl get nodes | grep worker | awk '{print $1}')\n    ```\n\n7. let's inspect the GPU workload:\n\n    ```sh\n    less manifests/darkapi.yaml\n    ```\n\n8. let's see if the GPU workload has been scheduled:\n\n    ```sh\n    watch -n 2 kubectl get pods\n    kubectl logs $(kubectl get pods | grep darkapi | awk '{print $1}') -f\n    ```\n\n9. for fun, let's test the GPU workload:\n\n    ```sh\n    export INGRESS=$(terraform output | grep ingress_static_ip | awk '{print $3}')\n    ~/code/darkapi/client http://$INGRESS/api/yolo\n    ```\n\n10. finally, let's clean up:\n\n    ```sh\n    terraform destroy --auto-approve\n    ```\n\n## Projects Leveraged In This Demo\n\n| Component                | URL                                                                                                          |\n|:------------------------:|:------------------------------------------------------------------------------------------------------------:|\n| Kubernetes installer     | https://github.com/poseidon/typhoon                                                                          |\n| GPU driver installer     | https://github.com/squat/modulus                                                                             |\n| Kubernetes device plugin | https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/device-plugins/nvidia-gpu/daemonset.yaml |\n| sample workload          | https://github.com/squat/darkapi                                                                             |\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsquat%2Fkubeconeu2018","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsquat%2Fkubeconeu2018","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsquat%2Fkubeconeu2018/lists"}