Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/afritzler/kubernetes-gpu
Install NVIDIA GPU Support on CoreOS based Kubernetes Clusters
https://github.com/afritzler/kubernetes-gpu
coreos gpu infrastructure kubernetes machinelearning nvidia
Last synced: about 1 month ago
JSON representation
Install NVIDIA GPU Support on CoreOS based Kubernetes Clusters
- Host: GitHub
- URL: https://github.com/afritzler/kubernetes-gpu
- Owner: afritzler
- License: other
- Created: 2018-05-15T08:08:02.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2018-06-15T16:08:53.000Z (over 6 years ago)
- Last Synced: 2024-11-10T07:45:10.505Z (3 months ago)
- Topics: coreos, gpu, infrastructure, kubernetes, machinelearning, nvidia
- Homepage:
- Size: 19.5 KB
- Stars: 2
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE.md
- Codeowners: CODEOWNERS
Awesome Lists containing this project
README
# kubernetes-gpu
Install NVIDIA GPU Support on CoreOS based Kubernetes Cluster# Prerequisits
* CoreOS based Kubernetes cluster with GPU nodes (e.g. AWS P2 instances)# Installation
First install the nvidia driver via this daemonset
```
kubectl apply -f https://raw.githubusercontent.com/afritzler/kubernetes-gpu/master/k8s-nvidia-driver.yaml
```
Wait until the init container finishes on each node and install the device plugin
```
kubectl apply -f https://raw.githubusercontent.com/afritzler/kubernetes-gpu/master/k8s-nvidia-deviceplugin.yaml```
# Run
To run an example training on a GPU node, start first a base image with Tensorflow with GPU support & Keras
```
kubectl apply -f https://raw.githubusercontent.com/afritzler/deeplearning-workbench/master/manifests/dl-workbench.yaml
```
Now `exec` into the container and start an example Keras traing
```
kubectl exec -it deeplearning-workbench-8676458f5d-p4d2v -- /bin/bash
cd /keras/example
python imdb_cnn.py
```# Open Issues
- [ ] Label GPU nodes and add NodeSelector to daemonset# Acknowledgments & References
* Build and install NVIDIA driver on CoreOS: https://github.com/squat/modulus
* Nvidia Device Plugin: https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/device-plugins/nvidia-gpu/daemonset.yaml