Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/baguasys/operator
Kubernetes operator for Bagua distributed training job.
https://github.com/baguasys/operator
Last synced: about 2 months ago
JSON representation
Kubernetes operator for Bagua distributed training job.
- Host: GitHub
- URL: https://github.com/baguasys/operator
- Owner: BaguaSys
- License: mit
- Created: 2021-06-11T02:35:12.000Z (over 3 years ago)
- Default Branch: master
- Last Pushed: 2023-02-07T00:04:54.000Z (almost 2 years ago)
- Last Synced: 2023-03-04T04:35:17.939Z (almost 2 years ago)
- Language: Go
- Homepage: https://baguasys.github.io/tutorials/kubernetes-integration/index.html
- Size: 33.7 MB
- Stars: 9
- Watchers: 7
- Forks: 5
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Kubernetes operator for Bagua jobs
This repository implements a kubernetes operator for Bagua distributed training job which supports static and elastic workloads. See [CRD definition](https://github.com/BaguaSys/operator/blob/preonline/config/crd/bases/bagua.kuaishou.com_baguas.yaml).
### Prerequisites
- Kubernetes
- kubectl### Installation
#### Run the operator locally
```shellgit clone https://github.com/BaguaSys/operator.git
cd operator# install crd
kubectl apply -f config/crd/bases/bagua.kuaishou.com_baguas.yamlgo run ./main.go
```
#### Deploy the operator
Install Bagua on an existing Kubernetes cluster.
```shell
kubectl apply -f https://raw.githubusercontent.com/BaguaSys/operator/master/deploy/deployment.yaml
```
Enjoy! Bagua will create resources in namespace `bagua`.### Examples
You can get demos in `config/samples`, and run as follows,
- static mode
```shellkubectl apply -f config/samples/bagua_v1alpha1_bagua_static.yaml
```
Verify pods are running
```yamlkubectl get pods
NAME READY STATUS RESTARTS AGE
bagua-sample-static-master-0 1/1 Running 0 45s
bagua-sample-static-worker-0 1/1 Running 0 45s
bagua-sample-static-worker-1 1/1 Running 0 45s
```- elastic mode
```shellkubectl apply -f config/samples/bagua_v1alpha1_bagua_elastic.yaml
```
Verify pods are running
```yamlkubectl get pods
NAME READY STATUS RESTARTS AGE
bagua-sample-elastic-etcd-0 1/1 Running 0 63s
bagua-sample-elastic-worker-0 1/1 Running 0 63s
bagua-sample-elastic-worker-1 1/1 Running 0 63s
bagua-sample-elastic-worker-2 1/1 Running 0 63s
```