Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/boozallen/goobernetes
On-prem GitHub Actions runners, backed by Kubernetes
https://github.com/boozallen/goobernetes
Last synced: about 2 months ago
JSON representation
On-prem GitHub Actions runners, backed by Kubernetes
- Host: GitHub
- URL: https://github.com/boozallen/goobernetes
- Owner: boozallen
- Created: 2021-08-06T16:49:45.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2022-01-18T09:56:54.000Z (almost 3 years ago)
- Last Synced: 2024-10-30T22:39:49.895Z (2 months ago)
- Language: Dockerfile
- Size: 729 KB
- Stars: 36
- Watchers: 4
- Forks: 6
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE.md
Awesome Lists containing this project
- awesome-runners - boozallen/goobernetes - closed](https://img.shields.io/github/issues-closed/boozallen/goobernetes.svg)](https://GitHub.com/boozallen/goobernetes/issues?q=is%3Aissue+is%3Aclosed) | k8s | ✅ | Enterprise, Org, Repo, Labels, RunnerGroups | k8s manifests & dynamic scaling | ✅ (pending + running jobs or percentage runners already busy, check run events, scale up/down and flapping prevention parameters) | AMD64, others possible | ✅ | no | yes (if ephemeral option is used) | yes (install time, optional DInD) | only if github-webhook autoscaler is used | no | yes ([IssueOps project available](https://github.com/jonico/auto-scaling-github-runners-kubernetes-issueops)) | actions-runner controller + at least one pod per org runner | (The matrix (might be better readable on [GitHub pages](https://jonico.github.io/awesome-runners/)) / A word about self-hosted action runner images / virtual environments and how to test locally)
README
# Goobernetes - the Kubernetes cluster of GitHub Actions runners
This is the repo that contains all the code and a walkthrough of building for an on-premises auto-scaling Kubernetes cluster of GitHub Actions Runners.
## But why though
Booz Allen Hamilton is a customer of GitHub Enterprise Server (and Cloud). We frequently need to stay on-premises for regulatory reasons, but don't want to compromise on offering an excellent developer experience to our teammates. With a wide variety of projects and people in our shared environment, we cannot spend time managing bespoke software dependencies on co-tenanted compute and troubleshooting interactions between project requirements. It is simply not an option at our scale.
The end goal of this project is a minimally viable product of ephemeral Linux-based Kubernetes runners capable of Docker-in-Docker and easy customization. It should be portable enough to host in your Kubernetes ecosystem of choice. Source code, dependencies, and directions are listed below.
## Disclaimer
This works for us, in production, for the several thousand users in our installation. I think it _should_ work with GitHub AE and Enterprise Cloud too, but I haven't tested that and YMMV. If you try it, let us know how it's working!
## Documentation
There's a couple layers to this solution and each have their own docs. If you're just going through this for the first time, we're going to assume you have the following things already set up:
- GitHub Enterprise Server (v3.0+) and that you have admin access to it.
- GitHub [Packages](https://docs.github.com/en/[email protected]/admin/packages) and [Actions](https://docs.github.com/en/[email protected]/admin/github-actions/enabling-github-actions-for-github-enterprise-server) are both already set up correctly and enabled.
- VMs provisioned however you need to to start building them into a Kubernetes cluster. There's about a billion ways you can do this, so the setup directions are assuming minimal VMs on premises and using `kubectl` without anything fancy on top.The foundation (for us) is on-prem hosting in vSphere. It's in the same hosting cluster as GitHub Enterprise Server. Each worker node is running Red Hat Enterprise Linux, mostly to be the same as all the other ancillary boxes that support GitHub. The software and configuration of these nodes is controlled by Ansible playbooks. Most of these have been omitted because they're very specific to our implementation - things such as baseline configuration, software installs, etc. There is a single playbook, but not the inventory file, in the [ansible](ansible) directory to do the setup of a worker node. If you prefer, the directions to do these steps manually are in the [cluster setup](docs/kubernetes/SETUP.md) page. It should be possible to do this same task in many combinations of other virtualization platforms, operating systems, and configuration management tooling.
The next building block is the Kubernetes applications. Here's some quick facts about how we set that up.
- [Flannel](https://github.com/flannel-io/flannel) controls the networking.
- [Cert-manager](https://cert-manager.io/) provides certificate generation and management.
- [Actions Runner Controller](https://github.com/actions-runner-controller/actions-runner-controller) is what actually connects to GitHub Enterprise Server and manages/scales the runners.
- [Helm](https://helm.sh/) is how all of the deployments are managed.The next part of this solution is the Docker images used as runners. These are what gets deployed as pods for GitHub to use. There are currently five versions we've created, listed below:
- Ubuntu 20.04 (focal)
- Debian 10 (buster)
- Debian 11 (bullseye)
- CentOS 7 (centos7)
- CentOS 8 (centos8)The Dockerfiles and all of the other software needed for each are in the [images](images) directory. The extra scripts and such provide additional software, install and configure the runner agent to automatically join the enterprise worker pool, configure logging, etc. In general, software that isn't commonly available in that distribution's default repositories is controlled by a shell script in the [software](images/software) directory.
:information_source: Looking to add a tool cache to the image so the team isn't downloading the same thing a million times? Look for directions [here](docs/TOOL-CACHE.md)!
The most visible part of the configuration for deploying these runners is in the [deployments](deployments) directory. This directory only contains YAML files used to define the runner deployment. Things you'll find here are how much resources are allotted to any given worker, how the controller scales that deployment, etc.
Lastly, the [workflows](github/workflows) directory provides the CI/CD pipeline for building, testing, and deploying the images. Of course we're going to use GitHub to build GitHub! :tada:
## I just want the images to use
Neat! Look at the packages to the right to download the latest image. They're built monthly and on pull request against the `main` branch.
## How-to docs, next steps, and FAQs
- [Cluster setup](docs/kubernetes/SETUP.md)
- [Resetting the cluster](docs/kubernetes/RESET.md)
- [Docker images](docs/docker/BUILD.md)
- [Initial setup in GHES](docs/github/SETUP.md)
- [Next steps](docs/NEXT-STEPS.md)
- [Tips](docs/TIPS.md)## Sources
You should read these, as they're all excellent and can provide more insight into the customization options and updates than are available in this repository.
- Kubernetes controller for self-hosted runners, on [GitHub](https://github.com/actions-runner-controller/actions-runner-controller), is the glue that makes this entire solution possible.
- Docker image for runners that can automatically join, which solved a good bit of getting the runner agent started automatically on each pod, [Write up](https://sanderknape.com/2020/03/self-hosted-github-actions-runner-kubernetes/) and [GitHub](https://github.com/SanderKnape/github-runner).
- GitHub's repository used to generate their runners' images ([GitHub](https://github.com/actions/virtual-environments)), where I got the idea of using shell scripts to layer discrete dependency management on top of a base image. I can't provision full VMs as runners in my environment, as it appears the repository supports, but several of the [software](../images/software) scripts are copy/pasted directly out of that repo.## Other Resources
- Don't know what the whole Kubernetes thing is about? Here's some help:
- The [Kubernetes Aquarium](https://medium.com/@AnneLoVerso/the-kubernetes-aquarium-6a3d1d7a2afd)
- The Cloud Native Computing Foundation's book, [The Illustrated Children's Guide to Kubernetes](https://www.cncf.io/phippy/the-childrens-illustrated-guide-to-kubernetes/)
- What helped me to understand this whole concept shift is to think that Kubernetes is to containers as KVM/vSphere/Hyper-V is to virtual machines. It's probably not a perfect metaphor, but it helped. :smile: