Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/colab-coop/coopernetes

Coopernetes is a tool to de-mystify kuberntes and create a basic, no-nonsense cluster that coops and other organizations can use to host applications.
https://github.com/colab-coop/coopernetes

Last synced: 3 months ago
JSON representation

Coopernetes is a tool to de-mystify kuberntes and create a basic, no-nonsense cluster that coops and other organizations can use to host applications.

Awesome Lists containing this project

README

        

This is the set of terraform, helm, and docker configurations required to manage, operate, and deploy to a no-nonsense version of Kubernetes we call Coopernetes. This project is still in very early alpha developement, and is currently only being used by Colab Coop (https://colab.coop) and itme (https://itme.company). If you are interested in hosting containers and applicaitons on a managed Kubernetes cluster using Coopernetes, or you are interested in deploying the infrastructure yourself, please reach out to [email protected].

## Branches and Releases
- `master` is our primary working branch. It is intended to be generic, and can be cloned and used by anyone to launch a cluster from scracth.
- `itme` and `colab` correspond to the configurations of the two organizations currently using coopernetes. We each have slightly different needs and architectures, so we're using branches to track the individual changes until we can merge them back to master.

## Tools
All commands can be installed with `brew install`, except for helm plugins which use `helm plugin install`
To manage the AWS infrastructure:
- `terraform` (We use 12.24 in this repo. Installing `tfswitch` will allow you to easily switch between terraform versions for different projects. You can install it by following the directions at https://tfswitch.warrensbox.com/Install/)
- `awscli`
- `wget`

To manage and deploy applications on kubernetes:
- `kubectl`
- `helm`
- `helmfile`
- the helm-diff plugin: `helm plugin install https://github.com/databus23/helm-diff`
- the helm-secrets plugin: `helm plugin install https://github.com/zendesk/helm-secrets`
- `gnu-getopt`: used by helm-secrets
- `velero`: used for backup and restore

## Creating a cluster
Based on the example at https://github.com/terraform-aws-modules/terraform-aws-eks/blob/7de18cd9cd882f6ad105ca375b13729537df9e68/examples/managed_node_groups/main.tf
1. Run `terraform init && terraform apply` in the sops folder to setup sops configuration, used for secrets management.
- Secrets are encrypted with helm-secrets, which is configured using a `.sops.yaml` file in the root folder. In this repo, that file is symlinked to `terraform/sops/generated/sops.yaml`, since the kms key is generated by terraform.
1. Create a new folder copying an existing cluster config, changing the `terraform.tfvars` file with the desired details to configure the new cluster.
1. From inside the `terraform/ENV/eks` folder `terraform apply`
- Terraform generates a couple of config files needed by helmfile / helm. They are put in `terraform/eks/generated`, and the other programs that rely on them link to the files there.
- The first of these files is `helmfile.yaml` which is a set of non-secret values generated or configured in terraform that are needed by helmfile. This file is imported as the default environment in helmfile, granting charts and configuration access to terraform values. Anything non-secret you want to pass from terraform to helmfile should live here.
1. Configure kubectl with the generated kubeconfig: `aws eks --region us-east-1 --profile= update-kubeconfig --name `
1. `helmfile apply` in the root folder.
1. Once you deploy your first application with an Ingress, run `kubectl get ingress --all-namespaces` to list the address associated with the ingress. That is the load balancer for all inbound requests on the clster. You should create a DNS entry pointing to this load balancer for all services you want to create.
1. Port forward into kibana by running the command from below, then go to Discover menu item, configure the index to `kubernetes_cluster*`, choose a `@timestamp` and Kibana is ready.
1. Once the velero client is installed, you need to run a couple commands to configure and setup backups:
- Run `velero client config set namespace=system-backups`. This tells velero what namespace we installed it it.
- Run `velero backup create test-backup` to test the backup functionality
- Run `velero schedule create daily-cluster-backup --schedule="0 0 * * *"` to setup a backup schedule for the cluster.
1. Once prometheus-operator is installed, you should add the following dashboard to grafana: https://grafana.com/grafana/dashboards/8670.
- You can run the grafan dashboard by finding the grafana pod in the system-monitoring namespace, and then running: `kubectl port-forward -n system-monitoring 3000:3000`
- You can log in with the user `admin` and the passwor `prom-operator`. Since you need access to the cluster to port forward, these account credentials can be shared freely.

## Setting up CD for a new project:
All deployment related files, including the chart, helmfile, and Dockfile, should all live in a folder called `.deploy` in the root of the repository.

To deploy, simply launch the `coopernetes-deploy` container in CircleCI and use `coopctl` to deploy
1. Builds a docker image using the Dockerfile at `.deploy/Dockerfile` and the project root as the context.
1. Calls `helfile apply .deploy/helmfile.yaml`.
1. Run `velero schedule create daily--backup --schedule="0 0 * * *" --include-namespaces ` to setup a backup schedule for the namespace.
1. Keep in mind, nodes have a maximum number of pods they can support, as indicated on the following list: https://github.com/awslabs/amazon-eks-ami/blob/master/files/eni-max-pods.txt

If you are using a custom chart for the project, we recommend putting it at `.deploy/chart/`.

## Relevant documents / blog posts for intallation:
1. https://cert-manager.io/docs/tutorials/acme/ingress/
1. https://cert-manager.io/docs/installation/kubernetes/

## Working with the cluster:
- *log-aggregator (if installed)*: `kubectl port-forward deployment/efk-kibana 5601 -n system-logging`
- *grafana*: `kubectl port-forward -n system-monitoring prometheus-operator-grafana-RANDOM-ID 3000:3000`

## Potential improvements

#### Use AWS spot instances / autoscalers to reduce costs:
Autoscaling is not currently set up in the cluster, but it can be enabled by installing the cluster-autoscaler as outlined in this doc: https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/docs/autoscaling.md. Autoscaling will mean the cluster shrinks and grows based on our capacity needs. If we annotate our deployed services correctly with expected CPU and memory usage, this will allow the cluster to scale up and down to meet demand.

Spot instances are likely not going to be worth our time to investigate, as they are instances that often have cheaper on demand prices, but no guaranteed availibility. We are probably better off with reserved instances, since our capacity is relatively consistent, but figured it might be a worthwhile exploration if someone is interested.
https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/docs/spot-instances.md

#### Migrate to single helm chart / terraform module
`helmfile` is great for managing infrastructure installed on a case by case basis, but in order to package up coopernetes so that it's easier to use we will eventually want to create a master helm chart with all the basic installation and configuration options, similar to how https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack installs a bunhc of different srevices using a mix of custom manifests and subcharts.

The benefits of this approach are that we replace the entire helmfile with a single configurable master chart, that installs the appropriate backup services, metrics, ingress, etc. The current repo is somewhat brittle, and not easy to share widely with many organizations. However, part of the reason for that brittleness is that terraform and helmfile are tighly integrated, allowing us to configure both AWS and kubernetes with the same repo. To maintain the same level of interoperability, we would likely want to create a coopernetes terraform module that installs all the AWS specific resources we need for the master chart. Then, any new team that wanted to deploy coopernetes could do so with a terraform module and this master chart.