https://github.com/snigdhasambitak/cks
Practice questions for Certified Kubernetes Security Specialist (CKS) exam
https://github.com/snigdhasambitak/cks
apparmor audit-log cks falco kube-bench kubernetes opa runsc trivy
Last synced: over 1 year ago
JSON representation
Practice questions for Certified Kubernetes Security Specialist (CKS) exam
- Host: GitHub
- URL: https://github.com/snigdhasambitak/cks
- Owner: snigdhasambitak
- License: mit
- Created: 2023-01-06T14:50:05.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2024-05-08T11:56:45.000Z (about 2 years ago)
- Last Synced: 2024-10-15T05:01:52.719Z (over 1 year ago)
- Topics: apparmor, audit-log, cks, falco, kube-bench, kubernetes, opa, runsc, trivy
- Homepage:
- Size: 1.36 MB
- Stars: 50
- Watchers: 4
- Forks: 36
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
- [CKS Simulator Kubernetes 1.25](#cks-simulator-kubernetes-125)
* [Pre Setup](#pre-setup)
* [Question 1 | Contexts](#question-1--contexts)
* [Question 2 | Runtime Security with Falco](#question-2--runtime-security-with-falco)
* [Question 3 | Apiserver Security](#question-3--apiserver-security)
* [Question 4 | Pod Security Standard](#question-4--pod-security-standard)
* [Question 5 | CIS Benchmark](#question-5--cis-benchmark)
* [Question 6 | Verify Platform Binaries](#question-6--verify-platform-binaries)
* [Question 7 | Open Policy Agent](#question-7--open-policy-agent)
* [Question 8 | Secure Kubernetes Dashboard](#question-8--secure-kubernetes-dashboard)
* [Question 9 | AppArmor Profile](#question-9--apparmor-profile)
* [Question 10 | Container Runtime Sandbox gVisor](#question-10--container-runtime-sandbox-gvisor)
* [Question 11 | Secrets in ETCD](#question-11--secrets-in-etcd)
* [Question 12 | Hack Secrets](#question-12---hack-secrets)
* [Question 13 | Restrict access to Metadata Server](#question-13--restrict-access-to-metadata-server)
* [Question 14 | Syscall Activity](#question-14--syscall-activity)
* [Question 15 | Configure TLS on Ingress](#question-15--configure-tls-on-ingress)
* [Question 16 | Docker Image Attack Surface](#question-16--docker-image-attack-surface)
* [Question 17 | Audit Log Policy](#question-17--audit-log-policy)
* [Question 18 | Investigate Break-in via Audit Log](#question-18--investigate-break-in-via-audit-log)
* [Question 19 | Immutable Root FileSystem](#question-19--immutable-root-filesystem)
* [Question 20 | Update Kubernetes](#question-20--update-kubernetes)
* [Question 21 | Image Vulnerability Scanning](#question-21--image-vulnerability-scanning)
* [Question 22 | Manual Static Security Analysis](#question-22--manual-static-security-analysis)
- [CKS Simulator Preview Kubernetes 1.25](#cks-simulator-preview-kubernetes-125)
* [Preview Question 1](#preview-question-1)
- [Answer:](#answer--20)
* [Part 1 - check existing RBAC rules](#part-1---check-existing-rbac-rules)
* [Part 2 - create additional RBAC rules](#part-2---create-additional-rbac-rules)
* [Preview Question 2](#preview-question-2)
- [Answer:](#answer--21)
* [Preview Question 3](#preview-question-3)
- [Answer:](#answer--22)
- [CKS Tips Kubernetes 1.25](#cks-tips-kubernetes-125)
* [Knowledge](#knowledge)
+ [Pre-Knowledge](#pre-knowledge)
+ [Knowledge](#knowledge-1)
+ [Approach](#approach)
+ [Content](#content)
- [CKS Exam Info](#cks-exam-info)
* [Read the Curriculum](#read-the-curriculum)
* [Read the Handbook](#read-the-handbook)
* [Read the important tips](#read-the-important-tips)
* [Read the FAQ](#read-the-faq)
- [Kubernetes documentation](#kubernetes-documentation)
- [CKS clusters](#cks-clusters)
- [The Test Environment / Browser Terminal](#the-test-environment---browser-terminal)
* [Laggin](#laggin)
* [Kubectl autocompletion and commands](#kubectl-autocompletion-and-commands)
* [Copy & Paste](#copy---paste)
* [Percentages and Score](#percentages-and-score)
* [Notepad & Skipping Questions](#notepad---skipping-questions)
* [Contexts](#contexts)
- [PSI Bridge](#psi-bridge)
- [Browser Terminal Setup](#browser-terminal-setup)
* [Minimal Setup](#minimal-setup)
+ [Alias](#alias)
+ [Vim](#vim)
* [Optional Setup](#optional-setup)
+ [Fast dry-run output](#fast-dry-run-output)
+ [Fast pod delete](#fast-pod-delete)
+ [Persist bash settings](#persist-bash-settings)
+ [Alias Namespace](#alias-namespace)
* [Be fast](#be-fast)
* [Vim](#vim-1)
+ [toggle vim line numbers](#toggle-vim-line-numbers)
+ [copy&paste](#copy-paste)
+ [Indent multiple lines](#indent-multiple-lines)
* [Split terminal screen](#split-terminal-screen)
# CKS Simulator Kubernetes 1.25
https://killer.sh
## Pre Setup
Once you've gained access to your terminal it might be wise to spend ~1 minute to setup your environment. You could set these:
```sh
alias k=kubectl # will already be pre-configured
export do="--dry-run=client -o yaml" # k create deploy nginx --image=nginx $do
export now="--force --grace-period 0" # k delete pod x $now
```
Vim
The following settings will already be configured in your real exam environment in ~/.vimrc. But it can never hurt to be able to type these down:
```sh
set tabstop=2
set expandtab
set shiftwidth=2
```
More setup suggestions are in the tips section.
## Question 1 | Contexts
#### Task weight: 1%
You have access to multiple clusters from your main terminal through kubectl contexts. Write all context names into /opt/course/1/contexts, one per line.
From the kubeconfig extract the certificate of user restricted@infra-prod and write it decoded to /opt/course/1/cert.
#### Answer:
Maybe the fastest way is just to run:
```sh
k config get-contexts # copy by hand
k config get-contexts -o name > /opt/course/1/contexts
```
Or using jsonpath:
```sh
k config view -o jsonpath="{.contexts[*].name}"
k config view -o jsonpath="{.contexts[*].name}" | tr " " "\n" # new lines
k config view -o jsonpath="{.contexts[*].name}" | tr " " "\n" > /opt/course/1/contexts
```
The content could then look like:
```sh
#/opt/course/1/contexts
gianna@infra-prod
infra-prod
restricted@infra-prod
workload-prod
workload-stage
```
For the certificate we could just run
```
k config view --raw
```
And copy it manually. Or we do:
```sh
k config view --raw -ojsonpath="{.users[2].user.client-certificate-data}" | base64 -d > /opt/course/1/cert
```
Or even:
```sh
k config view --raw -ojsonpath="{.users[?(.name == 'restricted@infra-prod')].user.client-certificate-data}" | base64 -d > /opt/course/1/cert
# /opt/course/1/cert
-----BEGIN CERTIFICATE-----
MIIDHzCCAgegAwIBAgIQN5Qe/Rj/PhaqckEI23LPnjANBgkqhkiG9w0BAQsFADAV
MRMwEQYDVQQDEwprdWJlcm5ldGVzMB4XDTIwMDkyNjIwNTUwNFoXDTIxMDkyNjIw
NTUwNFowKjETMBEGA1UEChMKcmVzdHJpY3RlZDETMBEGA1UEAxMKcmVzdHJpY3Rl
ZDCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAL/Jaf/QQdijyJTWIDij
qa5p4oAh+xDBX3jR9R0G5DkmPU/FgXjxej3rTwHJbuxg7qjTuqQbf9Fb2AHcVtwH
gUjC12ODUDE+nVtap+hCe8OLHZwH7BGFWWscgInZOZW2IATK/YdqyQL5OKpQpFkx
iAknVZmPa2DTZ8FoyRESboFSTZj6y+JVA7ot0pM09jnxswstal9GZLeqioqfFGY6
YBO/Dg4DDsbKhqfUwJVT6Ur3ELsktZIMTRS5By4Xz18798eBiFAHvgJGq1TTwuPM
EhBfwYwgYbalL8DSHeFrelLBKgciwUKjr1lolnnuc1vhkX1peV1J3xrf6o2KkyMc
lY0CAwEAAaNWMFQwDgYDVR0PAQH/BAQDAgWgMBMGA1UdJQQMMAoGCCsGAQUFBwMC
MAwGA1UdEwEB/wQCMAAwHwYDVR0jBBgwFoAUPrspZIWR7YMN8vT5DF3s/LvpxPQw
DQYJKoZIhvcNAQELBQADggEBAIDq0Zt77gXI1s+uW46zBw4mIWgAlBLl2QqCuwmV
kd86eH5bD0FCtWlb6vGdcKPdFccHh8Z6z2LjjLu6UoiGUdIJaALhbYNJiXXi/7cf
M7sqNOxpxQ5X5hyvOBYD1W7d/EzPHV/lcbXPUDYFHNqBYs842LWSTlPQioDpupXp
FFUQPxsenNXDa4TbmaRvnK2jka0yXcqdiXuIteZZovp/IgNkfmx2Ld4/Q+Xlnscf
CFtWbjRa/0W/3EW/ghQ7xtC7bgcOHJesoiTZPCZ+dfKuUfH6d1qxgj6Jwt0HtyEf
QTQSc66BdMLnw5DMObs4lXDo2YE6LvMrySdXm/S7img5YzU=
-----END CERTIFICATE-----
```
## Question 2 | Runtime Security with Falco
#### Task weight: 4%
Use context: `kubectl config use-context workload-prod`
Falco is installed with default configuration on node `cluster1-node1`. Connect using `ssh cluster1-node1`. Use it to:
Find a Pod running image nginx which creates unwanted package management processes inside its container.
Find a Pod running image httpd which modifies `/etc/passwd`.
Save the Falco logs for case 1 under `/opt/course/2/falco.log` in format:
`time-with-nanosconds,container-id,container-name,user-name`
No other information should be in any line. Collect the logs for at least 30 seconds.
Afterwards remove the threads (both 1 and 2) by scaling the replicas of the Deployments that control the offending Pods down to 0.
Answer:
Falco, the open-source cloud-native runtime security project, is the de facto Kubernetes threat detection engine.
NOTE: Other tools you might have to be familar with are sysdig or tracee
Use Falco as service
First we can investigate Falco config a little:
```sh
➜ ssh cluster1-node1
➜ root@cluster1-node1:~# service falco status
● falco.service - LSB: Falco syscall activity monitoring agent
Loaded: loaded (/etc/init.d/falco; generated)
Active: active (running) since Sat 2020-10-10 06:36:15 UTC; 2h 1min ago
...
➜ root@cluster1-node1:~# cd /etc/falco
➜ root@cluster1-node1:/etc/falco# ls
falco.yaml falco_rules.local.yaml falco_rules.yaml k8s_audit_rules.yaml rules.available rules.d
```
This is the default configuration, if we look into falco.yaml we can see:
```sh
# /etc/falco/falco.yaml
...
# Where security notifications should go.
# Multiple outputs can be enabled.
syslog_output:
enabled: true
...
```
This means that Falco is writing into syslog, hence we can do:
```sh
➜ root@cluster1-node1:~# cat /var/log/syslog | grep falco
Sep 15 08:44:04 ubuntu2004 falco: Falco version 0.29.1 (driver version 17f5df52a7d9ed6bb12d3b1768460def8439936d)
Sep 15 08:44:04 ubuntu2004 falco: Falco initialized with configuration file /etc/falco/falco.yaml
Sep 15 08:44:04 ubuntu2004 falco: Loading rules from file /etc/falco/falco_rules.yaml:
...
```
Yep, quite some action going on in there. Let's investigate the first offending Pod:
```sh
➜ root@cluster1-node1:~# cat /var/log/syslog | grep falco | grep nginx | grep process
Sep 16 06:23:47 ubuntu2004 falco: 06:23:47.376241377: Error Package management process launched in container (user=root user_loginuid=-1 command=apk container_id=7a5ea6a080d1 container_name=nginx image=docker.io/library/nginx:1.19.2-alpine)
...
➜ root@cluster1-node1:~# crictl ps -id 7a5ea6a080d1
CONTAINER ID IMAGE NAME ... POD ID
7a5ea6a080d1b 6f715d38cfe0e nginx ... 7a864406b9794
root@cluster1-node1:~# crictl pods -id 7a864406b9794
POD ID ... NAME NAMESPACE ...
7a864406b9794 ... webapi-6cfddcd6f4-ftxg4 team-blue ...
```
First Pod is webapi-6cfddcd6f4-ftxg4 in Namespace team-blue.
```sh
➜ root@cluster1-node1:~# cat /var/log/syslog | grep falco | grep httpd | grep passwd
Sep 16 06:23:48 ubuntu2004 falco: 06:23:48.830962378: Error File below /etc opened for writing (user=root user_loginuid=-1 command=sed -i $d /etc/passwd parent=sh pcmdline=sh -c echo hacker >> /etc/passwd; sed -i '$d' /etc/passwd; true file=/etc/passwdngFmAl program=sed gparent= ggparent= gggparent= container_id=b1339d5cc2de image=docker.io/library/httpd)
➜ root@cluster1-node1:~# crictl ps -id b1339d5cc2de
CONTAINER ID IMAGE NAME ... POD ID
b1339d5cc2dee f6b40f9f8ad71 httpd ... 595af943c3245
root@cluster1-node1:~# crictl pods -id 595af943c3245
POD ID ... NAME NAMESPACE ...
595af943c3245 ... rating-service-68cbdf7b7-v2p6g team-purple ...
```
Second Pod is rating-service-68cbdf7b7-v2p6g in Namespace team-purple.
Eliminate offending Pods
The logs from before should allow us to find and "eliminate" the offending Pods:
```sh
➜ k get pod -A | grep webapi
team-blue webapi-6cfddcd6f4-ftxg4 1/1 Running
➜ k -n team-blue scale deploy webapi --replicas 0
deployment.apps/webapi scaled
➜ k get pod -A | grep rating-service
team-purple rating-service-68cbdf7b7-v2p6g 1/1 Running
➜ k -n team-purple scale deploy rating-service --replicas 0
deployment.apps/rating-service scaled
```
#### Use Falco from command line
We can also use Falco directly from command line, but only if the service is disabled:
```sh
➜ root@cluster1-node1:~# service falco stop
➜ root@cluster1-node1:~# falco
Thu Sep 16 06:33:11 2021: Falco version 0.29.1 (driver version 17f5df52a7d9ed6bb12d3b1768460def8439936d)
Thu Sep 16 06:33:11 2021: Falco initialized with configuration file /etc/falco/falco.yaml
Thu Sep 16 06:33:11 2021: Loading rules from file /etc/falco/falco_rules.yaml:
Thu Sep 16 06:33:11 2021: Loading rules from file /etc/falco/falco_rules.local.yaml:
Thu Sep 16 06:33:11 2021: Loading rules from file /etc/falco/k8s_audit_rules.yaml:
Thu Sep 16 06:33:12 2021: Starting internal webserver, listening on port 8765
06:33:17.382603204: Error Package management process launched in container (user=root user_loginuid=-1 command=apk container_id=7a5ea6a080d1 container_name=nginx image=docker.io/library/nginx:1.19.2-alpine)
...
```
We can see that rule files are loaded and logs printed afterwards.
#### Create logs in correct format
The task requires us to store logs for "unwanted package management processes" in format time,container-id,container-name,user-name. The output from falco shows entries for "Error Package management process launched" in a default format. Let's find the proper file that contains the rule and change it:
```sh
➜ root@cluster1-node1:~# cd /etc/falco/
➜ root@cluster1-node1:/etc/falco# grep -r "Package management process launched" .
./falco_rules.yaml: Package management process launched in container (user=%user.name user_loginuid=%user.loginuid
➜ root@cluster1-node1:/etc/falco# cp falco_rules.yaml falco_rules.yaml_ori
➜ root@cluster1-node1:/etc/falco# vim falco_rules.yaml
```
Find the rule which looks like this:
```yaml
# Container is supposed to be immutable. Package management should be done in building the image.
- rule: Launch Package Management Process in Container
desc: Package management process ran inside container
condition: >
spawned_process
and container
and user.name != "_apt"
and package_mgmt_procs
and not package_mgmt_ancestor_procs
and not user_known_package_manager_in_container
output: >
Package management process launched in container (user=%user.name user_loginuid=%user.loginuid
command=%proc.cmdline container_id=%container.id container_name=%container.name image=%container.image.repository:%container.image.tag)
priority: ERROR
tags: [process, mitre_persistence]
```
Should be changed into the required format:
```yaml
# Container is supposed to be immutable. Package management should be done in building the image.
- rule: Launch Package Management Process in Container
desc: Package management process ran inside container
condition: >
spawned_process
and container
and user.name != "_apt"
and package_mgmt_procs
and not package_mgmt_ancestor_procs
and not user_known_package_manager_in_container
output: >
Package management process launched in container %evt.time,%container.id,%container.name,%user.name
priority: ERROR
tags: [process, mitre_persistence]
```
For all available fields we can check https://falco.org/docs/rules/supported-fields, which should be allowed to open during the exam.
Next we check the logs in our adjusted format:
```sh
➜ root@cluster1-node1:/etc/falco# falco | grep "Package management"
06:38:28.077150666: Error Package management process launched in container 06:38:28.077150666,090aad374a0a,nginx,root
06:38:33.058263010: Error Package management process launched in container 06:38:33.058263010,090aad374a0a,nginx,root
06:38:38.068693625: Error Package management process launched in container 06:38:38.068693625,090aad374a0a,nginx,root
06:38:43.066159360: Error Package management process launched in container 06:38:43.066159360,090aad374a0a,nginx,root
06:38:48.059792139: Error Package management process launched in container 06:38:48.059792139,090aad374a0a,nginx,root
06:38:53.063328933: Error Package management process launched in container 06:38:53.063328933,090aad374a0a,nginx,root
```
This looks much better. Copy&paste the output into file /opt/course/2/falco.log on your main terminal. The content should be cleaned like this:
```sh
# /opt/course/2/falco.log
06:38:28.077150666,090aad374a0a,nginx,root
06:38:33.058263010,090aad374a0a,nginx,root
06:38:38.068693625,090aad374a0a,nginx,root
06:38:43.066159360,090aad374a0a,nginx,root
06:38:48.059792139,090aad374a0a,nginx,root
06:38:53.063328933,090aad374a0a,nginx,root
06:38:58.070912841,090aad374a0a,nginx,root
06:39:03.069592140,090aad374a0a,nginx,root
06:39:08.064805371,090aad374a0a,nginx,root
06:39:13.078109098,090aad374a0a,nginx,root
06:39:18.065077287,090aad374a0a,nginx,root
06:39:23.061012151,090aad374a0a,nginx,root
```
For a few entries it should be fast to just clean it up manually. If there are larger amounts of entries we could do:
```sh
cat /opt/course/2/falco.log.dirty | cut -d" " -f 9 > /opt/course/2/falco.log
```
The tool cut will split input into fields using space as the delimiter (-d""). We then only select the 9th field using -f 9.
#### Local falco rules
There is also a file /etc/falco/falco_rules.local.yaml in which we can override existing default rules. This is a much cleaner solution for production. Choose the faster way for you in the exam if nothing is specified in the task.
## Question 3 | Apiserver Security
#### Task weight: 3%
Use context: `kubectl config use-context workload-prod`
You received a list from the DevSecOps team which performed a security investigation of the k8s cluster1 (workload-prod). The list states the following about the apiserver setup:
* Accessible through a NodePort Service
Change the apiserver setup so that:
* Only accessible through a ClusterIP Service
#### Answer:
In order to modify the parameters for the apiserver, we first ssh into the master node and check which parameters the apiserver process is running with:
```sh
➜ ssh cluster1-controlplane1
➜ root@cluster1-controlplane1:~# ps aux | grep kube-apiserver
root 13534 8.6 18.1 1099208 370684 ? Ssl 19:55 8:40 kube-apiserver --advertise-address=192.168.100.11 --allow-privileged=true --anonymous-auth=true --authorization-mode=Node,RBAC --client-ca-file=/etc/kubernetes/pki/ca.crt --enable-admission-plugins=NodeRestriction --enable-bootstrap-token-auth=true --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key --etcd-servers=https://127.0.0.1:2379 --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --kubernetes-service-node-port=31000 --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt --proxy-client-key-
...
```
We may notice the following argument:
```sh
--kubernetes-service-node-port=31000
```
We can also check the Service and see its of type NodePort:
```sh
➜ root@cluster1-controlplane1:~# kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes NodePort 10.96.0.1 443:31000/TCP 5d2h
```
The apiserver runs as a static Pod, so we can edit the manifest. But before we do this we also create a copy in case we mess things up:
```sh
➜ root@cluster1-controlplane1:~# cp /etc/kubernetes/manifests/kube-apiserver.yaml ~/3_kube-apiserver.yaml
➜ root@cluster1-controlplane1:~# vim /etc/kubernetes/manifests/kube-apiserver.yaml
```
We should remove the unsecure settings:
```yaml
# /etc/kubernetes/manifests/kube-apiserver.yaml
apiVersion: v1
kind: Pod
metadata:
annotations:
kubeadm.kubernetes.io/kube-apiserver.advertise-address.endpoint: 192.168.100.11:6443
creationTimestamp: null
labels:
component: kube-apiserver
tier: control-plane
name: kube-apiserver
namespace: kube-system
spec:
containers:
- command:
- kube-apiserver
- --advertise-address=192.168.100.11
- --allow-privileged=true
- --authorization-mode=Node,RBAC
- --client-ca-file=/etc/kubernetes/pki/ca.crt
- --enable-admission-plugins=NodeRestriction
- --enable-bootstrap-token-auth=true
- --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt
- --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt
- --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key
- --etcd-servers=https://127.0.0.1:2379
- --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt
- --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
# - --kubernetes-service-node-port=31000 # delete or set to 0
- --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt
- --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key
...
```
Once the changes are made, give the apiserver some time to start up again. Check the apiserver's Pod status and the process parameters:
```sh
➜ root@cluster1-controlplane1:~# kubectl -n kube-system get pod | grep apiserver
kube-apiserver-cluster1-controlplane1 1/1 Running 0 38s
➜ root@cluster1-controlplane1:~# ps aux | grep kube-apiserver | grep node-port
The apiserver got restarted without the unsecure settings. However, the Service kubernetes will still be of type NodePort:
➜ root@cluster1-controlplane1:~# kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes NodePort 10.96.0.1 443:31000/TCP 5d3h
We need to delete the Service for the changes to take effect:
➜ root@cluster1-controlplane1:~# kubectl delete svc kubernetes
service "kubernetes" deleted
```
After a few seconds:
```sh
➜ root@cluster1-controlplane1:~# kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 443/TCP 6s
```
This should satisfy the DevSecOps team.
## Question 4 | Pod Security Standard
#### Task weight: 8%
Use context: `kubectl config use-context workload-prod`
There is Deployment `container-host-hacker` in Namespace `team-red` which mounts `/run/containerd` as a hostPath volume on the Node where its running. This means that the Pod can access various data about other containers running on the same Node.
To prevent this configure Namespace `team-red` to enforce the `baseline` Pod Security Standard. Once completed, delete the Pod of the Deployment mentioned above.
Check the ReplicaSet events and write the event/log lines containing the reason why the Pod isn't recreated into `/opt/course/4/logs`.
#### Answer:
Making Namespaces use Pod Security Standards works via labels. We can simply edit it:
```sh
k edit ns team-red
```
Now we configure the requested label:
```yaml
# kubectl edit namespace team-red
apiVersion: v1
kind: Namespace
metadata:
labels:
kubernetes.io/metadata.name: team-red
pod-security.kubernetes.io/enforce: baseline # add
name: team-red
...
```
This should already be enough for the default Pod Security Admission Controller to pick up on that change. Let's test it and delete the Pod to see if it'll be recreated or fails, it should fail!
```sh
➜ k -n team-red get pod
NAME READY STATUS RESTARTS AGE
container-host-hacker-dbf989777-wm8fc 1/1 Running 0 115s
➜ k -n team-red delete pod container-host-hacker-dbf989777-wm8fc
pod "container-host-hacker-dbf989777-wm8fc" deleted
➜ k -n team-red get pod
No resources found in team-red namespace.
```
Usually the ReplicaSet of a Deployment would recreate the Pod if deleted, here we see this doesn't happen. Let's check why:
```sh
➜ k -n team-red get rs
NAME DESIRED CURRENT READY AGE
container-host-hacker-dbf989777 1 0 0 5m25s
➜ k -n team-red describe rs container-host-hacker-dbf989777
Name: container-host-hacker-dbf989777
Namespace: team-red
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
...
Warning FailedCreate 2m41s replicaset-controller Error creating: pods "container-host-hacker-dbf989777-bjwgv" is forbidden: violates PodSecurity "baseline:latest": hostPath volumes (volume "containerdata")
Warning FailedCreate 2m2s (x9 over 2m40s) replicaset-controller (combined from similar events): Error creating: pods "container-host-hacker-dbf989777-kjfpn" is forbidden: violates PodSecurity "baseline:latest": hostPath volumes (volume "containerdata")
```
There we go! Finally we write the reason into the requested file so that Mr Scoring will be happy too!
```sh
# /opt/course/4/logs
Warning FailedCreate 2m2s (x9 over 2m40s) replicaset-controller (combined from similar events): Error creating: pods "container-host-hacker-dbf989777-kjfpn" is forbidden: violates PodSecurity "baseline:latest": hostPath volumes (volume "containerdata")
Pod Security Standards can give a great base level of security! But when one finds themselves wanting to deeper adjust the levels like baseline or restricted... this isn't possible and 3rd party solutions like OPA could be looked at.
```
## Question 5 | CIS Benchmark
#### Task weight: 3%
Use context: `kubectl config use-context infra-prod`
You're ask to evaluate specific settings of `cluster2` against the CIS Benchmark recommendations. Use the tool kube-bench which is already installed on the nodes.
Connect using `ssh cluster2-controlplane1` and `ssh cluster2-node1`.
On the master node ensure (correct if necessary) that the CIS recommendations are set for:
* The --profiling argument of the kube-controller-manager
* The ownership of directory /var/lib/etcd
On the worker node ensure (correct if necessary) that the CIS recommendations are set for:
* The permissions of the kubelet configuration /var/lib/kubelet/config.yaml
* The --client-ca-file argument of the kubelet
#### Answer:
##### Number 1
First we ssh into the master node run kube-bench against the master components:
```sh
➜ ssh cluster2-controlplane1
➜ root@cluster2-controlplane1:~# kube-bench run --targets=master
...
== Summary ==
41 checks PASS
13 checks FAIL
11 checks WARN
0 checks INFO
```
We see some passes, fails and warnings. Let's check the required task (1) of the controller manager:
```sh
➜ root@cluster2-controlplane1:~# kube-bench run --targets=master | grep kube-controller -A 3
1.3.1 Edit the Controller Manager pod specification file /etc/kubernetes/manifests/kube-controller-manager.yaml
on the master node and set the --terminated-pod-gc-threshold to an appropriate threshold,
for example:
--terminated-pod-gc-threshold=10
--
1.3.2 Edit the Controller Manager pod specification file /etc/kubernetes/manifests/kube-controller-manager.yaml
on the master node and set the below parameter.
--profiling=false
1.3.6 Edit the Controller Manager pod specification file /etc/kubernetes/manifests/kube-controller-manager.yaml
on the master node and set the --feature-gates parameter to include RotateKubeletServerCertificate=true.
--feature-gates=RotateKubeletServerCertificate=true
```
There we see 1.3.2 which suggests to set --profiling=false, so we obey:
```sh
➜ root@cluster2-controlplane1:~# vim /etc/kubernetes/manifests/kube-controller-manager.yaml
```
Edit the corresponding line:
```yaml
# /etc/kubernetes/manifests/kube-controller-manager.yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
component: kube-controller-manager
tier: control-plane
name: kube-controller-manager
namespace: kube-system
spec:
containers:
- command:
- kube-controller-manager
- --allocate-node-cidrs=true
- --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
- --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
- --bind-address=127.0.0.1
- --client-ca-file=/etc/kubernetes/pki/ca.crt
- --cluster-cidr=10.244.0.0/16
- --cluster-name=kubernetes
- --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt
- --cluster-signing-key-file=/etc/kubernetes/pki/ca.key
- --controllers=*,bootstrapsigner,tokencleaner
- --kubeconfig=/etc/kubernetes/controller-manager.conf
- --leader-elect=true
- --node-cidr-mask-size=24
- --port=0
- --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
- --root-ca-file=/etc/kubernetes/pki/ca.crt
- --service-account-private-key-file=/etc/kubernetes/pki/sa.key
- --service-cluster-ip-range=10.96.0.0/12
- --use-service-account-credentials=true
- --profiling=false # add
...
```
We wait for the Pod to restart, then run kube-bench again to check if the problem was solved:
```sh
➜ root@cluster2-controlplane1:~# kube-bench run --targets=master | grep kube-controller -A 3
1.3.1 Edit the Controller Manager pod specification file /etc/kubernetes/manifests/kube-controller-manager.yaml
on the master node and set the --terminated-pod-gc-threshold to an appropriate threshold,
for example:
--terminated-pod-gc-threshold=10
--
1.3.6 Edit the Controller Manager pod specification file /etc/kubernetes/manifests/kube-controller-manager.yaml
on the master node and set the --feature-gates parameter to include RotateKubeletServerCertificate=true.
--feature-gates=RotateKubeletServerCertificate=true
```
Problem solved and 1.3.2 is passing:
```sh
root@cluster2-controlplane1:~# kube-bench run --targets=master | grep 1.3.2
[PASS] 1.3.2 Ensure that the --profiling argument is set to false (Scored)
```
##### Number 2
Next task (2) is to check the ownership of directory /var/lib/etcd, so we first have a look:
```sh
➜ root@cluster2-controlplane1:~# ls -lh /var/lib | grep etcd
drwx------ 3 root root 4.0K Sep 11 20:08 etcd
```
Looks like user root and group root. Also possible to check using:
```sh
➜ root@cluster2-controlplane1:~# stat -c %U:%G /var/lib/etcd
root:root
```
But what has kube-bench to say about this?
```sh
➜ root@cluster2-controlplane1:~# kube-bench run --targets=master | grep "/var/lib/etcd" -B5
1.1.12 On the etcd server node, get the etcd data directory, passed as an argument --data-dir,
from the below command:
ps -ef | grep etcd
Run the below command (based on the etcd data directory found above).
For example, chown etcd:etcd /var/lib/etcd
```
To comply we run the following:
```sh
➜ root@cluster2-controlplane1:~# chown etcd:etcd /var/lib/etcd
➜ root@cluster2-controlplane1:~# ls -lh /var/lib | grep etcd
drwx------ 3 etcd etcd 4.0K Sep 11 20:08 etcd
```
This looks better. We run kube-bench again, and make sure test 1.1.12. is passing.
```sh
➜ root@cluster2-controlplane1:~# kube-bench run --targets=master | grep 1.1.12
[PASS] 1.1.12 Ensure that the etcd data directory ownership is set to etcd:etcd (Scored)
```
Done.
##### Number 3
To continue with number (3), we'll head to the worker node and ensure that the kubelet configuration file has the minimum necessary permissions as recommended:
```sh
➜ ssh cluster2-node1
➜ root@cluster2-node1:~# kube-bench run --targets=node
...
== Summary ==
13 checks PASS
10 checks FAIL
2 checks WARN
0 checks INFO
```
Also here some passes, fails and warnings. We check the permission level of the kubelet config file:
```sh
➜ root@cluster2-node1:~# stat -c %a /var/lib/kubelet/config.yaml
777
```
777 is highly permissive access level and not recommended by the kube-bench guidelines:
```sh
➜ root@cluster2-node1:~# kube-bench run --targets=node | grep /var/lib/kubelet/config.yaml -B2
4.1.9 Run the following command (using the config file location identified in the Audit step)
chmod 644 /var/lib/kubelet/config.yaml
```
We obey and set the recommended permissions:
```sh
➜ root@cluster2-node1:~# chmod 644 /var/lib/kubelet/config.yaml
➜ root@cluster2-node1:~# stat -c %a /var/lib/kubelet/config.yaml
644
```
And check if test 2.2.10 is passing:
```sh
➜ root@cluster2-node1:~# kube-bench run --targets=node | grep 4.1.9
[PASS] 2.2.10 Ensure that the kubelet configuration file has permissions set to 644 or more restrictive (Scored)
```
##### Number 4
Finally for number (4), let's check whether --client-ca-file argument for the kubelet is set properly according to kube-bench recommendations:
```sh
➜ root@cluster2-node1:~# kube-bench run --targets=node | grep client-ca-file
[PASS] 4.2.3 Ensure that the --client-ca-file argument is set as appropriate (Automated)
```
This looks passing with 4.2.3. The other ones are about the file that the parameter points to and can be ignored here.
To further investigate we run the following command to locate the kubelet config file, and open it:
```sh
➜ root@cluster2-node1:~# ps -ef | grep kubelet
root 5157 1 2 20:28 ? 00:03:22 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --network-plugin=cni --pod-infra-container-image=k8s.gcr.io/pause:3.2
root 19940 11901 0 22:38 pts/0 00:00:00 grep --color=auto kubelet
```
```yaml
➜ root@croot@cluster2-node1:~# vim /var/lib/kubelet/config.yaml
# /var/lib/kubelet/config.yaml
apiVersion: kubelet.config.k8s.io/v1beta1
authentication:
anonymous:
enabled: false
webhook:
cacheTTL: 0s
enabled: true
x509:
clientCAFile: /etc/kubernetes/pki/ca.crt
...
```
The clientCAFile points to the location of the certificate, which is correct.
## Question 6 | Verify Platform Binaries
#### Task weight: 2%
(can be solved in any kubectl context)
There are four Kubernetes server binaries located at `/opt/course/6/binaries`. You're provided with the following verified sha512 values for these:
###### kube-apiserver
`f417c0555bc0167355589dd1afe23be9bf909bf98312b1025f12015d1b58a1c62c9908c0067a7764fa35efdac7016a9efa8711a44425dd6692906a7c283f032c`
###### kube-controller-manager
`60100cc725e91fe1a949e1b2d0474237844b5862556e25c2c655a33boa8225855ec5ee22fa4927e6c46a60d43a7c4403a27268f96fbb726307d1608b44f38a60`
###### kube-proxy
`52f9d8ad045f8eee1d689619ef8ceef2d86d50c75a6a332653240d7ba5b2a114aca056d9e513984ade24358c9662714973c1960c62a5cb37dd375631c8a614c6`
###### kubelet
`4be40f2440619e990897cf956c32800dc96c2c983bf64519854a3309fa5aa21827991559f9c44595098e27e6f2ee4d64a3fdec6baba8a177881f20e3ec61e26c`
Delete those binaries that don't match with the sha512 values above.
#### Answer:
We check the directory:
```sh
➜ cd /opt/course/6/binaries
➜ ls
kube-apiserver kube-controller-manager kube-proxy kubelet
```
To generate the sha512 sum of a binary we do:
```sh
➜ sha512sum kube-apiserver
f417c0555bc0167355589dd1afe23be9bf909bf98312b1025f12015d1b58a1c62c9908c0067a7764fa35efdac7016a9efa8711a44425dd6692906a7c283f032c kube-apiserver
```
Looking good, next:
```
➜ sha512sum kube-controller-manager
60100cc725e91fe1a949e1b2d0474237844b5862556e25c2c655a33b0a8225855ec5ee22fa4927e6c46a60d43a7c4403a27268f96fbb726307d1608b44f38a60 kube-controller-manager
```
Okay, next:
```sh
➜ sha512sum kube-proxy
52f9d8ad045f8eee1d689619ef8ceef2d86d50c75a6a332653240d7ba5b2a114aca056d9e513984ade24358c9662714973c1960c62a5cb37dd375631c8a614c6 kube-proxy
```
Also good, and finally:
```sh
➜ sha512sum kubelet
7b720598e6a3483b45c537b57d759e3e82bc5c53b3274f681792f62e941019cde3d51a7f9b55158abf3810d506146bc0aa7cf97b36f27f341028a54431b335be kubelet
```
Catch! Binary kubelet has a different hash!
But did we actually compare everything properly before? Let's have a closer look at kube-controller-manager again:
```sh
➜ sha512sum kube-controller-manager > compare
➜ vim compare
```
Edit to only have the provided hash and the generated one in one line each:
```sh
# ./compare
60100cc725e91fe1a949e1b2d0474237844b5862556e25c2c655a33b0a8225855ec5ee22fa4927e6c46a60d43a7c4403a27268f96fbb726307d1608b44f38a60
60100cc725e91fe1a949e1b2d0474237844b5862556e25c2c655a33boa8225855ec5ee22fa4927e6c46a60d43a7c4403a27268f96fbb726307d1608b44f38a60
```
Looks right at a first glance, but if we do:
```sh
➜ cat compare | uniq
60100cc725e91fe1a949e1b2d0474237844b5862556e25c2c655a33b0a8225855ec5ee22fa4927e6c46a60d43a7c4403a27268f96fbb726307d1608b44f38a60
60100cc725e91fe1a949e1b2d0474237844b5862556e25c2c655a33boa8225855ec5ee22fa4927e6c46a60d43a7c4403a27268f96fbb726307d1608b44f38a60
```
This shows they are different, by just one character actually.
To complete the task we do:
```sh
rm kubelet kube-controller-manager
```
## Question 7 | Open Policy Agent
#### Task weight: 6%
Use context: `kubectl config use-context infra-prod`
The Open Policy Agent and Gatekeeper have been installed to, among other things, enforce blacklisting of certain image registries. Alter the existing constraint and/or template to also blacklist images from very-bad-registry.com.
Test it by creating a single Pod using image very-bad-registry.com/image in Namespace default, it shouldn't work.
You can also verify your changes by looking at the existing Deployment untrusted in Namespace default, it uses an image from the new untrusted source. The OPA contraint should throw violation messages for this one.
#### Answer:
We look at existing OPA constraints, these are implemeted using CRDs by Gatekeeper:
```sh
➜ k get crd
NAME CREATED AT
blacklistimages.constraints.gatekeeper.sh 2020-09-14T19:29:31Z
configs.config.gatekeeper.sh 2020-09-14T19:29:04Z
constraintpodstatuses.status.gatekeeper.sh 2020-09-14T19:29:05Z
constrainttemplatepodstatuses.status.gatekeeper.sh 2020-09-14T19:29:05Z
constrainttemplates.templates.gatekeeper.sh 2020-09-14T19:29:05Z
requiredlabels.constraints.gatekeeper.sh 2020-09-14T19:29:31Z
```
So we can do:
```sh
➜ k get constraint
NAME AGE
blacklistimages.constraints.gatekeeper.sh/pod-trusted-images 10m
NAME AGE
requiredlabels.constraints.gatekeeper.sh/namespace-mandatory-labels 10m
```
and then look at the one that is probably about blacklisting images:
```sh
k edit blacklistimages pod-trusted-images
```
```yaml
# kubectl edit blacklistimages pod-trusted-images
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: BlacklistImages
metadata:
...
spec:
match:
kinds:
- apiGroups:
- ""
kinds:
- Pod
````
It looks like this constraint simply applies the template to all Pods, no arguments passed. So we edit the template:
```sh
k edit constrainttemplates blacklistimages
```
```yaml
# kubectl edit constrainttemplates blacklistimages
apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
...
spec:
crd:
spec:
names:
kind: BlacklistImages
targets:
- rego: |
package k8strustedimages
images {
image := input.review.object.spec.containers[_].image
not startswith(image, "docker-fake.io/")
not startswith(image, "google-gcr-fake.com/")
not startswith(image, "very-bad-registry.com/") # ADD THIS LINE
}
violation[{"msg": msg}] {
not images
msg := "not trusted image!"
}
target: admission.k8s.gatekeeper.sh
```
We simply have to add another line. After editing we try to create a Pod of the bad image:
```sh
➜ k run opa-test --image=very-bad-registry.com/image
Error from server ([denied by pod-trusted-images] not trusted image!): admission webhook "validation.gatekeeper.sh" denied the request: [denied by pod-trusted-images] not trusted image!
```
Nice! After some time we can also see that Pods of the existing Deployment "untrusted" will be listed as violators:
```sh
➜ k describe blacklistimages pod-trusted-images
...
Total Violations: 2
Violations:
Enforcement Action: deny
Kind: Namespace
Message: you must provide labels: {"security-level"}
Name: sidecar-injector
Enforcement Action: deny
Kind: Pod
Message: not trusted image!
Name: untrusted-68c4944d48-tfsnb
Namespace: default
Events:
```
Great, OPA fights bad registries !
## Question 8 | Secure Kubernetes Dashboard
#### Task weight: 3%
Use context: `kubectl config use-context workload-prod`
The Kubernetes Dashboard is installed in Namespace kubernetes-dashboard and is configured to:
* Allow users to "skip login"
* Allow insecure access (HTTP without authentication)
* Allow basic authentication
* Allow access from outside the cluster
You are asked to make it more secure by:
* Deny users to "skip login"
* Deny insecure access, enforce HTTPS (self signed certificates are ok for now)
* Add the --auto-generate-certificates argument
* Enforce authentication using a token (with possibility to use RBAC)
* Allow only cluster internal access
#### Answer:
Head to https://github.com/kubernetes/dashboard/tree/master/docs to find documentation about the dashboard. This link is not on the allowed list of urls during the real exam. This means you should be provided will all information necessary in case of a task like this.
First we have a look in Namespace kubernetes-dashboard:
```sh
➜ k -n kubernetes-dashboard get pod,svc
NAME READY STATUS RESTARTS AGE
pod/dashboard-metrics-scraper-7b59f7d4df-fbpd9 1/1 Running 0 24m
pod/kubernetes-dashboard-6d8cd5dd84-w7wr2 1/1 Running 0 24m
NAME TYPE ... PORT(S) AGE
service/dashboard-metrics-scraper ClusterIP ... 8000/TCP 24m
service/kubernetes-dashboard NodePort ... 9090:32520/TCP,443:31206/TCP 24m
```
We can see one running Pod and a NodePort Service exposing it. Let's try to connect to it via a NodePort, we can use IP of any Node:
(your port might be a different)
```sh
➜ k get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP ...
cluster1-controlplane1 Ready master 37m v1.24.1 192.168.100.11 ...
cluster1-node1 Ready 36m v1.24.1 192.168.100.12 ...
cluster1-node2 Ready 34m v1.24.1 192.168.100.13 ...
➜ curl http://192.168.100.11:32520
ceb8989101b
Successfully tagged registry.killer.sh:5000/image-verify:v2
ceb8989101bccd9f6b9c3b4c6c75f6c3561f19a5b784edd1f1a36fa0fb34a9df
```
We can then test our changes by running the container locally:
```sh
➜ :/opt/course/16/image$ podman run registry.killer.sh:5000/image-verify:v2
Thu Sep 16 06:01:47 UTC 2021
uid=101(myuser) gid=102(myuser) groups=102(myuser)
Thu Sep 16 06:01:48 UTC 2021
uid=101(myuser) gid=102(myuser) groups=102(myuser)
Thu Sep 16 06:01:49 UTC 2021
uid=101(myuser) gid=102(myuser) groups=102(myuser)
```
Looking good, so we push:
```sh
➜ :/opt/course/16/image$ podman push registry.killer.sh:5000/image-verify:v2
Getting image source signatures
Copying blob cd0853834d88 done
Copying blob 5298d0709c3e skipped: already exists
Copying blob e6688e911f15 done
Copying blob dbc406096645 skipped: already exists
Copying blob 98895ed393d9 done
Copying config ceb8989101 done
Writing manifest to image destination
Storing signatures
```
And we update the Deployment to use the new image:
```sh
k -n team-blue edit deploy image-verify
```
```yaml
# kubectl -n team-blue edit deploy image-verify
apiVersion: apps/v1
kind: Deployment
metadata:
...
spec:
...
template:
...
spec:
containers:
- image: registry.killer.sh:5000/image-verify:v2 # change
```
And afterwards we can verify our changes by looking at the Pod logs:
```sh
➜ k -n team-blue logs -f -l id=image-verify
Fri Sep 25 21:06:55 UTC 2020
uid=101(myuser) gid=102(myuser) groups=102(myuser)
```
Also to verify our changes even further:
```sh
➜ k -n team-blue exec image-verify-55fbcd4c9b-x2flc -- curl
OCI runtime exec failed: exec failed: container_linux.go:349: starting container process caused "exec: \"curl\": executable file not found in $PATH": unknown
command terminated with exit code 126
➜ k -n team-blue exec image-verify-55fbcd4c9b-x2flc -- nginx -v
nginx version: nginx/1.18.0
```
Another task solved.
## Question 17 | Audit Log Policy
#### Task weight: 7%
Use context: `kubectl config use-context infra-prod`
Audit Logging has been enabled in the cluster with an Audit Policy located at `/etc/kubernetes/audit/policy.yaml` on `cluster2-controlplane1`.
Change the configuration so that only one backup of the logs is stored.
Alter the Policy in a way that it only stores logs:
* From Secret resources, level Metadata
* From "system:nodes" userGroups, level RequestResponse
After you altered the Policy make sure to empty the log file so it only contains entries according to your changes, like using `truncate -s 0 /etc/kubernetes/audit/logs/audit.log`.
###### NOTE: You can use jq to render json more readable. cat data.json | jq
#### Answer:
First we check the apiserver configuration and change as requested:
```sh
➜ ssh cluster2-controlplane1
➜ root@cluster2-controlplane1:~# cp /etc/kubernetes/manifests/kube-apiserver.yaml ~/17_kube-apiserver.yaml # backup
➜ root@cluster2-controlplane1:~# vim /etc/kubernetes/manifests/kube-apiserver.yaml
```
```yaml
# /etc/kubernetes/manifests/kube-apiserver.yaml
apiVersion: v1
kind: Pod
metadata:
annotations:
kubeadm.kubernetes.io/kube-apiserver.advertise-address.endpoint: 192.168.100.21:6443
creationTimestamp: null
labels:
component: kube-apiserver
tier: control-plane
name: kube-apiserver
namespace: kube-system
spec:
containers:
- command:
- kube-apiserver
- --audit-policy-file=/etc/kubernetes/audit/policy.yaml
- --audit-log-path=/etc/kubernetes/audit/logs/audit.log
- --audit-log-maxsize=5
- --audit-log-maxbackup=1 # CHANGE
- --advertise-address=192.168.100.21
- --allow-privileged=true
...
```
###### NOTE: You should know how to enable Audit Logging completely yourself as described in the docs. Feel free to try this in another cluster in this environment.
Now we look at the existing Policy:
```sh
➜ root@cluster2-controlplane1:~# vim /etc/kubernetes/audit/policy.yaml
```
```yaml
# /etc/kubernetes/audit/policy.yaml
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
- level: Metadata
```
We can see that this simple Policy logs everything on Metadata level. So we change it to the requirements:
```yaml
# /etc/kubernetes/audit/policy.yaml
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
# log Secret resources audits, level Metadata
- level: Metadata
resources:
- group: ""
resources: ["secrets"]
# log node related audits, level RequestResponse
- level: RequestResponse
userGroups: ["system:nodes"]
# for everything else don't log anything
- level: None
```
After saving the changes we have to restart the apiserver:
```sh
➜ root@cluster2-controlplane1:~# cd /etc/kubernetes/manifests/
➜ root@cluster2-controlplane1:/etc/kubernetes/manifests# mv kube-apiserver.yaml ..
➜ root@cluster2-controlplane1:/etc/kubernetes/manifests# watch crictl ps # wait for apiserver gone
➜ root@cluster2-controlplane1:/etc/kubernetes/manifests# truncate -s 0 /etc/kubernetes/audit/logs/audit.log
➜ root@cluster2-controlplane1:/etc/kubernetes/manifests# mv ../kube-apiserver.yaml .
```
Once the apiserver is running again we can check the new logs and scroll through some entries:
```sh
cat audit.log | tail | jq
```
```json
{
"kind": "Event",
"apiVersion": "audit.k8s.io/v1",
"level": "Metadata",
"auditID": "e598dc9e-fc8b-4213-aee3-0719499ab1bd",
"stage": "RequestReceived",
"requestURI": "...",
"verb": "watch",
"user": {
"username": "system:serviceaccount:gatekeeper-system:gatekeeper-admin",
"uid": "79870838-75a8-479b-ad42-4b7b75bd17a3",
"groups": [
"system:serviceaccounts",
"system:serviceaccounts:gatekeeper-system",
"system:authenticated"
]
},
"sourceIPs": [
"192.168.102.21"
],
"userAgent": "manager/v0.0.0 (linux/amd64) kubernetes/$Format",
"objectRef": {
"resource": "secrets",
"apiVersion": "v1"
},
"requestReceivedTimestamp": "2020-09-27T20:01:36.238911Z",
"stageTimestamp": "2020-09-27T20:01:36.238911Z",
"annotations": {
"authentication.k8s.io/legacy-token": "..."
}
}
```
Above we logged a watch action by OPA Gatekeeper for Secrets, level Metadata.
```json
{
"kind": "Event",
"apiVersion": "audit.k8s.io/v1",
"level": "RequestResponse",
"auditID": "c90e53ed-b0cf-4cc4-889a-f1204dd39267",
"stage": "ResponseComplete",
"requestURI": "...",
"verb": "list",
"user": {
"username": "system:node:cluster2-controlplane1",
"groups": [
"system:nodes",
"system:authenticated"
]
},
"sourceIPs": [
"192.168.100.21"
],
"userAgent": "kubelet/v1.19.1 (linux/amd64) kubernetes/206bcad",
"objectRef": {
"resource": "configmaps",
"namespace": "kube-system",
"name": "kube-proxy",
"apiVersion": "v1"
},
"responseStatus": {
"metadata": {},
"code": 200
},
"responseObject": {
"kind": "ConfigMapList",
"apiVersion": "v1",
"metadata": {
"selfLink": "/api/v1/namespaces/kube-system/configmaps",
"resourceVersion": "83409"
},
"items": [
{
"metadata": {
"name": "kube-proxy",
"namespace": "kube-system",
"selfLink": "/api/v1/namespaces/kube-system/configmaps/kube-proxy",
"uid": "0f1c3950-430a-4543-83e4-3f9c87a478b8",
"resourceVersion": "232",
"creationTimestamp": "2020-09-26T20:59:50Z",
"labels": {
"app": "kube-proxy"
},
"annotations": {
"kubeadm.kubernetes.io/component-config.hash": "..."
},
"managedFields": [
{
...
}
]
},
...
}
]
},
"requestReceivedTimestamp": "2020-09-27T20:01:36.223781Z",
"stageTimestamp": "2020-09-27T20:01:36.225470Z",
"annotations": {
"authorization.k8s.io/decision": "allow",
"authorization.k8s.io/reason": ""
}
}
```
And in the one above we logged a list action by system:nodes for a ConfigMaps, level RequestResponse.
Because all JSON entries are written in a single line in the file we could also run some simple verifications on our Policy:
```yaml
# shows Secret entries
cat audit.log | grep '"resource":"secrets"' | wc -l
# confirms Secret entries are only of level Metadata
cat audit.log | grep '"resource":"secrets"' | grep -v '"level":"Metadata"' | wc -l
# shows RequestResponse level entries
cat audit.log | grep -v '"level":"RequestResponse"' | wc -l
# shows RequestResponse level entries are only for system:nodes
cat audit.log | grep '"level":"RequestResponse"' | grep -v "system:nodes" | wc -l
```
Looks like our job is done.
## Question 18 | Investigate Break-in via Audit Log
#### Task weight: 4%
Use context: `kubectl config use-context infra-prod`
Namespace security contains five Secrets of type Opaque which can be considered highly confidential. The latest Incident-Prevention-Investigation revealed that ServiceAccount p.auster had too broad access to the cluster for some time. This SA should've never had access to any Secrets in that Namespace.
Find out which Secrets in Namespace security this SA did access by looking at the Audit Logs under /opt/course/18/audit.log.
Change the password to any new string of only those Secrets that were accessed by this SA.
NOTE: You can use jq to render json more readable. cat data.json | jq
Answer:
First we look at the Secrets this is about:
➜ k -n security get secret | grep Opaque
kubeadmin-token Opaque 1 37m
mysql-admin Opaque 1 37m
postgres001 Opaque 1 37m
postgres002 Opaque 1 37m
vault-token Opaque 1 37m
Next we investigate the Audit Log file:
➜ cd /opt/course/18
➜ :/opt/course/18$ ls -lh
total 7.1M
-rw-r--r-- 1 k8s k8s 7.5M Sep 24 21:31 audit.log
➜ :/opt/course/18$ cat audit.log | wc -l
4451
Audit Logs can be huge and it's common to limit the amount by creating an Audit Policy and to transfer the data in systems like Elasticsearch. In this case we have a simple JSON export, but it already contains 4451 lines.
We should try to filter the file down to relevant information:
➜ :/opt/course/18$ cat audit.log | grep "p.auster" | wc -l
28
Not too bad, only 28 logs for ServiceAccount p.auster.
➜ :/opt/course/18$ cat audit.log | grep "p.auster" | grep Secret | wc -l
2
And only 2 logs related to Secrets...
➜ :/opt/course/18$ cat audit.log | grep "p.auster" | grep Secret | grep list | wc -l
0
➜ :/opt/course/18$ cat audit.log | grep "p.auster" | grep Secret | grep get | wc -l
2
No list actions, which is good, but 2 get actions, so we check these out:
cat audit.log | grep "p.auster" | grep Secret | grep get | jq
{
"kind": "Event",
"apiVersion": "audit.k8s.io/v1",
"level": "RequestResponse",
"auditID": "74fd9e03-abea-4df1-b3d0-9cfeff9ad97a",
"stage": "ResponseComplete",
"requestURI": "/api/v1/namespaces/security/secrets/vault-token",
"verb": "get",
"user": {
"username": "system:serviceaccount:security:p.auster",
"uid": "29ecb107-c0e8-4f2d-816a-b16f4391999c",
"groups": [
"system:serviceaccounts",
"system:serviceaccounts:security",
"system:authenticated"
]
},
...
"userAgent": "curl/7.64.0",
"objectRef": {
"resource": "secrets",
"namespace": "security",
"name": "vault-token",
"apiVersion": "v1"
},
...
}
{
"kind": "Event",
"apiVersion": "audit.k8s.io/v1",
"level": "RequestResponse",
"auditID": "aed6caf9-5af0-4872-8f09-ad55974bb5e0",
"stage": "ResponseComplete",
"requestURI": "/api/v1/namespaces/security/secrets/mysql-admin",
"verb": "get",
"user": {
"username": "system:serviceaccount:security:p.auster",
"uid": "29ecb107-c0e8-4f2d-816a-b16f4391999c",
"groups": [
"system:serviceaccounts",
"system:serviceaccounts:security",
"system:authenticated"
]
},
...
"userAgent": "curl/7.64.0",
"objectRef": {
"resource": "secrets",
"namespace": "security",
"name": "mysql-admin",
"apiVersion": "v1"
},
...
}
There we see that Secrets vault-token and mysql-admin were accessed by p.auster. Hence we change the passwords for those.
➜ echo new-vault-pass | base64
bmV3LXZhdWx0LXBhc3MK
➜ k -n security edit secret vault-token
➜ echo new-mysql-pass | base64
bmV3LW15c3FsLXBhc3MK
➜ k -n security edit secret mysql-admin
Audit Logs ftw.
By running cat audit.log | grep "p.auster" | grep Secret | grep password we can see that passwords are stored in the Audit Logs, because they store the complete content of Secrets. It's never a good idea to reveal passwords in logs. In this case it would probably be sufficient to only store Metadata level information of Secrets which can be controlled via a Audit Policy.
## Question 19 | Immutable Root FileSystem
#### Task weight: 2%
Use context: `kubectl config use-context workload-prod`
The Deployment `immutable-deployment` in Namespace `team-purple` should run immutable, it's created from file `/opt/course/19/immutable-deployment.yaml`. Even after a successful break-in, it shouldn't be possible for an attacker to modify the filesystem of the running container.
Modify the Deployment in a way that no processes inside the container can modify the local filesystem, only `/tmp` directory should be writeable. Don't modify the Docker image.
Save the updated YAML under `/opt/course/19/immutable-deployment-new.yaml` and update the running Deployment.
#### Answer:
Processes in containers can write to the local filesystem by default. This increases the attack surface when a non-malicious process gets hijacked. Preventing applications to write to disk or only allowing to certain directories can mitigate the risk. If there is for example a bug in Nginx which allows an attacker to override any file inside the container, then this only works if the Nginx process itself can write to the filesystem in the first place.
Making the root filesystem readonly can be done in the Docker image itself or in a Pod declaration.
Let us first check the Deployment `immutable-deployment` in Namespace `team-purple`:
```sh
➜ k -n team-purple edit deploy -o yaml
```
```yaml
# kubectl -n team-purple edit deploy -o yaml
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: team-purple
name: immutable-deployment
labels:
app: immutable-deployment
...
spec:
replicas: 1
selector:
matchLabels:
app: immutable-deployment
template:
metadata:
labels:
app: immutable-deployment
spec:
containers:
- image: busybox:1.32.0
command: ['sh', '-c', 'tail -f /dev/null']
imagePullPolicy: IfNotPresent
name: busybox
restartPolicy: Always
...
```
The container has write access to the Root File System, as there are no restrictions defined for the Pods or containers by an existing SecurityContext. And based on the task we're not allowed to alter the Docker image.
So we modify the YAML manifest to include the required changes:
```sh
cp /opt/course/19/immutable-deployment.yaml /opt/course/19/immutable-deployment-new.yaml
vim /opt/course/19/immutable-deployment-new.yaml
```
```yaml
# /opt/course/19/immutable-deployment-new.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: team-purple
name: immutable-deployment
labels:
app: immutable-deployment
spec:
replicas: 1
selector:
matchLabels:
app: immutable-deployment
template:
metadata:
labels:
app: immutable-deployment
spec:
containers:
- image: busybox:1.32.0
command: ['sh', '-c', 'tail -f /dev/null']
imagePullPolicy: IfNotPresent
name: busybox
securityContext: # add
readOnlyRootFilesystem: true # add
volumeMounts: # add
- mountPath: /tmp # add
name: temp-vol # add
volumes: # add
- name: temp-vol # add
emptyDir: {} # add
restartPolicy: Always
```
SecurityContexts can be set on Pod or container level, here the latter was asked. Enforcing readOnlyRootFilesystem: true will render the root filesystem readonly. We can then allow some directories to be writable by using an emptyDir volume.
Once the changes are made, let us update the Deployment:
```sh
➜ k delete -f /opt/course/19/immutable-deployment-new.yaml
deployment.apps "immutable-deployment" deleted
➜ k create -f /opt/course/19/immutable-deployment-new.yaml
deployment.apps/immutable-deployment created
```
We can verify if the required changes are propagated:
```sh
➜ k -n team-purple exec immutable-deployment-5b7ff8d464-j2nrj -- touch /abc.txt
touch: /abc.txt: Read-only file system
command terminated with exit code 1
➜ k -n team-purple exec immutable-deployment-5b7ff8d464-j2nrj -- touch /var/abc.txt
touch: /var/abc.txt: Read-only file system
command terminated with exit code 1
➜ k -n team-purple exec immutable-deployment-5b7ff8d464-j2nrj -- touch /etc/abc.txt
touch: /etc/abc.txt: Read-only file system
command terminated with exit code 1
➜ k -n team-purple exec immutable-deployment-5b7ff8d464-j2nrj -- touch /tmp/abc.txt
➜ k -n team-purple exec immutable-deployment-5b7ff8d464-j2nrj -- ls /tmp
abc.txt
```
The Deployment has been updated so that the container's file system is read-only, and the updated YAML has been placed under the required location. Sweet!
## Question 20 | Update Kubernetes
#### Task weight: 8%
Use context: `kubectl config use-context workload-stage`
The cluster is running Kubernetes `1.24.7`, update it to `1.25.2`.
Use `apt` package manager and `kubeadm` for this.
Use `ssh cluster3-controlplane1` and `ssh cluster3-node1` to connect to the instances.
#### Answer:
Let's have a look at the current versions:
```sh
➜ k get node
NAME STATUS ROLES AGE VERSION
cluster3-controlplane1 Ready control-plane 21d v1.24.7
cluster3-node1 Ready 21d v1.24.7
```
###### Control Plane Master Components
First we should update the control plane components running on the master node, so we drain it:
```sh
➜ k drain cluster3-controlplane1 --ignore-daemonsets
Next we ssh into it and check versions:
➜ ssh cluster3-controlplane1
➜ root@cluster3-controlplane1:~# kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.1", GitCommit:"3ddd0f45aa91e2f30c70734b175631bec5b5825a", GitTreeState:"clean", BuildDate:"2022-05-24T12:24:38Z", GoVersion:"go1.18.2", Compiler:"gc", Platform:"linux/amd64"}
➜ root@cluster3-controlplane1:~# kubelet --version
Kubernetes v1.23.1
```
We see `kubeadm` is already installed in the required version. Else we would need to install it:
```sh
# not necessary because here kubeadm is already installed in correct version
apt-mark unhold kubeadm
apt-mark hold kubectl kubelet
apt install kubeadm=1.24.1-00
apt-mark hold kubeadm
```
Check what kubeadm has available as an upgrade plan:
```sh
➜ root@cluster3-controlplane1:~# kubeadm upgrade plan
...
[upgrade/config] Making sure the configuration is correct:
[upgr