{"id":13533563,"url":"https://github.com/asobti/kube-monkey","last_synced_at":"2025-04-10T04:53:53.511Z","repository":{"id":37285068,"uuid":"75889576","full_name":"asobti/kube-monkey","owner":"asobti","description":"An implementation of Netflix's Chaos Monkey for Kubernetes clusters","archived":false,"fork":false,"pushed_at":"2024-06-16T19:34:50.000Z","size":31566,"stargazers_count":3003,"open_issues_count":27,"forks_count":253,"subscribers_count":55,"default_branch":"master","last_synced_at":"2025-04-03T02:55:19.175Z","etag":null,"topics":["chaos-monkey","go","kubernetes","netflix-chaos-monkey"],"latest_commit_sha":null,"homepage":null,"language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/asobti.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-12-08T01:06:28.000Z","updated_at":"2025-04-02T07:13:50.000Z","dependencies_parsed_at":"2024-02-03T20:43:33.337Z","dependency_job_id":"dbb7550a-d5d6-40de-9f7f-734bc892b91e","html_url":"https://github.com/asobti/kube-monkey","commit_stats":{"total_commits":334,"total_committers":50,"mean_commits":6.68,"dds":0.781437125748503,"last_synced_commit":"7dcb3ca1a629a179d66cae3e71bbd136d9bd11dd"},"previous_names":[],"tags_count":7,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/asobti%2Fkube-monkey","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/asobti%2Fkube-monkey/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/asobti%2Fkube-monkey/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/asobti%2Fkube-monkey/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/asobti","download_url":"https://codeload.github.com/asobti/kube-monkey/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248161255,"owners_count":21057553,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chaos-monkey","go","kubernetes","netflix-chaos-monkey"],"created_at":"2024-08-01T07:01:21.025Z","updated_at":"2025-04-10T04:53:53.478Z","avatar_url":"https://github.com/asobti.png","language":"Go","funding_links":[],"categories":["Go","Testing","Chaos engineering","Chaos testing","Tools and Libraries","Kubernetes","kubernetes","🏗相关开源项目","Repositories","Notable Tools","3. Fault Injection"],"sub_categories":["[Jenkins](#jenkins)","Testing and Troubleshooting","Kubernetes tools","测试","Cloud"],"readme":"[![Build](https://github.com/asobti/kube-monkey/actions/workflows/go.yml/badge.svg)](https://github.com/asobti/kube-monkey/actions/workflows/go.yml)\n[![Go Report Card](https://goreportcard.com/badge/github.com/asobti/kube-monkey)](https://goreportcard.com/report/github.com/asobti/kube-monkey)\n[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)\n[![Docker Pulls](https://img.shields.io/docker/pulls/ayushsobti/kube-monkey?label=Docker%20pulls\u0026logo=docker)](https://hub.docker.com/r/ayushsobti/kube-monkey/)\n[![Artifact Hub](https://img.shields.io/endpoint?url=https://artifacthub.io/badge/repository/kubemonkey)](https://artifacthub.io/packages/search?repo=kubemonkey)\n\nkube-monkey is an implementation of [Netflix's Chaos Monkey](https://github.com/Netflix/chaosmonkey) for [Kubernetes](http://kubernetes.io/) clusters. It randomly deletes Kubernetes (k8s) pods in the cluster encouraging and validating the development of failure-resilient services.\n\nJoin us at [#kube-monkey](https://kubernetes.slack.com/messages/kube-monkey) on Kubernetes Slack.\n\n---\n\nkube-monkey runs at a pre-configured hour (`run_hour`, defaults to 8 am) on weekdays, and builds a schedule of deployments that will face a random\nPod death sometime during the same day. The time-range during the day when the random pod Death might occur is configurable and defaults to 10 am to 4 pm.\n\nkube-monkey can be configured with a list of namespaces\n* to blacklist (any deployments within a blacklisted namespace will not be touched)\n\nTo disable the blacklist provide `[\"\"]` in the `blacklisted_namespaces` config.param.\n\n## Opting-In to Chaos\n\nkube-monkey works on an opt-in model and will only schedule terminations for Kubernetes (k8s) apps that have explicitly agreed to have their pods terminated by kube-monkey.\n\nOpt-in is done by setting the following labels on a k8s app:\n\n**`kube-monkey/enabled`**: Set to **`\"enabled\"`** to opt-in to kube-monkey  \n**`kube-monkey/mtbf`**: Mean time between failure (in days). For example, if set to **`\"3\"`**, the k8s app can expect to have a Pod\nkilled approximately every third weekday.  \n**`kube-monkey/identifier`**: A unique identifier for the k8s apps. This is used to identify the pods\nthat belong to a k8s app as Pods inherit labels from their k8s app. So, if kube-monkey detects that app `foo` has enrolled to be a victim, kube-monkey will look for all pods that have the label `kube-monkey/identifier: foo` to determine which pods are candidates for killing. The recommendation is to set this value to be the same as the app's name.  \n**`kube-monkey/kill-mode`**: Default behavior is for kube-monkey to kill only ONE pod of your app. You can override this behavior by setting the value to:\n* `kill-all` if you want kube-monkey to kill **ALL** of your pods regardless of status (including not ready and not running pods). Does not require `kill-value`. **Use this label carefully.**\n* `fixed` if you want to kill a specific number of running pods with `kill-value`. If you overspecify, it will kill **all** running pods and issue a warning.\n* `random-max-percent` to specify a *maximum* `%` with `kill-value` that can be killed. At the scheduled time, a uniform *random specified* `%` of the running pods will be terminated.\n* `fixed-percent` to specify a *fixed* `%` with `kill-value` that can be killed. At the scheduled time, a specified *fixed* `%` of the running pods will be terminated.\n\n\n**`kube-monkey/kill-value`**: Specify value for kill-mode\n* if `fixed`, provide an integer of pods to kill\n* if `random-max-percent`, provide a number from `0`-`100` to specify the max `%` of pods kube-monkey can kill\n* if `fixed-percent`, provide a number from `0`-`100` to specify the `%` of pods to kill\n\n#### Example of opted-in Deployment killing one pod per purge\n\n```yaml\n---\napiVersion: apps/v1\nkind: Deployment\nmetadata:\n  name: monkey-victim\n  namespace: app-namespace\nspec:\n  template:\n    metadata:\n      labels:\n        kube-monkey/enabled: enabled\n        kube-monkey/identifier: monkey-victim\n        kube-monkey/mtbf: '2'\n        kube-monkey/kill-mode: \"fixed\"\n        kube-monkey/kill-value: '1'\n[... omitted ...]\n```\n\nFor newer versions of kubernetes you may need to add the labels to the k8s app metadata as well.\n\n```yaml\n---\napiVersion: apps/v1\nkind: Deployment\nmetadata:\n  name: monkey-victim\n  namespace: app-namespace\n  labels:\n    kube-monkey/enabled: enabled\n    kube-monkey/identifier: monkey-victim\n    kube-monkey/mtbf: '2'\n    kube-monkey/kill-mode: \"fixed\"\n    kube-monkey/kill-value: '1'\nspec:\n  template:\n    metadata:\n      labels:\n        kube-monkey/enabled: enabled\n        kube-monkey/identifier: monkey-victim\n[... omitted ...]\n```\n\n### Overriding the apiserver\n#### Use cases:\n* Since client-go does not support [cluster dns](https://github.com/kubernetes/client-go/blob/master/rest/config.go#L331) explicitly with a `// TODO: switch to using cluster DNS.` note in the code, you may need to override the apiserver.\n* If you are running an unauthenticated system, you may need to force the http apiserver endpoint.\n\n#### To override the apiserver specify in the config.toml file\n```toml\n[kubernetes]\nhost=\"https://your-apiserver-url.com:apiport\"\n```\n\n## How kube-monkey works\n\n#### Scheduling time\nScheduling happens once a day on Weekdays - this is when a schedule for terminations for the current day is generated. During scheduling, kube-monkey will:  \n1. Generate a list of eligible k8s apps (k8s apps that have opted-in and are not blacklisted, if specified, and are whitelisted, if specified)\n2. For each eligible k8s app, flip a biased coin (bias determined by `kube-monkey/mtbf`) to determine if a pod for that k8s app should be killed today\n3. For each victim, calculate a random time when a pod will be killed\n\n#### Termination time\nThis is the randomly generated time during the day when a victim k8s app will have a pod killed.\nAt termination time, kube-monkey will:\n1. Check if the k8s app is still eligible (has not opted-out or been blacklisted or removed from the whitelist since scheduling)\n2. Check if the k8s app has updated kill-mode and kill-value\n3. Depending on kill-mode and kill-value, execute pods\n\n## Docker Images\n\nDocker images for kube-monkey can be found at [DockerHub](https://hub.docker.com/r/ayushsobti/kube-monkey/tags/)\n\n## Building\n\nClone the repository and build the container.\n\n```bash\ngo get github.com/asobti/kube-monkey\ncd $GOPATH/src/github.com/asobti/kube-monkey\nmake build\nmake container\n```\n\n## Configuring\nkube-monkey is configured by environment variables or a toml file placed at `/etc/kube-monkey/config.toml` and expects the configmap to exist before the kube-monkey deployment.\n\nConfiguration keys and descriptions can be found in [`config/param/param.go`](https://github.com/asobti/kube-monkey/blob/master/internal/pkg/config/param/param.go)\n\n#### Example config.toml file\n```toml\n[kubemonkey]\ndry_run = true                           # Terminations are only logged\nrun_hour = 8                             # Run scheduling at 8am on weekdays\nstart_hour = 10                          # Don't schedule any pod deaths before 10am\nend_hour = 16                            # Don't schedule any pod deaths after 4pm\nblacklisted_namespaces = [\"kube-system\"] # Critical apps live here\ntime_zone = \"America/New_York\"           # Set tzdata timezone example. Note the field is time_zone not timezone\n```\n\n#### Example environment variables\n```\nKUBEMONKEY_DRY_RUN=true\nKUBEMONKEY_RUN_HOUR=8\nKUBEMONKEY_START_HOUR=10\nKUBEMONKEY_END_HOUR=16\nKUBEMONKEY_BLACKLISTED_NAMESPACES=kube-system\nKUBEMONKEY_TIME_ZONE=America/New_York\n```\n#### Example Config to test kube-monkey works by enabling debug mode\n\nNote: this will keep attacking pods every 60s regardless of what you configured for the `startHour` and `endHour`.\n\n```toml\n[debug]\nenabled= true\nschedule_immediate_kill= true\n```\n\n## Notifications\n\nKube-monkey supports notifications and can notify an endpoint of your choice after an attack.\nIt can be a Slack webhook or a custom API.\n\n#### Example Config for posting attack notifications to an HTTP endpoint\n```toml\n[notifications]\n  enabled = true\n  reportSchedule = true\n  [notifications.attacks]\n    endpoint = \"http://url1\"\n    message = \"message1\"\n    headers = [\"header1Key:header1Value\",\"header2Key:header2/Value\"]\n```\n\n#### Placeholders\n\nThe message supports the following placeholders:\n* `{$name}`: victim's name\n* `{$kind}`: victim's kind\n* `{$namespace}`: victim's namespace\n* `{$timestamp}`: attack's time from Unix epoch in milliseconds\n* `{$time}`: attack's time\n* `{$date}`: attack's date\n* `{$error}`: result's error, if any\n* `{$kubemonkeyid}`: kube-monkey id (set using KUBE_MONKEY_ID env variable otherwise empty)\n\n```\n  message: '{\n            \"what\": \"Kube-monkey(${kubemonkeyid}) attack of {$name} in {$namespace}\",\n            \"who\": \"{$name}\",\n            \"when\": {$timestamp}\n           }'\n```\n\nThe header supports a special placeholder to retrieve the value of an environment variable.\nThis is useful when calling an API that has a protected endpoint.\nA typical scenario will be to pass an API token to the Kube-monkey container, this token is stored in a Kubernetes Secret and you want to pass it via an environment variable.\n\n```\nheaders = [\"api-key:{$env:API_TOKEN}\", \"Content-Type:application/json\"]\n```\n\n`{$env:API_TOKEN}` will be replaced by the environment variable `API_TOKEN` value.\n\nNote if the environment variable does not exist, the notification call will NOT be cancelled. The value will resolve to an empty string, and a warning will show up in the logs. \n\n## Deploying\n\n**Manually**\n1. First, deploy the expected `kube-monkey-config-map` configmap in the namespace you intend to run kube-monkey in (for example, the `kube-system` namespace). Make sure to define the keyname as `config.toml`\n\n\u003e For example `kubectl create configmap km-config --from-file=config.toml=km-config.toml` or `kubectl apply -f km-config.yaml`\n\n2. Run kube-monkey as a k8s app within the Kubernetes cluster, in a namespace that has permissions to kill Pods in other namespaces (eg. `kube-system`).\n\nSee dir [`examples/`](https://github.com/asobti/kube-monkey/tree/master/examples) for example Kubernetes yaml files.\n\n3. You should be able to see debug logs by `kubectl logs -f deployment.apps/kube-monkey --namespace=kube-system`  here the `deployment.apps/kube-monkey` is the k8s deployment for kube-monkey.\n\n\n**Helm Chart**  \n\nSee [How to install kube-monkey with Helm](helm/kubemonkey/README.md).\n\n## Logging\n\nkube-monkey uses [glog](github.com/golang/glog) and supports all command-line features for glog. To specify a custom v level or a custom log directory on the pod, see  `args: [\"-v=5\", \"-log_dir=/path/to/custom/log\"]` in the [example deployment file](https://github.com/asobti/kube-monkey/tree/master/examples/deployment.yaml)\n\n\u003e **Standardized glog levels `grep -r V\\([0-9]\\) *`**\n\u003e\n\u003e L0: None\n\u003e\n\u003e L1: Highest Level current status info and Errors with Terminations\n\u003e\n\u003e L2: Successful terminations\n\u003e\n\u003e L3: More detailed schedule status info\n\u003e\n\u003e L4: Debugging verbose schedule and config info\n\u003e\n\u003e L5: Auto-resolved inconsequential issues\n\nMore resources: See the [k8s logging page](https://kubernetes.io/docs/concepts/cluster-administration/logging/) suggesting [community conventions for logging severity](https://github.com/kubernetes/community/blob/master/contributors/devel/logging.md)\n\n## Instructions on how to get this working on OpenShift 3.x\n\n```\ngit clone https://github.com/asobti/kube-monkey.git\ncd examples\noc login http://someserver/ -u system:admin\noc project kube-system\noc create -f configmap.yaml\noc -n kube-system adm policy add-role-to-user -z deployer system:deployer\noc -n kube-system adm policy add-role-to-user -z builder system:image-builder\noc -n kube-system adm policy add-role-to-group system:image-puller system:serviceaccounts:kube-system\noc run kube-monkey --image=docker.io/ayushsobti/kube-monkey:v0.4.0 --command -- /kube-monkey -v=5 -log_dir=/var/log/kube-monkey\noc volume dc/kube-monkey --add --name=kubeconfigmap -m /etc/kube-monkey -t configmap --configmap-name=kube-monkey-config-map\n```\n\n### OpenShift 4.x\n\n```\ngit clone https://github.com/asobti/kube-monkey.git\ncd examples\noc login http://someserver/ -u system:admin\noc project kube-system\noc create -f configmap.yaml\noc -n kube-system adm policy add-cluster-role-to-user edit -z default --rolebinding-name kube-monkey-edit\noc run kube-monkey --image=docker.io/ayushsobti/kube-monkey:v0.3.0 --command -- /kube-monkey -v=5 -log_dir=/var/log/kube-monkey\noc set volume dc/kube-monkey --add --name=kubeconfigmap -m /etc/kube-monkey -t configmap --configmap-name=kube-monkey-config-map\n```\n\n## Ways to contribute\n\nSee [How to Contribute](CONTRIBUTING.md)\n\n## License\nThis project is licensed under the Apache License v2.0 - see the [LICENSE](LICENSE) file for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fasobti%2Fkube-monkey","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fasobti%2Fkube-monkey","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fasobti%2Fkube-monkey/lists"}