{"id":49571443,"url":"https://github.com/jebinjeb/k8s-evacuator","last_synced_at":"2026-05-03T14:31:19.054Z","repository":{"id":345919221,"uuid":"1187795725","full_name":"jebinjeb/k8s-evacuator","owner":"jebinjeb","description":"Advanced Kubernetes node evacuation tool with safe batching, workload-aware draining, and progressive (controlled) pod eviction.","archived":false,"fork":false,"pushed_at":"2026-03-23T06:39:12.000Z","size":36,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-03-24T03:16:07.974Z","etag":null,"topics":["cloud-native","cluster-operations","devops","k8s","kubectl","kubernetes","node-drain","platform-engineering","pod-eviction","sre","statefulset"],"latest_commit_sha":null,"homepage":"https://github.com/jebinjeb/k8s-evacuation#readme","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jebinjeb.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-21T07:08:08.000Z","updated_at":"2026-03-23T06:43:54.000Z","dependencies_parsed_at":"2026-03-22T01:03:24.324Z","dependency_job_id":null,"html_url":"https://github.com/jebinjeb/k8s-evacuator","commit_stats":null,"previous_names":["jebinjeb/k8s-evacuation","jebinjeb/k8s-evacuator"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/jebinjeb/k8s-evacuator","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jebinjeb%2Fk8s-evacuator","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jebinjeb%2Fk8s-evacuator/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jebinjeb%2Fk8s-evacuator/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jebinjeb%2Fk8s-evacuator/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jebinjeb","download_url":"https://codeload.github.com/jebinjeb/k8s-evacuator/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jebinjeb%2Fk8s-evacuator/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32573320,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-03T06:36:36.687Z","status":"ssl_error","status_checked_at":"2026-05-03T06:36:09.306Z","response_time":103,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cloud-native","cluster-operations","devops","k8s","kubectl","kubernetes","node-drain","platform-engineering","pod-eviction","sre","statefulset"],"created_at":"2026-05-03T14:31:18.397Z","updated_at":"2026-05-03T14:31:19.044Z","avatar_url":"https://github.com/jebinjeb.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🚀 Kubernetes Node Evacuator\n\n![Python](https://img.shields.io/badge/python-3.8%2B-blue)\n![Kubernetes](https://img.shields.io/badge/kubernetes-compatible-green)\n![License](https://img.shields.io/badge/license-MIT-blue)\n![Status](https://img.shields.io/badge/status-active-success)\n![Contributions](https://img.shields.io/badge/contributions-welcome-brightgreen)\n![Maintenance](https://img.shields.io/badge/maintained-yes-green)\n\n\u003e 🧠 A smarter, workload-aware alternative to kubectl drain using the Kubernetes Eviction API\n\nA Python-based tool to safely evacuate pods from a Kubernetes node, with **intelligent pod ordering, batch support, StatefulSet handling**, optional **Prometheus metrics**, and live CLI progress tracking.\n\n---\n\n## ✨ Why not just `kubectl drain`?\n\nWhile `kubectl drain` is safe, it is **not workload-aware**:\n\n- Evicts pods in a mostly **flat / unordered way**\n- No control over **batching strategies**\n- No awareness of **resource impact per workload**\n- Limited visibility into **progress and recovery**\n\n👉 This tool solves these problems with **controlled, observable, and intelligent evacuation**.\n\n---\n\n## ⚙️ Eviction Strategy (Production-Safe)\n\nThis tool uses the **Kubernetes Eviction API** — the same mechanism as `kubectl drain`.\n\n### ✅ Guarantees\n\n- ✅ Graceful pod eviction (no abrupt termination)  \n- ✅ Respects **PodDisruptionBudgets (PDB)**  \n- ✅ Waits for **replacement pods to become Ready**  \n- ✅ Ensures workloads reach **desired state before continuing**  \n- ❌ Does **NOT** force delete pods by default  \n\n---\n\n### ⚙️ Behavior\n\n- Uses **`policy/v1` Eviction API**\n- Retries when blocked by PDB (`429 Too Many Requests`)\n- Supports configurable fallback strategies:\n  - Graceful delete\n  - Force delete *(optional, last resort)*\n\n---\n\n\u003e ⚠️ Designed for **zero-downtime or minimal-disruption operations** in production environments.\n\n---\n## Features\n\n- **Safe pod evacuation** without modifying Deployment/StatefulSet specs.  \n- **Pod-aware batching**: one-by-one, fixed-size batch, or all-at-once.  \n- **StatefulSet support**: pods evicted in ordinal order to preserve stability. \n- **Grouping strategies**:\n  - owner (default for workloads) → evacuates pods workload by workload.\n  - spread → evicts pods from multiple workloads evenly across batches to minimize impact per workload.\n    - Supports controlled execution via `--max-batches` (process N batches and exit).\n- **Pre and post-checks**: waits for workloads to reach **desired state**.  \n- **Excludes**: DaemonSets, Jobs, completed/failed pods, mirror pods.  \n- **Optional metrics**: push per-pod progress and status to Prometheus Pushgateway.  \n- **Live progress tracking** in CLI.  \n- Fully configurable **timeout** and **retry** logic.\n- Dry-run mode for safe validation\n\n---\n\n## TODO\n\n- **Rollback / Retry**: If eviction fails mid-way, optionally rollback already evicted pods or retry safely.\n- **Advanced Dry-Run Simulation**: Check if pods can actually be scheduled on other nodes before eviction.\n  - NodeSelector / Affinity rules\n  - Taints \u0026 tolerations\n  - Available CPU / memory\n- Consider per-workload max-unavailable in spread mode to avoid evicting all replicas at once.\n\n## 🚧 Experimental / In-Progress Features\n\nThe following features are under development and may evolve:\n\n### 🔹 `--batch-size 0` (Dynamic Batching)\nAutomatically calculates a safe batch size based on workload size.\n\n- Adapts eviction speed to the number of replicas\n- Prevents over-eviction of small workloads\n- Designed to maintain minimum availability during evacuation\n\n---\n\n### ⚡ `--evict-all-safe` (Fast Path Eviction)\nEvicts all pods in a workload at once when considered safe.\n\n- Useful for stateless or highly replicated workloads\n- Skips batching for faster node evacuation\n- Automatically avoids unsafe scenarios (e.g., StatefulSets)\n\n---\n\n### 🛡️ `--respect-pdb` (PDB-Aware Eviction)\nAdjusts eviction behavior based on PodDisruptionBudgets.\n\n- Uses Kubernetes PDB limits to determine safe eviction count\n- Prevents disruption beyond allowed thresholds\n- Aligns evacuation strategy with cluster safety policies\n\n---\n\n\u003e ⚠️ These features are experimental and may change in future releases.\n\n\n## 📦 Installation\n\n### 🔹 Prerequisites\n\n- Python **3.8+**\n- Access to a Kubernetes cluster\n- `kubeconfig` configured (e.g., `~/.kube/config`)  \n  or running inside a cluster (in-cluster config)\n\n---\n\n### 🔹 Install Dependencies\n\n```bash\npip install kubernetes prometheus-client\n```\n\n### 🔹 Run the Tool\n```bash\npython k8s_evacuator.py --node \u003cnode_name\u003e\n```\n\n## 🏗️ Architecture\n\n```mermaid\nflowchart TD\n    A[User CLI] --\u003e B[Evacuator Engine]\n\n    B --\u003e C[Cordon Node]\n    B --\u003e D[Fetch Pods on Node]\n\n    D --\u003e E[Filter Pods]\n    E --\u003e F{Grouping Strategy}\n\n    F --\u003e|owner| G[Group by Workload]\n    F --\u003e|spread| H[Spread Across Workloads]\n\n    G --\u003e I[Sort Pods]\n    H --\u003e I\n\n    I --\u003e J{Batching Strategy}\n\n    J --\u003e|fixed| K[Static Batching]\n    J --\u003e|dynamic| L[Dynamic Batch Calculation]\n    J --\u003e|evict-all-safe| M[Fast Path Eviction]\n\n    K --\u003e N[Eviction Loop]\n    L --\u003e N\n    M --\u003e N\n\n    N --\u003e O[Evict via Eviction API]\n\n    O --\u003e|PDB Block 429| P[Retry / Backoff]\n    P --\u003e O\n\n    O --\u003e Q[Scheduler]\n    Q --\u003e R[New Pod Placement]\n\n    R --\u003e S[Replacement Pods]\n    S --\u003e T[Readiness Check]\n\n    T --\u003e U[Desired State Validation]\n\n    U --\u003e|Not Ready| N\n    U --\u003e|Ready| V[Next Batch]\n\n    B --\u003e W[Prometheus Pushgateway]\n    W --\u003e X[Metrics]\n\n    style A fill:#f9f,stroke:#333\n    style B fill:#bbf,stroke:#333\n    style O fill:#fbb,stroke:#333\n```\n\n## 📊 Metrics (Prometheus Pushgateway)\n\nThis tool can optionally push per-pod evacuation metrics to a Prometheus Pushgateway if `prometheus_client` is installed and the `--pushgateway` flag is provided.\n\n### Metric\n\n| Metric Name                | Labels                     | Description                                                                 |\n|-----------------------------|----------------------------|-----------------------------------------------------------------------------|\n| `evacuation_pod_status`     | `pod`, `namespace`, `status` | Tracks the status of each pod during evacuation. `status` can be: `evicted`, `ready`, or `failed`. |\n\n### Status Labels\n\n- **evicted** → Pod eviction has been initiated.  \n- **ready** → Replacement pod is running and ready.  \n- **failed** → Pod eviction failed or replacement pod did not become ready.  \n\n### Usage\n\n```bash\npython k8s_evacuator.py --node \u003cnode_name\u003e --pushgateway http://pushgateway.example.com:9091\n```\n\n## 📊 Metrics TODO\n\n### 🚀 1. Pod Movement Metric\n\nTracks how pods move between nodes during evacuation.\n\n#### Metric Definition\n\n| Metric Name | Labels | Description |\n|------------|--------|------------|\n| `evacuation_pod_movement` | `old_pod`, `old_node`, `new_pod`, `new_node`, `namespace`, `status` | Tracks pod migration from source node to destination node |\n\n#### Label Details\n\n| Label | Description | Example |\n|------|------------|--------|\n| `old_pod` | Original pod name | `nginx-abc123` |\n| `old_node` | Source node (hostname) | `k3s-lab-worker` |\n| `new_pod` | New pod name after rescheduling | `nginx-xyz789` |\n| `new_node` | Destination node | `k3s-lab-worker-2` |\n| `namespace` | Kubernetes namespace | `default` |\n| `status` | Movement result | `moved`, `failed` |\n\n#### Examples\n\n| Scenario | Metric |\n|--------|--------|\n| Deployment Pod | `evacuation_pod_movement{old_pod=\"nginx-abc\", old_node=\"node1\", new_pod=\"nginx-def\", new_node=\"node2\", namespace=\"default\", status=\"moved\"} 1` |\n| StatefulSet Pod | `evacuation_pod_movement{old_pod=\"mysql-0\", old_node=\"node1\", new_pod=\"mysql-0\", new_node=\"node2\", namespace=\"db\", status=\"moved\"} 1` |\n| Failed Evacuation | `evacuation_pod_movement{old_pod=\"redis-abc\", old_node=\"node1\", new_pod=\"unknown\", new_node=\"unknown\", namespace=\"cache\", status=\"failed\"} 1` |\n\n---\n\n### 📦 2. Pod Status Metric\n\nTracks lifecycle of pods during evacuation.\n\n#### Metric Definition\n\n| Metric Name | Labels | Description |\n|------------|--------|------------|\n| `evacuation_pod_status` | `pod`, `namespace`, `status` | Tracks pod state transitions during evacuation |\n\n#### Label Details\n\n| Label | Description | Example |\n|------|------------|--------|\n| `pod` | Pod name | `nginx-abc123` |\n| `namespace` | Kubernetes namespace | `default` |\n| `status` | Pod state | `evicted`, `ready`, `failed` |\n\n#### Examples\n\n| Status | Metric |\n|-------|--------|\n| Evicted | `evacuation_pod_status{pod=\"nginx-abc\", namespace=\"default\", status=\"evicted\"} 1` |\n| Ready | `evacuation_pod_status{pod=\"nginx-def\", namespace=\"default\", status=\"ready\"} 1` |\n\n---\n\n### ⏱️ 3. Pod Rescheduling Duration (Optional)\n\nTracks time taken for pods to become ready after eviction.\n\n#### Metric Definition\n\n| Metric Name | Labels | Description |\n|------------|--------|------------|\n| `evacuation_pod_reschedule_duration_seconds` | `pod`, `namespace` | Time taken for pod rescheduling |\n\n#### Example\n\n| Metric Type | Example |\n|------------|--------|\n| Bucket | `evacuation_pod_reschedule_duration_seconds_bucket{pod=\"nginx\", namespace=\"default\", le=\"5\"} 1` |\n| Sum | `evacuation_pod_reschedule_duration_seconds_sum{pod=\"nginx\", namespace=\"default\"} 3.2` |\n| Count | `evacuation_pod_reschedule_duration_seconds_count{pod=\"nginx\", namespace=\"default\"} 1` |\n\n---\n\n### 🔍 Useful Prometheus Queries\n\n| Use Case | Query |\n|--------|------|\n| Pods moved from a node | `evacuation_pod_movement{old_node=\"k3s-lab-worker\", status=\"moved\"}` |\n| Distribution across nodes | `sum by (new_node) (evacuation_pod_movement{status=\"moved\"})` |\n| Failed evacuations | `evacuation_pod_movement{status=\"failed\"}` |\n| Total evicted pods | `count(evacuation_pod_status{status=\"evicted\"})` |\n\n---\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjebinjeb%2Fk8s-evacuator","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjebinjeb%2Fk8s-evacuator","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjebinjeb%2Fk8s-evacuator/lists"}