{"id":49888893,"url":"https://github.com/23seriy/devops-ai-workflows","last_synced_at":"2026-06-03T03:00:44.653Z","repository":{"id":355098255,"uuid":"1225195521","full_name":"23seriy/devops-ai-workflows","owner":"23seriy","description":"Curated collection of AI-agent workflows, prompts \u0026 rules for DevOps/SRE — Kubernetes debugging, AWS audits, Terraform plan reviews, CI/CD triage, Dockerfile reviews, secrets scanning \u0026 incident response. Works with Windsurf, Cursor, Claude Code or any LLM.","archived":false,"fork":false,"pushed_at":"2026-06-02T19:32:08.000Z","size":224,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-02T21:24:45.308Z","etag":null,"topics":["ai-agent","ai-workflows","aws","chatops","cicd","cursor","devops","docker","incident-response","kubernetes","llm","observability","platform-engineering","prompts","security","sre","terraform","windsurf"],"latest_commit_sha":null,"homepage":"","language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/23seriy.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-30T03:26:52.000Z","updated_at":"2026-06-02T19:32:13.000Z","dependencies_parsed_at":"2026-05-15T20:01:01.490Z","dependency_job_id":null,"html_url":"https://github.com/23seriy/devops-ai-workflows","commit_stats":null,"previous_names":["23seriy/devops-ai-workflows"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/23seriy/devops-ai-workflows","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/23seriy%2Fdevops-ai-workflows","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/23seriy%2Fdevops-ai-workflows/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/23seriy%2Fdevops-ai-workflows/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/23seriy%2Fdevops-ai-workflows/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/23seriy","download_url":"https://codeload.github.com/23seriy/devops-ai-workflows/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/23seriy%2Fdevops-ai-workflows/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33845770,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-03T02:00:06.370Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-agent","ai-workflows","aws","chatops","cicd","cursor","devops","docker","incident-response","kubernetes","llm","observability","platform-engineering","prompts","security","sre","terraform","windsurf"],"created_at":"2026-05-15T20:00:51.943Z","updated_at":"2026-06-03T03:00:44.645Z","avatar_url":"https://github.com/23seriy.png","language":"Shell","funding_links":[],"categories":[],"sub_categories":[],"readme":"# devops-ai-workflows\n\nA growing collection of **AI-agent workflows, prompts, and rules** for day-to-day DevOps / SRE / platform work.\n\n\u003e Note: \"workflows\" here means **Claude Code slash commands / AI-agent workflows** — *not* GitHub Actions.\n\n## What's inside\n\n| Folder | Purpose | Audience |\n|---|---|---|\n| [`.claude/commands/`](./.claude/commands) | Workflow definitions, auto-discovered as slash commands by Claude Code. | Everyone |\n| [`prompts/`](./prompts) | Reusable system / task prompts (incident triage, code review, post-mortem, etc.) | Any LLM |\n| [`rules/`](./rules) | Reusable safety rule sets to load into Claude Code (via `CLAUDE.md` `@`-reference) or any other agent | Any agent |\n| [`scripts/`](./scripts) | Standalone shell scripts referenced by workflows | Anyone with a shell |\n\n## Available workflows\n\n### Kubernetes\n\n| Workflow | Slash command | Description | Prerequisites |\n|---|---|---|---|\n| [k8s-debug](./.claude/commands/k8s-debug.md) | `/k8s-debug` | General-purpose, read-only cluster diagnostics across nodes, pods, workloads, networking, storage, RBAC, events, and resource pressure. | `kubectl`. Optional: `jq`, metrics-server. |\n| [k8s-workload-debug](./.claude/commands/k8s-workload-debug.md) | `/k8s-workload-debug` | Deep-dive on a single Deployment / StatefulSet / DaemonSet / Job / Pod: rollout, spec, probes, resources, logs, networking, storage, config. | `kubectl`. Optional: `jq`, metrics-server. |\n| [k8s-rbac-audit](./.claude/commands/k8s-rbac-audit.md) | `/k8s-rbac-audit` | RBAC risk audit — wildcards, cluster-admin bindings, risky verb/resource combos, over-privileged ServiceAccounts, anonymous access. | `kubectl`, `jq`. Optional: `kubectl-who-can`. |\n| [k8s-cost-hotspots](./.claude/commands/k8s-cost-hotspots.md) | `/k8s-cost-hotspots` | Find waste: over-provisioned workloads, missing requests/limits, idle workloads, orphan PVCs/PVs, idle LoadBalancers. | `kubectl`, `jq`, metrics-server. |\n| [k8s-upgrade-readiness](./.claude/commands/k8s-upgrade-readiness.md) | `/k8s-upgrade-readiness` | Pre-flight before a control-plane / node upgrade: deprecated APIs, version skew, PDB gaps, expiring certs, broken webhooks. | `kubectl`. Optional: `kubent` or `pluto`, `helm`. |\n| [helm-release-debug](./.claude/commands/helm-release-debug.md) | `/helm-release-debug` | Diagnose a stuck or failed Helm release: history, values diff, hook failures, rendered manifest vs cluster, workload health. | `helm` v3, `kubectl`. Optional: `jq`, `yq`. |\n| [helm-chart-review](./.claude/commands/helm-chart-review.md) | `/helm-chart-review` | Review a Helm chart for security, reliability, and best practices: resource specs, probes, security context, PDBs, anti-affinity, RBAC. | Helm chart source. Optional: `helm` CLI. |\n\n### AWS / Cloud\n\n| Workflow | Slash command | Description | Prerequisites |\n|---|---|---|---|\n| [aws-account-audit](./.claude/commands/aws-account-audit.md) | `/aws-account-audit` | Read-only AWS account security \u0026 hygiene audit: IAM, S3, EC2, RDS, CloudTrail, encryption, GuardDuty, SecurityHub. | `aws` CLI. Optional: `jq`. |\n| [aws-cost-quickscan](./.claude/commands/aws-cost-quickscan.md) | `/aws-cost-quickscan` | Find AWS cost waste: idle EC2/RDS, unattached EBS, old snapshots, expensive log groups, NAT data processing, missing Savings Plans. | `aws` CLI, Cost Explorer enabled. Optional: `jq`. |\n| [aws-vpc-debug](./.claude/commands/aws-vpc-debug.md) | `/aws-vpc-debug` | Diagnose VPC connectivity: trace path across SGs, NACLs, route tables, NAT/IGW/TGW, VPC endpoints, DNS, and flow logs. | `aws` CLI. Optional: `jq`, `dig`. |\n| [aws-iam-policy-review](./.claude/commands/aws-iam-policy-review.md) | `/aws-iam-policy-review` | Explain an IAM policy and flag risks: admin-equivalent access, privilege escalation paths, wildcard actions, missing conditions. | `aws` CLI. Optional: `jq`. |\n\n### IaC\n\n| Workflow | Slash command | Description | Prerequisites |\n|---|---|---|---|\n| [terraform-plan-review](./.claude/commands/terraform-plan-review.md) | `/terraform-plan-review` | Explain a Terraform plan and flag risky changes: destroys, replacements, security group mutations, IAM changes, blast radius. | `terraform plan` output. Optional: `terraform` CLI, `jq`. |\n\n### Containers \u0026 CI/CD\n\n| Workflow | Slash command | Description | Prerequisites |\n|---|---|---|---|\n| [ci-debug](./.claude/commands/ci-debug.md) | `/ci-debug` | Diagnose a failing CI/CD pipeline: parse build logs from Jenkins, GitHub Actions, GitLab CI, or Bitbucket Pipelines. Root cause analysis and fix suggestions. | Build log output. Optional: repo source, CI config file. |\n| [jenkins-pipeline-review](./.claude/commands/jenkins-pipeline-review.md) | `/jenkins-pipeline-review` | Review Jenkinsfile / shared-library Groovy for security risks, anti-patterns, missing error handling, credential leaks, CPS issues, and build config cross-references. | Jenkinsfile(s) or `vars/*.groovy`. Optional: `repositories_v2.json`. |\n| [release-checklist](./.claude/commands/release-checklist.md) | `/release-checklist` | Pre-release safety gate: scope, deploy order, rollback, tests, monitoring, and communication before production release. | PR/diff summary. Optional: test results, plans, diffs. |\n| [dockerfile-review](./.claude/commands/dockerfile-review.md) | `/dockerfile-review` | Review Dockerfiles for security, size, caching, and best practices. Flags CVE-prone bases, leaked secrets, missing health checks. | Dockerfile(s). Optional: `docker`, `trivy`. |\n\n### Security\n\n| Workflow | Slash command | Description | Prerequisites |\n|---|---|---|---|\n| [secrets-leak-scan](./.claude/commands/secrets-leak-scan.md) | `/secrets-leak-scan` | Scan git repo history for leaked secrets: API keys, passwords, tokens, private keys. Uses gitleaks, trufflehog, or regex fallback. | Git repo. Optional: `gitleaks`, `trufflehog`. |\n| [repo-health](./.claude/commands/repo-health.md) | `/repo-health` | Audit repository hygiene: README, license, CI, branch/release hygiene, tracked secrets, ownership, and automation gaps. | Local git repo. Optional: `gh`, `jq`. |\n\n### Observability \u0026 Incident\n\n| Workflow | Slash command | Description | Prerequisites |\n|---|---|---|---|\n| [incident-triage](./.claude/commands/incident-triage.md) | `/incident-triage` | Guided first 15 minutes of a production incident: timeline, blast radius, evidence gathering, mitigation suggestions. | Access to affected environment. |\n\nMore on the way — see [Roadmap](#roadmap).\n\n## Prompts\n\nReusable system prompts you can paste into any AI agent for common DevOps tasks:\n\n| Prompt | What it does |\n|---|---|\n| [incident-commander](./prompts/incident-commander.md) | Puts the AI in incident-commander mode: timeline, blast radius, action tracking, status updates. |\n| [postmortem-writer](./prompts/postmortem-writer.md) | Generates a blameless post-mortem from incident notes: timeline, root cause, impact, action items. |\n| [code-review-devops](./prompts/code-review-devops.md) | Reviews IaC / pipeline / Docker / K8s code with a security-first DevOps lens. |\n| [pr-description](./prompts/pr-description.md) | Generates a PR description from a diff: what, why, how, testing, risk, rollback plan. |\n| [explain-like-a-senior](./prompts/explain-like-a-senior.md) | Explains infrastructure code to junior engineers: what it does, why, gotchas, and how it fits together. |\n| [runbook-from-incident](./prompts/runbook-from-incident.md) | Converts incident notes or post-mortems into reusable runbooks with diagnosis, mitigation, escalation, and follow-up steps. |\n\n## Rules\n\nReusable, agent-agnostic safety rule sets. Reference them from a project's `CLAUDE.md` (e.g. `@rules/kubernetes.md`), paste into a system prompt, or include as context:\n\n| Rule file | What it does |\n|---|---|\n| [devops-agent.md](./rules/devops-agent.md) | Safety guardrails for AI in DevOps repos: never modify prod without confirmation, prefer read-only, never hardcode secrets, always check context, GitOps awareness, multi-repo coordination. |\n| [terraform.md](./rules/terraform.md) | Terraform-specific: state safety, ForceNew attribute warnings, provider/module pinning, workspace safety, import workflow, `prevent_destroy` reminders. |\n| [kubernetes.md](./rules/kubernetes.md) | Kubernetes-specific: context verification, dry-run first, Helm safety, ArgoCD/GitOps awareness, secret handling, debugging approach, RBAC best practices. |\n\n## Scripts\n\nStandalone shell utilities referenced by workflows or useful on their own:\n\n| Script | Usage |\n|---|---|\n| [k8s-snapshot.sh](./scripts/k8s-snapshot.sh) | `./k8s-snapshot.sh [namespace\\|all] [output-dir]` — dump cluster state (nodes, pods, events, services, top) to a timestamped Markdown file. |\n| [aws-whoami.sh](./scripts/aws-whoami.sh) | `./aws-whoami.sh [profile]` — quick AWS identity check: caller, region, account alias, org, SSO role. |\n| [stale-branches.sh](./scripts/stale-branches.sh) | `./stale-branches.sh [days] [--remote]` — list git branches older than N days with last commit info. |\n| [validate-repo.sh](./scripts/validate-repo.sh) | `./scripts/validate-repo.sh` — validate workflow frontmatter, README links, script executability, and optional lint checks. |\n\n## Using a workflow\n\n### In Claude Code\n\nClone the repo and run Claude Code from the repo root. Every workflow under [`.claude/commands/`](./.claude/commands) is auto-discovered as a slash command — `.claude/commands/k8s-debug.md` is invoked as `/k8s-debug`, etc.\n\n### In other AI agents\n\nOpen the matching file in [`.claude/commands/`](./.claude/commands) and either:\n\n- paste the relevant section into the agent's chat, or\n- include the file as context and ask the agent to follow it.\n\n### As a plain human workflow\n\nEvery workflow is just Markdown with shell commands. You can run the steps yourself in a terminal — no AI required.\n\n## Repo layout\n\n```\ndevops-ai-workflows/\n├── .claude/commands/        # Workflow definitions (Claude Code slash commands, flat)\n├── prompts/                 # Reusable LLM prompts\n├── rules/                   # Editor/agent rule files\n├── scripts/                 # Standalone shell helpers\n├── CONTRIBUTING.md\n├── LICENSE\n└── README.md\n```\n\n## Roadmap\n\nIdeas I plan to add (PRs welcome):\n\n**AWS / cloud**\n- [ ] `/aws-eks-debug` — bridge EKS + Kubernetes: node groups, OIDC, add-ons, IAM roles for service accounts\n- [ ] `/aws-rds-health` — RDS/Aurora diagnostics: events, metrics, parameter groups, replication lag\n- [ ] `/aws-lambda-debug` — Lambda diagnostics: errors, throttles, DLQ, VPC/ENI, CloudWatch logs\n- [ ] `/aws-ecs-service-debug` — ECS/Fargate service rollout failures: task events, target group health, IAM roles\n\n**IaC**\n- [ ] `/terraform-state-debug` — diagnose locks, drift, orphans\n- [ ] `/iac-secrets-scan` — repo-wide hardcoded-secret sweep\n\n**Containers \u0026 CI/CD**\n- [ ] `/image-cve-triage` — prioritise CVE scanner output by exploitability + fix availability\n- [ ] `/github-actions-review` — security review of GitHub Actions workflow files\n\n**Observability \u0026 incident**\n- [ ] `/prometheus-query-helper` — intent → PromQL with rationale\n- [ ] `/log-pattern-extract` — cluster repeated errors out of a log dump\n- [ ] `/postmortem` — blameless post-mortem from a transcript\n- [ ] `/runbook-from-incident` — turn a resolved incident into a reusable runbook\n\n**Networking / database**\n- [ ] `/dns-debug` — multi-resolver dig, propagation, DNSSEC\n- [ ] `/tls-cert-audit` — chain inspection, expiry, weak ciphers across a list of hosts\n- [ ] `/postgres-health` — bloat, long queries, replication lag, missing indexes\n- [ ] `/redis-health` — memory pressure, slow log, persistence config, eviction patterns\n- [ ] `/db-migration-review` — flag risky migration patterns\n\n**Security \u0026 repo hygiene**\n- [ ] `/cve-impact-assessment` — given a CVE, check whether your stack is affected\n- [ ] `/repo-health` — README, license, CI, branch protection, stale branches\n- [ ] `/dependency-upgrade-plan` — group outdated deps by risk and suggest batching\n\n## Contributing\n\nSee [CONTRIBUTING.md](./CONTRIBUTING.md). The short version:\n\n1. Add the canonical workflow to `.claude/commands/\u003cname\u003e.md`.\n2. Update the **Available workflows** table in this README.\n3. Keep workflows **read-only by default**. Anything mutating must be opt-in (e.g. a `DEEP=yes` flag) and clearly flagged.\n\n## License\n\n[MIT](./LICENSE) — use freely, attribution appreciated but not required.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2F23seriy%2Fdevops-ai-workflows","html_url":"https://awesome.ecosyste.ms/projects/github.com%2F23seriy%2Fdevops-ai-workflows","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2F23seriy%2Fdevops-ai-workflows/lists"}