awesome-cloud-native
A curated list of cloud native tools that are actually production-ready. Curated by Podo Stack.
https://github.com/igusev/awesome-cloud-native
Last synced: 1 day ago
JSON representation
-
🏗️ Platform Engineering
-
🔥 Continuous Profiling
- OpenFeature - Does for feature flags what OpenTelemetry did for observability. One API, swap LaunchDarkly for Flagsmith with a single line — the hundreds of flag checks scattered across your codebase stay untouched. `CNCF Incubating`
- Backstage - Software catalog that finally answers "who owns the payment service?" Developers drop a `catalog-info.yaml` next to their code, Backstage auto-discovers it, and the Scaffolder turns a three-day new-service bootstrap into three minutes. `CNCF Incubating`
- Dapr - Sidecar that exposes distributed systems patterns — state, pub/sub, secrets, service invocation — through plain HTTP calls. Swap Redis for Postgres with a YAML change, not a code rewrite. `CNCF Incubating`
- Crossplane - Your cloud resources become Kubernetes objects with a continuous reconciliation loop. Someone edits an RDS instance by hand? Crossplane fixes it back — no Terraform drift, no CI wrapper. `CNCF Graduated`
- Kargo - Continuous promotion engine from the Argo CD team. Warehouse → Freight → Stage → Promotion makes GitOps promotion declarative instead of a shell script that rewrites `values.yaml`.
- Deep dive
- Deep dive
-
-
⚡ Autoscaling
-
Image Distribution & Caching
- Cluster Autoscaler - The battle-tested autoscaler that works through Node Groups. Slower than Karpenter but multi-cloud and familiar.
- Karpenter - Provisions the exact node your pods need in seconds, not minutes. No node groups — just right-sized instances from any available type. `AWS` `GCP`
- FinOps patterns
-
-
📊 Observability
-
📈 Metrics & Telemetry
- VictoriaMetrics - Drop-in Prometheus-compatible TSDB that's significantly leaner on memory and disk. Built for billion-series workloads without the Thanos/Cortex operational complexity.
- Thanos - Long-term storage, global query, and HA for Prometheus. Writes TSDB blocks to S3-compatible object storage and queries across them plus live Prometheus in one PromQL. `CNCF Incubating`
- OpenTelemetry Collector - Vendor-neutral telemetry gateway for traces, metrics, and logs. Receivers, processors, and exporters as pluggable modules — ship to Prometheus, Loki, Tempo, Datadog, or anything OTLP-compatible. `CNCF Graduated`
- Grafana Alloy - One agent to replace Prometheus, Promtail, and an OTel Collector. Programmable config language and built-in clustering that distributes scrape targets across instances — vendor-neutral despite the Grafana branding.
- Deep dive
-
🔥 Continuous Profiling
- Pyroscope - Continuous profiling with flame graphs and differential profiling — compare today's deployment to yesterday's and find which function started eating CPU. Part of Grafana Labs, supports SDK-based and eBPF collection across most languages.
- Parca - Pure-eBPF continuous profiling with zero instrumentation and under 1% overhead. Profiles land in FrostDB (columnar) so you can diff two deploys and see exactly which function changed. `CNCF Sandbox`
-
Image Distribution & Caching
-
-
📜 Kyverno Policies
-
🔥 Continuous Profiling
-
Image Distribution & Caching
-
-
🔐 Supply Chain & Runtime Security
-
Image Distribution & Caching
- Trivy - Single-binary CVE scanner for container images, IaC, and filesystems. No daemon, no config — drop it into CI and fail the build on criticals before they reach a registry.
- Falco - Watches Linux syscalls via eBPF while your containers run — every file open, every process spawn, every network connection. Real-time threat detection in the kernel, not "scan and report later". `CNCF Graduated`
- cosign - Signs container images like a wax seal on a letter — break the signature, everyone knows. Part of Sigstore, and keyless mode with GitHub OIDC means no private keys to manage. `Sigstore`
- Tetragon - Doesn't just watch syscalls — it kills them with SIGKILL in the kernel before they complete. From the Cilium team, under 1% overhead, and Kubernetes-aware policies as CRDs instead of Lua scripts.
- Chainguard Images - Distroless-style base images with daily rebuilds and aggressive CVE tracking. About 5 CVEs per year instead of Alpine's 150 — nothing to exploit because there's almost nothing there.
- Deep dive
- Falco vs Tetragon - kernel enforcement
-
-
🌐 Networking & Service Mesh
-
Image Distribution & Caching
- Istio Ambient - Service mesh without sidecars. Uses a node-level ztunnel for L4 and on-demand waypoint proxies for L7 — pay only for what you need. `CNCF Graduated`
- Cilium - Replaces kube-proxy with eBPF — O(1) lookups instead of walking iptables chains. Also does identity-based security and multi-cluster mesh. `CNCF Graduated`
- Deep dive - proxy replacement, Hubble, identity policies, egress gateway, cluster mesh
- Deep dive
-
-
🧩 Runtime
-
⚙️ Workload Scheduling
-
Image Distribution & Caching
- Koordinator - Colocation scheduler that runs best-effort batch jobs on the CPU your latency-sensitive services reserved but aren't actually using. Alibaba reports ~15% → 50%+ cluster utilization in production, with hardware-level LLC and memory-bandwidth isolation to stop noisy neighbors. `CNCF Sandbox`
-
-
🚀 GitOps
-
🔥 Continuous Profiling
- Flux - GitOps toolkit that actually waits for your deployments to be ready, not just applied. Handles Helm, Kustomize, and multi-tenancy. `CNCF Graduated`
-
-
📨 Messaging
-
🔥 Continuous Profiling
- RabbitMQ Cluster Operator - Declarative RabbitMQ on Kubernetes — nodes, quorum queues, streams, users, and broker policies as CRDs. Raft-backed data safety, streams for replay, and non-voter replicas that decouple durability from consensus latency.
- Deep dive
-
-
🖼️ Container Images
-
Image Distribution & Caching
- Stargz Snapshotter - Start containers before the image fully downloads. Your app uses ~6% of files at startup — why pull 100%?
- Spegel - Nodes share container images directly with each other — no registry involved. Stateless P2P caching that speeds up scaling and cuts egress costs. `CNCF Sandbox`
- Deep dive
- Deep dive
-
-
Contributing
-
Karpenter drift detection
-
Programming Languages
Categories
Sub Categories
Keywords
kubernetes
18
observability
9
monitoring
8
containers
7
cncf
7
cloud-native
5
ebpf
5
prometheus
5
security
4
docker
4
golang
4
metrics
4
bpf
3
microservices
3
go
3
serverless
3
k8s
3
ruby
2
infrastructure-as-code
2
python
2
oci
2
opentelemetry
2
gitops
2
profiling
2
performance
2
infrastructure
2
kernel
2
continuous-profiling
2
thanos
2
containerd
2
microservice
2
grafana
2
find-bottlenecks
1
devops
1
developer-tools
1
service-mesh
1
resiliency
1
request-routing
1
linux
1
proxies
1
polyglot-microservices
1
pyroscope
1
artificial-intelligence
1
cloud
1
container
1
edge-computing
1
ewasm
1
hacktoberfest2023
1
rust-lang
1
wasm
1