An open API service indexing awesome lists of open source software.

SRE

Site reliability engineering (SRE) is a set of principles and practices that incorporates aspects of software engineering and applies them to infrastructure and operations problems. The main goals are to create scalable and highly reliable software systems. Site reliability engineering is closely related to DevOps, a set of practices that combine software development and IT operations, and SRE has also been described as a specific implementation of DevOps.

https://github.com/tedilabs/terraform-aws-container

🌳 A sustainable Terraform Package which creates resources for Container Services on AWS

aws aws-ecr aws-eks devops hacktoberfest hcl2 iac lang-hcl sre tedilabs terraform terraform-aws terraform-module terraform-modules type-module

Last synced: 23 Feb 2026

https://github.com/nobl9/sloctl

A command line tool to cast SLO spells 🪄

cli go golang nobl9 reliability slo sre

Last synced: 27 Feb 2026

https://github.com/ory/jobs

Want to build the next generation identity stack? You've come to the right place!

go hiring jobs kubernetes open-source opensource ory react sre

Last synced: 17 Mar 2025

https://github.com/icco/postmortems

Postmortem metadata from danluu/post-mortems.

hacktoberfest postmortem-metadata sre

Last synced: 21 Mar 2025

https://github.com/sitectl/cuttle

Blue Box SRE Operations Platform

ansible bastion bluebox elk operations sensu sre

Last synced: 11 Apr 2025

https://github.com/apiaryio/heroku-datadog-drain-golang

Funnel metrics from multiple Heroku apps into DataDog using statsd.

datadog golang heroku metrics sre statsd

Last synced: 03 Oct 2025

https://github.com/excoriate/go-terradagger

TerraDagger is a Go package for managing your infrastructure-as-code through containers.

cli devops ecs example sre tooling

Last synced: 13 Apr 2025

https://github.com/ramizpolic/sre-playground

A set of Site Reliability Engineering notes & challenges

cicd cloud guide infrastructure site-reliability-engineer sre tasks

Last synced: 14 Apr 2025

https://github.com/seveas/herd

Massively parallel ssh client

cli orchestration sre ssh sysadmin system-administration

Last synced: 25 Jun 2025

https://github.com/fkie-cad/logprep

log data pre processing, generation and shipping in python

etl kafka log logdata loggenerator logshipper opensearch preprocessing python soar sre

Last synced: 02 Mar 2026

https://github.com/enola-dev/enola

Enola 🕵🏾‍♀️ Holmes was an SRE.

graph graphviz mermaid modeling rdf semantic-web sre visualization

Last synced: 16 Jun 2025

https://github.com/Excoriate/go-terradagger

TerraDagger is a Go package for managing your infrastructure-as-code through containers.

cli devops ecs example sre tooling

Last synced: 21 Apr 2025

https://github.com/keycloak/keycloak-sre-sig

Keycloak's Site Reliability Engineers Special Interest Group (Keycloak SRE SIG): To improve the lives of people running and operating Keycloak

keycloak sig sre

Last synced: 12 Apr 2025

https://github.com/lwindolf/multi-status

Aggregator PWA for status pages of online services. Know which of your 3rd party SaaS/PaaS are having issues right now.

cloud devops monitoring paas pwa saas sre

Last synced: 11 Apr 2025

https://github.com/tedilabs/terraform-aws-network

🌳 A sustainable Terraform Package which creates VPC resources (VPC, Subnet, NACL, NAT Gateway, Route Table) on AWS

aws aws-vpc devops hacktoberfest hcl2 iac lang-hcl sre tedilabs terraform terraform-aws terraform-module terraform-modules

Last synced: 15 Apr 2025

https://github.com/immobiliare/collectd-haproxy-plugin

Collectd plugin to pull metrics from HAProxy instances

collectd collectd-plugin grafana haproxy metrics monitoring sre

Last synced: 01 Apr 2025

https://github.com/better-sre/config

config files, Dockerfiles, Taskfiles for Developers.

awesome-taskfile docker dotfiles flutter go-task golang python rust sre taskfile

Last synced: 02 May 2025

https://github.com/hequan2017/raptor

猛禽 运维平台 项目完结,后续不再更新。

cmdb devops gin go golang jenkins rbac sre vue

Last synced: 07 May 2025

https://github.com/dynatrace-oss/customersuccess

Open source solutions that help you level up your observability game with Dynatrace.

adoption ai automation dashboards dynatrace intelligence notebooks observability obsolescence software sre value workflows

Last synced: 07 Jan 2026

https://github.com/grafana/xk6-chaos

xk6 extension for running chaos experiments with k6 💣

chaos chaos-engineering k6-extension reliability sre testing xk6

Last synced: 01 Oct 2025

https://github.com/operate-first/operations

The sig-operations repository.

site-reliability-engineering sre

Last synced: 16 Jan 2026

https://github.com/anjakammer/devops-and-sre

An online course @ HTW Berlin

devops gitops operations sre

Last synced: 21 Jan 2026

https://github.com/k8sgpt-ai/community

Community Management for K8sGPT

devops kubernetes openai sre tooling

Last synced: 15 Apr 2025

https://github.com/nathanielvarona/pritunl-client-github-action

Establish automated secure Pritunl VPN connections with Pritunl Client in GitHub Actions, supporting OpenVPN and WireGuard.

cicd devops github-actions hacktoberfest openvpn pritunl pritunl-vpn sre vpn-client vpn-server wireguard

Last synced: 10 Mar 2026

https://github.com/be-next/awesome-performance-engineering

A curated, opinionated collection of tools and resources dedicated to Performance Engineering, covering both Observability and Performance Testing.

awesome awesome-list devops load-testing monitoring observability performance performance-engineering performance-testing sre

Last synced: 08 Mar 2026

https://github.com/googlecloudplatform/reliable-app-platforms

A MVP of a platform for delivering reliable applications on Google Cloud

gke google-cloud kubernetes reliability slos sre terraform

Last synced: 20 Oct 2025

https://github.com/microsoft/tdslib

Open implementation of the TDS protocol (version 7.4) in managed C# code.

dotnet sqlserver sre tds

Last synced: 17 Aug 2025

https://github.com/devopsext/sre

Golang SRE framework for logs, metrics, traces and events. It supports: Jaeger, Prometheus, DataDog, Opentelemetry, NewRelic, Grafana

events logs metrics observability sre traces

Last synced: 12 Jan 2026

https://github.com/tedilabs/terraform-aws-domain

🌳 A sustainable Terraform Package which creates resources for Domain Services on AWS

aws aws-route53 devops hacktoberfest hcl2 iac lang-hcl sre tedilabs terraform terraform-aws terraform-module terraform-modules

Last synced: 15 Apr 2025

https://github.com/butuzov/todayilearned

Because I Can't Trust My Memory

bash go jupyter linux python sre

Last synced: 17 Mar 2025

https://github.com/wpjunior/multi-burn-rate-calculator

Calculator to view detection time using error budget consumption rates, based on lessons from Site Reliability Engineering Workbook

error-budget sli slo sre

Last synced: 17 Mar 2026

https://github.com/diogopms/monit-docker

Monit is a free open source utility for managing and monitoring, processes, programs, files, directories and filesystems on a UNIX system. Monit conducts automatic maintenance and repair and can execute meaningful causal actions in error situations.

devops docker kubernetes monit monitoring sre status

Last synced: 25 Oct 2025

https://github.com/aptible/unpage

Unpage is the open source framework for building SRE agents with infrastructure context and secure access to any dev tool.

agent agentic-workflow agents ai-agent ai-sre aiops automation devops dspy incident-response incident-response-tooling mcp monitoring observability site-reliability-engineering sre sre-agent

Last synced: 08 Sep 2025

https://github.com/rootlyhq/terraform-provider-rootly

Terraform provider for Rootly - manage incident management, on-call schedules, workflows, and alerts as code

devops go golang hashicorp iac incident-management incident-response infrastructure-as-code on-call rootly site-reliability-engineering sre terraform terraform-provider

Last synced: 11 Mar 2026

https://github.com/angelopoerio/oom-notifier

Notify about oomed processes reporting full command line

devops kubernetes linux observability rust site-reliability-engineering sre

Last synced: 17 Jan 2026

https://github.com/luan78zaoha/kaldi-timit-sre-ivector

Develop speaker recognition model based on i-vector using TIMIT database

chinese i-vector kaldi speaker-recognition speaker-verification sre

Last synced: 11 Mar 2025

https://github.com/quzanh1130/multi_metrics_to_compare_images

Comparing two images by using 9 metrics: VIFP, PSNR, SSIM, FSIM, RMSE, ISSM, SRE, SAM, UIQ.

compare-image fsim issm psnr rmse sam sre ssim uiq vifp

Last synced: 28 Oct 2025

https://github.com/tedilabs/terraform-aws-data

🌳 A sustainable Terraform Package which creates resources for Data Services on AWS

aws aws-athena devops hacktoberfest hcl2 iac lang-hcl sre tedilabs terraform terraform-aws terraform-module terraform-modules

Last synced: 03 Oct 2025

https://github.com/tedilabs/k8s-repository

♻️ Repository for Reusable Kubernetes App Manifests with Kustomize

devops gitops hacktoberfest k8s kubernetes kustomize lang-yaml sre tedilabs

Last synced: 19 Oct 2025

https://github.com/last9/last9-integrations

Sample applications of supported integrations by Last9 Products

integrations last9 reliability-engineering sre timeseries-database

Last synced: 28 Apr 2025

https://github.com/rsionnach/nthlayer

Generate the complete reliability stack from a service spec in 5 minutes. Dashboards, alerts, SLOs, PagerDuty - zero toil.

alerts devops grafana monitoring observability pagerduty prometheus python slo sre

Last synced: 18 Jan 2026

https://github.com/dkorunic/axfr2hosts

Fetches one or more DNS zones via AXFR and dumps in Unix hosts format for local use

bind bind9 bind9-dns dns dns-server domain linux networking security sre sysops unix zone

Last synced: 12 Apr 2025

https://github.com/tedilabs/terraform-aws-db

🌳 A sustainable Terraform Package which creates resources for Databases on AWS

aws aws-db aws-elasticache aws-rds devops hacktoberfest hcl2 iac lang-hcl sre tedilabs terraform terraform-aws terraform-module terraform-modules

Last synced: 15 Apr 2025

https://github.com/bjarneo/gecho

Gecho - a HTTP request echo debugging service

debugging devops echo golang http http-server request sre

Last synced: 25 Apr 2025

https://github.com/todd-dsm/mac-ops

QnD Automation to build a MacBook Pro for DevOps

customizable devops devops-tools macbook-configuration macbook-setup macos sre

Last synced: 13 Apr 2025

https://github.com/chatwoot/faultline

An open-source AI agent for infrastructure debugging.

ai ai-agents ai-sre sre

Last synced: 24 Feb 2026

https://github.com/guilhem/devops-training

DevOps culture training

agile cloud devops hugo lean reveal-js sre

Last synced: 19 Mar 2025

https://github.com/gopatchy/bkl

Layered Configuration Language

configuration deployment devops json k8s kubernetes sre toml yaml

Last synced: 17 Jan 2026

https://github.com/avivl/cloud-sre-agent

An autonomous SRE agent that monitors cloud logs across multiple platforms, leveraging AI models from various providers to detect anomalies, perform root cause analysis, and automate remediation by creating GitHub Pull Requests.

ai-agents ai-ops automation aws cloud devops gcp gemini-ai google-cloud incident-response llm log-analysis log-monitoring platform-engineering python resilience sre vertex-ai

Last synced: 09 Mar 2026

https://github.com/xe-nvdk/terraform-recipes

This is the repo where I save #Terraform recipes, mostly posted in cduser.com

devops iaac infrastructure-as-code sre terraform

Last synced: 11 Apr 2025

https://github.com/woodprogrammer/postgresql-connection-manager

This is project to manage postgresql connections via cgroup V2

cgroups devops pg postgresql sre

Last synced: 28 Apr 2025

https://github.com/input-output-hk/devshell-capsules

Space Capsules for the Modern DevShell

devshell sre

Last synced: 13 Oct 2025

https://github.com/fluxninja/aperture-go

SDK to interact with Aperture Agent

concurrency-limiter flow-control rate-limiter sdk sre

Last synced: 14 Oct 2025

https://github.com/apiaryio/ivy

A Node.js queue library focused on easy, yet flexible task execution.

sre

Last synced: 30 Jul 2025

https://github.com/tedilabs/terraform-aws-firewall

🌳 A sustainable Terraform Package which creates resources for Firewall Services on AWS

aws aws-firewall aws-waf devops hacktoberfest hcl2 iac lang-hcl sre tedilabs terraform terraform-aws terraform-module terraform-modules

Last synced: 21 Jan 2026

https://github.com/madetech/productionisation

The Made Tech Productionisation Checklist for Software Projects

checklist productionise sre

Last synced: 12 Apr 2025

https://github.com/tedilabs/terraform-aws-vpc-connectivity

🌳 A sustainable Terraform Package which creates VPC Connectivity resources (Private Link, Client VPN, Site-to-Site VPN, DX, VPC Lattice) on AWS

aws aws-client-vpn aws-direct-connect aws-dx aws-site-to-site-vpn aws-vpc aws-vpc-lattice aws-vpc-private-link aws-vpn devops hacktoberfest hcl2 iac lang-hcl sre tedilabs terraform terraform-aws terraform-module terraform-modules

Last synced: 24 Oct 2025

https://github.com/antolius/deployments-and-disasters

A tabletop RPG for practicing incident management.

rpg sre training

Last synced: 05 May 2025

https://github.com/excoriate/daggerx

DaggerX is a Go package 📦 that helps you avoid DRY while developing Dagger modules.

cli devops ecs example sre tooling

Last synced: 03 Jul 2025

https://github.com/tedilabs/.github

📣 Default community health files for @tedilabs organization on GitHub

devops github hacktoberfest sre tedilabs

Last synced: 15 Apr 2025

https://github.com/apiaryio/docker-base-images

Base docker images for Apiary applications

sre

Last synced: 26 Jun 2025

https://github.com/tedilabs/terraform-aws-misc

🌳 A sustainable Terraform Package which creates MISC resources on AWS

aws devops hacktoberfest hcl2 iac lang-hcl sre tedilabs terraform terraform-aws terraform-module terraform-modules

Last synced: 28 Oct 2025

https://github.com/linhng98/mess-around

playground to demonstrate many awesome devops tools, enforce gitops pattern, build scalable and sustainable application cluster

devops homelab kubernetes mess-around sre

Last synced: 17 Jan 2026

https://github.com/deoops-net/dotam

toil for developers

automa golang pipline sre toil workflow

Last synced: 10 Mar 2026

https://github.com/diptochakrabarty/learn_devops_with_projects

Learn Devops by practical projects . Includes all tech stacks including k8s, ansible , docker , python and more

ansible devops golang hacktoberfest kubernetes python sre

Last synced: 13 Jun 2025

https://github.com/tedilabs/terraform-aws-ml

🌳 A sustainable Terraform Package which creates Machine Learning resources on AWS

aws devops hacktoberfest hcl2 iac lang-hcl sre tedilabs terraform terraform-aws terraform-module terraform-modules

Last synced: 16 Feb 2026

https://github.com/shantoroy/site-reliability-engineering-101

This GitHub repository contains a comprehensive tutorial on Site Reliability Engineering (SRE), covering topics such as SLAs, SLOs, SLIs, Chaos Engineering, monitoring, alerting, and much more. It also includes a bonus content on SRE best practices. Follow along with the #100daysofSRE challenge and improve your reliability engineering skills.

100daysofcode alerting automation chaos-engineering devops devsecops monitoring reliability-engineering service-level-agreement service-level-indicator service-level-objective site-reliability-engineering sre

Last synced: 28 Dec 2025

https://github.com/powerhome/keess

Keep secrets and configmaps syncronized across clusters and namespaces

pac sre

Last synced: 04 Mar 2026

https://github.com/ajinkyakadam/systemhealthai

An AI SRE for triaging system health

agents ai aiops devops devops-tools llm llmops mlops observability sre

Last synced: 03 Nov 2025

https://github.com/lawouach/ebpf-2021-talk

Code for my talk at ebpf 2021 conference

devops ebpf reliability reliably sre

Last synced: 12 Apr 2025

https://github.com/certwatch-app/cw-agent

SSL/TLS certificate monitoring agent for Kubernetes and on-prem infrastructure. Scan certificates, detect expiration, validate chains, and sync to CertWatch cloud.

certificate cli cloud-native devops golang kubernetes monitoring security sre ssl tls

Last synced: 13 Jan 2026

https://github.com/guilt/chaossquirrel

Like Netflix's Chaos Monkey, packaged to run standalone.

chaos-monkey reliability-engineering sre

Last synced: 12 Aug 2025

https://github.com/skyzyx/engineering-for-site-reliability

Overall map of topics to cover for my “Engineering for Site Reliability” blog series.

ci-cd cicd devops docker security site-reliability site-reliability-engineering sre terraform

Last synced: 25 Mar 2025