SRE
Site reliability engineering (SRE) is a set of principles and practices that incorporates aspects of software engineering and applies them to infrastructure and operations problems. The main goals are to create scalable and highly reliable software systems. Site reliability engineering is closely related to DevOps, a set of practices that combine software development and IT operations, and SRE has also been described as a specific implementation of DevOps.
- GitHub: https://github.com/topics/sre
- Wikipedia: https://en.wikipedia.org/wiki/Site_reliability_engineering
- Aliases: site-reliability-engineering,
- Last updated: 2026-03-25 00:29:22 UTC
- JSON Representation
https://github.com/tedilabs/terraform-aws-quicksight
🌳 A sustainable Terraform Package to manage QuickSight resources on AWS
aws aws-data aws-quicksight devops hacktoberfest hcl2 iac lang-hcl sre tedilabs terraform terraform-aws terraform-module terraform-modules
Last synced: 18 Jul 2025
https://github.com/jnbdz/site-reliability-engineer-quickstarts
:mechanical_arm: Site Reliability Engineer | Quickstarts :mechanical_arm:
quickstart quickstarts site-reliability-engineering site-reliability-engineering-sre sre
Last synced: 03 Mar 2026
https://github.com/charles-adedotun/kubepulse
Intelligent Kubernetes health monitoring with AI-powered diagnostics, predictive analytics, and auto-remediation
ai claude cloud-native devops go kubernetes monitoring react sre typescript
Last synced: 28 Jul 2025
https://github.com/jrhrmsll/tsgen
tsgen is a little Go program to simulate HTTP requests faults and show how Prometheus alerts based on the Multiwindow, Multi-Burn-Rate Alerts works.
golang grafana monitoring prometheus sre
Last synced: 22 Feb 2026
https://github.com/cheesebanana/yellowstack
Real-time Python script runner with scheduling, logging, and OpenAI-assisted debugging
automation aws devops flask job-scheduler openai python rest-api scheduler scripts sre
Last synced: 12 Jun 2025
https://github.com/safoorsafdar/safoorsafdarcom
source code for the personal website safoorsafdar.com
azure cloud-architect devops docker kubernetes observability personal-website prometheus sre
Last synced: 18 Jan 2026
https://github.com/philyuchkoff/howtheysre
A curated collection of publicly available resources on how technology and tech-savvy organizations around the world practice Site Reliability Engineering (SRE)
Last synced: 11 Mar 2025
https://github.com/aronmilenait/aronmilenait.github.io
My blog and portfolio as a Software Developer transitioning to DevOps and SRE.
blog devops linux portfolio software-development sre
Last synced: 21 Jun 2025
https://github.com/omaciasd/sre
SRE Challenges.
aws devops docker kubernetes linux python sre vagrant
Last synced: 30 Dec 2025
https://github.com/omers/sre-devops-tools
Tools and useful sources for SRE and DevOps
awsome awsome-list data devops monitoring sre tools
Last synced: 25 Feb 2025
https://github.com/logan-bobo/user_infomation_api_infrastructure
The infrastructure for my user information REST API project.
aws containers devops docker infrastructure-as-code kubernetes sre system-engineering terraform
Last synced: 30 Mar 2025
https://github.com/rrabelloo/homebrew-formae
Unofficial Homebrew tap for Formae, a modern Infrastructure-as-Code platform.
devops iac infrastructure-as-code platform-engineering sre
Last synced: 04 Mar 2026
https://github.com/ishantanu/gcp-status-exporter
A Prometheus Exporter for generating metrics for GCP Service Status and Incidents :rocket:
gcp opentelemetry prometheus-exporter sre
Last synced: 10 Jul 2025
https://github.com/awcodify/awesome-monitoring
This repository is a curated collection of valuable monitoring tools, resources, and best practices for developers, sysadmins, and DevOps professionals. It covers various aspects of monitoring, including infrastructure, applications, logs, networks, cloud, and Kubernetes.
alerting devops infrastructure logging logs metrics monitoring sre sysadmin
Last synced: 22 Feb 2026
https://github.com/codreum/terraform-aws-dns-monitoring-pro
Production-grade Route 53 DNS observability (Pro). Templates-only repo; module delivered via Codreum private Terraform registry.
aws cloudwatch cloudwatch-alarms cloudwatch-logs commercial contributor-insights dashboards dns dns-monitoring dnsci dnsciz incident-response infrastructure-as-code observability reliability route53 saas sre terraform terraform-templates
Last synced: 05 Feb 2026
https://github.com/robson-teixeira/jaeger-opentelemetry-tracking
Repositório do curso Rastreamento: fazendo tracing com Jaeger e OpenTelemetry da plataforma Alura.
alura container docker grafana grafana-loki jaeger java jdk nginx opentelemetry postgresql prometheus rastreamento redis spring sre tracing
Last synced: 27 Aug 2025
https://github.com/amsa-2425-gei-udl/laboratoris
Material enfocat per als estudiants que desitgen ampliar els seus coneixements en administració i virtualització de sistemes.
devops labs sre sys-admin teaching-materials
Last synced: 11 Apr 2025
https://github.com/peopledoc/jarvis
SRE toolbox
approved-public ghec-mig-migrated sre team-sre
Last synced: 04 Apr 2025
https://github.com/deimosfr/mytechnotebook
My Tech Notebook
coding database dev devops kubernetes sre technology
Last synced: 05 May 2025
https://github.com/christiangalsterer/pg-promise-prometheus-exporter
A prometheus exporter for pg-promise
grafana-dashboard metrics monitoring node-js nodejs observability pg-promise postgres postgresql prometheus prometheus-exporter sre
Last synced: 14 Jun 2025
https://github.com/expeor/aws-automation
AWS 운영 자동화 CLI - 멀티 계정/리전 지원, Excel 보고서 생성
automation aws cli compliance cost-optimization devops inventory multi-account multi-region ops python security-audit sre
Last synced: 14 Jan 2026