Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
SRE
Site reliability engineering (SRE) is a set of principles and practices that incorporates aspects of software engineering and applies them to infrastructure and operations problems. The main goals are to create scalable and highly reliable software systems. Site reliability engineering is closely related to DevOps, a set of practices that combine software development and IT operations, and SRE has also been described as a specific implementation of DevOps.
- GitHub: https://github.com/topics/sre
- Wikipedia: https://en.wikipedia.org/wiki/Site_reliability_engineering
- Aliases: site-reliability-engineering,
- Last updated: 2025-01-22 00:29:42 UTC
- JSON Representation
https://github.com/bregman-arie/devops-exercises
Linux, Jenkins, AWS, SRE, Prometheus, Docker, Python, Ansible, Git, Kubernetes, Terraform, OpenStack, SQL, NoSQL, Azure, GCP, DNS, Elastic, Network, Virtualization. DevOps Interview Questions
ansible aws azure coding containers devops docker git interview interview-questions kubernetes linux openstack production-engineer prometheus python sql sre terraform
Last synced: 20 Jan 2025
https://github.com/milanm/devops-roadmap
DevOps Roadmap for 2024. with learning resources
aws azure computer-science continous-delivery continuous-integration developer-roadmap devops devops-roadmap docker go grafana jira kubernetes linux prometheus python roadmap sre study-plan
Last synced: 21 Jan 2025
https://github.com/milanm/DevOps-Roadmap
DevOps Roadmap for 2024. with learning resources
aws azure computer-science continous-delivery continuous-integration developer-roadmap devops devops-roadmap docker go grafana jira kubernetes linux prometheus python roadmap sre study-plan
Last synced: 29 Oct 2024
https://github.com/upgundecha/howtheysre
A curated collection of publicly available resources on how technology and tech-savvy organizations around the world practice Site Reliability Engineering (SRE)
alerting chaos-engineering dev-ops devops hacktoberfest hacktoberfest-accepted incident-management incident-response infrastructure ml-ops monitoring observability on-call post-mortem reliability security site-reliability-engineering software-engineering sre sre-culture
Last synced: 21 Jan 2025
https://github.com/linkedin/school-of-sre
At LinkedIn, we are using this curriculum for onboarding our entry-level talents into the SRE role.
git hadoop linux mysql networking nosql python security sre system-design
Last synced: 21 Jan 2025
https://github.com/runatlantis/atlantis
Terraform Pull Request Automation
atlantis automation devops go golang hacktoberfest sre tacos terraform
Last synced: 20 Jan 2025
https://github.com/mxssl/sre-interview-prep-guide
Site Reliability Engineer Interview Preparation Guide
interview-preparation preparation site-reliability-engineer sre sre-interview study
Last synced: 05 Dec 2024
https://github.com/isno/thebytebook
⭐ 【开源书籍】深入讲解内核网络、Kubernetes、ServiceMesh、容器等云原生相关技术。经历实践检验的 DevOps、SRE指南。如发现错误,谢谢提issue
cloud-native container devops distributed-systems finops kubernetes networking paas paxos raft service-mesh sre
Last synced: 21 Jan 2025
https://github.com/isno/theByteBook
⭐ 【开源书籍】深入讲解内核网络、Kubernetes、ServiceMesh、容器等云原生相关技术。经历实践检验的 DevOps、SRE指南。如发现错误,谢谢提issue
cloud-native container devops distributed-systems finops kubernetes networking paas paxos raft service-mesh sre
Last synced: 02 Nov 2024
https://github.com/hjacobs/kubernetes-failure-stories
Compilation of public failure/horror stories related to Kubernetes
failures incidents kubernetes post-mortem postmortem production-engineering reliability sre
Last synced: 17 Jan 2025
https://github.com/stackstorm/st2
StackStorm (aka "IFTTT for Ops") is event-driven automation for auto-remediation, incident responses, troubleshooting, deployments, and more for DevOps and SREs. Includes rules engine, workflow, 160 integration packs with 6000+ actions (see https://exchange.stackstorm.org) and ChatOps. Installer at https://docs.stackstorm.com/install/index.html
auto-remediation automation chatops cicd deployment devops ifttt python sre st2 stackstorm workflows
Last synced: 20 Jan 2025
https://github.com/StackStorm/st2
StackStorm (aka "IFTTT for Ops") is event-driven automation for auto-remediation, incident responses, troubleshooting, deployments, and more for DevOps and SREs. Includes rules engine, workflow, 160 integration packs with 6000+ actions (see https://exchange.stackstorm.org) and ChatOps. Installer at https://docs.stackstorm.com/install/index.html
auto-remediation automation chatops cicd deployment devops ifttt python sre st2 stackstorm workflows
Last synced: 28 Oct 2024
https://github.com/k8sgpt-ai/k8sgpt
Giving Kubernetes Superpowers to everyone
ai devops kubernetes llama openai sre tooling
Last synced: 20 Jan 2025
https://github.com/rundeck/rundeck
Enable Self-Service Operations: Give specific users access to your existing tools, services, and scripts
ansible audit automation category-distributed deployment devops devops-team devops-tools hacktoberfest java operations ops orchestration runbook rundeck scheduler sre
Last synced: 21 Jan 2025
https://github.com/jonmosco/kube-ps1
Kubernetes prompt info for bash and zsh
bash containers kubectl kubernetes kubernetes-helper prompts sre zsh
Last synced: 21 Jan 2025
https://github.com/antonputra/tutorials
DevOps Tutorials
ansible aws devops gcp kubernetes packer serverless sre terraform
Last synced: 21 Jan 2025
https://github.com/leandromoreira/cdn-up-and-running
CDN Up and Running - Building a CDN from Scratch to Learn about CDN, Nginx, Lua, Prometheus, Grafana, Load balancing, and Containers.
cdn docker-compose grafana load-balancer lua luajit nginx openresty prometheus sre tutorial wrk
Last synced: 17 Jan 2025
https://github.com/bregman-arie/sre-checklist
A checklist of anyone practicing Site Reliability Engineering
automation checklist gitops kubernetes reliability-engineering sre terraform
Last synced: 17 Jan 2025
https://github.com/anzhihe/learning
Learning Shell,Python,Golang,System,Network
applescript awk django django-rest-framework gin golang linux-system mysql network operating-systems performance programming python sed shell sql sre
Last synced: 16 Jan 2025
https://github.com/chaostoolkit/chaostoolkit
Chaos Engineering Toolkit & Orchestration for Developers
automation chaos-engineering chaostoolkit devops-tools reliability reliability-engineering resiliency sre
Last synced: 21 Jan 2025
https://github.com/alibaba/sreworks
Cloud Native DataOps & AIOps Platform | 云原生数智运维平台
aiops application cloudnative dataops devops engineering flink k8s kubernetes maintenance oam operation ops saas sre
Last synced: 17 Jan 2025
https://github.com/alibaba/SREWorks
Cloud Native DataOps & AIOps Platform | 云原生数智运维平台
aiops application cloudnative dataops devops engineering flink k8s kubernetes maintenance oam operation ops saas sre
Last synced: 30 Oct 2024
https://github.com/chame1eon/jnitrace
A Frida based tool that traces usage of the JNI API in Android apps.
android frida jni jni-api reverse-engineering sre tracer
Last synced: 17 Jan 2025
https://github.com/google/cloudprober
[Moved to cloudprober/cloudprober] An active monitoring software to detect failures before your customers do.
blackbox cloud cloudprober devops distributed-monitoring gcp golang google grafana k8s kubernetes monitoring observability ping probe prober prometheus sre stackdriver
Last synced: 28 Sep 2024
https://github.com/dastergon/postmortem-templates
A collection of postmortem templates
devops incident-reporting incident-reports incident-response post-mortem postmortem postmortem-templates site-reliability site-reliability-engineering sre
Last synced: 30 Nov 2024
https://github.com/briefercloud/layerform
Layerform helps engineers create reusable environment stacks using plain .tf files. Ideal for multiple "staging" environments.
dev-environment developer-tools devops platform-engineering sre terraform
Last synced: 19 Jan 2025
https://github.com/jaegertracing/jaeger-ui
Web UI for Jaeger
apm distributed-tracing hacktoberfest jaeger javascript monitoring opentracing react reactjs site-reliability-engineering sre trace tracing typescript ui
Last synced: 16 Jan 2025
https://github.com/idoavrah/terraform-tui
Terraform textual UI
devops iac productivity sre terraform tui
Last synced: 16 Jan 2025
https://github.com/unixorn/git-extra-commands
A collection of git utilities, useful extra git scripts, tutorials and other useful articles.
antigen bash collection devops devops-tools git hacktoberfest oh-my-zsh oh-my-zsh-plugin prezto shell-script shell-scripts sre zgenom zsh-plugin zsh-plugins
Last synced: 19 Jan 2025
https://github.com/mikeroyal/nixos-guide
NixOS Guide. Learn all about the immutable Nix Operating System and the declarative Nix Expression Language.
apple-silicon declarative-language functional-programming home-manager libadwaita nix nix-darwin nix-flake nix-packages nix-shell nixops nixos nixos-config nixos-expression nixos-module nixos-service nixpkgs self-hosting sre wsl2
Last synced: 17 Jan 2025
https://github.com/mikeroyal/NixOS-Guide
NixOS Guide. Learn all about the immutable Nix Operating System and the declarative Nix Expression Language.
apple-silicon declarative-language functional-programming home-manager libadwaita nix nix-darwin nix-flake nix-packages nix-shell nixops nixos nixos-config nixos-expression nixos-module nixos-service nixpkgs self-hosting sre wsl2
Last synced: 31 Oct 2024
https://github.com/azure/caf-terraform-landingzones
This solution, offered by the Open-Source community, will no longer receive contributions from Microsoft. Customers are encouraged to transition to Microsoft Azure Verified Modules for continued support and updates from Microsoft. Please note, this repository is scheduled for decommissioning and will be removed on July 1, 2025.
azure azure-resource-manager devops enterprise platform platform-engineering sre terraform
Last synced: 30 Sep 2024
https://github.com/Azure/caf-terraform-landingzones
This solution, offered by the Open-Source community, will no longer receive contributions from Microsoft. Customers are encouraged to transition to Microsoft Azure Verified Modules for continued support and updates from Microsoft. Please note, this repository is scheduled for decommissioning and will be removed on July 1, 2025.
azure azure-resource-manager devops enterprise platform platform-engineering sre terraform
Last synced: 13 Nov 2024
https://github.com/braintree/runbook
A framework for gradual system automation
automation automation-framework devops devops-tools operations ops opseng orchestration orchestration-framework remote-execution ruby rubygem runbook runbook-command runbook-configuration runbook-dsl runbook-generators runbooks sre sshkit
Last synced: 04 Nov 2024
https://github.com/jetstack/version-checker
Kubernetes utility for exposing image versions in use, compared to latest available upstream, as metrics.
docker gcr go grafana grafana-dashboard image kubernetes prometheus quay sre utility version
Last synced: 16 Jan 2025
https://github.com/upgundecha/howtheyaws
A curated collection of publicly available resources on how technology and tech-savvy organizations around the world use Amazon Web Services (AWS)
amazon-web-services automation aws cloud cloud-computing cloud-native devops hacktoberfest hacktoberfest-accepted hacktoberfest2021 infrastructure-as-code sre
Last synced: 15 Jan 2025
https://github.com/opslane/opslane
Making on-call suck less for engineers
aiops alerts copilot debugging gen-ai monitoring oncall oncall-engineers rag runbooks site-reliability-engineering sre
Last synced: 04 Nov 2024
https://github.com/robusta-dev/holmesgpt
On-Call Assistant for Prometheus Alerts - Get a head start on fixing alerts with AI investigation
aiops chatbot chatops devops devops-tools incident incident-management incident-response jira kubernetes llm llm-agent llm-framework llms monitoring observability prometheus site-reliability-engineering slack sre
Last synced: 18 Jan 2025
https://github.com/kaytu-io/kaytu
The Kaytu CLI improves the efficiency of cloud workloads by analyzing historical usage and providing tailored recommendations, such as changing instance sizes. This ensures you only pay for the resources you actually need without compromising stability.
cloud-optimization cloud-spend rightsizing sre workload-optimization
Last synced: 04 Nov 2024
https://github.com/unixorn/sysadmin-reading-list
A reading and viewing list for larval stage SREs and sysadmins
aws azure best-practices cloud devoops devops gcloud gcp hacktoberfest handbook infrastructure linux linux-administration reading-list sre sysadmin
Last synced: 17 Jan 2025
https://github.com/cloudprober/cloudprober
An active monitoring software to detect failures before your customers do.
blackbox-monitoring cloud cloud-monitoring cloudwatch datadog devops golang google grafana k8s kubernetes monitoring observability prober prometheus slo sre stackdriver synthetic-monitoring uptime
Last synced: 17 Jan 2025
https://github.com/ryan4yin/knowledge
(Chinese Only)Everything I know: DevOps & CloudNative, Linux, Embedded, Homelab, Music, Blockchain, AI, etc...
container devops devops-notes embedded kubernetes linux music sre
Last synced: 19 Jan 2025
https://github.com/squzy/squzy
Squzy - is a high-performance open-source monitoring, incident and alert system written in Golang with Bazel and love. Welcome to free SRE
bazel docker golang grpc monitoring opensource opensource-monitoring prometheus sitemap sre zabbix
Last synced: 28 Oct 2024
https://github.com/clivern/gauntlet
🔖 Guides, Articles, Podcasts, Videos and Notes to Build Reliable Large-Scale Distributed Systems.
automation aws containers continuous-delivery continuous-integration database-replication devops digitalocean distributed-systems docker failover hacktoberfest high-availability kubernetes load-balancer microservice microservices-architecture scalability sre
Last synced: 19 Jan 2025
https://github.com/chris-short/devops-readme.md
What to Read to Learn More About DevOps
blame cloud cloud-native continuous-delivery continuous-deployment continuous-integration culture devops devops-journey failure leader lean monitoring release site-reliability-engineering sre stress systems systems-administration systems-engineering
Last synced: 13 Jan 2025
https://github.com/dingguodong/linuxbashshellscriptforops
Linux Bash Shell Script and Python Script For Ops and Devops
automation bash-script devops linux ops python-script repository-python repository-shell sre
Last synced: 17 Jan 2025
https://github.com/Clivern/Gauntlet
🔖 Guides, Articles, Podcasts, Videos and Notes to Build Reliable Large-Scale Distributed Systems.
automation aws containers continuous-delivery continuous-integration database-replication devops digitalocean distributed-systems docker failover hacktoberfest high-availability kubernetes load-balancer microservice microservices-architecture scalability sre
Last synced: 06 Nov 2024
https://github.com/teivah/sre-roadmap
An Opinionated Roadmap to Become an SRE (Concepts > Tools)
Last synced: 16 Dec 2024
https://github.com/googlecloudplatform/cloud-ops-sandbox
Cloud Operations Sandbox is an open source collection of tools that helps practitioners to learn O11y and R9y practices from Google and apply them using Cloud Operations suite of tools.
cloud cloud-native cloud-operations cloudops devops google-cloud opencensus opentelemetry operations ops-management profiler samples sre stackdriver stackdriver-logs stackdriver-monitoring stackdriver-sandbox stackdriver-trace
Last synced: 19 Jan 2025
https://github.com/GoogleCloudPlatform/cloud-ops-sandbox
Cloud Operations Sandbox is an open source collection of tools that helps practitioners to learn O11y and R9y practices from Google and apply them using Cloud Operations suite of tools.
cloud cloud-native cloud-operations cloudops devops google-cloud opencensus opentelemetry operations ops-management profiler samples sre stackdriver stackdriver-logs stackdriver-monitoring stackdriver-sandbox stackdriver-trace
Last synced: 03 Nov 2024
https://github.com/rishiloyola/sre-interviews
Curated list of good SRE interview questions.
interview interview-preparation site-reliability-engineering sre
Last synced: 11 Dec 2024
https://github.com/mehrdadrad/tcpprobe
Modern TCP tool and service for network performance observability.
docker http http2 https k8s kubernetes monitoring observability probe socket sre tcp
Last synced: 15 Jan 2025
https://github.com/ozontech/file.d
A blazing fast tool for building data pipelines: read, process and output events. Our community: https://t.me/file_d_community
actions clickhouse elasticsearch events file gelf go http input json kafka logs observability output pipeline processing reading sre throttle tracing
Last synced: 18 Jan 2025
https://github.com/k8sgpt-ai/k8sgpt-operator
Automatic SRE Superpowers within your Kubernetes cluster
devops kubernetes openai sre tooling
Last synced: 19 Jan 2025
https://github.com/getsavvyinc/savvy-cli
Automatically capture and surface your team's tribal knowledge
ai automation charmbracelet cli devops devtool go golang incident-response llm oncall oncall-engineers playbooks runbooks security-audit sre support support-engineers terminal
Last synced: 28 Nov 2024
https://github.com/waltenne/guiadevopsbrasil
Repositório para compartilhamento de conteúdo Gratuito sobre DevOps
devops devops-tools docker documentation iac linux python sre terraform windows
Last synced: 31 Oct 2024
https://github.com/notharshhaa/into-the-devops
𝖫𝗂𝗇𝗎𝗑, 𝖩𝖾𝗇𝗄𝗂𝗇𝗌, 𝖠𝖶𝖲, 𝖲𝖱𝖤, 𝖯𝗋𝗈𝗆𝖾𝗍𝗁𝖾𝗎𝗌, 𝖣𝗈𝖼𝗄𝖾𝗋, 𝖯𝗒𝗍𝗁𝗈𝗇, 𝖠𝗇𝗌𝗂𝖻𝗅𝖾, 𝖦𝗂𝗍, 𝖪𝗎𝖻𝖾𝗋𝗇𝖾𝗍𝖾𝗌, 𝖳𝖾𝗋𝗋𝖺𝖿𝗈𝗋𝗆, 𝖮𝗉𝖾𝗇𝖲𝗍𝖺𝖼𝗄, 𝖲𝖰𝖫, 𝖭𝗈𝖲𝖰𝖫, 𝖠𝗓𝗎𝗋𝖾, 𝖦𝖢𝖯, 𝖣𝖭𝖲, 𝖤𝗅𝖺𝗌𝗍𝗂𝖼, 𝖭𝖾𝗍𝗐𝗈𝗋𝗄, 𝖵𝗂𝗋𝗍𝗎𝖺𝗅𝗂𝗓𝖺𝗍𝗂𝗈𝗇. 𝖣𝖾𝗏𝖮𝗉𝗌 𝖨𝗇𝗍𝖾𝗋𝗏𝗂𝖾𝗐 𝖰𝗎𝖾𝗌𝗍𝗂𝗈𝗇𝗌
ansible aws azure coding containers devops docker git interview-preparation interview-questions kubernetes linux openstack production-engineer prometheus python sql sre terraform
Last synced: 21 Jan 2025
https://github.com/actionjack/so-you-want-to-onboard-a-devops-engineer
Guidance on how to make your environment easier to onboard for Web Ops Engineers, SRE's and DevOps Practitioners
culture devops devops-practitioners mentoring onboard ops-engineers sre starters
Last synced: 17 Nov 2024
https://github.com/steve-mt/awesome-slo
Curated list of resources on SLOs
awesome awesome-list sli slo sre
Last synced: 23 Nov 2024
https://github.com/chame1eon/jnitrace-engine
Engine used by jnitrace to intercept JNI API calls.
android frida jni jni-api reverse-engineering sre tracer
Last synced: 21 Jan 2025
https://github.com/google/marmot
Marmot workflow execution engine
devops devops-services devops-tools go golang google google-cloud kubernetes kubernetes-operator network network-monitoring sre
Last synced: 09 Nov 2024
https://github.com/datadog/chaos-controller
:monkey: :fire: Datadog Failure Injection System for Kubernetes
chaos chaos-engineering chaos-monkey k8s kubernetes sre
Last synced: 19 Jan 2025
https://github.com/kgoralski/microservice-production-readiness-checklist
The principles that help to deploy safely to the production environment. If you like it:
aws checklist cloud kubernetes microservices resiliency sre
Last synced: 01 Nov 2024
https://github.com/seznam/slo-exporter
Slo-exporter computes standardized SLI and SLO metrics based on events coming from various data sources.
alerting exporter grafana monitoring prometheus service-level-indicator service-level-objective sli slo slo-exporter sre sre-workbook
Last synced: 19 Jan 2025
https://github.com/windvalley/gossh
🚀🚀A high-performance and high-concurrency ssh tool written in Go. It is 10 times faster than Ansible. If you need much more performance and better ease of use, you will love it.
ansible batchssh cli devops gossh high-concurrency multissh ops parallel-ssh sa sre ssh ssh-client sshbatch
Last synced: 18 Jan 2025
https://github.com/dastergon/wheel-of-misfortune
A role-playing game for incident management training
chaos-engineering devops incident-management incident-response incident-scenario oncall-engineers postmortem reliability site-reliability site-reliability-engineering sre
Last synced: 15 Dec 2024
https://github.com/last9/slo-computer
SLOs, Error windows and alerts are complicated. Here an attempt to make it easy
metrics observability service-level-indicator service-level-objective sla sli slo sre sre-team
Last synced: 16 Nov 2024
https://github.com/apiaryio/s3-streaming-upload
s3-streaming-upload is node.js library that listens to your stream and upload its data to Amazon S3 using ManagedUpload API.
Last synced: 02 Nov 2024
https://github.com/adhorn/aws-chaos-scripts
DEPRECATED Collection of python scripts to run failure injection on AWS infrastructure
amazon-web-services aws chaos-engineering chaos-monkey deprecated software-engineering sre
Last synced: 13 Nov 2024
https://github.com/getstrake/developer-cost-guide
SQL code for developers to understand AWS cloud costs. Reduce time spent on billing, get back to engineering. Created and maintained by the team at Macroscope.
aws cloud cost-estimation finops sre
Last synced: 24 Nov 2024
https://github.com/macbre/index-digest
Analyses your database queries and schema and suggests indices and schema improvements
code-quality database-perfomance database-queries dba digest docker-image index linter mariadb mysql performance python query-digest schema slow-queries sql sql-logs sqlcheck sre sustainable-software
Last synced: 18 Jan 2025
https://github.com/thoughtbot/flightdeck
Terraform modules for rapidly building production-grade Kubernetes clusters following SRE practices
Last synced: 11 Nov 2024
https://github.com/dentrax/falco-gpt
AI-generated remediations for Falco audit events
audit-log chatgpt devops falco golang kubernetes openai sre sysdig threat-hunting tooling
Last synced: 11 Oct 2024
https://github.com/Dentrax/falco-gpt
AI-generated remediations for Falco audit events
audit-log chatgpt devops falco golang kubernetes openai sre sysdig threat-hunting tooling
Last synced: 05 Nov 2024
https://github.com/ivanilves/travelgrunt
navigate inside [mono]repos effortlessly!
devops fatigue git monorepo platform shell sre system-administration systems terraform terragrunt
Last synced: 16 Jan 2025
https://github.com/marceloboeira/sre
📚 Index for my study topics
coursera courses distributed-systems functional-programming infrastructure-as-code nosql oncall operating-systems site-reliability-engineering software-engineering sre system-programming terraform
Last synced: 26 Oct 2024
https://github.com/qainsights/performance-engineers-devops
This repository helps performance testers and engineers who wants to dive into DevOps and SRE world.
aws-devops chaos chaos-engineering devops docker engineering engineers kubernetes linux microsoft performance performance-engineers-devops rancher roadmap site-reliability-engineering sre testing
Last synced: 13 Dec 2024
https://github.com/adhorn/aws-fis-templates-cdk
Collection of AWS Fault Injection Simulator (FIS) experiment templates deploy-able via the AWS CDK
amazon-web-services automation aws aws-fis cdk-examples cdk-library chaos-engineering chaos-testing devops-tools sre testing
Last synced: 19 Nov 2024
https://github.com/bjarneo/rip
Rest in peace(s) - HTTP/UDP load testing tool
ddos go golang http learning-by-doing load-testing rip security-tools sre sre-infra udp-flood
Last synced: 10 Nov 2024
https://github.com/ari-hacks/command-line-cheat-sheet
📝 A place to quickly lookup commands (bash, vim, git, AWS, Docker, Terraform, Ansible, kubectl)
ansible aws bash command-line devops docker git k8s kubectl kubernetes sre terraform vim
Last synced: 15 Jan 2025
https://github.com/tedilabs/terraform-aws-account
🌳 A sustainable Terraform Package which creates Account & IAM resources on AWS
aws aws-iam devops hacktoberfest hcl2 iac lang-hcl sre tedilabs terraform terraform-aws terraform-module terraform-modules
Last synced: 19 Dec 2024
https://github.com/microsoft/sqlcallstackresolver
A sample tool for users of Microsoft SQL Server to aid in troubleshooting otherwise difficult to diagnose issues. Provided AS-IS - see SUPPORT.md.
azuresql azuresqldb azuresqlmanagedinstance callstack debugging debugging-symbol msdia140 pdb pdb-files sqlserver sqlserver-2017 sqlserver-2019 sqlserver-2022 sre symbols tool xevent xevents
Last synced: 10 Jan 2025
https://github.com/blacklane/kiev
A set of tools to do distributed logging for Ruby web applications
distributed-tracing elk-stack logging ruby sre
Last synced: 16 Jan 2025
https://github.com/loftwah/loftwahs-cheatsheet
My own personal tech cheatsheet. This covers the stuff I use quite regularly.
bash devops hacktoberfest linux nodejs python sre typescript
Last synced: 09 Nov 2024
https://github.com/microsoft/SQLCallStackResolver
A sample tool for users of Microsoft SQL Server to aid in troubleshooting otherwise difficult to diagnose issues. Provided AS-IS - see SUPPORT.md.
azuresql azuresqldb azuresqlmanagedinstance callstack debugging debugging-symbol msdia140 pdb pdb-files sqlserver sqlserver-2017 sqlserver-2019 sqlserver-2022 sre symbols tool xevent xevents
Last synced: 06 Nov 2024
https://github.com/getyourguide/istio-config-validator
go121 istio sre validation virtualservice
Last synced: 14 Nov 2024
https://github.com/tedilabs/terraform-aws-container
🌳 A sustainable Terraform Package which creates resources for Container Services on AWS
aws aws-ecr aws-eks devops hacktoberfest hcl2 iac lang-hcl sre tedilabs terraform terraform-aws terraform-module terraform-modules
Last synced: 08 Nov 2024
https://github.com/ory/jobs
Want to build the next generation identity stack? You've come to the right place!
go hiring jobs kubernetes open-source opensource ory react sre
Last synced: 27 Oct 2024
https://github.com/k8sgpt-ai/docs
Documentation for K8sGPT
ai chatgpt docs kubernetes sre
Last synced: 15 Jan 2025
https://github.com/sitectl/cuttle
Blue Box SRE Operations Platform
ansible bastion bluebox elk operations sensu sre
Last synced: 07 Nov 2024
https://github.com/icco/postmortems
Postmortem metadata from danluu/post-mortems.
hacktoberfest postmortem-metadata sre
Last synced: 28 Oct 2024
https://github.com/ramizpolic/sre-playground
A set of Site Reliability Engineering notes & challenges
cicd cloud guide infrastructure site-reliability-engineer sre tasks
Last synced: 15 Nov 2024
https://github.com/last9/openmetrics-registry
Do more with your metrics
exporter hcl modules open-metrics openmetrics prometheus registry sre
Last synced: 02 Jan 2025
https://github.com/fkie-cad/logprep
log data pre processing, generation and shipping in python
etl kafka log logdata loggenerator logshipper opensearch preprocessing python soar sre
Last synced: 21 Jan 2025
https://github.com/seveas/herd
Massively parallel ssh client
cli orchestration sre ssh sysadmin system-administration
Last synced: 26 Nov 2024
https://github.com/lwindolf/multi-status
Aggregator PWA for status pages of online services. Know which of your 3rd party SaaS/PaaS are having issues right now.
cloud devops monitoring paas pwa saas sre
Last synced: 08 Dec 2024
https://github.com/immobiliare/collectd-haproxy-plugin
Collectd plugin to pull metrics from HAProxy instances
collectd collectd-plugin grafana haproxy metrics monitoring sre
Last synced: 02 Nov 2024
https://github.com/tedilabs/terraform-aws-load-balancer
🌳 A sustainable Terraform Package which creates resources for Load Balancers on AWS
aws aws-alb aws-clb aws-elb aws-load-balancer aws-nlb devops hacktoberfest hcl2 iac lang-hcl sre tedilabs terraform terraform-aws terraform-module terraform-modules
Last synced: 08 Nov 2024