{"id":13434900,"url":"https://github.com/dastergon/awesome-chaos-engineering","last_synced_at":"2025-09-28T21:30:54.264Z","repository":{"id":38338038,"uuid":"98446777","full_name":"dastergon/awesome-chaos-engineering","owner":"dastergon","description":"A curated list of Chaos Engineering resources.","archived":false,"fork":false,"pushed_at":"2023-12-28T19:30:06.000Z","size":250,"stargazers_count":5826,"open_issues_count":28,"forks_count":639,"subscribers_count":311,"default_branch":"master","last_synced_at":"2024-05-23T09:51:21.616Z","etag":null,"topics":["awesome","awesome-list","chaos","chaos-community","chaos-engineering","chaos-monkey","chaos-testing","netflix-chaos-monkey","resilience","simian-army","site-reliability-engineering"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"cc0-1.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dastergon.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2017-07-26T17:05:05.000Z","updated_at":"2024-05-22T22:36:47.000Z","dependencies_parsed_at":"2024-01-03T06:23:52.760Z","dependency_job_id":null,"html_url":"https://github.com/dastergon/awesome-chaos-engineering","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dastergon%2Fawesome-chaos-engineering","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dastergon%2Fawesome-chaos-engineering/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dastergon%2Fawesome-chaos-engineering/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dastergon%2Fawesome-chaos-engineering/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dastergon","download_url":"https://codeload.github.com/dastergon/awesome-chaos-engineering/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":234563136,"owners_count":18853060,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["awesome","awesome-list","chaos","chaos-community","chaos-engineering","chaos-monkey","chaos-testing","netflix-chaos-monkey","resilience","simian-army","site-reliability-engineering"],"created_at":"2024-07-31T03:00:26.430Z","updated_at":"2025-09-28T21:30:49.028Z","avatar_url":"https://github.com/dastergon.png","language":null,"funding_links":[],"categories":["HarmonyOS","Others","Community Lists","Other Lists","Reliability","SRE/DevOps/WebOps","awesome-list","Capabilities","Code signing ###","Related Awesome Lists","In Russian","🔥 Chaos Engineering","10. Conferences and Talks","DevOps \u0026 SRE","Related"],"sub_categories":["Windows Manager","TeX Lists","Resilience","Chess :chess_pawn:","Orchestration \u0026 CD","Introduction","Artigos e Tutoriais","Example Projects","Tools \u0026 Integrations"],"readme":"# Awesome Chaos Engineering [![Awesome](https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg)](https://github.com/sindresorhus/awesome)\n\nA curated list of awesome [Chaos Engineering](http://principlesofchaos.org/) resources.\n\n#### What is Chaos Engineering?\n\u003e Chaos Engineering is the discipline of experimenting on a distributed system in order to build confidence in the system’s capability to withstand turbulent conditions in production. - [Principles Of Chaos Engineering](http://principlesofchaos.org/) website.\n\n## Contents\n- [Culture](#culture)\n- [Books](#books)\n- [Education](#education)\n- [Notable Tools](#notable-tools)\n- [Papers](#papers)\n- [Gamedays](#gamedays)\n- [Blogs \u0026 Newsletters](#blogs--newsletters)\n- [Conferences \u0026 Meetups](#conferences--meetups)\n- [Forums](#forums)\n- [Twitter](#twitter)\n\n## Culture\n* [Principles Of Chaos Engineering](http://principlesofchaos.org/)\n* [Chaos Community](http://chaos.community/)\n* [Chaos Engineering](https://www.infoq.com/articles/chaos-engineering)\n* [O'Reilly Velocity San Jose 2017: Precision Chaos](https://www.youtube.com/watch?v=C11LNUEaHuo)\n* [The Discipline of Chaos Engineering](https://www.gremlin.com/blog/the-discipline-of-chaos-engineering/)\n* [Chaos Monkey for Fun and Profit](https://sharpend.io/chaos-monkey-for-fun-and-profit/)\n* [Fault Injection in Production: Making the case for resilience testing](https://queue.acm.org/detail.cfm?id=2353017)\n* [Lord of Chaos - Becoming a Chaos Engineer](https://vimeo.com/groups/jz2016/videos/181925286)\n* [Chaos testing - Preventing failure by instigation](http://www.cakesolutions.net/teamblogs/chaos-testing-preventing-failure-by-instiga)\n* [Orchestrated Chaos](https://docs.google.com/presentation/d/1zzHS3qoPGzwsSna5-uk3Xt7LW_3Fr6ag8JDkeyrKwL4/edit#slide=id.p)\n* Choose your own adventure: Chaos Engineering - [Video](https://www.infoq.com/presentations/adopt-chaos-engineering) \u0026 [Slides](https://www.slideshare.net/NoraJones1/choose-your-own-adventure-qcon-2017-1)\n* [AMA Chaos Engineering + DiRT](http://pages.catchpoint.com/AMA-Chaos-DiRT.html)\n* [SRECON17: Principles of Chaos Engineering](https://www.usenix.org/conference/srecon17americas/program/presentation/rosenthal)\n* [Chaos \u0026 Intuition Engineering at Netflix](https://www.youtube.com/watch?v=Q4nniyAarbs)\n* [Mastering Chaos - A Netflix Guide to Microservices](https://www.youtube.com/watch?v=CZ3wIuvmHeM)\n* [Too big to test: Breaking a production brokerage platform without causing financial devastation](https://conferences.oreilly.com/velocity/devops-web-performance-ny-2015/public/schedule/detail/45012)\n* [Inside Azure Search: Chaos Engineering](https://azure.microsoft.com/en-us/blog/inside-azure-search-chaos-engineering/)\n* [Netflix, the Simian Army, and the culture of freedom and responsibility](https://devops.com/netflix-the-simian-army-and-the-culture-of-freedom-and-responsibility/)\n* [FIT: Failure Injection Testing](https://medium.com/netflix-techblog/fit-failure-injection-testing-35d8e2a9bb2)\n* [The Netflix Simian Army](https://medium.com/netflix-techblog/the-netflix-simian-army-16e57fbab116)\n* [Automated Failure Testing](https://medium.com/netflix-techblog/automated-failure-testing-86c1b8bc841f)\n* [The Verification of a Distributed System by Caitie McCaffrey](http://queue.acm.org/detail.cfm?ref=rss\u0026id=2889274)\n* [The Journey to Chaos Engineering begins with a single step - Bruce Wong and James Burns (Twilio)](https://www.youtube.com/watch?v=rKAo2wANiHM)\n* [Chaos Engineering by Lorin Hochstein](https://www.youtube.com/watch?v=vq4QZ4_YDok)\n* [Aaron Rinehart - ChaoSlingr: Introducing Security based Chaos Testing](https://www.youtube.com/watch?v=BLRb-E0G5zk)\n* [Chaos Engineering - Casey Rosenthal](https://www.youtube.com/watch?v=6OIOpx_dVFY)\n* The Road to Chaos - Velocity 2017- [video](https://www.youtube.com/watch?v=FCZVAZaXIjs) \u0026 [slides](https://github.com/norajones/Presentations/blob/master/The%20Road%20To%20Chaos%20-%20Velocity%202017.pdf)\n* [How Netflix DDoS’d Itself To Help Protect the Entire Internet](https://www.wired.com/story/netflix-ddos-attack)\n* [10 Years of Crashing Google](https://www.usenix.org/conference/lisa15/conference-program/presentation/krishnan)\n* [Weathering the Unexpected](http://queue.acm.org/detail.cfm?id=2371516)\n* [SRECON17: Breaking Things on Purpose](https://youtu.be/h_-shm0SL08)\n* [PuppetConf 2016: Chaos Patterns - Architecting for Failure in Distributed Systems](https://youtu.be/V3P35N_HXNQ)\n* [Ship More, Sink Less - Changing Chaos Engineering and Distributed Tracing](https://youtu.be/nr2KWbyWAmA)\n* [Cloudcast - Discipline of Chaos Engineering](http://www.thecloudcast.net/2017/05/the-cloudcast-299-discipline-of-chaos.html)\n* [Software Engineering Daily - Failure Injection with Kolton Andrus podcast](https://softwareengineeringdaily.com/2017/03/29/failure-injection-with-kolton-andrus/)\n* [Responding to Failures in Playback Features with Haley Tucker podcast](https://www.infoq.com/podcasts/netflix-haley-tucker?utm_campaign=infoq_content\u0026utm_source=twitter\u0026utm_medium=feed\u0026utm_term=architecture-design)\n* [\"Antics, drift, and chaos\" by Lorin Hochstein](https://youtu.be/SM2uXpmyJmA)\n* [re:invent 2017: Nora Jones Describes Why We Need More Chaos - Chaos Engineering, That Is](https://youtu.be/rgfww8tLM0A)\n* [Failure Friday: Four Years On](https://www.pagerduty.com/blog/failure-fridays-four-years/)\n* [Monkeys \u0026 Lemurs and Locusts, Oh my!](https://www.slideshare.net/zgrinch/monkeys-lemurs-and-locusts-oh-my)\n* [Practical Chaos Engineering](https://youtu.be/Yn4tYxqzFVU)\n* [Chaos Day in the Met Office Cloud](https://www.cloudreach.com/fr/blog/training-cloud-operations-teams-met-office/)\n* [Cloud Native and Chaos Engineering](https://medium.com/chaosiq/cloud-native-and-chaos-engineering-20842ee2fa8a)\n* [Chaos Engineering with Kolton Andrus](https://softwareengineeringdaily.com/2018/02/02/chaos-engineering-with-kolton-andrus/)\n* [Chaos Engineering: the history, principles, and practice](https://www.gremlin.com/community/tutorials/chaos-engineering-the-history-principles-and-practice/)\n* [Embracing the Chaos of Chaos Engineering](https://blog.codeship.com/embracing-the-chaos-of-chaos-engineering/)\n* [Designing Services for Resilience: Netflix Lessons](https://www.infoq.com/presentations/netflix-microservices-resiliency)\n* [Chaos Engineering: A cheat sheet](https://www.techrepublic.com/article/chaos-engineering-a-cheat-sheet/)\n* [How to convince your boss and make them say “Yes!” to Chaos Engineering?](https://medium.com/@crochefolle/how-to-convince-your-boss-to-make-them-say-yes-to-chaos-engineering-796ba119bd7)\n* [Why the World Needs More Resilient Systems](https://www.infoq.com/news/2018/03/resilient-systems-chaos-engineer)\n* [Chaos Architecture](https://www.infoq.com/presentations/chaos-architecture-mindset)\n* [Gremlin’s Tammy Bütow on the Business Side of Chaos Engineering](https://thenewstack.io/gremlins-tammy-butow-on-the-business-side-of-chaos-engineering/)\n* [Kubernetes Chaos Engineering: Lessons Learned](https://learnk8s.io/blog/kubernetes-chaos-engineering-lessons-learned)\n* [Chaos Engineering: managing complexity by breaking things](https://hub.packtpub.com/chaos-engineering-managing-complexity-by-breaking-things/)\n* [Podcast:Database Chaos with Tammy Butow](https://softwareengineeringdaily.com/2018/04/10/database-chaos-with-tammy-butow/)\n* [LinkedOut: A Request-Level Failure Injection Framework](https://engineering.linkedin.com/blog/2018/05/linkedout--a-request-level-failure-injection-framework)\n* [GOTO 2018 - Breaking Things on Purpose - Kolton Andrus](https://youtu.be/S89ox7oQn8s)\n* [Why should Chaos be part of your Distributed Systems Engineering?](https://medium.com/@bbideep/why-should-chaos-be-part-of-your-distributed-systems-engineering-5bcb21497660)\n* [Brian Holt - Chaos Monkeys in Your Browser What Chaos Engineering Means For the Front End](https://www.youtube.com/watch?v=A4_rRj-4Mv0)\n* [Chaos Engineering: Why the World Needs More Resilient Systems](https://www.youtube.com/watch?time_continue=242\u0026v=Khqf0XltR_M)\n* QCon·Beijing 2017: The Practice of Failure Management and Fault Injection at Alibaba E-Commerce Platforms - [video](http://www.infoq.com/cn/presentations/ali-electricity-supplier-fault-management-and-fault-drills-practice) \u0026 [speech draft](http://jm.taobao.org/2017/06/22/20170622/) (Chinese speech)\n* [Orchestrating Chaos using Grab's Experimentation Platform](https://engineering.grab.com/chaos-engineering)\n* [Breaking to Learn: Chaos Engineering Explained](https://blog.newrelic.com/engineering/chaos-engineering-explained/)\n* [Chaos Engineering Traps](https://medium.com/@njones_18523/chaos-engineering-traps-e3486c526059)\n* [Chaos Engineering - The Art of Breaking Things Purposefully](https://medium.com/@adhorn/chaos-engineering-ab0cc9fbd12a)\n* [Disasterpiece Theater: Slack’s process for approachable Chaos Engineering](https://slack.engineering/disasterpiece-theater-slacks-process-for-approachable-chaos-engineering-3434422afb54)\n* [Taming chaos: Preparing for your next incident](https://www.oreilly.com/ideas/taming-chaos-preparing-for-your-next-incident)\n* [The Future of Chaos Engineering w/ Conde Nast](https://www.youtube.com/watch?v=RqM2sMt11Bw)\n* [Chaos Engineering For People Systems w/ Dave Rensin of Google](https://www.youtube.com/watch?v=sn6wokyCZSA)\n* [Performing chaos engineering in a serverless world (AWS re:Invent 2019 CMY301)](https://www.youtube.com/watch?v=vbyjpMeYitA)\n* [Building Confidence in Healthcare Systems through Chaos Engineering](https://www.infoq.com/presentations/cerner-resiliency)\n* [Break Your App before Someone Else Does](https://www.infoq.com/presentations/test-android-apk/)\n* [Preparing for Traffic Spikes with Chaos Engineering](https://www.bigmarker.com/gremlin/Preparing-for-Traffic-Spikes-with-Chaos-Engineering)\n* [Automating Chaos Engineering GameDays with Terraform](https://www.youtube.com/watch?v=NOOgKNbW0gk)\n* [Postmortem Culture: Learning from failure](https://www.youtube.com/watch?v=JtLrlDNdJzg\u0026feature=youtu.be)\n* [Problem Detection by John Allspaw](https://www.youtube.com/watch?v=NxctiGRI2y8)\n* [New Paradigms for the Next Era of Security](https://www.rsaconference.com/industry-topics/webcast/35-new-paradigms-for-the-next-era-of-security)\n* [Cloud-Native Chaos Engineering](https://dev.to/umamukkara/chaos-engineering-for-cloud-native-systems-2fjn)\n* [Building resilient services at Prime Video with chaos engineering](https://aws.amazon.com/blogs/opensource/building-resilient-services-at-prime-video-with-chaos-engineering/)\n* [Making Chaos Part of Kubernetes/OpenShift Performance and Scalability Tests](https://www.openshift.com/blog/making-chaos-part-of-kubernetes/openshift-performance-and-scalability-tests)\n* [Lucky Lotto, chaos engineering but for teams](https://danlebrero.com/2021/06/30/cto-dairy-lucky-lotto-chaos-engineering-for-teams/)\n* [Using Fault Injection Testing to Improve DoorDash Reliability](https://doordash.engineering/2022/04/25/using-fault-injection-testing-to-improve-doordash-reliability/)\n* [Chaos Engineering At Ant Group](https://medium.com/@monkeysuzie/chaos-engineering-at-ant-group-30c15cb6ab69)\n\n## Books\n* [Chaos Engineering: Building Confidence in System Behavior through Experiment](http://www.oreilly.com/webops-perf/free/chaos-engineering.csp)\n* [Site Reliability Engineering: How Google Runs Production Systems](https://landing.google.com/sre/book.html) -\n* [The Practice Of Cloud System Administration: Designing and Operating Large Distributed Systems](http://the-cloud-book.com/)\n* [Antifragile Systems and Teams](http://www.oreilly.com/webops-perf/free/antifragile-systems-and-teams.csp)\n* [The InfoQ eMag: Chaos Engineering](https://www.infoq.com/minibooks/emag-chaos-engineering)\n* [Learning Chaos Engineering](http://shop.oreilly.com/product/0636920251897.do)\n* [Chaos Engineering: System Resilience in Practice](https://www.oreilly.com/library/view/chaos-engineering/9781492043850/)\n* [Chaos Engineering: Crash test your applications](https://www.manning.com/books/chaos-engineering)\n* [Security Chaos Engineering: Gaining Confidence in Resilience and Safety at Speed and Scale](https://www.oreilly.com/library/view/security-chaos-engineering/9781492080350/)\n* [Chaos Engineering Observability](https://www.humio.com/resources/reports/chaos-observability/)\n\n## Education\n* A Chaos Engineering Bootcamp for O'Reilly Velocity 2017 - [Slides](https://speakerdeck.com/tammybutow/chaos-engineering-bootcamp) \u0026 [Source code](https://github.com/tammybutow/chaos_engineering_bootcamp)\n* [Your First Chaos Experiment](https://www.gremlin.com/community/tutorials/your-first-chaos-experiment)\n* [Chaos Engineering 101](https://sharpend.io/chaos-engineering-101/)\n* [A Primer on Automating Chaos](https://www.gremlin.com/community/tutorials/a-primer-on-automating-chaos)\n* [Intro to Chaos Engineering](https://www.youtube.com/watch?v=qHykK5pFRW4)\n* [Learn the basics of the Chaos Toolkit](https://www.katacoda.com/chaostoolkit/courses/01-chaostoolkit-getting-started)\n* [Build System Confidence with Chaos Engineering](https://medium.com/chaosiq/improve-your-cloud-native-devops-flow-with-chaos-engineering-dc32836c2d9a)\n* [How we break things at Twitter: failure testing](https://blog.twitter.com/engineering/en_us/a/2015/how-we-break-things-at-twitter-failure-testing.html)\n* [Run Chaos Experiments Without Risking Your Job](https://blog.loadmill.com/run-chaos-experiments-without-risking-your-job-2c8a5f4b0bfc)\n* [A Guide to Your First Chaos Day](https://victorops.com/blog/a-guide-to-your-first-chaos-day)\n* [Planning Your Own Chaos Day](https://www.gremlin.com/community/tutorials/planning-your-own-chaos-day/)\n* [How To Install Distributed Tensorflow on GCP and Perform Chaos Engineering Experiments](https://www.gremlin.com/community/tutorials/how-to-install-distributed-tensorflow-on-gcp-and-perform-chaos-engineering-experiments/)\n* [Monitoring Your Chaos Experiments](https://www.brighttalk.com/webcast/15087/316835)\n* [Increasing the Resilience of APIs with Chaos Engineering](https://www.infoq.com/news/2018/05/gremlin-api-chaos)\n* [3 key steps for running chaos engineering experiments](https://www.infoworld.com/article/3268017/devops/3-key-steps-for-running-chaos-engineering-experiments.html)\n* [Exploring Multi-level Weaknesses using Automated Chaos Experiments](https://medium.com/chaosiq/exploring-multi-level-weaknesses-using-automated-chaos-experiments-aa30f0605ce)\n* [Chaos Monkey Guide for Engineers](https://www.gremlin.com/chaos-monkey/)\n* [Chaos Engineering for Serverless](https://www.youtube.com/playlist?list=PL70SCo-0vujiQkPAOGuZP-kNZZkzcPVKD)\n* [Network Fire Drills with Chaos Engineering](https://speakerdeck.com/homingli/network-automation-meetup-network-fire-drills-with-chaos-engineering)\n* [Dev Ops Foundations: Chaos Engineering](https://www.linkedin.com/learning/devops-foundations-chaos-engineering/)\n* [Resilience Engineering: Short Course](http://csel.org.ohio-state.edu/ResilienceEngineering.html)\n* [The Chaos Engineering Collection](https://medium.com/@adhorn/the-chaos-engineering-collection-5e188d6a90e2)\n* [PenTester Academic](https://www.pentesteracademy.com/onlinelabs)\n* [Consul and Chaos Engineering](https://learn.hashicorp.com/tutorials/consul/introduction-chaos-engineering?in=consul/resiliency)\n\n## Notable Tools\n* [Chaos Monkey](https://github.com/Netflix/chaosmonkey) - A resiliency tool that helps applications tolerate random instance failures.\n* [orchestrator](https://github.com/github/orchestrator) - MySQL replication topology management and HA.\n* [kube-monkey](https://github.com/asobti/kube-monkey) - An implementation of Netflix's Chaos Monkey for Kubernetes clusters.\n* [Gremlin Inc.](https://www.gremlin.com/) - Failure as a Service.\n* [Chaos Toolkit](https://github.com/chaostoolkit/chaostoolkit) - A chaos engineering toolkit to help you build confidence in your software system.\n* [steadybit](https://www.steadybit.com/) - A Chaos Engineering platform (SaaS or On-Prem) with auto discovery features, different attack types, user management and many more.\n* [PowerfulSeal](https://github.com/bloomberg/powerfulseal) - Adds chaos to your Kubernetes clusters, so that you can detect problems in your systems as early as possible. It kills targeted pods and takes VMs up and down.\n* [drax](https://github.com/dcos-labs/drax) -  DC/OS Resilience Automated Xenodiagnosis tool. It helps to test DC/OS deployments by applying a Chaos Monkey-inspired, proactive and invasive testing approach.\n* [Wiremock](http://wiremock.org/) - API mocking (Service Virtualization) which enables modeling real world faults and delays\n* [MockLab](http://get.mocklab.io/) - API mocking (Service Virtualization) as a service which enables modeling real world faults and delays.\n* [Pod-Reaper](https://github.com/target/pod-reaper) - A rules based pod killing container. Pod-Reaper was designed to kill pods that meet specific conditions that can be used for Chaos testing in Kubernetes.\n* [Muxy](https://github.com/mefellows/muxy/) - A chaos testing tool for simulating a real-world distributed system failures.\n* [Toxiproxy](https://github.com/Shopify/toxiproxy) - A TCP proxy to simulate network and system conditions for chaos and resiliency testing.\n* Chaos engineering for Docker:\n  * [Pumba](https://github.com/gaia-adm/pumba) - Chaos testing and network emulation for Docker containers (and clusters).\n  * [Blockade](https://github.com/worstcase/blockade) - Docker-based utility for testing network failures and partitions in distributed applications.\n* [chaos-lambda](https://github.com/bbc/chaos-lambda) - Randomly terminate ASG instances during business hours.\n* [Namazu](https://github.com/osrg/namazu) - Programmable fuzzy scheduler for testing distributed systems.\n* [Chaos Monkey for Spring Boot](https://codecentric.github.io/chaos-monkey-spring-boot/) - Injects latencies, exceptions, and terminations into Spring Boot applications\n* [Byte-Monkey](https://github.com/mrwilson/byte-monkey) - Bytecode-level fault injection for the JVM. It works by instrumenting application code on the fly to deliberately introduce faults like exceptions and latency.\n* [GomJabbar](https://github.com/outbrain/GomJabbar) - ChaosMonkey for your private cloud\n* [Turbulence](https://github.com/cppforlife/turbulence-release) - Tool focused on BOSH environments capable of stressing VMs, manipulating network traffic, and more. It is very simmilar to Gremlin.\n* [chaosblade](https://github.com/chaosblade-io/chaosblade) - An Easy to Use and Powerful Chaos Engineering Toolkit.\n* [KubeInvaders](https://github.com/lucky-sideburn/KubeInvaders) - Gamfied Chaos engineering tool for Kubernetes Clusters\n* [Cthulhu](https://github.com/xmatters/cthulhu-chaos-testing) - Chaos Engineering tool that helps evaluating the resiliency of microservice systems simulating various disaster scenarios against a target infrastructure in a data-driven manner.\n* [VMware Mangle](https://vmware.github.io/mangle/) - Orchestrating Chaos Engineering.\n* [Byteman](https://byteman.jboss.org/) - A Swiss Army Knife for Byte Code Manipulation.\n* [Litmus](https://github.com/litmuschaos/litmus) - Framework for Kubernetes environments that enables users to run test suites, capture logs, generate reports and perform chaos tests.\n* [Perses](https://github.com/nicolasmanic/perses) - A project to cause (controlled) destruction to a JVM application.\n* [ChaosKube](https://github.com/linki/chaoskube) - chaoskube periodically kills random pods in your Kubernetes cluster. \n* [Chaos Mesh](https://github.com/chaos-mesh/chaos-mesh) - Chaos Mesh is a cloud-native Chaos Engineering platform that orchestrates chaos on Kubernetes environments.\n* [failure-lambda](https://github.com/gunnargrosch/failure-lambda) - A small Node module for injecting failure into AWS Lambda using latency, exception, statuscode or diskspace.\n* [aws-chaos-scripts](https://github.com/adhorn/aws-chaos-scripts) - Collection of python scripts to run failure injection on AWS infrastructure\n* [chaos-ssm-documents](https://github.com/adhorn/chaos-ssm-documents) - Collection of AWS SSM Documents to perform Chaos Engineering experiments\n* [aws-lambda-chaos-injection](https://github.com/adhorn/aws-lambda-chaos-injection) - A library injecting chaos into AWS Lambda. It offers simple python decorators to do delay, exception and statusCode injection and a Class to add delay to any 3rd party dependencies.\n* [chaos-dingo](https://github.com/jmspring/chaos-dingo) - A tool to mess with Azure services using the Azure NodeJS SDK.\n* [Chaos HTTP Proxy](https://github.com/bouncestorage/chaos-http-proxy) - Introduce failures into HTTP requests via a proxy server\n* [Chaos Lemur](https://github.com/strepsirrhini-army/chaos-lemur) - A self-hostable application to randomly destroy virtual machines in a BOSH-managed environment\n* [Simoorg](https://github.com/linkedin/simoorg) - Linkedin’s very own failure inducer framework.\n* [react-chaos](https://github.com/jchiatt/react-chaos) - A chaos engineering tool for your React apps\n* [vue-chaos](https://github.com/aviadhahami/vue-chaos) - A chaos engineering tool for your Vue apps\n* [Chaos Engine](https://github.com/ThalesGroup/chaos-engine) - tool designed to intermittently destroy or degrade application resources running in cloud based infrastructure. [Documentation](https://thalesgroup.github.io/chaos-engine/)\n* [kubedoom](https://github.com/storax/kubedoom) - Kill Kubernetes pods by playing Id's DOOM.\n* [kubethanos](https://github.com/berkay-dincer/kubethanos) - Kills half of your randomly selected Kubernetes pods.\n* [go-fault](https://github.com/github/go-fault) - Fault injection middleware in Go\n* [Proofdock's Chaos Engineering Platform](https://proofdock.io) - A chaos engineering platform that seamlessly integrates in Azure DevOps and has a focus on the Azure cloud platform.\n* [Pystol](https://www.pystol.org/docs) - Pystol is a fault injection platform allowing users to execute fault injection Actions in cloud-native environments in a controlled and prescribed way.\n* [AWSSSMChaosRunner](https://github.com/amzn/awsssmchaosrunner) - Amazon's light-weight open-source library for chaos engineering on AWS. It can be used for [EC2](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/concepts.html), [ECS (with EC2 launch type)](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/getting-started-ecs-ec2.html) and [Fargate](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/getting-started-fargate.html).\n* [Kraken](https://github.com/cloud-bulldozer/kraken) - Chaos and resiliency testing tool for Kubernetes and OpenShift.\n* [kube-burner](https://github.com/cloud-bulldozer/kube-burner) - A tool aimed at stressing Kubernetes clusters by creating or deleting a high quantity of objects.\n* [Chaos Experimentation Framework](https://github.com/lyft/clutch) - An extensible platform for infrastructure management including Chaos Engineering \n* [NetHavoc](https://www.cavisson.com/nethavoc-resilience-testing-solution/) - A Chaos Engineering Tool for Linux, K8s, Windows, PCF, Cloud, and Containers for injecting Resource, Infrastructure, Network, and Application failures.\n* [gorm-sqlchaos](https://github.com/u2386/gorm-sqlchaos) - A runtime SQL manipulator for your Golang applications based on gorm.\n* [Chaos Frontend Toolkit](https://chaos-frontend-toolkit.web.app/) - A set of tools to apply Chaos Engineering to frontend\n* [Mitigant](https://mitigant.io/) - The Continuos Security Verification Platform, enables confidence in cloud security posture by leveraging security chaos engineering.\n\n## Retired tools\n* [The Simian Army](https://github.com/Netflix/SimianArmy) - A suite of tools for keeping your cloud operating in top form.\n* [ChaoSlingr](https://github.com/Optum/ChaoSlingr) - Introducing Security Chaos Engineering. ChaoSlingr focuses primarily on the experimentation on AWS Infrastructure to proactively instrument system security failure through experimentation.\n\n## Cloud Services\n* [Testing Amazon Aurora Using Fault Injection Queries](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/AuroraMySQL.Managing.html#AuroraMySQL.Managing.FaultInjectionQueries)\n* [Azure Chaos Studio](https://aka.ms/azurechaosstudio) - A managed fault injection service for Azure applications. See also [Azure Fault Analysis Service](https://docs.microsoft.com/azure/service-fabric/service-fabric-testability-overview) for Azure Service Fabric applications.\n* [Security Chaos Engineering for Cloud Services](https://medium.com/@run2obtain/from-resilience-to-dependability-security-chaos-engineering-for-cloud-services-9c6d6d152ed2)\n\n## Papers\n* [Maelstrom: Mitigating Datacenter-level Disasters by Draining Interdependent Traffic Safely and Efficiently](https://www.usenix.org/system/files/osdi18-veeraraghavan.pdf)\n* [Simple Testing Can Prevent Most Critical Failures: An Analysis of Production Failures in Distributed Data-Intensive Systems](https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-yuan.pdf)\n* [Automating Failure Testing Research at Internet Scale ](https://people.ucsc.edu/~palvaro/fit-ldfi.pdf)\n* [Principles of Antifragile Software](https://arxiv.org/abs/1404.3056)\n* [Why is random testing effective for partition tolerance bugs?](https://dl.acm.org/citation.cfm?id=3177123.3158134)\n* [Chaos Engineering](https://arxiv.org/abs/1702.05843)\n* [A Platform for Automating Chaos Experiments](https://arxiv.org/abs/1702.05849)\n* [A Chaos Engineering System for Live Analysis and Falsification of Exception-handling in the JVM](https://arxiv.org/abs/1805.05246)\n* [TripleAgent: Monitoring, Perturbation And Failure-obliviousness for Automated Resilience Improvement in Java Applications](https://arxiv.org/abs/1812.10706)\n* [Lineage-driven Fault Injection](https://dl.acm.org/citation.cfm?id=2723711)\n* [Antifragility is a Fragile Concept](https://www.linkedin.com/pulse/antifragility-fragile-concept-casey-rosenthal/)\n* [Chaos Engineering Security](https://jaxenter.com/chaos-engineering-security-163358.html)\n* [Security Chaos Engineering: A new paradigm for cybersecurity](https://opensource.com/article/18/1/new-paradigm-cybersecurity)\n* [Security Challenges around Chaos Engineering](https://www.conjur.org/blog/security-challenges-around-chaos-engineering/)\n* [CloudStrike: Security Chaos Engineering for Cloud Services](https://www.researchgate.net/publication/335922038_Security_Chaos_Engineering_for_Cloud_Services)\n* [Observability and Chaos Engineering on System Calls for Containerized Applications in Docker](https://arxiv.org/abs/1907.13039)\n* [Maximizing Error Injection Realism for Chaos Engineering with System Calls](https://arxiv.org/abs/2006.04444)\n* [Chaos Engineering of Ethereum Blockchain Clients](https://arxiv.org/abs/2111.00221)\n\n## Gamedays\n* [Target: What is a Gameday?](https://tech.target.com/2019/05/09/chaos-engineering-at-Target.html) - Chaos Gamedays experience by Target.\n* [Codecentric: Chaos Engineering Gamedays](https://blog.codecentric.de/en/2018/08/chaos-engineering-gameday/) - Chaos Gamedays by Codecentric.\n* [New Relic: How to run a Gameday?](https://blog.newrelic.com/engineering/how-to-run-a-game-day/) - Chaos Gamedays experience by New Relic.\n* [Dius: Gamedays resources](https://dius.com.au/resources/game-day/) - Resources for getting started with GameDay and Chaos Engineering.\n* [Gremlin: Gamedays](https://www.gremlin.com/gameday/) - Resources for getting started with GameDay and Chaos Engineering.\n* [Gremlin: What is a Chaos Day?](https://www.gremlin.com/community/tutorials/planning-your-own-chaos-day/#what-is-a-chaos-day) - What is a Gameday according Gremlin.\n* [Gremlin: Why run a Chaos Day?](https://www.gremlin.com/community/tutorials/planning-your-own-chaos-day/#why-run-a-chaos-day) - Reasons to run Gamedays according Gremlin.\n* [Gremlin: How to run a Gameday?](https://www.gremlin.com/community/tutorials/how-to-run-a-gameday/) - Methodology to run Gamedays according Gremlin. \n* [Gremlin DB: Breaking Dynamo DB](https://www.gremlin.com/community/tutorials/gremlin-gameday-breaking-dynamodb/) - Example of a Gameday with DynamoDB by Gremlin.\n* [Gremlin: Introduction to Gameday](https://www.gremlin.com/community/tutorials/introduction-to-gamedays/) - What is a Gameday according Gremlin.\n* [Gremlin: Planning your own Chaos Day](https://www.gremlin.com/community/tutorials/planning-your-own-chaos-day/) - Example of a Gameday with DynamoDB by Gremlin.\n* [Gremlin: Inside Gremlin 2019 Gremlin Gamedays Roadmap](https://www.gremlin.com/community/tutorials/inside-gremlin-2019-gremlin-gamedays-roadmap/) - Chaos Gamedays experience by Gremlin.\n* [Gremlin: What I lerned running the Chaos Lab with Kafka](https://www.gremlin.com/community/tutorials/what-i-learned-running-the-chaos-lab-kafka-breaks/) - Example of a Gameday with Kafka by Gremlin.\n* [Chaos Toolkit: Chaos Engineering with Humans in the loop](https://medium.com/chaos-toolkit/chaos-engineering-with-humans-in-the-loop-f4854900b1eb) - Article about Chaos Gamedays.\n* [GooCardless: All fun and games until you start with Gamedays](https://gocardless.com/blog/game-days-at-gc/) - Article about Chaos Gamedays.\n* [InfoQ: Gamedays - Achieving Resilience through Chaos Engineering](https://www.infoq.com/presentations/gameday-chaos-engineering) - InfoQ Presentation with experiences about Chaos Gamedays.\n\n## Blogs \u0026 Newsletters\n* [Netflix Technology Blog](https://medium.com/@NetflixTechBlog) - Learn more about how Netflix designs, builds, and operates our systems and engineering organizations.\n* [Production Ready](https://tinyletter.com/production-ready) - A mailing list about building resilient infrastructure and tools.\n* [SRE Weekly](https://sreweekly.com/) - Weekly Site Reliability Newsletter.\n* [Site Reliability Engineering resources](https://github.com/dastergon/awesome-sre) - A curated list of awesome Site Reliability and Production Engineering resources.\n* [SysAdvent](https://sysadvent.blogspot.com) - One article for each day of December, ending on the 25th article.\n* [Gremlin Blog](https://blog.gremlininc.com) - Blogs on Chaos Engineering from Gremlin Inc.\n* [O’Reilly Systems Engineering and Operations Newsletter](http://www.oreilly.com/webops-perf/newsletter.html) - Weekly systems engineering and operations news and insights from industry insiders.\n* [LaunchDarkly Blog](http://blog.launchdarkly.com/) - Continuous delivery and feature flags blog.\n* [Verica](https://www.verica.io/) - Chaos engineering, security chaos engineering and continuous verification.\n* [Proofdock](https://medium.com/proofdock) - Reliability, resilience and chaos engineering with a focus on MS Azure\n* [LitmusChaos Blog](https://dev.to/t/litmuschaos/latest) - Blogs on Chaos Engineering from LitmusChaos\n* [ChaosEngineering.news](https://chaosengineering.news/) - Chaos Engineering newsletter. All things chaos engineering, directly to your inbox!\n* [Chaos Mesh Blog](https://chaos-mesh.org/blog) - Blogs on Chaos Engineering from Chaos Mesh.\n* [Chaos Experimentation Framework](https://eng.lyft.com/chaos-experimentation-an-open-source-framework-built-on-top-of-envoy-proxy-df87519ed681) Chaos Experimentation, an open-source framework built on top of Envoy Proxy\n* [Squadcast](https://squadcast.com/blog)- Blog on Site Reliability engineering.\n* [steadybit Blog](https://www.steadybit.com/blog) - Blogs on Chaos Engineering, Resilience, SRE and OPS from steadybit.\n\n## Podcasts\n* [Break Things On Purpose](https://podcasts.apple.com/us/podcast/break-things-on-purpose/id1460542551) - Monthly podcast about Chaos Engineering presented by Gremlin Inc. Also available on Spotify, Google Play, and Stitcher.\n\n## Conferences \u0026 Meetups\n* [Chaos Carnival](https://chaoscarnival.io/) - A global two-day virtual conference for Cloud Native Chaos Engineering. \n* [Chaos Conf](https://chaosconf.splashthat.com/) - A day of Chaos Engineering demos, expert advice, and connect with your peers putting chaos into practice at their companies.\n* [SRECon Conferences](https://www.usenix.org/conferences/byname/925) - The official SRE conference.\n* [LISA Conferences](https://www.usenix.org/conferences/byname/5) - Prominent conference about SysAdmin/DevOps/SRE.\n* [O'Reilly Velocity Conference](https://conferences.oreilly.com/velocity/) - Prominent conference about Systems Engineering/DevOps/SRE.\n* [Chaos Engineering Community Meetup Group](https://www.meetup.com/Chaos-Engineering-Community/) - Bay Area Meetup group for Chaos Engineers.\n* [London Chaos Engineering Community](https://www.meetup.com/London-Chaos-Engineering-Community/) _ London Area Meetup group for Chaos Engineers.\n* [Stockholm Chaos Engineering Meetup](https://www.meetup.com/Stockholm-Chaos-Engineering-Community/) Stockholm Meetup group for Chaos Engineers.\n* [Chaos Engineering Community](https://www.meetup.com/pro/chaos/) - A collection of meetups across the globe about Chaos Engineerings.\n* [Conf42.com: Chaos Engineering](https://conf42.com) - Chaos Engineering for practitioners and adopters - London UK, 23 Jan 2020.\n* [Kubernetes Chaos Engineering Meetup Group India](https://www.meetup.com/Kubernetes-Chaos-Engineering-Meetup-Group-India/)- India Meetup group for Chaos Engineers.\n\n## Forums\n* [Chaos Community Google Group](https://groups.google.com/forum/#!forum/chaos-community)\n* [Chaos Engineering LinkedIn Group](https://www.linkedin.com/groups/7057761)\n* [Chaos Engineering Slack Community](https://gremlin.com/community)\n* [CNCF Chaos Engineering Working Group](https://groups.google.com/forum/#!forum/chaoseng-wg)\n* CNCF Chaos Engineering Working Group Slack: #chaosengineering (slack.cncf.io)\n* [CNCF Chaos Engineering Working Group Github](https://github.com/chaoseng/wg-chaoseng)\n* [Chaos Toolkit Slack Community](https://join.chaostoolkit.org)\n* [Litmus Chaos Engineering Slack Community](https://slack.litmuschaos.io/)\n\n## Contributing\n\nPlease take a look at the [contribution guidelines](CONTRIBUTING.md) first. Contributions are always welcome!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdastergon%2Fawesome-chaos-engineering","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdastergon%2Fawesome-chaos-engineering","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdastergon%2Fawesome-chaos-engineering/lists"}