Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/postsuite/railway-derailer
💥 [Platform] Derailer is a tool that helps test for how services handle random dependency failures, using Railway.
https://github.com/postsuite/railway-derailer
chaos-engineering chaos-monkey java observability quarkus railway-app
Last synced: about 4 hours ago
JSON representation
💥 [Platform] Derailer is a tool that helps test for how services handle random dependency failures, using Railway.
- Host: GitHub
- URL: https://github.com/postsuite/railway-derailer
- Owner: PostSuite
- License: mit
- Created: 2025-02-04T19:13:23.000Z (about 24 hours ago)
- Default Branch: dev
- Last Pushed: 2025-02-04T21:09:57.000Z (about 22 hours ago)
- Last Synced: 2025-02-04T21:27:24.060Z (about 22 hours ago)
- Topics: chaos-engineering, chaos-monkey, java, observability, quarkus, railway-app
- Language: Java
- Homepage: https://derailer.postsuite.xyz
- Size: 9.77 KB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Railway Chaos Derailer
![Github Repository Header](https://github.com/user-attachments/assets/0af361a4-2a0a-4362-800b-36d19299ee9d)
---
Derailer is a tool that helps test for how services handle random dependency failures,
on [Railway](https://railway.com). Inspired by [Netflix's Chaos Monkey](https://github.com/Netflix/chaosmonkey).
Derailer randomly removes instances of running services, or remove the service entirely. Allowing you to build more
resilient services by finding missed opportunities to handle failures gracefully.> [!CAUTION]
> You should avoid deploying this onto your customer facing environment, as it may have visible impact, we suggest
> sticking to staging!**If you've read the above disclaimer and understand it, click to deploy:**
[![Deploy on Railway](https://railway.com/button.svg)](https://railway.com/template/WRXzro?referralCode=A6ij-A)
---
## ▶️ Derailment Process
Every so often (every `FREQUENCY_MINUTES`), the Derailer lists your Railway services and filters them to ones that
aren't in the configured blacklist.Using this list, it chooses to 'Derail' a number of your services (using `BLAST_RADIUS`), this is done by **Aborting
Deployments**, it will remove the currently active deployment of your service.After the defined `DURATION_MINUTES`, it will re-deploy any effected services.
**The derailed services are stored in a persistent volume (on sqlite), meaning re-deploying this service will restore
any effected services when it comes online!**---
## 👀 User Interface
> [!WARNING]
> This user interface is not password protected, we suggest
> using [a cloudflare tunnel](https://railway.com/template/cf-tunnel), or implementing your own authentication in a
> forked
> version.Derailer comes with a (very primitive) user interface for controlling how the derailer behaves.
**Features:**
- Viewing an active derailment and what services it's impacting
- Rollback an active derailment if it's causing a headache
- Viewing when the last derailment happened
- Viewing when the next derailment will occur
- Viewing a history of which services were impacted for each past derailment
- Delaying derailments for a certain time period (to allow for some rest!)---
## 🤖 Rest API
> [!WARNING]
> This API is not password protected, we suggest
> using [a cloudflare tunnel](https://railway.com/template/cf-tunnel), or implementing your own authentication in a
> forked
> version.For ease of use, all the features mentioned in our user interface are supported within a REST API, check out the [API
documentation](docs/openapi/openapi.json).---
## 🧩 Configuration Options
### Environment Variables
| Variable | Description | Default Value |
|-------------------|-----------------------------------------------------------------------------------------|-----------------|
| RAILWAY_API_KEY | Your Railway API key, required to list your services and fetch your running deployments | |
| FREQUENCY_MINUTES | How often (in minutes) to run the chaos, cannot be less than `DURATION_MINUTES` | 10 |
| DURATION_MINUTES | How long (in minutes) to keep services removed / deployments aborted | 60 |
| BLAST_RADIUS | How many services to impact in each run of the Derailer | 2 |The default configuration also uses `RAILWAY_SERVICE_ID` (automatically added by Railway) to blacklist the Derailer
application from being Derailed.### Blacklisting services
By default, the only service automatically blacklisted is the Derailer application. However, you may want to blacklist
some UIs, or essential services (think carefully about how reliable they are!) to avoid everything falling on its face.This can be configured in `src/main/resources/application.properties`
---
## 💻 Running Locally
Running locally allows you to quickly test your changes, using the Quarkus framework we can
Copy `.env-example` into a file called `.env` and fill in all the environment vars
Run `./gradlew quarkusDev` from the gradle quarkus menu to run in dev mode---
## 🚀 Testing
To run tests, execute:
```shell
./gradlew test
```