Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/aurora-is-near/borealis-engine-supervisor

Supervisor for borealis-engine
https://github.com/aurora-is-near/borealis-engine-supervisor

Last synced: 10 days ago
JSON representation

Supervisor for borealis-engine

Awesome Lists containing this project

README

        

# borealis-engine-supervisor

Supervisor starts a borealis-engine instance as a subprocess and monitors it through a Prometheus server.
If the metric variable at Prometheus hasn't increased by an expected amount in the timeframe, a signal is sent to the subprocess to reset it.
In case the borealis-engine instance still isn't producing output after state reset, another signal is sent to it to terminate it.

## Running

Configure the supervisor through env variables.
For example, to start a supervisor that fetches the `engine_last_block_height_processed` metric from `http://127.0.0.1:8041` every 10 seconds and sends SIGTERM signal to the subprocess if the metric data hasn't grown by at least one, do:

```
$ export SUPERVISOR_PROMURL='http://127.0.0.1:8041'
$ export SUPERVISOR_METRIC=engine_last_block_height_processed
$ export SUPERVISOR_WARMUPDURATION=15
$ export SUPERVISOR_CHECKDURATION=10
$ export SUPERVISOR_METRICDELTA=1
$ export SUPERVISOR_FAILSIGNAL=9
$ export SUPERVISOR_HANGSIGNAL=15
$ borealis-engine-supervisor /path/to/borealis-engine [ENGINE-ARGS...]
```

All env variables of supervisor are also passed to borealis-engine.

## Configuration

| Env variable | Meaning |
|-----------------------------|-------------------------------------------------------------------------------------------|
| `SUPERVISOR_PROMURL` | Address of the prometheus metrics exporter (http://127.0.0.1:8041) |
| `SUPERVISOR_METRIC` | Name of the metric to test. |
| `SUPERVISOR_WARMUPDURATION` | Seconds of warmup period after starting subprocess or state reset. |
| `SUPERVISOR_CHECKDURATION` | Seconds between metric checks. |
| `SUPERVISOR_METRICDELTA` | Expected metric delta between checks. |
| `SUPERVISOR_FAILSIGNAL` | Signal to send if no increment of metric can be detected between first and second check. |
| `SUPERVISOR_HANGSIGNAL` | Signal to send if no increment of metric can be detected between i and i+1 check (i > 1). |