An open API service indexing awesome lists of open source software.

https://github.com/aashahin/bunbreaker

Bun-native circuit breaker with built-in retry, abort-aware fetch, error mapping, and diagnostics. Zero dependencies — Redis + SQLite + Memory tiered storage, Bun.cron health probes, ElysiaJS & Hono adapters.
https://github.com/aashahin/bunbreaker

breaker bun bunjs circuit-breaker elysia elysiajs hono honojs redis sqlite

Last synced: 8 days ago
JSON representation

Bun-native circuit breaker with built-in retry, abort-aware fetch, error mapping, and diagnostics. Zero dependencies — Redis + SQLite + Memory tiered storage, Bun.cron health probes, ElysiaJS & Hono adapters.

Awesome Lists containing this project

README

          

# bunbreaker

> Bun-native circuit breaker with built-in retry, abort-aware fetch, error mapping, and diagnostics. Zero dependencies — Redis + SQLite + Memory tiered storage, Bun.cron health probes, ElysiaJS & Hono adapters.

**Minimum runtime: Bun >= 1.3.12**

## Features

- **Circuit Breaker** — CLOSED → OPEN → HALF_OPEN state machine with configurable thresholds
- **Capacity Limiter** — Per-breaker concurrent execution semaphore with dedicated `CapacityExceededError`
- **Built-in Retry** — Exponential backoff with jitter, per-error retryability, total time budgets
- **Abort-Aware Fetch** — `fetchWithBreaker()` cancels TCP connections on timeout via `AbortController`
- **Dual Threshold Modes** — Absolute failure count or percentage-based (like Opossum's `errorThresholdPercentage`)
- **Error Classification** — Built-in classifier for HTTP status, network errors, timeouts. Fully overridable
- **Error Mapping** — Transform `CircuitOpenError` → your domain errors before they leave the breaker
- **Three-Tier Storage** — Redis (primary) → SQLite (fallback + audit) → Memory (last resort)
- **Auto-Failover** — Redis meta-breaker detects failures, switches to SQLite, and startup failures retry in the background
- **Sliding Window** — True sliding window via Redis sorted sets + atomic Lua scripts
- **Health Probes** — `Bun.cron` in-process scheduler probes OPEN circuits automatically
- **Fallback Queue** — SQLite-backed bounded outbox replays events when services recover
- **Diagnostics** — Per-breaker stats + aggregate snapshot for health endpoints
- **Alert Adapters** — Resend, Telegram, Webhook (pure functions, zero coupling)
- **Framework Adapters** — ElysiaJS and Hono (optional, thin wrappers)
- **Disposable** — `await using cb = await createBreaker(...)` with `Symbol.asyncDispose`
- **Zero npm dependencies** — Built entirely on Bun primitives

## Quick Start

```ts
import { createBreaker, telegramAlert } from "bunbreaker";

const cb = await createBreaker({
redisUrl: process.env.REDIS_URL,
sqlite: { path: "./bunbreaker.db" },
});

// Create a named circuit breaker
const paymentBreaker = cb.for("payment-api", {
failureThreshold: 5,
windowSecs: 60,
resetTimeoutSecs: 30,
timeoutMs: 8000,
fallback: async () => ({ status: "queued" }),
});

// Execute a protected call
const result = await paymentBreaker.execute(
() => fetch("https://payments.example.com/charge", {
method: "POST",
body: JSON.stringify(payload),
}),
payload // optional — enqueued when circuit is OPEN
);

// Subscribe to events
cb.events
.on("opened", telegramAlert(process.env.TG_TOKEN!, process.env.TG_CHAT!))
.on("closed", (e) => console.log(`${e.name} recovered`))
.on("*", (e) => metrics.increment(`breaker.${e.type}`));

// Health status
const health = cb.health();
// → { currentLayer: "redis", redis: { open: false, failures: 0, recoversAt: null } }

// Diagnostics
const snap = await cb.diagnostics();
// → { summary: { openBreakers: 0, totalRequests: 42, ... }, breakers: [...] }

// Graceful shutdown
await cb.shutdown();
```

## Retry

Built-in retry with exponential backoff, jitter, and a hard total time budget. Each retry attempt gets its own timeout, capped by any remaining `maxRetryTimeMs` budget.

`execute()` uses a promise race for timeouts, so it cannot cancel the underlying work. The caller is released at `timeoutMs`, but the original promise may still run in the background. For production mutations, prefer `executeWithAbort()` or `fetchWithBreaker()`. To avoid duplicate side effects, `execute()` does not retry `BreakerTimeoutError` by default; provide `retry.shouldRetry` if you explicitly want that behavior for idempotent work.

```ts
const breaker = cb.for("flaky-api", {
failureThreshold: 5,
windowSecs: 60,
resetTimeoutSecs: 30,
timeoutMs: 3000,
retry: {
retries: 3,
factor: 2, // exponential backoff factor (default: 2)
minTimeoutMs: 250, // minimum delay between retries
maxTimeoutMs: 5000, // maximum delay between retries
maxRetryTimeMs: 10000, // hard total wall-clock budget for all retries
shouldRetry: (err) => {
// Override per-error retryability (default: uses error classifier)
return !(err instanceof PaymentError);
},
onRetry: (err, attempt, retriesLeft) => {
logger.warn(`Retry ${attempt}, ${retriesLeft} left`, err);
},
},
});

// Only the FINAL error (after all retries) counts toward the breaker
const result = await breaker.execute(() => callFlakyService());
```

## Abort-Aware Fetch

`fetchWithBreaker()` creates a per-attempt `AbortController` that actually cancels the TCP connection on timeout — unlike `execute(() => fetch(...))` which just races the promise. Caller-provided abort signals are treated as caller cancellation, so they do not count as upstream failures or trigger retries.

```ts
import { fetchWithBreaker } from "bunbreaker";

const breaker = cb.for("external-api", {
failureThreshold: 5,
windowSecs: 60,
resetTimeoutSecs: 30,
timeoutMs: 5000,
});

// Basic usage — abort on timeout, classify 5xx responses automatically
const response = await fetchWithBreaker(breaker, "https://api.example.com/data");

// With per-fetch retry (independent from breaker's retry config)
const response = await fetchWithBreaker(
breaker,
"https://api.example.com/data",
{ method: "POST", body: JSON.stringify(data) },
{
timeoutMs: 3000, // override breaker's timeout for this call
retry: { retries: 2, minTimeoutMs: 100 },
}
);
```

You can also use `executeWithAbort()` directly for non-fetch workloads that support cancellation:

```ts
const result = await breaker.executeWithAbort(async (signal) => {
const response = await fetch("https://api.example.com/stream", { signal });
return response.json();
});
```

## Percentage-Based Thresholding

Instead of a fixed failure count, trip the circuit when the error rate exceeds a percentage. Requires a minimum request volume to prevent false positives on low traffic.

```ts
const breaker = cb.for("high-traffic-api", {
percentageThreshold: 50, // trip at 50% error rate
volumeThreshold: 20, // need at least 20 requests before evaluating
windowSecs: 60,
resetTimeoutSecs: 30,
timeoutMs: 5000,
});
```

> **Note**: Use `failureThreshold` OR `percentageThreshold`, not both.

## Capacity Limiter

Limit the number of concurrent in-flight executions per breaker. When the limit is reached, new requests are rejected immediately (via fallback or `CapacityExceededError`) — even if the circuit is CLOSED.

This prevents overwhelming a slow or degraded upstream service with unbounded concurrency.

```ts
const breaker = cb.for("payment-api", {
failureThreshold: 5,
windowSecs: 60,
resetTimeoutSecs: 30,
timeoutMs: 8000,
capacity: 40, // max 40 concurrent requests
});

// If 40 requests are already in-flight, this rejects immediately
const result = await breaker.execute(() => paymentService.charge(body));
```

Capacity rejections emit `capacity_rejected` and do not enqueue payloads, because the upstream circuit is not OPEN. Timed-out calls release capacity at the breaker timeout, even if non-abortable work continues in the background.

## Enabled Kill-Switch

Disable a circuit breaker at runtime without removing it. When `enabled` is `false`, all calls pass straight through to the wrapped function with no circuit breaker logic — no state checks, no failure counting, no timeout racing. Stats are still tracked for observability.

```ts
const breaker = cb.for("payment-api", {
failureThreshold: 5,
windowSecs: 60,
resetTimeoutSecs: 30,
timeoutMs: 8000,
enabled: process.env.PAYMENT_BREAKER_ENABLED !== "false", // runtime kill-switch
});

breaker.setEnabled(false); // runtime toggle for an existing breaker
breaker.setEnabled(true);
```

## Error Classification

The built-in classifier decides which errors count toward the threshold and which are retryable:

| Error Type | Counts? | Retries? | Trips? |
|-----------|---------|----------|--------|
| 5xx Server | ✅ | ✅ | ✅ |
| 429 Rate Limited | ✅ | ✅ | ✅ |
| 4xx Client | ❌ | ❌ | — |
| Network failure | ✅ | ✅ | ✅ |
| Timeout | ✅ | ✅ | ✅ |
| Validation/Business | ❌ | ❌ | — |

### Custom Error Classifier

```ts
const breaker = cb.for("service", {
failureThreshold: 5,
windowSecs: 60,
resetTimeoutSecs: 30,
timeoutMs: 5000,
errorClassifier: (err) => ({
shouldCount: true, // count toward failure threshold
shouldRetry: true, // eligible for retry
shouldTrip: false, // count for health metrics, but don't trip the circuit
}),
});
```

The `shouldTrip` field lets you separate "count for health metrics" from "trigger OPEN". For example, you might want to track 429s in failure stats but not trip the circuit for rate limiting.

## Error Mapping

Map breaker errors to your application's domain errors before they leave the library:

```ts
import { CircuitOpenError, BreakerTimeoutError } from "bunbreaker";

const breaker = cb.for("payment-api", {
failureThreshold: 5,
windowSecs: 60,
resetTimeoutSecs: 30,
timeoutMs: 5000,
errorMapper: (err, ctx) => {
if (err instanceof CircuitOpenError) {
return new ThirdPartyCircuitOpenError(ctx.name);
}
if (err instanceof BreakerTimeoutError) {
return new ThirdPartyTimeoutError(ctx.name);
}
return err instanceof Error ? err : new Error(String(err));
},
});
```

The `ctx` parameter includes `{ name, state, config }` for context-aware mapping.

## Diagnostics

Get runtime stats for all registered breakers:

```ts
const snapshot = await cb.diagnostics();
// {
// generatedAt: "2024-01-15T10:30:00.000Z",
// storeHealth: { currentLayer: "redis", ... },
// summary: {
// registeredBreakers: 3,
// openBreakers: 1,
// halfOpenBreakers: 0,
// closedBreakers: 2,
// totalRequests: 1542,
// totalFailures: 23,
// totalTimeouts: 5,
// totalRejects: 12,
// },
// breakers: [
// {
// name: "payment-api",
// state: "CLOSED",
// config: { ... },
// stats: {
// createdAt: 1705312200000,
// useCount: 500,
// successCount: 487,
// failureCount: 13,
// rejectCount: 0,
// timeoutCount: 3,
// retryCount: 8,
// lastUsedAt: 1705312500000,
// lastOpenedAt: 1705312100000,
// lastClosedAt: 1705312150000,
// },
// },
// ...
// ],
// }
```

Per-breaker stats are also available directly:

```ts
const stats = breaker.getStats();
```

## Production Behavior

### Redis startup recovery

If Redis initialization fails during `createBreaker()`, bunbreaker starts with SQLite/Memory and retries Redis initialization every `redisReconnectIntervalMs` milliseconds. Once Redis connects, it becomes the active distributed store again. Set `redisReconnectIntervalMs: 0` to disable this startup retry loop.

Use `redisKeyPrefix` when multiple applications, environments, or tenants share one Redis deployment:

```ts
const cb = await createBreaker({
redisUrl: process.env.REDIS_URL,
redisKeyPrefix: "prod:checkout",
});
```

### SQLite fallback

If the configured SQLite path cannot be opened, bunbreaker throws during `createBreaker()` by default so production deployments do not silently lose queue and audit durability. For tests or emergency degraded mode, opt in explicitly:

```ts
const cb = await createBreaker({
sqlite: {
path: "/var/lib/app/bunbreaker.db",
allowInMemoryFallback: true,
},
});
```

### Health probe gating

Registering a probe gates recovery by default. An OPEN circuit with a registered probe stays OPEN after `resetTimeoutSecs` until the probe succeeds, then moves to HALF_OPEN for the next real trial request.

```ts
cb.probe("payment-api", {
url: "https://payments.example.com/health",
timeoutMs: 1000,
gateHalfOpen: true, // default
});
```

Set `gateHalfOpen: false` if you want probes for observability only and prefer timer-based OPEN -> HALF_OPEN recovery.

### Queue bounds

Fallback queue writes are bounded by pending count, serialized payload size, and pending TTL. Queue write failures emit `queue_error` but do not block the fallback response.

Replay handlers have a per-event timeout (`handlerTimeoutMs`, default 30s) so one stuck handler cannot hold the replay lock forever. Set it to `0` only when your handler already has its own hard timeout.

## Framework Adapters

Framework adapters inject the breaker helper only by default. Health and diagnostics routes are opt-in because diagnostics can expose breaker names and configuration. Enable them only behind internal/admin routing.

### ElysiaJS

```ts
import { Elysia } from "elysia";
import { createBreaker } from "bunbreaker";
import { elysiaBreaker } from "bunbreaker/elysia";

const cb = await createBreaker({ redisUrl: process.env.REDIS_URL });

const app = new Elysia()
.use(elysiaBreaker(cb, {
healthRoutes: {
healthPath: "/internal/health/circuits",
diagnosticsPath: "/internal/health/circuits/diagnostics",
},
}))
.post("/checkout", async ({ breaker, body }) => {
return await breaker
.for("payment-api", {
failureThreshold: 5,
windowSecs: 60,
resetTimeoutSecs: 30,
timeoutMs: 8000,
})
.execute(() => paymentService.charge(body));
});
```

### Hono

```ts
import { Hono } from "hono";
import { createBreaker } from "bunbreaker";
import { honoBreaker } from "bunbreaker/hono";

const cb = await createBreaker({ redisUrl: process.env.REDIS_URL });
const app = new Hono();

app.use("*", honoBreaker(cb, {
healthRoutes: {
healthPath: "/internal/health/circuits",
diagnosticsPath: "/internal/health/circuits/diagnostics",
},
}));

app.get("/resource", async (c) => {
const result = await c.var.breaker
.for("upstream", {
failureThreshold: 5,
windowSecs: 60,
resetTimeoutSecs: 30,
timeoutMs: 5000,
})
.execute(() => fetchUpstream());
return c.json(result);
});
```

### Bun.serve (Standalone)

```ts
import { createBreaker } from "bunbreaker";

await using cb = await createBreaker({
sqlite: { path: "./bunbreaker.db" },
});

const apiBreaker = cb.for("external-api", {
failureThreshold: 3,
windowSecs: 30,
resetTimeoutSecs: 15,
timeoutMs: 5000,
});

Bun.serve({
async fetch(req) {
const url = new URL(req.url);

if (url.pathname === "/health") {
return Response.json(cb.health());
}

if (url.pathname === "/diagnostics") {
return Response.json(await cb.diagnostics());
}

if (url.pathname === "/api/data") {
try {
const data = await apiBreaker.execute(() =>
fetch("https://api.example.com/data").then((r) => r.json())
);
return Response.json(data);
} catch (err) {
return Response.json({ error: "Service unavailable" }, { status: 503 });
}
}

return new Response("Not found", { status: 404 });
},
port: 3000,
});
```

## Configuration

### `createBreaker(config)`

| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `redisUrl` | `string?` | — | Redis connection URL. Omit for SQLite/Memory only |
| `redisKeyPrefix` | `string` | `bunbreaker` | Redis key/channel namespace for shared Redis deployments |
| `redisReconnectIntervalMs` | `number` | `30000` | Retry Redis startup initialization after failure. Set `0` to disable |
| `sqlite.path` | `string` | `./bunbreaker.db` | SQLite database file path |
| `sqlite.allowInMemoryFallback` | `boolean` | `false` | Continue with non-durable in-memory SQLite if the configured path fails |
| `sqlite.auditRetentionSecs` | `number` | `2592000` (30d) | Retain audit transition events. Set `0` to retain forever |
| `sqlite.deliveredRetentionSecs` | `number` | `604800` (7d) | Retain delivered events. Set `0` to retain forever |
| `sqlite.deadRetentionSecs` | `number` | `2592000` (30d) | Retain dead and stale pending events. Set `0` to retain forever |
| `sqlite.autoPurge` | `boolean` | `true` | Auto-purge old events |
| `sqlite.purgeSchedule` | `string` | `0 3 * * *` | Purge cron (UTC) |
| `probeSchedule` | `string` | `* * * * *` | Health probe cron (UTC) |
| `memoryCacheTtlMs` | `number` | `7000` | Memory cache TTL in ms |

### `.for(name, config)` — Breaker Config

| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `failureThreshold` | `number?` | — | Absolute failure count to trigger OPEN |
| `percentageThreshold` | `number?` | — | Error % (0–100) to trigger OPEN |
| `volumeThreshold` | `number?` | — | Minimum requests before percentage check |
| `windowSecs` | `number` | — | Sliding window duration in seconds |
| `resetTimeoutSecs` | `number` | — | Seconds in OPEN before HALF_OPEN |
| `timeoutMs` | `number` | — | Max ms to wait for fn() |
| `capacity` | `number?` | — | Max concurrent in-flight executions |
| `enabled` | `boolean?` | `true` | Set `false` to bypass all breaker logic |
| `retry` | `RetryConfig?` | — | Retry configuration (see below) |
| `errorMapper` | `ErrorMapper?` | — | Map errors to domain types |
| `errorClassifier` | `function?` | — | Override default classification |
| `fallback` | `function?` | — | Called when OPEN instead of throwing |
| `queueOnOpen` | `boolean?` | `true` | Enqueue payloads when OPEN |
| `queueMaxPending` | `number` | `10000` | Max pending queued payloads per breaker. Set `0` to disable |
| `queueMaxPayloadBytes` | `number` | `262144` | Max serialized queued payload size. Set `0` to disable |
| `queuePendingTtlSecs` | `number` | `604800` | Purge stale pending rows before enqueue. Set `0` to disable |

### `RetryConfig`

| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `retries` | `number` | — | Number of retry attempts |
| `factor` | `number` | `2` | Exponential backoff factor |
| `minTimeoutMs` | `number` | `250` | Minimum delay between retries |
| `maxTimeoutMs` | `number` | `5000` | Maximum delay between retries |
| `maxRetryTimeMs` | `number` | `Infinity` | Hard total wall-clock budget |
| `shouldRetry` | `function?` | — | Override per-error retryability |
| `onRetry` | `function?` | — | Called on each retry attempt |

### `ProbeConfig`

| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `url` | `string` | — | GET endpoint to probe |
| `expectedStatus` | `number` | `200` | Status required for a successful probe |
| `timeoutMs` | `number` | `3000` | Probe request timeout |
| `gateHalfOpen` | `boolean` | `true` | Require probe success before OPEN -> HALF_OPEN |

### `ReplayerConfig`

| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `batchSize` | `number` | `50` | Events to process per batch |
| `batchDelayMs` | `number` | `1000` | Delay between batches |
| `maxJitterMs` | `number` | `5000` | Random startup delay before replay |
| `lockTtlSecs` | `number` | `30` | Distributed replay lock TTL |
| `handlerTimeoutMs` | `number` | `30000` | Per-event handler timeout. Set `0` to disable |

## Architecture

```
┌─────────────────────────────────────────────────────┐
│ BunbreakerInstance │
│ .for() .events .health() .diagnostics() │
├─────────────────────────────────────────────────────┤
│ CircuitBreaker │
│ execute() → raceWithTimeout → classify → threshold │
│ executeWithAbort() → AbortController → classify │
│ retry integration → only final error counts │
├─────────────────────────────────────────────────────┤
│ StoreManager │
│ Redis → SQLite → Memory (fallback) │
│ Meta-breaker on Redis itself │
├──────────┬──────────────┬───────────────────────────┤
│ RedisStore│ SQLiteStore │ MemoryStore │
│ Sorted set│ WAL mode │ Map + TTL │
│ Lua atomic│ Audit log │ Last resort │
│ Pub/Sub │ Event queue │ │
└──────────┴──────────────┴───────────────────────────┘
```

## API Reference

### `CircuitBreaker`

| Method | Description |
|--------|-------------|
| `execute(fn, payload?)` | Execute with timeout race + optional retry |
| `executeWithAbort(fn, payload?)` | Execute with `AbortSignal` on timeout |
| `executeSelfTimed(fn, payload?)` | Execute without timeout (caller manages timeout) |
| `getState()` | Get current circuit state |
| `getStats()` | Get diagnostics stats snapshot |
| `setEnabled(enabled)` | Toggle breaker logic at runtime |

### `BunbreakerInstance`

| Method | Description |
|--------|-------------|
| `for(name, config)` | Create or retrieve a named breaker |
| `events` | Typed event emitter |
| `probe(name, config)` | Register a health probe |
| `health()` | Get store health status |
| `queue` | Access the local event queue |
| `replayer(config?)` | Create an event replayer |
| `diagnostics()` | Get full diagnostics snapshot |
| `maintenance()` | Manual SQLite VACUUM |
| `shutdown()` | Graceful shutdown |

### Standalone Functions

| Function | Description |
|----------|-------------|
| `fetchWithBreaker(breaker, input, init?, options?)` | Abort-aware fetch with circuit breaker |
| `executeWithRetry(fn, ctx)` | Pure retry engine (no breaker dependency) |
| `classifyError(err)` | Default error classifier |

### Alert Adapters

Alert adapters default to a 5000ms network timeout and swallow/log delivery failures so alerts never crash protected application code.

```ts
cb.events
.on("opened", telegramAlert(token, chatId, { timeoutMs: 3000 }))
.on("opened", webhookAlert(url, { Authorization: "Bearer ..." }, { timeoutMs: 3000 }))
.on("opened", resendAlert(apiKey, ["ops@example.com"], "alerts@example.com", { timeoutMs: 3000 }));
```

## Events

```ts
cb.events
.on("opened", (e) => { /* e.name, e.failures, e.ts */ })
.on("closed", (e) => { /* e.name, e.ts */ })
.on("half_open", (e) => { /* e.name, e.ts */ })
.on("rejected", (e) => { /* e.name, e.ts */ })
.on("capacity_rejected", (e) => { /* e.name, e.capacity, e.ts */ })
.on("fallback", (e) => { /* e.name, e.ts */ })
.on("ignored_error", (e) => { /* e.name, e.reason, e.ts */ })
.on("queue_error", (e) => { /* e.name, e.reason, e.ts */ })
.on("queue_purge_warning", (e) => { /* e.name, e.deadCount, e.ts */ })
.on("*", (e) => { /* wildcard — all events */ });
```

## License

MIT