https://github.com/eumemic/agentfx

A typed effect system whose production interpreter is a durable distributed runtime (iii). One program, two interpreters, differentially tested.
https://github.com/eumemic/agentfx
agents at-least-once distributed-systems durable-execution effect-system functional-programming idempotency iii interpreter type-safety typescript
Last synced: 1 day ago
JSON representation
A typed effect system whose production interpreter is a durable distributed runtime (iii). One program, two interpreters, differentially tested.
Host: GitHub
URL: https://github.com/eumemic/agentfx
Owner: eumemic
License: mit
Created: 2026-06-06T02:22:23.000Z (29 days ago)
Default Branch: main
Last Pushed: 2026-06-06T05:00:29.000Z (29 days ago)
Last Synced: 2026-06-06T06:12:45.724Z (29 days ago)
Topics: agents, at-least-once, distributed-systems, durable-execution, effect-system, functional-programming, idempotency, iii, interpreter, type-safety, typescript
Language: TypeScript
Size: 63.5 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

README

          # agentfx

**A small typed effect system whose production interpreter is a durable distributed runtime.**

You write one typed program from a tiny algebra. A *reference* interpreter (`runMemory`) runs

it in-process — great for tests. A *distributing* interpreter (`runDist`) runs the same program

on [iii](https://iii.dev): an at-least-once durable queue with atomic state, so it survives

process death. The two are differentially tested.

```mermaid

flowchart TD

  prog["one typed program — the Effect algebra"]

  prog --> walk["walk · one reified tree, one interpreter core"]

  walk -->|runMemory| mem["in-process promise pool
fast · for tests"]

  walk -->|runDist| dist["iii backend"]

  dist --> q["durable queue
at-least-once · DLQ · survives kill -9"]

  dist --> st["atomic state
idempotent claim-once"]

  mem -.->|"differentially tested: identical results"| dist

```

The thesis in a line: **a typed effect algebra on top, a durable runtime underneath, and the

boundary between "elegant" and "durable" is the interpreter** — a fan-out that survives a

`kill -9`, which a pure in-memory `Observable` never could.

> **Status: a working spike**, not production. The example "LLM" is a `setTimeout` and tasks are

> `toUpperCase`/`length` — the point is the control plane + the lowering, not inference. See

> **Limits** for exactly what's real.

---

## Two type laws (enforced by the compiler)

Checked by `tsc` in [`src/laws.ts`](src/laws.ts) via `@ts-expect-error` — if a guard stopped

firing, the build would fail.

**1. You cannot `retry` a non-idempotent effect** (the cold-resubscribe / double-charge footgun

is a *compile error*). Only `task(...)` mints the `Replayable` brand; `flatMap` strips it.

```ts

retry(charge.effect({ amt: 10 }), 3); // ✓ a task is Replayable

retry(succeed(5), 3);                 // ✗ compile error: not Replayable

```

**2. You cannot run an effect without supplying its capabilities** `R`; `provide` rejects keys

that aren't required.

```ts

runMemory(prog, { llm, db }); // ✓

runMemory(prog, { llm });     // ✗ compile error: missing `db`

```

## One program, two interpreters

```ts

import { flatMap, forEachTask, map, runMemory, runDist, makeIIIBackend } from "agentfx";

const program = flatMap(

  forEachTask(items, upper, 3),                              // type-safe distributable fan-out

  (uppers) => map(forEachTask(uppers, lengthOf, 2),

                  (lens) => ({ uppers, totalLen: lens.reduce((a, b) => a + b, 0) })),

);

await runMemory(program, {});                  // in-process promise pool — instant, for tests

await runDist(program, {}, makeIIIBackend());  // iii: durable queue — crash-surviving

```

[`ex-differential.ts`](src/ex-differential.ts) runs both on the same program: the happy path is

byte-identical, and a throwing task surfaces a typed `fail()` under **both** (no hang, no

unhandled rejection — the error *shapes* differ by design: `runMemory` returns the raw cause,

`runDist` an aggregated batch error). [`ex-durability.ts`](src/ex-durability.ts) survives an

executor `kill -9` mid-batch.

## The algebra

| combinator | meaning |

|---|---|

| `succeed` / `failWith` | pure success / typed failure |

| `flatMap` / `map` | sequence (unions `R` and `E`; strips `Replayable`) |

| `catchAll(e, h)` | recover from a typed failure with another effect |

| `retry(e, n)` | re-run — **requires `Replayable`** |

| `provide(e, layer)` | discharge capabilities from `R` |

| `task(fnId, keyOf, impl)` | a distributable, idempotent unit; `impl` gets a `TaskCtx.idempotencyKey` |

| `forEachTask(items, task, n)` | type-safe distributable fan-out (children guaranteed `Remote`) |

| `forEachPar(items, f, n)` | generic fan-out; **`runDist` requires `task` children** (closures don't serialize → it throws) |

## Typing the wire (optional): `schemaTask`

The cross-worker boundary is stringly-typed by default. `schemaTask` closes that with one

[zod](https://zod.dev) schema that becomes three things at once:

- **static types** — `In`/`Out` are inferred via `z.infer`, so callers are compile-time checked

- **a published JSON Schema** — sent to the engine as `request_format`/`response_format`; read it

  back with `iii trigger engine::functions::info --json '{"function_id":"agentfx::greet"}'`

  (discoverable by the console and LLM tool-use)

- **runtime validation** — a bad payload is rejected at the executor and rides the `fail()`

  channel as a `ZodError`

```ts

const greet = schemaTask(

  "agentfx::greet",

  { input: z.object({ name: z.string().min(1), times: z.number().int().min(1).max(5) }),

    output: z.string() },

  async ({ name, times }) => Array.from({ length: times }, () => `hi ${name}`).join(" "),

);

greet.effect({ name: "ada", times: 3 });       // ✓ typed

greet.effect({ name: "ada", times: "lots" });  // ✗ compile error (inferred from the schema)

```

`npm run schema` runs it live. (Caveat: the engine stores the schema under `request_schema` in

`functions::info`; the `iii trigger  --help` view doesn't surface it yet — a CLI display gap.)

A contract violation is just a typed failure, so `catchAll` recovers it into a branch —

`catchAll(parseAmount.effect(s), () => succeed(0))` turns a bad parse into a fallback,

identically under both interpreters (`npm run catch`). The recovered value matches across

backends; the raw error *shape* differs by design (structured `ZodError` in-process, a message

string over the wire).

## Types from other workers' contracts (any language)

`schemaTask` types the surface *you* author. But you also call functions implemented by *other*

workers — maybe in Python or Rust. Those workers declare their own contracts

(`request_format`/`response_format`), which the engine stores. `npm run gen` walks the live

registry and writes `src/contracts.generated.ts` — a typed `Contracts` map of every declared

function, regardless of language. `remote(fnId)` is statically typed from it:

```ts

// pymath::add is implemented in PYTHON (pymath_worker.py); its contract is declared there.

const r = await runDist(remote("pymath::add")({ a: 2, b: 3 }), {}, be);

//        remote("pymath::add") : (input: { a: number; b: number }) => Effect<…, { sum: number }>

//        r.value is typed { sum: number } — derived from the Python worker, not hand-written.

remote("pymath::add")({ a: 1, b: "two" }); // ✗ compile error (b: number, from the Python contract)

```

The engine is the IDL: a worker's contract flows to the consumer in *its* preferred typed

language. That's the polyglot-substrate property made concrete — you work in TypeScript and

still call anything in any language, type-safe. `npm run gen` regenerates against whatever's

running. (`remote()` calls run under `runDist` only — there's no in-process impl for another

worker's function.)

## A real agent (Claude + a polyglot tool)

The point of all the machinery: a real LLM agent whose tools are iii workers — in any language —

with durable, preemptible turn semantics. `ex-agent.ts` runs Claude (`claude-opus-4-8`) in a ReAct

loop whose `add` tool is the **Python** `pymath::add` worker, invoked through the typed `remote()`

client:

```

Claude → tool call `add` → remote("pymath::add") → runDist → iii engine → Python worker → result

```

```bash

npm run agent          # "what is 21 + 21, then add 100?" → calls the Python tool twice → 142

npm run preempt-agent  # a new message mid-turn preempts the in-flight turn

```

`ex-preempt-agent.ts` answers the question this repo kept circling: **does preemption compose with

a real, multi-step LLM turn?** A new user message arrives mid-turn; the in-flight inference is

aborted (soft — saves tokens) and every tool call is gated on the epoch fence (hard). Result: only

the latest message's tool fires; the superseded turn is fenced out. The fence sits at the tool

boundary, so it holds regardless of how long or variable the real inference is.

Needs `ANTHROPIC_API_KEY` (and optional `ANTHROPIC_BASE_URL`) plus the `pymath` worker running

(`python pymath_worker.py`). The model call is non-streaming with adaptive thinking and an

`AbortSignal` for preemption.

## A durable agent loop (aios-style) — the loop IS the queue

`ex-agent.ts` above runs the ReAct loop **in the driver** — a normal `for`-loop in one process.

Kill that process mid-turn and the turn is gone. [`src/harness/`](src/harness/) re-expresses it the

way [aios](https://github.com/eumemic/aios) does: **there is no loop.** An append-only event log is

the source of truth, a step function is re-entered by durable wake jobs, and a driver crash mid-turn

resumes from the log. The agent's reasoning loop itself is durable.

| aios concept | here, on agentfx + iii |

|---|---|

| append-only session event log | per-event keys in file-backed `iii-state` ([`log.ts`](src/harness/log.ts)); monotonic seq via atomic `state::update increment` |

| the "loop" is a job queue re-entering a step | `wake` jobs on the iii durable queue → `agentfx::harness::step` ([`worker.ts`](src/harness/worker.ts), [`step.ts`](src/harness/step.ts)) |

| `reacting_to` watermark; status derived from the log | [`sweep.ts`](src/harness/sweep.ts): `needsInference` = ∃ stimulus with `seq >` the max assistant `reacting_to`. No status column. |

| every tool is async; the model stays responsive | tools are fire-and-forget durable-queue jobs ([`tools.ts`](src/harness/tools.ts)); the step returns without awaiting them |

| no compaction — windowing, not summary | [`window.ts`](src/harness/window.ts): turn-aware, drop-from-front, cache-stable prefix |

| tool result IS the idempotency record | dedup on `tool_result` existence + a per-worker in-flight guard |

The model is an injected `Model` interface, so the same harness runs a deterministic stub (for the

crash proof) or real Claude:

```bash

# start the engine first:  (cd ../quickstart && iii --config config.yaml)

npm run harness:durable    # spawns a worker, sends "21+21 then +100", HARD-KILLs it mid-turn

                           #   (process-group kill, first tool in flight), spawns a second worker →

                           #   the queue redelivers the in-flight job, the loop RESUMES from the log

                           #   and finishes 142 on the new worker. PASS prints the w1→w2 hand-off.

npm run harness:claude     # the same durable loop driven by real Claude (claude-opus-4-8), whose

                           #   `add` tool is the Python pymath::add worker. Needs ANTHROPIC_API_KEY

                           #   (+ ANTHROPIC_BASE_URL) and pymath_worker.py running.

npm run harness:worker -- --model stub|claude   # run a standalone (killable) worker

```

The crash proof's log makes the hand-off explicit — each event records which worker wrote it:

```

#1 [client] user: what is 21+21, then add 100?

#2 [w1]     assistant tools=[add(21,21)]   ← w1 starts the turn, then is HARD-KILLED mid-tool

#3 [w2]     tool_result add → 42           ← w2 resumes from the file-backed log…

#4 [w2]     assistant tools=[add(42,100)]

#5 [w2]     tool_result add → 142

#6 [w2]     assistant "142"                 ← …and finishes. The loop is the queue.

```

**Scope** — a faithful core, not all of aios: single-session; the model is injected; no

multi-channel / memory-stores / sandbox / permissions. **Honest limits:** durability is across

*worker* crashes (the in-memory `builtin` queue redelivers; an *engine* restart loses in-flight

wakes — the file-backed log survives, but resuming would need a startup recovery sweep, not built).

Cross-worker concurrency on one session is a non-goal (the dedup guards are per-worker). At-least-once

means a crash between a tool's side effect and its result event re-runs the tool — `ToolImpl` gets

`ctx.idempotencyKey` (the `toolCallId`) so a real tool can dedupe. A model call that errors past a

bounded retry records a durable error event instead of spinning. Append is two ops (claim seq, then

write), so a crash between them leaves a benign seq *gap* — `readLog`/`sweep` tolerate gaps.

## How `runDist` lowers to iii

| node | `runMemory` | `runDist` (iii) |

|---|---|---|

| `task().effect()` standalone | local impl | **direct `w.trigger`** (a non-durable RPC) |

| `forEachTask` / `Par` of tasks | promise pool, concurrency `n` | durable queue: wave-throttled to `n`, at-least-once, crash-surviving, failures surfaced |

| `retry` | loop | loop (each attempt re-runs the child) |

| `flatMap` / `map` / `catchAll` / `provide` | in-process | in-process (driver) |

> Preemption (`switchMap` → an atomic *epoch fence*, so only the latest input's tool fires) is a

> **separate stream-level lowering** in [`src/demo.ts`](src/demo.ts) + [`src/iii.ts`](src/iii.ts),

> not an `Effect` combinator. Run it with `npm run preempt`.

## Run it

```bash

npm install

# start an iii engine with iii-state + iii-queue workers:  iii --config config.yaml

npm run executor       # terminal A: registers tasks + the batch subscriber (stays running)

npm run differential   # terminal B: happy path identical + failure path surfaced

npm run durability     # terminal B: fans out 12 tasks — now `pkill -9 -f ex-executor` in

                       #             terminal A mid-run; redelivery still finishes all 12

npm run typecheck      # the type laws are the test

```

## Design notes

- **Reified, not final.** `Effect` is a tagged-union *data* tree, so interpreters can

  walk it. TS has no GADTs, so the tree erases intermediate types (`FlatMap`/`Par`/`CatchAll`);

  that erasure is contained to the constructors in `effect.ts`. (This is why Effect-TS chose

  fibers/tagless-final — the trade-off is real and named.)

- **Closures don't distribute.** `runDist` only fans out `task` effects (a registered `fnId` +

  serializable args), never arbitrary closures — same reason RPC can't ship a lambda.

- **Backend is pluggable.** `runDist` targets a `Backend` interface; `runtime-iii.ts` is one

  implementation. The algebra doesn't know about iii.

## Limits (honest)

- Example tasks are pure; the "LLM" is a timer. Wire a real model into a `task` and it works.

- **Durability is across consumer crashes** (the engine holds messages, redelivers on restart).

  The default iii `builtin` broker is in-memory, so it is **not** durable across an *engine*

  restart — that needs a persistent queue adapter.

- **At-least-once, not exactly-once.** A crash *after* a side effect but *before* its result is

  recorded re-runs the task on redelivery. `task` impls get `ctx.idempotencyKey` precisely so a

  real side-effecting impl can dedupe at its provider; pure tasks need nothing.

- A standalone `task().effect()` under `runDist` is a **direct (non-durable) call**; durability

  applies to `forEachTask`/`Par` batches.

- The preemption fence's correctness depends on the iii engine's `state::update` being

  linearizable and returning a consistent atomic pre-image; demonstrated, not formally verified.

## License

MIT — see [LICENSE](LICENSE).
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/eumemic/agentfx

Awesome Lists containing this project

README