An open API service indexing awesome lists of open source software.

https://github.com/arianxdev/constitution-sim

Stress-test constitutions with AI-powered agentic politicians before trying them out on a real nation.
https://github.com/arianxdev/constitution-sim

ai-agents constitution multiagent-systems

Last synced: 6 days ago
JSON representation

Stress-test constitutions with AI-powered agentic politicians before trying them out on a real nation.

Awesome Lists containing this project

README

          

# Constitution-Sim

**Stress-test constitutions with AI-powered agentic politicians before trying them out on a real nation!**

`constitution-sim` is a research-grade multi-agent AI simulator. You give it
a constitution and a scenario; it spins up an LLM-powered agent for each
political role (Executive, Legislature, Judiciary, Media, Bureaucracy)
and lets them act under the rules you wrote, turn by turn. Every action
is checked by a rules engine, every event is logged, and every run is
reproducible from a seed.

## Why

Politicians are not utility-maximisers reading from a spec — they
deliberate, bargain, posture, and reach for legitimacy. The interesting
question is *how the rules of a constitution shape that behaviour*. So
the agents here are LLMs (OpenAI / Anthropic) instructed with a
role-specific persona, the constitution they live under, their own
goals and utility weights, and a memory of their own recent decisions.
They never get to mutate the world directly — every move passes the
typed rules engine first.

A deterministic heuristic agent is still available as a no-LLM fallback,
so the project also runs offline / in CI / with zero API keys.

## Features

- **AI cognition is the default.** When `OPENAI_API_KEY` or
`ANTHROPIC_API_KEY` is in the environment, `constitution-sim run`
uses LLM-powered agents out of the box. With no key, it falls back to
a deterministic heuristic — same CLI, same outputs, no setup required.
- **Role-specific personas.** Each role (Executive, Legislature,
Judiciary, Media, Bureaucracy) gets its own LLM system prompt. The
Executive is ambitious; the Judiciary is reactive; the Media chases a
narrative; the Bureaucracy implements steadily.
- **Agent memory & shared history.** Each agent remembers its own recent decisions and can see a public history of what other actors just did (if the constitution allows).
- **Inter-agent deliberation.** Each turn features a deliberation phase where agents can negotiate, threaten, or signal intent by sending messages to each other's inboxes.
- **Schema-driven constitutions.** Strict Pydantic v2 models; YAML in, typed objects out. Constitutions can enforce communication limits (e.g. authoritarian gag orders).
- **Rules engine is source of truth.** Agents propose typed actions; the engine accepts or rejects with a reason. The LLM cannot mutate state directly.
- **Partial observability.** Each role gets a state view filtered by its `observation_limits`.
- **Institutional metrics.** Power concentration, deadlock, trust
volatility, legitimacy, corruption pressure, emergency-power drift.
- **Repeated-run evaluation harness.** Multi-seed runs with pandas /
matplotlib output.
- **Deterministic when seeded** (heuristic mode is byte-for-byte
reproducible; LLM mode is reproducible up to provider variance).

num_pending_bills

corruption_proxy

num_active_laws

power_concentration

public_trust

num_active_laws

emergency_turns

trust_volatility

## Requirements

- Python 3.10+ (target: 3.14)
- `pydantic >= 2`, `PyYAML`, `pandas`, `matplotlib`, `seaborn`
- For AI cognition: `openai` (and/or `anthropic`)

## Install

```bash
git clone https://github.com/arianXdev/constitution-sim.git
cd constitution-sim
pip install -e ".[dev,llm]" # core + tests + LLM SDKs (recommended)
# or, no-LLM-only install:
pip install -e ".[dev]"
```

This exposes a `constitution-sim` console entry point.

## Quickstart (AI-powered)

```bash
export OPENAI_API_KEY=sk-...
constitution-sim run \
--constitution constitutions/advanced_constitution.yaml \
--scenario constitutions/scenario.yaml \
--turns 20 --seed 42 \
--log /tmp/cs/events.jsonl \
--metrics-out /tmp/cs/metrics.csv
```

That's it. The default `--agent-type auto` notices the key, spins up
LLM-powered Executive / Legislature / Judiciary / Media / Bureaucracy
agents, and runs the simulation. You'll see a one-liner telling you
which provider was picked.

Want to force a provider explicitly?

```bash
constitution-sim run --agent-type openai --model gpt-4o-mini ...
constitution-sim run --agent-type anthropic --model claude-sonnet-4-5 ...
```

Want deterministic, no-API runs (for tests / reproducibility)?

```bash
constitution-sim run --agent-type heuristic ...
```

## The four CLI subcommands

```bash
# 1. Validate a constitution YAML against the schema.
constitution-sim validate --constitution constitutions/advanced_constitution.yaml

# 2. Run a simulation (single seed or multi-seed evaluation).
constitution-sim run \
--constitution constitutions/advanced_constitution.yaml \
--scenario constitutions/scenario.yaml \
--turns 30 --runs 5 --seed 42 \
--log /tmp/cs/events.jsonl \
--metrics-out /tmp/cs/metrics.csv \
--plot-dir /tmp/cs/plots

# 3. Replay a recorded event log (structured summary, not re-execution).
constitution-sim replay --log /tmp/cs/eval_logs/run_0_events.jsonl --show-first 5

# 4. Compare two evaluations (e.g. two constitutions).
constitution-sim compare --a /tmp/cs/metrics_A.csv --b /tmp/cs/metrics_B.csv
```

## What the LLM sees

For each turn, the LLM agent is prompted with:

- A role-specific persona (Executive / Legislature / …).
- The constitution's name, description, and the list of other roles.
- Its own declared goals and utility weights (from the YAML).
- A partial state view filtered by its `observation_limits`.
- **Public political history**: recent public actions taken by all actors.
- **Inbox messages**: any negotiation/signals received during the turn's deliberation phase.
- A short memory of its own recent decisions (and whether they were legal).
- The exact set of typed actions it's allowed to return.

It replies with one JSON object describing a single action. If the LLM
returns malformed JSON or an action outside its permission set, the
agent silently falls back to the deterministic heuristic policy — the
simulator never breaks.

## Project structure

```
src/constitution_sim/
models/ Pydantic schemas: Constitution, Role, Rule, WorldState, actions
core/ SimulationEngine, RulesEngine, Scheduler, EventLogger
agents/ BaseAgent, DeterministicHeuristicAgent, LLMAgent, providers
scenarios/ Shock model + ScenarioEngine
analysis/ MetricsCollector, Evaluator, plot
app/ CLI (validate / run / replay / compare)
constitutions/
simple_constitution.yaml
advanced_constitution.yaml
strong_executive_constitution.yaml
scenario.yaml
docs/
architecture.md
tutorial.md
tests/
```

## Tests

```bash
pytest -q
```

All tests should pass. `tests/test_determinism.py` explicitly asserts
that two heuristic-mode runs with the same seed produce byte-identical
event logs. `tests/test_llm_agent.py::test_live_openai_smoke` runs a
real LLM round-trip when `OPENAI_API_KEY` is set, and is automatically
skipped otherwise.

## Headline experiment

Compare a balanced constitution against a strong-executive one (3 runs ×
12 turns, seed 11). The strong-executive YAML pushes power_concentration
from ~0.47 to ~0.92 and adds illegal-action attempts to the log: laws
written by one actor, judiciary unable to push back. That's the
framework working as intended — see `docs/tutorial.md` for a walkthrough.

## Design highlights

- `WorldState` is the single canonical truth; agents only ever see a
`StateView`.
- Every action attempt is recorded in the JSONL event log, including
the rules-engine reason for any rejection.
- `Role.observation_limits` lets the constitution define what each role
can see (e.g. the Bureaucracy doesn't see pending bills in
`advanced_constitution.yaml`).
- `Role.utility_weights` drives heuristic voting and is surfaced to
LLM agents in their prompt as part of the persona.
- `RulesEngine` does both permission checks AND state-level legality
checks (you can't vote on a non-existent bill, you can't declare
emergency powers if the constitution doesn't allow them).

See [`docs/architecture.md`](docs/architecture.md) for the full design
and [`docs/tutorial.md`](docs/tutorial.md) for an end-to-end "use it
like I'm 10" walkthrough.

## Out of scope (intentional)

This is an MVP, not a finished research instrument. The following are
explicit non-goals at this stage:

- Persistent economic/demographic simulation (state variables are scalars, not vector economies).
- Fine-tuned LLMs or RL self-play.