{"id":48962049,"url":"https://github.com/waldiez/wactorz","last_synced_at":"2026-04-18T02:21:31.617Z","repository":{"id":348742430,"uuid":"1170120579","full_name":"waldiez/wactorz","owner":"waldiez","description":"Real-time, async multi-agent orchestration system built on the Actor Model with MQTT pub/sub.","archived":false,"fork":false,"pushed_at":"2026-04-18T00:28:02.000Z","size":19629,"stargazers_count":10,"open_issues_count":1,"forks_count":2,"subscribers_count":2,"default_branch":"main","last_synced_at":"2026-04-18T01:28:52.724Z","etag":null,"topics":["agents","home-automation","iot","spawning"],"latest_commit_sha":null,"homepage":"https://wactorz.waldiez.io","language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/waldiez.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":"CODEOWNERS","security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":"NOTICE.md","maintainers":null,"copyright":null,"agents":"docs/agents.md","dco":null,"cla":null}},"created_at":"2026-03-01T18:21:31.000Z","updated_at":"2026-04-18T00:22:00.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/waldiez/wactorz","commit_stats":null,"previous_names":["waldiez/wactorz"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/waldiez/wactorz","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/waldiez%2Fwactorz","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/waldiez%2Fwactorz/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/waldiez%2Fwactorz/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/waldiez%2Fwactorz/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/waldiez","download_url":"https://codeload.github.com/waldiez/wactorz/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/waldiez%2Fwactorz/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31953567,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-18T00:39:45.007Z","status":"online","status_checked_at":"2026-04-18T02:00:07.018Z","response_time":103,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agents","home-automation","iot","spawning"],"created_at":"2026-04-18T02:21:30.423Z","updated_at":"2026-04-18T02:21:31.598Z","avatar_url":"https://github.com/waldiez.png","language":"HTML","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Wactorz\n\n**Actor-Model Multi-Agent Framework**  \n_Technical Reference \u0026 Developer Guide_\n\n---\n\n## Table of Contents\n\n1. [What is Wactorz?](#1-what-is-wactorz)\n2. [Architecture](#2-architecture)\n3. [Agent Types](#3-agent-types)\n4. [Spawning Agents at Runtime](#4-spawning-agents-at-runtime)\n5. [Agent-to-Agent Communication](#5-agent-to-agent-communication)\n6. [Health Monitoring \u0026 Error Recovery](#6-health-monitoring--error-recovery)\n7. [Persistence \u0026 State](#7-persistence--state)\n8. [Memory \u0026 User Facts](#8-memory--user-facts)\n9. [Reactive Pipelines](#9-reactive-pipelines)\n10. [TopicBus — Reactive Pub/Sub Coordination](#10-topicbus--reactive-pubsub-coordination)\n11. [Code Safety \u0026 Validation](#11-code-safety--validation)\n12. [LLM Cost Tracking](#10-llm-cost-tracking)\n13. [Interfaces](#11-interfaces)\n14. [MQTT Topic Reference](#12-mqtt-topic-reference)\n15. [Built-in Specialist Agents](#13-built-in-specialist-agents)\n16. [Catalog Agent — Pre-built Recipe Library](#14-catalog-agent--pre-built-recipe-library)\n17. [Remote Nodes \u0026 Edge Deployment](#15-remote-nodes--edge-deployment)\n18. [Installation \u0026 Configuration](#16-installation--configuration)\n19. [Troubleshooting](#17-troubleshooting)\n20. [File Structure](#appendix-file-structure)\n\n---\n\n## 1. What is Wactorz?\n\nWactorz is an asynchronous, actor-model multi-agent framework built from scratch in Python. It allows an LLM orchestrator (\"main\") to spawn, coordinate, monitor, and retire live software agents at runtime — without any code restart or predefined agent types.\n\nThe core idea is simple: you talk to the system in natural language. The LLM writes Python code, wraps it in a `\u003cspawn\u003e` block, and a new agent appears — running in its own async actor, connected to all other agents via MQTT and direct actor messaging, and persisting its state to disk automatically.\n\nWactorz was born out of the need for a framework that could operate on real-world IoT data streams at the edge — something existing agent frameworks (LangGraph, CrewAI, AutoGen) were not designed for. It is lightweight enough to run on modest hardware, offline-capable, and fully async.\n\n### Design Principles\n\n- **Everything is an Actor** — agents communicate via messages, not function calls\n- **Agents are spawned at runtime** — no hardcoded types, no restart required\n- **MQTT is the nervous system** — all events, heartbeats, and results flow through topics\n- **Persistence is automatic** — every agent survives a crash and restores its state\n- **The LLM is the orchestrator** — it decides what agents to create and how to wire them\n- **Errors are first-class** — structured error events trigger real recovery actions\n- **Memory is persistent** — conversation history is summarized, user facts are extracted and remembered across restarts\n\n---\n\n## 2. Architecture\n\n### The Actor Model\n\nEach agent is an Actor: an independent unit with its own async message loop, mailbox (`asyncio.Queue`), and lifecycle (`CREATED → RUNNING → PAUSED → STOPPED / FAILED`). Actors never share memory. They communicate by sending typed `Message` objects to each other via the `ActorRegistry`, which maps actor IDs to actor instances.\n\n```\nMessage flow:\n\n  Actor A                Registry              Actor B\n  ───────               ──────────             ───────\n  send(B_id, TASK, {…}) ──────────────────►  mailbox.put(msg)\n                                              message_loop picks it up\n                                              handle_message(msg) fires\n                        ◄─────────────────── send(A_id, RESULT, {…})\n  handle_message fires\n  future.set_result(…)\n```\n\n### Intent Routing\n\nEvery user message goes through a single cheap LLM call that classifies it into one of four categories before any further processing:\n\n| Intent | Description | Route |\n|--------|-------------|-------|\n| `ACTUATE` | Immediate one-shot Home Assistant device control — turn on/off lights, set temperature, lock/unlock, open/close covers | → ephemeral `OneOffActuatorAgent` |\n| `HA` | Home Assistant management and automation CRUD — list devices/entities/areas, create/edit/delete automations | → `home-assistant-agent` |\n| `PIPELINE` | Reactive rule — \"if X then Y\", \"when X send me a message\", any event-driven logic | → `PlannerAgent` |\n| `OTHER` | General conversation, coding, questions, everything else | → `main` LLM |\n\nThis replaces all previous keyword heuristics with a single LLM classification step. Explicit prefixes (`coordinate:`, `plan:`, `pipeline:`) always win before classification.\n\n### Core Components\n\n| File | Layer | Role |\n|------|-------|------|\n| `core/actor.py` | Core | Base Actor class — mailbox, lifecycle, heartbeat, spawn, send, persist/recall |\n| `core/registry.py` | Core | ActorSystem \u0026 ActorRegistry — actor registration, message routing, broadcast |\n| `core/topic_bus.py` | Core | TopicBus — reactive pub/sub coordination layer: TopicContract with observed schema introspection, TopicRegistry for topic-based agent discovery, SharedStateHub for retained world state, StreamWindow for temporal reasoning |\n| `agents/main_actor.py` | Agent | The LLM orchestrator — intent classification, spawns agents, routes requests, memory \u0026 user facts |\n| `agents/monitor_agent.py` | Agent | Health watcher — detects crashes, fires recovery actions, notifies user |\n| `agents/llm_agent.py` | Agent | Base LLM agent with rolling history summarization, cost tracking, streaming, and 4 providers |\n| `agents/dynamic_agent.py` | Agent | Runtime-generated agents — executes LLM-written Python code in a sandboxed namespace |\n| `agents/planner_agent.py` | Agent | Multi-step task planner + reactive pipeline builder — decomposes tasks, fans out to workers, synthesizes results |\n| `agents/installer_agent.py` | Agent | Package manager — installs pip packages locally and on remote nodes via SSH |\n| `agents/catalog_agent.py` | Agent | Recipe library — holds pre-built agent configs and spawns them on request without requiring code |\n| `agents/manual_agent.py` | Agent | PDF specialist — 3-layer search strategy to find and extract manual content |\n| `agents/home_assistant_agent.py` | Agent | Unified HA agent — hardware recommendations and automation CRUD via HA REST API |\n| `agents/one_off_actuator_agent.py` | Agent | Ephemeral one-shot HA actuator — resolves natural language to HA service calls, executes, reports, then deletes itself |\n| `agents/home_assistant_map_agent.py` | Agent | Live entity/location map via HA WebSocket |\n| `agents/home_assistant_state_bridge_agent.py` | Agent | HA `state_changed` → MQTT bridge |\n| `agents/home_assistant_actuator_agent.py` | Agent | Reactive MQTT→HA actuator — subscribes to topics, calls HA services |\n| `interfaces/chat_interfaces.py` | I/O | CLI (streaming), REST, Discord, WhatsApp — all call `process_user_input[_stream]` |\n| `monitor_server.py` | I/O | MQTT→WebSocket bridge that feeds the live dashboard |\n| `monitor.html` | I/O | Real-time web dashboard — agent cards, logs, cost meters, error alerts |\n\n---\n\n## 3. Agent Types\n\n### LLMAgent (base)\n\nAll LLM-backed agents inherit from `LLMAgent`, which inherits from `Actor`. It manages conversation history with automatic rolling summarization (persisted to disk), tracks token usage and cost across 4 providers, and supports both blocking and streaming responses.\n\n**Supported LLM providers:**\n\n| Provider | Key | Notes |\n|----------|-----|-------|\n| Anthropic Claude | `ANTHROPIC_API_KEY` | Default |\n| OpenAI | `OPENAI_API_KEY` | `--llm openai` |\n| Ollama | _(none)_ | Local models, `--llm ollama --ollama-model llama3` |\n| NVIDIA NIM | `NIM_API_KEY` | Free tier 1000 req/month, `--llm nim --nim-model meta/llama-3.3-70b-instruct` |\n| Google Gemini | `GEMINI_API_KEY` or `GOOGLE_API_KEY` | Free tier available, `--llm gemini --gemini-model gemini-2.5-flash` |\n\n### DynamicAgent\n\nThe heart of Wactorz. When the LLM writes a spawn block, a `DynamicAgent` is created with that code compiled into its namespace. Three optional async functions can be defined:\n\n```python\nasync def setup(agent):\n    # Runs once at startup\n    await agent.log('ready')\n\nasync def process(agent):\n    # Runs in a loop every poll_interval seconds\n    data = read_sensor()\n    await agent.publish('sensors/temp', data)\n\nasync def handle_task(agent, payload):\n    # Runs on demand when a task arrives\n    city = payload.get('city', 'Athens')\n    return {'temp': fetch_weather(city)}\n```\n\n**The `agent` API (available inside all three functions):**\n\n| Method | Description |\n|--------|-------------|\n| `await agent.log(msg)` | Publish a log event |\n| `await agent.publish(topic, data)` | Publish to an MQTT topic |\n| `agent.persist(key, value)` / `agent.recall(key)` | Durable key-value state |\n| `agent.state[\"key\"]` | In-memory dict (cleared on restart) |\n| `agent.llm.chat(prompt)` | Call the LLM |\n| `agent.send_to(name, payload)` | Send a task to another agent by name |\n| `agent.delegate(name, payload)` | Same, with cleaner syntax |\n| `agent.send_to_many(tasks)` | Fan-out to multiple agents in parallel |\n| `agent.agents()` | List all currently running agents |\n\n### MainActor\n\nThe user-facing orchestrator. Every message you type is processed by main, which:\n\n1. Intercepts slash-commands (`/rules`, `/memory`, `/webhook`, `/topics`, etc.) without any LLM call\n2. Classifies intent with a single LLM call: `ACTUATE`, `HA`, `PIPELINE`, or `OTHER`\n3. Routes `ACTUATE` requests to an ephemeral `OneOffActuatorAgent`\n4. Routes `HA` requests to `home-assistant-agent`\n5. Routes `PIPELINE` requests to `PlannerAgent`\n6. Handles `OTHER` with its own streaming LLM conversation\n7. Extracts and persists user facts in the background after every response\n8. Drains any pending monitor notifications and prepends them to the response\n9. Parses `\u003cspawn\u003e` blocks in the LLM output and creates agents automatically\n\n### PlannerAgent\n\nSpawned on-demand for two distinct modes:\n\n**Task planning mode** (complex multi-step tasks):\n1. Check plan cache — reuse plan structure if the task is similar to a recent one (24h TTL)\n2. Discover all running worker agents\n3. Ask the LLM to decompose the task into a dependency graph of steps\n4. Spawn any missing agents declared in the plan (with `spawn_config`)\n5. Execute parallel steps with `asyncio.gather`, inject context into dependent steps\n6. Synthesize all results into a clean user-facing answer\n7. Cache the plan to disk, self-terminate after 2 seconds\n\n**Pipeline mode** (reactive if/when/whenever rules):\n1. Query `home-assistant-agent` for real entity IDs from your HA instance\n2. Feasibility check — verifies required entity types exist, surfaces a clear error if not\n3. LLM designs the agent wiring using canonical patterns (see Section 9)\n4. Spawn `ha_actuator` agents (for HA service calls) and `dynamic` agents (for filtering, webcam, notifications)\n5. Register each rule in main's pipeline registry for persistence and listing\n\n**Trigger the planner explicitly or automatically:**\n\n```\ncoordinate: get the weather in Athens and search for AI news, then combine them\nplan: load the Philips manual and answer the cleaning question\n@planner   any complex multi-step task\nif the door opens send me a Discord message    ← auto-detected as PIPELINE\n```\n\n---\n\n## 4. Spawning Agents at Runtime\n\nSimply describe what you want in the chat. The LLM will write the code and wrap it in a `\u003cspawn\u003e` block. You never need to write code yourself.\n\n### The Spawn Block\n\n```json\n\u003cspawn\u003e\n{\n  \"name\": \"weather-agent\",\n  \"type\": \"dynamic\",\n  \"description\": \"Fetches live weather from Open-Meteo\",\n  \"install\": [\"httpx\"],\n  \"poll_interval\": 3600,\n  \"code\": \"\n    async def setup(agent):\n        await agent.log('Weather agent ready')\n\n    async def handle_task(agent, payload):\n        import httpx\n        city = payload.get('city', 'Athens')\n        async with httpx.AsyncClient() as c:\n            r = await c.get(f'https://wttr.in/{city}?format=j1')\n        return r.json()['current_condition'][0]\n  \"\n}\n\u003c/spawn\u003e\n```\n\n### Spawn Options\n\n| Field | Description |\n|-------|-------------|\n| `name` | Unique agent name. Use `\"replace\": true` to hot-swap a running agent |\n| `type` | `\"dynamic\"` (runtime code), `\"llm\"` (pure conversation), `\"manual\"` (PDF search) |\n| `node` | Remote node name to spawn on (e.g. `\"rpi-kitchen\"`). Omit to run locally |\n| `install` | List of pip packages to install before spawning. Fast-path skips if already importable |\n| `poll_interval` | Seconds between `process()` calls. Use `3600` for infrequent background tasks |\n| `replace` | If `true`, stops the existing agent with this name before spawning the new one |\n| `code` | The Python source. May define `setup()`, `process()`, and/or `handle_task()` |\n| `system_prompt` | For `type: \"llm\"` agents — the LLM's persona and instructions |\n| `description` | Human-readable description shown in the dashboard and used by the planner |\n\nAgents with packages in `\"install\"` are spawned in the background. A fast-path checks whether packages are already importable first — if they are, spawning is instant. All spawned agents are saved to the spawn registry and automatically restored on the next startup.\n\n---\n\n## 5. Agent-to-Agent Communication\n\nAgents can talk to each other directly — no LLM involved, pure actor messaging with futures for synchronous results.\n\n### From inside a DynamicAgent\n\n```python\nasync def handle_task(agent, payload):\n    # Ask another agent and wait for the result\n    weather = await agent.delegate('weather-agent', {'city': 'Athens'})\n\n    # Fan-out to multiple agents in parallel\n    results = await agent.send_to_many([\n        ('weather-agent', {'city': 'Athens'}),\n        ('news-agent',    {'query': 'AI today'}),\n    ])\n\n    # List all running agents\n    workers = agent.agents()\n    # [{'name': 'weather-agent', 'type': 'DynamicAgent', ...}, ...]\n```\n\n### Addressing Agents in Chat\n\n```\n@agent-name  your message here    — route directly to that agent\n@main        your message here    — route to the main orchestrator\n@planner     your complex task    — explicitly trigger the planner\n```\n\n---\n\n## 6. Health Monitoring \u0026 Error Recovery\n\nWactorz has a four-layer error handling system. Errors are first-class events, not just log lines.\n\n## UPDATE: Section 6 — Health Monitoring \u0026 Error Recovery\n \nReplace the existing \"Layer 1\" content with this expanded version that includes the self-healing layers:\n \n---\n \n### Layer 1 — DynamicAgent: 5-Layer Self-Healing\n \nSee [Section 11 — Code Safety \u0026 Validation](#11-code-safety--validation) for the full breakdown. In summary:\n \n1. **Prompt** — LLM is told which methods are sync vs async\n2. **Sanitizer** — `await` on sync methods is stripped at compile time\n3. **`_AwaitableNone`** — sync methods return an awaitable sentinel\n4. **Callback wrapper** — `TypeError` in subscribe callbacks is caught and suppressed\n5. **LLM self-correction** — runtime errors in `setup()` trigger automatic fix + retry (2x)\n \nAdditionally, a **pre-exec safety validator** blocks dangerous code patterns (shell execution, file deletion, eval), and **timeout guards** prevent runaway `process()` and `handle_task()` calls.\n\n\n### Layer 2 — MonitorAgent: Error Registry \u0026 Recovery\n\nThe monitor subscribes to error events from all agents and maintains an error registry. Recovery decisions:\n\n| Severity | Action |\n|----------|--------|\n| `warning` | Log it, let the agent recover on its own |\n| `critical` / `degraded` | Attempt restart (up to 3 times) |\n| `fatal` (compile/setup) | Do NOT restart — the code is broken. Notify user to fix it |\n\n**Heartbeat liveness:** every actor publishes a heartbeat every 10 seconds. The monitor reads `metrics.last_heartbeat` directly, so even idle agents (installer, manual-agent) are never falsely flagged as unresponsive. Infrastructure agents (monitor, installer, main, code-agent, anomaly-detector, home-assistant-agent) are excluded from user-facing notifications.\n\n### Layer 3 — MainActor: User Notification\n\nMonitor notifications are queued and prepended to the user's next response with severity icons:\n\n- 🔴 **critical** — agent is broken, needs attention\n- 🟡 **warning** — agent had issues, monitor is handling it\n- ✅ **recovered** — agent is running normally again\n\n### Layer 4 — PlannerAgent: Graceful Fallback\n\nIf a worker agent returns an error during a planner step, the planner logs it and falls back to asking main's LLM directly for that step — so the user gets a partial answer rather than a silent failure.\n\n---\n\n## 7. Persistence \u0026 State\n\nEvery actor has access to a simple key-value persistence API backed by pickle files in the `state/` directory. State is written to disk **immediately on every `persist()` call** — not just on graceful shutdown — so no state is ever lost on Ctrl+C or crashes.\n\n```python\n# Inside any agent\nagent.persist('my_key', {'count': 42, 'data': [...]})   # write (immediate disk write)\nvalue = agent.recall('my_key', default={})               # read\n```\n\nUsed internally for:\n\n- Conversation history (`LLMAgent`) — sanitized on every load, with rolling summarization\n- Rolling summary (`LLMAgent`) — compressed history surviving beyond the context window\n- User facts (`MainActor`) — durable facts extracted from every conversation exchange\n- Pipeline rules (`MainActor`) — spawn registry for reactive rules, with agent lists\n- Notification webhook URLs (`MainActor`) — auto-injected into pipeline prompts\n- Plan cache (`PlannerAgent`) — 24h TTL, invalidated if required agents are gone\n- Loaded PDF content (`ManualAgent`) — avoids re-downloading on repeated questions\n- Spawn registry (`MainActor`) — restores all agents on startup\n\n### Rolling Conversation History\n\n`LLMAgent` keeps conversation history bounded and lossless via automatic rolling summarization:\n\n- History is kept in RAM up to `summarize_threshold` messages (default: 30)\n- When that threshold is exceeded, the **oldest half** is compressed into a dense factual summary using the LLM (~400 tokens)\n- The summary is prepended as context to every subsequent LLM call — no facts are ever dropped\n- A chain of summaries accumulates over time as the conversation grows\n- Both `conversation_history` and `history_summary` are persisted after every exchange\n\nConversation history is sanitized on every load — any corrupted entries are stripped before the API is called. If you encounter a corrupted history from a previous session, run `fix_history.py` once to clean it up.\n\n---\n\n## 8. Memory \u0026 User Facts\n\nMain automatically extracts and remembers durable facts from every conversation — no explicit commands needed.\n\n### How It Works\n\nAfter every response, main runs a background LLM task that scans the exchange for durable facts worth remembering long-term:\n\n- Home Assistant URLs and entity IDs\n- User name and preferences\n- Webhook URLs and API keys\n- Device names, locations, and areas\n- Any explicit configuration or setup details mentioned by the user\n\nThese are stored in a persistent `_user_facts` dict and injected into main's system prompt on every startup, so main always knows who you are and what your setup looks like — even after a restart.\n\n### Memory Commands\n\n| Command | Description |\n|---------|-------------|\n| `/memory` | Show all stored user facts and the current conversation summary |\n| `/memory clear` | Wipe all facts and the conversation summary |\n| `/memory forget \u003ckey\u003e` | Remove one specific fact by its key |\n\n### Notification Webhooks\n\nWebhook URLs for Discord, Slack, and Telegram are stored separately and automatically injected into pipeline prompts — so generated pipeline agents always use your real URL without you having to provide it again.\n\n| Command | Description |\n|---------|-------------|\n| `/webhook` | List stored webhook URLs |\n| `/webhook discord \u003curl\u003e` | Save a Discord webhook URL |\n| `/webhook slack \u003curl\u003e` | Save a Slack webhook URL |\n| `/webhook telegram \u003curl\u003e` | Save a Telegram webhook URL |\n\nYou can also paste a webhook URL directly into any message — it is detected automatically and saved.\n\n---\n\n## 9. Reactive Pipelines\n\nWactorz can set up persistent reactive rules that run continuously in the background. Any message describing a conditional or event-driven behavior is automatically routed to the pipeline builder via the `PIPELINE` intent.\n\n### Natural Language Examples\n\n```\nif the door opens, send me a Discord message\nwhen the temperature in the kitchen goes above 28 degrees, turn on the air conditioner\nif a person is detected on my webcam, turn on the living room lights\nwhenever the lamp in the living room turns on, notify me on Discord\n```\n\nNo prefix needed — the intent classifier recognises these automatically.\n\n### How It Works\n\nThe `PlannerAgent` handles pipeline requests:\n\n1. **Entity discovery** — queries `home-assistant-agent` for real entity IDs from your HA instance\n2. **Feasibility check** — verifies the required entity types exist; surfaces a clear error if not\n3. **Agent design** — LLM selects the correct wiring pattern and generates spawn configs with real entity IDs\n4. **Spawning** — agents are created and registered in the spawn registry (auto-restore on restart)\n5. **Rule registration** — the rule is saved in main's pipeline registry with its agent list\n\n### Wiring Patterns\n\nThe pipeline builder uses five canonical patterns:\n\n| Pattern | Trigger | Action | Agents spawned |\n|---------|---------|--------|----------------|\n| 1 | HA sensor state change | HA service call (light/switch/climate) | dynamic filter agent + `ha_actuator` |\n| 2 | HA sensor state change | Discord/webhook notification | dynamic agent |\n| 3 | Webcam object detection | HA service call | dynamic YOLO agent + `ha_actuator` |\n| 4 | Webcam object detection | Discord/webhook notification | dynamic YOLO agent + dynamic notify agent |\n| 5 | Timer/schedule | HA service call | dynamic timer agent + `ha_actuator` |\n| 6 | MQTT sensor data + condition (e.g. temp \u003e 20 AND lamp is on) | HA service call | dynamic monitor agent + `ha_actuator` |\n\nPattern 1 requires a dynamic filter agent because HA state is nested under `new_state.state` — the `ha_actuator`'s `detection_filter` only matches top-level payload keys, so the filter agent extracts the state and re-publishes a clean trigger.\n\n### Pipeline Commands\n\n| Command | Description |\n|---------|-------------|\n| `/rules` | List all active pipeline rules with agent status (green/red) and creation time |\n| `/rules delete \u003crule_id\u003e` | Stop all agents for a rule and remove it from the registry |\n\n### HomeAssistantActuatorAgent\n\nThe actuator end of every HA pipeline. Each instance subscribes to one or more MQTT topics, evaluates optional HA entity conditions, enforces a configurable cooldown, and calls HA services via a persistent WebSocket connection.\n\n```\nDynamicAgent (sensor/filter) → MQTT topic → HomeAssistantActuatorAgent → HA service call\n```\n\nOne instance is spawned per automation, configured with an `ActuatorConfig`:\n\n```python\nActuatorConfig(\n    automation_id    = \"person-light\",\n    mqtt_topics      = [\"custom/detections/living-room\"],\n    detection_filter = {\"detected\": True},\n    cooldown_seconds = 10.0,\n    conditions       = [\n        ActuatorCondition(entity_id=\"sun.sun\", attribute=\"state\", operator=\"eq\", value=\"below_horizon\")\n    ],\n    actions          = [\n        ActuatorAction(domain=\"light\", service=\"turn_on\", entity_id=\"light.living_room\")\n    ],\n)\n```\n\nDetection filter values can be plain literals (equality) or operator dicts such as `{\"gte\": 0.7}`. Supported operators: `eq`, `ne`, `gt`, `lt`, `gte`, `lte`. Conditions use AND logic and query live HA entity state via WebSocket.\n\n---\n\n\n## 10. TopicBus — Reactive Pub/Sub Coordination\n \nThe TopicBus is Wactorz's shift from name-based RPC to topic-based reactive coordination. Instead of the planner hardcoding which agent to call by name, agents declare what data they **produce** and **consume** — and the system wires them automatically by topic compatibility.\n \n### Components\n \n| Component | Role |\n| --- | --- |\n| `TopicContract` | What an agent declares it produces/consumes — topics, schemas, triggers |\n| `TopicRegistry` | Global index of all live contracts, queryable by topic pattern or keyword |\n| `SharedStateHub` | Retained MQTT topics for world state (HA entities, presence, energy) |\n| `StreamWindow` | Sliding time window over a topic stream for temporal reasoning |\n| `TopicBus` | Ties everything together — registry, state hub, auto-wiring |\n \n### Topic Namespaces\n \n| Topic Pattern | Description |\n| --- | --- |\n| `home/state/{domain}/{entity_id}` | HA entity states (retained, updated by bridge) |\n| `home/presence/{zone}` | Occupancy/presence per zone (retained) |\n| `home/energy/current` | Current energy consumption (retained) |\n| `agents/{name}/data/{key}` | Agent-published data (retained world state) |\n| `custom/{agent}/{stream}` | Agent-to-agent data streams |\n| `wactorz/intents/{id}` | Planner-published task intents |\n| `wactorz/results/{id}` | Agent-published results to planner intents |\n \n### Observed Schema Introspection\n \nThe #1 failure mode when wiring LLM-generated agents together is the **vocabulary mismatch**: a producer publishes `{\"temp\": 30.5}` but the consumer reads `payload[\"temperature\"]`. The field names are semantically identical but syntactically different — resulting in a `KeyError` at runtime.\n \nWactorz solves this with **observed_samples** — a first-class field on `TopicContract` that auto-captures the actual field names from real published messages:\n \n**Auto-capture on publish:** Every time `agent.publish(topic, data)` is called with a dict payload, `TopicContract.update_observed()` records the real field names and types. This happens automatically — agents don't need to call anything extra.\n \n```python\n# Producer publishes:\nawait agent.publish('sensors/data', {'temp': 30.5, 'humidity': 47.7})\n \n# TopicContract auto-captures:\n# observed_samples = {\n#   'sensors/data': {\n#     'fields':  {'temp': 'float', 'humidity': 'float'},\n#     'example': {'temp': 30.5, 'humidity': 47.7}\n#   }\n# }\n```\n \n**Planner reads before generating:** Before writing consumer code, the planner checks `observed_samples` on registered contracts. If none exist yet, it falls back to `_sample_live_topics()` — a single MQTT connection subscribes to all known topics with a global timeout and captures one real message per topic.\n \nThe result is injected into the LLM prompt:\n \n```\n═══ LIVE TOPIC SAMPLES (use EXACTLY these field names in code!) ═══\n  Topic: sensors/data  (published by temp-simulator)\n    Fields: {'temp': 'float', 'humidity': 'float'}\n    Example payload: {'temp': 30.5, 'humidity': 47.7}\n \nCRITICAL: Use payload['temp'] — NOT payload['temperature'].\n```\n \n### TopicContract Safety Guards\n \n| Guard | What It Catches |\n| --- | --- |\n| String → List coercion | `publishes=\"topic\"` becomes `[\"topic\"]` in `__post_init__`. Prevents char-by-char iteration that would register 32 single-character \"topics\" |\n| Bogus topic filter | Strips entries like `\"publishes\"`, `\"subscribes\"` that leak from LLM kwarg name/value confusion |\n| Kwarg aliases in `declare_contract()` | `schema=...` maps to `produces_schema`. Also accepts `output_schema`, `topics`, `subscribe`, etc. |\n| Manifest propagation | `observed_samples` included in the agent's retained MQTT manifest so the planner has schema data even across restarts |\n \n### Auto-Wiring\n \nThe TopicBus automatically detects wiring opportunities when contracts are registered:\n \n```\n[TopicBus] Auto-wiring opportunity: temp-simulator → mean-logger via sensors/data\n```\n \nThe planner's `/bus` command shows the full registry:\n \n```\nTopicBus — Reactive Pub/Sub Registry\n  agents with contracts : 2\n  published topics      : 1\n  subscribed topics     : 1\n  auto-wiring pairs     : 1\n \n  [temp-simulator]\n    publishes : sensors/data\n    OBSERVED on 'sensors/data': fields={'temp': 'float', 'humidity': 'float'}\n  [mean-logger]\n    subscribes: sensors/data\n \nAuto-wiring opportunities:\n  temp-simulator → mean-logger  via sensors/data\n```\n \n### StreamWindow\n \nAgents can create sliding time windows over MQTT topic streams for temporal reasoning — without implementing their own ring buffers:\n \n```python\nasync def setup(agent):\n    agent.state['w'] = agent.window('sensors/temp', seconds=300)  # NO await\n \nasync def process(agent):\n    w = agent.state['w']\n    if w.rising(threshold=3.0):\n        await agent.alert('Temperature rising fast!')\n    if w.absent_for(60):\n        await agent.alert('Sensor stopped publishing!')\n    avg = w.mean('value')\n```\n \nAvailable methods: `mean`, `min`, `max`, `count`, `rising`, `falling`, `stable`, `absent_for`, `event_count`, `latest`, `values`.\n \n\n## 11. Code Safety \u0026 Validation\n \nLLM-generated code compiles but frequently fails at runtime. Wactorz uses a **5-layer defense** to catch and recover from these failures, plus a **pre-exec safety validator** and **timeout guards** to prevent dangerous or runaway code.\n \n### 5-Layer Self-Healing Defense\n \nEach layer catches what the previous one missed:\n \n| Layer | Where It Runs | What It Does | Coverage |\n| --- | --- | --- | --- |\n| **1. Prompt Engineering** | Planner LLM prompt | Explicitly lists sync vs async methods with `(SYNC, NO await!)` labels | ~70% — LLM may ignore instructions |\n| **2. Code Sanitizer** (`_sanitize_code`) | DynamicAgent compile time | Regex strips `await` from `agent.subscribe()`, `.persist()`, `.recall()`, etc. Also removes LLM self-setup blocks (openai/anthropic imports, API key assignments) | ~95% — misses dynamic patterns |\n| **3. `_AwaitableNone` sentinel** | Every sync API method | Sync methods return a sentinel whose `__await__` completes immediately. `await agent.subscribe(...)` silently works instead of crashing | Catches anything the sanitizer missed |\n| **4. Callback wrapper** (`_safe_invoke`) | Subscribe listener | Catches `TypeError` from `await None` inside callbacks, logs once, suppresses for all subsequent messages | Prevents infinite error spam |\n| **5. LLM self-correction** (`_run_setup`) | Setup phase | If `setup()` raises a runtime error, the traceback + API docs are sent to the LLM, code is fixed, recompiled, and retried (up to 2 attempts) | Last resort — works for most fixable errors |\n \n### Pre-Exec Safety Validator\n \n`_validate_code_safety()` runs after sanitization but before `exec()`. It scans for dangerous patterns in two tiers:\n \n**Blocked (code won't run):**\n \n| Pattern | Reason |\n| --- | --- |\n| `os.system()`, `os.popen()`, `os.exec*()` | Shell execution |\n| `os.remove()`, `os.rmdir()`, `shutil.rmtree()` | File/directory deletion |\n| `subprocess` with `rm` | Destructive shell commands |\n| `eval()`, `__import__()` | Arbitrary code execution |\n| `open()` in write mode | Use `agent.persist()` instead |\n| Raw `socket.socket()` | Use httpx or `agent.publish` |\n \n**Warned (runs but logged):**\n \n| Pattern | Reason |\n| --- | --- |\n| `subprocess` usage | Ensure necessary |\n| `ctypes` | Low-level C interface |\n| `pickle.loads` | Deserialization risk if data is untrusted |\n| `while True:` without `await` | May block event loop |\n \nThis is a best-effort blocklist, not a sandbox. For true isolation in multi-tenant scenarios, run DynamicAgents in a subprocess with seccomp or Docker.\n \n### Planner Code Validator\n \n`_validate_pipeline_code()` runs on every plan right after the LLM generates it and before any agents are spawned:\n \n| Check | Action |\n| --- | --- |\n| `await` on sync methods | Strips `await` from `agent.subscribe()`, `.persist()`, `.recall()`, etc. |\n| Raw `aiomqtt.Client()` | Rewrites to `agent.subscribe()` pattern |\n| Direct HA REST API calls | Flags `/api/services/` patterns — should use `ha_actuator` instead |\n \n### Timeout Guards\n \n| Guard | Timeout | What Happens on Timeout |\n| --- | --- | --- |\n| `process()` loop | 120 seconds | Logs \"likely a blocking call without `run_in_executor`\", publishes error, continues with backoff |\n| `handle_task()` | 60 seconds | Returns error result to caller so the planner doesn't hang waiting |\n \n---\n\n## 12. LLM Cost Tracking\n\nEvery LLM call across every agent accumulates token usage into three counters: `total_input_tokens`, `total_output_tokens`, and `total_cost_usd`. These are visible per-agent in the dashboard and via `/cost` in the CLI.\n\nCost is tracked for all five providers (Anthropic, OpenAI, Ollama free, NIM free/paid, Google Gemini). The `HomeAssistantAgent` tracks costs across all 7 of its internal LLM calls: classification, hardware selection, correction retry, automation generation, delete confirmation, edit identification, and edit generation.\n\n### Google Gemini Pricing (per 1M tokens, standard context ≤200K)\n\n| Model | Input | Output | Notes |\n|-------|-------|--------|-------|\n| `gemini-2.5-flash-lite` | $0.10 | $0.40 | Cheapest, fast, free tier |\n| `gemini-2.0-flash` | $0.10 | $0.40 | Fast \u0026 capable, free tier |\n| `gemini-2.5-flash` | $0.30 | $2.50 | Default, hybrid reasoning, free tier |\n| `gemini-2.5-pro` | $1.25 | $10.00 | Best for coding \u0026 complex tasks |\n| `gemini-3.1-pro` | $2.00 | $12.00 | Flagship, no free tier |\n\nPro models charge 2x for prompts above 200K tokens. Get a free API key at [aistudio.google.com](https://aistudio.google.com).\n\n---\n \n\n## 13. Interfaces\n\n### CLI (Streaming)\n\n```bash\npython -m wactorz                                              # Anthropic Claude (default)\npython -m wactorz --llm openai\npython -m wactorz --llm ollama --ollama-model llama3\npython -m wactorz --llm nim --nim-model meta/llama-3.3-70b-instruct\npython -m wactorz --llm gemini                                         # gemini-2.5-flash default\npython -m wactorz --llm gemini --gemini-model gemini-2.5-pro\npython -m wactorz --interface discord --discord-token YOUR_TOKEN\n```\n\n**CLI commands:**\n\n| Command | Description |\n|---------|-------------|\n| `/agents` | List all running agents with type and status |\n| `/nodes` | List remote nodes with online/offline status and their agents |\n| `/rules` | List all active pipeline rules |\n| `/rules delete \u003cid\u003e` | Stop and delete a pipeline rule by its ID |\n| `/memory` | Show stored user facts and conversation summary |\n| `/memory clear` | Wipe all stored memory |\n| `/memory forget \u003ckey\u003e` | Remove a specific stored fact |\n| `/webhook \u003cservice\u003e \u003curl\u003e` | Store a notification webhook URL |\n| `/topics` | List known MQTT topics and their publishing agents |\n| `/cost` | Show per-agent token usage and cost breakdown |\n| `/clear` | Clear the main agent's conversation history |\n| `/clear-plans` | Wipe the planner's plan cache |\n| `/deploy \u003cnode-name\u003e` | Bootstrap a new remote node via SSH |\n| `/deploy-pkg \u003chost\u003e \u003cpkg...\u003e` | Install pip packages on a remote node |\n| `/migrate \u003cagent\u003e \u003cnode\u003e` | Move a running agent to a different node |\n| `/help` | Show all available commands |\n| `@agent-name` | Route your next message directly to a specific agent |\n\n### REST API\n\nStart with `--interface rest` (default port 8080). Send `POST` requests to `/chat` with `{\"message\": \"...\"}`. Responses are blocking (non-streaming). Suitable for integration with other services.\n\nThe Home Assistant map snapshot is also available at `GET /api/ha-map/latest`. It returns the latest cached map payload from `HomeAssistantMapAgent`, or `404` if no snapshot has been fetched yet.\n\n### Discord\n\nSet `DISCORD_BOT_TOKEN` and start with `--interface discord`. The bot responds when **mentioned** (e.g. `@YourBot turn on the lights`). Make sure to enable the **Message Content Intent** in your Discord Developer Portal under Bot → Privileged Gateway Intents.\n\n### Telegram\n\nSet `TELEGRAM_BOT_TOKEN` and start with `--interface telegram`. The bot responds to any direct message — no prefix needed. Each user runs their own bot with their own token, so it is self-hosted and independent.\n\n```bash\npython -m wactorz --interface telegram\n```\n\n**Setup steps:**\n1. Create a bot via [@BotFather](https://t.me/BotFather) → `/newbot` → copy the token\n2. Add `TELEGRAM_BOT_TOKEN=\u003ctoken\u003e` to your `.env`\n3. Start wactorz and send `/start` to your bot — it replies with your numeric user ID\n4. Add `TELEGRAM_ALLOWED_USER_ID=\u003cid\u003e` to your `.env` to lock the bot to only you\n\n```env\nTELEGRAM_BOT_TOKEN=7123456789:AAF...\nTELEGRAM_ALLOWED_USER_ID=123456789\n```\n\n\u003e **Privacy \u0026 security notes:**\n\u003e - Telegram bots are publicly discoverable by username. Without `TELEGRAM_ALLOWED_USER_ID` set, anyone who finds your bot can send it messages and consume your LLM credits. **Always set it.**\n\u003e - Your bot token is a secret — treat it like a password. Never commit it to git. Make sure `.env` is in your `.gitignore`.\n\u003e - Messages pass through Telegram's servers. If end-to-end privacy is a hard requirement, consider the REST or CLI interface instead.\n\u003e - If your token is ever exposed (e.g. accidentally shared), revoke it immediately via BotFather: `/mybots` → select your bot → API Token → Revoke.\n\n### WhatsApp\n\nSet `TWILIO_ACCOUNT_SID`, `TWILIO_AUTH_TOKEN`, and `TWILIO_WHATSAPP_FROM` and start with `--interface whatsapp`. Wactorz runs an aiohttp webhook server that receives incoming messages from Twilio. The same `process_user_input()` pipeline handles all interfaces.\n\n### Live Dashboard\n\nStart `monitor_server.py` alongside wactorz. Open `monitor.html` in a browser. The dashboard shows real-time agent cards, log streams, token cost meters, spawn/stop controls, and error alerts — all fed via MQTT over WebSocket.\n\n---\n\n## 14. MQTT Topic Reference\n\n| Topic | Description |\n|-------|-------------|\n| `agents/{id}/heartbeat` | Liveness pulse every 10s — name, state, metrics |\n| `agents/{id}/logs` | Log events, spawn notifications, user interactions |\n| `agents/{id}/errors` | Structured error events with phase, severity, traceback |\n| `agents/{id}/alert` | Alert events (heartbeat timeout or error escalation) |\n| `agents/{id}/metrics` | Token usage, cost, tasks completed after each LLM call |\n| `agents/{id}/completed` | Task completion notification with result preview |\n| `agents/{id}/actuations` | Fired by `HomeAssistantActuatorAgent` on each HA service call |\n| `agents/by-name/{name}/task` | Address a task to an agent by name (used by remote agents) |\n| `agents/{id}/manifest` | Retained capability manifest — publishes, subscribes, schemas, observed samples |\n| `agents/{name}/data/{key}` | Agent-published world state (retained, via SharedStateHub) |\n| `system/health` | Global health snapshot every 15s — running/stopped/failed counts |\n| `homeassistant/state_changes/{domain}/{entity_id}` | HA state changes (published by StateBridgeAgent) |\n| `homeassistant/map/entities_with_location` | Live entity/location map (published by MapAgent) |\n| `custom/detections/{slug}` | Object detection events from YOLO pipeline agents |\n| `custom/triggers/{slug}` | Filtered state triggers re-published by pipeline filter agents |\n| `nodes/{name}/spawn` | Spawn a new agent on a remote node |\n| `nodes/{name}/stop` | Stop a named agent on a remote node |\n| `nodes/{name}/migrate` | Move an agent from this node to another |\n| `nodes/{name}/list` | Request list of agents running on a node |\n| `nodes/{name}/heartbeat` | Node liveness pulse — agent list, broker, timestamp |\n| `nodes/{name}/migrate_result` | Migration success/failure notification |\n\n---\n\n## 15. Built-in Specialist Agents\n\n### ManualAgent — PDF Specialist\n\nFinds and extracts product manuals from the web using a 3-layer search strategy:\n\n1. **Direct URL construction** — for known brands (e.g. Philips), tries manufacturer CDNs directly with a HEAD request\n2. **DuckDuckGo search** — with multiple key name fallbacks (`href`, `url`, `link`)\n3. **Bing HTML scrape** — parses HTML for PDF links and trusted manual site URLs\n\nPDF content is extracted in memory (`pdfplumber` → `pymupdf` fallback) and stored in the agent's persistence so repeat questions don't require re-downloading.\n\n### HomeAssistantAgent — HA Automation\n\nConnects to your Home Assistant instance (set `HA_URL` and `HA_TOKEN`) and handles intents, classified by a cheap single-token LLM call:\n\n| Intent | Description |\n|--------|-------------|\n| `recommend_hardware` | Suggests devices and entities for an automation request |\n| `create_automation` | Generates and inserts a new automation via the HA REST API |\n| `edit_automation` | Identifies which automation to change and applies the update |\n| `delete_automation` | Finds and deletes an automation by name (fuzzy matching) |\n| `list_automations` | Returns a formatted list of all automations |\n| `list_areas` | Lists all Home Assistant areas |\n| `list_devices` | Lists all devices |\n| `list_entities` | Lists all entities |\n\nDevice and automation data is cached (30s TTL). The agent includes a self-correction loop for hardware selection — if the LLM returns `can_fulfill=true` with an empty hardware list, it prompts for a correction automatically.\n\n### OneOffActuatorAgent — One-Shot HA Actuation\n\nSpawned by `MainActor` only for `ACTUATE` intent requests: immediate device control where the whole user request is about acting on Home Assistant devices right now.\n\nExamples:\n\n- `turn on the living room light`\n- `turn off the office light`\n- `set heating to 23 degrees`\n- `lock the front door`\n- `turn on the hallway light and turn off the kitchen light`\n\nFlow:\n\n1. Fetch the full Home Assistant device/entity map with location context\n2. Ask the configured LLM to resolve the natural-language request into a JSON array of Home Assistant service calls\n3. Execute those calls via the Home Assistant WebSocket API\n4. Send the result back to `MainActor`\n5. Publish metrics, unregister, stop, and delete its own persistence directory\n\nThe agent is ephemeral by design. Unlike `HomeAssistantAgent`, it does not handle listing, discovery, automation CRUD, or persistent rules.\n\n### HomeAssistantMapAgent — Live Entity Map\n\nMaintains a live, location-enriched map of every HA device and entity. On startup it fetches and caches the latest snapshot locally without dispatching it, then keeps a persistent WebSocket connection to Home Assistant and re-fetches the full device/entity/location dataset every time the entity registry changes. Event-driven and manual refreshes dispatch the result to MQTT or forward it directly to another actor by name.\n\n**Published topic** (default): `homeassistant/map/entities_with_location`\n\n**Task commands** (sent to agent mailbox):\n\n| Command | Description |\n|---------|-------------|\n| `refresh` | Force an immediate rebuild and publish |\n| `refresh simple` | Force an immediate rebuild and publish without entity states |\n| `status` | Return connection state, event counter, and last error |\n\nConfigure with `HA_MAP_AGENT_OUTPUT_TOPIC` and optionally `HA_MAP_AGENT_TARGET_ACTOR` (routes the payload to another actor instead of MQTT).\n\nThe latest cached snapshot is exposed through the REST API at `GET /ha-map`.\n\nWhen a map payload is too large for one MQTT message, the agent emits a `home_assistant_map_update_chunked` manifest first, followed by one or more `home_assistant_map_update_chunk` messages on the same topic carrying a base64-encoded JSON payload.\n\n### HomeAssistantStateBridgeAgent — State Change Bridge\n\nBridges every Home Assistant `state_changed` event to MQTT. Used as the trigger source for all HA-based reactive pipelines.\n\n**Published topic** (default): `homeassistant/state_changes/{domain}/{entity_id}`\n\nKey options:\n- `HA_STATE_BRIDGE_DOMAINS` — comma-separated allow-list (e.g. `light,switch,sensor`); empty = all domains\n- `HA_STATE_BRIDGE_PER_ENTITY` — `1` (default) splits into per-entity sub-topics; `0` sends everything to one topic\n\n**Task commands**: `status`\n\n### HomeAssistantActuatorAgent — Reactive Actuator\n\nSee [Section 9 — Reactive Pipelines](#9-reactive-pipelines) for full documentation.\n\n### CodeAgent \u0026 MLAgent\n\nPre-built agents for code execution and ML inference. `CodeAgent` runs arbitrary Python in a sandboxed subprocess. `MLAgent` wraps YOLO and anomaly detection models (`AnomalyDetectorAgent`) for computer vision tasks over MQTT — useful for smart building sensor streams.\n\n---\n\n## 16. Catalog Agent — Pre-built Recipe Library\n\nThe `CatalogAgent` is a built-in agent that starts with the system and holds a library of ready-made agent recipes. Instead of writing spawn code from scratch, you ask the catalog to spawn a named agent for you — it handles everything including injecting the code, schemas, and capabilities into main's existing spawn pipeline.\n\n### Why It Exists\n\nSome agents are too useful to re-invent every session but too specific to hardcode into `cli.py` as permanent agents. The catalog is the middle ground: recipes live in the `catalogue_agents/` folder as plain Python files, the catalog loads them at startup, and any agent — main, planner, or the user directly — can request a spawn by name.\n\n### Usage\n\n**Direct (from CLI):**\n\n```text\n@catalog spawn image-gen-agent\n@catalog spawn doc-to-pptx-agent\n@catalog list\n@catalog info doc-to-pptx-agent\n```\n\n**Natural language via main:**\n\n```text\n\"spawn the image generation agent\"\n\"what agents can you spawn for me?\"\n\"I need to convert a PDF to PowerPoint\"\n```\n\nMain discovers the catalog via `/capabilities` and routes through it automatically.\n\n### Available Actions\n\n| Action | Payload | Description |\n|--------|---------|-------------|\n| `list` | `{\"action\": \"list\"}` | Returns all available recipes with name, description, and capabilities |\n| `info` | `{\"action\": \"info\", \"agent\": \"name\"}` | Returns full recipe metadata (without the code string) |\n| `spawn` | `{\"action\": \"spawn\", \"agent\": \"name\"}` | Spawns the named agent via main's spawn pipeline; saves to spawn registry |\n\nSpawned agents are registered in main's spawn registry — they survive restarts just like any manually spawned agent.\n\n### Built-in Recipes\n\n| Recipe | Description | Key Dependencies |\n|--------|-------------|-----------------|\n| `image-gen-agent` | Generates images from text prompts using NVIDIA NIM FLUX.1-dev. Returns absolute PNG path. | `requests`, NIM API key |\n| `doc-to-pptx-agent` | Converts PDF or TXT documents into PowerPoint presentations. Extracts real embedded images from the PDF first; falls back to NIM FLUX generation for slides without images. | `pymupdf`, `pdfplumber`, `pptxgenjs` (Node.js) |\n\n### Adding New Recipes\n\nDrop a Python file into `catalogue_agents/` with an `AGENT_CODE` string (the same format as any dynamic agent), then add its entry to `catalog_agent.py`:\n\n```python\n# In catalog_agent.py — _build_catalog()\ncode = _load_recipe(\"my_new_agent.py\")\nif code:\n    catalog[\"my-new-agent\"] = {\n        \"name\":         \"my-new-agent\",\n        \"type\":         \"dynamic\",\n        \"description\":  \"What this agent does\",\n        \"capabilities\": [\"keyword1\", \"keyword2\"],\n        \"input_schema\":  { \"param\": \"str — description\" },\n        \"output_schema\": { \"result\": \"str\" },\n        \"poll_interval\": 3600,\n        \"code\":          code,\n    }\n```\n\nNo changes to `cli.py` or any other file needed. On next restart the recipe is available system-wide.\n\n### image-gen-agent\n\nGenerates images from text prompts via NVIDIA NIM FLUX.1-dev and saves them as PNG files. Requires a free NIM API key (1000 credits/month at [build.nvidia.com](https://build.nvidia.com)).\n\n**Setup:**\n\n```text\n@main remember nim_api_key = nvapi-xxxxxxxxxxxxxxxx\n```\n\n**Task payload:**\n\n```json\n{\n  \"prompt\": \"minimalist flat illustration of renewable energy\",\n  \"output_path\": \"C:/Users/you/Documents/slide.png\",\n  \"width\": 1024,\n  \"height\": 576,\n  \"steps\": 20\n}\n```\n\n**Result:** `{ \"image_path\": \"...\", \"width\": 1024, \"height\": 576, \"size_kb\": 312, \"error\": null }`\n\n### doc-to-pptx-agent\n\nConverts a PDF or TXT document into a polished PowerPoint presentation in four steps:\n\n1. **Read** — extracts text via `pdfplumber` (PDF) or plain read (TXT)\n2. **Extract images** — pulls real embedded images from the PDF using PyMuPDF; filters out small decorations (configurable minimum size); assigns images to slides by source-page proximity\n3. **LLM outline** — calls the LLM to produce a structured JSON outline: slide titles, bullets, theme colors, and per-slide image prompts\n4. **Build** — generates and runs a `pptxgenjs` Node.js script that assembles the final `.pptx` with two-column layouts (text left, image right) for content slides\n\nSlides that received a real PDF image skip NIM generation. Slides without one fall back to `image-gen-agent` (if running) or remain text-only.\n\n**Task payload:**\n\n```json\n{\n  \"file_path\": \"C:/Users/you/Documents/report.pdf\",\n  \"output_path\": \"C:/Users/you/Documents/report.pptx\",\n  \"slide_count\": 8,\n  \"theme\": \"dark executive\",\n  \"nim_fallback\": true,\n  \"min_img_width\": 200,\n  \"min_img_height\": 150\n}\n```\n\n**Result:** `{ \"pptx_path\": \"...\", \"slide_count\": 8, \"title\": \"...\", \"images_extracted\": 5, \"images_generated\": 3, \"error\": null }`\n\n---\n\n## 17. Remote Nodes \u0026 Edge Deployment\n\nWactorz can run agents on any machine on your network — Raspberry Pi, VM, cloud server, or any device with Python 3.10+. The edge node only needs a single file and one pip package.\n\n### How It Works\n\n```\n[Main machine]                        [Raspberry Pi / Edge node]\nmain_actor ──MQTT──► nodes/{name}/spawn ──► remote_runner.py\n                                               │  compiles + runs agent\n                                               │  heartbeats every 10s\ndashboard  ◄──MQTT── agents/{id}/heartbeat ◄───┘\n```\n\nThe `remote_runner.py` is fully self-contained — it reimplements the DynamicAgent contract inline without importing anything from the wactorz package. Remote agents appear in the dashboard and respond to MQTT commands exactly like local agents.\n\n### Edge Node Requirements\n\n```bash\n# That's it — one package, one file\npip install aiomqtt --break-system-packages\npython3 remote_runner.py --broker 192.168.1.10 --name rpi-kitchen\n```\n\nThe broker address must be reachable **from the Pi** (your main machine's LAN IP, not `localhost`).\n\n### Deploying a Node\n\nThe installer agent handles SSH deployment — no manual file copying needed.\n\n**From the CLI:**\n\n```\n/deploy rpi-kitchen\n```\n\nThis will:\n\n1. Discover the Pi on your LAN (mDNS first, then port-22 scan)\n2. Prompt for SSH user, password, and your MQTT broker IP\n3. Upload `remote_runner.py` via SFTP\n4. Install `aiomqtt` on the Pi\n5. Start the runner in the background\n6. The node appears in `/nodes` within ~15 seconds\n\n**From the chat:**\n\n```\nset up my Raspberry Pi at 192.168.1.50 as a node called rpi-kitchen\n```\n\nThe LLM will call `delegate_to_installer` with a `node_deploy` action automatically.\n\n### Spawning Agents on a Remote Node\n\nAdd `\"node\"` to any spawn block:\n\n```json\n\u003cspawn\u003e\n{\n  \"name\": \"temp-sensor\",\n  \"node\": \"rpi-kitchen\",\n  \"type\": \"dynamic\",\n  \"description\": \"Reads temperature from DHT22 and publishes to MQTT\",\n  \"poll_interval\": 30,\n  \"code\": \"\nasync def setup(agent):\n    await agent.log('Sensor ready on ' + agent.node)\n\nasync def process(agent):\n    import random\n    temp = round(20 + random.uniform(-2, 2), 1)\n    await agent.publish('sensors/temperature', {'value': temp, 'unit': 'C', 'node': agent.node})\n  \"\n}\n\u003c/spawn\u003e\n```\n\nOr just ask in chat: _\"spawn a temperature sensor agent on rpi-kitchen\"_\n\n### Installing Packages on a Node\n\nBefore spawning an agent that needs hardware libraries:\n\n```\n/deploy-pkg 192.168.1.50 adafruit-circuitpython-dht RPi.GPIO\n```\n\nOr include `\"install\"` in the spawn block — the remote runner will pip-install them before starting the agent.\n\n### Migrating Agents Between Nodes\n\nMove a running agent to a different machine without stopping it manually:\n\n```\n/migrate temp-sensor rpi-bedroom\n```\n\nOr via chat: _\"move temp-sensor to rpi-bedroom\"_\n\nThe system stops the agent on its current node, starts it fresh on the target, and updates the spawn registry so it restores to the right machine on the next restart.\n\n### Viewing Connected Nodes\n\n```\n/nodes\n```\n\nOutput:\n\n```\n  local                online   @main @monitor @installer @home-assistant-agent\n  rpi-kitchen          online   @temp-sensor\n  rpi-bedroom          OFFLINE  (no agents)\n```\n\nA node is considered online if it sent a heartbeat in the last 30 seconds.\n\n### Remote Agent API\n\nRemote agents have the same `agent.*` API as local agents, with one addition and one limitation:\n\n| Feature | Local | Remote |\n|---------|-------|--------|\n| `agent.publish(topic, data)` | YES | YES |\n| `agent.log(msg)` / `agent.alert(msg)` | YES | YES |\n| `agent.persist(key, val)` / `agent.recall(key)` | YES | YES (JSON file on the Pi) |\n| `agent.send_to(name, payload)` | YES | YES (via MQTT round-trip) |\n| `agent.node` | NO | YES (node name string) |\n| `agent.llm.chat(prompt)` | YES | NO (no LLM provider on edge) |\n\nFor LLM reasoning from a remote agent, use `agent.send_to('main', {'text': prompt})` — main will call its LLM and return the result over MQTT.\n\n### Installer Agent — Remote Actions\n\nThe installer agent handles three actions for node management:\n\n| Action | Description |\n|--------|-------------|\n| `node_deploy` | Full bootstrap: upload runner + install aiomqtt + start process |\n| `node_install` | Install pip packages on a running node via SSH |\n| `node_run` | Run any shell command on a remote node via SSH |\n\nAll three accept `host`, `user`, and either `password` or `key_path` for SSH auth.\n\n---\n\n## 18. Installation \u0026 Configuration\n\n### Quick Start\n\n```bash\ngit clone https://github.com/waldiez/wactorz\ncd wactorz\npython -m venv myenv\n\n# Windows\nmyenv\\Scripts\\activate\n# Mac/Linux\nsource myenv/bin/activate\n\npip install -r requirements.txt\n\n# Set your LLM key\nexport ANTHROPIC_API_KEY=sk-ant-...\n\n# Optional: Home Assistant\nexport HA_URL=http://homeassistant.local:8123\nexport HA_TOKEN=your_long_lived_token\n\n# Start\npython -m wactorz\n```\n\n### MQTT Broker\n\nWactorz requires an MQTT broker. The simplest option is Mosquitto running locally:\n\n```bash\n# Windows (after installing Mosquitto)\nmosquitto -v\n\n# Docker\ndocker run -it -p 1883:1883 eclipse-mosquitto\n```\n\nBy default Wactorz connects to `localhost:1883`. Override with `--mqtt-host` and `--mqtt-port`.\n\n### Environment Variables\n\n| Variable | Description |\n|----------|-------------|\n| `ANTHROPIC_API_KEY` | Claude API key (primary LLM) |\n| `OPENAI_API_KEY` | OpenAI key (alternative LLM) |\n| `NIM_API_KEY` | NVIDIA NIM key (free tier — get at build.nvidia.com) |\n| `GEMINI_API_KEY` or `GOOGLE_API_KEY` | Google Gemini API key (free tier — get at aistudio.google.com) |\n| `HA_URL` / `HOME_ASSISTANT_URL` | Home Assistant base URL (e.g. `http://homeassistant.local:8123`) |\n| `HA_TOKEN` / `HOME_ASSISTANT_TOKEN` | HA long-lived access token |\n| `HA_MAP_AGENT_OUTPUT_TOPIC` | MQTT topic for `HomeAssistantMapAgent` (default: `homeassistant/map/entities_with_location`) |\n| `HA_MAP_AGENT_TARGET_ACTOR` | Route map updates to a named actor instead of MQTT |\n| `HA_STATE_BRIDGE_OUTPUT_TOPIC` | Base MQTT topic for `HomeAssistantStateBridgeAgent` (default: `homeassistant/state_changes`) |\n| `HA_STATE_BRIDGE_DOMAINS` | Comma-separated domain allow-list for state bridge (e.g. `light,switch,sensor`; empty = all) |\n| `HA_STATE_BRIDGE_PER_ENTITY` | `1` (default) = per-entity sub-topics; `0` = single shared topic |\n| `DISCORD_BOT_TOKEN` | Discord bot token (for `--interface discord`) |\n| `TELEGRAM_BOT_TOKEN` | Telegram bot token from BotFather (for `--interface telegram`) |\n| `TELEGRAM_ALLOWED_USER_ID` | Optional — restrict Telegram bot to a single numeric user ID |\n| `TWILIO_ACCOUNT_SID` | Twilio account SID (for `--interface whatsapp`) |\n| `TWILIO_AUTH_TOKEN` | Twilio auth token |\n| `TWILIO_WHATSAPP_FROM` | Twilio WhatsApp sender number |\n\n---\n\n## 19. Troubleshooting\n\n### Conversation history corruption (400 Bad Request loop)\n\nIf you see repeated `400` errors from the Anthropic API with `\"Input should be a valid dictionary\"`, the persisted conversation history has been corrupted. Run the included cleanup script once:\n\n```bash\npython fix_history.py\n```\n\nThen restart Wactorz. The LLM agent also sanitizes history on every load and before every API call as a belt-and-suspenders guard.\n\n### Spawned agent takes too long to appear\n\nWactorz checks whether required packages are already importable before calling the installer. If a package is already installed, the agent spawns instantly. If the installer is called, it echoes the `task_id` back in its reply so the waiting future resolves immediately rather than sitting at the timeout.\n\n### Pipeline rule set up but not triggering\n\n1. Check `/rules` — verify all agents show green status\n2. Check that `HomeAssistantStateBridgeAgent` is running (look for it in `/agents`)\n3. Verify the entity ID is correct — run `@home-assistant-agent list_entities` to check\n4. For HA state triggers the dynamic filter agent must be subscribed to the correct MQTT topic\n\n### False \"unresponsive\" alerts for healthy agents\n\nThe monitor uses two liveness signals: `STATUS_RESPONSE` messages and `metrics.last_heartbeat` (updated every 10 seconds automatically). Infrastructure agents (monitor, installer, main, code-agent, anomaly-detector, home-assistant-agent) are excluded from user-facing notifications even if they are temporarily quiet.\n\n### Discord bot not responding\n\nEnsure **Message Content Intent** is enabled in the Discord Developer Portal (Bot → Privileged Gateway Intents). The bot responds when mentioned — e.g. `@YourBot hello`.\n\n### Telegram bot not responding\n\n- Check that `TELEGRAM_BOT_TOKEN` is set correctly in `.env`\n- If `TELEGRAM_ALLOWED_USER_ID` is set, send `/start` first to confirm your user ID matches\n- The bot uses long polling — no public server or webhook needed\n\n---\n\n## Appendix: File Structure\n\n```\nwactorz/\n├── main.py                                    Entry point — CLI args, actor system setup, supervision tree\n├── remote_runner.py                           Self-contained edge node runner — deploy to any Pi or machine\n├── monitor_server.py                          MQTT → WebSocket bridge for dashboard\n├── monitor.html                               Live web dashboard\n├── fix_history.py                             One-time corrupted history cleanup utility\n├── requirements.txt\n│\n├── core/\n│   ├── actor.py                               Base Actor — mailbox, lifecycle, heartbeat, spawn, supervisor\n│   ├── registry.py                            ActorSystem, ActorRegistry, Supervisor — routing \u0026 OTP restarts\n│   └── topic_bus.py                           TopicBus — reactive pub/sub coordination, schema introspection,\n│                                              TopicContract, TopicRegistry, SharedStateHub, StreamWindow\n│\n├── agents/\n│   ├── llm_agent.py                           LLMAgent — 4 providers, rolling summarization, cost tracking\n│   ├── main_actor.py                          MainActor — intent routing, memory, user facts, pipeline rules\n│   ├── dynamic_agent.py                       DynamicAgent — runtime code executor, error events\n│   ├── planner_agent.py                       PlannerAgent — task planning + reactive pipeline builder\n│   ├── monitor_agent.py                       MonitorAgent — heartbeat, error registry, recovery\n│   ├── installer_agent.py                     InstallerAgent — pip install locally + SSH deploy to remote nodes\n│   ├── catalog_agent.py                       CatalogAgent — pre-built recipe library, spawns agents by name\n│   ├── manual_agent.py                        ManualAgent — 3-layer PDF search and extraction\n│   ├── home_assistant_agent.py                HomeAssistantAgent — HA automation CRUD (LLM-backed, intent routing)\n│   ├── home_assistant_map_agent.py            HomeAssistantMapAgent — live entity/location map via HA WebSocket\n│   ├── home_assistant_state_bridge_agent.py   HomeAssistantStateBridgeAgent — HA state_changed → MQTT bridge\n│   ├── home_assistant_actuator_agent.py       HomeAssistantActuatorAgent — reactive MQTT→HA service actuator\n│   ├── code_agent.py                          CodeAgent — sandboxed Python execution\n│   └── ml_agent.py                            MLAgent, YOLOAgent, AnomalyDetectorAgent\n│\n└── interfaces/\n    └── chat_interfaces.py                     CLI (with /deploy, /migrate, /nodes), REST, Discord, WhatsApp\n\ncatalogue_agents/                              Pre-built agent recipe files (loaded by CatalogAgent at startup)\n├── __init__.py\n├── image_gen_agent.py                         NIM FLUX.1-dev image generation\n└── doc_to_pptx_agent.py                      PDF/TXT → PowerPoint conversion with real image extraction\n\nstate/                                         Persisted agent state (auto-created, never commit to git)\n├── main/state.pkl                             Spawn registry, pipeline rules, user facts, webhook URLs, history\n├── planner/state.pkl                          Plan cache\n└── {agent-name}/state.pkl                     Per-agent persistent state\n```\n\n---\n\n_Wactorz — the 24/7 agents built for the physical world._\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwaldiez%2Fwactorz","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwaldiez%2Fwactorz","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwaldiez%2Fwactorz/lists"}