{"id":45106597,"url":"https://github.com/tvanfossen/entropic","last_synced_at":"2026-05-30T07:01:07.323Z","repository":{"id":335108925,"uuid":"1119611149","full_name":"tvanfossen/entropic","owner":"tvanfossen","description":"Local-first agentic inference engine in C/C++. Multi-tier model routing, grammar-constrained output, MCP tool servers. Embeddable via C ABI.","archived":false,"fork":false,"pushed_at":"2026-05-28T13:37:47.000Z","size":43610,"stargazers_count":2,"open_issues_count":19,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-28T15:14:58.430Z","etag":null,"topics":["agentic-ai","agentic-framework","cpp","cpp20","cuda","edge-ai","embedded-ai","gbnf","gguf","grammar-constrained-decoding","inference-engine","llama-cpp","llm","local-llm","mcp","on-device-ai","privacy-first","tool-calling"],"latest_commit_sha":null,"homepage":"https://github.com/tvanfossen/entropic","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tvanfossen.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":"docs/security.md","support":null,"governance":null,"roadmap":"docs/roadmap.md","authors":"AUTHORS","dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":"NOTICE","maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-12-19T14:52:43.000Z","updated_at":"2026-05-28T13:37:51.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/tvanfossen/entropic","commit_stats":null,"previous_names":["tvanfossen/entropi","tvanfossen/entropic"],"tags_count":44,"template":false,"template_full_name":null,"purl":"pkg:github/tvanfossen/entropic","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tvanfossen%2Fentropic","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tvanfossen%2Fentropic/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tvanfossen%2Fentropic/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tvanfossen%2Fentropic/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tvanfossen","download_url":"https://codeload.github.com/tvanfossen/entropic/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tvanfossen%2Fentropic/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33682998,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-30T02:00:06.278Z","response_time":92,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agentic-ai","agentic-framework","cpp","cpp20","cuda","edge-ai","embedded-ai","gbnf","gguf","grammar-constrained-decoding","inference-engine","llama-cpp","llm","local-llm","mcp","on-device-ai","privacy-first","tool-calling"],"created_at":"2026-02-19T22:02:00.741Z","updated_at":"2026-05-30T07:01:07.304Z","avatar_url":"https://github.com/tvanfossen.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Entropic\n\n\u003e Local-first agentic inference engine — your models, your hardware, your control\n\n**API reference:** [tvanfossen.github.io/entropic](https://tvanfossen.github.io/entropic/) — auto-generated from doxygen on every release\n\n## What Is Entropic?\n\nEntropic is a **C inference engine** that turns a local GGUF model into a\nmulti-tier, tool-calling AI system. It runs entirely on your hardware — no\ncloud, no API keys, no telemetry. You control the model, the prompts, the\ntools, and the data.\n\nThe name comes from information theory: every handoff between human intent,\nprompt, and model is a lossy translation. Information decays at each boundary.\nEntropic is purpose-built to manage that decay — structured context management,\nidentity-based delegation, grammar-constrained output, and tool-augmented\nreasoning minimize what gets lost along the way.\n\n## Why Entropic?\n\n**You want local AI that actually does things**, not just answers questions.\n\nMost local inference tools give you a model and a chat loop. Entropic gives you\nan **engine** — the infrastructure between your application and the model that\nhandles the hard parts:\n\n| Problem | How Entropic Solves It |\n|---------|----------------------|\n| Model just generates text | Agentic loop: generate → parse tool calls → execute → re-generate |\n| One model, one personality | Identity system: same model serves multiple roles with different prompts, tools, and constraints |\n| Context gets stale | Auto-compaction: summarizes old context to stay within window |\n| Output format unpredictable | Grammar constraints: GBNF grammars force structured output |\n| Need tools but no cloud | MCP tool servers: filesystem, bash, git, web — all local, plugin architecture |\n| Privacy concerns | Zero network calls. Everything stays on your machine. |\n\n## Who Is It For?\n\nEntropic is an engine, not an application. It's for developers building\nAI-powered software that needs to run locally:\n\n- **CLI/TUI tools** that use AI for code generation, analysis, or planning\n- **Game engines** that need NPC dialogue or decision-making from a local model\n- **Embedded systems** with on-device inference (CPU-only static build available)\n- **Education platforms** running student-facing AI without cloud dependencies\n- **Privacy-sensitive applications** where data cannot leave the device\n\n## Architecture\n\n```\n┌─────────────────────────────────────────────────────────┐\n│  Your Application                                       │\n│  C/C++ (direct linkage) · Python (ctypes wrapper)       │\n├─────────────────────────────────────────────────────────┤\n│  librentropic.so — C API                                │\n│                                                         │\n│  ┌─────────┐ ┌────────┐ ┌──────┐ ┌───────┐ ┌────────┐ │\n│  │ Engine  │ │Inference│ │ MCP  │ │Config │ │Storage │ │\n│  │ Loop    │ │Backend │ │Servers│ │Loader │ │Backend │ │\n│  │         │ │        │ │      │ │       │ │        │ │\n│  │ Context │ │ Prompt │ │ Tool │ │ YAML  │ │ SQLite │ │\n│  │ Routing │ │ Cache  │ │ Auth │ │ Layer │ │ Audit  │ │\n│  │ Delegate│ │ Grammar│ │Plugin│ │ Merge │ │ Session│ │\n│  └─────────┘ └────────┘ └──────┘ └───────┘ └────────┘ │\n├─────────────────────────────────────────────────────────┤\n│  llama.cpp (CUDA / Vulkan / CPU)                        │\n└─────────────────────────────────────────────────────────┘\n```\n\nPure C at all `.so` boundaries. No C++ ABI crossing. Any language that can call\nC functions can use the engine.\n\n## Quick Start\n\nFor consumer install paths (tarball or `pip install entropic-engine`),\nread **[docs/getting-started.md](docs/getting-started.md)** — it covers C/C++ direct\nlinking and the Python wrapper end-to-end.\n\nFor contributors building from source:\n\n### Prerequisites\n\n- Linux (tested on Ubuntu 24.04)\n- cmake 3.21+, C++20 compiler\n- NVIDIA GPU with 16GB+ VRAM (or CPU-only for smaller models)\n- Python 3.10+ (for invoke task runner and pre-commit hooks)\n\n### Build\n\n```bash\ngit clone --recurse-submodules https://github.com/tvanfossen/entropic.git\ncd entropic\n\npython3 -m venv .venv\n.venv/bin/pip install -e \".[dev]\"     # invoke + pre-commit + gcovr + ruff + mypy\n.venv/bin/pre-commit install\n\n# CUDA build (default)\ninv build --clean\n\n# CPU-only build\ninv build --cpu\n```\n\n### Run an Example\n\n```bash\ninv example -n pychess     # Multi-tier chess (C++)\ninv example -n explorer    # Interactive REPL (C++)\ninv example -n headless    # Minimal C harness\n```\n\n## Usage\n\n### C API\n\n```c\n#include \u003centropic/entropic.h\u003e\n\nentropic_handle_t h = NULL;\nentropic_create(\u0026h);\nentropic_configure_dir(h, \".myapp\");  // Layered config resolution\n\n// Streaming generation with full engine pipeline\nentropic_run_streaming(h, \"What is 2+2?\", on_token, NULL, NULL);\n\n// Conversation persists across calls\nentropic_run_streaming(h, \"Explain your reasoning\", on_token, NULL, NULL);\n\n// Manage conversation\nsize_t count;\nentropic_context_count(h, \u0026count);   // Message count\nentropic_context_clear(h);           // New session\n\nentropic_destroy(h);\n```\n\n### Python Wrapper (ctypes)\n\nThe Python package is a thin ctypes binding over the C ABI — no OOP\nwrapper. See [docs/getting-started.md](docs/getting-started.md) for the full\nwalkthrough including `entropic install-engine` (downloads the matching\n`librentropic.so` from GitHub Releases).\n\n```python\nimport ctypes\nfrom entropic import (\n    entropic_create, entropic_configure_dir,\n    entropic_run_streaming, entropic_destroy,\n)\n\nhandle = ctypes.c_void_p()\nentropic_create(ctypes.byref(handle))\nentropic_configure_dir(handle, b\".myapp\")\n\n@ctypes.CFUNCTYPE(None, ctypes.c_char_p, ctypes.c_size_t, ctypes.c_void_p)\ndef on_token(tok, n, ud):\n    print(tok.decode(\"utf-8\", errors=\"replace\"), end=\"\", flush=True)\n\nentropic_run_streaming(handle, b\"What is 2 + 2?\", on_token, None, None)\nentropic_destroy(handle)\n```\n\n### Configuration\n\nConfiguration loads in layers (highest priority wins):\n\n1. **Compiled defaults** — struct initializers built into the engine\n2. **Consumer defaults** (`default_config.yaml` in CWD) — shipped with your app\n3. **Global** (`~/.entropic/config.yaml`) — user machine-wide settings\n4. **Project local** (`{project_dir}/config.local.yaml`) — per-project overrides\n5. **Environment** (`ENTROPIC_*` variables)\n\nMinimal config:\n\n```yaml\nmodels:\n  lead:\n    path: primary           # Resolves via bundled model registry\n    adapter: qwen35\n    context_length: 16384\n  default: lead\n\nrouting:\n  enabled: false\n\nmcp:\n  enable_entropic: true     # Internal tools (delegate, complete, etc.)\n  enable_filesystem: false\n  enable_bash: false\n\npermissions:\n  auto_approve: true        # Skip tool approval prompts\n```\n\n### Session Logging\n\nWhen using `entropic_configure_dir()`, the engine automatically creates:\n\n| File | Contents |\n|------|----------|\n| `{project_dir}/session.log` | Engine operations — config, routing, timing, errors |\n| `{project_dir}/session_model.log` | Full user/assistant exchanges — streamed in real time |\n\n## Features\n\nComprehensive engine capability inventory grouped by domain, with the\nprimary source file for each item so the README doubles as a navigation\nmap. For the historical \"how we got here\" narrative see\n[docs/roadmap.md](docs/roadmap.md); for the layering behind these\nfeatures see [docs/architecture-cpp.md](docs/architecture-cpp.md).\n\n### Agentic Loop\n\n- Generate → parse tool calls → execute → re-generate, with explicit\n  state machine (IDLE → GENERATING → EXECUTING → VERIFYING → COMPLETE) — `src/core/engine.cpp::loop, execute_iteration`, `include/entropic/core/engine_types.h::AgentState`\n- Streaming generation with cancel-token plumbed end-to-end — `src/core/response_generator.cpp::generate_streaming`\n- Per-iteration hook lifecycle: `ON_LOOP_ITERATION`, `ON_CONTEXT_ASSEMBLE`, `PRE_GENERATE` (cancellable), `POST_GENERATE` (revisable), `PRE_TOOL_CALL`, `POST_TOOL_CALL` — `src/core/engine.cpp::execute_iteration, dispatch_post_generate`\n- `LoopMetrics` surfaced via `last_loop_metrics()` and per-tier accessors (iterations, tool calls, tokens, duration, errors) — `include/entropic/core/engine_types.h::LoopMetrics`\n- Iteration count + budget exposed to the model each turn as a system reminder so identity prompts can teach budget-aware rules the model can actually enforce — `src/core/response_generator.cpp::inject_engine_state_reminder`\n- Anti-spiral primitive: warns the model after N consecutive calls of the same tool; threshold via `LoopConfig.max_consecutive_same_tool` — `src/mcp/tool_executor.cpp::update_anti_spiral_tracking`\n- Synthetic completion forced when iteration cap is hit, with terminal-reason metadata for caller diagnostics — `src/core/engine.cpp::loop` (the post-while terminal-reason block)\n- Per-identity overrides for `max_iterations` and `max_tool_calls_per_turn` from frontmatter — `src/facade/entropic.cpp::cache_per_tier_frontmatter`, `include/entropic/core/engine_types.h::effective_max_iterations`\n\n### Identity System \u0026 Delegation\n\n- Multi-tier identity-based delegation: same model, different per-identity prompt + tools + grammar + delegation targets — `src/core/identity_manager.cpp`, `src/core/delegation.cpp`\n- Tier locks at first generation; children run the engine loop independently with their own context and locked tier — `src/core/response_generator.cpp::lock_tier_if_needed`\n- `entropic.delegate` (single child) and `entropic.pipeline` (multi-stage) directives — `src/mcp/servers/entropic_server.cpp`, `src/core/directives.cpp`\n- `entropic.complete` for explicit child-completion summaries — `src/mcp/servers/entropic_server.cpp`\n- Delegation depth tracking + ancestor-tier stack for cycle detection and context-bleed isolation — `include/entropic/core/engine_types.h::delegation_depth, delegation_ancestor_tiers`\n- Single-delegate relay: skip lead re-synthesis when only one delegate fired and the parent identity allows passthrough — `src/core/engine.cpp::finalize_delegation_result, relay_partial_result`\n- Budget-exhausted child relay: validate-and-relay partial summaries with `[partial — budget_exhausted]` prefix and verdict-tagged metadata, instead of silently dropping — `src/core/engine.cpp::relay_partial_result`, `src/core/delegation.cpp::build_child_result`\n- Conversation parent/child ID linking through SQLite storage — `src/storage/database.cpp`, `include/entropic/core/engine_types.h::parent_conversation_id`\n- Worktree session isolation for delegation — `src/core/worktree.cpp`, `include/entropic/core/worktree.h`\n\n### Validation\n\n- Constitutional validator with critique-revise sub-loop and explicit `ValidationVerdict` enum (passed / revised / rejected_max_revisions / skipped) — `src/core/constitutional_validator.cpp`, `include/entropic/types/validation.h`\n- Per-tier `validation_rules` from identity frontmatter; constitution is background context when per-tier rules exist — `src/core/constitutional_validator.cpp::build_critique_prompt, set_tier_rules`\n- Validator can revise content via `POST_GENERATE` hook (Path A: structured JSON, Path B: full message-context revision) — `src/core/constitutional_validator.cpp::handle_hook, attempt_revision`\n- Validation rule visibility on retry: when validator rejects, the next turn's system reminder surfaces the violated rule text — `src/core/engine.cpp::capture_validation_feedback`, `src/core/response_generator.cpp::inject_engine_state_reminder`\n- `ON_COMPLETE` hook for summary validation; can reject and inject feedback as a user-role message tagged `[CITATION VALIDATION] …` — `src/core/engine.cpp::fire_complete_hook`\n- Per-identity validation enable/disable overrides — `src/core/constitutional_validator.cpp::set_identity_validation`\n\n### Tools / MCP\n\nBuilt-in MCP servers (each shipped as a plugin):\n\n| Server | Tools | Source |\n|--------|-------|--------|\n| `entropic` | delegate, pipeline, complete, diagnose, inspect, context_inspect | `src/mcp/servers/entropic_server.cpp` |\n| `filesystem` | read_file, write_file, edit_file, glob, grep | `src/mcp/servers/filesystem.cpp` |\n| `bash` | execute | `src/mcp/servers/bash.cpp` |\n| `git` | status, diff, log, commit, branch, checkout | `src/mcp/servers/git.cpp` |\n| `diagnostics` | diagnostics, check_errors | `src/mcp/servers/diagnostics.cpp` |\n| `web` | web_fetch, web_search | `src/mcp/servers/web.cpp` |\n\nExternal MCP servers connect at runtime via stdio or SSE transport:\n\n```c\nentropic_register_mcp_server(h, \"chess\",\n    \"{\\\"command\\\":\\\"python3\\\",\\\"args\\\":[\\\"chess_server.py\\\"]}\");\n```\n\nCross-cutting machinery:\n\n- Plugin architecture: external servers loaded via `dlopen` + `entropic_create_server()` factory at runtime — `src/mcp/server_manager.cpp`, `include/entropic/interfaces/i_mcp_server.h`\n- Stdio + SSE transports for external clients — `src/mcp/transport_stdio.cpp`, `src/mcp/transport_sse.cpp`, `src/mcp/external_client.cpp`\n- Reconnect policy + health monitoring for external MCP — `src/mcp/reconnect_policy.cpp`, `src/mcp/health_monitor.cpp`\n- Per-identity `allowed_tools` authorization from frontmatter — `src/facade/engine_handle.h::tier_allowed_tools`, `src/prompts/manager.cpp`\n- MCP key-set authorization for granular per-tool permissions — `src/mcp/mcp_key_set.cpp`, `src/mcp/mcp_authorization.cpp`\n- Permission persist interface (allow/deny across sessions) — `src/mcp/permission_manager.cpp`, `src/storage/permission_persister.cpp`\n- Tool-call history ring buffer for diagnostics and validator critique-prompt assembly — `src/mcp/tool_call_history.cpp`\n- Duplicate tool-call detection with circuit breaker after N exact repeats — `src/mcp/tool_executor.cpp::check_duplicate, handle_duplicate`\n- Typed `result_kind` on every `POST_TOOL_CALL` payload (`ok` / `ok_empty` / `error` / `rejected_duplicate` / `rejected_schema` / `rejected_precondition`) — `include/entropic/types/tool_result.h`\n- Empty-vs-content success distinction (`ok_empty`) for pivot-on-empty rules — `src/mcp/tool_result_classify.cpp::is_effectively_empty`\n- Tightened error detection (case-insensitive, bracketed, JSON-shaped `{\"error\": …}`) — `src/mcp/tool_result_classify.cpp::looks_like_tool_error`\n- Tool result preview length cap with truncation marker (`max_tool_result_bytes`, default 16 KB; 0 disables) — `src/mcp/tool_result_classify.cpp::truncate_to_cap`, `src/mcp/tool_executor.cpp::apply_result_size_cap`\n- UTF-8 sanitization on inbound tool results: invalid bytes replaced with U+FFFD before any downstream JSON serialization — `src/mcp/utf8_sanitize.cpp`, `include/entropic/mcp/utf8_sanitize.h`\n- Path-traversal protection in the filesystem server — `src/mcp/servers/filesystem.cpp`\n- Tier-scoped assembled-prompt inspection: `entropic.inspect identity \u003ctier\u003e` returns the full system prompt — `src/facade/entropic.cpp::sp_get_identities, build_assembled_prompt_for_tier`\n- Tool argument JSON schema validation (required-fields + enum + type checks) — `src/mcp/tool_executor.cpp::check_required_fields, check_enum`\n\n### Inference\n\n- llama.cpp backends as compile-time variants: CUDA, Vulkan, CPU — `src/inference/llama_cpp_backend.cpp`, `src/inference/orchestrator.cpp`\n- Native SASS kernels for Maxwell → Blackwell; PTX fallback for Jetson / embedded — see `docs/dist-README.md` for the arch matrix\n- Multiple chat adapters (Qwen3.5, generic) pluggable via `AdapterRegistry` — `src/inference/adapters/`, `src/inference/adapter_manager.cpp`\n- VRAM lifecycle state machine (COLD / WARM / ACTIVE) with explicit load / activate / unload boundaries — `src/inference/llama_cpp_backend.cpp`, `src/inference/orchestrator.cpp`\n- Streaming generation with token observers — per-call `on_token` AND a global `set_stream_observer()` that fires for every token from every entry point — `src/core/response_generator.cpp::stream_token_callback`, `src/facade/entropic.cpp::entropic_set_stream_observer`\n- Persistent stream observer: registered once, fires across `entropic_run_streaming()` reassignment so consumers don't lose events on inner loop iterations — `src/core/response_generator.cpp::stream_observer_`\n- GBNF grammar constraints applied per-tier:\n\n  ```gbnf\n  root ::= analysis best-move tool-call\n  analysis ::= \"Analysis: \" sentence \"\\n\"\n  best-move ::= \"Best move: \" uci \"\\n\"\n  tool-call ::= \"\u003ctool_call\u003e\" json \"\u003c/tool_call\u003e\"\n  ```\n\n  Output is structurally correct by construction; no post-processing, no retries — `src/inference/grammar_registry.cpp`\n\n- LoRA adapter hot-swap (`entropic_adapter_load`, `_unload`, `_swap`, `_state`, `_info`) — `src/facade/entropic.cpp` (entropic_adapter_*)\n- Logprob evaluation API for downstream perplexity / scoring — `src/inference/llama_cpp_backend.cpp`\n- Per-tier `GenerationParams`: temperature, top_p, top_k, repeat_penalty, max_tokens, seed, reasoning_budget — `src/inference/profile_registry.cpp`, `src/types/config.cpp`\n- KV-cache prefix save/load via `llama_state_seq_get_data` / `llama_state_seq_set_data` (host-memory prompt cache); identity / constitution / tool prefixes hot-swap on identity change — `src/inference/prompt_cache.cpp`\n- Multimodal support hooks (vision adapter slot wired in v1.9.11) — `src/inference/image_preprocessor.cpp`, `src/inference/adapters/qwen35_adapter.cpp`\n\n### Configuration\n\n- Layered config resolution (highest priority wins) — `src/config/loader.cpp`, `src/config/env_overrides.cpp`:\n\n  1. Compiled defaults (struct initializers in the engine)\n  2. Consumer defaults — `default_config.yaml` next to your binary\n  3. Global — `~/.entropic/config.yaml`\n  4. Project local — `{project_dir}/config.local.yaml`\n  5. Environment — `ENTROPIC_*` variables\n\n- Cross-field validation: routing references, default tier consistency, threshold ordering, identity references — `src/config/validate.cpp`\n- Bundled model registry (`path: primary` resolves via `bundled_models.yaml` to a vetted GGUF) — `src/config/bundled_models.cpp`, `data/bundled_models.yaml`\n- `entropic download primary` fetches into `~/.entropic/models/` with resume support — `src/cli/download.cpp`\n- Bundled data file discovery: compile-time `DATA_DIR` define, overridable at runtime via `config_dir` — `src/config/data_dir.cpp`\n- Per-identity frontmatter parsing: `allowed_tools`, `validation_rules`, `relay`, `max_iterations`, `max_tool_calls_per_turn` — `src/prompts/manager.cpp`, `include/entropic/prompts/manager.h::IdentityFrontmatter`\n\n### Storage / Persistence\n\n- SQLite-backed conversation storage with parent/child linkage — `src/storage/database.cpp`, `src/storage/backend.cpp`\n- Pluggable `StorageInterface` for downstream backends — `include/entropic/interfaces/i_storage.h`, `src/storage/c_interface.cpp`\n- Compaction strategies: token-budget triggers, custom compactor registration, snapshot-via-storage path — `src/core/compaction.cpp`, `src/core/compactor_registry.cpp`\n- Audit log: structured per-action records of tool dispatches, permission decisions, identity transitions — `src/storage/audit_logger.cpp`, `src/storage/audit_entry.cpp`\n- Session loggers: `session.log` (engine ops, routing, timing) + `session_model.log` (full transcripts streamed live) — `src/types/session_logger.cpp`\n\n### Hooks (Extensibility)\n\n20+ hook points across the loop lifecycle. All registration / dispatch lives in `src/core/hook_registry.cpp` and `include/entropic/types/hooks.h`.\n\n- **Lifecycle**: `ON_LOOP_START`, `ON_LOOP_ITERATION`, `ON_LOOP_END` — fired in `src/core/engine.cpp::loop`\n- **Generation**: `PRE_GENERATE` (cancellable), `POST_GENERATE` (content-revisable), `ON_STREAM_TOKEN` — `src/core/engine.cpp::execute_iteration`, `src/core/response_generator.cpp::stream_token_callback`\n- **Tools**: `PRE_TOOL_CALL`, `POST_TOOL_CALL`, `ON_PERMISSION_CHECK` — `src/mcp/tool_executor.cpp::fire_pre_tool_hook, fire_post_tool_hook`\n- **Delegation**: `ON_DELEGATE`, `ON_COMPLETE`, `ON_TIER_SELECTED` — `src/core/engine.cpp::fire_complete_hook`\n- **Context**: `ON_CONTEXT_ASSEMBLE`, `ON_ADAPTER_SWAP` — `src/core/context_manager.cpp`, `src/inference/adapter_manager.cpp`\n\nHook semantics:\n\n- Pre-hooks return non-zero to cancel the operation\n- Post-hooks return modified content to revise the engine's view\n- Info-level hooks are fire-and-forget\n- Multiple callbacks per hook point with priority ordering\n- Registration / deregistration at any time during engine lifetime — `src/facade/entropic_hooks.cpp`\n\n### External MCP Bridge\n\n- `entropic mcp-bridge` — pure stdio↔unix-socket relay (v2.1.7+, gh#34): forwards JSON-RPC bytes between an MCP client (Claude Code, VSCode, etc.) and a running engine's external bridge socket. Owns no engine instance and loads no model; an engine host (TUI / consumer app / future headless server) must already be running for the same project directory. Fails fast with a diagnostic naming the canonical path + socket when no engine is reachable — `src/cli/mcp_bridge.cpp`, `src/facade/external_bridge.cpp`\n- Multi-client subscription: TUI + Claude Code can both receive `ask_complete` / progress events simultaneously — `src/facade/external_bridge.cpp::subscribe, broadcast_notification`\n- Async ask via push notification + `ask_status` polling for long-running tasks — `src/facade/external_bridge.cpp::run_async_ask, handle_ask_status`\n- Phase observer: VERIFYING → \"validating\" / \"revising\" sub-phases surfaced to bridge subscribers — `src/facade/external_bridge.cpp::attach_phase_observer, phase_observer_cb`\n- Generation counter on the phase observer: in-flight stale callbacks are guaranteed no-ops after detach (race-safe under concurrent cancel) — `src/facade/external_bridge.cpp::observer_call_is_stale`, `include/entropic/mcp/external_bridge.h::observer_gen_`\n- Cancel-on-clear semantics: `entropic.context_clear` interrupts and drains in-flight async tasks before returning — `src/facade/external_bridge.cpp::cancel_inflight_async_tasks`\n\n### Distribution\n\n- **Pure C ABI** at every `.so` boundary — opaque handles, error codes via enum, no C++ ABI crossing, no exceptions across the boundary — `include/entropic/entropic.h`, `src/facade/entropic.cpp`\n- `find_package(entropic 2.1)` CMake support with imported target `entropic::entropic` — `cmake/entropic-config.cmake.in`, `CMakeLists.txt`\n- Tarball layout: `bin/`, `lib/`, `include/`, `share/` — standard Unix prefix — see `docs/releasing.md`\n- `pip install entropic-engine` — pure-Python ~50 KB ctypes wrapper + `entropic install-engine` subcommand that fetches the matching tarball from GitHub Releases (playwright pattern) — `python/src/entropic/`\n- `$ENTROPIC_LIB` / `$ENTROPIC_HOME` env vars for custom install resolution — `python/src/entropic/_loader.py`\n- `entropic` CLI subcommands: `mcp-bridge`, `download`, `version`, plus wrapper-side `install-engine` — `src/cli/main.cpp`, `python/src/entropic/cli.py`\n- Reference examples: `headless` (C), `pychess` (C++ multi-tier showcase), `explorer` (interactive REPL), `openai-server` (OpenAI-compat HTTP front-end with chat/completions, completions, models, models/{name}, health, SSE streaming) — `examples/`\n\n### Observability\n\n- Per-subsystem structured logging via `spdlog` (e.g. `mcp.tool_executor`, `core.response_generator`, `inference.orchestrator`) — `src/types/logging.cpp`\n- `LoopMetrics` exposed through `last_loop_metrics()` and per-tier accessor maps — `include/entropic/core/engine.h::last_loop_metrics, per_tier_metrics`\n- `ThroughputTracker` integration on `GenerationResult` — `src/inference/throughput_tracker.cpp`\n- `entropic_throughput_tok_per_sec()` facade query — `src/facade/entropic.cpp`\n- 20+ hook points doubling as telemetry consumption points — `src/core/hook_registry.cpp`\n- `session.log` + `session_model.log` per project_dir — `src/types/session_logger.cpp`\n- `doxygen-guard` enforces inline documentation on every function (`@brief`, exemption tag, `@version` bumped on body change) — `.doxygen-guard.yaml`, `.pre-commit-config.yaml`\n- `knots` enforces code-quality budget per function (cognitive complexity ≤ 15, McCabe ≤ 15, nesting ≤ 4, SLOC ≤ 50, ABC ≤ 10, returns ≤ 3) — `.pre-commit-config.yaml`\n- Per-library coverage gates via `gcovr` enforced by `inv check-coverage` — `tasks.py::check_coverage`, `.pre-commit-config.yaml`\n\n### Quality / Testing\n\n- 16 pre-commit hooks: trim whitespace, end-of-file, ruff, ruff-format, flake8, knots, doxygen-guard, build, unit tests, per-library coverage, plus standard pre-commit checks — `.pre-commit-config.yaml`\n- Unit + regression tests (Catch2 v3 BDD style) — `tests/unit/`\n- Model tests, GPU-recommended (CPU works but is impractically slow), developer-run; results attached to the GitHub Release as `model-results-vX.Y.Z.json` at each x.y.0 — `tests/model/`, `tasks.py::test`\n- `tests/distribution-smoke-consumer/` exercises the `find_package(entropic)` consumer experience end-to-end — `tests/distribution-smoke-consumer/`\n- Test gating per CLAUDE.md:\n\n  | Gate | When | What runs |\n  |---|---|---|\n  | Pre-commit | Every commit | Unit tests (CPU, no GPU) |\n  | Minor version | Each x.y.0 bump | Full model/benchmark suite (GPU) |\n  | Patch version | Each x.y.z bump | Unit tests only |\n\n### Privacy\n\n- Zero network calls in the inference hot path\n- All processing local; no telemetry collected\n- Conversation data, prompts, and model outputs stay on the host\n- Only outbound network call is the optional `web` MCP server's `web_fetch` / `web_search` tools, which the consumer opts into explicitly via config — `src/mcp/servers/web.cpp`\n\n## Bundled Models\n\n| Key | Model | Size | VRAM |\n|-----|-------|------|------|\n| `primary` | Qwen3.5-35B-A3B-UD-IQ3_XXS | 13.1 GB | 15+ GB |\n| `mid` | Qwen3.5-9B-Q8_0 | 9.5 GB | 12+ GB |\n| `lightweight` | Qwen3.5-4B-Q8_0 | 4.5 GB | 8+ GB |\n\nUse `path: primary` (or `mid`, `lightweight`) in config — the engine resolves\nto the full model path via the bundled registry.\n\n## Build Presets\n\n| Preset | Description | Use Case |\n|--------|-------------|----------|\n| `full` | CUDA, all servers, tests | Development workstation |\n| `dev` | CPU, debug, tests | Fast iteration |\n| `minimal-static` | CPU, static `.a`, minimal servers | Embedded consumer |\n| `game` | CUDA, minimal MCP servers | Game engine integration |\n| `coverage` | CPU, gcov instrumentation | Coverage analysis |\n\n```bash\ninv build --clean              # full (CUDA)\ninv build --cpu                # dev (CPU)\ninv test --cpu --no-build      # unit + regression tests\ninv test --model --no-build    # model tests (GPU recommended)\n```\n\n## Examples\n\n| Example | Demonstrates | Language |\n|---------|-------------|----------|\n| `headless/main.c` | Pure-C minimal harness — CI smoke target | C |\n| `pychess/main.cpp` | Multi-tier pipeline, grammar, delegation, external MCP | C++ |\n| `explorer/main.cpp` | Interactive C++ REPL for poking at the engine | C++ |\n| `openai-server/src/main.cpp` | OpenAI-compat HTTP front-end (chat/completions, models, SSE streaming) | C++ |\n\nEach example has its own `default_config.yaml` (consumer defaults) and\n`.{name}/` directory (session logs, local config).\n\n## Privacy\n\nEntropic runs entirely on your local hardware. No data is sent to external\nservers. No telemetry is collected. Your prompts, conversations, and model\noutputs never leave your machine.\n\n## Disclaimer\n\nEntropic runs AI models locally on your hardware. AI-generated outputs may be\ninaccurate, biased, or inappropriate. Users are solely responsible for\nevaluating and using any generated content. This software does not provide\nprofessional, legal, medical, or financial advice.\n\n## Documentation\n\n- [docs/getting-started.md](docs/getting-started.md) — install + first call\n- [docs/architecture-cpp.md](docs/architecture-cpp.md) — library design\n- [docs/roadmap.md](docs/roadmap.md) — version targeting\n- [docs/contributing.md](docs/contributing.md) — dev setup, gates, branching\n- [docs/releasing.md](docs/releasing.md) — release workflow\n- [docs/security.md](docs/security.md) — vulnerability reporting\n\n## License\n\nApache-2.0. See [`LICENSE`](LICENSE) for the canonical text and\n[`NOTICE`](NOTICE) for third-party attribution. Contributors retain\ncopyright in their contributions; see [`CONTRIBUTING.md`](CONTRIBUTING.md)\nfor the DCO sign-off process.\n\nVersions 2.0.0 through 2.2.1 were released under LGPL-3.0-or-later\nwith a linking exception; those releases remain under that license.\nThe relicense to Apache-2.0 takes effect at v2.2.2 and forward, and\nrestores the permissive license used for v1.x.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftvanfossen%2Fentropic","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftvanfossen%2Fentropic","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftvanfossen%2Fentropic/lists"}