{"id":49320007,"url":"https://github.com/synpulse8-opensource/pulse8-ai-aacl-pattern","last_synced_at":"2026-04-26T17:02:59.099Z","repository":{"id":336490165,"uuid":"1098235476","full_name":"synpulse8-opensource/pulse8-ai-aacl-pattern","owner":"synpulse8-opensource","description":"Specification for the Adaptive Anti-Corruption Layer Pattern","archived":false,"fork":false,"pushed_at":"2025-11-17T16:02:16.000Z","size":44,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-02-05T04:25:20.591Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/synpulse8-opensource.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-11-17T12:38:22.000Z","updated_at":"2025-11-18T11:40:30.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/synpulse8-opensource/pulse8-ai-aacl-pattern","commit_stats":null,"previous_names":["synpulse8-opensource/pulse8-ai-aacl-pattern"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/synpulse8-opensource/pulse8-ai-aacl-pattern","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/synpulse8-opensource%2Fpulse8-ai-aacl-pattern","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/synpulse8-opensource%2Fpulse8-ai-aacl-pattern/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/synpulse8-opensource%2Fpulse8-ai-aacl-pattern/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/synpulse8-opensource%2Fpulse8-ai-aacl-pattern/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/synpulse8-opensource","download_url":"https://codeload.github.com/synpulse8-opensource/pulse8-ai-aacl-pattern/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/synpulse8-opensource%2Fpulse8-ai-aacl-pattern/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32305043,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-26T09:34:17.070Z","status":"ssl_error","status_checked_at":"2026-04-26T09:34:00.993Z","response_time":129,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-04-26T17:02:58.950Z","updated_at":"2026-04-26T17:02:59.089Z","avatar_url":"https://github.com/synpulse8-opensource.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Adaptive Anti-Corruption Layer Pattern\n\n### Problem solved: LLM-produced format hallucinations\n\nThis repository formalises the Adaptive Anti-Corruption Layer (AACL) design pattern for integrating probabilistic LLM agents with deterministic systems via a self-healing mechanism. The AACL provides a normalisation boundary that converts ambiguous LLM outputs into strictly typed inputs, returning structured correction signals that allow the model to self-correct at runtime. This eliminates silent format corruption and enhances reliable agentic behaviour without model retraining.\n\nDesign Pattern: Adaptive Anti-Corruption Layer (AACL)\n\nContext: Integrating probabilistic LLM agents with deterministic systems\n\nProblem: LLMs produce chaotic, non-type-safe outputs; direct integration causes silent format corruption\n\nSolution: Two-layer architecture with normalisation boundary that provides structured feedback\n\nResult: Self-correcting system where structured feedback enables runtime error correction\n\nContents: Pattern Specification (this file), Reference Implementation (Python, in /src), Pattern Validation with adversarial testing (Python, in /pattern_validation)\n\n---\n\n### Architectural Assumptions\n\nThe self-healing mechanism requires an **agentic loop architecture** where the LLM can receive tool execution feedback and retry with corrected inputs. Specifically, the system must support:\n\n1. **Function/tool calling** — LLM can invoke tools with parameters\n2. **Error propagation** — Structured errors returned to LLM context\n3. **Iterative retry** — LLM can re-plan and retry after failures\n4. **State persistence** — Conversation state maintained across tool calls\n\n**Framework example:**\n\n- **LangGraph**: Requires a state saver, an agent node (runnable), a conditional edge to a tool node, and an edge back to the agent node for retry\n\nIf your system lacks an agentic loop (i.e., one-shot tool calls with no retry), the AACL pattern still provides value by preventing silent format corruption, but self-healing requires the retry mechanism.\n\n## Premise\n\nLLMs are semantic sequence models. They are not type-safe, not schema-stable, and not reliable data serialisers. Therefore:\n\n**LLMs must provide values. Code must provide *structure*.**\n\nAttempting to treat model-generated JSON or arguments as authoritative structure guarantees silent format corruption, brittle parsing, and cascading failure.\n\nThe correct architecture is a **two-layer boundary** separating free-form model output from deterministic business logic.\n\n```\nLLM (semantic planner)\n↓\nInterface Layer (normalisation + validation + structured errors)\n↓\nImplementation Layer (strict types, pure logic)\n```\n\nThis boundary is where the system becomes self-correcting. The interface boundary is the only location where ambiguity is allowed to exist. Once execution passes into the implementation layer, ambiguity must be ***ZERO***.\n\n**Structured output belongs in function results, not token streams.** Use function calling to receive structured data from code, not to parse it from LLM-generated text. Need the JSON visible to users? Put the function result in your output queue - same place the response stream goes.\n\n## Failure Modes of Raw LLM Output (Non-Exhaustive)\n\nLLMs freely interchange:\n\n* `\"true\"`, `\"True\"`, `\"yes\"`, `\"1\"`, `True`\n* `5`, `\"05\"`, `\"five\"`, `\"5\"`\n* `\"null\"`, `\"none\"`, `None`, `\"n/a\"`, `\"\"`\n* `\"a.com b.com\"`, `\"a.com,b.com\"`, `[\"a.com\", \"b.com\"]`\n* Dates in any human-readable permutation\n\nPassing these directly to an API layer introduces **silent format corruption** — the worst class of system failure because it has a probability of “working” then it just breaks for no apparent reason.\n\n## Architecture\n\nThis two-layer boundary is the core of the Adaptive Anti-Corruption Layer (AACL). The LLM operates in a semantic space; the implementation layer operates in a typed deterministic space. The AACL is the boundary that translates between them through normalisation + structured failure signals.\n\n### 1. Interface Layer (LLM-Facing)\n\n**Function**: Convert *arbitrary* inputs into typed inputs.\n\nRequirements:\n\n* Accept union and ambiguous input types\n* Normalise to canonical representations\n* Validate according to strict schema expectations\n* Return **structured error messages** when normalisation fails\n\nThis layer must be **total**:  \nEvery input either normalises or fails with a LLM-usable correction signal.\n\n### 2. Implementation Layer (Logic-Facing)\n\n**Function**: Perform business operations with strict typing.\n\n* No normalisation\n* No LLM-awareness\n* No ambiguity handling\n* Pure deterministic execution\n\nIf incorrect values reach this layer, **the architecture is wrong**.\n\n## Minimal Example (Python)\n\nThe pattern uses a two-file structure. See full reference implementation in `/src/mcp_tools/tavily_search/`.\n\n### Interface Layer (`interface.py`) — LLM-Facing\n\n```python\nfrom fastmcp import FastMCP\nfrom .implementations.tavily_impl import tavily_search_impl\n\nmcp = FastMCP(\"My MCP Server\")\n\n@mcp.tool()\ndef search_web(\n    query,\n    search_depth=\"basic\",\n    max_results=5,\n    include_domains=None,\n    time_range=None\n) -\u003e dict:\n    \"\"\"\n    Search the web using Tavily's search API.\n    \n    Args:\n        query: Search query (required)\n        search_depth: \"basic\" or \"advanced\" (Optional, defaults to \"basic\")\n        max_results: Number of results, 1-10 (Optional, defaults to 5)\n        include_domains: Domain filter (comma/space-separated or list) (Optional)\n        time_range: Time filter (\"day\", \"week\", \"month\", \"year\") (Optional)\n    \n    Returns:\n        Dictionary containing search results\n    \n    Raises:\n        ValueError: With structured correction signals for invalid inputs\n    \"\"\"\n    # Normalize ambiguous inputs to canonical forms\n    search_depth = _normalize_search_depth(search_depth)\n    include_domains = _normalize_optional_string(include_domains)\n    time_range = _normalize_optional_string(time_range)\n    \n    # Validate with structured error messages\n    if time_range and time_range not in (\"day\", \"week\", \"month\", \"year\"):\n        raise ValueError(\n            \"INVALID_TIME_RANGE: expected one of ['day', 'week', 'month', 'year']; \"\n            f\"received '{time_range}'. Retry with a valid value.\"\n        )\n    \n    # Pass typed, normalized inputs to implementation\n    return tavily_search_impl(\n        query=query,\n        search_depth=search_depth,\n        max_results=int(max_results),\n        include_domains=include_domains,\n        time_range=time_range\n    )\n\ndef _normalize_optional_string(value):\n    \"\"\"Normalize null-like values to None.\"\"\"\n    if value is None:\n        return None\n    if isinstance(value, str):\n        s = value.strip().lower()\n        if s in (\"\", \"null\", \"none\", \"n/a\", \"na\"):\n            return None\n    return value\n\ndef _normalize_search_depth(depth):\n    \"\"\"Normalize search depth to 'basic' or 'advanced'.\"\"\"\n    if not depth:\n        return \"basic\"\n    d = str(depth).strip().lower()\n    if d in (\"advanced\", \"deep\", \"thorough\"):\n        return \"advanced\"\n    return \"basic\"\n```\n\n### Implementation Layer (`implementations/tavily_impl.py`) — Logic-Facing\n\n```python\ndef tavily_search_impl(\n    query: str,\n    search_depth: str,\n    max_results: int,\n    include_domains: list[str] | None,\n    time_range: str | None\n) -\u003e dict:\n    \"\"\"\n    Pure implementation - expects strictly typed, normalized inputs.\n    No validation or normalization should happen here.\n    \"\"\"\n    client = TavilyClient(api_key=os.getenv(\"TAVILY_API_KEY\"))\n    \n    params = {\n        \"query\": query,\n        \"search_depth\": search_depth,\n        \"max_results\": max_results\n    }\n    \n    if include_domains:\n        params[\"include_domains\"] = include_domains\n    if time_range:\n        params[\"time_range\"] = time_range\n    \n    return client.search(**params)\n```\n\n## Why This Is Self-Healing\n\nThe loop:\n\n1. LLM emits a parameter in an arbitrary format.\n2. Interface layer attempts normalisation.\n3. If normalisation succeeds → call implementation logic. \n4. If normalisation fails → return a structured correction signal. \n5. LLM re-plans and retries (ReAct pattern, no human involvement).\n\nThis produces **adaptive convergence**:\nThe system self-heals at runtime by guiding the LLM to correct inputs without human supervision.\n\n## Using This Pattern for Structured LLM Outputs\n\n### Why You Never Let the LLM Produce JSON\n\nThis is not about syntax errors. It is about responsibility boundaries.\n\n**JSON is a deterministic serialisation format.** \n**LLMs are probabilistic sequence models.**\n\nIf the LLM is responsible for producing formatted JSON, you guarantee:\n\n* Silent type drift (\"5\" instead of 5)\n* Mixed boolean encodings (\"true\" vs true vs \"yes\")\n* Key-order instability (breaks hashing, caching, diff-based tests)\n* Schema drift over iterative refinement \n* Random breakage triggered by prompt context state\n\nThese are not mistakes. They are the statistical nature of token generation.\n\nOnce the model is permitted to define structure, the structure becomes non-deterministic.\n\n### Apply the Same Two-Layer Architecture\n\n```\nLLM → untyped values\nInterface Layer → normalization + schema enforcement\nImplementation Layer → constructs JSON deterministically\n```\nThe model never **formats** JSON. \n\nNeed the structured data visible in the user interface? That's fine - your function returns it, and your application layer displays it. The point is the LLM doesn't generate the structure, your code does.\n\n## Example\n\nSee full reference implementation in `/src/mcp_tools/structured_output/`.\n\n### Interface Layer (LLM-Facing)\n\n```python\nfrom fastmcp import FastMCP\nfrom .implementations.json_writer_impl import create_structured_data_impl\n\nmcp = FastMCP(\"My MCP Server\")\n\n@mcp.tool()\ndef create_json(x, y, flag) -\u003e dict:\n    \"\"\"\n    Create and return structured JSON from LLM-provided values.\n    \n    Args:\n        x: Integer value (accepts \"5\", \"05\", 5)\n        y: String value\n        flag: Boolean flag (accepts \"true\", \"yes\", \"1\", True, etc.)\n    \n    Returns:\n        Dictionary with deterministic structure (ready for JSON serialization)\n    \n    Raises:\n        ValueError: With structured correction signals for invalid inputs\n    \"\"\"\n    # Normalize integer with structured error\n    try:\n        x = int(x)\n    except (ValueError, TypeError):\n        raise ValueError(\n            f\"TYPE_ERROR: field 'x' must be an integer; \"\n            f\"received {repr(x)} (type: {type(x).__name__}). \"\n            \"Retry with a valid integer value.\"\n        )\n    \n    # Normalize string\n    y = str(y)\n    \n    # Normalize boolean from various representations\n    if isinstance(flag, str):\n        flag = flag.strip().lower() in (\"true\", \"1\", \"yes\", \"on\")\n    else:\n        flag = bool(flag)\n    \n    # Pass typed inputs to implementation\n    return create_structured_data_impl(x=x, y=y, flag=flag)\n```\n### Implementation Layer (Logic-Facing) (Strict, Deterministic JSON)\n\n```python\ndef create_structured_data_impl(x: int, y: str, flag: bool) -\u003e dict:\n    \"\"\"\n    Construct JSON structure deterministically from typed values.\n    \n    The LLM never generates JSON - it only provides values.\n    Code defines keys, order, and types.\n    \"\"\"\n    # Structure is defined by code, not LLM\n    return {\n        \"x\": x,\n        \"y\": y,\n        \"flag\": flag\n    }\n```\n## Why This Works\n\n| Responsibility            | LLM | Interface Layer | Implementation Layer |\n|---------------------------|-----|-----------------|----------------------|\n| Interpret Intent          | Yes | No              | No                   |\n| Normalise Values          | No  | Yes             | No                   |\n| Enforce Schema            | No  | Yes             | No                   |\n| Construct Data Structures | No  | No              | Yes                  |\n| Serialise Data            | No  | No              | Yes                  |\n\n## Core Principle\n\n**LLMs Plan. Code Types.**\nNever let the model define structure.\nAlways enforce structure at the boundary.\n\n## Summary\n\n| Layer                | Handles                    | Must Be               | Failure Mode                                         | Output                    |\n|----------------------|----------------------------|-----------------------|------------------------------------------------------|---------------------------|\n| LLM                  | Semantics                  | Flexible              | Format Hallucination                                 | Unstructured Values       |\n| Interface Layer      | Normalisation + Validation | Total / Deterministic | Structured Correction (Intentional Exception Raised) | Typed Inputs              |\n| Implementation Layer | Business Logic             | Pure / Strict         | Hard Failure (if reached incorrectly)                | Stable Data / JSON / YAML |\n\nThe invariant:\n**If the implementation layer sees garbage, the interface layer is incorrect.**\n\nThis pattern is general and applies to every LLM-tooling integration, including MCP, ReAct, function-calling APIs, and agentic planning systems. This architecture is not a workaround for LLM weaknesses. The Adaptive Anti-Corruption Layer is the correct separation of concerns for any system in which a probabilistic language generator interacts with deterministic software components.\n\n## Related Patterns\n\n| Pattern                   | Relationship                                                                                                       |\n|---------------------------|--------------------------------------------------------------------------------------------------------------------|\n| DDD Anti-Corruption Layer | Conceptual ancestor — but assumes deterministic upstream domain                                                    |\n| Adapter Pattern           | Handles interface mismatch, but not semantic ambiguity                                                             |\n| Retry with Backoff        | Handles failure, but not interpretation                                                                            |\n| ReAct                     | Handles iterative convergence, but focuses on LLM output instead of relying on the known good system known as code |\n\n---\n\n## Pattern Formalisation (Appendix)\n\n*For pattern catalogue inclusion.*\n\n### Applicability\n\n**Use the AACL pattern when:**\n* Integrating LLMs with deterministic APIs, databases, or business logic\n* Building function-calling or tool-use systems\n* Creating MCP servers or LLM integration points\n* Generating structured data (JSON, YAML) from LLM outputs\n\n**Do not use when:**\n* Input is already strictly typed (traditional API)\n* Format variation is acceptable downstream\n* Only semantic correctness matters (content, not format)\n\n### Consequences\n\n**Benefits:**\n* Eliminates silent format corruption\n* Enables self-healing via structured errors\n* Clear separation of concerns\n* Works with any LLM (no retraining) that supports function calling\n* Composable with existing frameworks\n\n**Trade-offs:**\n* Requires two-layer architecture\n* Normalisation adds (minimal) latency\n* Interface must evolve with edge cases\n* Does not solve content hallucinations\n\n### Known Uses\n\n* OpenAI/Anthropic function calling APIs\n* MCP server implementations\n* LangChain custom tools\n* Agentic systems\n* Any LLM-to-database/API boundary\n\n### Forces Resolved\n\nThe pattern balances:\n* **Flexibility vs. Correctness**: LLM freedom + type safety\n* **Fail-Fast vs. Teach**: Structured errors guide correction\n* **When to Normalise vs. Validate**: Intentional design choice per parameter\n* **Boundary Location**: Interface handles ambiguity, implementation stays pure\n\nThe resolution: Interface layer is **total** (handles all inputs), implementation is **pure** (assumes correctness).\n\n---\n\n## About This Pattern\n\n**Author:** Morgan Lee  \n**Organisation:** Synpulse8  \n**First Published:** 12 November 2025\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsynpulse8-opensource%2Fpulse8-ai-aacl-pattern","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsynpulse8-opensource%2Fpulse8-ai-aacl-pattern","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsynpulse8-opensource%2Fpulse8-ai-aacl-pattern/lists"}