https://github.com/ndrean/zexplorer

HTML processor engine on steroids. Give eyes to your LLM
https://github.com/ndrean/zexplorer
css-parser css-sanitization html-parser javascript-tools lexbor mcp-server quickjs-ng sanitize-html sqlite thorvg yoga zig zig-package
Last synced: 29 days ago
JSON representation
HTML processor engine on steroids. Give eyes to your LLM
Host: GitHub
URL: https://github.com/ndrean/zexplorer
Owner: ndrean
License: mit
Created: 2025-07-27T23:29:03.000Z (12 months ago)
Default Branch: main
Last Pushed: 2026-03-06T18:25:33.000Z (4 months ago)
Last Synced: 2026-03-06T21:47:46.544Z (4 months ago)
Topics: css-parser, css-sanitization, html-parser, javascript-tools, lexbor, mcp-server, quickjs-ng, sanitize-html, sqlite, thorvg, yoga, zig, zig-package
Language: Zig
Homepage: https://ndrean.github.io/zexplorer
Size: 107 MB
Stars: 5
Watchers: 1
Forks: 0
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE
- Security: SECURITY.md
Awesome Lists containing this project

README

          # zexplorer (`zxp`)

![Zig support](https://img.shields.io/badge/Zig-0.15.2-color?logo=zig&color=%23f3ab20)

`zexplorer` is a fast, zero-dependency HTML+JS engine. Think `ffmpeg` for the web.

You can use it as a _command-line tool_, an _HTTP dev-server_ or an _MCP server_ for LLM agents — no browser, no Node.js, no Python, no runtime.

The MCP service gives your LLM agent eyes and persistent local storage, zero infra.










**TL;DR**:

- Cold start: ~3ms

- Memory: ~12MB

- Zero dependencies. Single statically-compiled binary.

- Stateless by default, Stateful on demand with a zero-config embedded SQLite storage for local persistence.

- Pipelines: Native support for parsing Markdown, CSV, and SVG.

- Outputs: Return raw data (JSON, strings, binary arrays), Markdown or render layouts (**Flexbox**) to PNG, JPEG, WEBP, and PDF.

- MCP service - the token saver - let the LLM run scripts server-side, such as scrape, transform or render and get back the result.

- Usage: Composable CLI tool or a high-concurrency HTTP rendering service.

---

## What can it do?

It can:

- **Scrape** — fetch a URL, hydrate React, render Vue/Svelte/Lit/SolidJS, WebComponents, extract data. No headless browser.

- **Stream** — consume LLM output via SSE (currently local Ollama, extendable to  any OpenAI-compatible endpoint); receive HTML chunks and rebuild a live DOM incrementally.

- **Expose** — serve as an MCP server so LLM agents (Claude Desktop, Gemini CLI…) can `run_script` using the custom API, or use the shortcuts `render_html`, `render_markdown`, `render_url`, and receive data or screenshots directly in the conversation.

- **Render** — `Flexbox` based only, all static:  HTML+JS+SVG such as D3, Chart.js, Leaflet, ECharts. Basic support for Canvas API, output PNG/JPEG/WEBP/PDF.

- **Generate** — design an SVG in Figma, plug in data, batch-produce OG images or PDF reports.

- **Sanitize** — DOM+CSS-aware HTML sanitization (stylesheets, inline styles, XSS/mXSS). Built-in.

- **Run JS** — execute ES2020 scripts against a real DOM with fetch, timers, workers, and an event loop.

- **Store & Persist** - drop text, blobs, images in the local storage, no ceremony.

**Limitations**:

- no TypeScript support. JSX is supported via "tagged templates" (using `htm`).

- cannot scrape arbitrary bot protected public websites,

- cannot paint complex CSS using grid-2d nor position:fixed, no CSS functions or variables nor complex canvas nor media queries...

## Security

If you use your own trusted code, you can skip sanitization entirely. For untrusted content:

> [!WARNING]

> All layers are _best-effort_ — see [SECURITY.md](https://github.com/ndrean/zexplorer/blob/main/SECURITY.md) for full details.

>

> - **Content sanitization** — DOM+CSS-aware: stylesheets, inline styles, iframes, SVG/MathML, DOM clobbering, URI schemas, XSS/mXSS. Tested against [H5SC](https://github.com/cure53/H5SC), [OWASP](https://cheatsheetseries.owasp.org/cheatsheets/DOM_based_XSS_Prevention_Cheat_Sheet.html), [PortSwigger](https://portswigger.net/web-security/cross-site-scripting/cheat-sheet), and [DOMPurify](https://github.com/cure53/DOMPurify).

> - **Filesystem sandbox** — kernel-enforced `openat()` with symlink blocking, traversal rejection, cross-device check.

> - **Network hardening** — timeouts, redirect/size limits, SSRF pre-flight filtering, HTTPS-only remote imports.

> - **Resource limits** — worker fan-out caps, busy-loop interrupts, max stack/GC/memory, wall-clock deadlines.

## Examples

| Example | What it shows | Output | CLI | Server |

| ------- | ------------- | ------ | :---: | :---: |

| [MCP server](#mcp-server) | Give Claude Desktop / Gemini visual eyes | PNG | – | ✓ |

| [LLM generative UI](#generative-template) | Ollama/OpenAI SSE → DOM → image | WEBP | ✓ | ✓ |

| [Dynamic HTML card](#use-dynamic-html-with-htm-and-paint) | `htm` tagged templates → paintDOM | PNG | ✓ | ✓ |

| [CSS grid / flexbox layout](#render-an-html-file-in-the-terminal) | grid-1D + flexbox → terminal image | PNG | ✓ | ✓ |

| [Scrape Hacker News](#scrape-hacker-news) | fetch → DOM query → structured data | JSON | ✓ | ✓ |

| [Vercel SPA scrape](#scrape-a-vercel-site-in-less-than-1s) | Next.js hydration → `waitForSelector` | JSON | ✓ | ✓ |

| [Vercel site snapshot](#render-the-vercel-side) | SSR page → inlined images → render | WEBP | ✓ | ✓ |

| [Echarts](#echarts) | Echarts SVG -> rasterize | WEBP | ✓ | ✓ |

| [Leaflet map PDF](#generate-a-leaflet-map-pdf-report) | GeoJSON route → OSM tiles → SVG → PDF | PDF | – | ✓ |

---

### MCP server

Start the server (the `.` sets the sandbox root for file access and the SQLite store):

```sh

./zig-out/bin/zxp serve .

```

**Connect Claude Desktop** — add to `~/Library/Application Support/Claude/claude_desktop_config.json`:

```json

{

  "mcpServers": {

    "zexplorer": {

      "command": "npx",

      "args": ["-y", "mcp-remote", "http://localhost:9984/mcp"]

    }

  }

}

```

**Available tools:**

| Tool | What it does |

| ---- | ----------- |

| `render_html` | Render an HTML string → PNG/WEBP/JPEG (base64 image in MCP response) |

| `render_markdown` | Render GFM Markdown → image |

| `render_url` | Fetch a URL, run its scripts, render → image |

| `run_script` | Execute arbitrary JavaScript in the headless DOM+JS engine; returns text, JSON, or an image |

| `get_zxp_docs` | Return API docs and worked examples — call this before writing a `run_script` |

| `store_save` | Persist text or binary data (e.g. a rendered PNG) to a local SQLite store |

| `store_get` | Retrieve a stored entry by name; `data` is an ArrayBuffer |

| `store_list` | List store entries (metadata only) |

| `store_delete` | Delete a store entry by name |

The typical LLM workflow is: call `get_zxp_docs` to learn the `zxp.*` API, then call `run_script` with composed JavaScript to scrape, render, or process data. `store_*` lets the LLM persist intermediate results across stateless tool calls.

Your local storage is just:

```js

// zexplorer runs this instantly. No DB connection setup needed.

const pageTitle = document.querySelector('title').textContent;

zxp.store.save("last_scraped_title", pageTitle); // Saved instantly to SQLite

zxp.store.get("last_scraped_title");

```

**Smoke-test with curl:**

```sh

# Text result

curl -s -X POST http://localhost:9984/mcp \

  -H "Content-Type: application/json" \

  -d '{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"run_script","arguments":{"script":"const a=10,b=32; `The answer is ${a+b}`"}}}'

# → {"jsonrpc":"2.0","id":1,"result":{"content":[{"type":"text","text":"The answer is 42"}]}}

# Image result

curl -s -X POST http://localhost:9984/mcp \

  -H "Content-Type: application/json" \

  -d '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"render_html","arguments":{"html":"
Hello MCP!","width":400}}}'

# → {"jsonrpc":"2.0","id":2,"result":{"content":[{"type":"image","data":"iVBORw0KGgo....","mimeType":"image/png"}]}}

```

**Use `run_script` to build a D3 chart from CSV data** — the LLM composes the JS and gets an image back:

Source: 

```sh

curl -s -X POST http://localhost:9984/run --data-binary @src/examples/d3_chart/output_chart.webp

```






### Generative template

You want to use an LLM to generate some HTML with CSS for us as the engine has builtin support for SSE' text/event-stream' content-type support.

We showcase the local provider `ollama`. We used the 4.7G model "qwen2.5-coder:7b". This can be extended to any provider (OpenAI, Anthropic, Gemini) if you adapt the LLM response parsing.

- Our local LLM `ollama` is up and running: `curl -s http://localhost:11434/api/tags | head -c 200` returns _{"models":[{"name":"qwen2.5-coder:7b",....}_.

- The dev-server is up and running: `./zig-out.bin/zxp server .`

**First example**: render a generative `` component.

```html



```

Let's "live-serve" this component in a browser. The browser will send a GET request to the dev-servern which. in turn will reach the LLM. Depending upon the mood of the LLML, you can get this image:



**Second example**: interactive generative form

The HTML below is a HTML form where we select a more elaborated prompt. On submission, a JavaScript snippet will POST the prompt to the dev-server "/render_llm" endpoint.

Source: 

a FORM textarea INPUT populated by four buttons with a submit button

```html

  
Interactive prompt (POST → base64)


  


    Table

    KPI cards

    Progress steps

    Invoice

  


  

    A responsive table with 3 columns: Name, Status, Amount. Include 5 realistic sample rows. Use a blue header.

    


      Width 

      Format

        

          PNG

          WebP

          JPEG

        

      

      Model 

      Ollama URL 

    


    Generate

  

  


  

  const form   = document.getElementById('gen-form');

  const btn    = document.getElementById('gen-btn');

  const status = document.getElementById('status');

  const result = document.getElementById('result');

  // Quick-prompt buttons fill the textarea.

  document.querySelectorAll('.quick-prompts button').forEach(b => {

    b.addEventListener('click', () => {

      form.prompt.value = b.dataset.prompt;

    });

  });

  form.addEventListener('submit', async e => {

    e.preventDefault();

    const prompt = form.prompt.value.trim();

    if (!prompt) return;

    btn.disabled = true;

    status.className = 'status';

    status.textContent = 'Generating… (this may take a few seconds)';

    result.style.opacity = '0.4';

    try {

      const res = await fetch('http://localhost:9984/render_llm', {

        method: 'POST',

        headers: { 'Content-Type': 'application/json' },

        body: JSON.stringify({

          prompt,

          model:    form.model.value,

          base_url: form.base_url.value,

          width:    parseInt(form.width.value, 10) || 800,

          format:   form.format.value,

        }),

      });

      if (!res.ok) {

        const text = await res.text();

        throw new Error(`HTTP ${res.status}: ${text}`);

      }

      const { data, mime } = await res.json();

      result.src = `data:${mime};base64,${data}`;

      result.style.opacity = '1';

      status.textContent = '';

    } catch (err) {

      status.className = 'status error';

      status.textContent = `Error: ${err.message}`;

      result.style.opacity = '1';

    } finally {

      btn.disabled = false;

    }

  });

```




We have selected to render a table (this is a POST request to "/render_llm"). You can get for example:






#### Note about SSE format

|Provider	|Content path	|End signal|

|--|--|--|

|OpenAI / groq / mistral / together / ollama_v1	|.choices[0].delta.content	|data: [DONE]|

|Anthropic	|.delta.text (on content_block_delta events)	|event: message_stop|

|Gemini	|.candidates[0].content.parts[0].text	|.finishReason == "STOP"|

|Ollama	|.message.content	|no SSE — raw NDJSON|

```txt

if openai || groq || mistral || together || ollama_v1 → .choices[0].delta.content

if anthropic                                           → .delta.text

if gemini                                             → .candidates[0].content.parts[0].text

```

Due to our limitations in the CSS that we are able to render, and because we are using a small model, we had to set hard and explicit constraints in our prompt.

The general system prompt that we use

```zig

pub const default_system =

    "You are a UI generator. Output ONLY raw HTML — no markdown, no code fences, no backticks, no explanation. " ++

    "Start your response directly with an HTML tag. " ++

    "Use ONLY 
 and  elements — NEVER , , , , , . " ++

    "Use ONLY inline styles with these CSS properties: display (flex or block), flex-direction, " ++

    "justify-content, align-items, flex-wrap, gap, padding, margin, color, background, " ++

    "font-size, font-weight, border-radius, width, height, border, text-align, white-space. " ++

    "No external fonts. No CSS variables. No animations. No  tags. " ++

    "CRITICAL: Every opened <div> MUST be explicitly closed with </div> before opening the next sibling <div>. " ++

    "CARD PATTERN — row of sibling cards, each card has stacked label+value (NEVER nest cards): " ++

    "<div style=\"display:flex;flex-direction:row;gap:16px;padding:16px\">" ++

    "<div style=\"width:30%;background:#fff;border-radius:8px;padding:16px\">" ++

    "<div style=\"font-size:13px;color:#666\">Label A</div>" ++

    "<div style=\"font-size:24px;font-weight:bold\">Value A</div>" ++

    "</div>" ++

    "<div style=\"width:30%;background:#fff;border-radius:8px;padding:16px\">" ++

    "<div style=\"font-size:13px;color:#666\">Label B</div>" ++

    "<div style=\"font-size:24px;font-weight:bold\">Value B</div>" ++

    "</div>" ++

    "</div> " ++

    "TABLE PATTERN — outer column container, SIBLING row divs inside it (NEVER nest rows): " ++

    "<div style=\"display:flex;flex-direction:column\">" ++

    "<div style=\"display:flex;flex-direction:row\">" ++

    "<div style=\"width:40%\">Header A</div><div style=\"width:60%\">Header B</div>" ++

    "</div>" ++

    "<div style=\"display:flex;flex-direction:row\">" ++

    "<div style=\"width:40%\">Cell A1</div><div style=\"width:60%\">Cell B1</div>" ++

    "</div>" ++

    "</div>";

```

</details>

### ECharts

An example that shows how to collect public data from a CSV source and build a [D3.js](https://github.com/d3/d3) chart.

We use the following functions: `zxp.loadHTML()` , `zxp.runScripts()`, `new XMLSerializer()` (serialize the SVG), `zxp.paintSVG()` and `zxp.encode()` (generate WEBP encoded binary) and `zxp.fs.writeFileSync()` to save it locally.

Source: <https://github.com/ndrean/zexplorer/blob/main/src/examples/echarts/echarts_svg.html>

The dev-server is up and running. We send a POST request to the endpoint where the payload is the snippet:

```sh

curl -s -X POST http://localhost:9984/run --data-binary @src/examples/echarts/run_svg.js

```

<img src="https://github.com/ndrean/zexplorer/blob/main/src/examples/echarts/echarts_svg.png" alt="D3 chart from CSV" width="400">

<br>

### Use dynamic HTML with `htm` and paint

<details><summary>We use htm to build dynamic HTML and render it as an image</summary>

[Source](https://github.com/ndrean/zexplorer/blob/main/src/examples/frameworks/htm/teset_html.html")

```html

<html>

  <head>

    <script>

      const { html } = zxp; // embedded in the code

      const name = "Zexplorer";

      const version = "0.1.0";

      const features = ["Lexbor DOM", "QuickJS", "Yoga Layout", "ThorVG"];

      const card = html`

        <div style=${{

          background: "#1a1a2e",

          color: "#e0e0e0",

          padding: "20px",

        }}>

          <div style=${{

            background: "#16213e",

            padding: "10px",

            color: "#f7a41d",

          }}>

            ${name} v${version}

          </div>

          <ul style=${{ padding: "10px" }}>

            ${features.map(

              (f) => html`

                <li style=${{

                  background: "#0f3460",

                  padding: "5px",

                  margin: "4px",

                  color: "#e94560",

                }}>${f}</li>

              `

            )}

          </ul>

        </div>

      `;

      document.body.appendChild(card);
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ndrean/zexplorer

Awesome Lists containing this project

README

Hello MCP!

Interactive prompt (POST → base64)