{"id":50508541,"url":"https://github.com/murdinc/lcdata","last_synced_at":"2026-06-02T18:02:05.375Z","repository":{"id":351404234,"uuid":"1210677990","full_name":"murdinc/lcdata","owner":"murdinc","description":"A declarative agentic LLM execution engine ","archived":false,"fork":false,"pushed_at":"2026-05-01T02:51:08.000Z","size":20239,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2026-05-01T04:19:25.803Z","etag":null,"topics":["agentic-ai","agents","anthropic","cli","declarative","golang","json-config","llm","nlp","ollama","openai","orchestration","pipeline","rest-api","self-hosted","speech-to-text","sse","text-to-speech","websocket","workflow"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/murdinc.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-14T16:44:30.000Z","updated_at":"2026-05-01T02:51:12.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/murdinc/lcdata","commit_stats":null,"previous_names":["murdinc/lcdata"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/murdinc/lcdata","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/murdinc%2Flcdata","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/murdinc%2Flcdata/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/murdinc%2Flcdata/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/murdinc%2Flcdata/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/murdinc","download_url":"https://codeload.github.com/murdinc/lcdata/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/murdinc%2Flcdata/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33833277,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-02T02:00:07.132Z","response_time":109,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agentic-ai","agents","anthropic","cli","declarative","golang","json-config","llm","nlp","ollama","openai","orchestration","pipeline","rest-api","self-hosted","speech-to-text","sse","text-to-speech","websocket","workflow"],"created_at":"2026-06-02T18:02:04.115Z","updated_at":"2026-06-02T18:02:05.362Z","avatar_url":"https://github.com/murdinc.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# lcdata\n\nA declarative agentic LLM execution engine. Drop a folder into `nodes/`, write a JSON config, and the engine exposes it as a REST + WebSocket API endpoint. No code changes. No recompile. Nodes compose into pipelines with conditional branching, parallel execution, loops, and fan-out — all in JSON.\n\nBuilt as a single Go binary with a file-first design: the binary is the engine, `nodes/` is the content.\n\n---\n\n## Quick Start\n\n```bash\n# Build\ngo build -o lcdata .\n\n# First run creates lcdata.json with defaults\n./lcdata serve\n\n# List available nodes\n./lcdata list\n\n# Run a node from the CLI\n./lcdata run llm_chat --input message=\"Hello\"\n\n# Validate all node configs\n./lcdata validate\n\n# Show the execution graph for a pipeline\n./lcdata graph smart_assistant\n```\n\n---\n\n## The Core Idea\n\nEvery node is a directory:\n\n```\nnodes/\n  my_agent/\n    my_agent.json     ← node config\n    system.md         ← optional system prompt (LLM nodes)\n```\n\nThe `type` field determines how it runs. Nodes can be wired together into pipelines using Go template expressions (`{{.step_id.field}}`). Adding a new agent means dropping a folder — no Go code required.\n\n---\n\n## Node Types\n\n| Type | Description |\n|------|-------------|\n| `llm` | LLM call — Anthropic Claude, Ollama, or OpenAI-compatible |\n| `http` | Outbound HTTP request with templated URL, headers, and body |\n| `search` | Web search — Brave API or SearXNG |\n| `file` | File operations — read, write, append, exists, delete, list |\n| `command` | Shell command with streaming stdout |\n| `transform` | Template-based data reshaping, no external call |\n| `database` | SQL query — Postgres or SQLite |\n| `stt` | Speech-to-text — Deepgram, OpenAI Whisper, or whisper.cpp (local) |\n| `tts` | Text-to-speech — ElevenLabs, OpenAI, or Piper (local) |\n| `vector` | Vector store operations via springg — upsert, search, get, delete |\n| `embedding` | Generate embedding vectors — OpenAI or Ollama |\n| `scaffold` | Node self-builder — create, read, list, or delete node configs at runtime |\n| `pipeline` | Orchestrates other nodes — sequential, switch, parallel, loop, map |\n\n---\n\n## Node Config Reference\n\n### LLM Node\n\n```json\n{\n  \"name\": \"llm_chat\",\n  \"description\": \"General-purpose chat using Claude\",\n  \"type\": \"llm\",\n  \"provider\": \"anthropic\",\n  \"model\": \"claude-sonnet-4-6\",\n  \"system_prompt_file\": \"system.md\",\n  \"temperature\": 0.7,\n  \"max_tokens\": 4096,\n  \"stream\": true,\n  \"tools\": [\"web_search\", \"read_file\"],\n  \"retry_count\": 2,\n  \"retry_delay\": \"1s\",\n  \"structured_output\": {\n    \"intent\": { \"type\": \"string\" },\n    \"confidence\": { \"type\": \"number\" }\n  },\n  \"input\": {\n    \"message\": { \"type\": \"string\", \"required\": true },\n    \"history\":  { \"type\": \"array\",  \"required\": false }\n  },\n  \"output\": {\n    \"response\": { \"type\": \"string\" },\n    \"usage\":    { \"type\": \"object\" }\n  }\n}\n```\n\n- **providers:** `anthropic`, `ollama`, `openai`\n- **`stream: true`** emits `chunk` events over WebSocket/SSE as tokens arrive — works with and without tools\n- **`structured_output`** — when set, the LLM response is parsed as JSON and each field is merged into the output map alongside `response`\n- **`history`** input — pass an array of `{role, content}` objects for multi-turn conversations\n- **`max_history`** — trim `history` to the most recent N entries before sending to the model (prevents context overflow)\n- **`tools`** — list of node names the LLM can invoke as tools. Anthropic and Ollama run an agentic loop (up to 10 turns). Streaming is fully supported with tools — text tokens stream as they arrive; tool calls execute between turns.\n- **`retry_count` / `retry_delay`** — retry on API error with exponential backoff + jitter (e.g. `\"retry_count\": 3, \"retry_delay\": \"1s\"`)\n\n**Date/time template functions** — available in system prompts and all templates:\n\n| Function | Returns |\n|---|---|\n| `{{now}}` | Current time as RFC3339 string |\n| `{{date}}` | Current date as `YYYY-MM-DD` |\n| `{{datetime}}` | Current date+time as `YYYY-MM-DD HH:MM:SS` |\n\n### HTTP Node\n\n```json\n{\n  \"name\": \"fetch_url\",\n  \"type\": \"http\",\n  \"method\": \"GET\",\n  \"url\": \"{{.input.url}}\",\n  \"strip_html\": true,\n  \"headers\": {\n    \"Authorization\": \"Bearer {{.input.token}}\"\n  },\n  \"body\": \"{\\\"query\\\": \\\"{{.input.q}}\\\"}\",\n  \"input\": {\n    \"url\": { \"type\": \"string\", \"required\": true }\n  },\n  \"output\": {\n    \"status\": { \"type\": \"number\" },\n    \"body\":   { \"type\": \"string\" }\n  }\n}\n```\n\n- **`strip_html: true`** strips tags, scripts, and styles — returns clean plain text\n- URL, headers, and body are all Go templates rendered against the run context\n\n### Search Node\n\n```json\n{\n  \"name\": \"web_search\",\n  \"type\": \"search\",\n  \"search_provider\": \"brave\",\n  \"search_count\": 10,\n  \"input\": {\n    \"query\": { \"type\": \"string\", \"required\": true }\n  },\n  \"output\": {\n    \"results\": { \"type\": \"array\" },\n    \"count\":   { \"type\": \"number\" }\n  }\n}\n```\n\n- **providers:** `brave`, `searxng`\n- Returns `results` as an array of `{title, url, description}` objects\n\n### File Node\n\n```json\n{\n  \"name\": \"read_file\",\n  \"type\": \"file\",\n  \"operation\": \"read\",\n  \"input\": {\n    \"path\": { \"type\": \"string\", \"required\": true }\n  },\n  \"output\": {\n    \"content\": { \"type\": \"string\" },\n    \"size\":    { \"type\": \"number\" }\n  }\n}\n```\n\n- **operations:** `read`, `write`, `append`, `exists`, `delete`, `list`\n- `write` and `append` require `input.content`\n- Parent directories are created automatically on `write`/`append`\n\n### Command Node\n\n```json\n{\n  \"name\": \"run_script\",\n  \"type\": \"command\",\n  \"command\": \"bash\",\n  \"args\": [\"scripts/process.sh\"],\n  \"timeout\": \"10m\",\n  \"env\": {\n    \"TARGET\": \"{{.input.target}}\"\n  },\n  \"input\": {\n    \"target\": { \"type\": \"string\", \"required\": true }\n  },\n  \"output\": {\n    \"stdout\":    { \"type\": \"string\" },\n    \"exit_code\": { \"type\": \"number\" }\n  }\n}\n```\n\n- Stdout is streamed line-by-line as `chunk` events over WebSocket/SSE\n- `timeout` accepts Go duration strings: `30s`, `5m`, `1h`\n\n### Transform Node\n\n```json\n{\n  \"name\": \"format_report\",\n  \"type\": \"transform\",\n  \"template\": \"# Report\\n\\n{{.input.title}}\\n\\n{{.input.body}}\",\n  \"input\": {\n    \"title\": { \"type\": \"string\", \"required\": true },\n    \"body\":  { \"type\": \"string\", \"required\": true }\n  },\n  \"output\": {\n    \"result\": { \"type\": \"string\" }\n  }\n}\n```\n\n### Database Node\n\n```json\n{\n  \"name\": \"query_users\",\n  \"type\": \"database\",\n  \"driver\": \"sqlite\",\n  \"connection\": \"./data.db\",\n  \"query\": \"SELECT * FROM users WHERE name = ?\",\n  \"params\": [\"{{.input.name}}\"],\n  \"input\": {\n    \"name\": { \"type\": \"string\", \"required\": true }\n  },\n  \"output\": {\n    \"rows\":  { \"type\": \"array\" },\n    \"count\": { \"type\": \"number\" }\n  }\n}\n```\n\n- **drivers:** `sqlite`, `postgres`\n- `params` values are Go templates rendered against the run context\n- Rows stream as `chunk` events; final output is `{rows, count}`\n\n### STT Node\n\n```json\n{\n  \"name\": \"transcribe\",\n  \"type\": \"stt\",\n  \"provider\": \"deepgram\",\n  \"model\": \"nova-2\",\n  \"language\": \"en\",\n  \"input\": {\n    \"audio_url\": { \"type\": \"string\", \"required\": true }\n  },\n  \"output\": {\n    \"transcript\": { \"type\": \"string\" },\n    \"confidence\": { \"type\": \"number\" },\n    \"words\":      { \"type\": \"array\" },\n    \"duration\":   { \"type\": \"number\" },\n    \"language\":   { \"type\": \"string\" }\n  }\n}\n```\n\n- **providers:** `deepgram` (pre-recorded REST API), `openai` / `whisper` (multipart upload), `whisper-cpp` (local)\n- Deepgram accepts an audio URL in `input.audio_url`; OpenAI/Whisper fetches the URL then uploads it\n- `whisper-cpp` runs locally via the `whisper-cli` binary — no API key required\n\n**Local STT with whisper.cpp:**\n\n```json\n{\n  \"name\": \"transcribe_local\",\n  \"type\": \"stt\",\n  \"provider\": \"whisper-cpp\",\n  \"model\": \"/path/to/ggml-base.en.bin\",\n  \"language\": \"en\",\n  \"input\": {\n    \"audio_url\": { \"type\": \"string\", \"required\": true }\n  },\n  \"output\": {\n    \"transcript\": { \"type\": \"string\" },\n    \"confidence\": { \"type\": \"number\" },\n    \"words\":      { \"type\": \"array\" },\n    \"language\":   { \"type\": \"string\" }\n  }\n}\n```\n\n- `model` — path to a whisper.cpp GGML model file (e.g. `ggml-base.en.bin`, `ggml-large-v3.bin`). Falls back to `whisperCppModel` in env config.\n- Binary defaults to `whisper-cli` on `$PATH`. Override with `whisperCppBin` in env config or `WHISPER_CPP_BIN` env var.\n- Models: download from [ggerganov/whisper.cpp](https://github.com/ggerganov/whisper.cpp) or via `bash models/download-ggml-model.sh base.en`\n\n### TTS Node\n\n```json\n{\n  \"name\": \"speak\",\n  \"type\": \"tts\",\n  \"provider\": \"elevenlabs\",\n  \"model\": \"eleven_multilingual_v2\",\n  \"voice_id\": \"21m00Tcm4TlvDq8ikWAM\",\n  \"input\": {\n    \"text\": { \"type\": \"string\", \"required\": true }\n  },\n  \"output\": {\n    \"audio_base64\": { \"type\": \"string\" },\n    \"content_type\":  { \"type\": \"string\" },\n    \"size_bytes\":    { \"type\": \"number\" }\n  }\n}\n```\n\n- **providers:** `elevenlabs`, `openai`, `piper`\n- Returns audio as a base64-encoded string. `content_type` is `audio/mpeg` for cloud providers, `audio/wav` for Piper.\n\n**Local TTS with Piper:**\n\n```json\n{\n  \"name\": \"speak_local\",\n  \"type\": \"tts\",\n  \"provider\": \"piper\",\n  \"voice_id\": \"/path/to/en_US-lessac-medium.onnx\",\n  \"input\": {\n    \"text\": { \"type\": \"string\", \"required\": true }\n  },\n  \"output\": {\n    \"audio_base64\": { \"type\": \"string\" },\n    \"content_type\":  { \"type\": \"string\" },\n    \"size_bytes\":    { \"type\": \"number\" }\n  }\n}\n```\n\n- `voice_id` — path to a Piper `.onnx` voice model file (required). No API key needed.\n- Binary defaults to `piper` on `$PATH`. Override with `piperBin` in env config or `PIPER_BIN` env var.\n- Voice models: download from [rhasspy/piper](https://github.com/rhasspy/piper) releases\n\n### Vector Node\n\nBacked by [springg](https://github.com/murdinc/springg) — a local vector store with cosine similarity search, WAL persistence, and optional S3 backup.\n\n**Operations:** `upsert`, `search`, `get`, `delete`, `create_index`, `delete_index`\n\n**upsert** — add or update a vector:\n\n```json\n{\n  \"name\": \"memory_store\",\n  \"type\": \"vector\",\n  \"operation\": \"upsert\",\n  \"index\": \"assistant_memory\",\n  \"input\": {\n    \"id\":       { \"type\": \"string\", \"required\": true },\n    \"vector\":   { \"type\": \"array\",  \"required\": true },\n    \"metadata\": { \"type\": \"object\", \"required\": false }\n  },\n  \"output\": {\n    \"id\":    { \"type\": \"string\" },\n    \"added\": { \"type\": \"boolean\" }\n  }\n}\n```\n\n**search** — find top-k similar vectors by cosine similarity:\n\n```json\n{\n  \"name\": \"memory_search\",\n  \"type\": \"vector\",\n  \"operation\": \"search\",\n  \"index\": \"assistant_memory\",\n  \"top_k\": 5,\n  \"input\": {\n    \"vector\": { \"type\": \"array\",  \"required\": true },\n    \"k\":      { \"type\": \"number\", \"required\": false }\n  },\n  \"output\": {\n    \"results\": { \"type\": \"array\" },\n    \"count\":   { \"type\": \"number\" }\n  }\n}\n```\n\nEach result in `results` is `{id, score, metadata}` where `score` is cosine similarity (0–1).\n\n**get** — fetch a stored vector by ID:\n\n```json\n{\n  \"name\": \"memory_get\",\n  \"type\": \"vector\",\n  \"operation\": \"get\",\n  \"index\": \"assistant_memory\",\n  \"input\": {\n    \"id\": { \"type\": \"string\", \"required\": true }\n  },\n  \"output\": {\n    \"id\":       { \"type\": \"string\" },\n    \"vector\":   { \"type\": \"array\" },\n    \"metadata\": { \"type\": \"object\" }\n  }\n}\n```\n\n**delete** — remove a vector by ID:\n\n```json\n{\n  \"name\": \"memory_delete\",\n  \"type\": \"vector\",\n  \"operation\": \"delete\",\n  \"index\": \"assistant_memory\",\n  \"input\": {\n    \"id\": { \"type\": \"string\", \"required\": true }\n  }\n}\n```\n\n**create_index** / **delete_index** — manage indexes (typically run once at setup):\n\n```json\n{\n  \"name\": \"memory_init\",\n  \"type\": \"vector\",\n  \"operation\": \"create_index\",\n  \"index\": \"assistant_memory\",\n  \"dimensions\": 1536\n}\n```\n\n- `index` — name of the springg index (required on all operations)\n- `dimensions` — vector size; must match your embedding model (required for `create_index`)\n- `top_k` — default number of results for `search` (default: 10); overridable per-call via `input.k`\n- Vectors are stored as 32-bit floats; pass any JSON number array as `input.vector`\n\n### Embedding Node\n\nGenerates embedding vectors from text. Pair with the `vector` node to build RAG pipelines.\n\n```json\n{\n  \"name\": \"embed_text\",\n  \"type\": \"embedding\",\n  \"provider\": \"openai\",\n  \"model\": \"text-embedding-3-small\",\n  \"input\": {\n    \"text\": { \"type\": \"string\", \"required\": true }\n  },\n  \"output\": {\n    \"vector\":     { \"type\": \"array\" },\n    \"dimensions\": { \"type\": \"number\" },\n    \"model\":      { \"type\": \"string\" }\n  }\n}\n```\n\n- **providers:** `openai` (default model: `text-embedding-3-small`), `ollama` (model required, e.g. `nomic-embed-text`)\n- Output `vector` can be passed directly to a `vector` node's `upsert` or `search` input\n\n---\n\n### Scaffold Node\n\nThe `scaffold` type lets lcdata **create new nodes at runtime** — it can build itself. Combined with an LLM node and the built-in `design_node` pipeline, you can describe a new capability in plain English and have it live in the engine within seconds.\n\nOperations: `list`, `read`, `create`, `delete`.\n\n```json\n{\n  \"name\": \"scaffold_list\",\n  \"type\": \"scaffold\",\n  \"operation\": \"list\"\n}\n```\n\n```json\n{\n  \"name\": \"scaffold_create\",\n  \"type\": \"scaffold\",\n  \"operation\": \"create\",\n  \"input\": {\n    \"name\":          { \"type\": \"string\", \"required\": true },\n    \"config\":        { \"type\": \"string\", \"required\": true },\n    \"system_prompt\": { \"type\": \"string\" }\n  }\n}\n```\n\n**`list`** — returns `{nodes: [...summaries], count}`. No inputs required.\n\n**`read`** — input: `name`. Returns `{name, path, config (string), object (parsed)}`.\n\n**`create`** — inputs: `name`, `config` (JSON string or object — validated before writing), optional `system_prompt` (written as `system.md`). Returns `{name, path, created: true}`. The hot-reload watcher picks up the new node within 200ms automatically.\n\n**`delete`** — input: `name`. Removes the node directory. Hot-reload deregisters it.\n\n#### Self-building: `design_node` pipeline\n\nThe built-in `design_node` pipeline takes a natural language description and creates a working node:\n\n```bash\ncurl -X POST http://localhost:8080/api/nodes/design_node/run \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"input\": {\n      \"name\": \"sentiment_score\",\n      \"description\": \"Classify the sentiment of a text as positive, negative, or neutral with a confidence score\"\n    }\n  }'\n```\n\nInternally it chains: `scaffold_list` → `scaffold_designer_llm` (Claude with full schema reference) → `scaffold_create`. The new node is live as soon as the response returns.\n\nThe `scaffold_designer_llm` node uses a comprehensive system prompt (`nodes/scaffold_designer_llm/system.md`) covering every node type, field, template syntax, naming conventions, and design guidelines so the LLM generates configs that pass validation on the first attempt.\n\n**Provider note:** `design_node` defaults to `anthropic` / `claude-opus-4-5` for the designer LLM. You can edit `nodes/scaffold_designer_llm/scaffold_designer_llm.json` to switch to any other provider/model.\n\n---\n\n## Pipelines\n\nPipelines wire nodes together. Each step's output is available to all subsequent steps via `{{.step_id.field}}`. The pipeline's `output` block defines what gets returned to the caller.\n\n```json\n{\n  \"name\": \"my_pipeline\",\n  \"type\": \"pipeline\",\n  \"steps\": [ ... ],\n  \"input\": { ... },\n  \"output\": {\n    \"answer\": \"{{.final_step.response}}\",\n    \"items\":  \"{{.gather.results}}\"\n  }\n}\n```\n\n### Sequential Step\n\n```json\n{\n  \"id\": \"summarize\",\n  \"node\": \"summarize\",\n  \"input\": {\n    \"message\": \"{{.fetch.body}}\"\n  }\n}\n```\n\n### Error Handling\n\nAny step can specify `on_error` to run a fallback node instead of aborting the pipeline. The handler node receives `input.error` and `input.step_id`; its output replaces the failed step's output.\n\n```json\n{\n  \"id\": \"fetch\",\n  \"node\": \"fetch_url\",\n  \"input\": { \"url\": \"{{.input.url}}\" },\n  \"on_error\": \"fallback_handler\"\n}\n```\n\n### Switch (Conditional Branching)\n\nRoutes to a different node based on a runtime value. LLM outputs like `{\"intent\": \"search\"}` are automatically normalized to `\"search\"`.\n\n```json\n{\n  \"id\": \"route\",\n  \"switch\": \"{{.classify.intent}}\",\n  \"cases\": {\n    \"search\":   { \"node\": \"web_search\",  \"input\": { \"query\":   \"{{.input.message}}\" } },\n    \"chat\":     { \"node\": \"llm_chat\",    \"input\": { \"message\": \"{{.input.message}}\" } },\n    \"default\":  { \"node\": \"llm_chat\",    \"input\": { \"message\": \"{{.input.message}}\" } }\n  }\n}\n```\n\n### Parallel\n\nAll branches run concurrently. Outputs are namespaced: `{{.gather.web.results}}`, `{{.gather.db.rows}}`.\n\n```json\n{\n  \"id\": \"gather\",\n  \"parallel\": [\n    { \"id\": \"web\", \"node\": \"web_search\", \"input\": { \"query\": \"{{.input.topic}}\" } },\n    { \"id\": \"db\",  \"node\": \"db_lookup\",  \"input\": { \"term\":  \"{{.input.topic}}\" } }\n  ]\n}\n```\n\n### Loop\n\nRepeats inner steps until a condition is true or `max_iterations` is reached. Each iteration shares the same run context so later iterations can reference earlier ones.\n\n```json\n{\n  \"id\": \"refine\",\n  \"loop\": {\n    \"max_iterations\": 5,\n    \"until\": \"{{gt (toFloat .evaluate.score) 0.8}}\",\n    \"steps\": [\n      { \"id\": \"draft\",    \"node\": \"llm_writer\",    \"input\": { \"topic\": \"{{.input.topic}}\" } },\n      { \"id\": \"evaluate\", \"node\": \"llm_evaluator\",  \"input\": { \"draft\": \"{{.draft.text}}\" } }\n    ]\n  }\n}\n```\n\n### Map (Fan-out)\n\nRuns a node once per item in an array. Results are collected into a new context key.\n\n```json\n{\n  \"id\": \"fetch_all\",\n  \"map\": {\n    \"over\":        \"{{.search.results}}\",\n    \"as\":          \"result\",\n    \"node\":        \"fetch_url\",\n    \"concurrency\": 3,\n    \"input\": {\n      \"url\": \"{{.result.url}}\"\n    },\n    \"collect_as\": \"pages\"\n  }\n}\n```\n\n- `as` names the current item for use in `input` templates\n- `collect_as` sets the context key where the result array lands\n- `concurrency` controls how many items run in parallel (default: 1 = sequential)\n\n---\n\n## Run Context\n\nAll steps in a run share a thread-safe `RunContext`. Steps read via templates and write by returning their output fields. Keys are namespaced by step ID.\n\n```\ninput.message          → user-provided input\nclassify.intent        → written by the \"classify\" step\ngather.web.results     → written by branch \"web\" inside parallel step \"gather\"\nfetch_all.0            → written by map step \"fetch_all\" for item 0\npages                  → the collected array from a map step's collect_as\n```\n\n### Template Functions\n\n| Function | Usage |\n|----------|-------|\n| `{{.step.field}}` | Access any step output |\n| `{{.input.field}}` | Access run inputs |\n| `{{toJSON .value}}` | Marshal to JSON string |\n| `{{fromJSON .str}}` | Parse JSON string |\n| `{{toFloat .value}}` | Convert to float64 |\n| `{{toInt .value}}` | Convert to int |\n| `{{default val fallback}}` | Use fallback if val is empty/nil |\n| `{{join arr sep}}` | Join string array with separator |\n| `{{gt a b}}` / `{{lt a b}}` | Comparison (for loop conditions) |\n\nSimple path references like `{{.step.field}}` preserve the original type (array, map, number). Complex templates like `\"https://{{.host}}/{{.path}}\"` produce strings.\n\n---\n\n## Built-in Nodes\n\nThe `nodes/` directory ships with these ready-to-use nodes:\n\n| Node | Type | What it does |\n|------|------|-------------|\n| `llm_chat` | llm | Streaming chat with Claude (sonnet-4-6) |\n| `classify_intent` | llm | Classifies input as: search, database, chat, command |\n| `classify_sentiment` | llm | Returns `{sentiment, confidence, explanation}` |\n| `extract_entities` | llm | Returns `{people, organizations, locations, dates, topics}` |\n| `summarize` | llm | Condenses text to 2-4 sentences |\n| `translate` | llm | Translates to any language |\n| `fetch_url` | http | Fetches a URL, strips HTML to plain text |\n| `web_search` | search | Brave web search → `[{title, url, description}]` |\n| `read_file` | file | Reads a file from disk |\n| `write_file` | file | Writes a file to disk |\n| `research_pipeline` | pipeline | web_search → fetch pages → summarize each → synthesize answer |\n| `smart_assistant` | pipeline | Classify intent → route to research, translate, or chat |\n| `analyze_document` | pipeline | Read file → parallel analysis → compose report → write file |\n\n### Pipeline: `research_pipeline`\n\n```\nweb_search\n  └── map: fetch_url (concurrency: 3)\n        └── map: summarize (concurrency: 3)\n              └── llm_chat (synthesize)\n```\n\nReturns `{response, search_results, summaries}`.\n\n### Pipeline: `smart_assistant`\n\n```\nclassify_intent\n  └── switch on intent:\n        \"search\"   → research_pipeline\n        \"translate\" → translate\n        \"default\"  → llm_chat\n```\n\nReturns `{response, intent}`.\n\n### Pipeline: `analyze_document`\n\n```\nread_file\n  └── parallel:\n        ├── summarize\n        ├── extract_entities\n        └── classify_sentiment\n              └── llm_chat (compose report)\n                    └── write_file\n```\n\nReturns `{report, output_path, summary, sentiment}`.\n\n---\n\n## API\n\n### Discovery\n\n```\nGET /api/nodes           → list all nodes with descriptions and I/O schemas\nGET /api/nodes/{name}    → full node spec\nGET /api/health          → health check\nGET /api/info            → server version and capabilities\n```\n\n### Execution\n\n```\nPOST /api/nodes/{name}/run     → synchronous, waits for full result\nPOST /api/nodes/{name}/stream  → Server-Sent Events, streams as it runs\nGET  /ws/nodes/{name}          → WebSocket, bidirectional streaming\nPOST /api/nodes/{name}/audio   → multipart upload: POST audio file, runs node with audio_url set\n```\n\n**Audio upload** — `multipart/form-data` with fields:\n- `audio` (required) — audio file (WAV, MP3, OGG, FLAC, M4A, WebM)\n- `env` (optional) — environment name (default: `\"default\"`)\n\nThe file is saved to a temp path and passed as `input.audio_url` to the node. Works with any `stt` node; `whisper-cpp` reads it directly without an HTTP round-trip.\n\n**Request body:**\n```json\n{\n  \"input\": { \"message\": \"What is the capital of France?\" },\n  \"env\":    \"default\"\n}\n```\n\n**Response:**\n```json\n{\n  \"run_id\":      \"a3f9b2c1\",\n  \"node\":        \"smart_assistant\",\n  \"status\":      \"completed\",\n  \"output\": {\n    \"response\": \"The capital of France is Paris.\",\n    \"intent\":   \"chat\"\n  },\n  \"steps\": [\n    { \"id\": \"classify\", \"node\": \"classify_intent\", \"status\": \"completed\", \"duration_ms\": 180 },\n    { \"id\": \"handle\",   \"node\": \"llm_chat\",         \"status\": \"completed\", \"duration_ms\": 620 }\n  ],\n  \"duration_ms\": 800\n}\n```\n\n### Run Management\n\n```\nGET  /api/runs           → list recent runs\nGET  /api/runs/{id}      → get run status and full result\nPOST /api/runs/{id}/cancel → cancel an in-progress run\n```\n\n### Streaming Events\n\nAll streaming connections (WebSocket and SSE) receive the same event stream:\n\n```json\n{\"event\":\"run_started\",    \"run_id\":\"abc\", \"node\":\"smart_assistant\"}\n{\"event\":\"step_started\",   \"run_id\":\"abc\", \"step_id\":\"classify\",  \"node\":\"classify_intent\"}\n{\"event\":\"step_completed\", \"run_id\":\"abc\", \"step_id\":\"classify\",  \"output\":{\"intent\":\"search\"}, \"duration_ms\":180}\n{\"event\":\"step_started\",   \"run_id\":\"abc\", \"step_id\":\"handle\",    \"node\":\"research_pipeline\"}\n{\"event\":\"chunk\",          \"run_id\":\"abc\", \"step_id\":\"synthesize\",\"data\":\"Based on the search results...\"}\n{\"event\":\"map_progress\",   \"run_id\":\"abc\", \"step_id\":\"fetch_all\", \"progress\":3, \"total\":10}\n{\"event\":\"step_completed\", \"run_id\":\"abc\", \"step_id\":\"handle\",    \"output\":{...}, \"duration_ms\":4200}\n{\"event\":\"run_completed\",  \"run_id\":\"abc\", \"output\":{...}, \"duration_ms\":4380}\n```\n\n**Event types:** `run_started`, `run_completed`, `run_failed`, `run_cancelled`, `step_started`, `step_completed`, `step_failed`, `chunk`, `loop_iteration`, `map_progress`, `retry`\n\n---\n\n## Configuration\n\n### Server Config (`lcdata.json`)\n\nCreated automatically on first run.\n\n```json\n{\n  \"port\":                8080,\n  \"jwt_secret\":          \"change-this-in-production\",\n  \"require_jwt\":         true,\n  \"nodes_path\":          \"./nodes\",\n  \"env\":                 \"default\",\n  \"log_level\":           \"info\",\n  \"max_concurrent_runs\": 10,\n  \"run_timeout\":         \"5m\",\n  \"run_history\":         100,\n  \"store_path\":          \"./lcdata.db\",\n  \"rate_limit_rps\":      0,\n  \"rate_limit_burst\":    0\n}\n```\n\n- **`store_path`** — SQLite file for run history persistence (created automatically)\n- **`rate_limit_rps`** — requests per second per JWT `sub` claim (0 = disabled); falls back to remote IP when no JWT is present\n- **`rate_limit_burst`** — bucket size for bursts (default: `rps * 2`)\n\n### Credentials (`~/lcdataenv.json`)\n\nLookup order: `~/lcdataenv.json` → `./nodes/env.json`. All fields also fall back to environment variables.\n\n```json\n{\n  \"environments\": {\n    \"default\": {\n      \"anthropicKey\":    \"sk-ant-...\",\n      \"ollamaEndpoint\":  \"http://localhost:11434\",\n      \"openaiKey\":       \"sk-...\",\n      \"braveKey\":        \"BSA...\",\n      \"searxngEndpoint\": \"http://localhost:8888\",\n      \"elevenlabsKey\":   \"\",\n      \"deepgramKey\":     \"\",\n      \"whisperCppBin\":   \"/usr/local/bin/whisper-cli\",\n      \"whisperCppModel\": \"/path/to/ggml-base.en.bin\",\n      \"piperBin\":        \"/usr/local/bin/piper\",\n      \"springgEndpoint\": \"http://localhost:8181\",\n      \"springgKey\":      \"\",\n      \"dbConnections\": {\n        \"main\": \"postgres://user:pass@localhost:5432/mydb\"\n      }\n    },\n    \"production\": {\n      \"anthropicKey\": \"sk-ant-...\",\n      \"dbConnections\": {\n        \"main\": \"postgres://user:pass@prod:5432/mydb?sslmode=require\"\n      }\n    }\n  }\n}\n```\n\n**Environment variable fallbacks:**\n\n| Config key | Env var |\n|-----------|---------|\n| `anthropicKey` | `ANTHROPIC_API_KEY` |\n| `openaiKey` | `OPENAI_API_KEY` |\n| `ollamaEndpoint` | `OLLAMA_ENDPOINT` |\n| `braveKey` | `BRAVE_API_KEY` |\n| `searxngEndpoint` | `SEARXNG_ENDPOINT` |\n| `elevenlabsKey` | `ELEVENLABS_API_KEY` |\n| `deepgramKey` | `DEEPGRAM_API_KEY` |\n| `whisperCppBin` | `WHISPER_CPP_BIN` |\n| `whisperCppModel` | `WHISPER_CPP_MODEL` |\n| `piperBin` | `PIPER_BIN` |\n| `springgEndpoint` | `SPRINGG_ENDPOINT` |\n| `springgKey` | `SPRINGG_KEY` |\n\n---\n\n## CLI\n\n```\nlcdata serve                                        start the HTTP + WebSocket server\nlcdata init [name] [type]                           scaffold a new node directory\nlcdata list                                         list all nodes\nlcdata show [name]                                  show full node config\nlcdata run [name] --input key=val --env prod        run a node locally (no server)\nlcdata run [name] --input message=-                 read one input value from stdin\nlcdata validate                                     validate all node configs\nlcdata graph [name]                                 print execution tree with icons\nlcdata generate-jwt --client my-service             generate a signed JWT\nlcdata generate-jwt --client svc --allow node1,node2  JWT scoped to specific nodes\nlcdata version                                      show version\n```\n\n**Graph output example:**\n```\n◼ analyze_document  (pipeline)\n├── [read]\n│   ▤ read_file  (file)\n├── [analyze] parallel (3 branches)\n│   ├── branch \"summary\"\n│   │   ◆ summarize  (llm)\n│   ├── branch \"entities\"\n│   │   ◆ extract_entities  (llm)\n│   └── branch \"sentiment\"\n│       ◆ classify_sentiment  (llm)\n├── [compose]\n│   ◆ llm_chat  (llm)\n└── [write]\n    ▤ write_file  (file)\n```\n\n**Node type icons:** ◆ llm · ◈ http · ⊕ search · ▤ file · ▶ command · ▣ database · ◇ transform · ◼ pipeline · ◎ stt · ◉ tts\n\n---\n\n## Auth\n\nJWT authentication is enabled by default. Disable with `\"require_jwt\": false` in `lcdata.json`.\n\nGenerate a token:\n```bash\nlcdata generate-jwt --client my-service\n```\n\nUse in requests:\n```\nAuthorization: Bearer \u003ctoken\u003e\n```\n\n---\n\n## Operational Features\n\n### Hot Reload\n\nThe server watches the `nodes/` directory with fsnotify. Adding, editing, or removing a node config takes effect within 200ms — no restart required. Discovery endpoints always reflect the current state.\n\n### Run Persistence\n\nCompleted runs are persisted to SQLite (`store_path` in config). The `/api/runs` endpoint returns the most recent N runs (`run_history`), merging in-flight runs with persisted ones.\n\n### Cost Tracking\n\nLLM nodes report `input_tokens` and `output_tokens` in their output under a `usage` key. The runner aggregates token counts across all steps and exposes them on the run record:\n\n```json\n{\n  \"run_id\": \"a3f9b2c1\",\n  \"input_tokens\": 1240,\n  \"output_tokens\": 387,\n  \"steps\": [\n    { \"id\": \"classify\", \"input_tokens\": 320,  \"output_tokens\": 12  },\n    { \"id\": \"answer\",   \"input_tokens\": 920,  \"output_tokens\": 375 }\n  ]\n}\n```\n\n### Retry\n\nNodes that fail due to transient errors (API timeouts, rate limits) retry automatically with exponential backoff and ±25% jitter. Configure per-node:\n\n```json\n{\n  \"retry_count\": 3,\n  \"retry_delay\": \"1s\"\n}\n```\n\n`retry` events are emitted on each attempt so streaming clients can observe them.\n\n---\n\n## Project Structure\n\n```\nlcdata/\n  main.go\n  go.mod\n  lcdata.json.example\n  lcdataenv.json.example\n  DESIGN.md\n  cmd/\n    root.go           Cobra root + global flags\n    serve.go          HTTP server, WebSocket, SSE, JWT middleware, rate limiting\n    init.go           lcdata init (scaffold node directory)\n    list.go           lcdata list\n    show.go           lcdata show\n    run.go            lcdata run (stdin support via key=-)\n    validate.go       lcdata validate\n    graph.go          lcdata graph (ASCII tree)\n    jwt.go            lcdata generate-jwt (with --allow node scoping)\n    version.go        lcdata version\n  internal/lcdata/\n    config.go         Server config (lcdata.json)\n    environment.go    Credentials config (lcdataenv.json)\n    node.go           Node struct, JSON loading, field schema, input validation\n    pipeline.go       Step, SwitchCase, LoopConfig, MapConfig types\n    runner.go         Run lifecycle, async execution, node hot-swap\n    watcher.go        fsnotify hot reload (debounced, 200ms)\n    store.go          SQLite run history persistence\n    retry.go          Exponential backoff with ±25% jitter\n    context.go        RunContext, template rendering, type preservation\n    stream.go         Event types, Run struct, StepResult (with token counts)\n    flow.go           Pipeline execution, switch/parallel/loop/map, on_error\n    executor.go       Per-type dispatch with retry wrapper\n    executor_llm.go   Anthropic (tool use loop), Ollama, OpenAI (SSE streaming)\n    executor_http.go  HTTP requests + HTML stripping\n    executor_search.go Brave API + SearXNG\n    executor_file.go  File read/write/append/exists/delete/list\n    executor_cmd.go   Command execution with streaming stdout\n    executor_xfm.go   Transform (Go template rendering)\n    executor_db.go    Database — SQLite + Postgres via database/sql\n    executor_stt.go   STT — Deepgram, OpenAI Whisper, whisper.cpp (local+file path)\n    executor_tts.go   TTS — ElevenLabs, OpenAI, Piper (local, returns base64 audio)\n    executor_vector.go Vector store — springg (upsert, search, get, delete, create/delete index)\n    executor_embed.go  Embeddings — OpenAI, Ollama (returns float64 vector)\n  nodes/\n    llm_chat/\n    classify_intent/\n    classify_sentiment/\n    extract_entities/\n    summarize/\n    translate/\n    fetch_url/\n    web_search/\n    read_file/\n    write_file/\n    research_pipeline/\n    smart_assistant/\n    analyze_document/\n```\n\n---\n\n## Dependencies\n\n```\ngithub.com/anthropics/anthropic-sdk-go  v0.2.0-alpha.4\ngithub.com/fsnotify/fsnotify            v1.9.0\ngithub.com/go-chi/chi/v5                v5.2.3\ngithub.com/go-chi/cors                  v1.2.2\ngithub.com/golang-jwt/jwt/v5            v5.3.0\ngithub.com/gorilla/websocket            v1.5.3\ngithub.com/lib/pq                       v1.10.9\ngithub.com/spf13/cobra                  v1.8.0\nmodernc.org/sqlite                      v1.37.0\n```\n\nGo 1.23+ (uses `log/slog`, `math/rand/v2`)\n\n---\n\n## Comparison\n\n| | lcdata | LangChain | n8n |\n|---|---|---|---|\n| Config format | JSON files | Python code | Visual UI |\n| Add new agent | Drop a folder | Write a class | Drag nodes |\n| Streaming | All node types, unified events | Per-chain, varies | Limited |\n| Self-describing API | `/api/nodes` live registry | No | Workflow export |\n| Flow control | switch/parallel/loop/map in JSON | Graph edges in code | Node connections |\n| Provider switch | Change one JSON field | Change class import | Reconfigure credential |\n| Deploy | Single Go binary | Python env + deps | Node.js app |\n| CLI mode | `lcdata run name` | Script the chain | No |\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmurdinc%2Flcdata","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmurdinc%2Flcdata","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmurdinc%2Flcdata/lists"}