https://github.com/murdinc/lcdata
A declarative agentic LLM execution engine
https://github.com/murdinc/lcdata
agentic-ai agents anthropic cli declarative golang json-config llm nlp ollama openai orchestration pipeline rest-api self-hosted speech-to-text sse text-to-speech websocket workflow
Last synced: 22 days ago
JSON representation
A declarative agentic LLM execution engine
- Host: GitHub
- URL: https://github.com/murdinc/lcdata
- Owner: murdinc
- Created: 2026-04-14T16:44:30.000Z (2 months ago)
- Default Branch: master
- Last Pushed: 2026-05-01T02:51:08.000Z (about 2 months ago)
- Last Synced: 2026-05-01T04:19:25.803Z (about 2 months ago)
- Topics: agentic-ai, agents, anthropic, cli, declarative, golang, json-config, llm, nlp, ollama, openai, orchestration, pipeline, rest-api, self-hosted, speech-to-text, sse, text-to-speech, websocket, workflow
- Language: Go
- Homepage:
- Size: 19.3 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# lcdata
A declarative agentic LLM execution engine. Drop a folder into `nodes/`, write a JSON config, and the engine exposes it as a REST + WebSocket API endpoint. No code changes. No recompile. Nodes compose into pipelines with conditional branching, parallel execution, loops, and fan-out — all in JSON.
Built as a single Go binary with a file-first design: the binary is the engine, `nodes/` is the content.
---
## Quick Start
```bash
# Build
go build -o lcdata .
# First run creates lcdata.json with defaults
./lcdata serve
# List available nodes
./lcdata list
# Run a node from the CLI
./lcdata run llm_chat --input message="Hello"
# Validate all node configs
./lcdata validate
# Show the execution graph for a pipeline
./lcdata graph smart_assistant
```
---
## The Core Idea
Every node is a directory:
```
nodes/
my_agent/
my_agent.json ← node config
system.md ← optional system prompt (LLM nodes)
```
The `type` field determines how it runs. Nodes can be wired together into pipelines using Go template expressions (`{{.step_id.field}}`). Adding a new agent means dropping a folder — no Go code required.
---
## Node Types
| Type | Description |
|------|-------------|
| `llm` | LLM call — Anthropic Claude, Ollama, or OpenAI-compatible |
| `http` | Outbound HTTP request with templated URL, headers, and body |
| `search` | Web search — Brave API or SearXNG |
| `file` | File operations — read, write, append, exists, delete, list |
| `command` | Shell command with streaming stdout |
| `transform` | Template-based data reshaping, no external call |
| `database` | SQL query — Postgres or SQLite |
| `stt` | Speech-to-text — Deepgram, OpenAI Whisper, or whisper.cpp (local) |
| `tts` | Text-to-speech — ElevenLabs, OpenAI, or Piper (local) |
| `vector` | Vector store operations via springg — upsert, search, get, delete |
| `embedding` | Generate embedding vectors — OpenAI or Ollama |
| `scaffold` | Node self-builder — create, read, list, or delete node configs at runtime |
| `pipeline` | Orchestrates other nodes — sequential, switch, parallel, loop, map |
---
## Node Config Reference
### LLM Node
```json
{
"name": "llm_chat",
"description": "General-purpose chat using Claude",
"type": "llm",
"provider": "anthropic",
"model": "claude-sonnet-4-6",
"system_prompt_file": "system.md",
"temperature": 0.7,
"max_tokens": 4096,
"stream": true,
"tools": ["web_search", "read_file"],
"retry_count": 2,
"retry_delay": "1s",
"structured_output": {
"intent": { "type": "string" },
"confidence": { "type": "number" }
},
"input": {
"message": { "type": "string", "required": true },
"history": { "type": "array", "required": false }
},
"output": {
"response": { "type": "string" },
"usage": { "type": "object" }
}
}
```
- **providers:** `anthropic`, `ollama`, `openai`
- **`stream: true`** emits `chunk` events over WebSocket/SSE as tokens arrive — works with and without tools
- **`structured_output`** — when set, the LLM response is parsed as JSON and each field is merged into the output map alongside `response`
- **`history`** input — pass an array of `{role, content}` objects for multi-turn conversations
- **`max_history`** — trim `history` to the most recent N entries before sending to the model (prevents context overflow)
- **`tools`** — list of node names the LLM can invoke as tools. Anthropic and Ollama run an agentic loop (up to 10 turns). Streaming is fully supported with tools — text tokens stream as they arrive; tool calls execute between turns.
- **`retry_count` / `retry_delay`** — retry on API error with exponential backoff + jitter (e.g. `"retry_count": 3, "retry_delay": "1s"`)
**Date/time template functions** — available in system prompts and all templates:
| Function | Returns |
|---|---|
| `{{now}}` | Current time as RFC3339 string |
| `{{date}}` | Current date as `YYYY-MM-DD` |
| `{{datetime}}` | Current date+time as `YYYY-MM-DD HH:MM:SS` |
### HTTP Node
```json
{
"name": "fetch_url",
"type": "http",
"method": "GET",
"url": "{{.input.url}}",
"strip_html": true,
"headers": {
"Authorization": "Bearer {{.input.token}}"
},
"body": "{\"query\": \"{{.input.q}}\"}",
"input": {
"url": { "type": "string", "required": true }
},
"output": {
"status": { "type": "number" },
"body": { "type": "string" }
}
}
```
- **`strip_html: true`** strips tags, scripts, and styles — returns clean plain text
- URL, headers, and body are all Go templates rendered against the run context
### Search Node
```json
{
"name": "web_search",
"type": "search",
"search_provider": "brave",
"search_count": 10,
"input": {
"query": { "type": "string", "required": true }
},
"output": {
"results": { "type": "array" },
"count": { "type": "number" }
}
}
```
- **providers:** `brave`, `searxng`
- Returns `results` as an array of `{title, url, description}` objects
### File Node
```json
{
"name": "read_file",
"type": "file",
"operation": "read",
"input": {
"path": { "type": "string", "required": true }
},
"output": {
"content": { "type": "string" },
"size": { "type": "number" }
}
}
```
- **operations:** `read`, `write`, `append`, `exists`, `delete`, `list`
- `write` and `append` require `input.content`
- Parent directories are created automatically on `write`/`append`
### Command Node
```json
{
"name": "run_script",
"type": "command",
"command": "bash",
"args": ["scripts/process.sh"],
"timeout": "10m",
"env": {
"TARGET": "{{.input.target}}"
},
"input": {
"target": { "type": "string", "required": true }
},
"output": {
"stdout": { "type": "string" },
"exit_code": { "type": "number" }
}
}
```
- Stdout is streamed line-by-line as `chunk` events over WebSocket/SSE
- `timeout` accepts Go duration strings: `30s`, `5m`, `1h`
### Transform Node
```json
{
"name": "format_report",
"type": "transform",
"template": "# Report\n\n{{.input.title}}\n\n{{.input.body}}",
"input": {
"title": { "type": "string", "required": true },
"body": { "type": "string", "required": true }
},
"output": {
"result": { "type": "string" }
}
}
```
### Database Node
```json
{
"name": "query_users",
"type": "database",
"driver": "sqlite",
"connection": "./data.db",
"query": "SELECT * FROM users WHERE name = ?",
"params": ["{{.input.name}}"],
"input": {
"name": { "type": "string", "required": true }
},
"output": {
"rows": { "type": "array" },
"count": { "type": "number" }
}
}
```
- **drivers:** `sqlite`, `postgres`
- `params` values are Go templates rendered against the run context
- Rows stream as `chunk` events; final output is `{rows, count}`
### STT Node
```json
{
"name": "transcribe",
"type": "stt",
"provider": "deepgram",
"model": "nova-2",
"language": "en",
"input": {
"audio_url": { "type": "string", "required": true }
},
"output": {
"transcript": { "type": "string" },
"confidence": { "type": "number" },
"words": { "type": "array" },
"duration": { "type": "number" },
"language": { "type": "string" }
}
}
```
- **providers:** `deepgram` (pre-recorded REST API), `openai` / `whisper` (multipart upload), `whisper-cpp` (local)
- Deepgram accepts an audio URL in `input.audio_url`; OpenAI/Whisper fetches the URL then uploads it
- `whisper-cpp` runs locally via the `whisper-cli` binary — no API key required
**Local STT with whisper.cpp:**
```json
{
"name": "transcribe_local",
"type": "stt",
"provider": "whisper-cpp",
"model": "/path/to/ggml-base.en.bin",
"language": "en",
"input": {
"audio_url": { "type": "string", "required": true }
},
"output": {
"transcript": { "type": "string" },
"confidence": { "type": "number" },
"words": { "type": "array" },
"language": { "type": "string" }
}
}
```
- `model` — path to a whisper.cpp GGML model file (e.g. `ggml-base.en.bin`, `ggml-large-v3.bin`). Falls back to `whisperCppModel` in env config.
- Binary defaults to `whisper-cli` on `$PATH`. Override with `whisperCppBin` in env config or `WHISPER_CPP_BIN` env var.
- Models: download from [ggerganov/whisper.cpp](https://github.com/ggerganov/whisper.cpp) or via `bash models/download-ggml-model.sh base.en`
### TTS Node
```json
{
"name": "speak",
"type": "tts",
"provider": "elevenlabs",
"model": "eleven_multilingual_v2",
"voice_id": "21m00Tcm4TlvDq8ikWAM",
"input": {
"text": { "type": "string", "required": true }
},
"output": {
"audio_base64": { "type": "string" },
"content_type": { "type": "string" },
"size_bytes": { "type": "number" }
}
}
```
- **providers:** `elevenlabs`, `openai`, `piper`
- Returns audio as a base64-encoded string. `content_type` is `audio/mpeg` for cloud providers, `audio/wav` for Piper.
**Local TTS with Piper:**
```json
{
"name": "speak_local",
"type": "tts",
"provider": "piper",
"voice_id": "/path/to/en_US-lessac-medium.onnx",
"input": {
"text": { "type": "string", "required": true }
},
"output": {
"audio_base64": { "type": "string" },
"content_type": { "type": "string" },
"size_bytes": { "type": "number" }
}
}
```
- `voice_id` — path to a Piper `.onnx` voice model file (required). No API key needed.
- Binary defaults to `piper` on `$PATH`. Override with `piperBin` in env config or `PIPER_BIN` env var.
- Voice models: download from [rhasspy/piper](https://github.com/rhasspy/piper) releases
### Vector Node
Backed by [springg](https://github.com/murdinc/springg) — a local vector store with cosine similarity search, WAL persistence, and optional S3 backup.
**Operations:** `upsert`, `search`, `get`, `delete`, `create_index`, `delete_index`
**upsert** — add or update a vector:
```json
{
"name": "memory_store",
"type": "vector",
"operation": "upsert",
"index": "assistant_memory",
"input": {
"id": { "type": "string", "required": true },
"vector": { "type": "array", "required": true },
"metadata": { "type": "object", "required": false }
},
"output": {
"id": { "type": "string" },
"added": { "type": "boolean" }
}
}
```
**search** — find top-k similar vectors by cosine similarity:
```json
{
"name": "memory_search",
"type": "vector",
"operation": "search",
"index": "assistant_memory",
"top_k": 5,
"input": {
"vector": { "type": "array", "required": true },
"k": { "type": "number", "required": false }
},
"output": {
"results": { "type": "array" },
"count": { "type": "number" }
}
}
```
Each result in `results` is `{id, score, metadata}` where `score` is cosine similarity (0–1).
**get** — fetch a stored vector by ID:
```json
{
"name": "memory_get",
"type": "vector",
"operation": "get",
"index": "assistant_memory",
"input": {
"id": { "type": "string", "required": true }
},
"output": {
"id": { "type": "string" },
"vector": { "type": "array" },
"metadata": { "type": "object" }
}
}
```
**delete** — remove a vector by ID:
```json
{
"name": "memory_delete",
"type": "vector",
"operation": "delete",
"index": "assistant_memory",
"input": {
"id": { "type": "string", "required": true }
}
}
```
**create_index** / **delete_index** — manage indexes (typically run once at setup):
```json
{
"name": "memory_init",
"type": "vector",
"operation": "create_index",
"index": "assistant_memory",
"dimensions": 1536
}
```
- `index` — name of the springg index (required on all operations)
- `dimensions` — vector size; must match your embedding model (required for `create_index`)
- `top_k` — default number of results for `search` (default: 10); overridable per-call via `input.k`
- Vectors are stored as 32-bit floats; pass any JSON number array as `input.vector`
### Embedding Node
Generates embedding vectors from text. Pair with the `vector` node to build RAG pipelines.
```json
{
"name": "embed_text",
"type": "embedding",
"provider": "openai",
"model": "text-embedding-3-small",
"input": {
"text": { "type": "string", "required": true }
},
"output": {
"vector": { "type": "array" },
"dimensions": { "type": "number" },
"model": { "type": "string" }
}
}
```
- **providers:** `openai` (default model: `text-embedding-3-small`), `ollama` (model required, e.g. `nomic-embed-text`)
- Output `vector` can be passed directly to a `vector` node's `upsert` or `search` input
---
### Scaffold Node
The `scaffold` type lets lcdata **create new nodes at runtime** — it can build itself. Combined with an LLM node and the built-in `design_node` pipeline, you can describe a new capability in plain English and have it live in the engine within seconds.
Operations: `list`, `read`, `create`, `delete`.
```json
{
"name": "scaffold_list",
"type": "scaffold",
"operation": "list"
}
```
```json
{
"name": "scaffold_create",
"type": "scaffold",
"operation": "create",
"input": {
"name": { "type": "string", "required": true },
"config": { "type": "string", "required": true },
"system_prompt": { "type": "string" }
}
}
```
**`list`** — returns `{nodes: [...summaries], count}`. No inputs required.
**`read`** — input: `name`. Returns `{name, path, config (string), object (parsed)}`.
**`create`** — inputs: `name`, `config` (JSON string or object — validated before writing), optional `system_prompt` (written as `system.md`). Returns `{name, path, created: true}`. The hot-reload watcher picks up the new node within 200ms automatically.
**`delete`** — input: `name`. Removes the node directory. Hot-reload deregisters it.
#### Self-building: `design_node` pipeline
The built-in `design_node` pipeline takes a natural language description and creates a working node:
```bash
curl -X POST http://localhost:8080/api/nodes/design_node/run \
-H "Content-Type: application/json" \
-d '{
"input": {
"name": "sentiment_score",
"description": "Classify the sentiment of a text as positive, negative, or neutral with a confidence score"
}
}'
```
Internally it chains: `scaffold_list` → `scaffold_designer_llm` (Claude with full schema reference) → `scaffold_create`. The new node is live as soon as the response returns.
The `scaffold_designer_llm` node uses a comprehensive system prompt (`nodes/scaffold_designer_llm/system.md`) covering every node type, field, template syntax, naming conventions, and design guidelines so the LLM generates configs that pass validation on the first attempt.
**Provider note:** `design_node` defaults to `anthropic` / `claude-opus-4-5` for the designer LLM. You can edit `nodes/scaffold_designer_llm/scaffold_designer_llm.json` to switch to any other provider/model.
---
## Pipelines
Pipelines wire nodes together. Each step's output is available to all subsequent steps via `{{.step_id.field}}`. The pipeline's `output` block defines what gets returned to the caller.
```json
{
"name": "my_pipeline",
"type": "pipeline",
"steps": [ ... ],
"input": { ... },
"output": {
"answer": "{{.final_step.response}}",
"items": "{{.gather.results}}"
}
}
```
### Sequential Step
```json
{
"id": "summarize",
"node": "summarize",
"input": {
"message": "{{.fetch.body}}"
}
}
```
### Error Handling
Any step can specify `on_error` to run a fallback node instead of aborting the pipeline. The handler node receives `input.error` and `input.step_id`; its output replaces the failed step's output.
```json
{
"id": "fetch",
"node": "fetch_url",
"input": { "url": "{{.input.url}}" },
"on_error": "fallback_handler"
}
```
### Switch (Conditional Branching)
Routes to a different node based on a runtime value. LLM outputs like `{"intent": "search"}` are automatically normalized to `"search"`.
```json
{
"id": "route",
"switch": "{{.classify.intent}}",
"cases": {
"search": { "node": "web_search", "input": { "query": "{{.input.message}}" } },
"chat": { "node": "llm_chat", "input": { "message": "{{.input.message}}" } },
"default": { "node": "llm_chat", "input": { "message": "{{.input.message}}" } }
}
}
```
### Parallel
All branches run concurrently. Outputs are namespaced: `{{.gather.web.results}}`, `{{.gather.db.rows}}`.
```json
{
"id": "gather",
"parallel": [
{ "id": "web", "node": "web_search", "input": { "query": "{{.input.topic}}" } },
{ "id": "db", "node": "db_lookup", "input": { "term": "{{.input.topic}}" } }
]
}
```
### Loop
Repeats inner steps until a condition is true or `max_iterations` is reached. Each iteration shares the same run context so later iterations can reference earlier ones.
```json
{
"id": "refine",
"loop": {
"max_iterations": 5,
"until": "{{gt (toFloat .evaluate.score) 0.8}}",
"steps": [
{ "id": "draft", "node": "llm_writer", "input": { "topic": "{{.input.topic}}" } },
{ "id": "evaluate", "node": "llm_evaluator", "input": { "draft": "{{.draft.text}}" } }
]
}
}
```
### Map (Fan-out)
Runs a node once per item in an array. Results are collected into a new context key.
```json
{
"id": "fetch_all",
"map": {
"over": "{{.search.results}}",
"as": "result",
"node": "fetch_url",
"concurrency": 3,
"input": {
"url": "{{.result.url}}"
},
"collect_as": "pages"
}
}
```
- `as` names the current item for use in `input` templates
- `collect_as` sets the context key where the result array lands
- `concurrency` controls how many items run in parallel (default: 1 = sequential)
---
## Run Context
All steps in a run share a thread-safe `RunContext`. Steps read via templates and write by returning their output fields. Keys are namespaced by step ID.
```
input.message → user-provided input
classify.intent → written by the "classify" step
gather.web.results → written by branch "web" inside parallel step "gather"
fetch_all.0 → written by map step "fetch_all" for item 0
pages → the collected array from a map step's collect_as
```
### Template Functions
| Function | Usage |
|----------|-------|
| `{{.step.field}}` | Access any step output |
| `{{.input.field}}` | Access run inputs |
| `{{toJSON .value}}` | Marshal to JSON string |
| `{{fromJSON .str}}` | Parse JSON string |
| `{{toFloat .value}}` | Convert to float64 |
| `{{toInt .value}}` | Convert to int |
| `{{default val fallback}}` | Use fallback if val is empty/nil |
| `{{join arr sep}}` | Join string array with separator |
| `{{gt a b}}` / `{{lt a b}}` | Comparison (for loop conditions) |
Simple path references like `{{.step.field}}` preserve the original type (array, map, number). Complex templates like `"https://{{.host}}/{{.path}}"` produce strings.
---
## Built-in Nodes
The `nodes/` directory ships with these ready-to-use nodes:
| Node | Type | What it does |
|------|------|-------------|
| `llm_chat` | llm | Streaming chat with Claude (sonnet-4-6) |
| `classify_intent` | llm | Classifies input as: search, database, chat, command |
| `classify_sentiment` | llm | Returns `{sentiment, confidence, explanation}` |
| `extract_entities` | llm | Returns `{people, organizations, locations, dates, topics}` |
| `summarize` | llm | Condenses text to 2-4 sentences |
| `translate` | llm | Translates to any language |
| `fetch_url` | http | Fetches a URL, strips HTML to plain text |
| `web_search` | search | Brave web search → `[{title, url, description}]` |
| `read_file` | file | Reads a file from disk |
| `write_file` | file | Writes a file to disk |
| `research_pipeline` | pipeline | web_search → fetch pages → summarize each → synthesize answer |
| `smart_assistant` | pipeline | Classify intent → route to research, translate, or chat |
| `analyze_document` | pipeline | Read file → parallel analysis → compose report → write file |
### Pipeline: `research_pipeline`
```
web_search
└── map: fetch_url (concurrency: 3)
└── map: summarize (concurrency: 3)
└── llm_chat (synthesize)
```
Returns `{response, search_results, summaries}`.
### Pipeline: `smart_assistant`
```
classify_intent
└── switch on intent:
"search" → research_pipeline
"translate" → translate
"default" → llm_chat
```
Returns `{response, intent}`.
### Pipeline: `analyze_document`
```
read_file
└── parallel:
├── summarize
├── extract_entities
└── classify_sentiment
└── llm_chat (compose report)
└── write_file
```
Returns `{report, output_path, summary, sentiment}`.
---
## API
### Discovery
```
GET /api/nodes → list all nodes with descriptions and I/O schemas
GET /api/nodes/{name} → full node spec
GET /api/health → health check
GET /api/info → server version and capabilities
```
### Execution
```
POST /api/nodes/{name}/run → synchronous, waits for full result
POST /api/nodes/{name}/stream → Server-Sent Events, streams as it runs
GET /ws/nodes/{name} → WebSocket, bidirectional streaming
POST /api/nodes/{name}/audio → multipart upload: POST audio file, runs node with audio_url set
```
**Audio upload** — `multipart/form-data` with fields:
- `audio` (required) — audio file (WAV, MP3, OGG, FLAC, M4A, WebM)
- `env` (optional) — environment name (default: `"default"`)
The file is saved to a temp path and passed as `input.audio_url` to the node. Works with any `stt` node; `whisper-cpp` reads it directly without an HTTP round-trip.
**Request body:**
```json
{
"input": { "message": "What is the capital of France?" },
"env": "default"
}
```
**Response:**
```json
{
"run_id": "a3f9b2c1",
"node": "smart_assistant",
"status": "completed",
"output": {
"response": "The capital of France is Paris.",
"intent": "chat"
},
"steps": [
{ "id": "classify", "node": "classify_intent", "status": "completed", "duration_ms": 180 },
{ "id": "handle", "node": "llm_chat", "status": "completed", "duration_ms": 620 }
],
"duration_ms": 800
}
```
### Run Management
```
GET /api/runs → list recent runs
GET /api/runs/{id} → get run status and full result
POST /api/runs/{id}/cancel → cancel an in-progress run
```
### Streaming Events
All streaming connections (WebSocket and SSE) receive the same event stream:
```json
{"event":"run_started", "run_id":"abc", "node":"smart_assistant"}
{"event":"step_started", "run_id":"abc", "step_id":"classify", "node":"classify_intent"}
{"event":"step_completed", "run_id":"abc", "step_id":"classify", "output":{"intent":"search"}, "duration_ms":180}
{"event":"step_started", "run_id":"abc", "step_id":"handle", "node":"research_pipeline"}
{"event":"chunk", "run_id":"abc", "step_id":"synthesize","data":"Based on the search results..."}
{"event":"map_progress", "run_id":"abc", "step_id":"fetch_all", "progress":3, "total":10}
{"event":"step_completed", "run_id":"abc", "step_id":"handle", "output":{...}, "duration_ms":4200}
{"event":"run_completed", "run_id":"abc", "output":{...}, "duration_ms":4380}
```
**Event types:** `run_started`, `run_completed`, `run_failed`, `run_cancelled`, `step_started`, `step_completed`, `step_failed`, `chunk`, `loop_iteration`, `map_progress`, `retry`
---
## Configuration
### Server Config (`lcdata.json`)
Created automatically on first run.
```json
{
"port": 8080,
"jwt_secret": "change-this-in-production",
"require_jwt": true,
"nodes_path": "./nodes",
"env": "default",
"log_level": "info",
"max_concurrent_runs": 10,
"run_timeout": "5m",
"run_history": 100,
"store_path": "./lcdata.db",
"rate_limit_rps": 0,
"rate_limit_burst": 0
}
```
- **`store_path`** — SQLite file for run history persistence (created automatically)
- **`rate_limit_rps`** — requests per second per JWT `sub` claim (0 = disabled); falls back to remote IP when no JWT is present
- **`rate_limit_burst`** — bucket size for bursts (default: `rps * 2`)
### Credentials (`~/lcdataenv.json`)
Lookup order: `~/lcdataenv.json` → `./nodes/env.json`. All fields also fall back to environment variables.
```json
{
"environments": {
"default": {
"anthropicKey": "sk-ant-...",
"ollamaEndpoint": "http://localhost:11434",
"openaiKey": "sk-...",
"braveKey": "BSA...",
"searxngEndpoint": "http://localhost:8888",
"elevenlabsKey": "",
"deepgramKey": "",
"whisperCppBin": "/usr/local/bin/whisper-cli",
"whisperCppModel": "/path/to/ggml-base.en.bin",
"piperBin": "/usr/local/bin/piper",
"springgEndpoint": "http://localhost:8181",
"springgKey": "",
"dbConnections": {
"main": "postgres://user:pass@localhost:5432/mydb"
}
},
"production": {
"anthropicKey": "sk-ant-...",
"dbConnections": {
"main": "postgres://user:pass@prod:5432/mydb?sslmode=require"
}
}
}
}
```
**Environment variable fallbacks:**
| Config key | Env var |
|-----------|---------|
| `anthropicKey` | `ANTHROPIC_API_KEY` |
| `openaiKey` | `OPENAI_API_KEY` |
| `ollamaEndpoint` | `OLLAMA_ENDPOINT` |
| `braveKey` | `BRAVE_API_KEY` |
| `searxngEndpoint` | `SEARXNG_ENDPOINT` |
| `elevenlabsKey` | `ELEVENLABS_API_KEY` |
| `deepgramKey` | `DEEPGRAM_API_KEY` |
| `whisperCppBin` | `WHISPER_CPP_BIN` |
| `whisperCppModel` | `WHISPER_CPP_MODEL` |
| `piperBin` | `PIPER_BIN` |
| `springgEndpoint` | `SPRINGG_ENDPOINT` |
| `springgKey` | `SPRINGG_KEY` |
---
## CLI
```
lcdata serve start the HTTP + WebSocket server
lcdata init [name] [type] scaffold a new node directory
lcdata list list all nodes
lcdata show [name] show full node config
lcdata run [name] --input key=val --env prod run a node locally (no server)
lcdata run [name] --input message=- read one input value from stdin
lcdata validate validate all node configs
lcdata graph [name] print execution tree with icons
lcdata generate-jwt --client my-service generate a signed JWT
lcdata generate-jwt --client svc --allow node1,node2 JWT scoped to specific nodes
lcdata version show version
```
**Graph output example:**
```
◼ analyze_document (pipeline)
├── [read]
│ ▤ read_file (file)
├── [analyze] parallel (3 branches)
│ ├── branch "summary"
│ │ ◆ summarize (llm)
│ ├── branch "entities"
│ │ ◆ extract_entities (llm)
│ └── branch "sentiment"
│ ◆ classify_sentiment (llm)
├── [compose]
│ ◆ llm_chat (llm)
└── [write]
▤ write_file (file)
```
**Node type icons:** ◆ llm · ◈ http · ⊕ search · ▤ file · ▶ command · ▣ database · ◇ transform · ◼ pipeline · ◎ stt · ◉ tts
---
## Auth
JWT authentication is enabled by default. Disable with `"require_jwt": false` in `lcdata.json`.
Generate a token:
```bash
lcdata generate-jwt --client my-service
```
Use in requests:
```
Authorization: Bearer
```
---
## Operational Features
### Hot Reload
The server watches the `nodes/` directory with fsnotify. Adding, editing, or removing a node config takes effect within 200ms — no restart required. Discovery endpoints always reflect the current state.
### Run Persistence
Completed runs are persisted to SQLite (`store_path` in config). The `/api/runs` endpoint returns the most recent N runs (`run_history`), merging in-flight runs with persisted ones.
### Cost Tracking
LLM nodes report `input_tokens` and `output_tokens` in their output under a `usage` key. The runner aggregates token counts across all steps and exposes them on the run record:
```json
{
"run_id": "a3f9b2c1",
"input_tokens": 1240,
"output_tokens": 387,
"steps": [
{ "id": "classify", "input_tokens": 320, "output_tokens": 12 },
{ "id": "answer", "input_tokens": 920, "output_tokens": 375 }
]
}
```
### Retry
Nodes that fail due to transient errors (API timeouts, rate limits) retry automatically with exponential backoff and ±25% jitter. Configure per-node:
```json
{
"retry_count": 3,
"retry_delay": "1s"
}
```
`retry` events are emitted on each attempt so streaming clients can observe them.
---
## Project Structure
```
lcdata/
main.go
go.mod
lcdata.json.example
lcdataenv.json.example
DESIGN.md
cmd/
root.go Cobra root + global flags
serve.go HTTP server, WebSocket, SSE, JWT middleware, rate limiting
init.go lcdata init (scaffold node directory)
list.go lcdata list
show.go lcdata show
run.go lcdata run (stdin support via key=-)
validate.go lcdata validate
graph.go lcdata graph (ASCII tree)
jwt.go lcdata generate-jwt (with --allow node scoping)
version.go lcdata version
internal/lcdata/
config.go Server config (lcdata.json)
environment.go Credentials config (lcdataenv.json)
node.go Node struct, JSON loading, field schema, input validation
pipeline.go Step, SwitchCase, LoopConfig, MapConfig types
runner.go Run lifecycle, async execution, node hot-swap
watcher.go fsnotify hot reload (debounced, 200ms)
store.go SQLite run history persistence
retry.go Exponential backoff with ±25% jitter
context.go RunContext, template rendering, type preservation
stream.go Event types, Run struct, StepResult (with token counts)
flow.go Pipeline execution, switch/parallel/loop/map, on_error
executor.go Per-type dispatch with retry wrapper
executor_llm.go Anthropic (tool use loop), Ollama, OpenAI (SSE streaming)
executor_http.go HTTP requests + HTML stripping
executor_search.go Brave API + SearXNG
executor_file.go File read/write/append/exists/delete/list
executor_cmd.go Command execution with streaming stdout
executor_xfm.go Transform (Go template rendering)
executor_db.go Database — SQLite + Postgres via database/sql
executor_stt.go STT — Deepgram, OpenAI Whisper, whisper.cpp (local+file path)
executor_tts.go TTS — ElevenLabs, OpenAI, Piper (local, returns base64 audio)
executor_vector.go Vector store — springg (upsert, search, get, delete, create/delete index)
executor_embed.go Embeddings — OpenAI, Ollama (returns float64 vector)
nodes/
llm_chat/
classify_intent/
classify_sentiment/
extract_entities/
summarize/
translate/
fetch_url/
web_search/
read_file/
write_file/
research_pipeline/
smart_assistant/
analyze_document/
```
---
## Dependencies
```
github.com/anthropics/anthropic-sdk-go v0.2.0-alpha.4
github.com/fsnotify/fsnotify v1.9.0
github.com/go-chi/chi/v5 v5.2.3
github.com/go-chi/cors v1.2.2
github.com/golang-jwt/jwt/v5 v5.3.0
github.com/gorilla/websocket v1.5.3
github.com/lib/pq v1.10.9
github.com/spf13/cobra v1.8.0
modernc.org/sqlite v1.37.0
```
Go 1.23+ (uses `log/slog`, `math/rand/v2`)
---
## Comparison
| | lcdata | LangChain | n8n |
|---|---|---|---|
| Config format | JSON files | Python code | Visual UI |
| Add new agent | Drop a folder | Write a class | Drag nodes |
| Streaming | All node types, unified events | Per-chain, varies | Limited |
| Self-describing API | `/api/nodes` live registry | No | Workflow export |
| Flow control | switch/parallel/loop/map in JSON | Graph edges in code | Node connections |
| Provider switch | Change one JSON field | Change class import | Reconfigure credential |
| Deploy | Single Go binary | Python env + deps | Node.js app |
| CLI mode | `lcdata run name` | Script the chain | No |