{"id":46079825,"url":"https://github.com/tjamescouch/gro","last_synced_at":"2026-03-01T15:13:24.424Z","repository":{"id":338182290,"uuid":"1156829355","full_name":"tjamescouch/gro","owner":"tjamescouch","description":"Provider-agnostic LLM CLI wrapper (claude/openai/gemini)","archived":false,"fork":false,"pushed_at":"2026-02-27T15:54:28.000Z","size":2091,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-02-27T20:20:23.048Z","etag":null,"topics":["agent-framework","agent-runtime","ai-agents","ai-infrastructure","ai-runtime","autonomous-agents","context-management","llm","llm-agents","llm-runtime","long-context","mcp","model-context-protocol","multi-agent"],"latest_commit_sha":null,"homepage":"","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tjamescouch.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":"AUDIT.md","citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-02-13T05:11:49.000Z","updated_at":"2026-02-27T15:54:32.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/tjamescouch/gro","commit_stats":null,"previous_names":["tjamescouch/gro"],"tags_count":16,"template":false,"template_full_name":null,"purl":"pkg:github/tjamescouch/gro","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tjamescouch%2Fgro","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tjamescouch%2Fgro/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tjamescouch%2Fgro/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tjamescouch%2Fgro/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tjamescouch","download_url":"https://codeload.github.com/tjamescouch/gro/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tjamescouch%2Fgro/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29939299,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-28T13:49:17.081Z","status":"ssl_error","status_checked_at":"2026-02-28T13:48:50.396Z","response_time":90,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agent-framework","agent-runtime","ai-agents","ai-infrastructure","ai-runtime","autonomous-agents","context-management","llm","llm-agents","llm-runtime","long-context","mcp","model-context-protocol","multi-agent"],"created_at":"2026-03-01T15:13:23.925Z","updated_at":"2026-03-01T15:13:24.407Z","avatar_url":"https://github.com/tjamescouch.png","language":"TypeScript","readme":"# gro\n\n\u003cimg width=\"200\" height=\"200\" alt=\"ChatGPT Image Feb 22, 2026, 03_15_50 AM\" src=\"https://github.com/user-attachments/assets/3e168b1c-4d4b-4eca-b898-c9b6826ab1a0\" /\u003e\n\n**Provider-agnostic LLM agent runtime** with virtual memory, semantic retrieval, streaming tool-use, and context management.\n\n`gro` runs persistent agent loops against any LLM provider — Anthropic, OpenAI, Google, xAI, or local — with automatic context paging, semantic page retrieval, MCP tool integration, and AgentChat network support. For an interactive TUI see [gtui](https://github.com/tjamescouch/gtui) available as an [npm package](https://www.npmjs.com/package/@tjamescouch/gtui). This software is intended to be run in a containerized solution to protect the host machine.\n\ngro is also the runtime at the heart of thesystem — a one-command macOS dev shop with built-in agentchat networking, secure key proxying, and multi-agent swarms.\n```\nbrew install thesystem\nthesystem gro\n```\n\n[![npm version](https://img.shields.io/npm/v/@tjamescouch/gro.svg)](https://www.npmjs.com/package/@tjamescouch/gro)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)\n[![Node.js 18+](https://img.shields.io/badge/node-%3E%3D18-brightgreen.svg)](https://nodejs.org)\n\n---\n\n## Install\n\n```sh\nnpm install -g @tjamescouch/gro\n```\n\nRequires Node.js 18+.\n\n---\n\n## Quick Start\n\n```sh\n# One-shot prompt (Anthropic by default)\nexport ANTHROPIC_API_KEY=sk-...\ngro \"explain the CAP theorem in two sentences\"\n\n# Interactive conversation with virtual memory\ngro -i\n\n# Resume last session (like `claude --continue`)\ngro -c\n\n# Use a specific model (provider auto-inferred)\nexport OPENAI_API_KEY=sk-...\ngro -m gpt-4.1 \"hello\"\n\n# Pipe mode (like `claude -p`)\necho \"summarize this\" | gro -p\n```\n\n---\n\n## Providers\n\nProvider is auto-inferred from model name — `-m claude-sonnet-4-5` uses Anthropic, `-m gpt-4.1` uses OpenAI.\n\n| Provider | Example Models | Env var | Required? |\n|----------|---------------|---------|-----------|\n| **Anthropic** (default) | `claude-haiku-4-5`, `claude-sonnet-4-5`, `claude-opus-4-5` | `ANTHROPIC_API_KEY` | Yes (default provider) |\n| **OpenAI** | `gpt-4.1`, `gpt-4.1-mini`, `o3`, `o4-mini` | `OPENAI_API_KEY` | Only if using OpenAI models |\n| **Google** | `gemini-2.5-flash`, `gemini-2.5-pro` | `GOOGLE_API_KEY` | Only if using Gemini models |\n| **xAI** | `grok-4`, `grok-4-latest` | `XAI_API_KEY` | Only if using Grok models |\n| **Groq** | `llama-3.3-70b-versatile` | `GROQ_API_KEY` | Only if using Groq-hosted models |\n| **Local** | `llama3`, `mistral`, `qwen` | — | No key needed (Ollama / LM Studio) |\n\n### Setting API Keys\n\n**macOS** — store keys in Keychain (persistent, secure):\n\n```sh\ngro --set-key anthropic    # prompted for key, stored in macOS Keychain\ngro --set-key openai\ngro --set-key xai\ngro --set-key google\ngro --set-key groq\n```\n\n**Linux / CI** — use environment variables:\n\n```sh\nexport ANTHROPIC_API_KEY=sk-ant-...\nexport OPENAI_API_KEY=sk-...\nexport XAI_API_KEY=xai-...\nexport GOOGLE_API_KEY=AIza...\nexport GROQ_API_KEY=gsk_...\n```\n\nKey resolution order: macOS Keychain → environment variable. You only need to set keys for providers you use.\n\n---\n\n## Virtual Memory\n\ngro includes a swim-lane **VirtualMemory** system that manages context as a sliding window, allowing agents to work across arbitrarily long conversations without burning tokens on stale context.\n\n```sh\ngro -i --gro-memory virtual        # default in interactive mode\ngro -i --gro-memory simple         # unbounded buffer, no paging\ngro -i --gro-memory fragmentation  # zero-cost stochastic paging\ngro -i --gro-memory hnsw           # semantic similarity retrieval\n```\n\n**How it works:**\n\n- Messages are partitioned into swim lanes: `assistant` / `user` / `system` / `tool`\n- When working memory exceeds the high-watermark, old messages are summarized and paged to disk\n- Summaries preserve `@@ref('id')@@` markers — load any page back on demand\n- High-importance messages (tagged `@@important@@`) survive compaction\n- Summarization uses a configurable, cheaper model\n\n```sh\n# Use Haiku for compression, Sonnet for reasoning\ngro -i -m claude-sonnet-4-5 --summarizer-model claude-haiku-4-5\n```\n\n### Semantic Retrieval\n\nWhen an embedding API key is available (OpenAI or Google), gro automatically surfaces relevant paged context before each turn. The agent doesn't need to know what it needs — the runtime finds it.\n\n**How it works:**\n\n- Page summaries are embedded into a vector index on creation\n- Before each turn, the current conversation is compared against the index by cosine similarity\n- The most relevant unloaded page is automatically paged back into context via `VirtualMemory.ref()`\n- Existing pages in the `@@ref@@` index are backfilled on startup\n\nThis makes compaction reversible. The runtime can compact aggressively — summarizing and paging old context to disk — and trust that if the agent needs something it paged out, semantic retrieval will fault it back in. The result is a memory hierarchy analogous to virtual memory in an operating system: working set in context, everything else on disk, page faults handled automatically.\n\nThe retrieval mechanism is embedding-based similarity search — the same technique underlying RAG — but over the agent's own paged-out session history rather than an external corpus. Retrieved content is the agent's own prior reasoning, not new information, so there is no integration cost.\n\n**Explicit search:** The agent can also search by meaning directly:\n\n```\n@@search('query')@@\n```\n\nThis returns the top matching pages and loads them into context. Use when the agent knows it needs something but not which page contains it.\n\n**Graceful degradation:** If no embedding API key is configured, semantic retrieval is silently disabled. All other memory features — `@@ref@@`, compaction, importance tagging — work exactly as before.\n\n### Memory Modes\n\n| Mode | Description | Cost |\n|------|-------------|------|\n| `virtual` | Swim-lane LLM summarization with semantic retrieval (default) | Low (summarizer model + embedding calls) |\n| `simple` | Unbounded buffer, no paging | None |\n| `fragmentation` | Age-biased random sampling | None |\n| `hnsw` | Standalone semantic similarity index (no paging) | Embedding calls only |\n\n---\n\n## Extended Thinking\n\nControl reasoning depth dynamically with the `@@thinking()@@` stream marker. The level selects the model tier and allocates thinking tokens (Anthropic extended thinking; OpenAI reasoning tokens).\n\n```sh\ngro -i -m claude-sonnet-4-5 \"solve this complex problem\"\n# Agent can emit @@think@@ to escalate to Opus when stuck\n```\n\n| Level | Tier | Use case |\n|-------|------|----------|\n| `0.0–0.24` | Haiku / flash-lite | Fast, cheap — formatting, lookups, routine transforms |\n| `0.25–0.64` | Sonnet / flash | Balanced — most tasks requiring judgment or code |\n| `0.65–1.0` | Opus / pro | Deep reasoning — architecture, when stuck, low confidence |\n\nThe thinking budget **decays ×0.6 per idle round** unless renewed. Agents naturally step down from expensive tiers when not actively working hard problems.\n\n**Token reservation:** 30% of `max_tokens` is reserved for completion output to prevent truncation on high-budget calls. Example: `max_tokens=4096, thinking=0.8` → ~2293 thinking tokens, ~1803 output tokens.\n\n---\n\n## Prompt Caching\n\nAnthropic prompt caching is **enabled by default**. System prompts and tool definitions are cached automatically, reducing cost by ~90% on repeat calls. Cache hits are reported: `[cache read: 7993 tokens]`.\n\n```sh\ngro --no-prompt-caching   # disable if needed\n```\n\n---\n\n## Batch Summarization\n\nWhen `enableBatchSummarization` is set, context compaction queues summarization requests to the Anthropic Batch API (50% cost discount, async). The agent continues immediately with a placeholder summary. A background worker polls for completion and updates pages on disk.\n\n---\n\n## Stream Markers\n\ngro parses inline `@@marker()@@` directives from model output and acts on them in real-time. Markers are stripped before display — users never see them. Models use them as a runtime control plane.\n\n| Marker | Effect |\n|--------|--------|\n| `@@model-change('opus')@@` | Hot-swap to a different model mid-conversation |\n| `@@thinking(0.85)@@` | Set thinking level — controls model tier and token budget |\n| `@@importance('0.9')@@` | Tag message importance (0–1) for compaction priority |\n| `@@important@@` | Line is reproduced verbatim in all summaries |\n| `@@ephemeral@@` | Line may be omitted from summaries entirely |\n| `@@ref('id')@@` | Load a paged memory block into context |\n| `@@unref('id')@@` | Release a loaded page to free context budget |\n| `@@search('query')@@` | Find pages by meaning — top results auto-load into context |\n\nSee [`STREAM_MARKERS.md`](./STREAM_MARKERS.md) for the complete reference.\n\n---\n\n## MCP Support\n\ngro discovers MCP servers from Claude Code's config (`~/.claude/settings.json`) automatically. Provide an explicit config with `--mcp-config`.\n\n```sh\ngro --mcp-config ./my-servers.json \"use the filesystem tool to list files\"\ngro --no-mcp                       \"no tools\"\n```\n\n---\n\n## Containerized Deployment\n\nFor production or multi-agent workloads, run gro inside an isolated container using [thesystem](https://github.com/tjamescouch/thesystem). This provides API key isolation (keys never leave the host), session persistence across runs, and sandboxed execution inside a Lima VM + Podman container.\n\n```sh\n# Install thesystem and boot the environment\nbrew tap tjamescouch/thesystem \u0026\u0026 brew install thesystem\nthesystem init \u0026\u0026 thesystem keys set anthropic sk-ant-...\nthesystem start\n\n# Drop into an interactive gro session inside a pod (resumes last session)\nthesystem gro\nthesystem gro -P openai -m gpt-4.1\n\n# Fresh session (no resume) — equivalent to `claude -p` behavior\nthesystem gro --no-continue\n```\n\nSee the [thesystem README](https://github.com/tjamescouch/thesystem) for full setup and multi-agent swarm configuration.\n\n## AgentChat Integration\n\nRun gro as a persistent agent connected to an AgentChat network:\n\n```sh\ngro -i --persistent --system-prompt-file _base.md --mcp-config agentchat-mcp.json\n```\n\n**Persistent mode** (`--persistent`) keeps the agent in a continuous tool-calling loop. If the model stops calling tools, gro injects a system nudge to resume listening. The loop is indefinite: `agentchat_listen` → process → respond → repeat.\n\nAn external process manager (systemd, supervisor, Docker) handles process lifecycle. Auto-save triggers every 10 tool rounds.\n\n---\n\n## PLASTIC Mode (Self-Modifying Agent)\n\nPLASTIC mode lets an agent read, modify, and reload its own source code at runtime. The agent runs from a writable overlay directory (`~/.gro/plastic/overlay/`) — a copy of the stock `dist/` tree. It can edit files in the overlay, emit `@@reboot@@` to restart, and come back running the modified code.\n\n```sh\n# Run with PLASTIC enabled (containerized)\nthesystem gro --plastic\n\n# Or directly (not recommended outside containers)\nGRO_PLASTIC=1 gro -i\n```\n\n### How It Works\n\n1. **Boot**: stock `dist/main.js` diverts to `plastic/bootstrap.js`, which loads `overlay/main.js`\n2. **Read**: the agent's source is pre-chunked into virtual memory pages (`@@ref('pg_src_...')@@`)\n3. **Write**: `write_source` tool modifies files in the overlay (with syntax validation)\n4. **Reboot**: `@@reboot@@` marker saves state and exits with code 75; an outer runner restarts\n5. **Crash recovery**: if the overlay crashes on boot, it's wiped and re-initialized from stock\n\n### Safety\n\nPLASTIC mode is **training-only infrastructure**. It is designed for supervised experimentation in disposable containerized environments, not production use.\n\n**Always run PLASTIC inside a container.** The `thesystem gro --plastic` command provides:\n- Isolated Podman container (no host filesystem access)\n- API keys injected via proxy (never on disk inside the container)\n- Persistent volume for sessions (survives container restarts)\n- Reboot loop with a 20-restart cap (prevents infinite loops)\n\n**What the agent can modify:**\n- Its own runtime code in `~/.gro/plastic/overlay/`\n- Its version string, tool definitions, marker handling, memory system\n\n**What the agent cannot do:**\n- Escape the container or access the host\n- Modify the stock install (`/usr/local/lib/node_modules/...` is read-only)\n- Survive a container rebuild (`--rebuild` wipes everything)\n- Persist changes across stock upgrades (overlay is wiped when npm version increases)\n\n**Risk mitigations:**\n- `write_source` validates JavaScript syntax before writing — rejects broken code\n- Overlay crash triggers automatic fallback to stock code\n- Reboot cap (20) prevents runaway restart loops\n- Version mismatch detection wipes stale overlays on genuine upgrades\n- All modifications are confined to the overlay — `rm -rf ~/.gro/plastic/overlay/` restores stock behavior\n\n\u003e **Do not run PLASTIC mode on a host machine with access to sensitive data, credentials, or production systems.** The agent has a shell tool and can execute arbitrary code. Container isolation is your primary safety boundary.\n\n---\n\n## Shell Tool\n\nEnable a built-in `shell` tool for executing commands:\n\n```sh\ngro -i --bash \"help me debug this\"\n```\n\nCommands run with a 120s timeout and 30 KB output cap. The tool is opt-in and not available by default.\n\n---\n\n## Built-in Tools\n\nThese tools are always available (no flags required):\n\n| Tool | Description |\n|------|-------------|\n| `Read` | Read file contents with optional line range |\n| `Write` | Write content to a file (creates parent dirs) |\n| `Glob` | Find files by glob pattern (`.gitignore`-aware) |\n| `Grep` | Search file contents with POSIX regex |\n| `apply_patch` | Apply unified diffs to files |\n| `gro_version` | Runtime identity and version info |\n| `memory_status` | VirtualMemory statistics |\n| `memory_report` | Memory performance and tuning recommendations |\n| `memory_tune` | Auto-tune memory parameters |\n| `compact_context` | Force immediate context compaction |\n| `cleanup_sessions` | Remove orphaned sessions older than 48 hours |\n\n---\n\n## Options\n\n```\n-P, --provider                  openai | anthropic | google | xai | local\n-m, --model                     Model name (provider auto-inferred)\n--base-url                      API base URL override\n--system-prompt                 System prompt text\n--system-prompt-file            Read system prompt from file\n--append-system-prompt          Append to system prompt\n--append-system-prompt-file     Append system prompt from file\n--context-tokens                Working memory budget in tokens (default: 8192)\n--max-turns                     Max tool rounds per turn (default: 100 gro is designed for sustained autonomous work. Use --max-turns 10 for tighter human-in-the-loop control.)\n--summarizer-model              Model for context summarization\n--gro-memory                    virtual | simple | fragmentation | hnsw\n--mcp-config                    MCP servers config (JSON file or inline string)\n--no-mcp                        Disable MCP server connections\n--no-prompt-caching             Disable Anthropic prompt caching\n--bash                          Enable built-in shell tool\n--persistent                    Persistent agent mode (continuous loop)\n--output-format                 text | json | stream-json (default: text)\n-p, --print                     Print response and exit (non-interactive)\n-c, --continue                  Continue most recent session\n-r, --resume [id]               Resume session by ID\n-i, --interactive               Interactive conversation mode\n--verbose                       Verbose output\n-V, --version                   Show version\n-h, --help                      Show help\n```\n\n---\n\n## Session Persistence\n\nSessions are saved automatically to `.gro/`:\n\n```\n.gro/\n  context/\n    \u003csession-id\u003e/\n      messages.json    # full message history\n      meta.json        # model, provider, timestamps\n  pages/               # VirtualMemory paged summaries\n```\n\nResume with `-c` (most recent) or `-r \u003cid\u003e` (specific). Disable with `--no-session-persistence`.\n\n---\n\n## Architecture\n\n```\nsrc/\n  main.ts                      # CLI entry, flag parsing, agent loop\n  session.ts                   # Session persistence and tool-pair sanitization\n  errors.ts                    # Typed error hierarchy (GroError)\n  logger.ts                    # Logger with ANSI color support\n  stream-markers.ts            # Stream marker parser and dispatcher\n  spend-meter.ts               # Token cost tracking\n  drivers/\n    anthropic.ts               # Native Anthropic Messages API (no SDK)\n    streaming-openai.ts        # OpenAI-compatible streaming driver\n    types.ts                   # ChatDriver interface, message types\n    batch/\n      anthropic-batch.ts       # Anthropic Batch API client\n  memory/\n    agent-memory.ts            # AgentMemory interface\n    virtual-memory.ts          # Swim-lane paged context\n    simple-memory.ts           # Unbounded buffer\n    fragmentation-memory.ts    # Stochastic sampling pager\n    hnsw-memory.ts             # Semantic similarity retrieval\n    embedding-provider.ts      # Provider-agnostic embedding client (OpenAI / Google)\n    page-search-index.ts       # Flat cosine similarity vector index\n    semantic-retrieval.ts      # Auto-retrieval orchestrator\n    summarization-queue.ts     # Async batch summarization queue\n    batch-worker.ts            # Background batch worker\n    batch-worker-manager.ts    # Worker lifecycle manager\n    memory-metrics.ts          # Performance metrics\n    memory-tuner.ts            # Auto-tuning logic\n    vector-index.ts            # HNSW vector index\n  mcp/\n    client.ts                  # MCP client manager\n  tools/\n    bash.ts                    # Built-in shell tool (--bash flag)\n    read.ts / write.ts         # File I/O\n    glob.ts / grep.ts          # File search\n    agentpatch.ts              # Unified patch application\n    version.ts                 # gro_version introspection\n    memory-status.ts           # VirtualMemory stats\n    memory-report.ts           # Performance report\n    memory-tune.ts             # Auto-tune\n    compact-context.ts         # Manual compaction trigger\n    cleanup-sessions.ts        # Session cleanup\n  plastic/\n    bootstrap.ts               # PLASTIC overlay loader with crash fallback\n    init.ts                    # Overlay setup, source page generation\n    write-source.ts            # write_source tool (overlay file modification)\n  utils/\n    rate-limiter.ts            # Token bucket rate limiter\n    timed-fetch.ts             # Fetch with configurable timeout\n    retry.ts                   # Exponential backoff retry logic\n  runtime/\n    config-manager.ts          # Runtime configuration\n    directive-parser.ts        # Stream directive parsing\n  tui/\n    main.ts                    # Terminal UI entry\n    ui/                        # Blessed TUI panels\n```\n\n---\n\n## Development\n\n```sh\ngit clone https://github.com/tjamescouch/gro.git\ncd gro\nnpm install\nnpm run build\nnpm test\n```\n\n---\n\n## License\n\nMIT © [tjamescouch](https://github.com/tjamescouch)\n\n---\n\n## For Agents\n\nBoot context and stream marker reference: [`_base.md`](./_base.md)\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftjamescouch%2Fgro","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftjamescouch%2Fgro","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftjamescouch%2Fgro/lists"}