{"id":30737722,"url":"https://github.com/inference-gateway/cli","last_synced_at":"2026-07-01T01:01:05.231Z","repository":{"id":309132624,"uuid":"1034725719","full_name":"inference-gateway/cli","owner":"inference-gateway","description":"A Git-first CLI coding agent that turns ideas, issues, and tasks into real code changes.","archived":false,"fork":false,"pushed_at":"2026-06-30T23:32:04.000Z","size":6448,"stargazers_count":4,"open_issues_count":17,"forks_count":2,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-07-01T00:08:53.428Z","etag":null,"topics":["cli","coder","coding","coding-tools","computer-use","inference-gateway","llms","telegrambot"],"latest_commit_sha":null,"homepage":"https://docs.inference-gateway.com/cli","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/inference-gateway.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2025-08-08T21:57:04.000Z","updated_at":"2026-06-30T23:32:06.000Z","dependencies_parsed_at":"2025-08-10T04:09:37.279Z","dependency_job_id":"896a96b4-570d-4171-bbd5-856fa15b1d2f","html_url":"https://github.com/inference-gateway/cli","commit_stats":null,"previous_names":["inference-gateway/cli"],"tags_count":328,"template":false,"template_full_name":null,"purl":"pkg:github/inference-gateway/cli","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/inference-gateway%2Fcli","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/inference-gateway%2Fcli/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/inference-gateway%2Fcli/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/inference-gateway%2Fcli/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/inference-gateway","download_url":"https://codeload.github.com/inference-gateway/cli/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/inference-gateway%2Fcli/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34988714,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-30T02:00:05.919Z","response_time":92,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cli","coder","coding","coding-tools","computer-use","inference-gateway","llms","telegrambot"],"created_at":"2025-09-03T21:46:52.433Z","updated_at":"2026-07-01T01:01:05.217Z","avatar_url":"https://github.com/inference-gateway.png","language":"Go","funding_links":[],"categories":["Utilities"],"sub_categories":["🚆 \u003ca name=\"travel-and-transportation\"\u003e\u003c/a\u003eTravel \u0026 Transportation"],"readme":"\u003cdiv align=\"center\"\u003e\n\n# Inference Gateway CLI\n\n[![Go Version](https://img.shields.io/github/go-mod/go-version/inference-gateway/cli?style=flat-square\u0026logo=go)](https://golang.org)\n[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg?style=flat-square)](LICENSE)\n[![Build Status](https://img.shields.io/github/actions/workflow/status/inference-gateway/cli/ci.yml?style=flat-square\u0026logo=github)](https://github.com/inference-gateway/cli/actions)\n[![Go Report Card](https://goreportcard.com/badge/github.com/inference-gateway/cli?style=flat-square)](https://goreportcard.com/report/github.com/inference-gateway/cli)\n[![Release](https://img.shields.io/github/v/release/inference-gateway/cli?style=flat-square\u0026logo=github)](https://github.com/inference-gateway/cli/releases)\n\nAn agentic command-line assistant that writes code, understands project context, and uses tools to perform real tasks.\n\n\u003c/div\u003e\n\n## ⚠️ Warning\n\n\u003e **Early Development Stage**: This project is in its early development\n\u003e stage and breaking changes are expected until it reaches a stable version.\n\u003e\n\u003e Always use pinned versions by specifying a specific version tag when\n\u003e downloading binaries or using install scripts.\n\n## Table of Contents\n\n- [Features](#features)\n- [Installation](#installation)\n- [Quick Start](#quick-start)\n- [Claude Code Mode (Subscription)](#claude-code-mode-subscription)\n- [Commands](#commands)\n- [Tools for LLMs](#tools-for-llms)\n- [Configuration](#configuration)\n- [Cost Tracking](#cost-tracking)\n- [Tool Approval System](#tool-approval-system)\n- [Shortcuts](#shortcuts)\n- [Channels (Remote Messaging)](#channels-remote-messaging)\n- [Heartbeat (Periodic Wake-Up)](#heartbeat-periodic-wake-up)\n- [Agent Skills](#agent-skills)\n- [Computer Use](#computer-use)\n- [Persistent Memory](#persistent-memory)\n- [Reminders \u0026 Command Hooks](#reminders--command-hooks)\n- [Global Flags](#global-flags)\n- [Examples](#examples)\n- [Development](#development)\n- [License](#license)\n\n## Features\n\n- **Automatic Gateway Management**: Automatically downloads and runs the Inference Gateway binary (no Docker required!)\n- **Zero-Configuration Setup**: Start chatting immediately with just your API keys in a `.env` file\n- **Interactive Chat**: Chat with models using an interactive interface\n- **Status Monitoring**: Check gateway health and resource usage\n- **Conversation History**: Store and retrieve past conversations with multiple storage backends\n  - [Conversation Storage](docs/conversation-storage.md) - Detailed storage backend documentation\n  - [Conversation Title Generation](docs/conversation-title-generation.md) - AI-powered title generation system\n  - [Database Migrations](docs/database-migrations.md) - Schema migration system for the SQLite/Postgres backends\n- **Conversation Versioning**: Navigate back in time to previous conversation points (double ESC)\n  - View message history with timestamps\n  - Restore conversation to any previous user message\n  - Permanent deletion of messages after restore point\n  - [Learn more →](docs/conversation-versioning.md)\n- **Configuration Management**: Manage gateway settings via YAML config\n- **Project Initialization**: Set up local project configurations\n- **Tool Execution**: LLMs can execute allowed commands and tools - [See all tools →](docs/tools-reference.md)\n- **Tool Approval System**: User approval workflow for sensitive operations with real-time diff visualization\n- **Agent Modes**: Three operational modes for different workflows:\n  - **Standard Mode** (default): Normal operation with all configured tools and approval checks\n  - **Plan Mode**: Read-only mode for planning and analysis without execution - [Learn more →](docs/plan-mode.md)\n  - **Auto-Accept Mode**: All tools auto-approved for rapid execution (YOLO mode)\n  - Toggle between modes with **Shift+Tab**\n- **Token Usage Tracking**: Accurate token counting with polyfill support for providers that don't return usage metrics\n- **Cost Tracking**: Real-time cost calculation for API usage with per-model breakdown and configurable pricing\n- **Inline History Auto-Completion**: Smart command history suggestions with inline completion\n- **GitHub Issue References (`#`)**: Type `#` in chat to open a dropdown of the current\n  repo's open issues. Selecting one inserts a `#N` token that is highlighted in the input\n  and, on submit, expanded inline into the issue's title, body, and recent comments - so\n  the agent works from full context without a redundant `gh issue view` lookup. Gracefully\n  no-ops when `gh` is not installed or the repo has no remote.\n- **Customizable Keybindings**: Fully configurable keyboard shortcuts for the chat interface\n- **Model Thinking Visualization**: When models use extended thinking,\n  their internal reasoning process is displayed as collapsible blocks above responses (toggle with **ctrl+k** by default, configurable via `display_toggle_thinking`)\n- **Extensible Shortcuts System**: Create custom commands with AI-powered snippets - [Learn more →](docs/shortcuts-guide.md)\n- **MCP Server Support**: Direct integration with Model Context Protocol servers for extended tool capabilities -\n  [Learn more →](docs/mcp-integration.md)\n- **Web Terminal Interface**: Browser-based terminal access with tabbed sessions for remote access and multi-session workflows - [Learn more →](docs/web-terminal.md)\n- **Remote Messaging Channels**: Control the agent from Telegram, WhatsApp, and other platforms via a pluggable channel system - [Learn more →](docs/channels.md)\n- **Speech-to-Text (Whisper)**: Dictate into chat with `/voice` and transcribe inbound Telegram voice messages, locally and offline -\n  off by default - [Learn more →](docs/speech-to-text.md)\n- **Scheduled Tasks**: Ask the agent (over Telegram, etc.) to run a prompt on a cron schedule and deliver the result back through the same channel -\n  recurring (\"send me a quote every morning\") or one-off (\"remind me at 6pm today\") - [Learn more →](docs/scheduling.md)\n- **Heartbeat (Periodic Wake-Up)**: Wake the agent on a fixed interval to check for pending todos and background work,\n  with a separate configurable system prompt - off by default - [Learn more →](docs/heartbeat.md)\n- **Agent Skills**: Drop-in `SKILL.md` instruction folders the agent discovers and loads on demand;\n  install them straight from GitHub with `infer skills install` - on by default - [Learn more →](docs/skills.md)\n- **Persistent Memory**: Cross-session memory stored as individual Markdown fact-files with an\n  auto-maintained `MEMORY.md` index that is injected at session start - on by default - [Learn more →](#persistent-memory)\n- **Subagents**: Spawn parallel `infer agent` subprocesses from chat with the `Agent` tool to fan out\n  independent work (research, edits, investigations) and fold their results back into the conversation\n- **Computer Use**: Let the agent drive the desktop - mouse, keyboard, screenshots, app focus - across\n  macOS, X11, and Wayland - off by default - [Learn more →](#computer-use)\n- **Reminders \u0026 Command Hooks**: Inject system reminders or run shell commands at agent-loop hook\n  points to enforce project conventions - [Learn more →](#reminders--command-hooks)\n\n## Installation\n\n### Using npm/npx (Recommended)\n\nIf you already have Node.js (\u003e= 18), run the CLI with npx - no Go toolchain or manual\ndownload required. The matching native binary is fetched and cached on first use:\n\n```bash\n# Run without installing\nnpx @inference-gateway/cli@latest --help\nnpx @inference-gateway/cli@latest chat\n```\n\nOr install it globally:\n\n```bash\nnpm install -g @inference-gateway/cli\ninfer --help\n```\n\n\u003e **Not recommended for production.** For production or CI, prefer the\n\u003e [install script](#using-install-script), [container image](#using-container-image), or\n\u003e [building from source](#build-from-source). Prebuilt binaries cover Linux and macOS on\n\u003e amd64/arm64 - on Windows, use WSL.\n\n### Using Go Install\n\n```bash\ngo install github.com/inference-gateway/cli@latest\n```\n\nThis installs the binary as `cli`. To rename it to `infer`:\n\n```bash\nmv $(go env GOPATH)/bin/cli $(go env GOPATH)/bin/infer\n```\n\nOr use an alias:\n\n```bash\nalias infer=\"$(go env GOPATH)/bin/cli\"\n```\n\n### Using Container Image\n\n```bash\n# Create network and deploy inference gateway first\ndocker network create inference-gateway\ndocker run -d --name inference-gateway --network inference-gateway \\\n  --env-file .env \\\n  ghcr.io/inference-gateway/inference-gateway:latest\n\n# Pull and run the CLI\ndocker pull ghcr.io/inference-gateway/cli:latest\ndocker run -it --rm --network inference-gateway ghcr.io/inference-gateway/cli:latest chat\n```\n\n### Using Install Script\n\n```bash\n# Latest version\ncurl -fsSL https://raw.githubusercontent.com/inference-gateway/cli/main/install.sh | bash\n\n# Specific version\ncurl -fsSL https://raw.githubusercontent.com/inference-gateway/cli/main/install.sh | bash -s -- --version v0.77.0\n\n# Custom installation directory\ncurl -fsSL https://raw.githubusercontent.com/inference-gateway/cli/main/install.sh | bash -s -- --install-dir $HOME/.local/bin\n```\n\n### Manual Download\n\nDownload the latest release binary for your platform from the [releases page](https://github.com/inference-gateway/cli/releases).\n\n**Verify the binary** (recommended for security):\n\n```bash\n# Download binary and checksums\ncurl -L -o infer-darwin-amd64 \\\n  https://github.com/inference-gateway/cli/releases/latest/download/infer-darwin-amd64\ncurl -L -o checksums.txt \\\n  https://github.com/inference-gateway/cli/releases/latest/download/checksums.txt\n\n# Verify checksum\nshasum -a 256 infer-darwin-amd64\ngrep infer-darwin-amd64 checksums.txt\n\n# Install\nchmod +x infer-darwin-amd64\nsudo mv infer-darwin-amd64 /usr/local/bin/infer\n```\n\nFor advanced verification with Cosign signatures, see [Binary Verification Guide](docs/binary-verification.md).\n\n### Build from Source\n\n```bash\ngit clone https://github.com/inference-gateway/cli.git\ncd cli\ngo build -o infer cmd/infer/main.go\nsudo mv infer /usr/local/bin/\n```\n\n## Quick Start\n\n1. **Initialize your project**:\n\n```bash\ninfer init\n```\n\nThis creates a `.infer/` directory with configuration and shortcuts.\n\n2. **Set up your environment** (create `.env` file):\n\n```env\nANTHROPIC_API_KEY=your_key_here\nOPENAI_API_KEY=your_key_here\nDEEPSEEK_API_KEY=your_key_here\n```\n\n3. **Start chatting**:\n\n```bash\ninfer chat\n```\n\n### Next Steps\n\nNow that you're up and running, explore these guides:\n\n- **[Commands Reference](docs/commands-reference.md)** - Complete command documentation\n- **[Tools Reference](docs/tools-reference.md)** - Available tools for LLMs\n- **[Configuration Guide](docs/configuration-reference.md)** - Full configuration options\n- **[Directory Structure](docs/directory-structure.md)** - Map of every file and subdirectory the CLI creates under `.infer/` and `~/.infer/`\n- **[Web Terminal](docs/web-terminal.md)** - Browser-based terminal interface\n- **[Shortcuts Guide](docs/shortcuts-guide.md)** - Custom shortcuts and AI-powered snippets\n- **[A2A Agents](docs/agents-configuration.md)** - Agent-to-agent communication setup\n\n## Claude Code Mode (Subscription)\n\nSave on API costs by using your Claude Max or Pro subscription instead of pay-as-you-go API pricing.\n\n### Overview\n\nClaude Code mode enables you to use your **Claude Max or Pro subscription** ($100-200/month fixed cost)\ninstead of paying per token via the Anthropic API. This is ideal for heavy users who want\npredictable monthly costs.\n\n**Cost Comparison:**\n\n| Mode                 | Pricing                                 | Best For                                |\n| -------------------- | --------------------------------------- | --------------------------------------- |\n| **Gateway Mode**     | Pay per token ($3-$75 per million)      | API users, multi-provider needs         |\n| **Claude Code Mode** | Fixed monthly ($100-200)                | Heavy Claude users, cost predictability |\n\n### Prerequisites\n\n- **Claude Max or Pro subscription** - Required ($100-200/month)\n- **Claude Code CLI** - Official CLI from Anthropic\n\nInstall the Claude Code CLI:\n\n```bash\nnpm install -g @anthropic-ai/claude-code\n```\n\n### Setup\n\n1. **Configure for Claude Code mode**:\n\nEdit `.infer/config.yaml`:\n\n```yaml\n# Enable Claude Code mode\nclaude_code:\n  enabled: true\n  cli_path: claude  # or /usr/local/bin/claude if not in PATH\n  timeout: 600\n  max_output_tokens: 32000\n  thinking_budget: 10000\n\n# Disable gateway mode\ngateway:\n  run: false\n\n# Set model (no provider prefix needed)\nagent:\n  model: claude-sonnet-4-5-20250929\n```\n\n2. **Authenticate with your subscription**:\n\n```bash\ninfer claude-code setup\n```\n\nThis opens your browser to authenticate with your Claude Max/Pro account.\n\n3. **Verify authentication**:\n\n```bash\ninfer claude-code test\n```\n\n4. **Use normally**:\n\n```bash\ninfer chat  # Now using your subscription!\n```\n\n### Available Commands\n\n- `infer claude-code setup` - Authenticate with Claude subscription\n- `infer claude-code test` - Test authentication and CLI integration\n\n### Configuration Options\n\n```yaml\nclaude_code:\n  enabled: true                  # Enable/disable Claude Code mode\n  cli_path: claude               # Path to claude binary\n  timeout: 600                   # Command timeout in seconds\n  max_output_tokens: 32000       # Maximum output tokens per request\n  thinking_budget: 10000         # Token budget for extended thinking\n```\n\n**Environment Variables:**\n\n```bash\nexport INFER_CLAUDE_CODE_ENABLED=true\nexport INFER_CLAUDE_CODE_CLI_PATH=/usr/local/bin/claude\nexport INFER_CLAUDE_CODE_TIMEOUT=600\n```\n\n### Features and Limitations\n\n| Feature               | Gateway Mode                            | Claude Code Mode                         |\n| --------------------- | --------------------------------------- | ---------------------------------------- |\n| **Cost**              | Pay-per-token                           | Fixed monthly                            |\n| **Providers**         | All providers (Anthropic, OpenAI, etc.) | Claude only                              |\n| **Models**            | All provider models                     | Claude models only                       |\n| **Images**            | ✓ Supported                             | ✗ Not supported (stripped from messages) |\n| **Prompt Caching**    | ✓ Supported                             | ✗ Not available via CLI                  |\n| **Streaming**         | ✓ Supported                             | ✓ Supported                              |\n| **Tool Execution**    | ✓ Supported                             | ✓ Supported                              |\n| **Extended Thinking** | ✓ Supported                             | ✓ Supported                              |\n| **Authentication**    | API keys                                | Browser login                            |\n\n**Supported Models:**\n\nThe following Claude models are available via Claude Code subscription mode:\n\n**Claude 4.5 Series (Latest):**\n\n- `claude-opus-4-5` - Most capable Claude model (vision support)\n- `claude-haiku-4-5-20251001` - Fastest Claude model (vision support)\n- `claude-sonnet-4-5-20250929` - Latest Sonnet model (default, vision support)\n\n**Claude 4.1 Series:**\n\n- `claude-opus-4-1-20250805` - Claude 4.1 Opus (vision support)\n- `claude-sonnet-4-1-20250805` - Claude 4.1 Sonnet (vision support)\n\n**Claude 4 Series:**\n\n- `claude-opus-4-20250514` - Claude 4 Opus (vision support)\n- `claude-sonnet-4-20250514` - Claude 4 Sonnet (vision support)\n\n**Claude 3.7 Series:**\n\n- `claude-3-7-sonnet-20250219` - Claude 3.7 Sonnet (vision support)\n\n**Claude 3.5 Series:**\n\n- `claude-3-5-haiku-20241022` - Claude 3.5 Haiku (vision support)\n\n**Claude 3 Series:**\n\n- `claude-3-haiku-20240307` - Claude 3 Haiku (vision support)\n- `claude-3-opus-20240229` - Claude 3 Opus (vision support)\n\n**Note:** All modern Claude models support vision capabilities. The Claude Code CLI automatically strips images\nfrom messages when using subscription mode.\n\n### Troubleshooting\n\n**CLI Not Found:**\n\n```bash\n# Check if Claude CLI is installed\nwhich claude\n\n# If not found, install it\nnpm install -g @anthropic-ai/claude-code\n\n# Or set custom path in config\nclaude_code:\n  cli_path: /full/path/to/claude\n```\n\n**Authentication Issues:**\n\n```bash\n# Re-authenticate\ninfer claude-code setup\n\n# Test authentication\ninfer claude-code test\n```\n\n**Update CLI:**\n\n```bash\nnpm update -g @anthropic-ai/claude-code\n```\n\n### Switching Between Modes\n\nYou can easily switch between gateway and Claude Code modes:\n\n**To Claude Code mode:**\n\n```yaml\n# .infer/config.yaml\nclaude_code:\n  enabled: true\ngateway:\n  run: false\nagent:\n  model: claude-sonnet-4-5-20250929  # No provider prefix\n```\n\n**To Gateway mode:**\n\n```yaml\n# .infer/config.yaml\nclaude_code:\n  enabled: false\ngateway:\n  run: true\nagent:\n  model: anthropic/claude-sonnet-4-5-20250929  # With provider prefix\n```\n\n## Commands\n\nThe CLI provides several commands for different workflows. For detailed documentation, see [Commands Reference](docs/commands-reference.md).\n\n### Core Commands\n\n**`infer init`** - Initialize a new project with configuration and shortcuts\n\n```bash\ninfer init              # Initialize project configuration\ninfer init --userspace  # Initialize user-level configuration\n```\n\n**`infer chat`** - Start an interactive chat session with model selection\n\n```bash\n# Terminal mode (default)\ninfer chat\n\n# Web terminal mode with browser interface\ninfer chat --web\ninfer chat --web --port 8080  # Custom port\n```\n\n**Features:** Model selection, real-time streaming, scrollable history, three agent modes (Standard/Plan/Auto-Accept).\n\n**Web Mode Features:**\n\n- Browser-based terminal using xterm.js\n- Multiple independent tabbed sessions\n- Automatic session cleanup on inactivity\n- Each tab manages its own `infer chat` process with isolated containers\n- Access from any device on the network\n- Responsive terminal sizing with horizontal padding\n\n**`infer agent`** - Execute autonomous tasks in background mode\n\n```bash\n# Start new agent sessions\ninfer agent \"Please fix the github issue 38\"\ninfer agent --model \"openai/gpt-4\" \"Implement feature from issue #42\"\ninfer agent \"Analyze this UI issue\" --files screenshot.png\n\n# Resume existing sessions\ninfer conversations list  # Find session IDs\ninfer agent \"continue fixing the bug\" --session-id abc-123-def\ninfer agent \"analyze new logs\" --session-id abc-123 --files error.log\n```\n\n**Features:** Autonomous execution, multimodal support (images/files), parallel tool execution, **session resumption**.\n\n### Configuration Commands\n\n**`infer config`** - Manage CLI configuration settings\n\n```bash\n# Read any value (effective config: defaults + ~/.infer + .infer + env)\ninfer config get agent.model\ninfer config get                       # dump the whole effective config\n\n# Agent configuration\ninfer config set agent.model \"deepseek/deepseek-v4-pro\"\ninfer config set agent.max_turns 100\ninfer config set agent.verbose_tools true\n\n# Tool management\ninfer config set tools.enabled true\ninfer config set tools.bash.enabled true\ninfer config set tools.safety.require_approval true\n\n# Export configuration\ninfer config set export.summary_model \"anthropic/claude-4.1-haiku\"\n\n# Write to userspace (~/.infer/config.yaml) instead of the project\ninfer config set agent.model \"openai/gpt-4o\" --userspace\n```\n\n\u003e System prompts live in `prompts.yaml` (e.g. `prompts.agent.system_prompt`), not\n\u003e in `config.yaml`, so they are edited there rather than via `config set`.\n\nSee [Commands Reference](docs/commands-reference.md#configuration-management) for all configuration options.\n\n### Agent Management\n\n**`infer agents`** - Manage A2A (Agent-to-Agent) agent configurations\n\n```bash\ninfer agents init                    # Initialize agents configuration\ninfer agents add browser-agent       # Add an agent from the registry with defaults\ninfer agents add custom https://...  # Add a custom agent\ninfer agents list                    # List all agents\n```\n\nFor detailed A2A setup, see [A2A Agents Configuration](docs/agents-configuration.md); for how\nconnections are established and tasks polled, see [A2A Connections](docs/a2a-connections.md).\n\n### Utility Commands\n\n**`infer status`** - Check gateway health and resource usage\n\n```bash\ninfer status\n```\n\n**`infer conversations`** - List and manage conversation history\n\n```bash\ninfer conversations list                    # List all saved conversations\ninfer conversations list --limit 20         # List first 20 conversations\ninfer conversations list --offset 40 -l 20  # Paginate: conversations 41-60\ninfer conversations list --format json      # Output as JSON\n\ninfer conversations show \u003csession-id\u003e                  # Show a conversation's entries\ninfer conversations show \u003csession-id\u003e --include-hidden # Include hidden entries (e.g. system reminders)\ninfer conversations show \u003csession-id\u003e --format json    # One JSON object per line (jq-friendly)\n```\n\n**`infer conversation-title`** - Manage AI-powered conversation titles\n\n```bash\ninfer conversation-title generate  # Generate titles for all conversations\ninfer conversation-title status    # Show generation status\n```\n\n**`infer skills`** - Manage Agent Skills (reusable `SKILL.md` instruction folders)\n\n```bash\ninfer skills list                                # List discovered skills\ninfer skills install skill-creator               # Install a skill from GitHub\ninfer skills install acme/internal-comms --user  # Install to ~/.infer/skills\ninfer skills uninstall pdf                        # Remove a skill by name\n```\n\nSee [Agent Skills](#agent-skills) and [docs/skills.md](docs/skills.md) for the format.\n\n**`infer export`** - Export a conversation to a Markdown file\n\n```bash\ninfer conversations list      # Find the session ID\ninfer export \u003csession-id\u003e     # Writes .infer/chat_export_\u003ctimestamp\u003e.md\n```\n\n**`infer version`** - Display CLI version information\n\n```bash\ninfer version\n```\n\n## Tools for LLMs\n\nWhen tool execution is enabled, LLMs can use various tools to interact with your system. Below is a\nsummary of available tools. For detailed documentation, parameters, and examples, see\n[Tools Reference](docs/tools-reference.md).\n\nTools are grouped by category below. Many are gated behind a config flag (noted per group);\nthe always-available set is registered for every session. There is **no built-in GitHub tool** -\nuse the `gh` CLI through Bash (or the built-in `/scm` shortcuts) for GitHub operations.\n\n**Core file \u0026 search** (always available):\n\n| Tool | Purpose | Approval |\n| ------ | --------- | ---------- |\n| **Read** | Read file contents with line ranges | No |\n| **Write** | Write content to files | Yes |\n| **Edit** | Exact string replacements in files | Yes |\n| **MultiEdit** | Multiple atomic edits to a single file | Yes |\n| **Delete** | Delete files and directories | Yes |\n| **Grep** | Search files with regex (ripgrep/Go) | No |\n| **Tree** | Display directory structure | No |\n\n**Shell** (Bash is always available; the background-shell trio needs `tools.bash.background_shells.enabled`):\n\n| Tool | Purpose | Approval |\n| ------ | --------- | ---------- |\n| **Bash** | Execute shell commands (per-mode allow-list) | Optional |\n| **BashOutput** | Read new output from a running background shell | Yes |\n| **KillShell** | Terminate a background shell | Yes |\n| **ListShells** | List background shells and their state | Yes |\n\n**Task \u0026 planning** (`AskUserQuestion` needs `tools.ask_user_question.enabled`):\n\n| Tool | Purpose | Approval |\n| ------ | --------- | ---------- |\n| **TodoWrite** | Create and manage task lists | No |\n| **RequestPlanApproval** | Submit a plan for approval and persist it (plan mode) | No |\n| **AskUserQuestion** | Ask the user multiple-choice questions (plan mode) | No |\n\n**Web** (`WebSearch`/`WebFetch` need their respective config flag):\n\n| Tool | Purpose | Approval |\n| ------ | --------- | ---------- |\n| **WebSearch** | Search the web (DuckDuckGo/Google) | Yes |\n| **WebFetch** | Fetch content from a URL | Yes |\n\n**Subagents** (the `Agent` tool and its companions, enabled by default):\n\n| Tool | Purpose | Approval |\n| ------ | --------- | ---------- |\n| **Agent** | Spawn an `infer agent` subprocess to run work in parallel | Yes |\n| **ListSubagents** | List spawned subagents and their status | No |\n| **GetSubagentResult** | Re-read a finished subagent's last message | No |\n| **ReadSubagentScreen** | Capture an interactive subagent's terminal screen | No |\n| **SendSubagentInput** | Type into an interactive subagent's TUI | Yes |\n| **CloseSubagent** | Stop a subagent or tidy a finished pane | Yes |\n| **ApproveSubagent** | Relay an approval decision to a waiting subagent | Yes |\n\n**Computer Use** (require `computer_use.enabled`; these bypass the approval prompt and run silently):\n\n| Tool | Purpose | Approval |\n| ------ | --------- | ---------- |\n| **MouseMove** / **MouseClick** / **MouseScroll** | Control the mouse | No |\n| **KeyboardType** | Type text or send key combinations | No |\n| **GetFocusedApp** / **ActivateApp** | Query or focus an application | No |\n| **GetLatestScreenshot** | Read the latest streamed screenshot | No |\n\n**Memory, scheduling \u0026 A2A** (each gated by its own flag):\n\n| Tool | Purpose | Approval | Enabled by |\n| ------ | --------- | ---------- | ------------ |\n| **Memory** | Persistent, cross-session fact storage | No | `memory.enabled` (default on) |\n| **Schedule** | Cron-driven recurring/one-off tasks via the originating channel | Yes | `tools.schedule.enabled` |\n| **A2A_SubmitTask** | Submit a task to an A2A agent | Yes | A2A enabled |\n| **A2A_QueryAgent** | Query an A2A agent's capabilities | No | A2A enabled |\n| **A2A_QueryTask** | Check an A2A task's status | No | A2A enabled |\n\n\u003e **Approval** reflects the default policy. The global default is `tools.safety.require_approval: true`,\n\u003e so a tool is **No** only where it is explicitly exempt in code (read-only file/search tools, Memory,\n\u003e the plan/question tools, subagent reads, and computer-use). Bash is governed instead by the per-mode\n\u003e bash allow-list. Override any tool with `tools.\u003cname\u003e.require_approval`.\n\u003e\n\u003e **MCP tools** are not listed here - they are discovered and registered dynamically at runtime from your\n\u003e configured MCP servers and surface as `MCP_\u003cserver\u003e_\u003ctool\u003e` (see [MCP Integration](docs/mcp-integration.md)).\n\n**Tool Configuration:**\n\nTools can be enabled/disabled and configured individually:\n\n```bash\n# Enable/disable specific tools\ninfer config set tools.bash.enabled true\ninfer config set tools.write.enabled true\n\n# Configure tool settings\ninfer config set tools.grep.backend ripgrep\n# List values are comma-separated and replace the whole list\ninfer config set tools.web_fetch.allowed_domains \"example.com,github.com\"\n```\n\n**Customising Tool Descriptions:**\n\nThe description each tool exposes to the LLM is configurable in\n`.infer/prompts.yaml` under the `tools` key - useful when a model\nmisinterprets a default or when you want to nudge usage:\n\n```yaml\n# .infer/prompts.yaml\ntools:\n  Bash:\n    description: |-\n      Execute allowed bash commands securely. Only pre-approved\n      commands from the allowed list can be executed.\n  Read:\n    description: |-\n      Reads a file from the local filesystem. Always prefer reading\n      whole files unless the file is very large.\n```\n\nAny tool you omit falls back to the in-code default. Env-var override:\n`INFER_PROMPTS_TOOLS_\u003cUPPER_SNAKE_NAME\u003e_DESCRIPTION` (e.g.\n`INFER_PROMPTS_TOOLS_BASH_DESCRIPTION`). MCP tool descriptions are not\nconfigurable here - they come from the MCP server at runtime.\n\nSee [Tools Reference](docs/tools-reference.md) for complete documentation.\n\n## Configuration\n\nThe CLI uses a powerful 2-layer configuration system with environment variable support.\n\n### Configuration Quick Start\n\nCreate a minimal configuration:\n\n```yaml\n# .infer/config.yaml\ngateway:\n  url: http://localhost:8080\n  docker: true  # Use Docker mode (or false for binary mode)\n\ntools:\n  enabled: true\n  bash:\n    enabled: true\n\nagent:\n  model: \"deepseek/deepseek-v4-pro\"\n  system_prompt: \"You are a helpful assistant\"  # Base identity\n  custom_instructions: \"\"  # Additional instructions appended to system prompt\n  max_turns: 50\n\nchat:\n  theme: tokyo-night\n```\n\n### Configuration Layers\n\n1. **Environment Variables** (`INFER_*`) - Highest priority\n2. **Command Line Flags**\n3. **Project Config** (`.infer/config.yaml`)\n4. **Userspace Config** (`~/.infer/config.yaml`)\n5. **Built-in Defaults** - Lowest priority\n\n**Example:**\n\n```bash\n# Set via environment variable (highest priority)\nexport INFER_AGENT_MODEL=\"openai/gpt-4\"\n\n# Or via config file\ninfer config set agent.model \"deepseek/deepseek-v4-pro\"\n\n# Or via command flag\ninfer chat --model \"anthropic/claude-4\"\n```\n\n### Key Configuration Options\n\n- **gateway.url** - Gateway URL (default: `http://localhost:8080`)\n- **gateway.docker** - Use Docker mode vs binary mode (default: `true`)\n- **tools.enabled** - Enable/disable all tools (default: `true`)\n- **agent.model** - Default model for agent operations\n- **agent.system_prompt** - Base identity for the agent (e.g., `\"You are a helpful assistant\"`)\n- **agent.custom_instructions** - Additional instructions appended after the system prompt\n- **agent.max_turns** - Maximum turns for agent sessions (default: `50`)\n- **chat.theme** - Chat interface theme (default: `tokyo-night`)\n- **chat.status_bar.enabled** - Enable/disable status bar (default: `true`)\n- **chat.status_bar.indicators** - Configure individual status indicators (all enabled by default except `max_output`)\n- **web.enabled** - Enable web terminal mode (default: `false`)\n- **web.port** - Web server port (default: `3000`)\n- **web.host** - Web server host (default: `localhost`)\n- **web.session_inactivity_mins** - Session timeout in minutes (default: `5`)\n\n### Environment Variables\n\nAll configuration can be set via environment variables with the `INFER_` prefix:\n\n```bash\nexport INFER_GATEWAY_URL=\"http://localhost:8080\"\nexport INFER_AGENT_MODEL=\"deepseek/deepseek-v4-pro\"\nexport INFER_TOOLS_BASH_ENABLED=true\nexport INFER_CHAT_THEME=\"tokyo-night\"\n\n# Web terminal configuration\nexport INFER_WEB_PORT=3000\nexport INFER_WEB_HOST=\"localhost\"\nexport INFER_WEB_SESSION_INACTIVITY_MINS=5\n```\n\n**Format:** `INFER_\u003cPATH\u003e` where dots become underscores.\nExample: `agent.model` → `INFER_AGENT_MODEL`\n\nFor complete configuration documentation, including all options and environment variables, see [Configuration Reference](docs/configuration-reference.md).\n\n## Cost Tracking\n\nThe CLI automatically tracks API costs based on token usage for all providers and models.\nCosts are calculated in real-time with support for both aggregate totals and per-model breakdowns.\n\n### Viewing Costs\n\nUse the `/cost` command in any chat session to see the cost breakdown:\n\n```bash\n# In chat, use the /cost shortcut\n/cost\n```\n\nThis displays:\n\n- **Total session cost** in USD\n- **Input/output costs** separately\n- **Per-model breakdown** when using multiple models\n- **Token usage** for each model\n\n**Status Bar**: Session costs are also displayed in the status bar (e.g., `💰 $0.0234`) if enabled.\n\n### Configuring Pricing\n\nThe CLI includes hardcoded pricing for 30+ models across all major providers\n(Anthropic, OpenAI, Google, DeepSeek, Groq, Mistral, Cohere, etc.).\nPrices are updated regularly to match current provider pricing.\n\nThe model picker groups models into three categories you can filter with the\n`[1] All` / `[2] Free` / `[3] Paid` / `[4] Pro` tabs:\n\n- **Free** - no per-token cost (e.g. local Ollama, Gemma).\n- **Paid** - billed per token at the listed `$input/$output per MTok` rate.\n- **Pro** - gated behind a paid **Pro subscription** (some Ollama Cloud models).\n  These have no per-token price but are not free, so they are marked\n  `pro subscription` instead of `free` to avoid the misleading label.\n\n**Override pricing** for specific models or add pricing for custom models:\n\n```yaml\n# .infer/config.yaml\npricing:\n  enabled: true\n  currency: \"USD\"\n  custom_prices:\n    # Override existing model pricing\n    \"openai/gpt-4o\":\n      input_price_per_mtoken: 2.50    # Price per million input tokens\n      output_price_per_mtoken: 10.00  # Price per million output tokens\n\n    # Add pricing for custom/local models\n    \"ollama/llama3.2\":\n      input_price_per_mtoken: 0.0\n      output_price_per_mtoken: 0.0\n\n    \"custom-fine-tuned-model\":\n      input_price_per_mtoken: 5.00\n      output_price_per_mtoken: 15.00\n\n    # Mark a model as Pro-subscription only (no per-token cost, but gated)\n    \"ollama_cloud/deepseek-v4-pro\":\n      input_price_per_mtoken: 0.0\n      output_price_per_mtoken: 0.0\n      requires_pro: true\n```\n\n\u003e **Note:** A custom entry fully replaces the default for that model. Omitting\n\u003e `requires_pro` in a custom override resets it to `false`, so re-state\n\u003e `requires_pro: true` if you override a model the CLI flags as Pro by default.\n\n**Via environment variables:**\n\n```bash\n# Disable cost tracking entirely\nexport INFER_PRICING_ENABLED=false\n\n# Override specific model pricing (use underscores in model names)\nexport INFER_PRICING_CUSTOM_PRICES_OPENAI_GPT_4O_INPUT_PRICE_PER_MTOKEN=3.00\nexport INFER_PRICING_CUSTOM_PRICES_OPENAI_GPT_4O_OUTPUT_PRICE_PER_MTOKEN=12.00\n\n# Hide cost from status bar\nexport INFER_CHAT_STATUS_BAR_INDICATORS_COST=false\n```\n\n**Status Bar Configuration:**\n\n```yaml\n# .infer/config.yaml\nchat:\n  status_bar:\n    enabled: true\n    indicators:\n      cost: true  # Show/hide cost indicator\n```\n\n### Cost Calculation\n\n- Costs are calculated as: `(tokens / 1,000,000) × price_per_million_tokens`\n- Prices are per million tokens (input and output priced separately)\n- Models without pricing data (Ollama, free tiers) show $0.00\n- Token counts use actual usage from providers or polyfilled estimates\n\n## Tool Approval System\n\nThe CLI includes a comprehensive approval system for sensitive tool operations, providing security and\nvisibility into what actions LLMs are taking.\n\n### How It Works\n\nWhen a tool requiring approval is executed:\n\n1. **Validation**: Tool arguments are validated\n2. **Approval Prompt**: User sees tool details with:\n   - Tool name and parameters\n   - Real-time diff preview (for file modifications)\n   - Approve/Reject/Auto-approve options\n3. **Execution**: Tool runs only if approved\n\n### Default Approval Requirements\n\nThe global default is `tools.safety.require_approval: true`, so **any tool not explicitly exempt requires\napproval**; override per tool with `tools.\u003cname\u003e.require_approval`.\n\n| Tool | Requires Approval | Reason |\n| ------ | ------------------- | --------- |\n| Write, Edit, MultiEdit, Delete | Yes | Create / modify / remove files |\n| Schedule, Agent | Yes | Side effects (scheduled jobs, spawned subprocesses) |\n| WebSearch, WebFetch | Yes | Make external requests (global default) |\n| A2A_SubmitTask | Yes | Dispatches work to another agent |\n| Bash | Optional | Governed by the per-mode bash allow-list |\n| Read, Grep, Tree | No | Read-only operations |\n| Memory, TodoWrite | No | Local agent state (explicitly exempt) |\n| Computer-use tools | No | Run silently in the background |\n| A2A_QueryAgent, A2A_QueryTask | No | Read-only A2A queries |\n\n### Approval Configuration\n\nConfigure approval requirements per tool:\n\n```bash\n# Enable/disable approval for specific tools\ninfer config set tools.safety.require_approval true   # Global approval\ninfer config set tools.bash.enabled true              # Enable bash tool\n```\n\nOr via configuration file:\n\n```yaml\ntools:\n  safety:\n    require_approval: true  # Global default\n  write:\n    require_approval: true\n  bash:\n    require_approval: false  # Override for bash\n```\n\n### Approval UI Controls\n\n- **y / Enter** - Approve execution\n- **n / Esc** - Reject execution\n- **a** - Auto-approve (disables approval for session)\n\n## Shortcuts\n\nThe CLI provides an extensible shortcuts system for quickly executing common commands with `/shortcut-name` syntax.\n\n**Subcommands:** Shortcuts can have subcommands for organized command groups (e.g., `/git status`,\n`/scm issues`). This allows related operations to be grouped under a single shortcut name with multiple\nactions.\n\n### Built-in Shortcuts\n\n**Conversation \u0026 session:**\n\n- `/new [title]` - Start a new conversation (optionally titled)\n- `/clear` - Save the current conversation and start a new one\n- `/compact` - Save the conversation and start a new session seeded with a summary\n- `/conversations` - Open the conversation selection dropdown\n- `/context` - Show context-window usage\n- `/cost` - Show session cost breakdown with per-model details\n- `/copy [text|markdown|json]` - Copy the conversation to the clipboard (aliases: `txt`, `md`)\n- `/model [model-name] [prompt]` - Switch model, or run a single prompt against a specific model then restore\n- `/theme` - Switch chat theme\n- `/voice [seconds]` - Record from the microphone and transcribe to the input with Whisper (requires `speech_to_text.enabled`)\n- `/help [shortcut]` - Show available shortcuts\n- `/exit` - Exit the chat session\n\n**Panels \u0026 views:**\n\n- `/diff` - Open the changes panel (interactive diff viewer)\n- `/explorer` - Open the file explorer (tree + fuzzy finder) - [Learn more →](docs/explorer.md)\n- `/tasks` - Show the A2A task-management interface (requires A2A) - [Learn more →](docs/tasks-management.md)\n- `/release-notes [version]` - Show release notes from GitHub Releases (latest, or a specific version)\n\n**Project setup:**\n\n- `/init` - Generate an `AGENTS.md` by analyzing the project\n- `/init-github-action` - Set up a GitHub Action via an interactive wizard\n\n**Git Shortcuts** (created by `infer init`):\n\n- `/git status` - Show working tree status\n- `/git commit` - Generate AI commit message from staged changes\n- `/git push` - Push commits to remote\n- `/git log` - Show commit logs\n\n**SCM Shortcuts** (GitHub integration):\n\n- `/scm issues` - List GitHub issues\n- `/scm issue \u003cnumber\u003e` - Show issue details\n- `/scm pr-create [context]` - Generate AI-powered PR plan\n\n**Other Shortcuts** (created by `infer init`):\n\n- `/mcp [list|add|remove|enable|disable]` - Manage MCP servers\n- `/shells` - List running and recent background shell processes\n- `/export` - Export the current conversation to markdown\n- `/env` - Generate a `.env.example` with all provider API keys\n- `/agents [list|add|remove|enable|disable]` - Manage A2A agents\n- `/skills [list|install|uninstall]` - Manage Agent Skills\n\n### AI-Powered Snippets\n\nCreate shortcuts that use LLMs to transform data:\n\n```yaml\n# .infer/shortcuts/custom-example.yaml\nshortcuts:\n  - name: analyze-diff\n    description: \"Analyze git diff with AI\"\n    command: bash\n    args:\n      - -c\n      - |\n        diff=$(git diff)\n        jq -n --arg diff \"$diff\" '{diff: $diff}'\n    snippet:\n      prompt: |\n        Analyze this diff and suggest improvements:\n        ```diff\n        {diff}\n        ```\n      template: |\n        ## Analysis\n        {llm}\n```\n\n### Custom Shortcuts\n\nCreate custom shortcuts by adding YAML files to `.infer/shortcuts/`:\n\n```yaml\n# .infer/shortcuts/custom-dev.yaml\nshortcuts:\n  - name: tests\n    description: \"Run all tests\"\n    command: go\n    args:\n      - test\n      - ./...\n\n  - name: build\n    description: \"Build the project\"\n    command: go\n    args:\n      - build\n      - -o\n      - infer\n      - .\n```\n\n**With Subcommands:**\n\n```yaml\n# .infer/shortcuts/custom-docker.yaml\nshortcuts:\n  - name: docker\n    description: \"Docker operations\"\n    command: docker\n    subcommands:\n      - name: build\n        description: \"Build Docker image\"\n        args:\n          - build\n          - -t\n          - myapp\n          - .\n      - name: run\n        description: \"Run Docker container\"\n        args:\n          - run\n          - -p\n          - \"8080:8080\"\n          - myapp\n```\n\nUse as: `/docker build` or `/docker run`\n\nUse with `/tests` or `/build`.\n\nFor complete shortcuts documentation, including advanced features and examples, see [Shortcuts Guide](docs/shortcuts-guide.md).\n\n## Channels (Remote Messaging)\n\nControl the agent remotely from messaging platforms like Telegram or\nWhatsApp. Messages sent to a bot are forwarded to the agent, and the\nagent's responses are sent back through the same platform.\n\n### Setup (Telegram)\n\n**1. Create a Telegram bot** by messaging [@BotFather](https://t.me/BotFather) and sending `/newbot`. Copy the bot token.\n\n**2. Get your chat ID** by messaging your bot, then visiting:\n\n```text\nhttps://api.telegram.org/bot\u003cYOUR_TOKEN\u003e/getUpdates\n```\n\nFind `\"chat\":{\"id\":123456789}` in the response.\n\n**3. Configure** in `.infer/channels.yaml` (seeded by `infer init`):\n\n```yaml\n---\nenabled: true\n\ntelegram:\n  enabled: true\n  bot_token: \"${INFER_CHANNELS_TELEGRAM_BOT_TOKEN}\"\n  allowed_users:\n    - \"123456789\"\n  poll_timeout: 30\n```\n\nOr via environment variables:\n\n```bash\nexport INFER_CHANNELS_ENABLED=true\nexport INFER_CHANNELS_TELEGRAM_ENABLED=true\nexport INFER_CHANNELS_TELEGRAM_BOT_TOKEN=\"123456:ABC-DEF...\"\nexport INFER_CHANNELS_TELEGRAM_ALLOWED_USERS=\"123456789\"\n```\n\n**4. Start the channel listener:**\n\n```bash\ninfer channels-manager\n```\n\n**5. Send a message** to your bot in Telegram - the agent will respond.\n\nEach incoming message triggers `infer agent --session-id \u003cid\u003e` as a\nsubprocess with a persistent session per sender.\n\n### Tool Approval\n\nBy default, sensitive tools (Write, Edit, Delete, Bash) require user approval\nbefore executing. The agent sends an approval prompt to the channel and waits\nfor the user to reply \"yes\" or \"no\". Read-only tools (Read, Grep, Tree) execute\nwithout approval.\n\nThis reuses the existing `tools.*.require_approval` configuration. To disable, set in `.infer/channels.yaml`:\n\n```yaml\nrequire_approval: false  # default: true\n```\n\nOr: `INFER_CHANNELS_REQUIRE_APPROVAL=false`\n\nApprovals time out after 5 minutes and are automatically rejected.\n\n### Security\n\n- **Allowlist-only access**: Only chat IDs in `allowed_users` can interact with the agent\n- **Empty allowlist = reject all**: If no users are configured, all messages are rejected (secure by default)\n- **Per-channel allowlists**: Each channel (Telegram, WhatsApp) has its own independent allowlist\n- **Tool approval by default**: Sensitive tools require explicit user approval before executing\n- **Use environment variables** for tokens - never commit secrets to config files\n\n### Supported Channels\n\n| Channel  | Status    | Transport                   |\n| -------- | --------- | --------------------------- |\n| Telegram | Available | Long-polling (Bot API)      |\n| WhatsApp | Planned   | Webhook (Meta Business API) |\n\nFor a complete working example with Docker Compose, see [examples/telegram-channel](examples/telegram-channel/).\n\nFor detailed documentation including custom channel development, see [Channels Documentation](docs/channels.md).\n\n### Scheduled Tasks\n\nWhen the channels-manager daemon is running, you can ask the bot to schedule\nprompts on a cron schedule. The agent's response is delivered back through\nthe originating channel (e.g. the Telegram chat where you set it up).\n\n\u003e *\"Send me an inspiring quote every day at 8 AM\"* - recurring\n\u003e *\"Remind me at 6pm today to call mum\"* - one-off (deletes itself after firing)\n\nEnable in `.infer/config.yaml`:\n\n```yaml\ntools:\n  schedule:\n    enabled: true               # disabled by default\n    require_approval: true      # default; recommended\n```\n\nOr via env var: `INFER_TOOLS_SCHEDULE_ENABLED=true`.\n\nJobs are persisted as YAML in `~/.infer/schedules/\u003cid\u003e.yaml` and hot-reloaded\nby the daemon (no restart needed). Channel + recipient are derived\nautomatically from the session - the LLM never has to guess them.\n\nContainer deployments must set `TZ=Europe/Berlin` (or your zone) so cron\nexpressions are interpreted in local time. The binary embeds the IANA zone\ndatabase so this works on any base image.\n\nFor the full guide, including the cron syntax primer and end-to-end Telegram\nwalkthroughs, see [Scheduling Documentation](docs/scheduling.md).\n\n## Heartbeat (Periodic Wake-Up)\n\nHeartbeat wakes the agent on a fixed interval - without any user input -\nso it can check for pending todos, background tasks, or anything else\nyour system prompt tells it to monitor. It runs alongside the scheduler\ninside the `infer channels-manager` daemon and is **disabled by default**.\n\nUnlike the [Schedule](docs/scheduling.md) tool (which the LLM uses to\ncreate user-driven cron jobs that deliver to a channel), heartbeat is a\nsingle global tick the operator configures once. Output goes to logs;\nthe agent itself decides whether to send a Telegram message, open a PR,\nor just no-op.\n\nEnable in `.infer/heartbeat.yaml` (seeded by `infer init`):\n\n```yaml\n---\nenabled: true\ninterval: 1h            # Go duration: 30s, 5m, 1h, 24h\ninitial_delay: 1m       # delay before first tick\nmodel: \"\"               # optional override; empty = agent.model\nprompt: \"Heartbeat tick - check for any pending tasks, todos, or background work and act on them.\"\n```\n\nThe **system prompt** for heartbeat runs lives in `.infer/prompts.yaml`\nunder `agent.system_prompt_heartbeat` so you can tune the agent's\nwake-up behaviour separately from chat-mode behaviour.\n\nThen start the daemon:\n\n```bash\ninfer channels-manager\n```\n\nHeartbeat alone is a valid run mode - you don't need any channel\nenabled to use it. The daemon hosts whichever of channels / scheduler /\nheartbeat are turned on.\n\nOr via env vars:\n\n```bash\nexport INFER_HEARTBEAT_ENABLED=true\nexport INFER_HEARTBEAT_INTERVAL=30m\n```\n\nFor the full guide, including configuration reference and common\npatterns (TODO sweeps, CI watchdogs), see [Heartbeat Documentation](docs/heartbeat.md).\n\n## Agent Skills\n\nAgent Skills are reusable, model-readable instruction folders. Each skill is a folder containing a\n`SKILL.md` file with YAML frontmatter (`name`, `description`) - the same contract used by the open\n`.agents/skills/` standard, so existing skill folders drop in unchanged. The agent discovers skills at\nstartup and loads a skill's full instructions on demand when they are relevant.\n\nSkills are scanned from three locations (highest precedence first; first match wins on a name collision):\n\n- `.infer/skills/\u003cname\u003e/SKILL.md` - project\n- `.agents/skills/\u003cname\u003e/SKILL.md` - open standard\n- `~/.infer/skills/\u003cname\u003e/SKILL.md` - user-global\n\nSkills are **enabled by default**; disable with `agent.skills.enabled=false` (or `INFER_AGENT_SKILLS_ENABLED=false`).\n\n```bash\ninfer skills list                          # Discover skills (works even when disabled)\ninfer skills install skill-creator         # Install from github.com/inference-gateway/skills\ninfer skills install acme/internal-comms   # Install from github.com/acme/skills\ninfer skills uninstall pdf                 # Remove a skill folder by name\n```\n\n`install` accepts a skill name, an `org/skill` pair, or a full GitHub tree URL; set `GITHUB_TOKEN`\n(or `GH_TOKEN`) to raise the rate limit and reach private repositories. See\n[docs/skills.md](docs/skills.md) for the authoring format.\n\n## Computer Use\n\nWhen enabled, the agent can control the desktop - move and click the mouse, scroll, type text and key\ncombinations, focus applications, and read screenshots. The display backend is detected automatically\nacross **macOS** (via a bundled Swift bridge), **X11**, and **Wayland**.\n\nComputer Use is **off by default**. Turn it on in `computer_use.yaml` (or `infer config set computer_use.enabled true`):\n\n```yaml\n# .infer/computer_use.yaml\nenabled: true\nrate_limit:\n  enabled: true\nscreenshot:\n  streaming_enabled: true   # also registers the GetLatestScreenshot tool\n```\n\nTools: `MouseMove`, `MouseClick`, `MouseScroll`, `KeyboardType`, `GetFocusedApp`, `ActivateApp`, and\n`GetLatestScreenshot`. They run silently in the background (bypassing the approval prompt) and are\ngoverned by `computer_use.enabled` plus the configured rate limits. On macOS an optional **floating\nprogress window** can mirror what the agent is doing. For a sandboxed desktop to drive, see\n[examples/computer-use](examples/computer-use/).\n\n## Persistent Memory\n\nThe agent keeps a durable, cross-session memory: individual Markdown **fact-files** under a global\ndirectory (`~/.infer/memory` by default), catalogued by a `MEMORY.md` index. The index is injected\ninto context at session start, and the agent reads or writes individual facts on demand through the\n`Memory` tool. A session reminder nudges it to consult and keep memory up to date.\n\nMemory is **enabled by default**. Configure it in `memory.yaml`:\n\n```yaml\n# .infer/memory.yaml\nenabled: true\ndir: \"\"           # \"\" =\u003e ~/.infer/memory\nmax_chars: 4000   # cap on the injected MEMORY.md index\n```\n\nTurn it off with `memory.enabled=false` (or `INFER_MEMORY_ENABLED=false`); the memory-consult\nreminder below is pruned automatically when memory is disabled.\n\n## Reminders \u0026 Command Hooks\n\nTwo lightweight extension points fire at fixed **agent-loop hook points** - `pre_session`,\n`pre_stream`, `post_stream`, `pre_tool`, `post_tool`, `pre_queue_drain`, `post_queue_drain`, and\n`post_session`:\n\n- **Reminders** (`reminders.yaml`) inject a `\u003csystem-reminder\u003e` text block at a hook point, gated by a\n  trigger (`always`, `interval`, `turns_before_max`, or `once`). Reminders ship **enabled** with two\n  defaults: `todo-hygiene` (nudges the agent to keep a todo list) and `memory-consult` (points it at\n  the memory index; auto-pruned when memory is off).\n- **Command Hooks** (`hooks.yaml`) run a shell command at a hook point - the executable sibling of\n  reminders. They are **off by default**; each command still faces the per-mode bash allow-list when\n  the agent runs it, so allow-list the command and set `enabled: true` to turn hooks on.\n\n```yaml\n# .infer/hooks.yaml\nenabled: true\nhooks:\n  - name: gofmt\n    hook: post_session\n    command: \"gofmt -w .\"\n    timeout: 30   # seconds; 0 -\u003e default 30\n```\n\n## Global Flags\n\n- `-v, --verbose`: Enable verbose output\n- `--config \u003cpath\u003e`: Specify custom config file path\n\n## Examples\n\n### Docker Compose Examples\n\nEach directory under [`examples/`](examples/) is a self-contained, runnable setup with its own README\nand Docker Compose file:\n\n| Example | Demonstrates |\n| --------- | -------------- |\n| [basic](examples/basic/) | Minimal gateway + CLI setup to get started |\n| [a2a](examples/a2a/) | Agent-to-Agent: multiple agents, a demo site, and a VNC container |\n| [mcp](examples/mcp/) | MCP server integration with a sample server and config |\n| [computer-use](examples/computer-use/) | Computer Use driving a sandboxed Ubuntu GUI container |\n| [model-switching](examples/model-switching/) | Switching models mid-session, with a small frontend |\n| [shortcuts](examples/shortcuts/) | Custom `/`-shortcuts wired through config |\n| [web-terminal](examples/web-terminal/) | Browser-based, multi-tab web terminal |\n| [telegram-channel](examples/telegram-channel/) | Driving the agent from a Telegram channel |\n\n### Basic Workflow\n\n```bash\n# Initialize project\ninfer init\n\n# Start interactive chat\ninfer chat\n\n# Execute autonomous task\ninfer agent \"Fix the bug in issue #42\"\n\n# Check gateway status\ninfer status\n```\n\n### Working on a GitHub Issue\n\n```bash\n# Start chat\ninfer chat\n\n# In chat, use shortcuts to get context\n/scm issue 123\n\n# Discuss with AI, let it use tools to:\n# - Read files\n# - Search codebase\n# - Make changes\n# - Run tests\n\n# Generate PR plan when ready\n/scm pr-create Fixes the authentication timeout issue\n```\n\n### Configuration Example\n\n```bash\n# Set default model\ninfer config set agent.model \"deepseek/deepseek-v4-pro\"\n\n# Enable bash tool\ninfer config set tools.bash.enabled true\n\n# Configure web search\ninfer config set tools.web_search.enabled true\n\n# Check current configuration\ninfer config get\n```\n\n### Web Terminal Example\n\n```bash\n# Start web terminal server\ninfer chat --web\n\n# Open browser to http://localhost:3000\n# Click \"+\" to create new terminal tabs\n# Each tab is an independent chat session\n\n# Custom port for remote access\ninfer chat --web --port 8080 --host 0.0.0.0\n\n# Configure via config file\ncat \u003e .infer/config.yaml \u003c\u003cEOF\nweb:\n  enabled: true\n  port: 3000\n  host: \"localhost\"\n  session_inactivity_mins: 10  # Auto-cleanup after 10 minutes\nEOF\n\ninfer chat --web  # Uses config file settings\n```\n\n**Use Cases:**\n\n- Remote access to CLI from any device\n- Multiple parallel chat sessions in browser tabs\n- Team collaboration with shared terminal access\n- Persistent sessions with automatic cleanup\n\n## Development\n\nFor development, use [Task](https://taskfile.dev) for build automation:\n\n```bash\ntask build # Build binary\ntask test  # Run tests\ntask fmt   # Format code\ntask lint  # Run linter\n```\n\nSee [CLAUDE.md](CLAUDE.md) for detailed development documentation.\n\n## License\n\nApache 2.0 License - see [LICENSE](LICENSE) file for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Finference-gateway%2Fcli","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Finference-gateway%2Fcli","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Finference-gateway%2Fcli/lists"}