{"id":50718213,"url":"https://github.com/nujovich/hermes-telemetry","last_synced_at":"2026-06-09T21:00:46.673Z","repository":{"id":361589367,"uuid":"1254916915","full_name":"nujovich/hermes-telemetry","owner":"nujovich","description":"Budget enforcement + observability plugin for Hermes Agent. Stops runaway costs before they happen.","archived":false,"fork":false,"pushed_at":"2026-06-07T18:50:44.000Z","size":801,"stargazers_count":7,"open_issues_count":5,"forks_count":2,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-07T20:20:36.429Z","etag":null,"topics":["agent-telemetry","ai-cost-tracking","budget-enforcement","hermes-agent","hermes-plugin","llm-budget","llm-observability","token-tracking"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nujovich.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-31T06:51:19.000Z","updated_at":"2026-06-07T18:50:49.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/nujovich/hermes-telemetry","commit_stats":null,"previous_names":["nujovich/hermes-telemetry"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/nujovich/hermes-telemetry","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nujovich%2Fhermes-telemetry","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nujovich%2Fhermes-telemetry/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nujovich%2Fhermes-telemetry/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nujovich%2Fhermes-telemetry/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nujovich","download_url":"https://codeload.github.com/nujovich/hermes-telemetry/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nujovich%2Fhermes-telemetry/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34125332,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-09T02:00:06.510Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agent-telemetry","ai-cost-tracking","budget-enforcement","hermes-agent","hermes-plugin","llm-budget","llm-observability","token-tracking"],"created_at":"2026-06-09T21:00:29.327Z","updated_at":"2026-06-09T21:00:46.646Z","avatar_url":"https://github.com/nujovich.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# hermes-telemetry ☤\n\n\u003e *Observability + budget guardrails for [Hermes Agent](https://github.com/NousResearch/hermes-agent)*\n\n**Budget enforcement + observability for Hermes Agent. The only plugin that can stop a run before it overspends.**\n\nA comprehensive telemetry plugin that captures real usage data, enforces budget limits, and provides detailed cost analysis for AI agent operations. Built for the [Hermes Agent Challenge](https://dev.to/devteam/join-the-hermes-agent-challenge-1000-in-prizes-13cd) by [Nadia Ujovich](https://nadiaujovich.dev).\n\n**The differentiator: it can _stop_ work that's about to overspend — not just report it after the fact.** Set a daily cap below current spend, and the next cron run is blocked by the budget:\n\n![Budget enforcement demo: a $0.001 daily global cap is set, current spend already exceeds it, and the next marketing cron run is blocked by the resulting hard breach](docs/budget_enforcement.gif)\n\n*`/budget set global daily 0.001` writes the cap to `budget.yaml`; current spend ($0.0102) already exceeds it, so `/budget` re-renders at 1020% `[daily]` — a hard breach — and the next marketing cron run is blocked by the budget.*\n\n[![Hermes Agent](https://raw.githubusercontent.com/NousResearch/hermes-agent/HEAD/assets/banner.png)](https://raw.githubusercontent.com/NousResearch/hermes-agent/HEAD/assets/banner.png)\n\n[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://camo.githubusercontent.com/08cef40a9105b6526ca22088bc514fbfdbc9aac1ddbf8d4e6c750e3a88a44dca/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4c6963656e73652d4d49542d626c75652e737667) [![Tests: 94 passing](https://img.shields.io/badge/Tests-94%20passing-green.svg)](https://camo.githubusercontent.com/89bc4bc6079d0e919e0c1363852fe900e05cb49429800097aa3ca83908c5cd59/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f54657374732d393425323070617373696e672d677265656e2e737667) [![Provider Support](https://img.shields.io/badge/Providers-OpenRouter%20%7C%20OpenAI%20%7C%20Anthropic-orange.svg)](https://camo.githubusercontent.com/cf0938e4acec0cd17c14dcf61a72734ffd03e8fff8eb44e359994f6ea773bfad/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f50726f7669646572732d4f70656e526f757465722532302537432532304f70656e4149253230253743253230416e7468726f7069632d6f72616e67652e737667) [![Challenge Entry](https://img.shields.io/badge/Hermes%20Agent-Challenge%20Entry-purple.svg)](https://camo.githubusercontent.com/d0c993fdf35127e435629279025d4b1892e351f5e04ce1547329686aa4223366/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4865726d65732532304167656e742d4368616c6c656e6765253230456e7472792d707572706c652e737667)\n\n-----\n\nHermes Agent runs autonomously — across sessions, platforms, and cron jobs — which \nmeans it can keep spending even when you're not watching.  \n**hermes-telemetry lives inside the runtime** and enforces hard budget limits before \nthe next LLM call is made.\n\n\u003e This plugin addresses [NousResearch/hermes-agent#6642](https://github.com/NousResearch/hermes-agent/issues/6642) — \n\u003e the open feature request for a first-class telemetry and budget subsystem for Hermes Agent.\n\n\n```\nYour Hermes session\n  ↓ every API call\nhermes-telemetry (native plugin)\n  → tracks tokens + cost in real time\n  → enforces budget limits mid-session\n  → logs to SQLite with WAL mode\n  → syncs OpenRouter pricing automatically\n  ↓ if budget OK\nLLM provider\n```\n\n\u003e **Not a log reader.** TokenTelemetry and similar tools read what already happened.\n\u003e hermes-telemetry hooks into the Hermes runtime and can *stop* what’s about to happen.\n\n-----\n\n**Design principle:** observability is invisible to the model. Everything goes through hooks. The only user-facing surface is `/stats` and `/budget`.\n\n-----\n\n## Table of Contents\n\n- [Screenshots](#screenshots)\n  - [Dashboard (Web UI)](#dashboard-web-ui)\n  - [Slash Commands](#slash-commands-1)\n- [What It Measures](#what-it-measures)\n- [Installation](#installation)\n- [Quick Start](#quick-start)\n- [Setup Wizard](#setup-wizard)\n- [Dashboard (Web UI)](#dashboard-web-ui-1)\n  - [Auto-Refresh](#auto-refresh)\n  - [Features](#features)\n- [Slash Commands](#slash-commands-2)\n  - [/stats](#stats)\n  - [/budget](#budget)\n- [Configuration](#configuration)\n  - [pricing.yaml](#pricingyaml)\n  - [budget.yaml](#budgetyaml)\n- [Pricing Auto-Refresh](#pricing-auto-refresh)\n  - [How It Works](#how-it-works)\n  - [Estimated-Price Models](#estimated-price-models)\n  - [CLI Usage](#cli-usage)\n- [Architecture](#architecture)\n  - [Hook Pipeline](#hook-pipeline)\n  - [Database Schema](#database-schema)\n  - [Concurrency Model](#concurrency-model)\n- [Budget Enforcement](#budget-enforcement)\n  - [How It Works](#how-it-works)\n  - [Enforcement Levels](#enforcement-levels)\n  - [Estimated Data and Budget Degradation](#estimated-data-and-budget-degradation)\n- [Provider Probe: Verifying Your Provider](#provider-probe-verifying-your-provider)\n- [Proof of Concept](#proof-of-concept)\n  - [Setup](#setup)\n  - [Pricing Capture](#pricing-capture)\n  - [Budget Enforcement Test](#budget-enforcement-test)\n  - [Cron Job Cost Comparison](#cron-job-cost-comparison)\n  - [Results Summary](#results-summary)\n- [Comparison](#comparison)\n- [Running Tests](#running-tests)\n- [Data Location](#data-location)\n- [Known Limitations](#known-limitations)\n- [Troubleshooting](#troubleshooting)\n- [License](#license)\n- [Hermes Agent Challenge](#hermes-agent-challenge)\n\n-----\n\n## Screenshots\n\n### Dashboard (Web UI)\n\nA standalone HTML dashboard for users who prefer a visual interface over slash commands. Served locally, reads directly from the telemetry SQLite database.\n\n[![Dashboard overview](https://github.com/nujovich/hermes-telemetry/raw/main/docs/screenshots/dashboard-overview.png)](https://github.com/nujovich/hermes-telemetry/blob/main/docs/screenshots/dashboard-overview.png)\n\n*The dashboard auto-refreshes every 30 seconds. Shows sessions, API calls, tokens, cost, budget status, daily cost trends, top tools, cost by cron job, provider distribution, and recent sessions.*\n\n### Slash Commands\n\n#### `/stats` — Session analytics\n\n[![Stats output](https://github.com/nujovich/hermes-telemetry/raw/main/docs/screenshots/stats-output.png)](https://github.com/nujovich/hermes-telemetry/blob/main/docs/screenshots/stats-output.png)\n\n#### `/budget` — Current spending vs limits\n\n[![Budget output](https://github.com/nujovich/hermes-telemetry/raw/main/docs/screenshots/budget-output.png)](https://github.com/nujovich/hermes-telemetry/blob/main/docs/screenshots/budget-output.png)\n\n#### `/stats cron week` — Cron job cost breakdown\n\n[![Cron output](https://github.com/nujovich/hermes-telemetry/raw/main/docs/screenshots/cron-output.png)](https://github.com/nujovich/hermes-telemetry/blob/main/docs/screenshots/cron-output.png)\n\n#### `/stats providers` — Real vs estimated usage + estimated-price warning\n\n[![Providers output](https://github.com/nujovich/hermes-telemetry/raw/main/docs/screenshots/providers-output.png)](https://github.com/nujovich/hermes-telemetry/blob/main/docs/screenshots/providers-output.png)\n\n-----\n\n## What It Measures\n\n|Metric                                   |Source                         |Real or Estimated       |\n|-----------------------------------------|-------------------------------|------------------------|\n|Tokens in / out per API call             |`post_api_request.usage`       |✅ Real (from provider)  |\n|Cache read / write tokens                |`post_api_request.usage`       |✅ Real (from provider)  |\n|Reasoning tokens                         |`post_api_request.usage`       |✅ Real (from provider)  |\n|API call latency                         |`post_api_request.api_duration`|✅ Real (ms)             |\n|Tool call latency \u0026 success/failure      |`post_tool_call`               |✅ Real                  |\n|Session / cron job wall time             |`started_at` → `ended_at`      |✅ Real                  |\n|Model \u0026 provider name                    |`post_api_request`             |✅ Real                  |\n|Platform (cli / cron / telegram / …)     |`on_session_start.platform`    |✅ Real                  |\n|Cron job ID                              |Parsed from `session_id`       |✅ Real                  |\n|Subagent invocation count                |`subagent_stop` hook           |✅ Real (proxy)          |\n|**Cost (USD)**                           |Local pricing table × tokens   |⚠️ **Estimated**         |\n|Tokens when provider returns `usage=None`|Fallback approximation         |⚠️ **Estimated, flagged**|\n\nCost is always an **estimate** computed from a locally-maintained pricing table. No external pricing API is called. When the provider returns no usage data, tokens are estimated from a pre-request approximation + response length and the row is flagged as `estimated=1`, so `/stats` and `/budget` show a `~` prefix and an “estimated data” percentage.\n\n-----\n\n## Installation\n\nHermes plugins are **opt-in** — you must both install and enable the plugin.\n\n### Option A: Install from GitHub\n\n```\nhermes plugins install nujovich/hermes-telemetry\nhermes plugins enable hermes-telemetry\n```\n\n### Option B: Manual install\n\n```\ngit clone https://github.com/nujovich/hermes-telemetry ~/.hermes/plugins/hermes-telemetry\nhermes plugins enable hermes-telemetry\n```\n\n**Important:** restart the Hermes gateway after enabling:\n\n```\nhermes gateway restart\n```\n\n\u003e **Note:** Plugin changes only take effect after a gateway restart. The gateway loads the plugin registry at startup. If you enable a plugin and cron jobs don’t appear in `/stats cron week`, this is the most likely cause.\n\n-----\n\n## Quick Start\n\n1. Install and enable the plugin (see above)\n1. Restart the gateway\n1. Run any session, then type `/stats` to see captured data\n1. Optionally configure `pricing.yaml` and `budget.yaml` (see below)\n\nThat’s it. The plugin captures data automatically — no agent action required.\n\n-----\n\n## Setup Wizard\n\nhermes-telemetry includes a first-time setup wizard that runs automatically on first\nplugin load when `pricing.yaml` and/or `budget.yaml` are missing. It can also be\ntriggered manually at any time with the `/setup` slash command.\n\n### Auto-setup (first load)\n\nOn first load, if either config file is missing, the plugin auto-generates defaults:\n\n- **Pricing:** fetches all models with fixed pricing from the OpenRouter API and merges\n  them with ~30 built-in defaults (Anthropic, OpenAI, DeepSeek, Google, Meta, Nous).\n  New prices take effect immediately — no gateway restart needed.\n- **Budget:** writes a conservative global budget (`$5.00/day`, `$100.00/month`) with\n  an 80% soft warning and 100% hard cap.\n\n### `/setup` slash command\n\nUse `/setup` to check configuration status or reconfigure individual files.\n\n```\n/setup                     → show current status (which files exist)\n/setup pricing auto        → built-in defaults + fetch from OpenRouter API\n/setup pricing minimal     → built-in defaults only (~30 models, no network)\n/setup pricing skip        → skip (unrecognized models will record $0.00 cost)\n/setup budget default      → recommended global budget ($5/day, $100/month)\n/setup budget custom       → instructions for setting your own limits manually\n/setup budget skip         → no enforcement (costs still tracked)\n```\n\n#### Pricing options\n\n| Option | Models | Network |\n|--------|--------|---------|\n| `auto` | ~30 built-in + all OpenRouter fixed-price models | Yes (OpenRouter API) |\n| `minimal` | ~30 built-in only | No |\n| `skip` | None — models will record `$0.00` cost | No |\n\n#### Budget options\n\n| Option | Behavior |\n|--------|----------|\n| `default` | Global: `$5.00/day`, `$100.00/month`. Soft warning at 80%, hard block at 100% |\n| `custom` | Prints the `/budget set` commands for manual configuration |\n| `skip` | Costs tracked but never enforced |\n\n### Re-running setup\n\nSetup skips files that already exist. To reconfigure:\n\n```bash\n# Reprice from scratch\nrm ~/.hermes/telemetry/pricing.yaml\n/setup pricing auto\n\n# Reset budget\nrm ~/.hermes/telemetry/budget.yaml\n/setup budget default\n```\n\n\u003e **Note:** Pricing changes take effect immediately without a gateway restart. Budget\n\u003e changes require a restart.\n\n-----\n\n## Slash Commands\n\n### `/stats`\n\n```\n/stats                  → last 24h summary (sessions, tokens, cost, top tools)\n/stats today            → same as /stats\n/stats week             → last 7 days\n/stats month            → last 30 days\n/stats cron             → breakdown by cron_job_id (last 7 days)\n/stats cron week        → cron breakdown, last 7 days\n/stats cron month       → cron breakdown, last 30 days\n/stats cron today       → cron breakdown, last 24 hours\n/stats providers        → per-provider: real vs estimated calls + cost (last 24h)\n/stats providers week   → provider breakdown, last 7 days\n/stats models           → per-model breakdown within each provider (last 24h)\n/stats models week      → per-model breakdown, last 7 days\n/stats raw [N]          → last N raw run records (default 20, max 200)\n```\n\n**Example output (`/stats`):**\n\n```\nhermes-telemetry — last 24 h\n============================================\n  Sessions      : 14\n  Success rate  : 92.9%  (ok=13, failed=1)\n  API calls     : 47\n  Tool calls    : 183\n  Tokens in     : 1,240,500\n  Tokens out    : 87,300\n  Cost (est.)   : $0.004822\n  Avg latency   : 1.2s\n  Avg duration  : 48.3s\n\n  Top tools:\n  Tool                            Calls  Failures   Avg ms\n  --------------------------------------------------------\n  read_file                          92         0      12ms\n  terminal                           51         3     340ms\n  write_file                         28         0      18ms\n```\n\n**Example output (`/stats cron week`):**\n\n```\nhermes-telemetry — cron jobs (last 7 days)\n========================================================================\n  Job ID               Runs    OK  Fail     Tok-in    Tok-out         Cost   Avg dur\n  --------------------------------------------------------------------------\n  09dd0c24f29b            3     3     0   892,341    12,405    $0.314378     2.1m\n  d68c2728b513            1     1     0   445,119     8,200    $2.225595     4.7m\n```\n\n**Example output (`/stats providers`):**\n\n```\nhermes-telemetry — providers (last 24 h)\n========================================================================\n  Provider                     Calls   Real   Est   Est%         Cost\n  -------------------------------------------------------------------\n  openrouter                      66     66      0     0%    $0.916782\n\n  Est% = share of calls where the provider returned no usage data\n  (tokens estimated locally).\n  If Est% \u003e 0 for your main provider, budget hard-verdicts may be\n  degraded to soft under on_estimated.mode: warn_only.\n```\n\n**Example output (`/stats models`):**\n\n```\nhermes-telemetry — models (last 24 h)\n================================================================================================\n  Provider             Model                                           Calls   Real   Est         Cost\n  ----------------------------------------------------------------------------------------------\n  openrouter           owl-alpha                                          66     66     0    $0.000000\n  openrouter           anthropic/claude-sonnet-4-6                        42     42     0    $0.314378\n  openrouter           anthropic/claude-opus-4-7                           8      8     0    $2.225595\n\n  Rows are grouped by provider, then by calls (desc). A model showing $0.00 has no price entry\n  in pricing.yaml — run /setup pricing auto to refresh, or add it manually.\n```\n\nBreaks each provider's spend down to individual models. Rows are grouped by provider (ascending), then ordered by call count within each provider; the `Model` column is kept wide so dated model keys stay readable. Columns: `Calls` (total), `Real` (calls with provider-reported usage), `Est` (calls with locally estimated tokens), and `Cost`. A model showing `$0.000000` has no price entry in `pricing.yaml`.\n\n### `/budget`\n\n```\n/budget                             → status of every scope (spent / limit / %)\n/budget cron                        → per-cron-job budgets, with soft/hard flags\n/budget set global daily 5.00       → set or raise a limit (persists + hot-reloads)\n/budget set cron_job daily 1.00     → set default per-cron-job limit\n/budget set sender daily 2.00       → set default per-sender limit\n```\n\n**Example output (`/budget`):**\n\n```\nhermes-telemetry — budget status\n============================================================\n  global                       $   0.1812 / $    2.00      9%  [daily]\n\n  Legend:  (blank)=ok  !=soft (≥80%)  █=hard (≥100%)  ~est=estimated data\n```\n\n**Status flags:**\n\n|Flag   |Meaning                                                    |\n|-------|-----------------------------------------------------------|\n|(blank)|Within budget (`\u003c 80%`)                                    |\n|`!`    |Soft warning (≥ 80%) — notice injected into conversation   |\n|`█`    |Hard breach (≥ 100%) — tool calls blocked, cron jobs paused|\n|`~est` |Verdict based partly on estimated (usage=None) data        |\n\n-----\n\n## Dashboard (Web UI)\n\nA standalone HTML dashboard for users who prefer a visual interface over slash commands. Zero dependencies — uses only Python stdlib.\n\n### Auto-Refresh\n\nThe dashboard auto-refreshes every 30 seconds. No manual reload needed.\n\n### Features\n\n- **Summary cards**: Sessions, OK/failed, API calls, tokens in, cost\n- **Budget bar**: Real-time spend vs limit with progress indicator\n- **Daily cost chart**: 7-day line chart of spending\n- **Top tools chart**: Bar chart of most-used tools\n- **Cost by cron job**: Per-job cost breakdown\n- **Provider distribution**: Donut chart (nous / openrouter / anthropic)\n- **Cron jobs table**: Runs, tokens, cost, avg duration, last run\n- **Recent sessions table**: All sessions with platform, model, status, cost\n- **Time range selector**: Last 24h / 7 days / 30 days\n\n### Usage\n\n```\ncd ~/.hermes/plugins/hermes-telemetry/dashboard\npython3 serve.py                  # http://localhost:8765 (loopback only)\npython3 serve.py --port 9090      # custom port, still loopback\npython3 serve.py 9090             # positional port (back-compat)\n```\n\nThen open `http://localhost:8765` in your browser.\n\n### Accessing the dashboard from another host\n\nThe dashboard has **no authentication** — anyone who can reach the port sees\nevery captured token, cost, and tool-call detail. By default it binds to\n`127.0.0.1`, which is unreachable from other machines.\n\nIf your Hermes server is headless (Pi, VPS, NAS) and you browse from a laptop,\ntwo options:\n\n**Recommended — SSH tunnel** (no server-side change, leaves the safe default in\nplace):\n\n```bash\n# Start the dashboard on the server as usual\nssh server \"cd ~/.hermes/plugins/hermes-telemetry/dashboard \u0026\u0026 python3 serve.py \u0026\"\n\n# Tunnel from your client\nssh -L 8765:localhost:8765 -N server \u0026\n\n# Browse on the client\nopen http://localhost:8765\n```\n\n**Trusted-LAN shortcut — `--host 0.0.0.0`:**\n\n```bash\npython3 serve.py --host 0.0.0.0\n```\n\nThe script prints a warning when binding to any non-loopback interface. Only\nuse this on a network where you trust every host. **Do not expose to the\npublic internet or to networks that include untrusted hosts** — the dashboard\nships without an auth layer by design (see CONTRIBUTING.md if you want to add\none).\n\n-----\n\n## Configuration\n\nConfiguration lives in `~/.hermes/telemetry/`:\n\n```\n~/.hermes/telemetry/\n├── telemetry.db      ← SQLite database (WAL mode)\n├── telemetry.log     ← plugin log (errors / debug)\n├── pricing.yaml      ← optional pricing overrides\n└── budget.yaml       ← optional spend budgets\n```\n\nIf these files don’t exist, the plugin still works — it just uses defaults (all models at $0.00, budgets disabled).\n\n### `pricing.yaml`\n\nOverride model prices in USD per 1 million tokens. Without overrides, unknown models log a one-time warning and record cost as `$0.00`.\n\n**Full format:**\n\n```yaml\nmodels:\n  # Free model\n  \"openrouter/owl-alpha\":\n    input: 0.00\n    output: 0.00\n\n  # Paid model with full cache/reasoning split\n  \"openrouter/anthropic/claude-sonnet-4-6\":\n    input: 3.00\n    output: 15.00\n    cache_read: 0.30\n    cache_write: 3.75\n    reasoning: 15.00\n\n  # Minimal override (cache prices derived from multipliers)\n  \"openrouter/anthropic/claude-opus-4-7\":\n    input: 5.00\n    output: 25.00\n\ndefaults:\n  cache_read_multiplier: 0.10   # cache_read = input * 0.10 if not specified\n  cache_write_multiplier: 1.25  # cache_write = input * 1.25 if not specified\n```\n\n**Matching rules (in order):**\n\n1. Exact match (case-insensitive) against `models:` keys in your YAML\n1. Exact match against the built-in pricing table (~35 models)\n1. Longest-prefix match (e.g. `claude-sonnet` matches `claude-sonnet-4-6-future`)\n1. Unknown → `$0.00` with a one-time warning in `telemetry.log`\n\nThe built-in table covers: Anthropic (Claude 3/4 family), OpenAI (GPT-4o, GPT-4, o1, o3, o4), DeepSeek, Gemini, Llama, and Hermes models. Prices sourced from official provider pages (May 2026).\n\n### `budget.yaml`\n\nConfigure spend guardrails. No file → budgets disabled.\n\n```yaml\nbudgets:\n  global:\n    daily_usd: 2.00\n    monthly_usd: 50.00\n  per_cron_job:\n    default:\n      daily_usd: 1.00\n    overrides:\n      daily_email_report:\n        daily_usd: 3.00\n  per_sender:\n    default:\n      daily_usd: 2.00\n    overrides:\n      premium_user_123:\n        daily_usd: 5.00\n\nthresholds:\n  soft_pct: 0.80    # warn at 80% of limit\n  hard_pct: 1.00    # enforce at 100%\n\non_estimated:\n  mode: enforce     # warn_only | enforce\n```\n\n**Scope resolution:**\n\n|Scope         |How spend is calculated                                      |\n|--------------|-------------------------------------------------------------|\n|`global`      |All sessions + all cron jobs combined                        |\n|`per_cron_job`|Sessions where `cron_job_id` matches (excludes subagent cost)|\n|`per_sender`  |Sessions from a specific sender (multi-user gateways)        |\n\n**Window math:** daily and monthly windows are computed in the user’s local timezone. A cron job that runs at 11:59 PM and another at 12:01 AM count against different daily windows.\n\n-----\n\n## Pricing Auto-Refresh\n\nThe plugin can automatically fetch model pricing from OpenRouter’s public API, eliminating the need to manually maintain `pricing.yaml` for hundreds of models.\n\n### How It Works\n\n- **Source**: OpenRouter public API (`https://openrouter.ai/api/v1/models`) — no auth required\n- **Frequency**: Once per 24 hours (tracked via sentinel file)\n- **Trigger**: Automatically on plugin load (gateway startup), or manually via CLI\n- **Merge strategy**:\n  - User overrides in `pricing.yaml` are **always preserved** — manual entries take priority over auto-fetched ones\n  - New models from the API are added automatically\n  - Previously auto-fetched models are updated when prices change\n  - Models are tagged with `_auto: true` and `_source: openrouter` for traceability\n\n### Estimated-Price Models\n\nSome OpenRouter models have no fixed pricing (e.g. `auto` routing, experimental models). These are represented with negative prices in the API.\n\nThe plugin handles these safely:\n\n- Prices are normalized to `$0.00` (they don’t inflate cost calculations)\n- Flagged with `_estimated_price: true` in `pricing.yaml`\n- The budget engine detects when spend uses these models\n\n**Budget degradation logic:**\n\n|Condition                               |Effect                                                                                                                               |\n|----------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------|\n|`on_estimated.mode: warn_only` (default)|If \u003e0% of calls use estimated-price models, **hard verdicts are degraded to soft** — the user gets a warning but tools aren’t blocked|\n|`on_estimated.mode: enforce`            |Hard verdicts take effect regardless                                                                                                 |\n\n### CLI Usage\n\n```\n# Dry run — see what would change\npython -m hermes_telemetry.pricing_refresh --check\n\n# Apply changes\npython -m hermes_telemetry.pricing_refresh\n\n# Verbose output\npython -m hermes_telemetry.pricing_refresh --verbose\n```\n\n**Example output:**\n\n```\nINFO OpenRouterSource: fetched 320 models\nUpdated 3 model(s):\n\n  ~ stepfun/step-3.7-flash  (openrouter)\n      input: 0.9999 → 0.2000\n      output: 9.9999 → 1.1500\n\n  + anthropic/claude-opus-4.8  (openrouter)\n      input=5.0000 output=25.0000\n\n  ⚠  Model(s) with estimated pricing: openrouter/auto, openrouter/bodybuilder, openrouter/pareto-code\n```\n\n### Extending with New Sources\n\nAdd new pricing providers by subclassing `PricingSource`:\n\n```python\nfrom hermes_telemetry.pricing_refresh import PricingSource, register_source\n\nclass AnthropicSource(PricingSource):\n    name = \"anthropic\"\n\n    def fetch(self) -\u003e dict[str, dict]:\n        # Fetch from Anthropic's pricing page or API\n        ...\n\nregister_source(AnthropicSource)\n```\n\nSources are registered in `pricing_refresh.py` and fetched in parallel on each refresh cycle.\n\n-----\n\n## Architecture\n\n### Hook Plugin\n\nThe plugin registers 10 hooks (out of 16 available in Hermes) plus 2 slash commands:\n\n```\nHook                      Purpose\n─────────────────────────────────────────────────────────────\non_session_start          Create run row, extract cron_job_id\npre_api_request           Stash approx_input_tokens for fallback\npost_api_request          PRIMARY: record tokens, cost, latency\npost_tool_call            Record tool name, success, duration\npost_llm_call             Refresh session end timestamp\nsubagent_stop             Record delegate_task proxy on parent\non_session_end            Set final status (ok/error/interrupted)\non_session_finalize       Safety net: ensure run is closed\npre_llm_call              Soft budget alerts + capture sender_id\npre_tool_call             Hard budget enforcement (tool-gate)\n```\n\n**Why `post_api_request` is the primary hook for tokens:** The Hermes conversation loop can make multiple API calls per turn (retries, reasoning models, tool calls). Only `post_api_request` carries the canonical `usage` dict with token counts and cost data. `pre_llm_call` fires once per turn with no token data. `post_llm_call` fires after the tool loop with no token data.\n\n**Cron job identification:** There is no `cron_job_id` in any hook. The plugin extracts it from the `session_id`, which follows the format `cron_{job_id}_{YYYYMMDD_HHMMSS}` (confirmed in Hermes source). An anchored regex handles job IDs that contain underscores.\n\n### Database Schema\n\nSQLite with WAL mode, per-thread connections, schema v3:\n\n**`runs`** — one row per session (CLI session or cron job execution):\n\n|Column                    |Description                                                                     |\n|--------------------------|--------------------------------------------------------------------------------|\n|`session_id`              |Primary key (`{YYYYMMDD_HHMMSS}_{uuid6}` for CLI, `cron_{job_id}_{ts}` for cron)|\n|`platform`                |`cli`, `cron`, `telegram`, `discord`, etc.                                      |\n|`cron_job_id`             |Extracted from session_id when platform=cron                                    |\n|`model`                   |Model name (updated from last API call)                                         |\n|`provider`                |Provider name (e.g. `openrouter`, `anthropic`)                                  |\n|`started_at` / `ended_at` |ISO-8601 UTC timestamps                                                         |\n|`status`                  |`running`, `ok`, `error`, `interrupted`                                         |\n|`tokens_in` / `tokens_out`|Accumulated across all API calls in the session                                 |\n|`cost_usd`                |Accumulated estimated cost                                                      |\n|`duration_ms`             |Wall time (ms) via `julianday()`                                                |\n|`api_calls` / `tool_calls`|Counters                                                                        |\n|`parent_session_id`       |Reserved for future parent-child linking (not populated in v0.2)                |\n|`estimated_llm_calls`     |Count of calls where provider returned `usage=None`                             |\n|`sender_id`               |For per-sender budgets (set via `pre_llm_call`)                                 |\n\n**`llm_calls`** — one row per individual API call:\n\nAll of `runs` token/cost columns, plus `cache_read_tokens`, `cache_write_tokens`, `reasoning_tokens`, `estimated` (boolean).\n\n**`tool_calls`** — one row per tool execution:\n\n`session_id`, `ts`, `tool_name`, `ok` (boolean), `latency_ms`.\n\n**`budget_alerts`** — anti-spam ledger:\n\n`scope`, `scope_id`, `window`, `period_key`, `level`, `fired_at`, `spent_usd`, `limit_usd`. Unique constraint prevents duplicate alerts.\n\n### Concurrency Model\n\nCron jobs run in a `ThreadPoolExecutor` (Hermes `cron/scheduler.py`). Multiple jobs can write to the DB simultaneously from different threads.\n\n**Design:** per-thread SQLite connections via `threading.local()`. Each thread opens its own connection to the same WAL-mode DB file. A serializable `_schema_lock` protects DDL migrations on first connect (WAL mode switch requires a brief lock that `busy_timeout` alone doesn’t handle).\n\n`busy_timeout=5000` ensures write collisions retry for 5 seconds before raising. `synchronous=NORMAL` balances durability with write performance (safe for WAL mode).\n\n-----\n\n## Budget Enforcement\n\n\u003e See the budget enforcement demo at the top of this README for an end-to-end walkthrough.\n\n### How It Works\n\nEvery time the agent is about to do work, the plugin checks:\n\n1. **`pre_llm_call`** (fires once per turn): evaluates all applicable budget scopes. If any has a `soft` or `hard` verdict that hasn’t been alerted yet this window, injects a one-time notice into the conversation context (anti-spam via `budget_alerts` table). Captures `sender_id`.\n1. **`pre_tool_call`** (fires before every tool): re-evaluates budgets. If any scope is in `hard` breach, returns `{\"action\":\"block\",\"message\":...}` which aborts the tool call.\n1. **For cron jobs with `hard` breach:** additionally calls `cron.jobs.pause_job` to pause future runs.\n\n### Enforcement Levels\n\nHermes does **not** expose a way to abort an in-flight model call from a plugin. `pre_llm_call` / `pre_api_request` returns can’t cancel a call. So enforcement is honest about its reach:\n\n|Level                  |Trigger                                  |Effect                                    |Repeat?                            |\n|-----------------------|-----------------------------------------|------------------------------------------|-----------------------------------|\n|**Soft** (≥ `soft_pct`)|Spend reaches 80% of limit (configurable)|One-time notice injected into conversation|Once per window per scope          |\n|**Hard** (≥ `hard_pct`)|Spend reaches 100% of limit              |Every subsequent tool call is blocked     |Every tool call until window resets|\n|**Cron pause**         |Any hard `cron_job` verdict              |Job is paused for future runs             |Once per window per scope          |\n\nThe model response already in flight still completes and is billed. What’s prevented is *further* tool-driven work.\n\n### Estimated Data and Budget Degradation\n\nWhen the provider returns `usage=None`, the plugin estimates tokens and flags the row as `estimated=1`. Since these estimates may be inaccurate, the budget engine offers a safety valve:\n\n**`on_estimated.mode: warn_only` (default):** If a hard verdict rests partly on estimated rows, it is **degraded to soft** — the user gets a warning but tools aren’t blocked. Rationale: a budget built on estimates shouldn’t hard-stop work.\n\n**`on_estimated.mode: enforce`:** Hard verdicts take effect regardless of estimate quality. Use this when you trust your provider’s usage data (Est% = 0) or when estimates are acceptable.\n\nThe `/stats providers` command shows the `Est%` column so you can see at a glance whether your provider returns real usage data.\n\n**Estimated-price models:** Some models (e.g. OpenRouter `auto` routing) have no fixed pricing. These are flagged with `_estimated_price: true` in `pricing.yaml` and normalized to `$0.00`. If \u003e0% of calls use these models, budget hard-verdicts are also degraded to soft under `warn_only` mode. See [Pricing Auto-Refresh](#pricing-auto-refresh) for details.\n\n-----\n\n## Provider Probe: Verifying Your Provider Returns Real Usage\n\nRun this **once** after enabling the plugin:\n\n1. Run one short session (any minimal task works)\n1. Execute `/stats providers`\n1. Look at the `Est%` column for your provider:\n- **`0%`** → provider returns real usage data. Budget verdicts are based on real numbers. Set `on_estimated.mode: enforce` for strict enforcement. ✅\n- **`\u003e 0%`** → provider omits usage in some responses. Those calls are estimated and flagged. Budget hard-verdicts will be degraded to soft under `warn_only`. The `telemetry.log` will have a **one-time WARNING** per provider. ⚠️\n\n-----\n\n## Proof of Concept\n\nThe following PoC was executed live to validate the plugin end-to-end.\n\n### Setup\n\n- **Hermes gateway** running on Linux (WSL), model `openrouter/owl-alpha` (free tier)\n- **Plugin:** hermes-telemetry v0.2.0, loaded in gateway process\n- **DB:** `/home/nujovich/.hermes/telemetry/telemetry.db` (schema v3, WAL mode)\n- **6 cron jobs** configured, 2 used for this PoC\n\n### Pricing Capture\n\nAdded models to `~/.hermes/telemetry/pricing.yaml`:\n\n```yaml\nmodels:\n  \"openrouter/owl-alpha\":\n    input: 0.00\n    output: 0.00\n  \"openrouter/anthropic/claude-sonnet-4-6\":\n    input: 3.00\n    output: 15.00\n    cache_read: 0.30\n    cache_write: 3.75\n  \"openrouter/anthropic/claude-opus-4-7\":\n    input: 5.00\n    output: 25.00\n    cache_read: 0.50\n    cache_write: 6.25\n```\n\nSet `on_estimated.mode: enforce` for deterministic enforcement.\n\n### Budget Enforcement Test\n\n**Step 1 — Trigger a hard breach:**\n\n- Budget: `global.daily_usd: 0.001` ($0.001/day)\n- Ran MCP Lead Gen job (model: `claude-sonnet-4-6`, ~$3/$15 per 1M)\n- Result: job spent $0.1812 on first run → **18,120% of daily limit** → █ hard breach → **job auto-paused**\n\n```\n█ global    $0.1812 / $0.00    18120%  [daily]\n                         ↑ (0.001 rounded to 0.00 in display)\n```\n\n**Step 2 — Raise budget and resume:**\n\n```\n/budget set global daily 2.00\n```\n\nResult after `/budget set`:\n\n```\nglobal    $0.1812 / $2.00    9%  [daily]\n```\n\n**Step 3 — Verify job runs normally:**\n\n- MCP Lead Gen re-ran successfully under the $2.00 daily budget\n- Second run confirmed: `state: scheduled`, `paused_at: null`\n\n### Cron Job Cost Comparison\n\n|Job                 |Model              |Price (input/output) |\n|--------------------|-------------------|---------------------|\n|MCP Lead Gen        |`claude-sonnet-4-6`|$3.00 / $15.00 per 1M|\n|Marketing Highlights|`claude-opus-4-7`  |$5.00 / $25.00 per 1M|\n|Base sessions (CLI) |`owl-alpha`        |$0.00 / $0.00 (free) |\n\n**Results from SQLite (`/stats` after all runs):**\n\n- **CLI sessions** (owl-alpha, free): ~1M tokens in → **$0.00**\n- **MCP Lead Gen** (claude-sonnet-4-6): ~892K tokens in → **$0.314**\n- **Marketing Highlights** (claude-opus-4-7): ~445K tokens in → **$2.23** (opus is ~5-8x more expensive per token)\n\n### Results Summary\n\n|Component                            |Status                                             |\n|-------------------------------------|---------------------------------------------------|\n|Token capture from provider          |✅ Real usage (`estimated=0`)                       |\n|Cost estimation with pricing table   |✅ Accurate to pricing YAML                         |\n|Cron job session tracking            |✅ Captured via `session_id` regex                  |\n|Budget soft alerts                   |✅ One-time context injection                       |\n|Budget hard enforcement              |✅ Paused job at $0.001/day                         |\n|Budget hot-reload via `/budget set`  |✅ Cache cleared, new limit active                  |\n|Multi-model cost comparison          |✅ Sonnet vs Opus vs Free                           |\n|Pricing auto-refresh (OpenRouter API)|✅ 320 models fetched, manual overrides preserved   |\n|Estimated-price model handling       |✅ Negative prices → $0.00, budget degradation      |\n|Dashboard (HTML, auto-refresh 30s)   |✅ Charts, tables, budget bar, provider distribution|\n|94 tests pass                        |✅                                                  |\n\n-----\n\n## Comparison\n\n|                  |hermes-telemetry|TokenTelemetry       |Martin Loop         |\n|------------------|----------------|---------------------|--------------------|\n|Hermes-native     |✅ Native plugin |❌ Reads external logs|❌ No Hermes support |\n|Budget enforcement|✅ Stops the run |❌ Observe only       |✅ But not for Hermes|\n|Real-time         |✅ Pre-call      |❌ Post-hoc           |✅ Pre-attempt       |\n|Requires Hermes   |✅ Hermes only   |Any agent            |Claude Code / Codex |\n|Local dashboard   |✅               |✅ (more complete)    |❌                   |\n|Open source       |✅ MIT           |✅ MIT                |✅ MIT               |\n\n**When to use TokenTelemetry instead:** if you need a multi-agent dashboard (Claude Code + Codex + Hermes in one place), TokenTelemetry is the right choice. hermes-telemetry is purpose-built for Hermes operators who need budget enforcement, not just visibility.\n\n-----\n\n## Running Tests\n\n```\ncd hermes-telemetry\npip install pytest pyyaml\npytest tests/ -v\n```\n\n**Test suite (94 tests):**\n\n|File                             |Tests|Coverage                                                                                                                       |\n|---------------------------------|-----|-------------------------------------------------------------------------------------------------------------------------------|\n|`test_db.py`                     |15   |Schema v1→v3 migrations, CRUD, aggregations, concurrent WAL writes (10 threads × 5 writes)                                     |\n|`test_pricing.py`                |17   |Cache/reasoning split, no double-counting of `prompt_tokens`, YAML overrides, prefix matching, unknown model handling          |\n|`test_init.py`                   |6    |Cron session ID regex, tool success/failure parsing                                                                            |\n|`test_budget.py`                 |17   |ok/soft/hard verdicts, estimated-to-soft degradation, anti-spam ledger, cron pause, per-scope routing, `/budget set` hot-reload|\n|`test_stats_providers.py`        |8    |Real vs estimated per provider, `/stats providers` output format, Nous warning dedup                                           |\n|`test_subagent_reconciliation.py`|4    |Parent + child hook sequence, token reconciliation, no double-counting                                                         |\n\nNo live Hermes is required — all tests are self-contained with in-memory SQLite.\n\n-----\n\n## Data Location\n\n```\n~/.hermes/telemetry/\n├── telemetry.db        ← SQLite (WAL mode, ~70KB base + growth)\n├── telemetry.log       ← Plugin log (errors, debug, one-time warnings)\n├── pricing.yaml        ← Your model price overrides\n└── budget.yaml         ← Your spend guardrails\n```\n\nThe DB grows over time. For high-frequency cron jobs, consider periodic cleanup of old rows (not yet automated — see [Known Limitations](#known-limitations)).\n\n-----\n\n## Known Limitations\n\n**Enforcement gaps:**\n\n- **No true mid-call abort.** `pre_llm_call` / `pre_api_request` cannot cancel an in-flight model call. The response that’s already generating will complete and be billed. The tool-gate (`pre_tool_call`) stops *subsequent* work at the next tool boundary.\n- **Runaway text-only sessions.** A session that generates text without calling any tools never hits the tool-gate. If this becomes a problem, a pre-flight check in `on_session_start` for cron jobs could abort before the first LLM call.\n\n**Subagent attribution:**\n\n- Child agents (`delegate_task`) run as their own sessions. Their tokens are captured independently and included in **global** totals. But there is no parent→child link in any hook — so `per_cron_job` budgets **exclude** subagent cost. Use the `global` budget for a cap that captures delegated work.\n\n**Pricing refresh only for OpenRouter models:**\n\n- `pricing.yaml` is updated with OpenRouter models via OpenRouter API, preserving those entered manually by the user.\n\n**DB retention:**\n\n- `telemetry.db` grows without bound. No automatic purge of old rows. For \u003e100K rows, consider manual cleanup or a retention policy (not yet implemented).\n\n**Gateway restart required:**\n\n- Enabling the plugin takes effect only after gateway restart. Cron runs that started before the restart won’t have telemetry.\n\n-----\n\n## Troubleshooting\n\n**`/stats cron week` shows “No cron runs in the last 7 days”:**\n\nThe gateway loaded before the plugin was enabled. Restart the gateway:\n\n```\nhermes gateway restart\n```\n\nThen re-run a cron job.\n\n**`/budget` shows `$0.00` as the limit:**\n\nThe limit is cached in memory at gateway start. If you edited `budget.yaml` directly, the cache is stale. Use `/budget set global daily \u003camount\u003e` to hot-reload, or restart the gateway.\n\n**Cost is $0.00 for all sessions:**\n\nYour model isn’t in the pricing table. Check `telemetry.log` for a one-time warning like:\n\n```\nhermes-telemetry: unknown model 'openrouter/some-model' — cost recorded as $0.00\n```\n\nAdd it to `pricing.yaml`.\n\n**Provider Est% \u003e 0:**\n\nYour provider returns `usage=None` for some/all calls. Tokens are estimated. Check `/stats providers` to see which providers are affected. If Est% is 100% for your main provider, all spend is estimated and budget hard-verdicts degrade to soft under `warn_only` mode.\n\n**Plugin not loading at all:**\n\nCheck `telemetry.log` for errors. Common causes:\n\n- Missing `pyyaml` in the gateway’s venv: `pip install pyyaml`\n- Plugin not in `plugins.enabled` in config.yaml\n- Syntax error in `pricing.yaml` or `budget.yaml`\n\n-----\n\n## License\n\nMIT — see [LICENSE](https://github.com/nujovich/hermes-telemetry/blob/main/LICENSE).\n\n-----\n\n## Hermes Agent Challenge\n\nThis plugin was built for the [**Hermes Agent Challenge**](https://dev.to/devteam/join-the-hermes-agent-challenge-1000-in-prizes-13cd) — a $1,000 competition to build the most useful Hermes Agent plugins and extensions.\n\n**🔗 Challenge Entry:** [hermes-telemetry on dev.to](https://dev.to/devteam/join-the-hermes-agent-challenge-1000-in-prizes-13cd)\n\n**🛠️ Built by:** [Nadia Ujovich](https://github.com/nujovich)\n\n**💡 Why this plugin:** Every AI system needs observability and cost control. This plugin gives Hermes Agent users the visibility to optimize their workflows and the guardrails to prevent bill shock — essential for production deployments and automated cron jobs.\n\n-----\n\n*Made with ☕ for the Hermes Agent ecosystem*\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnujovich%2Fhermes-telemetry","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnujovich%2Fhermes-telemetry","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnujovich%2Fhermes-telemetry/lists"}