https://github.com/edihasaj/spendwatch

Token/$ leaderboards across coding agents (Claude Code, Codex) — see which tool calls, commands, and prompts spend the most, so you know what to automate or fix.
https://github.com/edihasaj/spendwatch

bun claude-code cli codex coding-agents cost llm observability tokens

Last synced: about 1 month ago
JSON representation

Token/$ leaderboards across coding agents (Claude Code, Codex) — see which tool calls, commands, and prompts spend the most, so you know what to automate or fix.

Host: GitHub
URL: https://github.com/edihasaj/spendwatch
Owner: edihasaj
License: mit
Created: 2026-06-09T23:19:26.000Z (about 1 month ago)
Default Branch: master
Last Pushed: 2026-06-11T21:21:06.000Z (about 1 month ago)
Last Synced: 2026-06-11T23:12:14.484Z (about 1 month ago)
Topics: bun, claude-code, cli, codex, coding-agents, cost, llm, observability, tokens
Language: TypeScript
Size: 56.6 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# spendwatch

[![license: MIT](https://img.shields.io/badge/license-MIT-amber)](LICENSE) ![runtime: Bun](https://img.shields.io/badge/runtime-Bun-black) ![deps: zero](https://img.shields.io/badge/deps-zero-brightgreen)

Token/$ leaderboards **across coding agents** (Claude Code, Codex, …): which **tool calls**, **shell commands**, and **prompts** spend the most — so you know what to automate (build a CLI for `ssh`/`git diff`/`docker compose`…) or fix (a tool dumping huge results into context).

Parses each agent's local transcripts. Zero runtime deps, Bun. Agents are split out, with a cross-agent overview on top.

> **Privacy:** spendwatch reads your local transcript files and writes local reports only. It makes **no network calls** and sends nothing anywhere.

## Install

Requires [Bun](https://bun.sh). Run from source, or compile a standalone binary:

```sh
git clone https://github.com/edihasaj/spendwatch && cd spendwatch
bun run report # run directly
bun run build # → bin/spendwatch (self-contained, embeds Bun + SQLite)
install -m 0755 bin/spendwatch /opt/homebrew/bin/ # optional: put on PATH
```

## Usage

```sh
bun src/cli.ts report # all agents, past 30 days
bun src/cli.ts report --agent codex # one agent (claude,codex,copilot,gemini)
bun src/cli.ts report --project chat-sql # filter by project substring
bun src/cli.ts report --days 7 --json
bun src/cli.ts report --html # also write a standalone HTML report
bun src/cli.ts report --open # write HTML + open it in the browser
bun src/cli.ts report --brief # TL;DR: total, biggest hog, top automate targets
bun src/cli.ts report --sqlite # append a snapshot to spendwatch.db
bun src/cli.ts report --account work # filter by account (email/label)
bun src/cli.ts watch # live leaderboard, refreshes as sessions write
bin/spendwatch report # compiled binary (bun run build)
```

`--html [path]` writes a self-contained, shareable `index.html` (default `spendwatch-report.html`): cross-agent overview, per-agent tabs, every table with heat bars, and **click any tool/command row to drill into the actual invocations** (which files, which commands, with counts + token cost) — the "what to automate" view. `--open` writes it and opens it.

`--sqlite [path]` appends a normalized snapshot (one run per call) so you can build spend history and query with SQL — tables: `runs`, `agent_account`, `tools`, `commands`, `prompts`, `models`, `projects`, `samples`.

## Multi-account

Accounts are auto-detected per agent (Claude `~/.claude.json` email, Codex auth JWT email). Reports **tag by account but sum per agent**. For multiple accounts in different config dirs, create `~/.config/spendwatch/config.json` (or `$SPENDWATCH_CONFIG`):

```json
{ "roots": [
{ "agent": "claude", "account": "work", "path": "~/.claude/projects" },
{ "agent": "claude", "account": "personal", "path": "~/personal/.claude/projects" },
{ "agent": "codex", "account": "work", "path": "~/.codex/sessions" }
] }
```

Each account shows under **BY ACCOUNT**; the agent total (and overview) sum across them. Filter with `--account `.

## Sources

| agent | logs | token usage |
|---------|------|-------------|
| Claude Code | `~/.claude/projects/**/*.jsonl` | ✓ |
| Codex CLI | `~/.codex/sessions/**/rollout-*.jsonl` | ✓ |
| Copilot CLI | `~/.config/github-copilot` | ✗ (binary Xodus store, no usage) |
| Gemini CLI | `~/.gemini` | ✓ when present |

Copilot/Gemini are detected and reported as footnotes until parseable logs exist. Adding a new agent = one parser emitting the shared `Event` model + a `sources.ts` entry.

## What to automate

The **AUTOMATE — top targets** list (top of the report, and `--brief`) ranks shell commands by **cost × frequency × friction** — the build-a-CLI/MCP shortlist. Friction is measured, not guessed: Claude `tool_result.is_error` and Codex exit codes. A `why` column tells you the driver:

- `frequent + costly` — high spend, lots of calls → wrap it in one tool
- `fails N% · flaky` — the command errors often (the agent retries, burning tokens)
- `exit-127 · agent guessing` — command-not-found: the agent doesn't know this tool → give it a CLI/MCP

> **Honesty note:** spendwatch does **not** invent a "you saved $X" number — that's an unmeasurable counterfactual. It shows *observed* cost and *measured* friction (error/retry rate). After you automate something, its cost and error rate drop and you see it in the data.

## What the numbers mean

- **$** — from real per-request `usage` fields (Claude: `cache_creation`/`cache_read`; Codex: `last_token_usage`, summed — verified equal to cumulative). Prices/MTok: Fable $10/$50, Opus $5/$25, Sonnet $3/$15, Haiku $1/$5, gpt-5 tier $1.25/$10 (Codex $ is an estimate — it's largely subscription-billed). Cache: write 1.25× (5m)/2× (1h), read 0.1×.
- **ctx $** (per tool/command) — est. cost a call's *results* impose on the session: result tokens (chars/4) × one cache write + a 0.1× reread on every later request in that session. Big outputs early in long sessions cost the most.
- **BY COMMAND** — shell calls split by executable (`echo`, `docker`, `grep`, `ssh`…), skipping `cd X &&`/env/wrappers.
- **BY COMMAND — DEEP** — executable + subcommand (`git diff`, `docker compose`, `pnpm lint`) and, for `ssh`, the remote command head. This is the "what to build a CLI for" list.
- **BY PROMPT** — spend attributed to the active prompt (`⑂` = subagent / Codex task).

## Layout

- `src/pricing.ts` — price table + cost functions (incl. cache multipliers)
- `src/parse.ts` — Claude JSONL → events; `commandPath()` deep shell breakdown
- `src/codex.ts` — Codex rollout JSONL → same events
- `src/sources.ts` — agent registry: log locations, discovery, account detection, config roots
- `src/aggregate.ts` — per-session fold → leaderboards + per-account + drill-down samples
- `src/db.ts` — SQLite snapshot writer (`bun:sqlite`)
- `src/html.ts` — standalone HTML report with tabs + drill-down
- `src/scan.ts` — file walk + incremental offsets (watch mode)
- `src/render.ts` / `src/cli.ts` — tables, cross-agent overview, report/watch

## Test

```sh
bun test # claude + codex fixtures with hand-computed costs, deep-command,
# incremental append, multi-account sum/breakout, sqlite round-trip
```

## Contributing

Adding another agent is one parser emitting the shared `Event` model plus a `sources.ts` entry. PRs welcome.

## License

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/edihasaj/spendwatch

Awesome Lists containing this project

README