{"id":48974685,"url":"https://github.com/sergey-homenko/llm_cost_tracker","last_synced_at":"2026-04-29T15:01:08.532Z","repository":{"id":351869220,"uuid":"1212854568","full_name":"sergey-homenko/llm_cost_tracker","owner":"sergey-homenko","description":"Rails-native LLM cost ledger: track spend by user, feature, provider, and model with self-hosted ActiveRecord storage and budget guardrails.","archived":false,"fork":false,"pushed_at":"2026-04-22T20:10:42.000Z","size":620,"stargazers_count":5,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-23T13:33:58.008Z","etag":null,"topics":["ai","anthropic","cost-management","cost-tracking","deepseek","faraday","finops","gemini","llm","llmops","openai","openrouter","rails","ruby"],"latest_commit_sha":null,"homepage":"https://rubygems.org/gems/llm_cost_tracker","language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sergey-homenko.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":".github/CONTRIBUTING.md","funding":null,"license":"LICENSE.txt","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-16T19:53:59.000Z","updated_at":"2026-04-22T20:10:44.000Z","dependencies_parsed_at":"2026-04-21T11:00:55.809Z","dependency_job_id":null,"html_url":"https://github.com/sergey-homenko/llm_cost_tracker","commit_stats":null,"previous_names":["sergey-homenko/llm_cost_tracker"],"tags_count":10,"template":false,"template_full_name":null,"purl":"pkg:github/sergey-homenko/llm_cost_tracker","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sergey-homenko%2Fllm_cost_tracker","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sergey-homenko%2Fllm_cost_tracker/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sergey-homenko%2Fllm_cost_tracker/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sergey-homenko%2Fllm_cost_tracker/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sergey-homenko","download_url":"https://codeload.github.com/sergey-homenko/llm_cost_tracker/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sergey-homenko%2Fllm_cost_tracker/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32226667,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-24T13:21:15.438Z","status":"ssl_error","status_checked_at":"2026-04-24T13:21:15.005Z","response_time":64,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","anthropic","cost-management","cost-tracking","deepseek","faraday","finops","gemini","llm","llmops","openai","openrouter","rails","ruby"],"created_at":"2026-04-18T08:12:48.928Z","updated_at":"2026-04-29T15:01:08.525Z","avatar_url":"https://github.com/sergey-homenko.png","language":"Ruby","funding_links":[],"categories":["Ruby"],"sub_categories":[],"readme":"# LLM Cost Tracker\n\nA Rails-native ledger for what your LLM calls actually cost.\n\n[![Gem Version](https://img.shields.io/gem/v/llm_cost_tracker.svg)](https://rubygems.org/gems/llm_cost_tracker)\n[![CI](https://github.com/sergey-homenko/llm_cost_tracker/actions/workflows/ruby.yml/badge.svg)](https://github.com/sergey-homenko/llm_cost_tracker/actions)\n[![codecov](https://codecov.io/gh/sergey-homenko/llm_cost_tracker/branch/main/graph/badge.svg)](https://codecov.io/gh/sergey-homenko/llm_cost_tracker)\n\nIf you have OpenAI, Anthropic, or Gemini in production and someone keeps asking \"where did that bill come from?\", this gem records every call into your own database, prices it locally, and gives you a dashboard you can mount in five minutes. No proxy, no SaaS account, no extra service to deploy.\n\nIt is not Langfuse, Helicone, or LiteLLM. It does not capture prompts, score completions, or replay traces. It does one thing: tells you which provider, which model, which feature, and which user burned how much money. That's the entire pitch.\n\nRequires Ruby 3.3+, ActiveSupport 7.1+, Faraday 2.0+. ActiveRecord storage and the dashboard need Rails 7.1+.\n\n![Dashboard overview](docs/dashboard-overview.png)\n\n## Quickstart\n\nAdd to your Gemfile alongside whatever LLM client you already use:\n\n```ruby\ngem \"llm_cost_tracker\"\ngem \"openai\"  # or \"anthropic\", \"ruby_llm\", or your existing client\n```\n\nInstall, migrate, verify:\n\n```bash\nbin/rails generate llm_cost_tracker:install --dashboard --prices\nbin/rails db:migrate\nbin/rails llm_cost_tracker:doctor\n```\n\nDrop this into `config/initializers/llm_cost_tracker.rb`:\n\n```ruby\nLlmCostTracker.configure do |config|\n  config.storage_backend = :active_record\n  config.default_tags    = -\u003e { { environment: Rails.env } }\n  config.instrument :openai\nend\n```\n\nNow every OpenAI call is recorded. Wrap calls in `with_tags` to attribute spend to a user, feature, or anything else you care about:\n\n```ruby\nLlmCostTracker.with_tags(user_id: Current.user\u0026.id, feature: \"chat\") do\n  client = OpenAI::Client.new(api_key: ENV[\"OPENAI_API_KEY\"])\n  client.responses.create(model: \"gpt-4o\", input: \"Hello\")\nend\n```\n\nVisit `/llm-costs` for the dashboard. **Mount it behind your app's auth before deploying** — the gem doesn't ship with one, on purpose.\n\n## What you get\n\n- Local ActiveRecord ledger of every call: provider, model, token breakdown, cost, latency, tags, response IDs\n- Auto-capture for RubyLLM and the official `openai` and `anthropic` Ruby SDKs, plus Faraday middleware for `ruby-openai`, the Gemini REST API, and any client you can inject middleware into\n- Server-rendered dashboard (plain ERB, zero JavaScript) with overview, models, calls, tags, CSV export, and a data-quality page\n- Local pricing snapshots refreshed daily from the official provider pricing pages, applied with `bin/rails llm_cost_tracker:prices:refresh`\n- Monthly / daily / per-call budget guardrails with notify, raise, or block-requests behaviour\n- Tag-based attribution that survives concurrency — Puma threads and Sidekiq fibers don't bleed into each other\n\n## What it deliberately doesn't do\n\n- **Doesn't run as a proxy.** Calls go directly from your app to the provider.\n- **Doesn't store prompts or completions.** Token counts, model, cost, tags, response IDs only. Nothing else.\n- **Doesn't promise invoice-grade accuracy.** It uses official provider pricing pages, but enterprise rates, batch discounts on unsupported endpoints, and modality tiers are not always modeled. `provider_response_id` is stored as a join key for whoever does that reconciliation.\n- **Doesn't ship with auth on the dashboard.** It's a Rails Engine; mount it behind whatever your app already uses (Devise, basic auth, Cloudflare Access, your own session middleware).\n- **Doesn't centralize multi-service visibility.** One Rails monolith — perfect fit. Six services in four languages — wrong tool, look at a proxy or API-layer gateway.\n\n## Capturing calls\n\nThree paths, in order of preference. Use the first one that fits your stack.\n\n### 1. SDK integrations\n\nDrop-in for RubyLLM and the official `openai` and `anthropic` gems. `config.instrument` patches tested SDK methods so you don't change a single call site:\n\n```ruby\nLlmCostTracker.configure do |config|\n  config.instrument :openai      # or :anthropic / :ruby_llm\nend\n\nLlmCostTracker.with_tags(feature: \"support_chat\") do\n  Anthropic::Client.new.messages.create(\n    model: \"claude-sonnet-4-6\",\n    max_tokens: 1024,\n    messages: [{ role: \"user\", content: \"Hello\" }]\n  )\nend\n```\n\nCaptures usage, model, latency, response ID, cache tokens, and reasoning tokens whenever the SDK exposes them. Provider SDKs are not added as gem dependencies — you install whichever you actually use.\n\nEnabled integrations are checked at boot: the client gem must be loaded, meet the minimum supported version, and expose the expected classes and methods. If the contract check fails, boot raises instead of silently missing spend.\n\nThis patches **only** RubyLLM and the official Ruby SDKs. `ruby-openai` (alexrudall) and any custom client go through Faraday middleware below.\n\n### 2. Faraday middleware\n\nFor `ruby-openai`, the Gemini REST API, custom Faraday clients, or anything OpenAI-compatible (OpenRouter, DeepSeek, LiteLLM proxies):\n\n```ruby\nconn = Faraday.new(url: \"https://api.openai.com\") do |f|\n  f.use :llm_cost_tracker, tags: -\u003e { { feature: \"chat\", user_id: Current.user\u0026.id } }\n  f.request :json\n  f.response :json\n  f.adapter Faraday.default_adapter\nend\n```\n\nTags can be a hash or a callable evaluated per request. Place the middleware where it sees the final response body — in practice, before the JSON parser.\n\nStreaming works through the same path: the middleware tees the `on_data` callback so your code keeps receiving chunks normally, and the final usage gets recorded once the stream finishes. OpenAI streams need `stream_options: { include_usage: true }` for the final usage event.\n\nPer-client setup snippets for `ruby-openai`, Azure OpenAI, LiteLLM proxy, and Gemini live in [`docs/cookbook.md`](docs/cookbook.md).\n\n### 3. Manual `track` / `track_stream`\n\nWhen you have a client that doesn't expose Faraday and isn't an official SDK — internal gateways, homegrown wrappers, batch jobs replaying historical usage:\n\n```ruby\nLlmCostTracker.track(\n  provider: :anthropic,\n  model: \"claude-sonnet-4-6\",\n  input_tokens: 1500,\n  output_tokens: 320,\n  feature: \"summarizer\",\n  user_id: current_user.id\n)\n```\n\nFor streaming the same way, `track_stream` accepts a block, parses provider events automatically, and records once the stream finishes. Full reference in [`docs/streaming.md`](docs/streaming.md).\n\n## Tags: who burned this money\n\nTags answer the only question that matters in attribution: which feature, which user, which job, which tenant. They're free-form strings, indexed (JSONB on Postgres, fallback elsewhere), and queryable from both Ruby and the dashboard.\n\n```ruby\nLlmCostTracker.with_tags(user_id: current_user.id, feature: \"support_chat\", trace_id: request.uuid) do\n  client.chat(parameters: { model: \"gpt-4o\", messages: [...] })\nend\n```\n\n`with_tags` is thread- and fiber-isolated, so concurrent requests in Puma or jobs in Sidekiq don't bleed into each other. A `default_tags` callable on configuration runs on every event for things you always want — `environment`, `region`, deployment SHA. Explicit tags passed to `track` win over scoped tags, scoped tags win over defaults.\n\nWhat you put in tags is **your** input — they're queryable strings. Don't put prompts, completions, emails, or secrets there. Use IDs.\n\n## Pricing\n\nBuilt-in prices live in `lib/llm_cost_tracker/prices.json` and are refreshed daily from official provider pricing pages by an automated CI workflow that opens a PR on every change. Most apps run on bundled prices and never think about this.\n\nWhen you want to control updates yourself — for negotiated rates, gateway-specific model IDs, or pinned reviews — generate a local snapshot:\n\n```bash\nbin/rails generate llm_cost_tracker:prices\n```\n\n```ruby\nconfig.prices_file = Rails.root.join(\"config/llm_cost_tracker_prices.yml\")\n```\n\nRefresh on demand from the maintained snapshot:\n\n```bash\nbin/rails llm_cost_tracker:prices:refresh\n```\n\nExplain why a model is priced or unknown:\n\n```bash\nPROVIDER=openai MODEL=gpt-4o bin/rails llm_cost_tracker:prices:explain\n```\n\nPrecedence is `pricing_overrides` → `prices_file` → bundled. Provider-qualified keys like `openai/gpt-4o-mini` win over model-only keys. Full pricing reference: [`docs/pricing.md`](docs/pricing.md).\n\n## Budgets\n\nBudgets are guardrails, not transactional caps:\n\n```ruby\nconfig.monthly_budget           = 500.00\nconfig.daily_budget             = 50.00\nconfig.per_call_budget          = 2.00\nconfig.budget_exceeded_behavior = :block_requests   # or :notify, :raise\nconfig.on_budget_exceeded       = -\u003e(data) { SlackNotifier.notify(\"#alerts\", \"...\") }\n```\n\n`:block_requests` reads ledger totals before a call goes out and stops it if you're already over. Under concurrency multiple workers can pass preflight at the same time and collectively overshoot — this catches the next call after the overshoot becomes visible, not the overshoot itself. For a strict cap, use a provider-side limit or a transactional counter outside the gem.\n\nFull behavior, error class, and preflight details: [`docs/budgets.md`](docs/budgets.md).\n\n## Querying\n\nWhen you want to slice spend from a console, scheduled job, or your own admin page:\n\n```ruby\nLlmCostTracker::LlmApiCall.this_month.cost_by_model\nLlmCostTracker::LlmApiCall.this_month.cost_by_tag(\"feature\")\nLlmCostTracker::LlmApiCall.daily_costs(days: 7)\nLlmCostTracker::LlmApiCall.by_tags(user_id: 42, feature: \"chat\").this_month.total_cost\n```\n\nA text report is also one rake task away:\n\n```bash\nDAYS=7 bin/rails llm_cost_tracker:report\n```\n\nFull scope and helper reference: [`docs/querying.md`](docs/querying.md).\n\n## Dashboard\n\nMount the engine wherever you want — it's plain ERB, no JavaScript bundle, no asset pipeline gymnastics:\n\n```ruby\n# config/routes.rb\nmount LlmCostTracker::Engine =\u003e \"/llm-costs\"\n```\n\nPages: overview (spend trend, budget status, anomaly banner), models, calls (filterable, paginated, CSV export), tags, data quality. Reads `llm_api_calls`, so use `:active_record` storage if you want to mount it.\n\nAuth is your job. Examples for basic auth and Devise: [`docs/dashboard.md`](docs/dashboard.md).\n\n## Supported providers\n\n| Provider | Auto-detected | Coverage |\n|---|:---:|---|\n| OpenAI | Yes | GPT-5.5/5.4/5.2/5.1/5 + pro/mini/nano variants, GPT-4.1, GPT-4o, o1/o3/o4-mini |\n| Anthropic | Yes | Claude Opus 4.7/4.6/4.5/4.1/4, Sonnet 4.6/4.5/4, Haiku 4.5 |\n| Google Gemini | Yes | Gemini 2.5 Pro/Flash/Flash-Lite, 2.0 Flash/Flash-Lite |\n| OpenRouter | Yes | OpenAI-compatible usage; provider-prefixed model IDs are normalized |\n| DeepSeek | Yes | OpenAI-compatible usage; add `pricing_overrides` for DeepSeek-specific rates |\n| Other OpenAI-compatible hosts | Configurable | Register the host via `config.openai_compatible_providers` |\n| Anything else | Configurable | Custom parser — see [`docs/extending.md`](docs/extending.md) |\n\nRubyLLM chat, embedding, and transcription calls are captured through RubyLLM's provider layer when `config.instrument :ruby_llm` is enabled.\n\nEndpoints covered end-to-end: OpenAI Chat Completions / Responses / Completions / Embeddings, Anthropic Messages, Gemini `generateContent` and `streamGenerateContent`, plus their OpenAI-compatible equivalents. Streaming is captured for Faraday paths and official OpenAI / Anthropic SDK stream helpers whenever the provider emits final-usage events.\n\n## Privacy\n\nBy design, **no prompt or response content is ever stored.** Per call, the ledger holds: provider, model, token counts, cost, latency, tags, response ID, timestamp. That's it. No request bodies, no headers, no completions. Warning logs strip query strings before logging URLs.\n\nTags carry whatever your app passes — they are application-controlled input, treat them accordingly. Use `user_id`, not the user's email; use a feature key, not the input prompt.\n\n## Documentation\n\nDeeper guides live in `docs/`. Reference pages are being filled out as content\nmoves out of this README; the inline sections above remain canonical where a page\nis still brief.\n\n- [Configuration reference](docs/configuration.md)\n- [Pricing \u0026 price refresh](docs/pricing.md)\n- [Budgets \u0026 guardrails](docs/budgets.md)\n- [Querying \u0026 reports](docs/querying.md)\n- [Dashboard mounting](docs/dashboard.md)\n- [Streaming capture](docs/streaming.md)\n- [Extending](docs/extending.md)\n- [Production operations](docs/operations.md)\n- [Upgrading](docs/upgrading.md)\n- [Cookbook — per-client recipes](docs/cookbook.md)\n- [Architecture \u0026 design rules](docs/architecture.md)\n\n## Known limitations\n\n- `:block_requests` is best-effort under concurrency, not a transactional cap.\n- Streaming usage capture relies on the provider emitting a final-usage event. Missing events are stored with `usage_source: \"unknown\"` so they appear on the data-quality page rather than vanishing.\n- `provider_response_id` is stored only when the provider exposes a stable ID. Gemini is best-effort and varies by endpoint.\n- Cache write TTL variants on Anthropic (1h vs 5min writes) are not modeled separately yet.\n\n## Development\n\n```bash\nbundle install\nbin/check       # rubocop + rspec + coverage gate\n```\n\nArchitecture rules and conventions for contributions live in [`AGENTS.md`](AGENTS.md) and [`docs/architecture.md`](docs/architecture.md).\n\n## License\n\nMIT — see [LICENSE.txt](LICENSE.txt).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsergey-homenko%2Fllm_cost_tracker","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsergey-homenko%2Fllm_cost_tracker","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsergey-homenko%2Fllm_cost_tracker/lists"}