{"id":48531316,"url":"https://github.com/developedby-siva/token-scope","last_synced_at":"2026-04-08T00:01:28.689Z","repository":{"id":345198438,"uuid":"1184836147","full_name":"DevelopedBy-Siva/token-scope","owner":"DevelopedBy-Siva","description":"Profile your LLM payloads. Find the waste. Cut the cost. Field-level token attribution, cost leak detection, and payload optimization for any LLM API.","archived":false,"fork":false,"pushed_at":"2026-03-18T06:26:00.000Z","size":44,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2026-03-18T19:22:55.386Z","etag":null,"topics":["docker","fastapi","github-actions","llm","openai","tiktoken"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/DevelopedBy-Siva.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-18T01:24:13.000Z","updated_at":"2026-03-18T06:31:54.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/DevelopedBy-Siva/token-scope","commit_stats":null,"previous_names":["developedby-siva/token-scope"],"tags_count":7,"template":false,"template_full_name":null,"purl":"pkg:github/DevelopedBy-Siva/token-scope","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DevelopedBy-Siva%2Ftoken-scope","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DevelopedBy-Siva%2Ftoken-scope/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DevelopedBy-Siva%2Ftoken-scope/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DevelopedBy-Siva%2Ftoken-scope/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/DevelopedBy-Siva","download_url":"https://codeload.github.com/DevelopedBy-Siva/token-scope/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DevelopedBy-Siva%2Ftoken-scope/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31533824,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-07T16:28:08.000Z","status":"ssl_error","status_checked_at":"2026-04-07T16:28:06.951Z","response_time":105,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["docker","fastapi","github-actions","llm","openai","tiktoken"],"created_at":"2026-04-08T00:01:11.626Z","updated_at":"2026-04-08T00:01:28.667Z","avatar_url":"https://github.com/DevelopedBy-Siva.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# TokenScope\n\n**Profile your LLM payloads. Find the waste. Cut the cost.**\n\n```bash\npip install llm-tokenscope\n```\n\nYour LLM bill isn't just your prompt — it's JSON. Keys, nested objects, tool schemas, conversation history, metadata, context chunks. By the time it hits the API, your code has assembled something expensive that you've never actually looked at.\n\nTokenScope shows you which fields are burning your budget, detects structural waste, and generates an HTML report after your session.\n\n---\n\n## SDK\n\nWrap your existing client. Zero changes to your app logic.\n\n### OpenAI / OpenAI-compatible\n\n```python\nfrom tokenscope import TokenScope\nfrom openai import OpenAI\n\nwith TokenScope.wrap(OpenAI()) as client:\n    client.chat.completions.create(\n        model=\"gpt-4o\",\n        messages=[{\"role\": \"user\", \"content\": \"Hello\"}],\n    )\n# HTML report written to ./reports/\n```\n\nWorks with any OpenAI-compatible API — OpenAI, Together, Anyscale, Ollama, anything.\n\n```python\n# Ollama example — cost shows $0.00, all other profiling works normally\nwith TokenScope.wrap(OpenAI(base_url=\"http://localhost:11434/v1\", api_key=\"ollama\")) as client:\n    client.chat.completions.create(model=\"llama3\", messages=[...])\n```\n\n### Anthropic SDK\n\n```python\nimport anthropic\nfrom tokenscope import TokenScope\n\nwith TokenScope.wrap(anthropic.Anthropic()) as client:\n    client.messages.create(\n        model=\"claude-3-7-sonnet-20250219\",\n        max_tokens=1024,\n        messages=[{\"role\": \"user\", \"content\": \"Hello\"}],\n    )\n```\n\n### LangChain\n\n```python\nfrom tokenscope import TokenScope\nfrom langchain_openai import ChatOpenAI\n\nhandler = TokenScope.langchain_handler()\n\nllm = ChatOpenAI(model=\"gpt-4o\", callbacks=[handler])\nllm.invoke(\"Hello\")\n\nhandler.scope.report()  # write report manually\n```\n\nOr as a context manager:\n\n```python\nwith TokenScope.langchain_handler() as handler:\n    chain.invoke({\"input\": \"...\"}, config={\"callbacks\": [handler]})\n```\n\n### Attach extra context for analysis\n\nProfile data that's generated in your app but stripped before the API call:\n\n```python\nwith TokenScope.wrap(OpenAI()) as client:\n    client.chat.completions.create(\n        model=\"gpt-4o\",\n        messages=[{\"role\": \"user\", \"content\": \"Summarize\"}],\n        extra_data={\"retrieved_chunks\": chunks},  # stripped before API call, included in leak analysis\n    )\n```\n\n### Manual report control\n\n```python\nscope = TokenScope()\nclient = scope.wrap_openai(OpenAI())\n\nclient.chat.completions.create(...)\nclient.chat.completions.create(...)\n\nprint(scope.session.total_input_tokens)\nprint(scope.session.total_cost_usd)\nprint(scope.session.total_tokens_saveable)\n\nscope.report()  # write report whenever you want\n```\n\n---\n\n## The Report\n\nAfter your session, a self-contained HTML report is written to `./reports/tokenscope_\u003ctimestamp\u003e.html`.\n\n**Session summary** — total calls, input tokens, output tokens, analyzed tokens, total cost, tokens saveable.\n\n**Per call** — top fields by token cost, detected cost leaks with severity and savings estimate, cost and duration.\n\n---\n\n## Cost Leaks Detected\n\n| Rule                | What It Catches                                      | Severity  |\n| ------------------- | ---------------------------------------------------- | --------- |\n| `VERBOSE_SCHEMA`    | Tool/function descriptions over 200 tokens           | 🔴 High   |\n| `BLOATED_ARRAY`     | Arrays with 3+ similar items that could be trimmed   | 🔴 High   |\n| `DUPLICATE_CONTENT` | Same content appearing in multiple fields            | 🔴 High   |\n| `REPEATED_KEYS`     | Same key appearing 5+ times across the payload       | 🟡 Medium |\n| `LOW_SIGNAL_FIELDS` | UUIDs, timestamps, IDs the model doesn't reason over | 🟡 Medium |\n| `DEEP_NESTING`      | Objects nested 4+ levels deep                        | 🟢 Low    |\n\n---\n\n## REST API\n\n**Base URL:** `https://token-scope.onrender.com`\n\n### `POST /api/v1/analyze`\n\n```bash\ncurl -X POST https://token-scope.onrender.com/api/v1/analyze \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"payload\": {\n      \"model\": \"gpt-4o\",\n      \"messages\": [{\"role\": \"user\", \"content\": \"Hello\"}]\n    },\n    \"model_id\": \"gpt-4o\",\n    \"requests_per_day\": 100\n  }'\n```\n\n**Response includes:**\n\n- `total_tokens` — exact tiktoken count\n- `top_fields` — top 5 most expensive leaf fields\n- `leaks` — detected cost leaks with severity and savings estimate\n- `optimization` — cleaned payload with tokens saved\n- `cost` — per-request cost breakdown\n- `monthly` — projected monthly cost at given request volume\n- `all_models` — cost comparison across all supported models\n\n### `GET /api/v1/health`\n\n```json\n{ \"status\": \"ok\", \"version\": \"0.2.0\" }\n```\n\n---\n\n## Supported Models\n\n| Model             | Provider  | Input (per 1M) | Output (per 1M) |\n| ----------------- | --------- | -------------- | --------------- |\n| GPT-4o            | OpenAI    | $2.50          | $10.00          |\n| GPT-4o mini       | OpenAI    | $0.15          | $0.60           |\n| GPT-4 Turbo       | OpenAI    | $10.00         | $30.00          |\n| o3                | OpenAI    | $10.00         | $40.00          |\n| o3-mini           | OpenAI    | $1.10          | $4.40           |\n| Claude 3.7 Sonnet | Anthropic | $3.00          | $15.00          |\n| Claude 3.5 Sonnet | Anthropic | $3.00          | $15.00          |\n| Claude 3.5 Haiku  | Anthropic | $0.80          | $4.00           |\n| Claude 3 Haiku    | Anthropic | $0.25          | $1.25           |\n| Gemini 2.0 Flash  | Google    | $0.10          | $0.40           |\n| Gemini 1.5 Pro    | Google    | $1.25          | $5.00           |\n| Gemini 1.5 Flash  | Google    | $0.075         | $0.30           |\n\nPricing is stored in `src/tokenscope/prices.json` and loaded at runtime. A warning is shown if the file is more than 60 days old. Unknown models show `$0.00` — all other profiling works normally.\n\n---\n\n## How Token Counting Works\n\nTokenScope uses `tiktoken` — OpenAI's tokenizer. Token counting is deterministic math, no API calls, no data leaves your machine.\n\n**Accuracy:** Exact for OpenAI models. ~95% for Claude. ~90% for Gemini.\n\n**Two token counts per call:**\n\n- **Input tokens** — tiktoken count of what was actually sent to the API\n- **Analyzed tokens** — full payload count including `extra_data`. What leak detection runs against.\n\nPer-field attribution is proportionally estimated. The session total is always exact.\n\n---\n\n## Project Structure\n\n```\ntoken-scope/\n├── pyproject.toml\n├── Dockerfile\n│\n├── src/\n│   └── tokenscope/\n│       ├── __init__.py          ← public API: TokenScope, TokenScopeSession\n│       ├── client.py            ← OpenAI wrapper, Anthropic wrapper, LangChain handler\n│       ├── reporter.py          ← writes reports/ HTML\n│       ├── prices.json          ← pricing data, update without touching code\n│       └── core/\n│           ├── tokenizer.py     ← tiktoken wrapper, per-field attribution\n│           ├── parser.py        ← JSON tree walker\n│           ├── leak_detector.py ← 6-rule waste detection\n│           ├── payload_optimizer.py\n│           └── calculator.py   ← token counts → dollar costs\n│\n├── api/\n│   ├── main.py                  ← FastAPI app\n│   ├── routes.py                ← /analyze, /health\n│   └── models.py                ← Pydantic request/response models\n│\n└── tests/\n    ├── test_calculator.py\n    ├── test_leak_detector.py\n    ├── test_payload_optimizer.py\n    ├── test_tokenizer_parser.py\n    └── test_api.py\n```\n\n---\n\n## Running Locally\n\n```bash\ngit clone https://github.com/DevelopedBy-Siva/token-scope\ncd token-scope\n\n# SDK only\npip install -e .\n\n# API\npip install -e \".[api]\"\nuvicorn api.main:app --reload\n\n# Tests\npip install -e \".[dev]\"\npytest\n```\n\n## Docker\n\n```bash\ndocker build -t tokenscope .\ndocker run -p 8000:8000 tokenscope\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdevelopedby-siva%2Ftoken-scope","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdevelopedby-siva%2Ftoken-scope","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdevelopedby-siva%2Ftoken-scope/lists"}