{"id":49546460,"url":"https://github.com/BlockRunAI/blockrun-llm","last_synced_at":"2026-05-19T09:01:03.162Z","repository":{"id":330921274,"uuid":"1123804298","full_name":"BlockRunAI/blockrun-llm","owner":"BlockRunAI","description":"Python SDK for BlockRun — 41+ AI models, pay-per-call with USDC. OpenAI-compatible. Zero rate limits.","archived":false,"fork":false,"pushed_at":"2026-04-20T17:22:43.000Z","size":195,"stargazers_count":5,"open_issues_count":0,"forks_count":3,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-20T19:29:49.189Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/BlockRunAI.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2025-12-27T16:55:11.000Z","updated_at":"2026-04-20T17:45:42.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/BlockRunAI/blockrun-llm","commit_stats":null,"previous_names":["blockrunai/blockrun-llm"],"tags_count":4,"template":false,"template_full_name":null,"purl":"pkg:github/BlockRunAI/blockrun-llm","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BlockRunAI%2Fblockrun-llm","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BlockRunAI%2Fblockrun-llm/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BlockRunAI%2Fblockrun-llm/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BlockRunAI%2Fblockrun-llm/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/BlockRunAI","download_url":"https://codeload.github.com/BlockRunAI/blockrun-llm/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BlockRunAI%2Fblockrun-llm/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33209392,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-19T07:54:09.561Z","status":"ssl_error","status_checked_at":"2026-05-19T07:54:08.508Z","response_time":58,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-05-02T20:00:18.032Z","updated_at":"2026-05-19T09:01:03.126Z","avatar_url":"https://github.com/BlockRunAI.png","language":"Python","funding_links":[],"categories":["Resources \u0026 Directories"],"sub_categories":["Infrastructure"],"readme":"# BlockRun LLM SDK (Python)\r\n\r\n\u003e **blockrun-llm** is a Python SDK for accessing 80+ large language models (GPT-5.x, Claude 4.x, Gemini 3.x, DeepSeek, Grok 4.x, GLM, MiniMax, Moonshot and more) plus image / video / music generation, Grok Live Search, prediction-market data (Predexon), Exa neural web search, and Pyth-backed market data — all with automatic pay-per-request USDC micropayments via the x402 protocol. No API keys required; your wallet signature is your authentication. Built for AI agents that need to operate autonomously.\r\n\u003e\r\n\u003e 🆓 **Includes 8 fully-free NVIDIA-hosted models** — DeepSeek V4 Flash (1M context), Nemotron Nano Omni (vision), Qwen3 Next + Coder, Llama 4 Maverick, Mistral Small 4, plus `gpt-oss-120b/20b` (hidden from `/v1/models` but direct calls still work). Zero USDC, no rate-limit gimmicks. Use `routing_profile=\"free\"` or call any `nvidia/*` model directly.\r\n\r\n[![PyPI](https://img.shields.io/pypi/v/blockrun-llm.svg)](https://pypi.org/project/blockrun-llm/)\r\n[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)\r\n\r\n**BlockRun assumes Claude Code as the agent runtime.**\r\n\r\n## Supported Chains\r\n\r\n| Chain | Network | Payment | Status |\r\n|-------|---------|---------|--------|\r\n| **Base** | Base Mainnet (Chain ID: 8453) | USDC | ✅ Primary |\r\n| **Base Testnet** | Base Sepolia (Chain ID: 84532) | Testnet USDC | ✅ Development |\r\n| **Solana** | Solana Mainnet | USDC (SPL) | ✅ New |\r\n\r\n\u003e **XRPL (RLUSD):** Use [blockrun-llm-xrpl](https://pypi.org/project/blockrun-llm-xrpl/) for XRPL payments\r\n\r\n**Protocol:** x402 v2\r\n\r\n## Installation\r\n\r\n```bash\r\npip install blockrun-llm              # Base chain (EVM/USDC) — includes all core deps\r\npip install blockrun-llm[solana]      # Base + Solana (USDC SPL) payments\r\npip install blockrun-llm[dev]         # Base + dev tools (pytest, black, ruff, mypy)\r\npip install blockrun-llm[dev,solana]  # Everything\r\n```\r\n\r\n## Quick Start\r\n\r\n```python\r\nfrom blockrun_llm import LLMClient\r\n\r\nclient = LLMClient()  # Uses BLOCKRUN_WALLET_KEY (never sent to server)\r\nresponse = client.chat(\"openai/gpt-5.2\", \"Hello!\")\r\n```\r\n\r\nThat's it. The SDK handles x402 payment automatically.\r\n\r\n### Try It Free (No USDC Required)\r\n\r\nWant to kick the tires before funding a wallet? Route to BlockRun's free NVIDIA tier:\r\n\r\n```python\r\nfrom blockrun_llm import LLMClient\r\n\r\nclient = LLMClient()  # Wallet still required for signing, but $0 charged\r\n\r\n# Option 1: call a free model directly\r\nresponse = client.chat(\"nvidia/qwen3-next-80b-a3b-thinking\", \"Explain x402 in 1 sentence\")\r\n\r\n# Option 2: let the smart router pick the best free model per request\r\nresult = client.smart_chat(\"What is 2+2?\", routing_profile=\"free\")\r\nprint(result.model)     # e.g. 'nvidia/deepseek-v4-flash' (cheapest capable for SIMPLE tier)\r\nprint(result.response)  # '4'\r\n```\r\n\r\n**Available free models** (input + output both $0, all NVIDIA-hosted):\r\n\r\n| Model ID | Context | Best For |\r\n|----------|---------|----------|\r\n| `nvidia/deepseek-v4-flash` | 1M | DeepSeek V4 Flash — 284B / 13B active MoE, ~5× faster than V4 Pro. Best free chat / summarization / light reasoning |\r\n| `nvidia/nemotron-3-nano-omni-30b-a3b-reasoning` | 256K | Only vision-capable free model — text + images + video (≤2 min) + audio (≤1 hr) |\r\n| `nvidia/qwen3-next-80b-a3b-thinking` | 131K | 116 tok/s reasoning with thinking mode |\r\n| `nvidia/mistral-small-4-119b` | 131K | 114 tok/s — fastest free chat |\r\n| `nvidia/llama-4-maverick` | 131K | Meta Llama 4 Maverick MoE |\r\n| `nvidia/qwen3-coder-480b` | 131K | Coding-optimised 480B MoE |\r\n| `nvidia/gpt-oss-120b` | 128K | OpenAI open-weight 120B — 123 tok/s. Hidden from `/v1/models` (so SmartChat won't auto-pick it) but direct calls still work |\r\n| `nvidia/gpt-oss-20b` | 128K | OpenAI open-weight 20B — 155 tok/s. Hidden from `/v1/models` but direct calls still work |\r\n\r\n\u003e Need V4-Pro-class reasoning? Use the paid `deepseek/deepseek-v4-pro` ($0.50/$1.00 with the 75% promo through 2026-05-31) — `nvidia/deepseek-v4-pro` is hidden because NVIDIA's NIM deployment is hung; backend MODEL_REDIRECTS forwards calls to V4 Flash.\r\n\r\n\u003e **Privacy note for `gpt-oss-120b/20b`**: NVIDIA's free build.nvidia.com tier reserves the right to use prompts/outputs for service improvement. The models are hidden from `/v1/models` so SmartChat won't auto-route to them, but direct calls still work — use them only when prompts contain no sensitive data.\r\n\r\n## Solana Support\r\n\r\nPay for AI calls with Solana USDC via [sol.blockrun.ai](https://sol.blockrun.ai):\r\n\r\n```python\r\nfrom blockrun_llm import SolanaLLMClient\r\n\r\n# SOLANA_WALLET_KEY env var (bs58-encoded Solana secret key)\r\nclient = SolanaLLMClient()\r\n\r\n# Or pass key directly\r\nclient = SolanaLLMClient(private_key=\"your-bs58-solana-key\")\r\n\r\n# Same API as LLMClient\r\nresponse = client.chat(\"openai/gpt-5.2\", \"gm Solana\")\r\nprint(response)\r\n\r\n# DeepSeek on Solana\r\nanswer = client.chat(\"deepseek/deepseek-chat\", \"Explain Solana consensus\", temperature=0.5)\r\n```\r\n\r\n**Setup:**\r\n```bash\r\npip install blockrun-llm[solana]\r\nexport SOLANA_WALLET_KEY=\"your-bs58-solana-key\"\r\n```\r\n\r\n**Endpoint:** `https://sol.blockrun.ai/api`\r\n**Payment:** Solana USDC (SPL Token, mainnet)\r\n\r\n## Smart Routing (ClawRouter)\r\n\r\nLet the SDK automatically pick the cheapest capable model for each request:\r\n\r\n```python\r\nfrom blockrun_llm import LLMClient\r\n\r\nclient = LLMClient()\r\n\r\n# Auto-routes to cheapest capable model\r\nresult = client.smart_chat(\"What is 2+2?\")\r\nprint(result.response)  # '4'\r\nprint(result.model)     # 'moonshot/kimi-k2.6' (Moonshot flagship — vision + reasoning_content)\r\nprint(f\"Saved {result.routing.savings * 100:.0f}%\")  # 'Saved 94%'\r\n\r\n# Complex reasoning task -\u003e routes to reasoning model\r\nresult = client.smart_chat(\"Prove the Riemann hypothesis step by step\")\r\nprint(result.model)  # 'deepseek/deepseek-reasoner'\r\n```\r\n\r\n### Routing Profiles\r\n\r\n| Profile | Description | Best For |\r\n|---------|-------------|----------|\r\n| `free` | NVIDIA free tier — smart-routes across 9 models (DeepSeek V4 Pro/Flash, Nemotron Nano Omni, Qwen3, GLM-4.7, Llama 4, Mistral) | Zero-cost testing, dev, prod |\r\n| `eco` | Cheapest models per tier (DeepSeek, NVIDIA) | Cost-sensitive production |\r\n| `auto` | Best balance of cost/quality (default) | General use |\r\n| `premium` | Top-tier models (OpenAI, Anthropic) | Quality-critical tasks |\r\n\r\n```python\r\n# Use premium models for complex tasks\r\nresult = client.smart_chat(\r\n    \"Write production-grade async Python code\",\r\n    routing_profile=\"premium\"\r\n)\r\nprint(result.model)  # 'openai/gpt-5.4'\r\n```\r\n\r\n### How It Works\r\n\r\nClawRouter uses a 14-dimension rule-based classifier to analyze each request:\r\n\r\n- **Token count** - Short vs long prompts\r\n- **Code presence** - Programming keywords\r\n- **Reasoning markers** - \"prove\", \"step by step\", etc.\r\n- **Technical terms** - Architecture, optimization, etc.\r\n- **Creative markers** - Story, poem, brainstorm, etc.\r\n- **Agentic patterns** - Multi-step, tool use indicators\r\n\r\nThe classifier runs in \u003c1ms, 100% locally, and routes to one of four tiers:\r\n\r\n| Tier | Example Tasks | Auto Profile Model |\r\n|------|---------------|-------------------|\r\n| SIMPLE | \"What is 2+2?\", definitions | moonshot/kimi-k2.6 |\r\n| MEDIUM | Code snippets, explanations | google/gemini-2.5-flash |\r\n| COMPLEX | Architecture, long documents | google/gemini-3.1-pro |\r\n| REASONING | Proofs, multi-step reasoning | deepseek/deepseek-reasoner |\r\n\r\n## How It Works\r\n\r\n1. You send a request to BlockRun's API\r\n2. The API returns a 402 Payment Required with the price\r\n3. The SDK automatically signs a USDC payment on Base\r\n4. The request is retried with the payment proof\r\n5. You receive the AI response\r\n\r\n**Your private key never leaves your machine** - it's only used for local signing.\r\n\r\n## Available Models\r\n\r\n### OpenAI GPT-5.5 Family\r\nReleased 2026-04-23 — first fully retrained base since GPT-4.5. 1M context, 128K output, native agent + computer use.\r\n\r\n| Model | Input Price | Output Price | Context |\r\n|-------|-------------|--------------|---------|\r\n| `openai/gpt-5.5` | $5.00/M | $30.00/M | 1M |\r\n\r\n### OpenAI GPT-5.4 Family\r\n| Model | Input Price | Output Price | Context |\r\n|-------|-------------|--------------|---------|\r\n| `openai/gpt-5.4` | $2.50/M | $15.00/M | 1M |\r\n| `openai/gpt-5.4-pro` | $30.00/M | $180.00/M | 1M |\r\n| `openai/gpt-5.4-mini` | $0.75/M | $4.50/M | 400K |\r\n| `openai/gpt-5.4-nano` | $0.20/M | $1.25/M | 1M |\r\n\r\n### OpenAI GPT-5 Family\r\n| Model | Input Price | Output Price | Context |\r\n|-------|-------------|--------------|---------|\r\n| `openai/gpt-5.3` | $1.75/M | $14.00/M | 128K |\r\n| `openai/gpt-5.2` | $1.75/M | $14.00/M | 400K |\r\n| `openai/gpt-5-mini` | $0.25/M | $2.00/M | 200K |\r\n| `openai/gpt-5.2-pro` | $21.00/M | $168.00/M | 400K |\r\n| `openai/gpt-5.3-codex` | $1.75/M | $14.00/M | 400K |\r\n\r\n### OpenAI O-Series (Reasoning)\r\n| Model | Input Price | Output Price | Context |\r\n|-------|-------------|--------------|---------|\r\n| `openai/o1` | $15.00/M | $60.00/M | 200K |\r\n| `openai/o1-mini` | $1.10/M | $4.40/M | 128K |\r\n| `openai/o3` | $2.00/M | $8.00/M | 200K |\r\n| `openai/o3-mini` | $1.10/M | $4.40/M | 128K |\r\n\r\n### Anthropic Claude\r\n| Model | Input Price | Output Price | Context | Notes |\r\n|-------|-------------|--------------|---------|-------|\r\n| `anthropic/claude-opus-4.7` | $5.00/M | $25.00/M | 1M | Most capable Claude — agentic coding + adaptive thinking, 128K output |\r\n| `anthropic/claude-opus-4.6` | $5.00/M | $25.00/M | 200K | Hidden from `/v1/models` (superseded by 4.7); direct calls still work |\r\n| `anthropic/claude-opus-4.5` | $5.00/M | $25.00/M | 200K | |\r\n| `anthropic/claude-sonnet-4.6` | $3.00/M | $15.00/M | 200K | |\r\n| `anthropic/claude-haiku-4.5` | $1.00/M | $5.00/M | 200K | |\r\n\r\n### Google Gemini\r\n| Model | Input Price | Output Price | Context |\r\n|-------|-------------|--------------|---------|\r\n| `google/gemini-3.1-pro` | $2.00/M | $12.00/M | 1M |\r\n| `google/gemini-3-pro-preview` | $2.00/M | $12.00/M | 1M |\r\n| `google/gemini-3-flash-preview` | $0.50/M | $3.00/M | 1M |\r\n| `google/gemini-2.5-pro` | $1.25/M | $10.00/M | 1M |\r\n| `google/gemini-2.5-flash` | $0.30/M | $2.50/M | 1M |\r\n| `google/gemini-3.1-flash-lite` | $0.25/M | $1.50/M | 1M |\r\n| `google/gemini-2.5-flash-lite` | $0.10/M | $0.40/M | 1M |\r\n\r\n### DeepSeek\r\n\r\nV4 family launched 2026-04-24. DeepSeek upstream now serves the legacy\r\n`deepseek-chat` / `deepseek-reasoner` aliases as V4 Flash non-thinking /\r\nthinking modes. V4 Pro is the new flagship paid SKU — 1.6T MoE / 49B active,\r\n1M context, MMLU-Pro 87.5, GPQA 90.1, SWE-bench 80.6, LiveCodeBench 93.5.\r\n\r\n| Model | Input Price | Output Price | Context | Notes |\r\n|-------|-------------|--------------|---------|-------|\r\n| `deepseek/deepseek-v4-pro` | $0.50/M | $1.00/M | 1M | V4 flagship — strongest open-weight reasoner. **75% off until 2026-05-31** (list $2.00/$4.00) |\r\n| `deepseek/deepseek-chat` | $0.20/M | $0.40/M | 1M | V4 Flash non-thinking (paid endpoint with 5MB request bodies; same upstream as `nvidia/deepseek-v4-flash`) |\r\n| `deepseek/deepseek-reasoner` | $0.20/M | $0.40/M | 1M | V4 Flash thinking (same upstream as `deepseek-chat`, thinking enabled by default) |\r\n\r\n### MiniMax\r\n| Model | Input Price | Output Price | Context |\r\n|-------|-------------|--------------|---------|\r\n| `minimax/minimax-m2.7` | $0.30/M | $1.20/M | 200K |\r\n\r\n### ZAI\r\n\r\nThe GLM-5 family bills as **flat $0.001/call** (no token counting) — `/v1/models` reports them under `billing_mode: \"flat\"`. Per-call pricing makes them cheapest-of-class for short prompts.\r\n\r\n| Model | Price | Context | Notes |\r\n|-------|-------|---------|-------|\r\n| `zai/glm-5.1` | $0.001/call | 200K | Z.AI's latest flagship — #1 open-source on SWE-Bench Pro, 8-hour autonomous execution |\r\n| `zai/glm-5` | $0.001/call | 200K | |\r\n| `zai/glm-5-turbo` | $0.001/call | 200K | |\r\n\r\n### NVIDIA (Free \u0026 Hosted)\r\n\r\nFree tier refreshed 2026-04-28: added `nvidia/deepseek-v4-flash` (1M context)\r\nand `nvidia/nemotron-3-nano-omni-30b-a3b-reasoning` (vision). `nvidia/gpt-oss-120b`\r\nand `nvidia/gpt-oss-20b` were briefly delisted over privacy concerns\r\n(NVIDIA's free build.nvidia.com tier reserves the right to use prompts for\r\nservice improvement) but **re-enabled 2026-04-30 with `available: true` +\r\n`hidden: true`** — they no longer appear in `/v1/models` (so SmartChat won't\r\nauto-pick them) but direct calls by full ID still return HTTP 200.\r\n`nvidia/deepseek-v4-pro`, `nvidia/deepseek-v3.2`, and `nvidia/glm-4.7` are\r\nhidden because NVIDIA's NIM deployment is hung — backend MODEL_REDIRECTS\r\nauto-forwards calls to V4 Flash / qwen3-coder.\r\n\r\n| Model | Input Price | Output Price | Context | Notes |\r\n|-------|-------------|--------------|---------|-------|\r\n| `nvidia/deepseek-v4-flash` | **FREE** | **FREE** | 1M | DeepSeek V4 Flash — 284B / 13B active MoE, ~5× faster than V4 Pro. Best free chat / summarization |\r\n| `nvidia/nemotron-3-nano-omni-30b-a3b-reasoning` | **FREE** | **FREE** | 256K | First vision-capable free model — RGB images, mp4 video |\r\n| `nvidia/qwen3-next-80b-a3b-thinking` | **FREE** | **FREE** | 131K | Reasoning flagship — 116 tok/s, thinking mode |\r\n| `nvidia/mistral-small-4-119b` | **FREE** | **FREE** | 131K | Fastest chat — 114 tok/s |\r\n| `nvidia/llama-4-maverick` | **FREE** | **FREE** | 131K | Meta Llama 4 Maverick MoE |\r\n| `nvidia/qwen3-coder-480b` | **FREE** | **FREE** | 131K | Coding-optimised 480B MoE |\r\n| `nvidia/gpt-oss-120b` | **FREE** | **FREE** | 128K | OpenAI open-weight 120B — 123 tok/s. Hidden from `/v1/models`; direct calls work |\r\n| `nvidia/gpt-oss-20b` | **FREE** | **FREE** | 128K | OpenAI open-weight 20B — 155 tok/s. Hidden from `/v1/models`; direct calls work |\r\n| `moonshot/kimi-k2.5` | $0.60/M | $3.00/M | 262K | Kimi K2.5 direct from Moonshot (replaces `nvidia/kimi-k2.5`) |\r\n| `moonshot/kimi-k2.6` | $0.95/M | $4.00/M | 256K | Moonshot flagship (vision + reasoning_content) |\r\n\r\n### Testnet Models (Base Sepolia)\r\n| Model | Price |\r\n|-------|-------|\r\n| `openai/gpt-oss-20b` | $0.001/request |\r\n| `openai/gpt-oss-120b` | $0.002/request |\r\n\r\n*Testnet models use flat pricing (no token counting) for simplicity.*\r\n\r\n### Verifying Models End-to-End\r\n\r\nThe SDK ships two runnable sweep scripts under `examples/`:\r\n\r\n```bash\r\n# Chat LLMs — every chat model the SDK exposes\r\npython examples/sweep_all_chat_models.py --output-json sweep-results.json\r\n\r\n# Image + music models (video excluded — long polling, expensive per clip)\r\npython examples/sweep_all_media_models.py --output-json sweep-media-results.json\r\n```\r\n\r\nEach script captures per-model status, latency, token counts, and per-call\r\ncost, prints a grouped report, and exits non-zero if any expected-to-work\r\nmodel fails. Useful before a release or after router/catalog changes.\r\n\r\n`smart_chat()` and `chat()` accept an optional `fallback_models=[...]` list —\r\non timeout / 5xx / network error the SDK transparently walks the chain\r\nbefore raising. `smart_chat()` populates this from the tier's fallback list\r\nautomatically.\r\n\r\n### Image Generation\r\n\r\n| Model | Price |\r\n|-------|-------|\r\n| `openai/dall-e-3` | $0.04/image |\r\n| `openai/gpt-image-1` | $0.02/image |\r\n| `openai/gpt-image-2` | $0.06/image (reasoning-driven, multilingual text rendering, character consistency) |\r\n| `google/nano-banana` | $0.05/image |\r\n| `google/nano-banana-pro` | $0.10/image |\r\n| `xai/grok-imagine-image` | $0.02/image |\r\n| `xai/grok-imagine-image-pro` | $0.07/image |\r\n| `zai/cogview-4` | $0.015/image |\r\n\r\nImage editing (`client.edit`): `openai/gpt-image-1` and `openai/gpt-image-2` both support the `/v1/images/image2image` endpoint.\r\n\r\n### Video Generation\r\n| Model | Price |\r\n|-------|-------|\r\n| `xai/grok-imagine-video` | $0.05/sec (8s default → $0.42/clip) |\r\n| `bytedance/seedance-1.5-pro` | $0.03/sec (5s default, up to 10s, 720p) |\r\n| `bytedance/seedance-2.0-fast` | $0.15/sec (~60-80s gen, sweet-spot price/quality) |\r\n| `bytedance/seedance-2.0` | $0.30/sec (720p Pro) |\r\n\r\n```python\r\nfrom blockrun_llm import VideoClient\r\n\r\nclient = VideoClient()\r\nresult = client.generate(\"a red apple slowly spinning on a wooden table\")\r\nprint(result.data[0].url)            # permanent MP4 URL\r\nprint(result.data[0].duration_seconds)  # 8\r\n\r\n# Image-to-video\r\nresult = client.generate(\r\n    \"the subject turns its head and smiles\",\r\n    image_url=\"https://example.com/portrait.jpg\",\r\n)\r\n```\r\n\r\n## Voice Calls (`VoiceClient`)\r\n\r\n`VoiceClient` wraps `POST /v1/voice/call` (paid, $0.54/call) and\r\n`GET /v1/voice/call/{call_id}` (free polling) — AI-powered outbound phone\r\ncalls powered by Bland.ai. The agent dials the recipient and runs a real-time\r\nconversation based on your `task` instructions. US + Canada destinations.\r\n\r\n```python\r\nfrom blockrun_llm import VoiceClient\r\n\r\nclient = VoiceClient()\r\n\r\n# Initiate (paid $0.54)\r\nresult = client.call(\r\n    to=\"+14155552671\",\r\n    task=\"You are a friendly assistant calling to confirm a 3pm dentist appointment.\",\r\n    voice=\"maya\",          # nat / josh / maya / june / paige / derek / florian\r\n    max_duration=5,        # minutes (1–30)\r\n)\r\nprint(result[\"call_id\"])\r\n\r\n# Poll for transcript + recording (free)\r\nstatus = client.get_status(result[\"call_id\"])\r\nprint(status.get(\"status\"), status.get(\"recording_url\"))\r\n```\r\n\r\nBring your own caller-ID: pass `from_=\"+14155552671\"` (must be a BlockRun\r\nphone number you own; buy via `PhoneClient.buy_number()` or\r\n`/v1/phone/numbers/buy`). If you omit `from_` and your wallet owns exactly one\r\nactive number, the backend auto-picks it; with multiple active numbers you'll\r\nget a `400 ambiguous_from` and the error body lists your candidates.\r\n\r\n## Phone Numbers (`PhoneClient`)\r\n\r\n`PhoneClient` wraps `/v1/phone/*` — Twilio-backed phone lookup and\r\nwallet-bound number provisioning. Buy a number once to use it as caller ID in\r\n`VoiceClient`; the number is leased for 30 days and tied to your wallet.\r\n\r\n```python\r\nfrom blockrun_llm import PhoneClient\r\n\r\nclient = PhoneClient()\r\n\r\n# Carrier + line-type lookup ($0.01)\r\ninfo = client.lookup(\"+14155552671\")\r\n\r\n# Carrier + SIM-swap/forwarding fraud signals ($0.05)\r\nfraud = client.lookup_fraud(\"+14155552671\")\r\n\r\n# Buy a number — 30-day lease, wallet-bound ($5.00).\r\n# Payment is held until Twilio confirms the purchase, so failed buys never charge you.\r\nbought = client.buy_number(country=\"US\", area_code=\"415\")\r\nprint(bought[\"phone_number\"], bought[\"expires_at\"])\r\n\r\n# List, renew, release\r\nprint(client.list_numbers())                    # $0.001\r\nclient.renew_number(bought[\"phone_number\"])     # $5.00, +30 days\r\nclient.release_number(bought[\"phone_number\"])   # free\r\n```\r\n\r\n## Surf — Crypto Intelligence (`SurfClient`)\r\n\r\n`SurfClient` wraps `/v1/surf/*` — the asksurf.ai partner gateway, ~83 crypto\r\nendpoints across exchanges, on-chain SQL, prediction markets (Polymarket +\r\nKalshi), wallets, social analytics, and project intelligence. Tiered pricing:\r\n$0.001 / $0.005 / $0.020 per call (tier 1 / 2 / 3).\r\n\r\n```python\r\nfrom blockrun_llm import SurfClient\r\n\r\nclient = SurfClient()\r\n\r\n# Discovery\r\nprint(SurfClient.endpoints())                       # full catalog\r\nprint(client.price(\"market/ranking\"))               # 0.001\r\nprint(client.endpoint_info(\"onchain/sql\"))          # {'method': 'POST', 'tier': 3, ...}\r\n\r\n# GET — pass query params (validated against the catalog)\r\nbtc_price = client.get(\"exchange/price\", {\"pair\": \"BTC/USDT\"})\r\nholders   = client.get(\"token/holders\", {\"address\": \"0x...\", \"chain\": \"ethereum\"})\r\n\r\n# POST — JSON body\r\nrows = client.post(\"onchain/sql\", {\"query\": \"SELECT count() FROM ethereum.blocks\"})\r\n\r\n# Generic helper — auto-routes GET vs POST from the catalog\r\nresult = client.call(\"token/holders\", params={\"address\": \"0x...\", \"chain\": \"ethereum\"})\r\n```\r\n\r\n## Standalone Search (`SearchClient`)\r\n\r\n`SearchClient` wraps `POST /v1/search` — standalone Grok Live Search with\r\nautomatic x402 payment. Pricing: `$0.025/source + margin`\r\n(10 sources ≈ `$0.26`).\r\n\r\n```python\r\nfrom blockrun_llm import SearchClient\r\n\r\nclient = SearchClient()\r\nresult = client.search(\r\n    \"Latest news on x402 adoption\",\r\n    sources=[\"x\", \"web\"],\r\n    max_results=10,\r\n)\r\nprint(result.summary)\r\nfor url in result.citations or []:\r\n    print(url)\r\n```\r\n\r\n## Market Data (`PriceClient`)\r\n\r\nPyth-backed realtime quotes and OHLC history across crypto, FX, commodities\r\nand 12 global equity markets. Crypto / FX / commodity are **fully free**\r\nacross price, history and list; stocks (`stocks/{market}` and the `usstock`\r\nlegacy alias) charge `$0.001` per price or history call. Pass\r\n`require_wallet=False` when you only need free endpoints.\r\n\r\n```python\r\nfrom blockrun_llm import PriceClient\r\n\r\n# Free usage — no wallet\r\np = PriceClient(require_wallet=False)\r\nbtc = p.price(\"crypto\", \"BTC-USD\")\r\neur = p.price(\"fx\", \"EUR-USD\")\r\nsymbols = p.list_symbols(\"crypto\", q=\"sol\", limit=20)\r\n\r\n# Paid — requires a wallet\r\np2 = PriceClient()\r\naapl = p2.price(\"stocks\", \"AAPL\", market=\"us\")\r\nbars = p2.history(\r\n    \"stocks\", \"AAPL\",\r\n    market=\"us\",\r\n    resolution=\"D\",\r\n    from_ts=1_700_000_000,\r\n    to_ts=1_710_000_000,\r\n)\r\n```\r\n\r\nSupported stock markets: `us, hk, jp, kr, gb, de, fr, nl, ie, lu, cn, ca`.\r\n\r\n## Prediction Markets (Powered by Predexon v2)\r\n\r\nAccess real-time prediction market data from Polymarket, Kalshi, Limitless, sports, and Binance Futures via [Predexon](https://predexon.com). No API keys needed — pay-per-request via x402. Tier 1 endpoints are $0.001/call, Tier 2 (wallet identity / clustering) are $0.005/call.\r\n\r\nEach method below is available on `LLMClient` (Base), `AsyncLLMClient`, and `SolanaLLMClient`.\r\n\r\n### Typed helpers\r\n\r\n| Method | Endpoint | Tier |\r\n|---|---|---|\r\n| `pm_markets(**filters)` | canonical cross-venue markets | 1 |\r\n| `pm_listings(**filters)` | venue-native executable listings | 1 |\r\n| `pm_outcome(predexon_id)` | resolve a canonical outcome | 1 |\r\n| `pm_polymarket_markets(**filters)` | Polymarket markets (offset pagination) | 1 |\r\n| `pm_polymarket_events(**filters)` | Polymarket events (offset pagination) | 1 |\r\n| `pm_polymarket_markets_keyset(**filters)` | Polymarket markets, cursor pagination | 1 |\r\n| `pm_polymarket_events_keyset(**filters)` | Polymarket events, cursor pagination | 1 |\r\n| `pm_polymarket_positions(**filters)` | per-wallet open positions + PnL | 1 |\r\n| `pm_polymarket_trades(**filters)` | recent trades (token, side, price, tx_hash) | 1 |\r\n| `pm_polymarket_leaderboard(**filters)` | trader leaderboard (window, sort_by) | 1 |\r\n| `pm_kalshi_markets(**filters)` | Kalshi event contracts | 1 |\r\n| `pm_limitless_markets(**filters)` | Limitless binary AMM markets | 1 |\r\n| `pm_sports_categories()` | available sports categories | 1 |\r\n| `pm_sports_markets(**filters)` | sports markets grouped by game | 1 |\r\n| `pm_wallet_identity(wallet)` | identity + profile for one wallet | 2 |\r\n| `pm_wallet_identities(addresses)` | bulk identity for ≤200 wallets (POST) | 2 |\r\n| `pm_wallet_cluster(address)` | on-chain transfer + identity-proof cluster | 2 |\r\n\r\n```python\r\nfrom blockrun_llm import LLMClient\r\n\r\nclient = LLMClient()\r\n\r\n# Canonical cross-venue snapshot\r\nmarkets = client.pm_markets(status=\"active\", limit=20)\r\nlistings = client.pm_listings(venue=\"polymarket\", limit=20)\r\n\r\n# Polymarket\r\nevents = client.pm_polymarket_events(limit=10)\r\npositions = client.pm_polymarket_positions(user=\"0xABC123...\")\r\ntop = client.pm_polymarket_leaderboard(window=\"7d\", sort_by=\"pnl\", limit=10)\r\n\r\n# Sports + Kalshi + Limitless\r\ngames = client.pm_sports_markets(league=\"NBA\", limit=10)\r\nkalshi = client.pm_kalshi_markets(limit=10)\r\nlimitless = client.pm_limitless_markets(limit=10)\r\n\r\n# Wallet identity (Tier 2)\r\nprofile = client.pm_wallet_identity(\"0xABC123...\")\r\nbatch = client.pm_wallet_identities([\"0xABC...\", \"0xDEF...\"])\r\ncluster = client.pm_wallet_cluster(\"0xABC123...\")\r\n```\r\n\r\n### Generic passthrough\r\n\r\nFor endpoints without a typed helper, drop down to `pm()` (GET) or `pm_query()`\r\n(POST). Same pricing tiers, same return shape:\r\n\r\n```python\r\ncandles = client.pm(\"polymarket/candlesticks/0x1234abcd...\")  # OHLCV\r\nbtc = client.pm(\"binance/candles/BTCUSDT\")                    # crypto candles\r\npairs = client.pm(\"matching-markets/pairs\")                   # cross-platform pairs\r\n```\r\n\r\n## Exa Web Search (Powered by Exa)\r\n\r\nAccess [Exa](https://exa.ai)'s neural web search via x402. No API keys needed — pay-per-request in USDC. Available on both `LLMClient` (Base, recommended) and `SolanaLLMClient` (Solana).\r\n\r\n| Endpoint | Method | Price |\r\n|---|---|---|\r\n| `exa_search` | Neural/keyword web search | $0.01/request |\r\n| `exa_find_similar` | Find semantically similar pages | $0.01/request |\r\n| `exa_contents` | Extract full text from URLs | $0.002/URL |\r\n| `exa_answer` | AI answer grounded in web search | $0.01/request |\r\n\r\n```python\r\nfrom blockrun_llm import LLMClient\r\n\r\nclient = LLMClient()  # uses BLOCKRUN_WALLET_KEY (Base USDC)\r\n\r\n# Neural web search ($0.01/request)\r\nresults = client.exa_search(\"latest AI safety research\", numResults=5)\r\nresults = client.exa_search(\"bitcoin ETF news\", category=\"news\", numResults=10)\r\n\r\n# Find similar pages ($0.01/request)\r\nsimilar = client.exa_find_similar(\"https://openai.com/research/gpt-4\", numResults=5)\r\n\r\n# Extract content from URLs ($0.002/URL)\r\ncontent = client.exa_contents([\"https://arxiv.org/abs/2303.08774\"])\r\ncontent = client.exa_contents(\r\n    [\"https://example.com/page1\", \"https://example.com/page2\"],\r\n    text=True,\r\n    highlights=True,\r\n)\r\n\r\n# AI-generated answer from live web ($0.01/request)\r\nanswer = client.exa_answer(\"What is the current state of AI safety research?\")\r\n\r\n# Generic proxy for any Exa endpoint\r\nresult = client.exa(\"search\", {\"query\": \"transformer architecture\", \"numResults\": 5})\r\n```\r\n\r\nFor Solana payments use `from blockrun_llm import SolanaLLMClient` — same method\r\nnames, same call shape; the Solana gateway requires the backend to be configured\r\nwith `EXA_API_KEY`, so prefer Base unless you need SOL/SPL settlement.\r\n\r\n## Standalone Search\r\n\r\nSearch web, X/Twitter, and news without using a chat model:\r\n\r\n```python\r\nfrom blockrun_llm import LLMClient\r\n\r\nclient = LLMClient()\r\n\r\nresult = client.search(\"latest AI agent frameworks 2026\")\r\nprint(result.summary)\r\nfor cite in result.citations or []:\r\n    print(f\"  - {cite}\")\r\n\r\n# Filter by source type and date range\r\nresult = client.search(\r\n    \"BlockRun x402\",\r\n    sources=[\"web\", \"x\"],\r\n    from_date=\"2026-01-01\",\r\n    max_results=5,\r\n)\r\n```\r\n\r\n## Image Editing (img2img)\r\n\r\nEdit existing images with text prompts:\r\n\r\n```python\r\nfrom blockrun_llm import LLMClient, ImageClient\r\n\r\n# Via LLMClient\r\nclient = LLMClient()\r\nresult = client.image_edit(\r\n    prompt=\"Make the sky purple and add northern lights\",\r\n    image=\"data:image/png;base64,...\",  # base64 or URL\r\n    model=\"openai/gpt-image-1\",\r\n)\r\nprint(result.data[0].url)\r\n\r\n# Via ImageClient\r\nimg_client = ImageClient()\r\nresult = img_client.edit(\"Add a rainbow\", image=\"https://example.com/photo.jpg\")\r\n```\r\n\r\n## Usage Examples\r\n\r\n### Simple Chat\r\n\r\n```python\r\nfrom blockrun_llm import LLMClient\r\n\r\nclient = LLMClient()  # Uses BLOCKRUN_WALLET_KEY (never sent to server)\r\n\r\nresponse = client.chat(\"openai/gpt-5.2\", \"Explain quantum computing\")\r\nprint(response)\r\n\r\n# With system prompt\r\nresponse = client.chat(\r\n    \"anthropic/claude-sonnet-4.6\",\r\n    \"Write a haiku\",\r\n    system=\"You are a creative poet.\"\r\n)\r\n```\r\n\r\n### Real-time Search (Live Search)\r\n\r\n**Note:** Live Search can take 30-120+ seconds as it searches multiple sources. The SDK automatically uses a 5-minute timeout for search requests.\r\n\r\n```python\r\nfrom blockrun_llm import LLMClient\r\n\r\nclient = LLMClient()\r\n\r\n# Simple: Enable live search with search=True (default 10 sources, ~$0.26)\r\nresponse = client.chat(\r\n    \"openai/gpt-5.2\",\r\n    \"What are the latest posts from @blockrunai?\",\r\n    search=True\r\n)\r\nprint(response)\r\n\r\n# Custom: Limit sources to reduce cost (5 sources, ~$0.13)\r\nresponse = client.chat(\r\n    \"openai/gpt-5.2\",\r\n    \"What's trending on X?\",\r\n    search_parameters={\"mode\": \"on\", \"max_search_results\": 5}\r\n)\r\n\r\n# Custom timeout (if 5 min isn't enough)\r\nclient = LLMClient(search_timeout=600.0)  # 10 minutes\r\n```\r\n\r\n### Check Spending\r\n\r\n```python\r\nfrom blockrun_llm import LLMClient\r\n\r\nclient = LLMClient()\r\n\r\nresponse = client.chat(\"openai/gpt-5.2\", \"Explain quantum computing\")\r\nprint(response)\r\n\r\n# Check how much was spent\r\nspending = client.get_spending()\r\nprint(f\"Spent ${spending['total_usd']:.4f} across {spending['calls']} calls\")\r\n```\r\n\r\n### Full Chat Completion\r\n\r\n```python\r\nfrom blockrun_llm import LLMClient\r\n\r\nclient = LLMClient()  # Uses BLOCKRUN_WALLET_KEY (never sent to server)\r\n\r\nmessages = [\r\n    {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\r\n    {\"role\": \"user\", \"content\": \"How do I read a file in Python?\"}\r\n]\r\n\r\nresult = client.chat_completion(\"openai/gpt-5.2\", messages)\r\nprint(result.choices[0].message.content)\r\n```\r\n\r\n### Async Usage\r\n\r\n```python\r\nimport asyncio\r\nfrom blockrun_llm import AsyncLLMClient\r\n\r\nasync def main():\r\n    async with AsyncLLMClient() as client:\r\n        # Simple chat\r\n        response = await client.chat(\"openai/gpt-5.2\", \"Hello!\")\r\n        print(response)\r\n\r\n        # Multiple requests concurrently\r\n        tasks = [\r\n            client.chat(\"openai/gpt-5.2\", \"What is 2+2?\"),\r\n            client.chat(\"anthropic/claude-sonnet-4.6\", \"What is 3+3?\"),\r\n            client.chat(\"google/gemini-2.5-flash\", \"What is 4+4?\"),\r\n        ]\r\n        responses = await asyncio.gather(*tasks)\r\n        for r in responses:\r\n            print(r)\r\n\r\nasyncio.run(main())\r\n```\r\n\r\n### List Available Models\r\n\r\n```python\r\nfrom blockrun_llm import LLMClient\r\n\r\nclient = LLMClient()\r\nmodels = client.list_models()\r\n\r\nfor model in models:\r\n    print(f\"{model['id']}: ${model['inputPrice']}/M input, ${model['outputPrice']}/M output\")\r\n```\r\n\r\n## Testnet Usage\r\n\r\nFor development and testing without real USDC, use the testnet:\r\n\r\n```python\r\nfrom blockrun_llm import testnet_client\r\n\r\n# Create testnet client (uses Base Sepolia)\r\nclient = testnet_client()  # Uses BLOCKRUN_WALLET_KEY\r\n\r\n# Chat with testnet model\r\nresponse = client.chat(\"openai/gpt-oss-20b\", \"Hello!\")\r\nprint(response)\r\n\r\n# Check testnet USDC balance\r\nbalance = client.get_balance()\r\nprint(f\"Testnet USDC: ${balance:.4f}\")\r\n```\r\n\r\n### Testnet Setup\r\n\r\n1. Get testnet ETH from [Alchemy Base Sepolia Faucet](https://www.alchemy.com/faucets/base-sepolia)\r\n2. Get testnet USDC from [Circle USDC Faucet](https://faucet.circle.com/)\r\n3. Set your wallet key: `export BLOCKRUN_WALLET_KEY=0x...`\r\n\r\n### Available Testnet Models\r\n\r\n- `openai/gpt-oss-20b` - $0.001/request (flat price)\r\n- `openai/gpt-oss-120b` - $0.002/request (flat price)\r\n\r\n### Manual Testnet Configuration\r\n\r\n```python\r\nfrom blockrun_llm import LLMClient\r\n\r\n# Or configure manually\r\nclient = LLMClient(api_url=\"https://testnet.blockrun.ai/api\")\r\nresponse = client.chat(\"openai/gpt-oss-20b\", \"Hello!\")\r\n```\r\n\r\n## Billing \u0026 Cost Tracking\r\n\r\nEvery paid call appends one line to `~/.blockrun/cost_log.jsonl` capturing\r\ntimestamp, endpoint, cost, and (when available) `model`, `wallet`, `network`,\r\nand `client_kind`. The SDK ships a small reader / exporter on top so you can\r\naudit spending without leaving the Python ecosystem.\r\n\r\n### CLI\r\n\r\n```bash\r\n# Aggregated summary, default grouped by endpoint\r\npython -m blockrun_llm.billing summary\r\n\r\n# Group by model / month / wallet / network / client_kind / day\r\npython -m blockrun_llm.billing summary --group-by model\r\npython -m blockrun_llm.billing summary --group-by month --from 2026-04-01\r\n\r\n# Filter by wallet (when one machine drives multiple keys)\r\npython -m blockrun_llm.billing summary --wallet 0xCC8c... --network base-mainnet\r\n\r\n# Export per-call records\r\npython -m blockrun_llm.billing export csv  --from 2026-05-01 --output may.csv\r\npython -m blockrun_llm.billing export json --to 2026-05-09\r\n```\r\n\r\n### Python API\r\n\r\n```python\r\nfrom blockrun_llm import (\r\n    get_cost_log_summary,\r\n    export_cost_log_csv,\r\n    export_cost_log_json,\r\n)\r\n\r\nsummary = get_cost_log_summary(group_by=\"model\", from_date=\"2026-04-01\")\r\nprint(summary[\"total_usd\"], summary[\"calls\"])\r\nfor model, slot in summary[\"groups\"].items():\r\n    print(f\"  {model:40s}  {slot['calls']:\u003e5}  ${slot['cost_usd']:.4f}\")\r\n\r\n# Returns CSV / JSON text; pass output_path to also write to disk\r\ncsv_text  = export_cost_log_csv(\"bill.csv\", from_date=\"2026-05-01\")\r\njson_text = export_cost_log_json(from_date=\"2026-05-01\")\r\n```\r\n\r\n### Example output\r\n\r\nReal session — four cheap chat calls across providers, then queried by model:\r\n\r\n```\r\n$ python -m blockrun_llm.billing summary --from 2026-05-10 --group-by model\r\n================================================================\r\nBLOCKRUN — LOCAL COST LOG SUMMARY\r\n================================================================\r\n  log file       : /Users/me/.blockrun/cost_log.jsonl\r\n  from           : 2026-05-10\r\n  group_by       : model\r\n  total          : $0.0070 (9 calls)\r\n\r\n  KEY                             CALLS        COST\r\n  ----------------------------  -------  ----------\r\n  deepseek/deepseek-chat              2     $0.0020\r\n  google/gemini-2.5-flash-lite        1     $0.0010\r\n  anthropic/claude-haiku-4.5          1     $0.0010\r\n  zai/glm-5-turbo                     1     $0.0010\r\n  unknown                             4     $0.0020\r\n```\r\n\r\nThe four `unknown` rows are pre-existing entries from before this release —\r\nthey had only `{ts, endpoint, cost_usd}` so the model column reads `unknown`.\r\nCalls made after upgrading carry the full metadata (wallet / network /\r\nclient_kind / model). CSV export shows it directly:\r\n\r\n```\r\n$ python -m blockrun_llm.billing export csv --from 2026-05-10 | head -3\r\nts_iso,endpoint,model,wallet,network,client_kind,cost_usd\r\n2026-05-10T03:38:28.198937+00:00,/v1/chat/completions,deepseek/deepseek-chat,0xCC8c...5EF8,base-mainnet,LLMClient,0.001\r\n2026-05-10T03:38:31.192060+00:00,/v1/chat/completions,google/gemini-2.5-flash-lite,0xCC8c...5EF8,base-mainnet,LLMClient,0.001\r\n```\r\n\r\n### Scope\r\n\r\nThe cost log is per-machine. It records calls made by this Python SDK only —\r\ncalls from other clients (TS SDK, MCP, raw curl) are not included. For\r\norganization-wide billing, query the gateway's authoritative ledger.\r\n\r\n## Environment Variables\r\n\r\n| Variable | Description | Required |\r\n|----------|-------------|----------|\r\n| `BLOCKRUN_WALLET_KEY` | Your Base chain wallet private key | Yes (or pass to constructor) |\r\n| `BLOCKRUN_API_URL` | API endpoint | No (default: https://blockrun.ai/api) |\r\n\r\n## Setting Up Your Wallet\r\n\r\n1. Create a wallet on Base network (Coinbase Wallet, MetaMask, etc.)\r\n2. Get some ETH on Base for gas (small amount, ~$1)\r\n3. Get USDC on Base for API payments\r\n4. Export your private key and set it as `BLOCKRUN_WALLET_KEY`\r\n\r\n```bash\r\n# .env file\r\nBLOCKRUN_WALLET_KEY=0x...your_private_key_here\r\n```\r\n\r\n## Error Handling\r\n\r\n```python\r\nfrom blockrun_llm import LLMClient, APIError, PaymentError\r\n\r\nclient = LLMClient()\r\n\r\ntry:\r\n    response = client.chat(\"openai/gpt-5.2\", \"Hello!\")\r\nexcept PaymentError as e:\r\n    print(f\"Payment failed: {e}\")\r\n    # Check your USDC balance\r\nexcept APIError as e:\r\n    print(f\"API error ({e.status_code}): {e}\")\r\n```\r\n\r\n## Testing\r\n\r\n### Running Unit Tests\r\n\r\nUnit tests do not require API access or funded wallets:\r\n\r\n```bash\r\npytest tests/unit                    # Run unit tests only\r\npytest tests/unit --cov              # Run with coverage report\r\npytest tests/unit -v                 # Verbose output\r\n```\r\n\r\n### Running Integration Tests\r\n\r\nIntegration tests call the production API and require:\r\n- A funded Base wallet with USDC ($1+ recommended)\r\n- `BLOCKRUN_WALLET_KEY` environment variable set\r\n- Estimated cost: ~$0.05 per test run\r\n\r\n```bash\r\nexport BLOCKRUN_WALLET_KEY=0x...\r\npytest tests/integration             # Run integration tests only\r\npytest                               # Run all tests\r\n```\r\n\r\nIntegration tests are automatically skipped if `BLOCKRUN_WALLET_KEY` is not set.\r\n\r\n## Security\r\n\r\n### Private Key Safety\r\n\r\n- **Private key stays local**: Your key is only used for signing on your machine\r\n- **No custody**: BlockRun never holds your funds\r\n- **Verify transactions**: All payments are on-chain and verifiable\r\n\r\n### Best Practices\r\n\r\n**Private Key Management:**\r\n- Use environment variables, never hard-code keys\r\n- Use dedicated wallets for API payments (separate from main holdings)\r\n- Set spending limits by only funding payment wallets with small amounts\r\n- Never commit `.env` files to version control\r\n- Rotate keys periodically\r\n\r\n**Input Validation:**\r\nThe SDK validates all inputs before API requests:\r\n- Private keys (format, length, valid hex)\r\n- API URLs (HTTPS required for production, HTTP allowed for localhost)\r\n- Model names and parameters (ranges for max\\_tokens, temperature, top\\_p)\r\n\r\n**Error Sanitization:**\r\nAPI errors are automatically sanitized to prevent sensitive information leaks.\r\n\r\n**Monitoring:**\r\n```python\r\naddress = client.get_wallet_address()\r\nprint(f\"View transactions: https://basescan.org/address/{address}\")\r\n```\r\n\r\n**Keep Updated:**\r\n```bash\r\npip install --upgrade blockrun-llm  # Get security patches\r\n```\r\n\r\n## Agent Wallet Setup\r\n\r\nOne-line setup for agent runtimes (Claude Code skills, MCP servers, etc.):\r\n\r\n```python\r\nfrom blockrun_llm import setup_agent_wallet\r\n\r\n# Auto-creates wallet if none exists, returns ready client\r\nclient = setup_agent_wallet()\r\nresponse = client.chat(\"openai/gpt-5.4\", \"Hello!\")\r\n```\r\n\r\nFor Solana:\r\n\r\n```python\r\nfrom blockrun_llm import setup_agent_solana_wallet\r\n\r\nclient = setup_agent_solana_wallet()\r\nresponse = client.chat(\"anthropic/claude-sonnet-4.6\", \"Hello!\")\r\n```\r\n\r\nCheck wallet status:\r\n\r\n```python\r\nfrom blockrun_llm import status\r\n\r\nstatus()\r\n# Wallet: 0xCC8c...5EF8\r\n# Balance: $5.30 USDC\r\n```\r\n\r\n## Wallet Scanning\r\n\r\nThe SDK auto-detects wallets from any provider on your system:\r\n\r\n```python\r\nfrom blockrun_llm.wallet import scan_wallets\r\nfrom blockrun_llm.solana_wallet import scan_solana_wallets\r\n\r\n# Scans ~/.\u003cdir\u003e/wallet.json for Base wallets\r\nbase_wallets = scan_wallets()\r\n\r\n# Scans ~/.\u003cdir\u003e/solana-wallet.json\r\nsol_wallets = scan_solana_wallets()\r\n```\r\n\r\n`get_or_create_wallet()` checks scanned wallets first, so if you already have a wallet from another BlockRun tool, it will be reused automatically.\r\n\r\n## Response Caching\r\n\r\nThe SDK caches responses to avoid duplicate payments:\r\n\r\n```python\r\nfrom blockrun_llm import clear_cache\r\n\r\n# Automatic TTLs by endpoint:\r\n# - Prediction Markets: 30 minutes\r\n# - Search: 15 minutes\r\n# - Models: 24 hours\r\n# - Chat/Image: no cache (every call is unique)\r\n\r\n# Manual cache management\r\nremoved = clear_cache()  # Remove all cached responses\r\n```\r\n\r\nPer-session spending is also available on any client (see also\r\n[Billing \u0026 Cost Tracking](#billing--cost-tracking) for the full surface):\r\n\r\n```python\r\nfrom blockrun_llm import LLMClient\r\n\r\nclient = LLMClient()\r\nresponse = client.chat(\"openai/gpt-5.2\", \"Hello!\")\r\n\r\nspending = client.get_spending()\r\nprint(f\"Session: ${spending['total_usd']:.4f} across {spending['calls']} calls\")\r\n```\r\n\r\n## Anthropic SDK Compatibility\r\n\r\nUse the official Anthropic Python SDK with BlockRun's API gateway and automatic x402 payments:\r\n\r\n```bash\r\npip install blockrun-llm[anthropic]\r\n```\r\n\r\n```python\r\nfrom blockrun_llm import AnthropicClient\r\n\r\nclient = AnthropicClient()  # Auto-detects wallet, auto-pays\r\n\r\nresponse = client.messages.create(\r\n    model=\"claude-sonnet-4-6\",\r\n    max_tokens=1024,\r\n    messages=[{\"role\": \"user\", \"content\": \"Hello!\"}]\r\n)\r\nprint(response.content[0].text)\r\n\r\n# Works with any BlockRun model in Anthropic format\r\nresponse = client.messages.create(\r\n    model=\"openai/gpt-5.4\",\r\n    max_tokens=1024,\r\n    messages=[{\"role\": \"user\", \"content\": \"Hello from GPT!\"}]\r\n)\r\n```\r\n\r\nThe `AnthropicClient` wraps `anthropic.Anthropic` with a custom httpx transport that handles x402 payment signing transparently. Your private key never leaves your machine.\r\n\r\n## Links\r\n\r\n- [Website](https://blockrun.ai)\r\n- [Documentation](https://github.com/BlockRunAI/awesome-blockrun/tree/main/docs)\r\n- [GitHub](https://github.com/blockrunai/blockrun-llm)\r\n- [Telegram](https://t.me/+mroQv4-4hGgzOGUx)\r\n\r\n## Frequently Asked Questions\r\n\r\n### What is blockrun-llm?\r\nblockrun-llm is a Python SDK that provides pay-per-request access to 43+ large language models from OpenAI, Anthropic, Google, DeepSeek, NVIDIA, ZAI, and more. It uses the x402 protocol for automatic USDC micropayments — no API keys, no subscriptions, no vendor lock-in.\r\n\r\n### How does payment work?\r\nWhen you make an API call, the SDK automatically handles x402 payment. It signs a USDC transaction locally using your wallet private key (which never leaves your machine), and includes the payment proof in the request header. Settlement is non-custodial and instant on Base or Solana.\r\n\r\n### What is smart routing / ClawRouter?\r\nClawRouter is a built-in smart routing engine that analyzes your request across 14 dimensions and automatically picks the cheapest model capable of handling it. Routing happens locally in under 1ms. It can save up to 92% on LLM costs compared to using premium models for every request.\r\n\r\n### How much does it cost?\r\nPay only for what you use. Prices start at **FREE** (11 NVIDIA-hosted models). Paid models start at $0.10/M tokens. There are no minimums, subscriptions, or monthly fees. $5 in USDC gets you thousands of requests.\r\n\r\n### Can I use it with Solana?\r\nYes. Install with `pip install blockrun-llm[solana]` and use `SolanaLLMClient` instead of `LLMClient`. Same API, different payment chain.\r\n\r\n## License\r\n\r\nMIT\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FBlockRunAI%2Fblockrun-llm","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FBlockRunAI%2Fblockrun-llm","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FBlockRunAI%2Fblockrun-llm/lists"}