{"id":50445491,"url":"https://github.com/maureranton/artificialanalysis-ai-parser","last_synced_at":"2026-05-31T21:02:57.145Z","repository":{"id":357006443,"uuid":"1234949934","full_name":"MaurerAnton/artificialanalysis-ai-parser","owner":"MaurerAnton","description":"Parser for artificialanalysis.ai — extract AI model pricing, benchmarks \u0026 speed without an API key. Python (CLI) + JavaScript (browser \u0026 Node.js). Rewrites the broken demianarc/artificialanalysisscrapper.","archived":false,"fork":false,"pushed_at":"2026-05-10T21:44:08.000Z","size":25,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2026-05-10T23:24:04.954Z","etag":null,"topics":["ai-models","artificial-analysis","artificialanalysis","benchmarks","data-extraction","llm","model-data","parser","pricing","python","rsc","scraper"],"latest_commit_sha":null,"homepage":null,"language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/MaurerAnton.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-10T21:08:07.000Z","updated_at":"2026-05-10T21:44:12.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/MaurerAnton/artificialanalysis-ai-parser","commit_stats":null,"previous_names":["maureranton/artificialanalysis-ai-parser"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/MaurerAnton/artificialanalysis-ai-parser","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MaurerAnton%2Fartificialanalysis-ai-parser","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MaurerAnton%2Fartificialanalysis-ai-parser/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MaurerAnton%2Fartificialanalysis-ai-parser/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MaurerAnton%2Fartificialanalysis-ai-parser/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/MaurerAnton","download_url":"https://codeload.github.com/MaurerAnton/artificialanalysis-ai-parser/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MaurerAnton%2Fartificialanalysis-ai-parser/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33748607,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-31T02:00:06.040Z","response_time":95,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-models","artificial-analysis","artificialanalysis","benchmarks","data-extraction","llm","model-data","parser","pricing","python","rsc","scraper"],"created_at":"2026-05-31T21:02:55.048Z","updated_at":"2026-05-31T21:02:57.130Z","avatar_url":"https://github.com/MaurerAnton.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# artificialanalysis-ai-parser\n\nParser for [artificialanalysis.ai](https://artificialanalysis.ai) — extracts AI model data (pricing, benchmarks, speed) **without an API key**.\n\n## Why?\n\nThe idea started from [demianarc/artificialanalysisscrapper](https://github.com/demianarc/artificialanalysisscrapper) — a Python scraper that fetched model data from the Artificial Analysis Next.js RSC endpoint. It was a clever approach: the site's React Server Components stream exposed the full dataset (`hostsModels`) in a single 10 MB response, no authentication needed.\n\nHowever, after the site's redesign (\"A new look for Artificial Analysis\"), the old line-based parser broke completely. The RSC format changed from simple `key:value` pairs to a chunk-referenced wire format with `I[...]` inline references and `$c:props:...` circular links.\n\nThis project:\n\n- **Rewrites the extraction** using regex + bracket-counting instead of line-based parsing\n- **Deduplicates** 867 host-model pairs down to 326 unique models (keeping first occurrence with full non-circular data)\n- **Cleans** the output to only essential fields (pricing, IQ, speed, context window)\n- **Outputs** `models.json` — 314 models with full input/output/cache pricing, ready for downstream use\n\nThe result is a self-contained Python script with zero dependencies beyond the standard library.\n\n## Quick start\n\n### C++\n\n```bash\ng++ -std=c++17 -O2 artificialanalysis.ai-parser.cpp -lcurl -o aaparser\n./aaparser --minimal --pretty          # fetch + save to models.json\n```\n\nRequires: `libcurl`, `nlohmann/json` (header-only, auto-downloaded if missing).\n\n### Python\n\n```bash\npython3 artificialanalysis.ai-parser.py --minimal --pretty\n```\n\n### JavaScript (Node.js)\n\n```js\n// Node.js — works without CORS restrictions\nconst { AAParser } = require('./artificialanalysis.ai-parser.js');\nconst models = await AAParser.fetch({ minimal: true });\nconsole.log(models[0].name, models[0].price_1m_input_tokens);\n```\n\n\u003e **Note:** The JS parser does **not** work directly in the browser. The RSC endpoint requires the custom `rsc` header which triggers a CORS preflight, and the server does not return `Access-Control-Allow-Headers`. Use in Node.js or through a CORS proxy.\n\n### Output\n\n```\nDownloading RSC data from https://artificialanalysis.ai/leaderboards/providers?_rsc=hgvan ...\nDownloaded 10,481,155 bytes\nExtracted 867 raw entries (host-model pairs)\nDeduplicated to 326 unique models\nModels with pricing: 314\n\nSaved 314 models to models.json (134,549 bytes)\n\nTop model: GPT-5.5 (xhigh) (OpenAI)\n  IQ: 60.24 | Coding: 59.12 | Math: None\n  Price: $5.00 in / $30.00 out\n  Speed: 57 tok/s\n```\n\n## models.json structure\n\nEach entry:\n\n| Field | Description |\n|---|---|\n| `name` | Model name |\n| `creator` | AI lab / company |\n| `slug` | URL-friendly identifier |\n| `intelligence_index` | AA Intelligence Index score |\n| `coding_index` | AA Coding Index score |\n| `math_index` | AA Math Index score |\n| `price_1m_input_tokens` | Input price per 1M tokens (USD) |\n| `price_1m_output_tokens` | Output price per 1M tokens (USD) |\n| `price_1m_cache_hit` | Cache hit price per 1M tokens (USD) |\n| `blended_price_3_1` | Blended price at 3:1 input:output ratio |\n| `context_window_tokens` | Context window size |\n| `output_tokens_per_second` | Generation speed |\n| `time_to_first_token_ms` | Latency to first token |\n| `reasoning` | Whether it's a reasoning model |\n| `open_weights` | Whether weights are open |\n\n## Data coverage\n\n| Metric | Coverage |\n|---|---|\n| Pricing (input/output) | 100% (314/314) |\n| Intelligence Index | 87% |\n| Coding Index | 90% |\n| Math Index | 60% |\n| Speed (tok/s) | 100% |\n| Cache pricing | 33% |\n\n## How it works\n\n```text\nartificialanalysis.ai\n  └─ /leaderboards/providers?_rsc=hgvan\n       └─ Next.js RSC stream (10 MB, text/x-component)\n            └─ Contains \"hostsModels\":[{...}] with ~867 entries\n                 └─ Extract JSON via bracket-counting\n                      └─ Deduplicate by model_id\n                           └─ Clean \u0026 output models.json\n```\n\nThe RSC endpoint requires specific headers (`rsc: 1`, `next-router-state-tree`, `next-url`) but no cookies or authentication.\n\n## Limitations\n\n- **No API key = fragile.** The RSC endpoint is an internal Next.js mechanism. If the site changes its chunk format again, the bracket-counting may need updating.\n- **Circular references.** From the 2nd entry onward, some nested model fields use `$c:props:...` reference strings instead of actual values. We keep only the *first* occurrence per `model_id` (which has full data).\n- **Official API is preferred** for production use. This parser is a workaround for when you don't have (or don't want) an API key. See [artificialanalysis.ai/documentation](https://artificialanalysis.ai/documentation) for the free API tier (1,000 req/day).\n\n## Companion: interactive cost calculator\n\n`dashboard.html` — a dark-themed token cost dashboard that lets you see how much you'd spend using different AI model providers.\n\n`compact-dashboard.html` — a lightweight version: no charts, 4 top models compared side by side. Each model card shows estimated total cost for your token data at a glance.\n\n**Try it live:**  \n[Full dashboard](https://maureranton.github.io/dashboard/dashboard.html) — charts, model selector, date range filter  \n[Compact dashboard](https://maureranton.github.io/dashboard/compact-dashboard.html) — 4 models, instant cost comparison\n\n**To run locally:**\n\n1. Open `dashboard.html` or `compact-dashboard.html` in a browser (or serve via any HTTP server)\n2. They load `paths.json` → `data.json` + `models.json`\n3. Select a model — prices auto-fill from Artificial Analysis data\n4. Tweak token counts — costs recalculate instantly\n\nExample files included:\n- `example-paths.json` — points to `example-data.json` and `models.json`\n- `example-data.json` — 7 days of synthetic token data for demo\n\nTo use your own data, rename `example-paths.json` → `paths.json`, point it at your data file, and update your `data.json` with real token counts.\n\n## License\n\nGPL-3.0 — Copyright (C) 2026 Anton Maurer\n\n## Credits\n\n- Original scraping concept by [demianarc/artificialanalysisscrapper](https://github.com/demianarc/artificialanalysisscrapper)\n- Model data source: [artificialanalysis.ai](https://artificialanalysis.ai)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmaureranton%2Fartificialanalysis-ai-parser","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmaureranton%2Fartificialanalysis-ai-parser","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmaureranton%2Fartificialanalysis-ai-parser/lists"}