{"id":50610313,"url":"https://github.com/blackwell-systems/gcf","last_synced_at":"2026-06-06T03:02:29.248Z","repository":{"id":362364814,"uuid":"1258711561","full_name":"blackwell-systems/gcf","owner":"blackwell-systems","description":"GCF: token-optimized wire format for LLM tool responses. 84% fewer tokens than JSON, 34% fewer than TOON, 100% comprehension accuracy at scale.","archived":false,"fork":false,"pushed_at":"2026-06-03T23:05:04.000Z","size":6,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-03T23:06:03.382Z","etag":null,"topics":["ai-agents","code-intelligence","context-window","format","gcf","graph","llm","mcp","model-context-protocol","specification","token-optimization","wire-format"],"latest_commit_sha":null,"homepage":null,"language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/blackwell-systems.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-06-03T21:03:08.000Z","updated_at":"2026-06-03T23:05:07.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/blackwell-systems/gcf","commit_stats":null,"previous_names":["blackwell-systems/gcf"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/blackwell-systems/gcf","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/blackwell-systems%2Fgcf","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/blackwell-systems%2Fgcf/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/blackwell-systems%2Fgcf/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/blackwell-systems%2Fgcf/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/blackwell-systems","download_url":"https://codeload.github.com/blackwell-systems/gcf/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/blackwell-systems%2Fgcf/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33967641,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-06T02:00:07.033Z","response_time":107,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-agents","code-intelligence","context-window","format","gcf","graph","llm","mcp","model-context-protocol","specification","token-optimization","wire-format"],"created_at":"2026-06-06T03:02:27.559Z","updated_at":"2026-06-06T03:02:29.234Z","avatar_url":"https://github.com/blackwell-systems.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://github.com/blackwell-systems\"\u003e\u003cimg src=\"https://raw.githubusercontent.com/blackwell-systems/blackwell-docs-theme/main/badge-trademark.svg\" alt=\"Blackwell Systems\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://gcformat.com/playground.html\"\u003e\u003cimg src=\"https://img.shields.io/badge/playground-live-blue.svg\" alt=\"Playground\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/blackwell-systems/gcf\"\u003e\u003cimg src=\"https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/blackwell-systems/gcf/main/assets/downloads-badge.json\" alt=\"Downloads\"\u003e\u003c/a\u003e\n  \u003ca href=\"LICENSE\"\u003e\u003cimg src=\"https://img.shields.io/badge/license-MIT-blue.svg\" alt=\"License\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n# GCF: Graph Compact Format\n\n**The most token-efficient wire format for LLMs. Bidirectional: cheaper to read and cheaper to write.**\n\nTwo encoding profiles, one grammar:\n\n- **Graph profile**: code graph payloads (symbols, edges, distance groups). 79% fewer tokens than JSON.\n- **Tabular profile**: any structured data (arrays, nested objects, mixed types). 34% fewer tokens than TOON.\n\n```\nTool  ───▶  encode()  ───▶  GCF  ───▶  LLM  ───▶  GCF  ───▶  Agent/Tool\n```\n\n### vs JSON: 79% fewer tokens, JSON can't even count at scale\n\n```\nTokens (500 symbols):\n\n  JSON   ████████████████████████████████████████████████████  53,341\n  TOON   ████████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░  16,378\n  GCF    ███████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░  11,090  ◀ winner\n```\n\n### vs TOON: 34% fewer tokens on their own benchmark\n\n```\nToken efficiency (TOON's datasets, TOON's tokenizer):\n\n  Mixed-structure data:\n  TOON   ████████████████████████████████████████████████████  227,896\n  GCF    ██████████████████████████████████░░░░░░░░░░░░░░░░░  170,367  ◀ 34% smaller\n\n  Semi-uniform data (most common real-world pattern):\n  TOON   ████████████████████████████████████████████████████  154,032\n  GCF    ████████████████████████████████████░░░░░░░░░░░░░░░  108,158  ◀ 42% smaller\n\n  Flat tabular data:\n  TOON   ████████████████████████████████████████████████████   67,837\n  GCF    ██████████████████████████████████████████████████░░   66,026  ◀ 3% smaller\n```\n\n### LLM comprehension: 100% accuracy at the lowest token cost\n\n```\nAccuracy at 500 symbols (13 structured extraction questions):\n\n  GCF    ████████████████████████████████████████████████████  100%   ✓ (13/13)\n  TOON   ████████████████████████████████████████████████░░░░  92.3%  (12/13)\n  JSON   ██████████████████████████████████████░░░░░░░░░░░░░░  76.9%  ✗ (10/13)\n```\n\nGCF beats TOON on accuracy AND uses 32% fewer tokens. JSON fails on counting tasks because field-name repetition at scale overwhelms the model's attention.\n\n---\n\n### Try it\n\n```bash\npip install gcf-python                    # Python\nnpm install @blackwell-systems/gcf        # TypeScript\ngo get github.com/blackwell-systems/gcf-go  # Go\ncargo add gcf                             # Rust\n```\n\n### Encode any structured data (tabular profile)\n\n```python\nfrom gcf import encode_generic\n\noutput = encode_generic({\n    \"employees\": [\n        {\"id\": 1, \"name\": \"Alice\", \"department\": \"Engineering\", \"salary\": 95000},\n        {\"id\": 2, \"name\": \"Bob\", \"department\": \"Sales\", \"salary\": 72000},\n        {\"id\": 3, \"name\": \"Carol\", \"department\": \"Marketing\", \"salary\": 85000},\n    ],\n})\n```\n\n```\n## employees [3]{id,name,department,salary}\n1|Alice|Engineering|95000\n2|Bob|Sales|72000\n3|Carol|Marketing|85000\n```\n\nOne header declares field names. Rows are positional values only. No field names repeated per record. Works on any JSON: arrays, nested objects, primitives.\n\n### Graph profile (code intelligence, MCP tools)\n\nFor data with nodes, edges, and distance groups:\n\n```python\nfrom gcf import encode, Payload, Symbol, Edge\n\noutput = encode(Payload(\n    tool=\"context_for_task\", token_budget=5000, tokens_used=1847,\n    symbols=[\n        Symbol(qualified_name=\"pkg.Auth\", kind=\"function\", score=0.78, provenance=\"lsp\", distance=0),\n        Symbol(qualified_name=\"pkg.Server\", kind=\"function\", score=0.54, provenance=\"lsp\", distance=1),\n    ],\n    edges=[Edge(source=\"pkg.Server\", target=\"pkg.Auth\", edge_type=\"calls\")],\n))\n```\n\n```\nGCF tool=context_for_task budget=5000 tokens=1847 symbols=2 edges=1\n## targets\n@0 fn pkg.Auth 0.78 lsp\n## related\n@1 fn pkg.Server 0.54 lsp\n## edges [1]\n@0\u003c@1 calls\n```\n\nLocal IDs (`@0`, `@1`) replace full names in edges. 233 tokens instead of 965 for JSON.\n\n**[Try it live in the playground](https://gcformat.com/playground.html)** with real-time three-way comparison (JSON vs TOON vs GCF).\n\n---\n\n### At a glance\n\n| | GCF | TOON | JSON |\n|---|---|---|---|\n| **Input tokens (500 symbols)** | 11,090 | 16,378 | 53,341 |\n| **Output tokens (100 symbols)** | 5,619 | 11,650 | 22,180 |\n| **Comprehension accuracy** | 100% (13/13) | 92.3% (12/13) | 76.9% (10/13) |\n| **Generation validity** | 5/5 | 5/5 | N/A |\n| **Session dedup (5th call)** | 92.7% savings | N/A | N/A |\n| **Delta encoding** | 81.2% savings | N/A | N/A |\n| **Semi-uniform data** | native | falls back | verbose |\n| **Best for** | graph data, MCP tools, multi-turn, agent output | flat tables | nothing at scale |\n\n---\n\n## How it works\n\n### Graph profile\n\nExploits three properties of graph-structured data:\n\n1. **Positional fields.** One header declares field names. Rows are values only.\n2. **Local IDs.** `@0`, `@1`. Edges reference by ID, not by repeating full identifiers.\n3. **Hierarchical grouping.** Section headers (`## targets`) replace per-record metadata.\n\n### Tabular profile\n\nExploits two properties of structured data:\n\n1. **Tabular headers.** `## name [count]{field1,field2}` declares field names once. Rows are pipe-separated values.\n2. **Section headers.** `## key` for nested objects. `key=value` for primitives.\n\nBoth profiles share the same grammar: `##` headers, `@` IDs, positional fields. The savings are structural and grow with payload size.\n\n## Example (graph profile)\n\n**JSON (965 tokens):**\n```json\n{\n  \"tool\": \"context_for_task\",\n  \"tokens_used\": 1847,\n  \"token_budget\": 5000,\n  \"symbols\": [\n    { \"qualified_name\": \"github.com/org/repo/pkg.AuthMiddleware\", \"kind\": \"function\", \"score\": 0.78, \"provenance\": \"lsp_resolved\", \"distance\": 0 },\n    { \"qualified_name\": \"github.com/org/repo/pkg.NewServer\", \"kind\": \"function\", \"score\": 0.54, \"provenance\": \"lsp_resolved\", \"distance\": 1 }\n  ],\n  \"edges\": [\n    { \"source\": \"github.com/org/repo/pkg.NewServer\", \"target\": \"github.com/org/repo/pkg.AuthMiddleware\", \"edge_type\": \"calls\" }\n  ]\n}\n```\n\n**GCF (233 tokens):**\n```\nGCF tool=context_for_task budget=5000 tokens=1847 symbols=2 edges=1\n## targets\n@0 fn github.com/org/repo/pkg.AuthMiddleware 0.78 lsp_resolved\n## related\n@1 fn github.com/org/repo/pkg.NewServer 0.54 lsp_resolved\n## edges [1]\n@0\u003c@1 calls\n```\n\nSame information. 75.9% fewer tokens.\n\n## It gets cheaper over time\n\n**Session deduplication:** Symbols sent in prior responses become bare references. By the 5th tool call: 92.7% savings vs JSON.\n\n**Delta encoding:** When the context changes slightly between queries, send only the diff. 81.2% additional savings on re-queries.\n\nNo other format has these. They're possible because GCF was designed for multi-turn LLM tool interactions, not generic data serialization.\n\n## Benchmarks\n\n### Comprehension accuracy (500 symbols, 13 extraction questions)\n\n| Format | Accuracy | Tokens | vs JSON |\n|--------|----------|--------|---------|\n| **GCF** | **100%** (13/13) | **11,090** | **79% fewer** |\n| TOON | 92.3% (12/13) | 16,378 | 69% fewer |\n| JSON | 76.9% (10/13) | 53,341 | baseline |\n\nEval: [gcf-go/eval](https://github.com/blackwell-systems/gcf-go/tree/main/eval)\n\n### Token efficiency ([TOON's own benchmark](https://github.com/blackwell-systems/toon/tree/gcf-comparison), their datasets, their tokenizer)\n\n| Track | GCF | TOON | Result |\n|-------|-----|------|--------|\n| Mixed-structure (nested, semi-uniform) | 170,367 | 227,896 | **GCF 34% smaller** |\n| Flat-only (tabular) | 66,026 | 67,837 | **GCF 3% smaller** |\n| Semi-uniform event logs | 108,158 | 154,032 | **GCF 42% smaller** |\n\nFork with reproducible results: [blackwell-systems/toon@gcf-comparison](https://github.com/blackwell-systems/toon/tree/gcf-comparison)\n\n## Specification\n\nFull grammar, encoding rules, session statefulness, delta encoding, and tabular profile: [SPEC.md](SPEC.md)\n\n## Implementations\n\n| Language | Package | Repository |\n|----------|---------|-----------|\n| Go | `go get github.com/blackwell-systems/gcf-go` | [gcf-go](https://github.com/blackwell-systems/gcf-go) |\n| TypeScript | `npm install @blackwell-systems/gcf` | [gcf-typescript](https://github.com/blackwell-systems/gcf-typescript) |\n| Python | `pip install gcf-python` | [gcf-python](https://github.com/blackwell-systems/gcf-python) |\n| Rust | `cargo add gcf` | [gcf-rust](https://github.com/blackwell-systems/gcf-rust) |\n| Swift | Swift Package Manager | [gcf-swift](https://github.com/blackwell-systems/gcf-swift) |\n| Kotlin | JitPack (`com.github.blackwell-systems:gcf-kotlin`) | [gcf-kotlin](https://github.com/blackwell-systems/gcf-kotlin) |\n| MCP Proxy | `pip install gcf-proxy` | [gcf-proxy](https://github.com/blackwell-systems/gcf-proxy) |\n\nZero runtime dependencies. MIT licensed. Spec is stable. The proxy is a drop-in wrapper for any existing MCP server (zero code changes).\n\nAll implementations support both graph profile (`encode`/`Encode`) and tabular profile (`encode_generic`/`encodeGeneric`/`EncodeGeneric`).\n\n## Documentation\n\nFull guides, API reference, benchmarks, and integration patterns: **[gcformat.com](https://gcformat.com/)**\n\n- [Getting Started](https://gcformat.com/guide/getting-started.html)\n- [Format Overview](https://gcformat.com/guide/format-overview.html)\n- [Session Deduplication](https://gcformat.com/guide/sessions.html)\n- [Delta Encoding](https://gcformat.com/guide/delta.html)\n- [MCP Integration](https://gcformat.com/guide/mcp.html)\n- [Benchmarks](https://gcformat.com/guide/benchmarks.html)\n- [GCF vs TOON](https://gcformat.com/guide/vs-toon.html)\n- [MCP Proxy Guide](https://gcformat.com/guide/proxy.html)\n- [Playground](https://gcformat.com/playground.html)\n- [Syntax Cheatsheet](https://gcformat.com/reference/cheatsheet.html)\n- [Token Savings Proof](https://gcformat.com/reference/token-savings-proof.html)\n\n## Use cases\n\n- **MCP tool responses.** Any [MCP](https://modelcontextprotocol.io/) server returning structured data. GCF delivers more context per token budget with better comprehension accuracy than JSON.\n- **Agent-to-agent communication.** Agents passing context in multi-agent workflows. 75% fewer tokens per handoff.\n- **LLM structured output.** LLMs produce valid GCF with a 3-line primer. 52% fewer output tokens than TOON.\n- **Code intelligence.** Graph profile with local IDs, edges, and distance grouping for symbols, call hierarchies, and dependency graphs.\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fblackwell-systems%2Fgcf","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fblackwell-systems%2Fgcf","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fblackwell-systems%2Fgcf/lists"}