{"id":49487170,"url":"https://github.com/0bserver07/bourbaki","last_synced_at":"2026-05-14T04:01:30.066Z","repository":{"id":337541708,"uuid":"1153241866","full_name":"0bserver07/bourbaki","owner":"0bserver07","description":"An autonomous agent for mathematical reasoning and proof","archived":false,"fork":false,"pushed_at":"2026-05-12T02:50:57.000Z","size":727,"stargazers_count":3,"open_issues_count":8,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2026-05-12T04:09:24.267Z","etag":null,"topics":["ai-agent","arxiv","autonomous-agent","computer-algebra","fastapi","lean4","math-research","mathematics","oeis","proof-assistant","pydantic-ai","sympy","theorem-proving"],"latest_commit_sha":null,"homepage":"https://yad.codes/posts/building-bourbaki/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/0bserver07.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-02-09T04:26:34.000Z","updated_at":"2026-05-12T02:51:02.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/0bserver07/bourbaki","commit_stats":null,"previous_names":["0bserver07/bourbaki"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/0bserver07/bourbaki","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/0bserver07%2Fbourbaki","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/0bserver07%2Fbourbaki/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/0bserver07%2Fbourbaki/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/0bserver07%2Fbourbaki/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/0bserver07","download_url":"https://codeload.github.com/0bserver07/bourbaki/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/0bserver07%2Fbourbaki/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33009919,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-13T13:14:54.681Z","status":"online","status_checked_at":"2026-05-14T02:00:06.663Z","response_time":57,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-agent","arxiv","autonomous-agent","computer-algebra","fastapi","lean4","math-research","mathematics","oeis","proof-assistant","pydantic-ai","sympy","theorem-proving"],"created_at":"2026-05-01T02:11:23.848Z","updated_at":"2026-05-14T04:01:30.058Z","avatar_url":"https://github.com/0bserver07.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"assets/banner.svg\" alt=\"bourbaki - An autonomous agent for mathematical reasoning and proof.\" width=\"100%\"\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cstrong\u003eAn autonomous agent for mathematical reasoning and proof.\u003c/strong\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"#how-it-works\"\u003eHow It Works\u003c/a\u003e \u0026middot;\n  \u003ca href=\"#quick-start\"\u003eQuick Start\u003c/a\u003e \u0026middot;\n  \u003ca href=\"#tools\"\u003eTools\u003c/a\u003e \u0026middot;\n  \u003ca href=\"#skills\"\u003eSkills\u003c/a\u003e \u0026middot;\n  \u003ca href=\"#autonomous-mode\"\u003eAutonomous Mode\u003c/a\u003e \u0026middot;\n  \u003ca href=\"#commands\"\u003eCommands\u003c/a\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/Python-3.11+-3776AB?logo=python\u0026logoColor=white\" alt=\"Python 3.11+\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/Bun-1.0+-F9F1E1?logo=bun\u0026logoColor=black\" alt=\"Bun 1.0+\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/Lean_4-Mathlib-4B32C3\" alt=\"Lean 4\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/Pydantic_AI-Agent-E92063?logo=pydantic\u0026logoColor=white\" alt=\"Pydantic AI\"\u003e\n  \u003cimg src=\"https://img.shields.io/github/license/0bserver07/bourbaki\" alt=\"License\"\u003e\n\u003c/p\u003e\n\n---\n\n[Claude Code](https://docs.anthropic.com/en/docs/claude-code/overview) gives an LLM a shell and dev tools so it can write and run code. Bourbaki does the same thing for math: it gives an LLM a computer algebra system (SymPy), a proof assistant (Lean 4), and research APIs (OEIS, arXiv).\n\nYou ask a question in the TUI, the agent computes, verifies, looks things up, and streams the answer back. If it writes a proof, it can formalize it. If it makes a claim, it can check it.\n\n## How It Works\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"assets/agent-loop.svg\" alt=\"Bourbaki agent loop — TUI, Backend, and Tools\" width=\"100%\"\u003e\n\u003c/p\u003e\n\n1. You ask a question in the TUI\n2. The backend agent reasons about the approach\n3. It calls tools: SymPy for computation, Lean for verification, OEIS/arXiv for lookup\n4. Results feed back into the agent, which iterates if needed\n5. A scratchpad enforces limits and deduplicates repeated calls\n6. The final answer streams back to the TUI as it's generated\n\nThe TUI is a pure display client. All reasoning, tool calls, and state live in the Python backend.\n\n## Quick Start\n\n```bash\n# Clone the repo\ngit clone https://github.com/0bserver07/bourbaki.git\ncd bourbaki\n\n# Start the backend\ncd backend\npip install -e .\nuvicorn bourbaki.main:app --reload --port 8000\n\n# In another terminal — start the TUI\nbun install\nbun start\n```\n\nThe TUI connects to `localhost:8000` by default. Override with `BOURBAKI_BACKEND_URL`.\n\n### Prerequisites\n\n- [Python 3.11+](https://python.org)\n- [Bun](https://bun.sh) v1.0+\n- An LLM API key (set `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, or `GOOGLE_API_KEY`)\n- [Lean 4](https://lean-lang.org/) with Mathlib (optional, for formal verification)\n\n## Tools\n\n| Tool | What it does |\n|------|-------------|\n| **Symbolic Compute** | Native SymPy: simplification, integration, solving, 30+ operations |\n| **Lean Prover** | Lean 4 + Mathlib, machine-checked formal proofs |\n| **Sequence Lookup** | OEIS: identify and explore integer sequences |\n| **Paper Search** | arXiv: find relevant papers and results |\n| **Web Search** | Exa: search the web for mathematical references |\n\n## Skills\n\nSkills are proof techniques loaded from `SKILL.md` files. They tell the agent how to approach a specific type of proof step by step, instead of letting it improvise.\n\n21 built-in skills across five categories:\n\n- **Basic:** induction, strong induction, direct proof, contradiction, pigeonhole, counting\n- **Analysis:** epsilon-delta, convergence tests, sequence limits, inequality chains\n- **Geometry:** coordinate proof, synthetic construction, transformations\n- **Algebra:** group homomorphisms, ring ideals, polynomials\n- **Advanced:** extremal arguments, probabilistic method, conjecture exploration, formalization, proof explanation\n\nSkills can be added at three levels: built-in (`src/skills/`), user (`~/.bourbaki/skills/`), or project (`.bourbaki/skills/`).\n\n## Autonomous Mode\n\nLong-running proof search via a proposer-builder-reviewer loop driven by GLM-5.1 and a warm `LeanREPLSession`. One proposal per iteration, bounded by `max_iterations` (default 50, 8 for interactive). Every reported solve is gated by a `lean_prover` whole-file compile — no REPL-only claims.\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"assets/prover-loop.svg\" alt=\"Proposer-Builder-Reviewer loop\" width=\"100%\"\u003e\n\u003c/p\u003e\n\nDrive the loop from `backend/bourbaki/benchmarks/minif2f.py::attempt_proof_loop` or from the FastAPI `/query` endpoint with `use_loop=True`. (The TUI's `/prove \u003cid\u003e` command still points at the legacy `/autonomous/start` route, which now returns HTTP 410 Gone — the legacy pipeline was deleted in commit `2113629`. Rewiring the TUI to the new loop is tracked separately.)\n\n## Results\n\nVerified pass rates on miniF2F valid (every solve confirmed by `lean_prover` standalone compile — see [`docs/REALITY_CHECK.md`](docs/REALITY_CHECK.md) for the audit of the earlier REPL-only era):\n\n| Date | Approach | Verified | Sample |\n|------|----------|---------:|--------|\n| 2026-02-22 (audit) | v0.2.1 code, lean_prover-gated | 6.2% (15/244) | full 244 |\n| 2026-03-08 (v0.2.2) | + REPL pipe-recovery + tactic blocklist | 25.8% (63/244) | full 244 |\n| 2026-04-01 | + HILBERT decomposer + in-context solving | 50.0% (5/10) | 10-problem |\n| 2026-04-25 | **proposer-builder-reviewer loop (GLM-5.1)** | **90.0% (9/10)** | 10-problem · 0 false positives |\n| 2026-05-09 | same loop on a wider sample | **62.9% (22/35)** | 35-problem stratified · 0 false positives |\n\nThe 2026-02-17 v0.2.0 and 2026-02-18 v0.2.1 releases claimed 91.8% / 94.3% on the valid/test splits. Both numbers were inflated ~15× by REPL false positives and were retracted in the v0.2.2 audit (both GitHub releases now read \"RETRACTED (inflated numbers)\" in their titles). The current proposer-builder-reviewer architecture (commits `49211ce` through `2113629`) replaces the prior HILBERT-style pipeline; the full 244-problem run with the new architecture is pending (tracked in [issue #14](https://github.com/0bserver07/bourbaki/issues/14)).\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"assets/benchmark-history.svg\" alt=\"miniF2F verified pass-rate history\" width=\"100%\"\u003e\n\u003c/p\u003e\n\n## Example Usage\n\n**Prove a theorem:**\n```\n❯ Prove that the sum of the first n integers equals n(n+1)/2\n\n⏺ Thinking...\n⏺ Symbolic Compute (expression=Sum(k, (k, 1, n)))\n  ⎿ Computed result\n⏺ Lean Prover\n  ⎿ ✓ Verified in 2.3s\n\nProof by induction. Base case: n = 1, sum = 1 = 1·2/2. ✓\nInductive step: assume ∑_{k=1}^{n} k = n(n+1)/2.\nThen ∑_{k=1}^{n+1} k = n(n+1)/2 + (n+1) = (n+1)(n+2)/2. ∎\n```\n\n**Compute symbolically:**\n```\n❯ Factor 84 and find its divisors\n\n⏺ Symbolic Compute (operation=factor, expression=84)\n  ⎿ Computed result\n\n84 = 2² × 3 × 7\nDivisors: {1, 2, 3, 4, 6, 7, 12, 14, 21, 28, 42, 84}\n```\n\n**Identify a sequence:**\n```\n❯ What sequence is 1, 1, 2, 3, 5, 8, 13?\n\n⏺ Sequence Lookup (query=\"1,1,2,3,5,8,13\")\n  ⎿ Found 1 results\n\nA000045 — Fibonacci numbers: F(n) = F(n-1) + F(n-2) with F(0) = 0 and F(1) = 1.\n```\n\n## Commands\n\n| Command | What it does |\n|---------|-------------|\n| `/help` | Show all commands |\n| `/model \u003cname\u003e` | Switch LLM model |\n| `/skills` | List available proof technique skills |\n| `/problems` | Browse the problem database |\n| `/prove \u003cid\u003e` | Start proof attempt (legacy TUI handler still POSTs to `/autonomous/start`, which now returns 410; use the `attempt_proof_loop` driver or `/query` with `use_loop=True` for the new loop) |\n| `/pause` | Pause proof search (legacy, 410) |\n| `/progress` | Show proof search progress (legacy, 410) |\n| `/sessions` | List saved sessions |\n| `/new` | Start a new session |\n| `/export [format]` | Export last answer (latex, lean, markdown) |\n| `/debug` | Toggle debug mode |\n| `/clear` | Clear the screen |\n\n## Architecture\n\n```\nsrc/                          React + Ink TUI (display client)\n├── components/               UI components (Input, AgentEventView, AnswerView)\n├── hooks/                    useAgentRunner (SSE bridge), useModelSelection\n└── skills/                   21 SKILL.md proof technique files\n\nbackend/bourbaki/             Python backend (owns all state)\n├── agent/                    Pydantic AI agent, prompts, scratchpad, event mapper\n├── tools/                    SymPy, Lean 4, OEIS, arXiv, Web Search, Skills\n├── sessions/                 Persistence + context compaction\n├── prover/                   Proposer-builder-reviewer-memory loop\n├── autonomous/               Phase-3 vestige — only `tactics.py` survives (blocklist)\n├── benchmarks/               miniF2F + PutnamBench runners\n├── problems/                 13 classic problems database\n└── server/routes/            FastAPI endpoints (query, sessions, skills, ...)\n```\n\n## Tech Stack\n\n- **Backend:** Python, FastAPI, Pydantic AI, SymPy, httpx\n- **TUI:** Bun, React + Ink, TypeScript\n- **Verification:** Lean 4 + Mathlib\n- **Sequences:** OEIS API\n- **Papers:** arXiv API\n\n## Credits\n\nNamed after [Nicolas Bourbaki](https://en.wikipedia.org/wiki/Nicolas_Bourbaki), the collective pseudonym of a group of mathematicians who tried to rewrite all of mathematics from scratch using set theory.\n\n## License\n\n[MIT License](LICENSE)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2F0bserver07%2Fbourbaki","html_url":"https://awesome.ecosyste.ms/projects/github.com%2F0bserver07%2Fbourbaki","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2F0bserver07%2Fbourbaki/lists"}