{"id":48229148,"url":"https://github.com/multikernel/branching","last_synced_at":"2026-04-04T19:29:19.228Z","repository":{"id":337708422,"uuid":"1154856614","full_name":"multikernel/branching","owner":"multikernel","description":"BranchContext gives AI agents and automated workflows copy-on-write branching over filesystems and processes","archived":false,"fork":false,"pushed_at":"2026-03-14T03:12:00.000Z","size":320,"stargazers_count":7,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-03-14T13:57:45.079Z","etag":null,"topics":["ai-agents","sandboxing"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/multikernel.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-02-10T21:14:38.000Z","updated_at":"2026-03-14T03:12:05.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/multikernel/branching","commit_stats":null,"previous_names":["multikernel/branching"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/multikernel/branching","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/multikernel%2Fbranching","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/multikernel%2Fbranching/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/multikernel%2Fbranching/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/multikernel%2Fbranching/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/multikernel","download_url":"https://codeload.github.com/multikernel/branching/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/multikernel%2Fbranching/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31410686,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-04T10:20:44.708Z","status":"ssl_error","status_checked_at":"2026-04-04T10:20:06.846Z","response_time":60,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-agents","sandboxing"],"created_at":"2026-04-04T19:29:19.130Z","updated_at":"2026-04-04T19:29:19.210Z","avatar_url":"https://github.com/multikernel.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# BranchContext\n\nLet AI agents try things without consequences.\n\nWhen an agent explores multiple strategies - applying different patches,\ntrying different prompts, or testing alternative approaches - it normally\nhas to snapshot the workspace, run the attempt, then clean up the mess\nbefore trying the next one. BranchContext eliminates that overhead.\n\nFork the workspace into parallel copy-on-write branches, run speculative\nattempts in each, commit the winner, and abort the rest - instantly.\nNo snapshots, no cleanup, no leftover state.\n\nBased on the paper [Fork, Explore, Commit: OS Primitives for Agentic Exploration](https://arxiv.org/abs/2602.08199).\n\n## Install\n\n```\npip install BranchContext\n```\n\nRequires Python \u003e= 3.10. No external dependencies.\n\n### Docker\n\nThe Docker image ships with [BranchFS](https://github.com/multikernel/branchfs)\nbuilt in -- no need to install FUSE, compile Rust, or configure any filesystem\nyourself. Just pull the image and go:\n\n```bash\ndocker pull multikernel/branching\n```\n\nRun directly with `docker run`:\n\n```bash\ndocker run --rm --device /dev/fuse --cap-add SYS_ADMIN \\\n  --security-opt apparmor:unconfined \\\n  -v $(pwd):/src multikernel/branching run -- make test\n```\n\nOr use the `branching-docker` wrapper (in `integration/docker/`) which handles\nthe Docker flags for you:\n\n```bash\nbranching-docker -w ./myproject run -- make test\nbranching-docker -w . speculate -c \"./fix_a.sh\" -c \"./fix_b.sh\"\nbranching-docker -w . best-of-n -n 5 -- ./solve.py\n```\n\nThe `-w` flag specifies a host directory to use as the workspace. It is\nbind-mounted into the container, and BranchFS is mounted on top automatically.\nCommitted changes are written back to the host directory.\n\nTo build the image from source:\n\n```bash\ndocker build -t branching -f integration/docker/Dockerfile .\n```\n\n## Quick start\n\n```python\nfrom branching import Workspace\n\nws = Workspace(\"/mnt/workspace\")\n\n# Auto-commit on success, auto-abort on exception\nwith ws.branch(\"attempt\") as b:\n    subprocess.run([\"agent\", \"--workdir\", str(b.path)], check=True)\n```\n\nThe agent writes to `b.path`, which is an isolated copy-on-write view.\nIf the command succeeds, changes are merged back into the workspace. If it\nraises, everything is rolled back - the workspace is untouched.\n\n## Agent patterns\n\nBranchContext ships with seven high-level patterns that cover the most common\nagent workflows. Each is a callable class: instantiate with config, call with\na workspace.\n\n### Parallel speculation (first wins)\n\nRun multiple strategies in parallel. The first one that succeeds gets\ncommitted; the rest are aborted.\n\nUse when you have several plausible approaches and care about latency more\nthan optimality: bug fixes where any passing patch is good enough, tool\nselection where multiple tools could work, or prompt variants where you\njust need one that doesn't error out. Pairs naturally with the ``n=``\nparameter in OpenAI's Chat Completions API to race N variations in\nparallel.\n\n```python\nfrom branching import Workspace, Speculate\nimport openai\n\nclient = openai.OpenAI()\nresp = client.chat.completions.create(\n    model=\"gpt-4o\", n=5,\n    messages=[{\"role\": \"user\", \"content\": prompt}],\n)\n\ndef make_candidate(code: str):\n    def candidate(path: Path) -\u003e bool:\n        (path / \"fix.py\").write_text(code)\n        return run_tests(path)\n    return candidate\n\ncandidates = [make_candidate(c.message.content) for c in resp.choices]\noutcome = Speculate(candidates, first_wins=True, timeout=60)(ws)\n\nif outcome.committed:\n    print(f\"Fix {outcome.winner.branch_index} succeeded!\")\n```\n\n### Best-of-N with scoring\n\nRun N candidates in parallel and commit the highest-scoring success.\n\nUse when quality matters more than speed: code generation where you want\nthe cleanest output across multiple temperatures, translation with a BLEU\nscorer picking the best variant, or any task with a reliable quality metric.\nPairs naturally with the ``n=`` parameter in OpenAI's Chat Completions API\nto generate N variations in a single call, then test each in an isolated\nbranch.\n\nCandidates can return ``bool`` or ``(bool, float)``. Scoring is flexible:\npass pre-computed ``scores`` (e.g. from logprobs), provide an ``evaluate``\ncallback for post-execution scoring, or let candidates score themselves.\n\n```python\nfrom branching import BestOfN\nimport openai\n\nclient = openai.OpenAI()\nresp = client.chat.completions.create(\n    model=\"gpt-4o\", n=5, logprobs=True, top_logprobs=1,\n    messages=[{\"role\": \"user\", \"content\": prompt}],\n)\n\n# Pre-computed confidence scores from logprobs\nlogprob_scores = [\n    sum(t.logprob for t in c.logprobs.content) / len(c.logprobs.content)\n    for c in resp.choices\n]\n\n# Candidates just apply code and test -- return bare bool\ncandidates = [make_test(c.message.content) for c in resp.choices]\n\n# BestOfN picks the highest-logprob passing candidate\noutcome = BestOfN(candidates, scores=logprob_scores)(ws)\n```\n\n#### RL training rollouts\n\nPass ``commit=False`` to collect scores from all candidates without\nmodifying the workspace. Every branch runs to completion and aborts --\nthe base stays pristine for the next batch. This gives you cheap,\nisolated rollout environments for policy gradient methods like GRPO.\n\n```python\nfrom branching import Workspace, BestOfN\n\nws = Workspace(\"/mnt/workspace\")\n\nfor prompt in training_batch:\n    candidates = [make_candidate(prompt) for _ in range(N)]\n    outcome = BestOfN(candidates, commit=False)(ws)\n\n    # All N results available -- extract (success, score) for training\n    rewards = [(r.success, r.score) for r in outcome.all_results]\n    trainer.step(prompt, rewards)\n```\n\n### Reflexion (retry with feedback)\n\nRun a task, and if it fails, generate a critique and feed it back into the\nnext attempt. The agent learns from its mistakes across retries.\n\nUse when failures carry diagnostic signal: fixing test failures where the\nerror log tells you what went wrong, iterating on a solution where a\nvalidator explains why it was rejected, or multi-step plans where each\nfailed attempt narrows the search space for the next one.\n\n```python\nfrom branching import Reflexion\n\ndef task(path: Path, attempt: int, feedback: str | None) -\u003e bool:\n    if feedback:\n        (path / \"critique.txt\").write_text(feedback)\n    return run_and_test(path)\n\ndef critique(path: Path) -\u003e str:\n    return analyze_failure(path / \"test_output.log\")\n\noutcome = Reflexion(task, max_retries=3, critique=critique)(ws)\n```\n\n### Tree of Thoughts\n\nExplore multiple strategies in parallel, optionally expanding the best one\ninto deeper sub-strategies across multiple levels.\n\nUse when the problem has hierarchical structure: architectural decisions\nwhere you first pick a framework then optimize within it, multi-stage\npipelines where each stage has variants worth exploring, or planning tasks\nwhere high-level strategies each decompose into tactical choices.\n\n```python\nfrom branching import TreeOfThoughts\n\ndef strategy_a(path: Path) -\u003e tuple[bool, float]:\n    apply_approach_a(path)\n    return run_tests(path), evaluate_quality(path)\n\ndef strategy_b(path: Path) -\u003e tuple[bool, float]:\n    apply_approach_b(path)\n    return run_tests(path), evaluate_quality(path)\n\noutcome = TreeOfThoughts(\n    [strategy_a, strategy_b],\n    max_depth=2,\n    expand=lambda path, depth: generate_refinements(path),\n)(ws)\n```\n\n### Beam Search\n\nKeep the top-K branches alive at each depth level instead of just one\nwinner. Interpolates between BestOfN (all parallel, one level) and\nTreeOfThoughts (one winner per level). At each level, all candidates\nacross all beams are scored globally and only the top-K survive.\n\nInspired by [EnCompass](https://arxiv.org/abs/2512.03571), which showed\nthat multi-level beam search outperforms both BestOfN and single-winner\ntree search for hierarchical agent tasks.\n\nUse when the problem has hierarchical structure *and* you want to hedge\nacross multiple promising directions: multi-step code migrations where\nseveral rewrite strategies look viable at each stage, planning tasks where\npruning to one path too early loses good alternatives, or any setting where\nTreeOfThoughts' single-winner-per-level is too aggressive.\n\n```python\nfrom branching import BeamSearch\n\ndef strategy_a(path: Path) -\u003e tuple[bool, float]:\n    apply_approach_a(path)\n    return run_tests(path), evaluate_quality(path)\n\ndef strategy_b(path: Path) -\u003e tuple[bool, float]:\n    apply_approach_b(path)\n    return run_tests(path), evaluate_quality(path)\n\noutcome = BeamSearch(\n    [strategy_a, strategy_b, strategy_c, strategy_d],\n    expand=lambda path, depth: generate_refinements(path),\n    beam_width=2,\n    max_depth=3,\n)(ws)\n```\n\n### Tournament (pairwise elimination)\n\nRun N candidates in parallel, then narrow to one through pairwise\nelimination via a judge function. The convergent dual of Tree of Thoughts:\nstarts wide, narrows to one.\n\nUse when you have a reliable pairwise comparator but no absolute scoring\nfunction: patch selection where an LLM judge picks the better diff,\nA/B-style evaluation where candidates are compared head-to-head, or\nany setting where relative ranking is easier than absolute scoring.\n\n```python\nfrom branching import Tournament\n\ndef make_patch(code: str):\n    def candidate(path: Path) -\u003e bool:\n        (path / \"fix.patch\").write_text(code)\n        return apply_and_test(path)\n    return candidate\n\ncandidates = [make_patch(p) for p in generate_patches(n=8)]\n\ndef judge(path_a: Path, path_b: Path) -\u003e int:\n    # 0 = a wins, 1 = b wins\n    return llm_compare(path_a / \"diff.patch\", path_b / \"diff.patch\")\n\noutcome = Tournament(candidates, judge=judge)(ws)\n```\n\n### Cascaded speculation (adaptive fan-out)\n\nStart with one attempt. If it fails, widen to more parallel candidates,\neach informed by error context from prior failures. Repeat with increasing\nfan-out until one succeeds or all waves are exhausted.\n\nInspired by [Cascade Speculative Drafting](https://arxiv.org/abs/2312.11462),\nwhich applies the same start-cheap-escalate-on-failure principle to LLM\ntoken generation.\n\nUse when most tasks succeed on the first try and you want to minimize\nwasted compute: coding agents where one LLM call usually works but\noccasionally needs retries with error feedback, test-fix loops where the\nerror log from a failed attempt is the best guide for the next one, or\nany workload with variable difficulty where paying for N parallel branches\nupfront is wasteful.\n\n```python\nfrom branching import Cascaded\n\ndef solve(path: Path, feedback: list[str]) -\u003e tuple[bool, str]:\n    result = run_agent(path, prior_errors=feedback)\n    if result.tests_pass:\n        return True, \"\"\n    return False, result.error_output\n\noutcome = Cascaded(solve, fan_out=(1, 2, 4), timeout=120)(ws)\n```\n\nThe task returns `(success, error_context)`. On failure, the error string\nis collected and passed as feedback to subsequent waves. On success, it is\nignored. Empty error strings are silently dropped.\n\n## Lower-level usage\n\nThe patterns above are built on two lower-level primitives you can use\ndirectly when you need more control.\n\n### Branching with manual control\n\n```python\nwith ws.branch(\"attempt\", on_success=None, on_error=None) as b:\n    result = run_agent(workdir=b.path)\n    if result.confident:\n        b.commit()\n    else:\n        b.abort()\n```\n\n### Nested branches\n\nBranches can nest - useful for hierarchical exploration (e.g. pick a\nstrategy, then explore variants within it).\n\n```python\nwith ws.branch(\"strategy_a\") as a:\n    apply_strategy(a.path)\n\n    with a.branch(\"variant_1\") as v1:\n        tweak(v1.path)\n        # v1 auto-commits into a on success\n\n    # a auto-commits into main on success\n```\n\n### Process forking\n\nFor crash-prone agent code, `BranchContext` runs each task in a forked child\nprocess with its own process group. The child is automatically killed on\ntimeout or context exit.\n\nFor sandboxing (filesystem confinement, resource limits, syscall filtering),\ncombine with [sandlock](https://github.com/multikernel/sandlock).\n\n```python\nfrom branching import BranchContext\n\nwith ws.branch(\"forked\", on_success=None, on_error=None) as fb:\n    with BranchContext(run_agent, workspace=fb.path) as ctx:\n        try:\n            ctx.wait(timeout=30)\n            fb.commit()\n        except ProcessBranchError:\n            fb.abort()\n```\n\nRun N tasks in parallel, each in its own forked process:\n\n```python\nwith BranchContext.create(\n    targets=[task_a, task_b, task_c],\n    workspaces=[ws_a.path, ws_b.path, ws_c.path],\n) as contexts:\n    for ctx in contexts:\n        ctx.wait(timeout=60)\n```\n\n## CLI\n\nThe `branching` command exposes the agent patterns as shell commands.\nAuto-detects the workspace from your current directory, or pass `-w PATH`.\nAll commands support `--json` for machine-readable output.\n\n### run\n\nRun a command in a new branch. Commits on exit 0, aborts on non-zero.\n\n```bash\nbranching run -- ./build.sh\nbranching run --on-error none -- python train.py\nbranching run --ask -- make test          # prompt before commit/abort\n```\n\n### speculate\n\nRace N commands in parallel branches. First success wins.\n\n```bash\nbranching speculate -c \"./fix_a.sh\" -c \"./fix_b.sh\" -c \"./fix_c.sh\"\nbranching speculate --timeout 60 -c \"python solve_v1.py\" -c \"python solve_v2.py\"\n```\n\n### best-of-n\n\nRun CMD N times in parallel, commit the highest-scoring success.\n\nThe child process can write a score to fd 3 (`echo 0.95 \u003e\u00263`).\nIf nothing is written, score defaults to 1.0 for success / 0.0 for failure.\nEach child receives `BRANCHING_ATTEMPT` (0-indexed) in its environment.\n\n```bash\nbranching best-of-n -n 5 -- ./solve.py\nbranching best-of-n -n 3 --timeout 120 --json -- python attempt.py\nbranching best-of-n -n 3 -- bash -c 'python run.py \u0026\u0026 echo \"$SCORE\" \u003e\u00263'\n```\n\n### reflexion\n\nSequential retry with optional critique feedback loop.\n\nThe child receives `BRANCHING_ATTEMPT` (0-indexed) and `BRANCHING_FEEDBACK`\n(empty on first attempt, critique output on retries) in its environment.\n\n```bash\nbranching reflexion --retries 5 -- ./fix.sh\nbranching reflexion --retries 3 --critique \"./review.sh\" -- ./solve.py\nbranching reflexion --retries 3 --critique \"python critique.py\" --json -- python agent.py\n```\n\n### status\n\nShow workspace info and active branches.\n\n```bash\nbranching status\nbranching status --json\n```\n\n## How it works\n\nBranchContext uses [BranchFS](https://github.com/multikernel/branchfs), a\ncopy-on-write FUSE filesystem, to create instant, zero-cost branches of your\nworkspace. Branches are virtual paths within a single mount, with\nfirst-winner-commit semantics.\n\nYou just create a `Workspace` pointed at a mounted BranchFS path.\n\nProcess forking (`BranchContext`) uses `fork(2)` + process groups to run\neach task in an isolated child process. The child's working directory is set\nto the branch path, and `mprotect(2)` enforces copy-on-write invariants on\nparent memory regions.\n\nBranchContext focuses purely on branching. For sandboxing (filesystem\nconfinement, syscall filtering, resource limits), use\n[sandlock](https://github.com/multikernel/sandlock) alongside branching --\nthe two are designed to compose together.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmultikernel%2Fbranching","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmultikernel%2Fbranching","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmultikernel%2Fbranching/lists"}