{"id":51076996,"url":"https://github.com/zozo123/t2t-vs-grch38","last_synced_at":"2026-06-23T15:02:06.689Z","repository":{"id":364827207,"uuid":"1269356477","full_name":"zozo123/t2t-vs-grch38","owner":"zozo123","description":null,"archived":false,"fork":false,"pushed_at":"2026-06-14T16:11:59.000Z","size":371,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-14T18:09:16.451Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/zozo123.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-06-14T16:01:23.000Z","updated_at":"2026-06-14T16:12:02.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/zozo123/t2t-vs-grch38","commit_stats":null,"previous_names":["zozo123/t2t-vs-grch38"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/zozo123/t2t-vs-grch38","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zozo123%2Ft2t-vs-grch38","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zozo123%2Ft2t-vs-grch38/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zozo123%2Ft2t-vs-grch38/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zozo123%2Ft2t-vs-grch38/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/zozo123","download_url":"https://codeload.github.com/zozo123/t2t-vs-grch38/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zozo123%2Ft2t-vs-grch38/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34694786,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-23T02:00:07.161Z","response_time":65,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-06-23T15:02:04.446Z","updated_at":"2026-06-23T15:02:06.684Z","avatar_url":"https://github.com/zozo123.png","language":"Shell","funding_links":[],"categories":[],"sub_categories":[],"readme":"# What T2T changed in the human reference\n\n**A genome-wide GRCh38 -\u003e T2T-CHM13 delta, computed by Claude Code with [OpenClaw Crabbox](https://github.com/openclaw/crabbox) on [islo.dev](https://islo.dev).**\n\nLive site: https://zozo123.github.io/t2t-vs-grch38/\n\nThis is the follow-up to the reference-broadcast demo:\nhttps://zozo123.github.io/genomics-sandboxes/\n\n## Result\n\nT2T-CHM13 closes many gaps in GRCh38. Running the same CpG-island kernel across both references shows:\n\n| metric | value |\n|---|---:|\n| Newly usable sequence in T2T | 179.6 Mb |\n| GRCh38 gap sequence removed | 150.6 Mb |\n| GRCh38 candidate CpG islands | 264,816 |\n| T2T candidate CpG islands | 305,308 |\n| Extra candidate islands | 40,492 |\n| Candidate-island density in added sequence | 225.4/Mb |\n\nThe added sequence is not one signal:\n\n- chrY and chr9 add large amounts of sequence with little candidate-island gain, consistent with newly resolved heterochromatin and repeat-rich sequence.\n- chr13, chr15, chr21, and chr22 gain many candidate islands, consistent with acrocentric/rDNA-rich sequence and with the known tendency of simple CpG-island rules to over-call GC/CpG-rich repeats.\n\nThis is not framed as a novel biological discovery. It is a compact, reproducible measurement of how completing the reference changes what a transparent sequence-composition caller sees.\n\n## Harness\n\nThe harness is a Claude Code agent driving OpenClaw Crabbox:\n\n```bash\nexport ISLO_API_KEY=...\n./crabbox.sh t2t\n```\n\nThe harness follows the official Crabbox Islo-provider model:\n\n1. `crabbox run --provider islo --keep --lease-output ...` warms an Islo sandbox with UCSC `hs1.fa.gz` (T2T-CHM13v2.0), splits it by chromosome, and keeps the sandbox.\n2. The harness saves that kept sandbox as an Islo snapshot, because the paper is explicitly testing snapshot broadcast.\n3. Each chromosome runs through `crabbox run --provider islo --islo-snapshot-name \u003csnapshot\u003e ...`, so Crabbox owns repo sync, guardrails, timing, and run lifecycle while Islo owns sandbox state and streaming exec.\n4. The reducer merges T2T per-chromosome JSON against the GRCh38 genome-wide receipts.\n\nIf `ISLO_API_KEY` / `CRABBOX_ISLO_API_KEY` is missing, the script prints a warning and falls back to the direct Islo CLI path used to generate the published receipts. The preferred path is OpenClaw Crabbox.\n\nMeasured run:\n\n| phase | time |\n|---|---:|\n| T2T warm-up | 77.6 s |\n| snapshot save | 15.0 s |\n| 24-way fan-out | 60.0 s |\n| snapshot size | 1009.5 MB |\n\n## Files\n\n| file | purpose |\n|---|---|\n| `index.html`, `styles.css`, `script.js` | static GitHub Pages paper |\n| `data/compare.json` | comparison receipts used by the figures |\n| `crabbox.sh` | genome-wide OpenClaw Crabbox harness with `run` and `t2t` modes |\n| `scripts/t2t_warmup.sh` | T2T warm-up: download, split, index |\n| `scripts/compute.py` | per-chromosome MAP kernel |\n| `scripts/reduce_compare.py` | GRCh38 vs T2T reducer |\n\n`crabbox run --provider islo` requires `ISLO_API_KEY`. New islo.dev accounts can use coupon `YOSSI150` for 150 free credits.\n\n## Caveats\n\nThe CpG-island calls are candidates from a simple sequence-composition rule. The 1987 Gardiner-Garden and Frommer criterion is transparent and useful for this demonstration, but it is known to over-call in some GC-rich repeat contexts. The result should be read as a reference-composition comparison, not as promoter annotation, methylation measurement, or medical inference.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzozo123%2Ft2t-vs-grch38","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzozo123%2Ft2t-vs-grch38","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzozo123%2Ft2t-vs-grch38/lists"}