{"id":50565307,"url":"https://github.com/velvetmonkey/flywheel-concept","last_synced_at":"2026-06-04T14:01:07.529Z","repository":{"id":356679813,"uuid":"1233626449","full_name":"velvetmonkey/flywheel-concept","owner":"velvetmonkey","description":"Pre-registered cross-model latent alignment as a falsifiable research programme on whether neural networks reveal structured concept geometry.","archived":false,"fork":false,"pushed_at":"2026-05-09T07:42:49.000Z","size":176,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-09T08:42:31.859Z","etag":null,"topics":["concept-geometry","cross-model-alignment","falsification","flywheel","interpretability","obsidian","research"],"latest_commit_sha":null,"homepage":null,"language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/velvetmonkey.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-09T06:55:04.000Z","updated_at":"2026-05-09T07:42:51.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/velvetmonkey/flywheel-concept","commit_stats":null,"previous_names":["velvetmonkey/flywheel-concept"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/velvetmonkey/flywheel-concept","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/velvetmonkey%2Fflywheel-concept","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/velvetmonkey%2Fflywheel-concept/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/velvetmonkey%2Fflywheel-concept/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/velvetmonkey%2Fflywheel-concept/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/velvetmonkey","download_url":"https://codeload.github.com/velvetmonkey/flywheel-concept/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/velvetmonkey%2Fflywheel-concept/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33907694,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-04T02:00:06.755Z","response_time":64,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["concept-geometry","cross-model-alignment","falsification","flywheel","interpretability","obsidian","research"],"created_at":"2026-06-04T14:01:06.383Z","updated_at":"2026-06-04T14:01:07.516Z","avatar_url":"https://github.com/velvetmonkey.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n  \u003cimg src=\"header.png\" alt=\"Flywheel\" width=\"256\"/\u003e\n  \u003ch1\u003eFlywheel Concept\u003c/h1\u003e\n  \u003cp\u003e\u003cstrong\u003eA falsifiable study of whether cross-model activations reveal structured concept geometry — or shared training-corpus artefact.\u003c/strong\u003e\u003c/p\u003e\n\u003c/div\u003e\n\n\u003e **Status: v0.1 working draft post-Trial-2 (2026-05-10) — RESHAPE-NARROW active; repo freeze pending council pass.**\n\u003e Sibling [`flywheel-geometry`](https://github.com/velvetmonkey/flywheel-geometry) Method 6 falsifier resolved **FAIL** on 2026-05-10 (full audit: [trial2-postmortem.md](https://github.com/velvetmonkey/flywheel-geometry/blob/main/docs/trial2-postmortem.md)). After two same-day turns on Concept's gate-spec, the active resolution is **RESHAPE-NARROW** (turn-2; supersedes turn-1 stay-the-course same-day). The active gate-spec is in the vault at `tech/flywheel/flywheel-concept-falsification-gate.md` (Resolution v2 section).\n\u003e\n\u003e The narrowed bridge claim, Bet 0 BM25 safety floor, three-control decision rule (B1 + B2 + C1 lexical/corpus-frequency, all simultaneously), and 6 structured concept families panel are not yet reflected in the body of this README or in [`docs/pre-registration.md`](./docs/pre-registration.md) — those updates ship at the `v0.1-prereg-frozen` freeze tag, after a council pass on the README and pre-registration doc rewrites. Until then, the body below describes the original v0.1-09 spec; the **active** spec is in the vault gate-spec linked above.\n\nA telescope's value is judged by what it reveals about stars, not by what it tells you about itself. Neural networks reveal traces — activations, behaviours, distances. The science is measuring which traces faithfully reveal latent concept structure, and which are artefacts of the instrument.\n\n---\n\n## What this is\n\nFlywheel Concept is a research programme. Not a product, not a benchmark, not a model evaluator, not a cosmology. Recent interpretability work — Goodfire AI's manifold-steering programme (Lubana et al., 2025–2026), the Platonic Representation Hypothesis (Huh et al., 2024), Hindupur–Lubana–Fel–Ba's *Projecting Assumptions* (NeurIPS 2025) — shows neural networks across architectures encode meaning on curved manifolds and converge on shared latent geometry without coordinated training. The question Concept addresses is the next one: *when is that convergence revealing real structure, and when is it shared training-corpus artefact?*\n\nThe work is vault-first and pre-pilot. When the pilot ships, numbers will be published regardless of outcome.\n\n---\n\n## How we'll know this works\n\nConcept commits, before any experiment runs, to a single bridge claim:\n\n\u003e **Pre-registered cross-model latent alignment under structural transforms predicts task transfer with incremental coefficient of determination ΔR² ≥ 0.10, bootstrap 95% CI excluding 0, holding on ≥2 of 3 task domains AND on the code-heavy holdout model.**\n\nFull metric definitions, bootstrap procedure, and unit-of-analysis discussion live in [`benchmark/metrics.md`](./benchmark/metrics.md). Protocol freeze rules in [`docs/pre-registration.md`](./docs/pre-registration.md).\n\n**Tasks** (three relational-structure domains, all literature-grounded — full scope in [`benchmark/tasks.md`](./benchmark/tasks.md)):\n\n- **BATS semantic subsets** (Gladkova et al. 2016) — relational linguistic structure. Lexicographic + encyclopedic subsets only; inflectional and derivational morphology subsets excluded as token-level rather than concept-level.\n- **WordNet taxonomic distance** — hierarchical concept structure.\n- **Color-circumplex ordering** — perceptual concept geometry, the canonical curved-manifold case.\n\n**Models** (five open-weight models, ≥2 model families, ≥2 training distributions — full table in [`benchmark/models.md`](./benchmark/models.md)):\n\n- Llama 3.1 8B (Meta, RedPajama-style corpus)\n- Gemma 2 9B (Google, proprietary public-corpus mix)\n- Pythia 12B (EleutherAI, the Pile)\n- Qwen 2.5 Coder 7B (Alibaba, code-heavy corpus) — cross-distribution stress test\n- Mistral 7B (Mistral, mixed corpus)\n\nThe Qwen-Coder run is a **cross-distribution stress holdout**, not an independently decisive falsifier. A code-heavy model still has substantial natural-language pretraining (documentation, comments, package names, tutorials), so alignment success does not prove cross-corpus concept geometry, and alignment failure does not prove the effect is corpus artefact (could be tokenization, scale, post-training, layer choice, or task mismatch). Outcome is diagnostic and feeds the interpretation; it is not on its own the gate.\n\n**Baselines** (two, both literature-grounded):\n\n- **B1 — Raw per-model probe**: per-model linear/MLP probe on residual stream. The artefact-y baseline Concept claims to beat.\n- **B2 — Cross-model linear probe transfer**: train probe on model A, evaluate on model B's activations after a best-fit linear map. Literature-grounded (Conneau et al. cross-lingual transfer; Bansal et al. model stitching).\n\n**Decision rule** (precise version in [`benchmark/metrics.md`](./benchmark/metrics.md)): ΔR² ≥ 0.10 over both B1 and B2, bootstrap 95% CI (BCa, 1000 resamples, clustered by concept-item / relation-group / model-pair) excluding 0, on ≥2 of 3 domains AND on the Qwen-Coder cross-distribution holdout. Any post-hoc parameter selection or task-set change after the protocol-freeze tag is automatic refutation. Pre-registration in [Flywheel Ideas](https://github.com/velvetmonkey/flywheel-ideas) lands when the protocol freezes.\n\n**If the bridge claim fails to beat both baselines on the agreed criteria, the negative result is the launch artifact.**\n\n---\n\n## What this is not\n\n- Not a leaderboard. The benchmark exists only to falsify the bridge claim. There is no ranking of models by score.\n- Not the Universal Semantic Coordinate System. USCS is held in [docs/philosophy.md](./docs/philosophy.md) as the upper bound this work could become; the bridge claim is the necessary condition, not the conclusion.\n- Not a finance product. No finance-domain experiments before the method is proven on cyclic time, scalar order, and concept-heavy taxonomies.\n- Not a cosmological claim about reality. The cosmological reading lives in [docs/philosophy.md](./docs/philosophy.md), owned and linked, not on this front door.\n- Not a rotator-tier claim. Rotator usefulness — can we move along the geometry, can behaviour follow geodesic interventions — is Tier 3 of the model-comparison ladder. Concept is Tier 1: instrument fidelity.\n\n---\n\n## How this slots in\n\nConcept is umbrella to Geometry's research lane. Geometry's adversarial-screen discipline produced the falsified probe that motivated Concept. Concept does not replace Geometry; it is what Geometry's empirical floor was pointing at.\n\n## The Flywheel Suite\n\n- **[flywheel-memory](https://github.com/velvetmonkey/flywheel-memory)** — local-first MCP server. Hybrid BM25 + semantic search, knowledge graph, safe writes over an Obsidian vault.\n- **[flywheel-crank](https://github.com/velvetmonkey/flywheel-crank)** — Obsidian plugin. Visual layer over Memory's graph: sidebar, vault health, semantic search UI.\n- **[flywheel-ideas](https://github.com/velvetmonkey/flywheel-ideas)** — falsifiable decision ledger. Pre-registered assumptions, multi-model AI council dissent, outcome-driven refutation propagation.\n- **[flywheel-geometry](https://github.com/velvetmonkey/flywheel-geometry)** — geodesic retrieval extension. Pre-registered study of cross-domain bridge-finding via activation manifolds.\n- **flywheel-concept** *(this repo)* — research programme on whether cross-model activations reveal structured concept geometry.\n\n---\n\n## Status\n\n**Draft pre-registration. No pilot run yet.** This repo holds:\n\n- The bridge claim, the task list, the model list, the baseline list, and the decision rule (above).\n- The frozen protocol document at [`docs/pre-registration.md`](./docs/pre-registration.md). Currently `DRAFT` — moves to `FROZEN` at a tagged commit when the user signs off.\n- The evidence record at [`evidence/cheap-probe-360/`](./evidence/cheap-probe-360/) — the failed introspective-probe sweep that motivated this programme. Mirrored from [`flywheel-geometry`](https://github.com/velvetmonkey/flywheel-geometry).\n- No baseline numbers, no pilot results, no claim of empirical outcome.\n\nConcept follows [Flywheel Geometry](https://github.com/velvetmonkey/flywheel-geometry) — Geometry's pilot lands first, Concept's protocol freezes after, pre-registration in Flywheel Ideas at protocol freeze.\n\n---\n\n## Evidence\n\nThe programme stands on one piece of preserved evidence at [`evidence/cheap-probe-360/`](./evidence/cheap-probe-360/): the 360-call adversarial sweep on @slashreboot's introspective coordinate elicitation probe (12 concepts × 6 prompt variants × 5 runs, against `claude-sonnet-4-6`, 2026-05-08). Pre-registered Spearman rank-correlation pass criterion `ρ \u003e 0.5` between core and adversarial variants. **All four screens failed** (A: 0.078, B: 0.290, C: 0.159, E: 0.108). The pre-registered failure-signal trap (variant D, coherence-pressure framing) fired at ρ = 0.759 with 18× core's ground-truth pair separation. Two load-bearing assumptions in [Flywheel Ideas](https://github.com/velvetmonkey/flywheel-ideas) refuted: `asm-HvE9muhM` and `asm-VotY4n8g`, both parented to idea `idea-b4ZeRCoa`.\n\nThe bridge claim above is the next assumption in that lineage. The chain itself is the work.\n\n---\n\n## Credit \u0026 Collaboration\n\nConcept exists because of work that is not Concept's:\n\n- [Goodfire AI](https://goodfire.ai/) — Ekdeep Singh Lubana, Thomas Fel, [@GoodfireAI](https://x.com/GoodfireAI) — for the manifold-steering programme that established representation-and-behaviour geometry as evidence of structure in data.\n- [@slashreboot](https://x.com/slashreboot) — Matthew — for publishing the introspective coordinate elicitation probe with full chain-of-thought traces, the data that made the falsification possible.\n- Hindupur, Lubana, Fel \u0026 Ba — *Projecting Assumptions: The Duality Between Sparse Autoencoders and Concept Geometry* (NeurIPS 2025, [arXiv:2503.01822](https://arxiv.org/abs/2503.01822)) — for the instrument-fidelity framing.\n- Anthropic NLA team — for the Natural Language Autoencoders publication that defined the cleanest available baseline for introspective decoding.\n- BetaTomorrow / Deep Manifold — *Single Token Geometry 04* — for the framing of activations and behaviour as traces of boundary-conditioned neural computation, not the manifold itself.\n- Huh, Cheung, Wang \u0026 Isola — for the [Platonic Representation Hypothesis](https://arxiv.org/abs/2405.07987).\n\nCoauthorship on derivative academic work belongs upstream.\n\n---\n\n## Philosophy\n\nThe cosmological reading, the Universal Semantic Coordinate System reframe, and the four-piece suite headline are held in [docs/philosophy.md](./docs/philosophy.md). Owned, not hidden. Read it for the story so far. Read this README for what is being claimed and how it could be wrong.\n\n---\n\n## License\n\nApache 2.0. See [LICENSE](./LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvelvetmonkey%2Fflywheel-concept","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvelvetmonkey%2Fflywheel-concept","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvelvetmonkey%2Fflywheel-concept/lists"}