{"id":31756763,"url":"https://github.com/pathwaycom/bdh","last_synced_at":"2025-10-09T19:23:35.132Z","repository":{"id":317544440,"uuid":"1067119192","full_name":"pathwaycom/bdh","owner":"pathwaycom","description":"Baby Dragon Hatchling (BDH) – Architecture and Code","archived":false,"fork":false,"pushed_at":"2025-10-01T15:08:12.000Z","size":989,"stargazers_count":11,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-10-01T16:26:58.943Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pathwaycom.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-09-30T12:05:01.000Z","updated_at":"2025-10-01T16:25:44.000Z","dependencies_parsed_at":"2025-10-01T16:29:02.065Z","dependency_job_id":"fae04b73-7560-4d66-8fb4-17ded9149edd","html_url":"https://github.com/pathwaycom/bdh","commit_stats":null,"previous_names":["pathwaycom/bdh"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/pathwaycom/bdh","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pathwaycom%2Fbdh","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pathwaycom%2Fbdh/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pathwaycom%2Fbdh/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pathwaycom%2Fbdh/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pathwaycom","download_url":"https://codeload.github.com/pathwaycom/bdh/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pathwaycom%2Fbdh/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279001979,"owners_count":26083243,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-09T02:00:07.460Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-10-09T19:23:32.974Z","updated_at":"2025-10-09T19:23:35.125Z","avatar_url":"https://github.com/pathwaycom.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"## Baby Dragon Hatchling\nThis repository contains source code from the paper: Adrian Kosowski, Przemysław Uznański, Jan Chorowski, Zuzanna Stamirowska, Michał Bartoszkiewicz, _\"The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain\"_, [link](https://doi.org/10.48550/arXiv.2509.26507).\n\n## Architecture\n\u003cimg src=\"figs/architecture.png\" width=\"600\"/\u003e \n\n## Relation to Tranformers\n\u003cimg src=\"figs/vocab.png\" width=\"600\"/\u003e \n\n## Scaling laws\n\u003cimg src=\"figs/bdh_scaling.png\" width=\"600\"/\u003e \n\n## Abstract:\nThe relationship between computing systems and the brain has served as motivation for pioneering theoreticians since John von Neumann and Alan Turing. \nUniform, scale-free biological networks, such as the brain, have powerful properties, including generalizing over time, which is the main barrier for Machine Learning on the path to Universal Reasoning Models.\n\nWe introduce `Dragon Hatchling' (BDH), a new Large Language Model architecture based on a scale-free biologically inspired network of $n$ locally-interacting neuron particles. BDH couples strong theoretical foundations and inherent interpretability without sacrificing Transformer-like performance.\n\nBDH is a practical, performant state-of-the-art \nattention-based state space sequence learning architecture. \nIn addition to being a graph model, BDH admits a GPU-friendly formulation.\nIt exhibits Transformer-like scaling laws: we find empirically that BDH rivals GPT2-architecture Transformer performance on language and translation tasks, at the same number of parameters (10M to 1B), for the same training data.\n\nBDH provides theoretical foundations for understanding model behavior in the limit of large size and reasoning time. \nOur results, formalized as a chain of reductions of expressiveness in the framework of computational Complexity Theory and Distributed Computing, and combined with findings on the BDH model, show a macro-to-micro correspondence of function between the general attention mechanisms in state-of-the-art Language Models, and attention mechanisms observed in the brain. These attention mechanisms formally converge as closed-form local graph dynamics at neurons and synapses: _the equations of reasoning_.\n\nBDH can be represented as a brain model. It contains $n$ neurons, organized as an excitatory circuit and an inhibitory circuit with integrate-and-fire thresholding of input signals at neurons. The working memory of BDH during inference entirely relies on synaptic plasticity with Hebbian learning using spiking neurons, at potentiation scales of minutes for the brain (up to hundreds of tokens). We confirm empirically that specific, individual synapses strengthen connection whenever BDH hears or reasons about a specific concept while processing language inputs. The neuron interaction network of BDH is a graph of high modularity with heavy-tailed degree distribution. The BDH model is biologically plausible, explaining one possible mechanism which human neurons could use to achieve speech.\n\nBDH is designed for interpretability. Activation vectors of BDH are sparse and positive. We demonstrate monosemanticity in BDH on language tasks, including representation of concept abstractions, which happens even for small models, below 100M-parameter scale. Interpretability of state, which goes beyond interpretability of neurons and model parameters, is an inherent feature of the BDH architecture. \n\nWe believe BDH opens the door to a new theory of _Thermodynamic Limit_ behavior for language and reasoning models, with the ultimate goal of Probably Approximately Correct (PAC)-like bounds for generalization of reasoning over time.\n\n## Running the code\n\nTo train and sample from the BDH model on a toy language modeling task please do:\n1. `pip install -r requirements.txt`\n2. `python train.py`\n\n## Acknowledgements\nWe thank Andrej Karpathy for the [nanoGPT](https://github.com/karpathy/nanoGPT/) code and the tiny Shapespeare dataset used in this demonstration.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpathwaycom%2Fbdh","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpathwaycom%2Fbdh","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpathwaycom%2Fbdh/lists"}