{"id":34034363,"url":"https://github.com/lucidrains/hyper-connections","last_synced_at":"2026-02-04T16:11:47.641Z","repository":{"id":269603122,"uuid":"907965459","full_name":"lucidrains/hyper-connections","owner":"lucidrains","description":"Attempt to make multiple residual streams from Bytedance's Hyper-Connections paper accessible to the public","archived":false,"fork":false,"pushed_at":"2026-01-15T01:04:46.000Z","size":371,"stargazers_count":140,"open_issues_count":3,"forks_count":13,"subscribers_count":2,"default_branch":"main","last_synced_at":"2026-01-15T07:43:13.198Z","etag":null,"topics":["artificial-intelligence","deep-learning","residuals"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lucidrains.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-12-24T18:14:23.000Z","updated_at":"2026-01-15T01:04:47.000Z","dependencies_parsed_at":"2024-12-24T18:59:55.663Z","dependency_job_id":null,"html_url":"https://github.com/lucidrains/hyper-connections","commit_stats":null,"previous_names":["lucidrains/hyper-connections"],"tags_count":62,"template":false,"template_full_name":null,"purl":"pkg:github/lucidrains/hyper-connections","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Fhyper-connections","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Fhyper-connections/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Fhyper-connections/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Fhyper-connections/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lucidrains","download_url":"https://codeload.github.com/lucidrains/hyper-connections/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Fhyper-connections/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29089924,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-04T03:31:03.593Z","status":"ssl_error","status_checked_at":"2026-02-04T03:29:50.742Z","response_time":62,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","deep-learning","residuals"],"created_at":"2025-12-13T19:43:50.096Z","updated_at":"2026-02-04T16:11:47.634Z","avatar_url":"https://github.com/lucidrains.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cimg src=\"./hyper-connections.png\" width=\"450px\"\u003e\u003c/img\u003e\n\n## Hyper Connections\n\nAttempt to make multiple residual streams, proposed in [Hyper-Connections paper](https://arxiv.org/abs/2409.19606) out of Bytedance AI lab, accessible as an easy to use library, as well as for following any new research in this direction.\n\n[Write up on mHC from Subhadip Mitra](https://subhadipmitra.com/blog/2026/deepseek-mhc-manifold-constrained-hyper-connections/)\n\n## Install\n\n```bash\n$ pip install hyper-connections\n```\n\n## Usage\n\n```python\nimport torch\nfrom torch import nn\n\n# a single branch layer\n\nbranch = nn.Linear(512, 512)\n\n# before\n\nresidual = torch.randn(2, 1024, 512)\n\nresidual = branch(residual) + residual\n\n# after, say 4 streams in paper\n\nfrom hyper_connections import get_init_and_expand_reduce_stream_functions\n\ninit_hyper_conn, expand_stream, reduce_stream = get_init_and_expand_reduce_stream_functions(4)\n\n# 1. wrap your branch function\n\nhyper_conn_branch = init_hyper_conn(dim = 512, branch = branch)\n\n# 2. expand to 4 streams, this must be done before your trunk, typically a for-loop with many branch functions\n\nresidual = expand_stream(residual)\n\n# 3. forward your residual as usual into the wrapped branch function(s)\n\nresidual = hyper_conn_branch(residual) \n\n# 4. reduce 4 streams with a summation, this has to be done after your for-loop trunk. for transformer, unsure whether to do before or after final norm\n\nresidual = reduce_stream(residual)\n```\n\nOr doing it manually, as in the paper\n\n```python\nimport torch\nfrom torch import nn\n\n# a single branch layer\n\nbranch = nn.Linear(512, 512)\n\n# before\n\nresidual = torch.randn(2, 1024, 512)\n\nresidual = branch(residual) + residual\n\n# after, say 4 streams in paper\n\nfrom hyper_connections import get_init_and_expand_reduce_stream_functions\n\ninit_hyper_conn, expand_stream, reduce_stream = get_init_and_expand_reduce_stream_functions(4)\n\n# 1. instantiate hyper connection with correct number of streams (4 in this case) - or use the init function above\n\nhyper_conn = init_hyper_conn(dim = 512)\n\n# 2. expand to 4 streams\n\nresidual = expand_stream(residual)\n\n# 3. forward your residual into hyper connection for the branch input + add residual function (learned betas)\n\nbranch_input, add_residual = hyper_conn(residual)\n\nbranch_output = branch(branch_input)\n\nresidual = add_residual(branch_output)\n\n# or you can do it in one line as so -\u003e residual = hyper_conn.decorate_branch(branch)(residual)\n\n# 4. reduce 4 streams with a summation, this has to be done after your for loop trunk\n\nresidual = reduce_stream(residual)\n```\n\nTo compare hyper connections to plain residual without changing the code, just pass `disable = True` when fetching the functions\n\n```python\nget_init_and_expand_reduce_stream_functions(4, disable = True)\n```\n\nTo use the fractionated feature dimensions proposed in [a follow up paper](https://arxiv.org/abs/2503.14125) by same authors, just instantiate with `num_fracs` greater than `1` as so\n\n```python\nget_init_and_expand_reduce_stream_functions(1, num_fracs = 4) # also allows you to mix streams and fractions of feature dimension\n```\n\n## Citation\n\n```bibtex\n@article{Zhu2024HyperConnections,\n    title   = {Hyper-Connections},\n    author  = {Defa Zhu and Hongzhi Huang and Zihao Huang and Yutao Zeng and Yunyao Mao and Banggu Wu and Qiyang Min and Xun Zhou},\n    journal = {ArXiv},\n    year    = {2024},\n    volume  = {abs/2409.19606},\n    url     = {https://api.semanticscholar.org/CorpusID:272987528}\n}\n```\n\n```bibtex\n@misc{Rubin2024,\n    author  = {Ohad Rubin},\n    url     = {https://medium.com/@ohadrubin/exploring-weight-decay-in-layer-normalization-challenges-and-a-reparameterization-solution-ad4d12c24950}\n}\n```\n\n```bibtex\n@article{Zhu2025FracConnectionsFE,\n    title   = {Frac-Connections: Fractional Extension of Hyper-Connections},\n    author  = {Defa Zhu and Hongzhi Huang and Jundong Zhou and Zihao Huang and Yutao Zeng and Banggu Wu and Qiyang Min and Xun Zhou},\n    journal = {ArXiv},\n    year    = {2025},\n    volume  = {abs/2503.14125},\n    url     = {https://api.semanticscholar.org/CorpusID:277104144}\n}\n```\n\n```bibtex\n@misc{xie2025mhcmanifoldconstrainedhyperconnections,\n    title   = {mHC: Manifold-Constrained Hyper-Connections}, \n    author  = {Zhenda Xie and Yixuan Wei and Huanqi Cao and Chenggang Zhao and Chengqi Deng and Jiashi Li and Damai Dai and Huazuo Gao and Jiang Chang and Liang Zhao and Shangyan Zhou and Zhean Xu and Zhengyan Zhang and Wangding Zeng and Shengding Hu and Yuqing Wang and Jingyang Yuan and Lean Wang and Wenfeng Liang},\n    year    = {2025},\n    eprint  = {2512.24880},\n    archivePrefix = {arXiv},\n    primaryClass = {cs.CL},\n    url     = {https://arxiv.org/abs/2512.24880}, \n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flucidrains%2Fhyper-connections","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flucidrains%2Fhyper-connections","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flucidrains%2Fhyper-connections/lists"}