{"id":37609014,"url":"https://github.com/fla-org/flash-bidirectional-linear-attention","last_synced_at":"2026-01-16T10:16:16.865Z","repository":{"id":263847527,"uuid":"891572405","full_name":"fla-org/flash-bidirectional-linear-attention","owner":"fla-org","description":"Triton implement of bi-directional (non-causal) linear attention","archived":false,"fork":false,"pushed_at":"2025-02-04T07:05:46.000Z","size":80,"stargazers_count":40,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-02-04T08:19:11.602Z","etag":null,"topics":["computer-vision","machine-learning-systems","triton-lang"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fla-org.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-11-20T15:14:53.000Z","updated_at":"2025-02-04T07:14:14.000Z","dependencies_parsed_at":"2025-01-13T12:43:27.848Z","dependency_job_id":"982308c4-9610-44aa-8bf5-b8be928d9981","html_url":"https://github.com/fla-org/flash-bidirectional-linear-attention","commit_stats":null,"previous_names":["hp-l33/flash-bidirectional-linear-attention","fla-org/flash-bidirectional-linear-attention"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/fla-org/flash-bidirectional-linear-attention","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fla-org%2Fflash-bidirectional-linear-attention","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fla-org%2Fflash-bidirectional-linear-attention/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fla-org%2Fflash-bidirectional-linear-attention/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fla-org%2Fflash-bidirectional-linear-attention/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fla-org","download_url":"https://codeload.github.com/fla-org/flash-bidirectional-linear-attention/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fla-org%2Fflash-bidirectional-linear-attention/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28478049,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-16T06:30:42.265Z","status":"ssl_error","status_checked_at":"2026-01-16T06:30:16.248Z","response_time":107,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computer-vision","machine-learning-systems","triton-lang"],"created_at":"2026-01-16T10:16:16.301Z","updated_at":"2026-01-16T10:16:16.841Z","avatar_url":"https://github.com/fla-org.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\n# Flash Bi-directional Linear Attention\n\n\u003c/div\u003e\n\nThe aim of this repository is to implement **bi-directional linear attention** for **non-causal** modeling using Triton.\n\n\u003cdiv align=\"center\"\u003e\n  \u003cimg width=\"600\" alt=\"image\" src=\"https://res.cloudinary.com/dunty6aot/image/upload/v1735544947/387246938-cd89a618-5d54-41b7-9055-36ba28b29fbd-2_tailvo.png\"\u003e\n\u003c/div\u003e\n\n\n\nThis project is currently maintained by an individual and remains a work in progress. As the maintainer is still in the early stages of learning Triton, many implementations may not be optimal. **Contributions and suggestions are welcome!**\n\n# Update\n* [2025-02-04] Updated PolaFormer\n* [2024-12-30] Optimized the backpropagation speed of the `linear attn`.\n* [2024-12-28] Updated `simple_la`, which is a simple form of `linear_attn` without the norm term.\n\n# Models\nRoughly sorted according to the timeline supported in FBi-LA\n\n| Year    | Model     | Title                                                                  | Paper                                     | Code                                                          | `fla` impl                                                                                                           |\n| :------ | :-------- | :--------------------------------------------------------------------- | :---------------------------------------: | :-----------------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------: |\n| 2024 | Linfusion | LinFusion: 1 GPU, 1 Minute, 16K Image                                  | [arxiv](https://arxiv.org/abs/2409.02097) | [official](https://github.com/Huage001/LinFusion)             | [code](https://github.com/hp-l33/flash-bidirectional-linear-attention/blob/main/fbi_la/layers/linfusion.py)           |\n| 2024 | MLLA      | Demystify Mamba in Vision: A Linear Attention Perspective              | [arxiv](https://arxiv.org/abs/2405.16605) | [official](https://github.com/LeapLabTHU/MLLA)                | [code](https://github.com/hp-l33/flash-bidirectional-linear-attention/blob/main/fbi_la/layers/mlla.py)                |\n| 2023 | Focused-LA| FLatten Transformer: Vision Transformer using Focused Linear Attention | [arxiv](https://arxiv.org/abs/2308.00442) | [official](https://github.com/LeapLabTHU/FLatten-Transformer) | [code](https://github.com/hp-l33/flash-bidirectional-linear-attention/blob/main/fbi_la/layers/focused_la.py)          |\n| 2025 | PolaFormer| PolaFormer: Polarity-aware Linear Attention for Vision Transformers    | [arxiv](https://arxiv.org/abs/2501.15061) | [official](https://github.com/ZacharyMeng/PolaFormer) | [code](https://github.com/hp-l33/flash-bidirectional-linear-attention/blob/main/fbi_la/layers/polaformer.py)                 |\n\nMore models will be implemented gradually.\n\n\n# Usage\n\n## Installation\n``` shell\ngit clone https://github.com/fla-org/flash-bidirectional-linear-attention.git\npip install -e flash-bidirectional-linear-attention/.\n```\n\n## Integrated Models\nThis library has integrated some models, which can be called directly. Taking [LinFusion](https://github.com/Huage001/LinFusion) as an example:\n``` python\nimport torch\nfrom diffusers import AutoPipelineForText2Image\nfrom fbi_la.models import LinFusion\n\nsd_repo = \"Lykon/dreamshaper-8\"\n\npipeline = AutoPipelineForText2Image.from_pretrained(\n    sd_repo, torch_dtype=torch.float16, variant=\"fp16\"\n).to(torch.device(\"cuda\"))\n\nlinfusion = LinFusion.construct_for(pipeline)\n\nimage = pipeline(\n    \"An astronaut floating in space. Beautiful view of the stars and the universe in the background.\",\n    generator=torch.manual_seed(123)\n).images[0]\n```\n\n# Benchmarks\nTested on an A800 80G GPU.\n``` shell\nB8-H16-D64:\n         T  torch_fwd  triton_fwd  torch_bwd  triton_bwd\n0    128.0   0.063488    0.049152   0.798720    0.651264\n1    256.0   0.080896    0.056320   0.796672    0.625664\n2    512.0   0.111616    0.058368   0.798720    0.630784\n3   1024.0   0.169984    0.090112   0.864256    0.719872\n4   2048.0   0.300032    0.151552   1.624064    0.702464\n5   4096.0   0.532480    0.276480   3.058176    1.324032\n6   8192.0   1.005568    0.521216   5.880320    2.556928\n7  16384.0   1.924608    0.980992  11.540992    5.022208\n```\n\n\u003cdiv align=\"center\"\u003e\n  \u003cimg width=\"600\" alt=\"image\" src=\"https://res.cloudinary.com/dunty6aot/image/upload/v1735545026/817a5a20-2cc5-48e8-b8dd-01b63753926b_mbbnfk.png\"\u003e\n\u003c/div\u003e\n\n# TODO\n- improve memory efficiency during backpropagation\n- implement more models\n  - VSSD\n  - RALA\n\n# Acknowledgments\nThanks to the following repositories for their inspiration:\n- [flash-attention](https://github.com/Dao-AILab/flash-attention)\n- [flash-linear-attention](https://github.com/sustcsonglin/flash-linear-attention)\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffla-org%2Fflash-bidirectional-linear-attention","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffla-org%2Fflash-bidirectional-linear-attention","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffla-org%2Fflash-bidirectional-linear-attention/lists"}