{"id":15600963,"url":"https://github.com/lucidrains/linformer","last_synced_at":"2025-04-04T14:02:48.281Z","repository":{"id":50666061,"uuid":"275479127","full_name":"lucidrains/linformer","owner":"lucidrains","description":"Implementation of Linformer for Pytorch","archived":false,"fork":false,"pushed_at":"2024-01-05T20:39:57.000Z","size":27,"stargazers_count":276,"open_issues_count":6,"forks_count":26,"subscribers_count":8,"default_branch":"master","last_synced_at":"2025-03-28T13:04:03.792Z","etag":null,"topics":["artificial-intelligence","attention-mechanism","deep-learning","transformer"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lucidrains.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2020-06-28T00:53:05.000Z","updated_at":"2025-03-27T06:47:55.000Z","dependencies_parsed_at":"2024-01-16T15:40:49.277Z","dependency_job_id":"fa1f8d0f-f77a-4764-9ee2-db39a2ade638","html_url":"https://github.com/lucidrains/linformer","commit_stats":{"total_commits":24,"total_committers":2,"mean_commits":12.0,"dds":0.08333333333333337,"last_synced_commit":"970e1c3f12fbdc494f60a2bed5e807563850f585"},"previous_names":[],"tags_count":10,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Flinformer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Flinformer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Flinformer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Flinformer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lucidrains","download_url":"https://codeload.github.com/lucidrains/linformer/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247182933,"owners_count":20897486,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","attention-mechanism","deep-learning","transformer"],"created_at":"2024-10-03T02:10:28.184Z","updated_at":"2025-04-04T14:02:48.255Z","avatar_url":"https://github.com/lucidrains.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"## Linformer for Pytorch\n\nAn implementation of Linformer in Pytorch. Linformer comes with two deficiencies. (1) It does not work for the auto-regressive case. (2) Assumes a fixed sequence length. However, if benchmarks show it to perform well enough, it will be added to \u003ca href=\"https://github.com/lucidrains/linear-attention-transformer\"\u003ethis repository\u003c/a\u003e as a self-attention layer to be used in the encoder.\n\nLinformer has been \u003ca href=\"https://ai.facebook.com/blog/how-facebook-uses-super-efficient-ai-models-to-detect-hate-speech/\"\u003eput into production\u003c/a\u003e by Facebook!\n\n## Install\n\n```bash\n$ pip install linformer\n```\n\n## Usage\n\nLinformer language model\n\n```python\nimport torch\nfrom linformer import LinformerLM\n\nmodel = LinformerLM(\n    num_tokens = 20000,\n    dim = 512,\n    seq_len = 4096,\n    depth = 12,\n    heads = 8,\n    dim_head = 128,        # be able to set the dimension of each head in multi-head attention\n    k = 256,               # this is the k that the key/values are projected to along the sequence dimension\n    one_kv_head = True,    # share one key/value head across all heads\n    share_kv = False,      # share the same projection for keys and values\n    reversible = True      # make network reversible, like Reformer\n)\n\nx = torch.randint(0, 20000, (1, 4096))\nmodel(x) # (1, 4096, 20000)\n```\n\nLinformer\n\n```python\nimport torch\nfrom linformer import Linformer\n\nmodel = Linformer(\n    dim = 512,\n    seq_len = 4096,\n    depth = 12,\n    heads = 8,\n    k = 256,\n    one_kv_head = True,\n    share_kv = True\n)\n\nx = torch.randn(1, 4096, 512)\nmodel(x) # (1, 4096, 512)\n```\n\nSingle Self-Attention layer\n\n```python\nimport torch\nfrom linformer import LinformerSelfAttention\n\nattn = LinformerSelfAttention(\n    dim = 512,\n    seq_len = 4096,\n    heads = 8,\n    k = 256,\n    one_kv_head = True,\n    share_kv = True\n)\n\nx = torch.randn(1, 4096, 512)\nattn(x) # (1, 4096, 512)\n```\n\nSelf-Attention layer above receiving contextual keys. The sequence length is validated on the length of the contextual keys instead of the source sequence.\n\n```python\nimport torch\nfrom linformer import LinformerSelfAttention\n\nattn = LinformerSelfAttention(\n    dim = 512,\n    seq_len = 8192,\n    heads = 8,\n    k = 256,\n    one_kv_head = True,\n    share_kv = True\n)\n\nx = torch.randn(1, 2048, 512)\ncontext = torch.randn(1, 8192, 512)\nattn(x, context) # (1, 2048, 512)\n```\n\n## Citations\n\n```bibtex\n@misc{wang2020linformer,\n    title={Linformer: Self-Attention with Linear Complexity},\n    author={Sinong Wang and Belinda Z. Li and Madian Khabsa and Han Fang and Hao Ma},\n    year={2020},\n    eprint={2006.04768},\n    archivePrefix={arXiv},\n    primaryClass={cs.LG}\n}\n```\n\n```bibtex\n@inproceedings{kitaev2020reformer,\n    title       = {Reformer: The Efficient Transformer},\n    author      = {Nikita Kitaev and Lukasz Kaiser and Anselm Levskaya},\n    booktitle   = {International Conference on Learning Representations},\n    year        = {2020},\n    url         = {https://openreview.net/forum?id=rkgNKkHtvB}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flucidrains%2Flinformer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flucidrains%2Flinformer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flucidrains%2Flinformer/lists"}