{"id":21529591,"url":"https://github.com/sooftware/luna-transformer","last_synced_at":"2025-04-09T23:51:31.126Z","repository":{"id":41088977,"uuid":"390810322","full_name":"sooftware/luna-transformer","owner":"sooftware","description":"A PyTorch Implementation of the Luna: Linear Unified Nested Attention","archived":false,"fork":false,"pushed_at":"2021-07-29T18:15:00.000Z","size":24,"stargazers_count":41,"open_issues_count":0,"forks_count":2,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-09T23:51:26.671Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sooftware.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-07-29T18:01:35.000Z","updated_at":"2024-11-10T08:46:52.000Z","dependencies_parsed_at":"2022-07-30T20:18:17.262Z","dependency_job_id":null,"html_url":"https://github.com/sooftware/luna-transformer","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sooftware%2Fluna-transformer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sooftware%2Fluna-transformer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sooftware%2Fluna-transformer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sooftware%2Fluna-transformer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sooftware","download_url":"https://codeload.github.com/sooftware/luna-transformer/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248131455,"owners_count":21052819,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-24T01:58:16.412Z","updated_at":"2025-04-09T23:51:31.091Z","avatar_url":"https://github.com/sooftware.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n\u003cp  align=\"center\"\u003e\u003cimg src=\"https://user-images.githubusercontent.com/42150335/127541215-931f2271-5c17-4672-a328-c8fafc4a8da9.png\" height=100\u003e\n  \n\n\u003cdiv align=\"center\"\u003e\n\n**Unofficial PyTorch implementation of [Luna: Linear Unified Nested Attention](https://arxiv.org/abs/2106.01540.pdf)**\n\n  \n\u003c/div\u003e\n  \n***\n  \n  \nThe quadratic computational and memory complexities of the Transformer’s attention mechanism have limited its scalability for modeling long sequences. In\nthis paper, we propose Luna, a linear unified nested attention mechanism that\napproximates softmax attention with two nested linear attention functions, yielding\nonly linear (as opposed to quadratic) time and space complexity. As compared to\na more traditional attention mechanism, Luna introduces an additional sequence\nwith a fixed length as input and an additional corresponding output, which allows\nLuna to perform attention operation linearly, while also storing adequate contextual\ninformation. We perform extensive evaluations on three benchmarks of sequence\nmodeling tasks: long-context sequence modeling, neural machine translation and\nmasked language modeling for large-scale pretraining. Competitive or even better\nexperimental results demonstrate both the effectiveness and efficiency of Luna\ncompared to a variety of strong baseline methods including the full-rank attention\nand other efficient sparse and dense attention methods\n\n![image](https://user-images.githubusercontent.com/42150335/127543497-0b4a5513-4ac6-48c7-9595-d38c880ad8ed.png)\n\n## Installation\nThis project recommends Python 3.7 or higher.\nWe recommend creating a new virtual environment for this project (using virtual env or conda).\n  \n### Prerequisites\n* Numpy: `pip install numpy` (Refer [here](https://github.com/numpy/numpy) for problem installing Numpy).\n* Pytorch: Refer to [PyTorch website](http://pytorch.org/) to install the version w.r.t. your environment.  \n  \n### Install from source\nCurrently we only support installation from source code using setuptools. Checkout the source code and run the\nfollowing commands:  \n  \n```\npip install -e .\n```\n\n## Usage\n\n```python\nimport torch\nfrom luna_transformer import LunaTransformerEncoder\n\nDUMMY_INPUTS = torch.LongTensor([\n    [2, 3, 3, 3, 3, 3, 2, 2, 0],\n    [2, 3, 3, 3, 3, 3, 2, 3, 2],\n    [2, 3, 3, 3, 3, 3, 2, 2, 0],\n])\nDUMMY_INPUT_LENGTHS = torch.LongTensor([9, 8, 7])\n\nmodel = LunaTransformerEncoder(vocab_size=4, d_model=512, num_layers=6,\n                               num_attention_heads=8, project_embedding_length=32,\n                               dropout_p=0.1, max_length=1024)\nouputs = model(DUMMY_INPUTS, DUMMY_INPUT_LENGTHS)\n```\n\n## Troubleshoots and Contributing\nIf you have any questions, bug reports, and feature requests, please [open an issue](https://github.com/sooftware/conformer/issues) on github or   \ncontacts sh951011@gmail.com please.\n  \nI appreciate any kind of feedback or contribution.  Feel free to proceed with small issues like bug fixes, documentation improvement.  For major contributions and new features, please discuss with the collaborators in corresponding issues.  \n  \n## Code Style\nI follow [PEP-8](https://www.python.org/dev/peps/pep-0008/) for code style. Especially the style of docstrings is important to generate documentation.\n\n## Author\n  \n* Soohwan Kim [@sooftware](https://github.com/sooftware)\n* Contacts: sh951011@gmail.com\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsooftware%2Fluna-transformer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsooftware%2Fluna-transformer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsooftware%2Fluna-transformer/lists"}