{"id":14970719,"url":"https://github.com/rishit-dagli/nystromformer","last_synced_at":"2025-10-15T00:24:50.737Z","repository":{"id":57160143,"uuid":"526596866","full_name":"Rishit-dagli/Nystromformer","owner":"Rishit-dagli","description":"An implementation of the Nyströmformer, using Nystrom method to approximate standard self attention","archived":false,"fork":false,"pushed_at":"2022-08-21T06:51:39.000Z","size":184,"stargazers_count":55,"open_issues_count":0,"forks_count":4,"subscribers_count":5,"default_branch":"main","last_synced_at":"2024-10-11T04:41:15.712Z","etag":null,"topics":["artificial-intelligence","attention-mechanism","deep-learning","keras","machine-learning","nystrom","nystromformer","tensorflow","transformer"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Rishit-dagli.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":null,"support":null}},"created_at":"2022-08-19T12:22:58.000Z","updated_at":"2024-07-03T10:08:45.000Z","dependencies_parsed_at":"2022-08-24T13:10:36.451Z","dependency_job_id":null,"html_url":"https://github.com/Rishit-dagli/Nystromformer","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Rishit-dagli%2FNystromformer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Rishit-dagli%2FNystromformer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Rishit-dagli%2FNystromformer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Rishit-dagli%2FNystromformer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Rishit-dagli","download_url":"https://codeload.github.com/Rishit-dagli/Nystromformer/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":219862887,"owners_count":16555951,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","attention-mechanism","deep-learning","keras","machine-learning","nystrom","nystromformer","tensorflow","transformer"],"created_at":"2024-09-24T13:44:02.297Z","updated_at":"2025-10-15T00:24:50.643Z","avatar_url":"https://github.com/Rishit-dagli.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Nystromformer [![Twitter](https://img.shields.io/twitter/url?style=social\u0026url=https%3A%2F%2Fgithub.com%2FRishit-dagli%2FNystromformer)](https://twitter.com/intent/tweet?text=Wow:\u0026url=https%3A%2F%2Fgithub.com%2FRishit-dagli%2FNystromformer)\n\n[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Rishit-dagli/Nystromformer/blob/main/example/nystromformer-example.ipynb)\n![PyPI](https://img.shields.io/pypi/v/Nystromformer)\n[![Run Tests](https://github.com/Rishit-dagli/Nystromformer/actions/workflows/tests.yml/badge.svg)](https://github.com/Rishit-dagli/Nystromformer/actions/workflows/tests.yml)\n[![Upload Python Package](https://github.com/Rishit-dagli/Nystromformer/actions/workflows/python-publish.yml/badge.svg)](https://github.com/Rishit-dagli/Nystromformer/actions/workflows/python-publish.yml)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![codecov](https://codecov.io/gh/Rishit-dagli/Nystromformer/branch/main/graph/badge.svg?token=CTXN1T8P2Q)](https://codecov.io/gh/Rishit-dagli/Nystromformer)\n\n\n![GitHub License](https://img.shields.io/github/license/Rishit-dagli/Nystromformer)\n[![GitHub stars](https://img.shields.io/github/stars/Rishit-dagli/Nystromformer?style=social)](https://github.com/Rishit-dagli/Nystromformer/stargazers)\n[![GitHub followers](https://img.shields.io/github/followers/Rishit-dagli?label=Follow\u0026style=social)](https://github.com/Rishit-dagli)\n[![Twitter Follow](https://img.shields.io/twitter/follow/rishit_dagli?style=social)](https://twitter.com/intent/follow?screen_name=rishit_dagli)\n\nAn implementation of the [Nyströmformer: A Nyström-Based Algorithm for Approximating Self-Attention](https://arxiv.org/abs/2102.03902) paper by Xiong et al. The self-attention mechanism that encodes the influence or dependence of other tokens on each specific token is a key component of the performance of Transformers. This uses the Nyström method to approximate standard self-attention with O(n) complexity allowing to exhibit scalability as a function of sequence length.\n\n![](media/nystromformer.png)\n\n## Installation\n\nRun the following to install:\n\n```sh\npip install nystromformer\n```\n\n## Developing nystromformer\n\nTo install `nystromformer`, along with tools you need to develop and test, run the following in your virtualenv:\n\n```sh\ngit clone https://github.com/Rishit-dagli/Nystromformer.git\n# or clone your own fork\n\ncd Nystromformer\npip install -e .[dev]\n```\n\nTo run rank and shape tests run the following:\n\n```\npytest -v --disable-warnings --cov\n```\n\n## Usage\n\n### Nystrom Attention\n\n```py\nimport tensorflow as tf\nfrom nystromformer import NystromAttention\n\nattn = NystromAttention(\n    dim = 512,\n    dim_head = 64,\n    heads = 8,\n    num_landmarks = 256,    # number of landmarks\n    pinv_iterations = 6,    # number of moore-penrose iterations for approximating pinverse. 6 was recommended by the paper\n    residual = True         # whether to do an extra residual with the value or not. supposedly faster convergence if turned on\n)\n\nx = tf.random.normal((1, 16384, 512))\nmask = tf.ones((1, 16384), dtype=tf.bool)\n\nattn(x, mask = mask) # (1, 16384, 512)\n```\n\n### Nystromformer\n\n```py\nimport tensorflow as tf\nfrom nystromformer import Nystromformer\n\nmodel = Nystromformer(\n    dim = 512,\n    dim_head = 64,\n    heads = 8,\n    depth = 6,\n    num_landmarks = 256,\n    pinv_iterations = 6\n)\n\nx = tf.random.normal((1, 16384, 512))\nmask = tf.ones((1, 16384), dtype=tf.bool)\n\nmodel(x, mask = mask) # (1, 16384, 512)\n```\n\n## Want to Contribute 🙋‍♂️?\n\nAwesome! If you want to contribute to this project, you're always welcome! See [Contributing Guidelines](CONTRIBUTING.md). You can also take a look at [open issues](https://github.com/Rishit-dagli/Nystromformer/issues) for getting more information about current or upcoming tasks.\n\n## Want to discuss? 💬\n\nHave any questions, doubts or want to present your opinions, views? You're always welcome. You can [start discussions](https://github.com/Rishit-dagli/Nystromformer/discussions).\n\n## Citation\n\n```bibtex\n@misc{xiong2021nystromformer,\n    title   = {Nyströmformer: A Nyström-Based Algorithm for Approximating Self-Attention},\n    author  = {Yunyang Xiong and Zhanpeng Zeng and Rudrasis Chakraborty and Mingxing Tan and Glenn Fung and Yin Li and Vikas Singh},\n    year    = {2021},\n    eprint  = {2102.03902},\n    archivePrefix = {arXiv},\n    primaryClass = {cs.CL}\n}\n```\n\n[Yannic Kilcher's Video](https://www.youtube.com/watch?v=m-zrcmRd7E4)\n\n[PyTorch Implementation](https://github.com/mlpen/Nystromformer)\n\n[PyTorch Implementation](https://github.com/lucidrains/nystrom-attention)\n\n## License\n\n```\nCopyright 2020 Rishit Dagli\n\nLicensed under the Apache License, Version 2.0 (the \"License\");\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n    http://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an \"AS IS\" BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frishit-dagli%2Fnystromformer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frishit-dagli%2Fnystromformer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frishit-dagli%2Fnystromformer/lists"}