{"id":19380902,"url":"https://github.com/openmachine-ai/transformer-tricks","last_synced_at":"2025-05-16T14:07:58.759Z","repository":{"id":227170644,"uuid":"770650946","full_name":"OpenMachine-ai/transformer-tricks","owner":"OpenMachine-ai","description":"A collection of tricks and tools to speed up transformer models","archived":false,"fork":false,"pushed_at":"2025-04-02T00:45:13.000Z","size":10667,"stargazers_count":159,"open_issues_count":1,"forks_count":9,"subscribers_count":6,"default_branch":"main","last_synced_at":"2025-05-12T23:11:20.989Z","etag":null,"topics":["ai","arxiv","arxiv-papers","llm","llm-inference","llmops","machine-learning","python","transformer","transformer-models","transformer-pytorch"],"latest_commit_sha":null,"homepage":"","language":"TeX","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/OpenMachine-ai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-03-11T23:08:59.000Z","updated_at":"2025-05-12T07:24:19.000Z","dependencies_parsed_at":"2024-04-21T20:53:55.786Z","dependency_job_id":"cb261644-0bbb-42da-8245-bce67f7b9d86","html_url":"https://github.com/OpenMachine-ai/transformer-tricks","commit_stats":null,"previous_names":["openmachine-ai/transformer-tricks"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenMachine-ai%2Ftransformer-tricks","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenMachine-ai%2Ftransformer-tricks/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenMachine-ai%2Ftransformer-tricks/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenMachine-ai%2Ftransformer-tricks/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/OpenMachine-ai","download_url":"https://codeload.github.com/OpenMachine-ai/transformer-tricks/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254544146,"owners_count":22088807,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","arxiv","arxiv-papers","llm","llm-inference","llmops","machine-learning","python","transformer","transformer-models","transformer-pytorch"],"created_at":"2024-11-10T09:15:15.332Z","updated_at":"2025-05-16T14:07:53.732Z","avatar_url":"https://github.com/OpenMachine-ai.png","language":"TeX","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003ch1 align=\"center\"\u003e Transformer Tricks\n\n  \u003ca href=\"https://transformertricks.substack.com\"\u003e\u003cimg src=\"https://img.shields.io/badge/Substack-FF6719?logo=substack\u0026logoColor=fff\"\u003e\u003c/a\u003e\n  [![PyPI](https://img.shields.io/pypi/v/transformer-tricks)](https://pypi.org/project/transformer-tricks)\n  \u003ca href=\"https://pepy.tech/projects/transformer-tricks\"\u003e\u003cimg src=\"https://static.pepy.tech/badge/transformer-tricks\" alt=\"PyPI Downloads\"\u003e\u003c/a\u003e\n\u003c/h1\u003e\n\nA collection of tricks to simplify and speed up transformer models:\n- Slim attention: [[paper]](https://arxiv.org/abs/2503.05840), [[video]](https://youtu.be/uVtk3B6YO4Y), [[podcast]](https://notebooklm.google.com/notebook/ac47a53c-866b-4271-ab79-bc48d1b41722/audio), [[notebook]](https://colab.research.google.com/github/OpenMachine-ai/transformer-tricks/blob/main/notebooks/slimAttn_paper.ipynb), [[code-readme]](doc/slimAttn.md), [[reddit]](https://www.reddit.com/r/LocalLLaMA/comments/1j9wkc2/slim_attention_cut_your_context_memory_in_half)\n- Matrix-shrink \\[work in progress\\]: [[paper]](https://docs.google.com/viewer?url=https://raw.githubusercontent.com/OpenMachine-ai/transformer-tricks/refs/heads/main/doc/matShrink.pdf)\n- Flash normalization: [[paper]](https://arxiv.org/abs/2407.09577), [[podcast]](https://notebooklm.google.com/notebook/0877599c-720c-49b5-b451-8a41af592dd1/audio), [[notebook]](https://colab.research.google.com/github/OpenMachine-ai/transformer-tricks/blob/main/notebooks/flashNorm_paper.ipynb), [[code-readme]](doc/flashNorm.md)\n- Precomputing the first layer: [[paper]](https://arxiv.org/abs/2402.13388), [[podcast]](https://notebooklm.google.com/notebook/7794278e-de6a-40fc-ab1c-3240a40e55d5/audio)\n- Removing weights from skipless transformers: [[paper]](https://arxiv.org/abs/2404.12362), [[podcast]](https://notebooklm.google.com/notebook/0875eef7-094e-4c30-bc13-90a1a074c949/audio), [[notebook]](https://colab.research.google.com/github/OpenMachine-ai/transformer-tricks/blob/main/notebooks/removeWeights_paper.ipynb)\n\nMany of these tricks follow a recent trend of removing parts from neural networks such as [RMSNorm’s](https://arxiv.org/abs/1910.07467) removal of mean centering from LayerNorm, [PaLM's](https://arxiv.org/abs/2204.02311) removal of bias-parameters, [decoder-only transformer's](https://arxiv.org/abs/1801.10198) removal of the encoder stack, and of course [transformer’s](https://arxiv.org/abs/1706.03762) revolutionary removal of recurrent layers. \n\nFor example, our FlashNorm removes the weights from RMSNorm and merges them with the next linear layer. And slim attention removes the entire V-cache from the context memory for MHA transformers.\n\n---\n\n## Installation\n\nInstall the transformer tricks package:\n```bash\npip install transformer-tricks\n```\n\nAlternatively, to run from latest repo:\n```bash\ngit clone https://github.com/OpenMachine-ai/transformer-tricks.git\npip3 install --quiet -r requirements.txt\n```\n\n---\n\n## Documentation\nFollow the links below for documentation of the python code in this directory:\n- [Slim attention](doc/slimAttn.md)\n- [Flash normalization](doc/flashNorm.md)\n\n---\n\n## Notebooks\nThe papers are accompanied by the following Jupyter notebooks:\n- Slim attention: \u003ca href=\"https://colab.research.google.com/github/OpenMachine-ai/transformer-tricks/blob/main/notebooks/slimAttn_paper.ipynb\"\u003e\u003cimg src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Colab\" height=\"20\"\u003e\u003c/a\u003e\n- Flash normalization: \u003ca href=\"https://colab.research.google.com/github/OpenMachine-ai/transformer-tricks/blob/main/notebooks/flashNorm_example.ipynb\"\u003e\u003cimg src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Colab\" height=\"20\"\u003e\u003c/a\u003e \u003ca href=\"https://colab.research.google.com/github/OpenMachine-ai/transformer-tricks/blob/main/notebooks/flashNorm_paper.ipynb\"\u003e\u003cimg src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Colab\" height=\"20\"\u003e\u003c/a\u003e\n- Removing weights from skipless transformers: \u003ca href=\"https://colab.research.google.com/github/OpenMachine-ai/transformer-tricks/blob/main/notebooks/removeWeights_paper.ipynb\"\u003e\u003cimg src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Colab\" height=\"20\"\u003e\u003c/a\u003e\n\n---\n## Newsletter\nPlease subscribe to our [newsletter](https://transformertricks.substack.com) on substack to get the latest news about this project. We will never send you more than one email per month.\n\n[![Substack](https://img.shields.io/badge/Substack-FF6719?logo=substack\u0026logoColor=fff)](https://transformertricks.substack.com)\n\n---\n\n## Contributing\nWe pay cash for high-impact contributions. Please check out [CONTRIBUTING](doc/CONTRIBUTING.md) for how to get involved.\n\n---\n\n## Sponsors\nThe Transformer Tricks project is currently sponsored by [OpenMachine](https://openmachine.ai). We'd love to hear from you if you'd like to join us in supporting this project.\n\n---\n\n### Please give us a ⭐ if you like this repo, and check out [TinyFive](https://github.com/OpenMachine-ai/tinyfive)\n\n---\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopenmachine-ai%2Ftransformer-tricks","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fopenmachine-ai%2Ftransformer-tricks","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopenmachine-ai%2Ftransformer-tricks/lists"}