Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/robflynnyh/hydra-linear-attention
Implementation of: Hydra Attention: Efficient Attention with Many Heads (https://arxiv.org/abs/2209.07484)
https://github.com/robflynnyh/hydra-linear-attention
attention efficient-attention linear-attention machine-learning transformers
Last synced: 9 days ago
JSON representation
Implementation of: Hydra Attention: Efficient Attention with Many Heads (https://arxiv.org/abs/2209.07484)
- Host: GitHub
- URL: https://github.com/robflynnyh/hydra-linear-attention
- Owner: robflynnyh
- Created: 2023-01-05T21:34:29.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2023-01-08T11:39:40.000Z (almost 2 years ago)
- Last Synced: 2024-08-03T09:09:42.688Z (4 months ago)
- Topics: attention, efficient-attention, linear-attention, machine-learning, transformers
- Language: Python
- Homepage:
- Size: 8.79 KB
- Stars: 10
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# hydra-linear-attention
Implementation of the thingy described in this paper: https://arxiv.org/pdf/2209.07484.pdf- code is mostly taken from the appendix of the paper its pretty simple
- basically its linear attention with heads equeal to the feature dim, they use l2 norm as the kernel fn rather than softmax as it allows you to scale the "head" dimension, which makes it faster
- idk if it's descriptive to say stuff like this is similar to regular attention - I see it being more similar to something like squeeze and excite layers