https://github.com/kyegomez/hedgehog
Implementation of the model "Hedgehog" from the paper: "The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax Mimicry"
https://github.com/kyegomez/hedgehog
ai attention attention-is-all-you-need attention-mechanisms feedforward ffns ml mlps multi-modal neural-nets open-source opensource-ai softmax
Last synced: 8 months ago
JSON representation
Implementation of the model "Hedgehog" from the paper: "The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax Mimicry"
- Host: GitHub
- URL: https://github.com/kyegomez/hedgehog
- Owner: kyegomez
- License: mit
- Created: 2024-02-09T02:15:22.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-03-11T02:20:29.000Z (over 2 years ago)
- Last Synced: 2025-07-13T01:21:46.263Z (11 months ago)
- Topics: ai, attention, attention-is-all-you-need, attention-mechanisms, feedforward, ffns, ml, mlps, multi-modal, neural-nets, open-source, opensource-ai, softmax
- Language: Python
- Homepage: https://discord.gg/GYbXvDGevY
- Size: 2.16 MB
- Stars: 14
- Watchers: 2
- Forks: 0
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project
README
[](https://discord.gg/qUtxnK2NMf)
# HedgeHog
Implementation of the model "Hedgehog" from the paper: "The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax Mimicry". This paper implements MLPs to mimic the softmax of a transformer. Suppodesly hits SOTA on wikitext for sub quadratic models. I've too been thinking about replacing softmax with MLPs. This past month we saw doezens of papers on mamba and convolutions but MLPs might have undiscovered powers.
# License
MIT