https://github.com/collabora/sigmareparam-pytorch
An unofficial implementation of the σReparam from the "Stabilizing Transformer Training by Preventing Attention Entropy Collapse" paper
https://github.com/collabora/sigmareparam-pytorch
Last synced: 3 months ago
JSON representation
An unofficial implementation of the σReparam from the "Stabilizing Transformer Training by Preventing Attention Entropy Collapse" paper
- Host: GitHub
- URL: https://github.com/collabora/sigmareparam-pytorch
- Owner: collabora
- License: mit
- Created: 2023-03-15T14:47:01.000Z (about 3 years ago)
- Default Branch: main
- Last Pushed: 2023-03-15T14:50:43.000Z (about 3 years ago)
- Last Synced: 2024-12-31T21:22:19.957Z (over 1 year ago)
- Homepage: https://collabora.github.io/sigmareparam-pytorch
- Size: 1.95 KB
- Stars: 4
- Watchers: 7
- Forks: 1
- Open Issues: 0