Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/johnma2006/mamba-minimal
Simple, minimal implementation of the Mamba SSM in one file of PyTorch.
https://github.com/johnma2006/mamba-minimal
Last synced: 26 days ago
JSON representation
Simple, minimal implementation of the Mamba SSM in one file of PyTorch.
- Host: GitHub
- URL: https://github.com/johnma2006/mamba-minimal
- Owner: johnma2006
- License: apache-2.0
- Created: 2023-12-20T10:39:47.000Z (6 months ago)
- Default Branch: master
- Last Pushed: 2024-01-21T06:13:08.000Z (5 months ago)
- Last Synced: 2024-02-18T06:32:44.042Z (4 months ago)
- Language: Python
- Homepage:
- Size: 47.9 KB
- Stars: 1,759
- Watchers: 22
- Forks: 109
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Lists
- awesome-stars - johnma2006/mamba-minimal - Simple, minimal implementation of the Mamba SSM in one file of PyTorch. (Python)
- awesome-stars - johnma2006/mamba-minimal - Simple, minimal implementation of the Mamba SSM in one file of PyTorch. (Python)
- AiTreasureBox - johnma2006/mamba-minimal - 06-27_2418_1](https://img.shields.io/github/stars/johnma2006/mamba-minimal.svg)|Simple, minimal implementation of the Mamba SSM in one file of PyTorch.| (Repos)
README
## mamba-minimal
Simple, minimal implementation of Mamba in one file of PyTorch.
Featuring:
* Equivalent numerical output as official implementation for both forward and backward pass
* Simplified, readable, annotated codeDoes NOT include:
* Speed. The official implementation is heavily optimized, and these optimizations are core contributions of the Mamba paper. I kept most implementations simple for readability.
* Proper parameter initialization (though this could be added without sacrificing readability)## Demo
See [demo.ipynb](demo.ipynb) for examples of prompt completions.
```python
from model import Mamba
from transformers import AutoTokenizermodel = Mamba.from_pretrained('state-spaces/mamba-370m')
tokenizer = AutoTokenizer.from_pretrained('EleutherAI/gpt-neox-20b')generate(model, tokenizer, 'Mamba is the')
```
> Mamba is the world's longest venomous snake with an estimated length of over 150 m. With such a large size and a venomous bite, Mamba kills by stabbing the victim (which is more painful and less effective than a single stab of the bite)150 meters... 🫢 scary!
## References
The Mamba architecture was introduced in [Mamba: Linear-Time Sequence Modeling with Selective State Spaces](https://arxiv.org/abs/2312.00752) by [Albert Gu](https://twitter.com/_albertgu?lang=en) and [Tri Dao](https://twitter.com/tri_dao?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Eauthor).
The official implementation is here: https://github.com/state-spaces/mamba/tree/main