Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/johnma2006/mamba-minimal
Simple, minimal implementation of the Mamba SSM in one file of PyTorch.
https://github.com/johnma2006/mamba-minimal
Last synced: 2 days ago
JSON representation
Simple, minimal implementation of the Mamba SSM in one file of PyTorch.
- Host: GitHub
- URL: https://github.com/johnma2006/mamba-minimal
- Owner: johnma2006
- License: apache-2.0
- Created: 2023-12-20T10:39:47.000Z (about 1 year ago)
- Default Branch: master
- Last Pushed: 2024-03-08T04:13:40.000Z (10 months ago)
- Last Synced: 2025-01-03T18:05:50.995Z (9 days ago)
- Language: Python
- Homepage:
- Size: 47.9 KB
- Stars: 2,680
- Watchers: 24
- Forks: 197
- Open Issues: 19
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- Awesome-Mamba - Mamba-minimal-pytorch
- Awesome-state-space-models - Mamba-minimal-in-PyTorch
- AiTreasureBox - johnma2006/mamba-minimal - 01-01_2674_1](https://img.shields.io/github/stars/johnma2006/mamba-minimal.svg)|Simple, minimal implementation of the Mamba SSM in one file of PyTorch.| (Repos)
- StarryDivineSky - johnma2006/mamba-minimal - minimal是一个用 PyTorch 实现的 Mamba 模型,它以单文件形式提供了一个简化的 Mamba 实现,并保证了与官方实现的数值输出一致。该项目以可读性为优先,代码经过注释,但没有包含官方实现中的速度优化和参数初始化,旨在帮助用户理解 Mamba 模型的工作原理。用户可以通过 demo.ipynb 文件查看示例代码,并使用该项目进行文本生成等任务。该项目基于 Albert Gu 和 Tri Dao的论文 "Mamba: Linear-Time Sequence Modeling with Selective State Spaces",并参考了官方实现。 (A01_文本生成_文本对话 / 大语言对话模型及数据)
README
## mamba-minimal
Simple, minimal implementation of Mamba in one file of PyTorch.
Featuring:
* Equivalent numerical output as official implementation for both forward and backward pass
* Simplified, readable, annotated codeDoes NOT include:
* Speed. The official implementation is heavily optimized, and these optimizations are core contributions of the Mamba paper. I kept most implementations simple for readability.
* Proper parameter initialization (though this could be added without sacrificing readability)## Demo
See [demo.ipynb](demo.ipynb) for examples of prompt completions.
```python
from model import Mamba
from transformers import AutoTokenizermodel = Mamba.from_pretrained('state-spaces/mamba-370m')
tokenizer = AutoTokenizer.from_pretrained('EleutherAI/gpt-neox-20b')generate(model, tokenizer, 'Mamba is the')
```
> Mamba is the world's longest venomous snake with an estimated length of over 150 m. With such a large size and a venomous bite, Mamba kills by stabbing the victim (which is more painful and less effective than a single stab of the bite)150 meters... 🫢 scary!
## References
The Mamba architecture was introduced in [Mamba: Linear-Time Sequence Modeling with Selective State Spaces](https://arxiv.org/abs/2312.00752) by [Albert Gu](https://twitter.com/_albertgu?lang=en) and [Tri Dao](https://twitter.com/tri_dao?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Eauthor).
The official implementation is here: https://github.com/state-spaces/mamba/tree/main