Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/shjwudp/mamba-jax
https://github.com/shjwudp/mamba-jax
Last synced: about 1 month ago
JSON representation
- Host: GitHub
- URL: https://github.com/shjwudp/mamba-jax
- Owner: shjwudp
- Created: 2023-12-10T16:30:53.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-03-18T06:03:08.000Z (10 months ago)
- Last Synced: 2024-11-24T00:36:20.419Z (about 1 month ago)
- Language: Python
- Homepage:
- Size: 171 KB
- Stars: 5
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Mamba-JAX
Welcome to the Mamba-JAX repository! This is a passion project that utilizes the Mamba training pipeline, built on the JAX framework.
At present, I am able to train a small-scale Mamba model on my MacBook. I am excited to share with you the convergence curve based on Shakespeare's works.
![Loss convergence curve trained on Shakespeare](assets/shakespeare-loss-curve.png)
# TODOs
- [x] Implement gradient accumulation.
- [ ] Implement activation memory checkpoint.
- [ ] Add GPU-aware selective scan.# Citation
```
@article{mamba,
title={Mamba: Linear-Time Sequence Modeling with Selective State Spaces},
author={Gu, Albert and Dao, Tri},
journal={arXiv preprint arXiv:2312.00752},
year={2023}
}
```