https://github.com/branebb/mamba
Interactive demo of the Mamba state-space language model with pretrained weights and a Streamlit interface.
https://github.com/branebb/mamba
mamba-state-space-models python pytorch streamlit
Last synced: 2 months ago
JSON representation
Interactive demo of the Mamba state-space language model with pretrained weights and a Streamlit interface.
- Host: GitHub
- URL: https://github.com/branebb/mamba
- Owner: branebb
- Created: 2024-09-15T19:34:15.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2026-01-18T16:35:40.000Z (5 months ago)
- Last Synced: 2026-01-19T00:46:12.695Z (5 months ago)
- Topics: mamba-state-space-models, python, pytorch, streamlit
- Language: Python
- Homepage:
- Size: 16.6 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Mamba Architecture
This project implements a **Mamba-inspired language model** using **PyTorch**, along with an interactive **Streamlit-based web application** for real-time text generation. The model architecture is based on the key ideas introduced in the *Mamba* paper, focusing on **linear-time sequence modeling with selective state space models (SSMs)** instead of attention mechanisms.
The application allows users to select pretrained Mamba models and generate text completions from custom prompts through a simple web interface.
---
## Project Overview
Recent advances in sequence modeling have shown that attention mechanisms are not the only viable approach for modeling long-range dependencies. The **Mamba architecture** introduces a novel alternative based on **selective state space models**, enabling **linear-time** and **memory-efficient** sequence processing.
This project provides:
- A **custom PyTorch implementation** of a simplified Mamba-style language model
- Support for loading **official pretrained Mamba weights**
- An **interactive Streamlit demo** for autoregressive text generation
---
## Architecture Overview
The model follows a stacked residual architecture composed of multiple **Mamba blocks**, each including:
- Token embeddings with tied output projection
- Pre-normalization using **RMSNorm**
- Input projection with gating
- Depthwise 1D convolution for local mixing
- **Selective State Space Model (SSM)** with learnable parameters
- Linear-time **selective scan** over the sequence
- Residual connections
This design enables efficient autoregressive generation while avoiding quadratic attention costs.
---
## Reference Paper
The implementation is **inspired by** the following paper:
> **Mamba: Linear-Time Sequence Modeling with Selective State Spaces**
> Albert Gu, Tri Dao
> arXiv:2312.00752 (2023)
> https://arxiv.org/abs/2312.00752
This project does **not** aim to be a full or optimized reproduction of the original implementation, but rather a **simplified and educational adaptation** of its core ideas.
---
## Features
- **Mamba-inspired architecture** with selective state space models
- **Linear-time sequence processing**
- Autoregressive text generation
- Support for pretrained Mamba models (`130M`, `370M`)
- Interactive **Streamlit web interface**
- GPU acceleration via CUDA (if available)
---
## Tech Stack
- **Python 3.8+**
- **PyTorch** – model implementation and inference
- **Hugging Face Transformers** – pretrained weights and tokenizer
- **Einops** – tensor manipulation
- **Streamlit** – web-based user interface
---
## Requirements
- Python 3.8 or higher
- PyTorch (CPU or CUDA version)
- CUDA-compatible GPU (optional, recommended for larger models)
---
## Installation
1. **Clone the repository**
```bash
git clone https://github.com/your-username/your-repo-name.git
cd your-repo-name
```
2. **Create and activate a virtual environment**
```bash
python -m venv venv
# Windows
venv\Scripts\activate
# Linux / macOS
source venv/bin/activate
```
3. **Upgrade pip**
```bash
python -m pip install --upgrade pip
```
4. **Install dependencies**
```bash
pip install -r requirements.txt
```
## Usage
This project provides an interactive web-based interface for text generation using pretrained Mamba language models.
### Running the Application
To start the application, run the Streamlit app from the project root:
```bash
streamlit run app.py
```
## Pretrained Models
This project supports loading **official pretrained Mamba language models** released by the authors of the Mamba architecture. The models are automatically downloaded from the Hugging Face Hub at runtime.
### Available Models
- `state-spaces/mamba-130m`
A lightweight Mamba model suitable for fast experimentation and limited hardware.
- `state-spaces/mamba-370m`
A larger model with improved generation quality and stronger language modeling capabilities.
### Tokenizer
All models use the tokenizer from:
- `EleutherAI/gpt-neox-20b`
This tokenizer is compatible with the pretrained Mamba checkpoints and is required for correct text encoding and decoding.
### Loading Mechanism
Pretrained weights and configuration files are fetched automatically using the Hugging Face `transformers` library. No manual download is required.
The models are loaded once at application startup and cached to avoid repeated initialization.