https://github.com/deepbiolab/mamba-in-depth
A comprehensive learning guide for understanding S4 (Structured State Space Sequence Model) and Mamba architectures, from fundamentals through implementation, with curated resources and hands-on examples.
https://github.com/deepbiolab/mamba-in-depth
learning-resources mamba state-space-models
Last synced: about 1 month ago
JSON representation
A comprehensive learning guide for understanding S4 (Structured State Space Sequence Model) and Mamba architectures, from fundamentals through implementation, with curated resources and hands-on examples.
- Host: GitHub
- URL: https://github.com/deepbiolab/mamba-in-depth
- Owner: deepbiolab
- License: mit
- Created: 2025-03-30T14:22:21.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2025-03-31T15:44:04.000Z (about 2 months ago)
- Last Synced: 2025-03-31T17:00:03.762Z (about 2 months ago)
- Topics: learning-resources, mamba, state-space-models
- Language: Jupyter Notebook
- Homepage:
- Size: 3.43 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# S4 & Mamba In-Depth Guide
## 📚 Learning Path Overview
This guide provides a structured learning path to understand S4 (Structured State Space Sequence Model) and Mamba from fundamentals to implementation. Follow these steps sequentially for the best learning experience.
### 1️⃣ Introduction to Sequence Modeling
**Goal**: Understand the basics and motivation behind state space models- 📖 Start with [A Visual Guide to Mamba and State Space Models](https://newsletter.maartengrootendorst.com/p/a-visual-guide-to-mamba-and-state)
- Focus on [Part 1: The Problem with Transformers](https://newsletter.maartengrootendorst.com/i/141228095/part-the-problem-with-transformers)
- This explains why we need alternatives to Transformers and introduces S4### 2️⃣ Deep Dive into State Space Models (SSMs)
**Goal**: Master the fundamental concepts of State Space Models1. 🎥 Watch [State Space Models (SSMs) and Mamba](https://www.youtube.com/watch?v=g1AqUhP00Do) by Serrano.Academy
- Provides excellent visualizations and examples of SSMs2. 📘 Continue with [Part 2: The State Space Model](https://newsletter.maartengrootendorst.com/i/141228095/part-the-state-space-model-ssm)
- Builds on the theoretical foundation3. 💻 Study [The Annotated S4](https://srush.github.io/annotated-s4/#part-1b-addressing-long-range-dependencies-with-hippo)
> Note: Since the original repo is outdated, use the [s4_in_depth.ipynb](./s4_in_depth.ipynb) notebook in this repository
>
> 对于中文读者,我已经将原作者的The Annotated S4翻译成了中文,放在了[cn 文件夹](./cn/The-Annotated-S4-CN-Part1.pdf)中,方便中文读者阅读。### 3️⃣ Mamba Architecture
**Goal**: Understand how Mamba builds upon and improves SSMs- 📖 Read [Part 3: Visual Guide to Mamba](https://newsletter.maartengrootendorst.com/i/141228095/part-mamba-a-selective-ssm)
- Explains how Mamba extends and improves upon S4### 4️⃣ Hands-On Implementation
**Goal**: Get practical experience with Mamba- 💻 Study [The Annotated Mamba](https://srush.github.io/annotated-mamba/hard.html)
- Detailed implementation walkthrough
- Includes code examples and explanations### 5️⃣ Applications & Real-World Usage
**Goal**: Explore practical applications and stay updated1. 📚 Explore [Awesome-Mamba-Collection](https://github.com/XiudingCai/Awesome-Mamba-Collection?tab=readme-ov-file#head18)
- Comprehensive overview of Mamba applications
- Covers NLP, time series analysis, and more2. 🔍 Additional Resources:
- [Official Mamba Repository](https://github.com/state-spaces/mamba)
- 📄 [Research Paper: Mamba: Linear-Time Sequence Modeling with Selective State Spaces](https://arxiv.org/abs/2312.00752)> 💡 **Tip**: Follow this path sequentially for the best learning experience. Each section builds upon the knowledge from previous sections.
## 🚀 Implementation Example
This repository contains a PyTorch implementation of the S4 (Structured State Space Sequence Model) for sequence modeling and classification tasks. The implementation focuses on the MNIST dataset with two main tasks:
1. MNIST Sequence Modeling: Predict next pixel value given history (784 pixels x 256 values)
2. MNIST Classification: Predict digit class using sequence model (784 pixels => 10 classes)### Key Components
- **SSM Kernel**: Basic implementation of State Space Model kernel
- **S4 Kernel**: Advanced implementation with HiPPO-based initialization
- **Sequence Processing**: Layer normalization, SSM/S4 layer, and MLP components
- **Model Architecture**: Stacked sequence model with configurable parameters### Usage
```python
# Run S4 model for both sequence modeling / classification case
python train_s4.py
```> Notes
>
> - Model uses reduced dimensions for demonstration
> - Supports both CNN and RNN modesFor detailed implementation, see [train_s4.py](./train_s4.py).