Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/AmeenAli/HiddenMambaAttn
Official PyTorch Implementation of "The Hidden Attention of Mamba Models"
https://github.com/AmeenAli/HiddenMambaAttn
Last synced: 3 months ago
JSON representation
Official PyTorch Implementation of "The Hidden Attention of Mamba Models"
- Host: GitHub
- URL: https://github.com/AmeenAli/HiddenMambaAttn
- Owner: AmeenAli
- Created: 2024-03-01T19:39:16.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2024-05-27T06:14:23.000Z (8 months ago)
- Last Synced: 2024-08-01T04:02:12.468Z (6 months ago)
- Language: Python
- Size: 13.7 MB
- Stars: 177
- Watchers: 5
- Forks: 10
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- Awesome-state-space-models - Vision
README
🐍 The Hidden Attention of Mamba Models 🐍
Ameen Ali1 \*,Itamar Zimerman1 \* and Lior Wolf1
[email protected], [email protected], [email protected]
1 Tel Aviv University
(\*) equal contribution## Official PyTorch Implementation of "The Hidden Attention of Mamba Models"
The Mamba layer offers an efficient state space model (SSM) that is highly effective in modeling multiple domains including long-range sequences and images. SSMs are viewed as dual models, in which one trains in parallel on the entire sequence using convolutions, and deploys in an autoregressive manner. We add a third view and show that such models can be viewed as attention-driven models. This new perspective enables us to compare the underlying mechanisms to that of the self-attention layers in transformers and allows us to peer inside the inner workings of the Mamba model with explainability methods.
You can access the paper through : The Hidden Attention of Mamba Models
## Set Up Environment
- Python 3.10.13
- `conda create -n your_env_name python=3.10.13`
- Activate Env
- `conda activate your_env_name`
- CUDA TOOLKIT 11.8
- `conda install nvidia/label/cuda-11.8.0::cuda-toolkit`
- torch 2.1.1 + cu118
- `pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118`- Requirements: vim_requirements.txt
- `pip install -r vim/vim_requirements.txt`- Install jupyter
- `pip install jupyter`
- Install ``causal_conv1d`` and ``mamba`` from *our source*
- `cd causal-conv1d`
- `pip install --editable .`
- `cd ..`
- `pip install --editable mamba-1p1p1`
## Pre-Trained Weights
We have used the official weights provided by [Vim](https://github.com/hustvl/Vim), which can be downloaded from here:
| Model | #param. | Top-1 Acc. | Top-5 Acc. | Hugginface Repo |
|:------------------------------------------------------------------:|:-------------:|:----------:|:----------:|:----------:|
| [Vim-tiny](https://huggingface.co/hustvl/Vim-tiny-midclstok) | 7M | 76.1 | 93.0 | https://huggingface.co/hustvl/Vim-tiny-midclstok |
| [Vim-tiny+](https://huggingface.co/hustvl/Vim-tiny-midclstok) | 7M | 78.3 | 94.2 | https://huggingface.co/hustvl/Vim-tiny-midclstok |
| [Vim-small](https://huggingface.co/hustvl/Vim-small-midclstok) | 26M | 80.5 | 95.1 | https://huggingface.co/hustvl/Vim-small-midclstok |
| [Vim-small+](https://huggingface.co/hustvl/Vim-small-midclstok) | 26M | 81.6 | 95.4 | https://huggingface.co/hustvl/Vim-small-midclstok |**Notes:**
- In all of our experiments, we have worked with [Vim-small](https://huggingface.co/hustvl/Vim-small-midclstok).## Vision-Mamba Explainability Notebook:
Follow the instructions in vim/vmamba_xai.ipynb notebook, in order to apply a single-image inference for the 3 introduced methods in the paper.
## To-Do
For the segmentation experiment, please check out our [follow-up work](https://github.com/Itamarzimm/UnifiedImplicitAttnRepr/tree/main).
- XAI - Single Image Inference Notebook
- XAI - Segmentation Experimnts
## Citation
if you find our work useful, please consider citing us:
```latex
@misc{ali2024hidden,
title={The Hidden Attention of Mamba Models},
author={Ameen Ali and Itamar Zimerman and Lior Wolf},
year={2024},
eprint={2403.01590},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
```
## Acknowledgement
This repository is heavily based on [Vim](https://github.com/hustvl/Vim), [Mamba](https://github.com/state-spaces/mamba) and [Transformer-Explainability](https://github.com/hila-chefer/Transformer-Explainability). Thanks for their wonderful works.