https://github.com/AmeenAli/HiddenMambaAttn

Official PyTorch Implementation of "The Hidden Attention of Mamba Models"
https://github.com/AmeenAli/HiddenMambaAttn

Last synced: about 2 months ago
JSON representation

Official PyTorch Implementation of "The Hidden Attention of Mamba Models"

Host: GitHub
URL: https://github.com/AmeenAli/HiddenMambaAttn
Owner: AmeenAli
Created: 2024-03-01T19:39:16.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2024-05-27T06:14:23.000Z (12 months ago)
Last Synced: 2024-10-28T08:41:45.723Z (7 months ago)
Language: Python
Size: 13.7 MB
Stars: 198
Watchers: 5
Forks: 12
Open Issues: 7
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

Awesome-state-space-models - Vision

README

        


 🐍 The Hidden Attention of Mamba Models 🐍 


Ameen Ali¹ \*,Itamar Zimerman¹ \* and Lior Wolf¹




[email protected], [email protected], [email protected] 




¹  Tel Aviv University 

(\*) equal contribution



## Official PyTorch Implementation of "The Hidden Attention of Mamba Models"

The Mamba layer offers an efficient state space model (SSM) that is highly effective in modeling multiple domains including long-range sequences and images. SSMs are viewed as dual models, in which one trains in parallel on the entire sequence using convolutions, and deploys in an autoregressive manner. We add a third view and show that such models can be viewed as attention-driven models. This new perspective enables us to compare the underlying mechanisms to that of the self-attention layers in transformers and allows us to peer inside the inner workings of the Mamba model with explainability methods. 




You can access the paper through : The Hidden Attention of Mamba Models







## Set Up Environment

- Python 3.10.13

  - `conda create -n your_env_name python=3.10.13`

- Activate Env

  - `conda activate your_env_name`

- CUDA TOOLKIT 11.8

  - `conda install nvidia/label/cuda-11.8.0::cuda-toolkit`

- torch 2.1.1 + cu118

  - `pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118`

- Requirements: vim_requirements.txt

  - `pip install -r vim/vim_requirements.txt`

- Install jupyter

  - `pip install jupyter`

  

- Install ``causal_conv1d`` and ``mamba`` from *our source*

  - `cd causal-conv1d`

  - `pip install --editable .`

  - `cd ..`

  - `pip install --editable mamba-1p1p1`

  

  

## Pre-Trained Weights

We have used the official weights provided by [Vim](https://github.com/hustvl/Vim), which can be downloaded from here:

| Model | #param. | Top-1 Acc. | Top-5 Acc. | Hugginface Repo |

|:------------------------------------------------------------------:|:-------------:|:----------:|:----------:|:----------:|

| [Vim-tiny](https://huggingface.co/hustvl/Vim-tiny-midclstok)    |       7M       |   76.1   | 93.0 | https://huggingface.co/hustvl/Vim-tiny-midclstok |

| [Vim-tiny⁺](https://huggingface.co/hustvl/Vim-tiny-midclstok)    |       7M       |   78.3   | 94.2 | https://huggingface.co/hustvl/Vim-tiny-midclstok |

| [Vim-small](https://huggingface.co/hustvl/Vim-small-midclstok)    |       26M       |   80.5   | 95.1 | https://huggingface.co/hustvl/Vim-small-midclstok |

| [Vim-small⁺](https://huggingface.co/hustvl/Vim-small-midclstok)    |       26M       |   81.6   | 95.4 | https://huggingface.co/hustvl/Vim-small-midclstok |

**Notes:**

-  In all of our experiments, we have worked with [Vim-small](https://huggingface.co/hustvl/Vim-small-midclstok).

## Vision-Mamba Explainability Notebook:










Follow the instructions in vim/vmamba_xai.ipynb notebook, in order to apply a single-image inference for the 3 introduced methods in the paper.










## To-Do

For the segmentation experiment, please check out our [follow-up work](https://github.com/Itamarzimm/UnifiedImplicitAttnRepr/tree/main).






    XAI - Single Image Inference Notebook

    XAI - Segmentation Experimnts



## Citation

if you find our work useful, please consider citing us:

```latex

@misc{ali2024hidden,

      title={The Hidden Attention of Mamba Models}, 

      author={Ameen Ali and Itamar Zimerman and Lior Wolf},

      year={2024},

      eprint={2403.01590},

      archivePrefix={arXiv},

      primaryClass={cs.LG}

}

```

## Acknowledgement

This repository is heavily based on [Vim](https://github.com/hustvl/Vim), [Mamba](https://github.com/state-spaces/mamba) and [Transformer-Explainability](https://github.com/hila-chefer/Transformer-Explainability). Thanks for their wonderful works.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/AmeenAli/HiddenMambaAttn

Awesome Lists containing this project

README

🐍 The Hidden Attention of Mamba Models 🐍