https://github.com/OpenSparseLLMs/MoM

Last synced: over 1 year ago
JSON representation

Host: GitHub
URL: https://github.com/OpenSparseLLMs/MoM
Owner: OpenSparseLLMs
License: apache-2.0
Created: 2025-02-19T03:11:55.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2025-03-20T02:51:51.000Z (over 1 year ago)
Last Synced: 2025-03-31T06:41:18.971Z (over 1 year ago)
Language: Python
Size: 1.15 MB
Stars: 73
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

Awesome-Agent-Memory - [code

README

          


# MoM: Mixture-of-Memories

[![arXiv](https://img.shields.io/badge/Arxiv-2502.13685-b31b1b.svg?logo=arXiv)](https://arxiv.org/abs/2502.13685)

[![huggingface weights](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Weights-ffc107?color=ffc107&logoColor=white)](https://huggingface.co/linear-moe-hub)

[![zhihu](https://img.shields.io/badge/Zhihu-Intro-blue?logo=zhihu)](https://zhuanlan.zhihu.com/p/25066090353)

[![stars](https://img.shields.io/github/stars/OpenSparseLLMs/MoM)](https://github.com/OpenSparseLLMs/MoM/stargazers)



Welcome to MoM! This repository provides the implementation of [MoM: Linear Sequence Modeling with Mixture-of-Memories](https://arxiv.org/abs/2502.13685), on huggingface eco-system. MoM is compatible with all kinds of linear sequence modeling methods like: linear attention, SSM, linear RNN, etc. **Here is an introductory artical about MoM (in Chinese) on [Zhihu](https://zhuanlan.zhihu.com/p/25066090353)**.



  





Figure 1: MoM Architecture



## Installation

The following requirements should be satisfied:

- [PyTorch](https://pytorch.org/) >= 2.5

- [Triton](https://github.com/openai/triton) >=3.0

- [einops](https://einops.rocks/)

- [transformers](https://github.com/huggingface/transformers) >=4.45.0

- [datasets](https://github.com/huggingface/datasets) >=3.3.0

- [causal-conv1d](https://github.com/Dao-AILab/causal-conv1d) >=1.4.0

Install the package from source:

```bash

pip install -e .

```

## Getting Started

### Data Preparation

Before training, make sure to preprocess your data by following the steps outlined in [training/README.md](training/README.md).

### Training From Scratch

To start training with default setup, simply run:

```bash

cd training

bash train.sh \

  nodes=4 \

  gpus=8 \

  type=mom \

  lr=3e-4 \

  steps=30720 \

  batch=8 \

  update=1 \

  warmup=1024 \

  context=2048 \

  path=SlimPajama/mom-15B \

  project=SlimPajama \

  model=configs/mom_340M.json \

  tokenizer=fla-hub/gla-1.3B-100B \

  data=SlimPajama-627B \

  cache=data/chunk1/train

```

You can also

- Modify the script to adjust the modeling and training settings.

- e.g., modify [examples/configs/mom_340M.json](examples/configs/mom_340M.json) to adjust the MoM model structure.

### Evaluation

To evaluate model checkpoints on **commonsense reasoning benchmarks**, we recommend you to run:

```bash

MODEL_PATH=training/SlimPajama/mom-15B/checkpoint-30720

accelerate launch --multi_gpu evals/harness.py --model hf \

    --model_args pretrained=$MODEL_PATH,dtype=bfloat16 \

    --tasks arc_easy,arc_challenge,hellaswag,lambada_standard,piqa,winogrande,wikitext \

    --output_path eval_results \

    --batch_size 32 \

    --device cuda

```

To evaluate model checkpoints on **recall-intensive tasks**, we recommend you to run:

1. Install lm_eval

```bash

cd lm-eval-harness

pip install -e .

```

2. Run the script:

```bash

MODEL_PATH=../training/SlimPajama/mom-15B/checkpoint-30720

CUDA_VISIBLE_DEVICES=0,1,2,3,4 python launch_local.py \

    --batch-size 32 \

    -t based_squad \

    -t based_swde \

    -t based_fda \

    -t based_drop \

    -t based_triviaqa \

    -t based_nq_2048 \

    -m $MODEL_PATH \

    --context_length 2048 \

    --answer_length 48 \

    --cutting_context \

    --limit -1 \

    -p

```

## Acknowledgement

This repo builds upon the open-source [flash-linear-attention](https://github.com/fla-org/flash-linear-attention) and the evaluation code is based on [prefix-linear-attention](https://github.com/HazyResearch/prefix-linear-attention). Happy experimenting! 🔥🚀🔥

## Citation

If you find this repo useful, please consider citing our paper:

```bib

@article{du2025mom,

  title={MoM: Linear Sequence Modeling with Mixture-of-Memories},

  author={Du, Jusen and Sun, Weigao and Lan, Disen and Hu, Jiaxi and Cheng, Yu},

  journal={arXiv preprint arXiv:2502.13685},

  year={2025}

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/OpenSparseLLMs/MoM

Awesome Lists containing this project

README