https://github.com/ilyalasy/moe-routing
Analysis of token routing for different implementations of Mixture of Experts
https://github.com/ilyalasy/moe-routing
artificial-intelligence deep-learning interpretable-deep-learning mixture-of-experts
Last synced: 10 months ago
JSON representation
Analysis of token routing for different implementations of Mixture of Experts
- Host: GitHub
- URL: https://github.com/ilyalasy/moe-routing
- Owner: ilyalasy
- Created: 2024-02-27T13:00:01.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-03-22T12:13:25.000Z (about 2 years ago)
- Last Synced: 2025-04-04T22:51:12.562Z (about 1 year ago)
- Topics: artificial-intelligence, deep-learning, interpretable-deep-learning, mixture-of-experts
- Language: Jupyter Notebook
- Homepage:
- Size: 882 KB
- Stars: 9
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Token Routing Analysis of Mixture of Experts LLMs
## Install
```
pip install -r requirements.txt
cd ..
git clone https://github.com/hpcaitech/ColossalAI
pip install -U ./ColossalAI
cd ColossalAI/examples/language/openmoe
pip install -r requirements.txt
```
## Run OpenMoe Inference on RedPajama
```
./scripts/token-routing.sh
```
## Analyse token routing data
See [EDA notebook](https://github.com/Misterion777/moe-experiments/blob/main/notebooks/routing_eda.ipynb)
## TODO
- [x] Support Mixtral
- [x] Support DeepSeek