Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/luodian/generalizable-mixture-of-experts
GMoE could be the next backbone model for many kinds of generalization task.
https://github.com/luodian/generalizable-mixture-of-experts
deep-learning domain-generalization mixture-of-experts pytorch pytorch-implementation
Last synced: 17 days ago
JSON representation
GMoE could be the next backbone model for many kinds of generalization task.
- Host: GitHub
- URL: https://github.com/luodian/generalizable-mixture-of-experts
- Owner: Luodian
- License: mit
- Created: 2022-05-28T04:17:42.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2023-03-21T18:23:06.000Z (over 1 year ago)
- Last Synced: 2024-10-22T23:36:40.069Z (21 days ago)
- Topics: deep-learning, domain-generalization, mixture-of-experts, pytorch, pytorch-implementation
- Language: Python
- Homepage:
- Size: 2.04 MB
- Stars: 290
- Watchers: 15
- Forks: 35
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Welcome to Generalizable Mixture-of-Experts for Domain Generalization
🔥 Our paper [Sparse Mixture-of-Experts are Domain Generalizable Learners](https://openreview.net/forum?id=RecZ9nB9Q4) has officially been accepted as ICLR 2023 for Oral presentation.
🔥 GMoE-S/16 model currently [ranks top place](https://paperswithcode.com/sota/domain-generalization-on-domainnet) among multiple DG datasets without extra pre-training data. (Our GMoE-S/16 is initilized from [DeiT-S/16](https://github.com/facebookresearch/deit/blob/main/README_deit.md), which was only pretrained on ImageNet-1K 2012)
Wondering why GMoEs have astonishing performance? 🤯 Let's investigate the generalization ability of model architecture itself and see the great potentials of Sparse Mixture-of-Experts (MoE) architecture.
### Preparation
```sh
pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116python3 -m pip uninstall tutel -y
python3 -m pip install --user --upgrade git+https://github.com/microsoft/tutel@mainpip3 install -r requirements.txt
```### Datasets
```sh
python3 -m domainbed.scripts.download \
--data_dir=./domainbed/data
```### Environments
Environment details used in paper for the main experiments on Nvidia V100 GPU.
```shell
Environment:
Python: 3.9.12
PyTorch: 1.12.0+cu116
Torchvision: 0.13.0+cu116
CUDA: 11.6
CUDNN: 8302
NumPy: 1.19.5
PIL: 9.2.0
```## Start Training
Train a model:
```sh
python3 -m domainbed.scripts.train\
--data_dir=./domainbed/data/OfficeHome/\
--algorithm GMOE\
--dataset OfficeHome\
--test_env 2
```## Hyper-params
We put hparams for each dataset into
```sh
./domainbed/hparams_registry.py
```Basically, you just need to choose `--algorithm` and `--dataset`. The optimal hparams will be loaded accordingly.
## License
This source code is released under the MIT license, included [here](LICENSE).
## Acknowledgement
The MoE module is built on [Tutel MoE](https://github.com/microsoft/tutel).