Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/google-research/vmoe
https://github.com/google-research/vmoe
Last synced: about 1 month ago
JSON representation
- Host: GitHub
- URL: https://github.com/google-research/vmoe
- Owner: google-research
- License: apache-2.0
- Created: 2021-10-08T09:52:37.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2024-07-25T11:16:45.000Z (about 2 months ago)
- Last Synced: 2024-07-25T12:44:49.749Z (about 2 months ago)
- Language: Jupyter Notebook
- Size: 1.71 MB
- Stars: 545
- Watchers: 14
- Forks: 48
- Open Issues: 14
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
# Scaling Vision with Sparse Mixture of Experts
This repository contains the code for training and fine-tuning Sparse MoE models
for vision (V-MoE) on ImageNet-21k, reproducing the results presented in
the paper:- [Scaling Vision with Sparse Mixture of Experts](https://arxiv.org/abs/2106.05974), by
Carlos Riquelme, Joan Puigcerver, Basil Mustafa, Maxim Neumann,
Rodolphe Jenatton, André Susano Pinto, Daniel Keysers, and Neil Houlsby.We will soon provide a colab analysing one of the models that we have released,
as well as "config" files to train from scratch and fine-tune checkpoints. Stay
tuned.We also provide checkpoints, a notebook, and a config for Efficient Ensemble of
Experts (E3), presented in the paper:- [Sparse MoEs meet Efficient Ensembles](https://openreview.net/forum?id=i0ZM36d2qU¬eId=Rtlnlx5PzY), by
James Urquhart Allingham, Florian Wenzel, Zelda E Mariet, Basil Mustafa,
Joan Puigcerver, Neil Houlsby, Ghassen Jerfel, Vincent Fortuin,
Balaji Lakshminarayanan, Jasper Snoek, Dustin Tran,Carlos Riquelme Ruiz,
and Rodolphe Jenatton.## Installation
Simply clone this repository.
The file `requirements.txt` contains the requirements that can be installed
via PyPi. However, we recommend installing `jax`, `flax` and `optax`
directly from GitHub, since we use some of the latest features that are not part
of any release yet.In addition, you also have to clone the
[Vision Transformer](https://github.com/google-research/vision_transformer)
repository, since we use some parts of it.If you want to use RandAugment to train models (which we recommend if you train
on ImageNet-21k or ILSVRC2012 from scratch), you must also clone the
[Cloud TPU](https://github.com/tensorflow/tpu) repository, and name it
`cloud_tpu`.## Checkpoints
We release the checkpoints containing the weights of some models that we trained
on ImageNet (either ILSVRC2012 or ImageNet-21k). All checkpoints contain an
index file (with `.index` extension) and one or multiple data files (
with extension `.data-nnnnn-of-NNNNN`, called *shards*). In the following
list, we indicate *only the prefix* of each checkpoint.
We recommend using [gsutil](https://cloud.google.com/storage/docs/gsutil) to
obtain the full list of files, download them, etc.- V-MoE S/32, 8 experts on the last two odd blocks, trained from scratch on
ILSVRC2012 with RandAugment for 300 epochs:
`gs://vmoe_checkpoints/vmoe_s32_last2_ilsvrc2012_randaug_light1`.
- Fine-tuned on ILSVRC2012 with a resolution of 384 pixels:
`gs://vmoe_checkpoints/vmoe_s32_last2_ilsvrc2012_randaug_light1_ft_ilsvrc2012`
- V-MoE S/32, 8 experts on the last two odd blocks, trained from scratch on
ILSVRC2012 with RandAugment for 1000 epochs:
`gs://vmoe_checkpoints/vmoe_s32_last2_ilsvrc2012_randaug_medium`.
- V-MoE B/16, 8 experts on every odd block, trained from scratch on ImageNet-21k
with RandAugment: `gs://vmoe_checkpoints/vmoe_b16_imagenet21k_randaug_strong`.
- Fine-tuned on ILSVRC2012 with a resolution of 384 pixels:
`gs://vmoe_checkpoints/vmoe_b16_imagenet21k_randaug_strong_ft_ilsvrc2012`
- E3 S/32, 8 experts on the last two odd blocks, with two ensemble
members (i.e., the 8 experts are partitioned into two groups), trained from
scratch on ILSVRC2012 with RandAugment for 300 epochs:
`gs://vmoe_checkpoints/eee_s32_last2_ilsvrc2012`
- Fine-tuned on CIFAR100:
`gs://vmoe_checkpoints/eee_s32_last2_ilsvrc2012_ft_cifar100`## Disclaimers
This is not an officially supported Google product.