https://github.com/rkv0id/fair-distribert
BERT distillation and quantization in a distributed setting using Fairscale library.
https://github.com/rkv0id/fair-distribert
Last synced: 3 months ago
JSON representation
BERT distillation and quantization in a distributed setting using Fairscale library.
- Host: GitHub
- URL: https://github.com/rkv0id/fair-distribert
- Owner: rkv0id
- Created: 2022-03-17T23:33:27.000Z (about 3 years ago)
- Default Branch: main
- Last Pushed: 2022-03-18T23:08:39.000Z (about 3 years ago)
- Last Synced: 2025-01-03T12:30:04.049Z (5 months ago)
- Language: Python
- Size: 9.77 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# fair-distribert
BERT distillation and quantization in a distributed setting using [Fairscale](https://fairscale.readthedocs.io/) library.
The repository contains different versions of the training code on the [GLUE](https://gluebenchmark.com) task using the [mRPC](https://paperswithcode.com/dataset/mrpc) dataset with different levels of parallelism using [PyTorch](https://pytorch.org/docs/stable/distributed.html) and FairScale's constructs:
- [x] [Distributed Data Parallel](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html)
- [x] [Sharded Data Parallel](https://fairscale.readthedocs.io/en/stable/api/nn/sharded_ddp.html)
- [x] [Fully Sharded Data Parallel](https://fairscale.readthedocs.io/en/stable/api/nn/fsdp.html)
- [ ] [Model Off-loading](https://fairscale.readthedocs.io/en/stable/api/experimental/nn/offload_model.html)
- [ ] [Activation Checkpointing](https://fairscale.readthedocs.io/en/stable/api/nn/checkpoint/checkpoint_activations.html)
- [ ] [Pipeline Parallelism](https://fairscale.readthedocs.io/en/stable/api/nn/pipe.html)
- [ ] [SlowMo Distributed Data Parallel](https://fairscale.readthedocs.io/en/stable/api/experimental/nn/slowmo_ddp.html)