Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/moskomule/simple_transformers
Simple transformer implementations that I can understand
https://github.com/moskomule/simple_transformers
transformers
Last synced: 3 months ago
JSON representation
Simple transformer implementations that I can understand
- Host: GitHub
- URL: https://github.com/moskomule/simple_transformers
- Owner: moskomule
- Created: 2021-02-17T13:35:25.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2021-12-28T05:15:48.000Z (almost 3 years ago)
- Last Synced: 2024-07-04T00:59:57.696Z (4 months ago)
- Topics: transformers
- Language: Python
- Homepage:
- Size: 119 KB
- Stars: 20
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Simple Transformers
Simple transformer implementations that I can understand.
## Requirements
```commandline
conda create -n transformer python=3.9
conda activate transformer
conda install -c pytorch -c conda-forge pytorch torchvision cudatoolkit=11.1
pip install -U homura-core chika rich
# for NLP also install
pip install -U datasets tokenizers
# To use checkpointing
pip install -U fairscale
# To accelerate (probably only slightly),
pip install -U opt_einsum
```## Examples
### Language Modeling
#### GPT
Train GPT-like models on wikitext or GigaText. Currently, Transformer blocks of improved pre LN, pre LN, and post LN are
available for comparison.```commandline
python gpt.py [--model.block {ipre_ln, pre_ln, post_ln}] [--amp]
```#### Bert
Work in progress
### Image Recognition
Train ImageNet classification models.
Currently, ViT, and CaiT are implemented.
```commandline
# single process training
python vit.py [--amp] [--model.ema]
# for multi-process training,
python -m torch.distributed.launch --nproc_per_node=2 vit.py ...
```## Acknowledgement
For this project, I learned a lot from Andrej's [minGPT](https://github.com/karpathy/mingpt),
Ross's [timm](https://github.com/rwightman/pytorch-image-models), and
FAIR's [ClassyVision](https://github.com/facebookresearch/ClassyVision).