Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ethanhe42/gpt2-tpu
pretraining gpt2 on TPUs
https://github.com/ethanhe42/gpt2-tpu
Last synced: about 1 month ago
JSON representation
pretraining gpt2 on TPUs
- Host: GitHub
- URL: https://github.com/ethanhe42/gpt2-tpu
- Owner: ethanhe42
- Created: 2023-09-13T05:39:24.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2023-09-18T05:45:22.000Z (about 1 year ago)
- Last Synced: 2024-04-14T17:45:06.022Z (7 months ago)
- Size: 3.91 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# gpt2-tpu
pretraining gpt2 on TPUs. Verified on TPUv3-8
![image](https://github.com/yihui-he/gpt2-tpu/assets/10027339/2859929b-1340-4190-92dc-3c183f7a8b81)
[full log](https://wandb.ai/yihuihe/tpu-research/runs/8a33wa61?workspace=user-yihuihe)
```bash
pip3 install -r requirements.txt# optional
python3 -m wandb login# choose single machine, TPU, 8 cores
python3 -m accelerate configgit clone https://github.com/huggingface/transformers
cd transformers/examples/pytorch/language-modeling/export ALLOW_MULTIPLE_LIBTPU_LOAD=1
export PJRT_DEVICE=TPU
export WANDB_PROJECT='tpu-research'
accelerate launch run_clm.py \
--model_name_or_path gpt2 \
--dataset_name Skylion007/openwebtext \
--learning_rate 6e-4 \
--adam_beta1 0.9 \
--adam_beta2 0.95 \
--weight_decay 1e-1 \
--warmup_steps 2000 \
--per_device_train_batch_size 14 \
--per_device_eval_batch_size 14 \
--do_train \
--do_eval \
--preprocessing_num_workers 96 \
--save_steps 0.01 \
--save_total_limit 1 \
--evaluation_strategy steps \
--eval_steps 0.01 \
--bf16 True \
--output_dir /tmp/test-clm
```wikitext-2-raw-v1