https://github.com/freedomintelligence/try_phoenix2
Phoenix2 code in dev
https://github.com/freedomintelligence/try_phoenix2
Last synced: 7 months ago
JSON representation
Phoenix2 code in dev
- Host: GitHub
- URL: https://github.com/freedomintelligence/try_phoenix2
- Owner: FreedomIntelligence
- Created: 2023-10-30T13:18:18.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2023-10-30T14:25:58.000Z (over 2 years ago)
- Last Synced: 2025-03-11T22:19:45.731Z (about 1 year ago)
- Language: Python
- Size: 67.4 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# try_Phoenix2
Phoenix2 code in dev
## Dependency
- module load cuda11.8/toolkit/11.8.0
- pip3 install -U xformers --index-url https://download.pytorch.org/whl/cu118
- pip install packaging
- pip uninstall -y ninja && pip install ninja
- pip install flash-attn --no-build-isolation
- pip install -r requirements.txt tokenizers sentencepiece
## Structure
- TinyLlama: 原来的预训练代码
- TinyLlama_deepspeed_check1: 利用Deepspeed策略
- pip install deepspeed
- 文档:
- 代码: https://lightning.ai/docs/pytorch/stable/advanced/model_parallel/deepspeed.html
- 参数: https://lightning.ai/docs/pytorch/stable/api/lightning.pytorch.strategies.DeepSpeedStrategy.html#lightning.pytorch.strategies.DeepSpeedStrategy
- 改动:
- 更换策略: pretrain.py line 10, 95
- 为了用checkpointing技术: pretrain.py line 141
- scripts/convert_zero_checkpoint.py Zero3 checkpoint转换
- TinyLlama_deepspeed_check2: 利用Deepspeed策略
- 改动:
- 为了用checkpointing技术: lit-gpt/model/gpt.foward line 69-119
- TinyLlama_collosal:
- 改动:
- 更换策略: pretrain.py line 10, 95
## Usage
```
sbatch multinode_pretrain.sh
```
### 数据
```
python scripts/prepare_starcoder.py --source_path /path/to/starcoderdata/ --tokenizer_path data/llama --destination_path data/slim_star_combined --split train --percentage 1.0
python scripts/prepare_slimpajama.py --source_path /path/to/SlimPajama --tokenizer_path data/llama --destination_path data/slim_star_combined --split validation --percentage 1.0
python scripts/prepare_slimpajama.py --source_path /path/to/SlimPajama --tokenizer_path data/llama --destination_path data/slim_star_combined --split train --percentage 1.0
```
### 多节点训练
```
sbatch multinode_pretrain.sh
```
## Calculate index for Strategy choosing
```
python pre_train_math.py
```
```
-----------Model_Size and GPU_Mem-----------
+--------------+------------------------+----------------------+
| Model size/B | ratio(NHIDDEN/NLAYERS) | Usable_mem_per_GPU/G |
+--------------+------------------------+----------------------+
| 1.18 | 93 | 79 |
+--------------+------------------------+----------------------+
-----------With Mixed Precision(bp16)-----------
-----Memory_reference_indicator(Batch_size=1)-----
+-------------------------+----------+------------------+-------------------+
| Module | Size/B | Eval_memory/GB | Train_momery/GB |
+=========================+==========+==================+===================+
| emb | 0.07 | 0.14 | 1.12 |
+-------------------------+----------+------------------+-------------------+
| one_layer | 0.05 | 0.1 | 0.81 |
+-------------------------+----------+------------------+-------------------+
| input | 0 | 0.01 | 0.01 |
+-------------------------+----------+------------------+-------------------+
| activation(batchsize=1) | 1.77 | 3.54 | 3.54 |
+-------------------------+----------+------------------+-------------------+
| ALL | 2.95 | 5.91 | 22.39 |
+-------------------------+----------+------------------+-------------------+
-----Strategy_reference_indicator(Batch_size=1)-----
+------------+--------------------------+---------------------------+
| Strategy | Eval_memory_per_gpu/GB | Train_momery_per_gpu/GB |
+============+==========================+===========================+
| Zero1 | 2.35 | 8.44 |
+------------+--------------------------+---------------------------+
| Zero2 | 2.35 | 6.11 |
+------------+--------------------------+---------------------------+
| Zero3 | 0.03 | 3.79 |
+------------+--------------------------+---------------------------+
---------------------Strategy_Recommand---------------------
Recommand_Strategy:
+--------+------+------+------+---------------------------+-----------------+
| Zero | DP | TP | PP | Train_momery_per_gpu/GB | Trianing_days |
+========+======+======+======+===========================+=================+
| Zero1 | 80 | 1 | 1 | 8.44 | 0.01 |
+--------+------+------+------+---------------------------+-----------------+
Please find the best batch_size by adjusting BATCH_SIZE
```
## To do list
Use Deepspeed/ Collasal AI/ Megatron-deepspeed to reconstruct the code respectively.