https://github.com/murapadev/phinetuning
A repository dedicated to finetuning phi2 models using advanced machine learning techniques. This includes training scripts, model evaluation methods, and data processing tools.
https://github.com/murapadev/phinetuning
deep-learning finetuning machine-learning model-training models natural-language-processing nlp phi2 python pytorch transformers
Last synced: 2 months ago
JSON representation
A repository dedicated to finetuning phi2 models using advanced machine learning techniques. This includes training scripts, model evaluation methods, and data processing tools.
- Host: GitHub
- URL: https://github.com/murapadev/phinetuning
- Owner: murapadev
- License: mit
- Created: 2024-02-12T15:25:40.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-03-03T18:53:48.000Z (4 months ago)
- Last Synced: 2025-03-28T06:33:39.899Z (3 months ago)
- Topics: deep-learning, finetuning, machine-learning, model-training, models, natural-language-processing, nlp, phi2, python, pytorch, transformers
- Language: Python
- Homepage:
- Size: 16.6 KB
- Stars: 3
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Phinetuning
Advanced finetuning toolkit for Phi-2 and other language models, featuring distributed training, hyperparameter optimization, and mixed precision training.
## Table of Contents
- [Features](#features)
- [Installation](#installation)
- [Usage](#usage)
- [Training](#training)
- [Generation](#generation)
- [Advanced Configuration](#advanced-configuration)
- [Examples](#examples)
- [Contributing](#contributing)
- [License](#license)## Features
- Distributed training support
- Mixed precision training (FP16/BF16)
- 4-bit and 8-bit quantization
- Hyperparameter optimization using Optuna
- Data augmentation techniques
- Advanced text generation with various decoding strategies
- Streaming generation support
- Comprehensive logging and error handling
- Support for multiple model architectures## Installation
```bash
git clone https://github.com/murapa96/Phinetuning.git
cd Phinetuning
pip install -r requirements.txt
```
## Usage### Training
Basic training command:
```bash
python train.py \
--model_path microsoft/phi-2 \
--dataset_path your_dataset.json \
--output_dir ./results \
--batch_size 1 \
--grad_accum_steps 4
```Enable advanced features:
```bash
python train.py \
--model_path microsoft/phi-2 \
--dataset_path your_dataset.json \
--output_dir ./results \
--mixed_precision bf16 \
--load_in_4bit \
--distributed \
--tune_hyperparams \
--n_trials 10 \
--augment_data
```### Generation
Basic text generation:
```bash
python generate.py \
--model_path ./results/final_model \
--input_text "Your prompt here" \
--max_length 200
```Advanced generation with sampling:
```bash
python generate.py \
--model_path ./results/final_model \
--input_file inputs.json \
--output_file outputs.json \
--temperature 0.7 \
--top_p 0.95 \
--top_k 50 \
--do_sample \
--streaming \
--mixed_precision
```## Advanced Configuration
### Training Arguments
- `--mixed_precision`: Choose between 'no', 'fp16', or 'bf16'
- `--load_in_4bit`: Enable 4-bit quantization
- `--load_in_8bit`: Enable 8-bit quantization
- `--distributed`: Enable distributed training
- `--tune_hyperparams`: Enable Optuna hyperparameter tuning
- `--augment_data`: Enable data augmentation
- `--max_seq_length`: Maximum sequence length (default: 2048)
- `--learning_rate`: Learning rate (default: 2e-4)
- `--batch_size`: Per device batch size
- `--grad_accum_steps`: Gradient accumulation steps### Generation Arguments
- `--temperature`: Controls randomness (0.0-1.0)
- `--top_k`: Top-k sampling parameter
- `--top_p`: Nucleus sampling parameter
- `--num_beams`: Number of beams for beam search
- `--repetition_penalty`: Penalize repeated tokens
- `--streaming`: Enable token-by-token generation
- `--mixed_precision`: Enable mixed precision inference
- `--load_in_4bit`: Enable 4-bit quantization for memory efficiency## Examples
1. **Distributed training with mixed precision:**
```bash
python train.py \
--model_path microsoft/phi-2 \
--dataset_path your_dataset.json \
--distributed \
--mixed_precision bf16 \
--batch_size 2 \
--grad_accum_steps 4 \
--learning_rate 2e-4 \
--max_steps 10000
```2. **Hyperparameter optimization:**
```bash
python train.py \
--model_path microsoft/phi-2 \
--dataset_path your_dataset.json \
--tune_hyperparams \
--n_trials 20 \
--load_in_4bit
```3. **Batch generation with advanced sampling:**
```bash
python generate.py \
--model_path ./results/final_model \
--input_file inputs.json \
--output_file outputs.json \
--batch_size 4 \
--temperature 0.7 \
--top_p 0.95 \
--do_sample \
--mixed_precision
```## Contributing
We welcome contributions! Please:
1. Fork the repository
2. Create a feature branch: `git checkout -b feature-name`
3. Commit your changes: `git commit -am 'Add feature'`
4. Push to the branch: `git push origin feature-name`
5. Submit a pull request## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.