https://github.com/freedomintelligence/finetune_chatgpt
The example for finetuning chatgpt.
https://github.com/freedomintelligence/finetune_chatgpt
Last synced: about 1 year ago
JSON representation
The example for finetuning chatgpt.
- Host: GitHub
- URL: https://github.com/freedomintelligence/finetune_chatgpt
- Owner: FreedomIntelligence
- License: apache-2.0
- Created: 2023-08-24T15:12:29.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2023-08-28T07:01:55.000Z (almost 3 years ago)
- Last Synced: 2025-03-30T19:22:48.273Z (about 1 year ago)
- Language: Python
- Size: 131 KB
- Stars: 27
- Watchers: 7
- Forks: 6
- Open Issues: 0
-
Metadata Files:
- Readme: Readme.md
- License: LICENSE
Awesome Lists containing this project
README
# ChatGPT Fine-Tuning Guide
## Introduction
This repository provides an example on how to fine-tune the ChatGPT model for a specific task.
It draws inspiration from the [GrammarGPT project](https://github.com/FreedomIntelligence/GrammarGPT).
You're free to modify the input files as per your requirements.
## Steps to Fine-Tune Your Model
### 1. Setup
Ensure you have the latest OpenAI package installed (version `openai>=0.27.9`).
```bash
pip install -r requirements.txt
```
### 2. Data Preparation
Prepare your custom training data (e.g., `train_data.jsonl`) and validation data (e.g., `dev_data.jsonl`).
**Note:** The size of the input file is currently capped at 50MB. For more details, refer to the [OpenAI documentation on dataset preparation](https://platform.openai.com/docs/guides/fine-tuning/preparing-your-dataset).
### 3. Model Training
To train your model, use the `train_model.py` script:
```bash
python train_model.py
```
**Important:** In `train_model.py`, ensure that you first upload the custom data to OpenAI. Once the upload is successful, the training will commence. The training process might take several hours, so please be patient.
### 4. Model Testing
To test your fine-tuned model, use the `test_model.py` script:
```bash
python test_model.py
```
**Note:** You need to get the model id before the test.
## Experimental Results
| Model | # Param. | Data| Word-level (P/R/F) | Char-level(P/R/F) |
|:--------------------------|:---------|:----|:-------------------|:--------------------------|
| S2S_BART | 375M | 1061| 21.08/10.54/17.57 | 22.09/10.62/18.16 |
| GrammarGPT | 7B | 1061| **42.42**/16.87/32.56 | **46.67**/18.58/**35.84** |
| Fine-tuning GPT-3.5 Turbo | - | 1061| 36.16/**34.75**/**35.87** | 36.17/**33.69**/35.65 |
## Additional Resources
- Official OpenAI API documentation on fine-tuning: [API Reference](https://platform.openai.com/docs/api-reference/fine-tuning/create)
- OpenAI Guide on creating a fine-tuned model: [Python Guide](https://platform.openai.com/docs/guides/fine-tuning/create-a-fine-tuned-model)
- Video tutorial on fine-tuning ChatGPT: [YouTube Tutorial](https://www.youtube.com/watch?v=_yzmQbez7gk)
## Acknowledgments
Special thanks to the following repository for its invaluable insights:
[OpenAI Fine-tuning Guide](https://github.com/horosin/open-finetuning/tree/main)