https://github.com/albedim/nlp-face-tuning
Simple repository to get started with fine-tuning and ML models. You can download pre-trained models, fine-tune and test them very easily
https://github.com/albedim/nlp-face-tuning
ai fine-tuning machine-learning ml-models
Last synced: about 1 year ago
JSON representation
Simple repository to get started with fine-tuning and ML models. You can download pre-trained models, fine-tune and test them very easily
- Host: GitHub
- URL: https://github.com/albedim/nlp-face-tuning
- Owner: albedim
- Created: 2025-04-07T23:03:09.000Z (about 1 year ago)
- Default Branch: master
- Last Pushed: 2025-04-30T12:17:46.000Z (about 1 year ago)
- Last Synced: 2025-04-30T13:48:47.470Z (about 1 year ago)
- Topics: ai, fine-tuning, machine-learning, ml-models
- Language: Python
- Homepage:
- Size: 317 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
Awesome Lists containing this project
README
# !!! DISCLAIMER !!!
The repository is still under development, so in case of errors, it is recommended to report them in the issues section, hoping that someone will come and make a pull request.
The training part is currently very rough: some important parameters for training are not entered for each training session in the command line when the file is launched, but are instead set statically in the code.
Experiment, if there are parameters that need to be set before each training and therefore need to be defined when starting the training from the command line, please create an issue and pull request, thanks.
## Install dependencies:
```bash
pip install -r requirements.txt
```
# !!! This repository allows the launch and training of a model using the GPU, it is necessary to download CUDA 12.6 !!!
Link to install CUDA:
https://developer.nvidia.com/cuda-12-6-0-download-archive
## Install the pre-trained model:
NOTE:
The model can be changed but it is not recommended as it has only been tested with gemma-2-2b-it:
```bash
huggingface-cli download google/gemma-2-2b-it --local-dir ./models/base/gemma-2-2b-it
```
---------------------------------------------------------------
## Dataset:
Provide a txt format file in the /dataset folder with the name "raw_dataset.txt", it should contain examples of responses (After each response it should go to a new line) and run dataset_generator.py.
A file will be generated inside the /dataset folder called "fine_tune_dataset.jsonl", which will be the one to use for training.
## Training
The model can be fine-tuned using this command:
```bash
python fine_tune.py
```
1. < model_path >
EX: If you want to train the default downloaded model ((gemma-2-2b-it)) and ALL the steps written previously have been executed:
For enter: models/base/gemma-2-2b-it
2. < fine_tuned_model_name >
Enter the name to give to the model after fine-tuning, it will be saved in the following path:
models/finetuned/*
3. < dataset_name >
Provide the name of the dataset to use, namely the one generated previously (fine_tune_dataset.jsonl)
4. < epochs >
Number of epochs to use in training (int)
### After each training, in the path of the fine-tuned model (models/finetuned/{MODEL_NAME}) you will find a file called benchmarks.png, it is a graph that shows the trend of the loss as a function of the interactions.
## How to run models:
In order to run a model, you can use this command:
```bash
python run_model.py [max_tokens]
```
1. < model_path >
All the downloaded models are located in models/base/*.
ES: If you want to run a pre-trained model ((gemma-2-2b-it)) and ALL the steps written previously have been executed:
For enter models/base/gemma-2-2b-it
ES: If you want to run a fine-tuned model and ALL the steps written previously have been executed:
For enter models/finetuned/{MODEL_NAME}
2. < max_tokens >
Parameter that determines the maximum length of tokens that can be generated by a model's responses
### The model will be launched and will respond to the prompts in the following file "test/questions.txt"
### The responses are saved respectively
1. "test/base/%Y-%m-%d_%H-%M-%S.json" if launching a base model.
2. "test/finetuned/%Y-%m-%d_%H-%M-%S.json" if launching a fine-tuned model.