https://github.com/coderpat/croissant-llm-training
Repository containing the code for training the CroissantLLM
https://github.com/coderpat/croissant-llm-training
Last synced: 10 months ago
JSON representation
Repository containing the code for training the CroissantLLM
- Host: GitHub
- URL: https://github.com/coderpat/croissant-llm-training
- Owner: CoderPat
- Created: 2024-02-01T15:45:34.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-02-02T13:00:16.000Z (about 2 years ago)
- Last Synced: 2024-02-02T14:24:41.453Z (about 2 years ago)
- Language: Python
- Size: 14.6 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# CroissantLLM: Training Repository
## Installation Instructions
As a pre-requisite, make sure you have [ducttape](https://github.com/CoderPat/ducttape) and [(mini)conda](https://docs.conda.io/en/latest/miniconda.html) installed.
First, clone this repository and its submodules:
```bash
git clone --recurse-submodules git@github.com:CoderPat/croissant-llm-training.git
```
Then, to create a new conda environment with all the necessary dependencies, run the following command:
```bash
export CONDA_HOME="/path/to/(mini)conda3"
bash setup/conda.sh
```
## Running pipelines
The core experimentation and training pipelines rely on ducttape, and are defined in `main.tape`.
Configuration files for different models and datasets are defined in `configs/`.
Start by creating a configuration with user-dependent variables (like the output folder) in associated `configs/*_uservars.conf` associated with your chosen `.tconf`. E.g, for the `configs/croissant_llm.tconf` configuration, create a `configs/croissant_llm_uservars.conf` file with the following content:
```
global {
ducttape_output=/path/to/output
repo=/path/to/croissant-llm-training
(...)
# use a simple shell submitter
# we are forced to explicitly set the submitter parameters
# to make it compatible with other submitters (ie the slurm submitter)
submitter=shell
dump_account=none
dump_partition=none
(...)
}
```
We provide a template for our user variables used in JeanZay.
Then, you can ran the one of the specified pipelines in `main.tape` by running ducttape with the corresponding configuration file:
```bash
conda activate towerllm-env
ducttape main.tape -C configs/croissant_llm.conf
```