https://github.com/coderpat/croissant-llm-training

Repository containing the code for training the CroissantLLM
https://github.com/coderpat/croissant-llm-training

Last synced: 10 months ago
JSON representation

Repository containing the code for training the CroissantLLM

Host: GitHub
URL: https://github.com/coderpat/croissant-llm-training
Owner: CoderPat
Created: 2024-02-01T15:45:34.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2024-02-02T13:00:16.000Z (about 2 years ago)
Last Synced: 2024-02-02T14:24:41.453Z (about 2 years ago)
Language: Python
Size: 14.6 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# CroissantLLM: Training Repository

## Installation Instructions

As a pre-requisite, make sure you have [ducttape](https://github.com/CoderPat/ducttape) and [(mini)conda](https://docs.conda.io/en/latest/miniconda.html) installed.

First, clone this repository and its submodules:

```bash
git clone --recurse-submodules git@github.com:CoderPat/croissant-llm-training.git
```

Then, to create a new conda environment with all the necessary dependencies, run the following command:

```bash
export CONDA_HOME="/path/to/(mini)conda3"
bash setup/conda.sh
```

## Running pipelines

The core experimentation and training pipelines rely on ducttape, and are defined in `main.tape`.
Configuration files for different models and datasets are defined in `configs/`.

Start by creating a configuration with user-dependent variables (like the output folder) in associated `configs/*_uservars.conf` associated with your chosen `.tconf`. E.g, for the `configs/croissant_llm.tconf` configuration, create a `configs/croissant_llm_uservars.conf` file with the following content:
```
global {
ducttape_output=/path/to/output
repo=/path/to/croissant-llm-training

(...)
# use a simple shell submitter
# we are forced to explicitly set the submitter parameters
# to make it compatible with other submitters (ie the slurm submitter)
submitter=shell
dump_account=none
dump_partition=none
(...)
}
```

We provide a template for our user variables used in JeanZay.

Then, you can ran the one of the specified pipelines in `main.tape` by running ducttape with the corresponding configuration file:

```bash
conda activate towerllm-env
ducttape main.tape -C configs/croissant_llm.conf
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/coderpat/croissant-llm-training

Awesome Lists containing this project

README