https://github.com/young-geng/EasyLM

Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.
https://github.com/young-geng/EasyLM

chatbot deep-learning flax jax language-model large-language-models llama natural-language-processing transformer

Last synced: 3 months ago
JSON representation

Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.

Host: GitHub
URL: https://github.com/young-geng/EasyLM
Owner: young-geng
License: apache-2.0
Created: 2022-11-22T12:55:20.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2024-08-13T05:55:05.000Z (10 months ago)
Last Synced: 2024-12-20T00:34:05.606Z (6 months ago)
Topics: chatbot, deep-learning, flax, jax, language-model, large-language-models, llama, natural-language-processing, transformer
Language: Python
Homepage:
Size: 378 KB
Stars: 2,428
Watchers: 43
Forks: 258
Open Issues: 29
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

Awesome-LLM-Productization - EasyLM - EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax. (Note from the repo: here comes the details of [Jax](https://github.com/google/jax) and [Flax](https://github.com/google/flax)) (Models and Tools / Full LLM Lifecycle)
awesome-open-chatgpt - young-geng/EasyLM
awesome-llm - Code
awesome-genai - EasyLM - Jax Based (Training and Fine Tuning Libraries)
StarryDivineSky - young-geng/EasyLM
awesome-jax - EasyLM - LLMs made easy: Pre-training, finetuning, evaluating and serving LLMs in JAX/Flax. <img src="https://img.shields.io/github/stars/young-geng/EasyLM?style=social" align="center"> (Libraries)
awesome-jax - EasyLM - Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax. <img src="https://img.shields.io/github/stars/young-geng/EasyLM?style=social" align="center"> (Libraries)
awesome-jax - EasyLM - Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax. <img src="https://img.shields.io/github/stars/young-geng/EasyLM?style=social" align="center"> (Libraries)
Awesome-LLMOps - EasyLM - training, finetuning, evaluating and serving LLMs in JAX/Flax. ![Stars](https://img.shields.io/github/stars/young-geng/EasyLM.svg?style=flat&color=green) ![Contributors](https://img.shields.io/github/contributors/young-geng/EasyLM?color=green) ![LastCommit](https://img.shields.io/github/last-commit/young-geng/EasyLM?color=green) (Training / FineTune)

README

        # EasyLM

Large language models (LLMs) made easy, EasyLM is a one stop solution for

pre-training, finetuning, evaluating and serving LLMs in JAX/Flax. EasyLM can

scale up LLM training to hundreds of TPU/GPU accelerators by leveraging

JAX's pjit functionality.

Building on top of Hugginface's [transformers](https://huggingface.co/docs/transformers/main/en/index)

and [datasets](https://huggingface.co/docs/datasets/index), this repo provides

an easy to use and easy to customize codebase for training large language models

without the complexity in many other frameworks.

EasyLM is built with JAX/Flax. By leveraging JAX's pjit utility, EasyLM is able

to train large models that don't fit on a single accelerator by sharding

the model weights and training data across multiple accelerators. Currently,

EasyLM supports multiple TPU/GPU training in a single host as well as multi-host

training on Google Cloud TPU Pods.

Currently, the following models are supported:

* [LLaMA](https://arxiv.org/abs/2302.13971)

* [LLaMA 2](https://arxiv.org/abs/2307.09288)

* [LLaMA 3](https://llama.meta.com/llama3/)

## Discord Server

We are running an unofficial Discord community (unaffiliated with Google) for discussion related to training LLMs in JAX. [Follow this link to join the Discord server](https://discord.gg/Rf4drG3Bhp). We have dedicated channels for several JAX based LLM frameworks, include EasyLM, [JaxSeq](https://github.com/Sea-Snell/JAXSeq), [Alpa](https://github.com/alpa-projects/alpa) and [Levanter](https://github.com/stanford-crfm/levanter).

## Models Trained with EasyLM

### OpenLLaMA

OpenLLaMA is our permissively licensed reproduction of LLaMA which can be used

for commercial purposes. Check out the [project main page here](https://github.com/openlm-research/open_llama).

The OpenLLaMA can serve as drop in replacement for the LLaMA weights in EasyLM.

Please refer to the [LLaMA documentation](docs/llama.md) for more details.

### Koala

Koala is our new chatbot fine-tuned on top of LLaMA. If you are interested in

our Koala chatbot, you can check out the [blogpost](https://bair.berkeley.edu/blog/2023/04/03/koala/)

and [documentation for running it locally](docs/koala.md).

## Installation

The installation method differs between GPU hosts and Cloud TPU hosts. The first

step is to pull from GitHub.

``` shell

git clone https://github.com/young-geng/EasyLM.git

cd EasyLM

export PYTHONPATH="${PWD}:$PYTHONPATH"

```

#### Installing on GPU Host

The GPU environment can be installed via [Anaconda](https://www.anaconda.com/products/distribution).

``` shell

conda env create -f scripts/gpu_environment.yml

conda activate EasyLM

```

#### Installing on Cloud TPU Host

The TPU host VM comes with Python and PIP pre-installed. Simply run the following

script to set up the TPU host.

``` shell

./scripts/tpu_vm_setup.sh

```

## [Documentations](docs/README.md)

The EasyLM documentations can be found in the [docs](docs/) directory.

## Reference

If you found EasyLM useful in your research or applications, please cite using the following BibTeX:

```

@software{geng2023easylm,

  author = {Geng, Xinyang},

  title = {EasyLM: A Simple And Scalable Training Framework for Large Language Models},

  month = March,

  year = 2023,

  url = {https://github.com/young-geng/EasyLM}

}

```

## Credits

* The LLaMA implementation is from [JAX_llama](https://github.com/Sea-Snell/JAX_llama)

* The JAX/Flax GPT-J and RoBERTa implementation are from [transformers](https://huggingface.co/docs/transformers/main/en/index)

* Most of the JAX utilities are from [mlxu](https://github.com/young-geng/mlxu)

* The codebase is heavily inspired by [JAXSeq](https://github.com/Sea-Snell/JAXSeq)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/young-geng/EasyLM

Awesome Lists containing this project

README