Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/young-geng/EasyLM
Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.
https://github.com/young-geng/EasyLM
chatbot deep-learning flax jax language-model large-language-models llama natural-language-processing transformer
Last synced: about 2 months ago
JSON representation
Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.
- Host: GitHub
- URL: https://github.com/young-geng/EasyLM
- Owner: young-geng
- License: apache-2.0
- Created: 2022-11-22T12:55:20.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-08-13T05:55:05.000Z (4 months ago)
- Last Synced: 2024-10-18T22:07:02.574Z (about 2 months ago)
- Topics: chatbot, deep-learning, flax, jax, language-model, large-language-models, llama, natural-language-processing, transformer
- Language: Python
- Homepage:
- Size: 378 KB
- Stars: 2,394
- Watchers: 43
- Forks: 253
- Open Issues: 29
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- Awesome-LLM-Productization - EasyLM - EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax. (Note from the repo: here comes the details of [Jax](https://github.com/google/jax) and [Flax](https://github.com/google/flax)) (Models and Tools / Full LLM Lifecycle)
- awesome-open-chatgpt - young-geng/EasyLM
- awesome-llm - Code
- awesome-genai - EasyLM - Jax Based (Training and Fine Tuning Libraries)
- StarryDivineSky - young-geng/EasyLM
- awesome-jax - EasyLM - LLMs made easy: Pre-training, finetuning, evaluating and serving LLMs in JAX/Flax. <img src="https://img.shields.io/github/stars/young-geng/EasyLM?style=social" align="center"> (Libraries)
README
# EasyLM
Large language models (LLMs) made easy, EasyLM is a one stop solution for
pre-training, finetuning, evaluating and serving LLMs in JAX/Flax. EasyLM can
scale up LLM training to hundreds of TPU/GPU accelerators by leveraging
JAX's pjit functionality.Building on top of Hugginface's [transformers](https://huggingface.co/docs/transformers/main/en/index)
and [datasets](https://huggingface.co/docs/datasets/index), this repo provides
an easy to use and easy to customize codebase for training large language models
without the complexity in many other frameworks.EasyLM is built with JAX/Flax. By leveraging JAX's pjit utility, EasyLM is able
to train large models that don't fit on a single accelerator by sharding
the model weights and training data across multiple accelerators. Currently,
EasyLM supports multiple TPU/GPU training in a single host as well as multi-host
training on Google Cloud TPU Pods.Currently, the following models are supported:
* [LLaMA](https://arxiv.org/abs/2302.13971)
* [LLaMA 2](https://arxiv.org/abs/2307.09288)
* [LLaMA 3](https://llama.meta.com/llama3/)## Discord Server
We are running an unofficial Discord community (unaffiliated with Google) for discussion related to training LLMs in JAX. [Follow this link to join the Discord server](https://discord.gg/Rf4drG3Bhp). We have dedicated channels for several JAX based LLM frameworks, include EasyLM, [JaxSeq](https://github.com/Sea-Snell/JAXSeq), [Alpa](https://github.com/alpa-projects/alpa) and [Levanter](https://github.com/stanford-crfm/levanter).## Models Trained with EasyLM
### OpenLLaMA
OpenLLaMA is our permissively licensed reproduction of LLaMA which can be used
for commercial purposes. Check out the [project main page here](https://github.com/openlm-research/open_llama).
The OpenLLaMA can serve as drop in replacement for the LLaMA weights in EasyLM.
Please refer to the [LLaMA documentation](docs/llama.md) for more details.### Koala
Koala is our new chatbot fine-tuned on top of LLaMA. If you are interested in
our Koala chatbot, you can check out the [blogpost](https://bair.berkeley.edu/blog/2023/04/03/koala/)
and [documentation for running it locally](docs/koala.md).## Installation
The installation method differs between GPU hosts and Cloud TPU hosts. The first
step is to pull from GitHub.``` shell
git clone https://github.com/young-geng/EasyLM.git
cd EasyLM
export PYTHONPATH="${PWD}:$PYTHONPATH"
```#### Installing on GPU Host
The GPU environment can be installed via [Anaconda](https://www.anaconda.com/products/distribution).``` shell
conda env create -f scripts/gpu_environment.yml
conda activate EasyLM
```#### Installing on Cloud TPU Host
The TPU host VM comes with Python and PIP pre-installed. Simply run the following
script to set up the TPU host.``` shell
./scripts/tpu_vm_setup.sh
```## [Documentations](docs/README.md)
The EasyLM documentations can be found in the [docs](docs/) directory.## Reference
If you found EasyLM useful in your research or applications, please cite using the following BibTeX:
```
@software{geng2023easylm,
author = {Geng, Xinyang},
title = {EasyLM: A Simple And Scalable Training Framework for Large Language Models},
month = March,
year = 2023,
url = {https://github.com/young-geng/EasyLM}
}
```## Credits
* The LLaMA implementation is from [JAX_llama](https://github.com/Sea-Snell/JAX_llama)
* The JAX/Flax GPT-J and RoBERTa implementation are from [transformers](https://huggingface.co/docs/transformers/main/en/index)
* Most of the JAX utilities are from [mlxu](https://github.com/young-geng/mlxu)
* The codebase is heavily inspired by [JAXSeq](https://github.com/Sea-Snell/JAXSeq)