Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/Oneflow-Inc/libai

LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training
https://github.com/Oneflow-Inc/libai

data-parallelism deep-learning distributed-training large-scale model-parallelism nlp oneflow pipeline-parallelism self-supervised-learning transformer vision-transformer

Last synced: about 1 month ago
JSON representation

LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training

Awesome Lists containing this project

README

        

LiBai




docs


GitHub


GitHub release


PRs Welcome


Python Checks


Docs Release Status

## Introduction

**English** | [简体中文](/README_zh-CN.md)

LiBai is a large-scale open-source model training toolbox based on OneFlow. The main branch works with OneFlow 0.7.0.

Highlights

- **Support a collection of parallel training components**

LiBai provides multiple parallelisms such as Data Parallelism, Tensor Parallelism, and Pipeline Parallelism. It's also extensible for other new parallelisms.

- **Varied training techniques**

LiBai provides many out-of-the-box training techniques such as Distributed Training, Mixed Precision Training, Activation Checkpointing, Recomputation, Gradient Accumulation, and Zero Redundancy Optimizer(ZeRO).

- **Support for both CV and NLP tasks**

LiBai has predefined data process for both CV and NLP datasets such as CIFAR, ImageNet, and BERT Dataset.

- **Easy to use**

LiBai's components are designed to be modular for easier usage as follows:
- LazyConfig system for more flexible syntax and no predefined structures
- Friendly trainer and engine
- Used as a library to support building research projects on it. See [projects/](/projects) for some projects that are built based on LiBai

- **High Efficiency**

## Installation

See [Installation instructions](https://libai.readthedocs.io/en/latest/tutorials/get_started/Installation.html).

## Getting Started

See [Quick Run](https://libai.readthedocs.io/en/latest/tutorials/get_started/quick_run.html) for the basic usage of LiBai.

## Documentation

See LiBai's [documentation](https://libai.readthedocs.io/en/latest/index.html) for full API documentation and tutorials.

## ChangeLog

**Beta 0.3.0** was released in 03/11/2024, the general changes in **0.3.0** version are as follows:

**Features:**
- Support mock transformers, see [Mock transformers](https://github.com/Oneflow-Inc/libai/tree/main/projects/mock_transformers#readme)
- Support lm-evaluation-harness for model evaluation
- User Experience Optimization

**New Supported Models:**
- These models are natively supported by libai



Models
2D(tp+pp) Inference
3D Parallel Training


BLOOM

-


ChatGLM




Couplets




DALLE2

-


Llama2




MAE




Stable_Diffusion
-
-

**New Mock Models:**
- These models are extended and implemented by libai through mocking transformers.



Models
Tensor Parallel
Pipeline Parallel


BLOOM

-


GPT2

-


LLAMA

-


LLAMA2

-


Baichuan

-


OPT

-

See [changelog](./changelog.md) for details and release history.

## Contributing

We appreciate all contributions to improve LiBai. See [CONTRIBUTING](./CONTRIBUTING.md) for the contributing guideline.

## License

This project is released under the [Apache 2.0 license](LICENSE).

## Citation

If you find this project useful for your research, consider cite:

```BibTeX
@misc{of2021libai,
author = {Xingyu Liao and Peng Cheng and Tianhe Ren and Depeng Liang and
Kai Dang and Yi Wang and Xiaoyu Xu},
title = {LiBai},
howpublished = {\url{https://github.com/Oneflow-Inc/libai}},
year = {2021}
}
```

## Join the WeChat group

![LiBai_Wechat_QRcode](./docs/source/tutorials/assets/LiBai_Wechat.png)