Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/simplifine-llm/Simplifine

Simplifine: ๐Ÿš€ Super easy LLM finetuning with one-line commands, cutting-edge optimization, and fully open source! โœจ
https://github.com/simplifine-llm/Simplifine

Last synced: about 1 month ago
JSON representation

Simplifine: ๐Ÿš€ Super easy LLM finetuning with one-line commands, cutting-edge optimization, and fully open source! โœจ

Awesome Lists containing this project

README

        

# ๐ŸŒŸ Simplifine ๐ŸŒŸ

Simplifine lets you invoke LLM finetuning with just one line of code using any Hugging Face dataset or model.
> The easiest, fully open-source LLM finetuning library!

**Get free Simplifine Cloud Credits to finetune [here](https://www.simplifine.com/api-key-interest)**

## Roadmap
- **COMPREHENSIVE UPDATE of DOCUMENTATIONS on INCOMING (By Aug 9th, 2024) to match new config files.

## ๐Ÿ”„ Updates
**v0.0.8 (2024-08-08)**
- **Bug Fixes:** Code clean up and trainer fixes.
- **New Feature:** Ability to define more complex configuration files for the trainer.
- **Examples:** -New examples on training cloud and training a fake news detector.

**v0.0.71 (2024-07-25)**
- **Bug Fixes:** Resolved issues that prevented the library from loading on certain configurations.
- **New Feature:** Added support for installing directly from git. Added support for Hugging Face API Tokens to access restricted models.
- **Documentation:** Updated examples.

## ๐Ÿš€ Features

- **Supervised Fine Tuning** ๐Ÿง‘โ€๐Ÿซ
- **Question-Answer Finetuning** โ“โž•
- **Contrastive Loss for Embedding Tasks** ๐ŸŒŒ
- **Multi-label Classification Finetuning** ๐Ÿท๏ธ
- **WandB Logging** ๐Ÿ“Š
- **In-built Evaluation Tools** ๐Ÿ“ˆ
- **Automated Finetuning Parameters** ๐Ÿค–
- **State-of-the-art Optimization Techniques (DeepSpeed, FDSP)** ๐ŸŽ๏ธ

## ๐Ÿ“ฆ Installation

```bash
pip install simplifine-alpha
```

Or you can install the package from source. To do so, simply download the content of this repository and navigate to the installation folder and run the following command:

```bash
pip install .
```

You can also directly install from github using the following command:
```bash
pip install git+https://github.com/simplifine-llm/Simplifine.git
```

## ๐Ÿ Quickstart

For a more comprehensive example, see this [notebook](https://github.com/simplifine-llm/Simplifine/blob/main/examples/cloud_quickstart.ipynb) in the examples folder:

Further examples on how to use train engine are also located in the examples folder.

## ๐Ÿค Contributing

We are looking for contributors! Please send an email to [[email protected]](mailto:[email protected]) to get onboarded, or add your name to the waitlist on [www.simplifine.com](http://www.simplifine.com).

## ๐Ÿ“„ License

Simplifine is licensed under the GNU General Public License Version 3. See the LICENSE file for more details.

## ๐Ÿ“š Documentation
MAJOR OVERHAUL OF DOCUMENTATION IN THE WORKS (Done by 11th Aug 2024). In the meantime, please use this [notebook](https://github.com/simplifine-llm/Simplifine/blob/main/examples/cloud_quickstart.ipynb) here to learn how to use the model.

## ๐Ÿ’ฌ Support

If you have any suggestions for new features you'd like to see implemented, please raise an issueโ€”we will work hard to make it happen ASAP! For any other questions, feel free to contact us at [[email protected]](mailto:[email protected]).

## โ›ฎ General Compute Considerations

We currently support both DistributedDataParallel (DDP) and ZeRO from DeepSpeed.

**TL;DR**:
- **DDP** is useful when a model can fit in GPU memory (this includes gradients and activation states).
- **ZeRO** is useful when a model requires sharding across multiple GPUs.

**Longer Version**:

- **DDP**: Distributed Data Parallel (DDP) creates a replica of the model on each processor (GPU). For example, imagine 8 GPUs, each being fed a single data pointโ€”this would make a batch size of 8. The model replicas are then updated on each device. DDP speeds up training by parallelizing the data-feeding process. However, DDP **fails** if the replica cannot fit in GPU memory. Remember, the memory not only hosts parameters but also gradients and optimizer states.

- **ZeRO**: ZeRO is a powerful optimization developed by DeepSpeed and comes in different stages (1, 2, and 3). Each stage shards different parts of the training process (parameters, gradients, and activation states). This is really useful if a model cannot fit in GPU memory. ZeRO also supports offloading to the CPU, making even more room for training larger models.

### Example Scenarios and Appropriate Optimization Methods:
1. **LLaMA-3-8b model with 16-bit precision**: Use ZeRO Stage 3 on 8 A100s.
2. **LLaMA-3-8b model with LoRA adapters**: Usually fine with DDP on A100s.
3. **GPT-2 with 16-bit precision**: Use DDP.

## ๐Ÿชฒ FAQs and Bugs

**Issue: RuntimeError: Error building extension 'cpu_adam' python dev**

This error occurs when `python-dev` is not installed, and ZeRO is using offload. To resolve this, try:

```bash
# Try sudo apt-get install python3-dev if the following fails.
apt-get install python-dev # for Python 2.x installs
apt-get install python3-dev # for Python 3.x installs
```

See this [link](https://stackoverflow.com/questions/21530577/fatal-error-python-h-no-such-file-or-directory)