Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/huggingface/optimum-graphcore

Blazing fast training of πŸ€— Transformers on Graphcore IPUs
https://github.com/huggingface/optimum-graphcore

fine-tuning graphcore machine-learning pytorch training transformers

Last synced: 2 months ago
JSON representation

Blazing fast training of πŸ€— Transformers on Graphcore IPUs

Awesome Lists containing this project

README

        

[![examples](https://github.com/huggingface/optimum-graphcore/actions/workflows/test-examples.yml/badge.svg)](https://github.com/huggingface/optimum-graphcore/actions/workflows/test-examples.yml) [![pipelines](https://github.com/huggingface/optimum-graphcore/actions/workflows/test-pipelines.yml/badge.svg)](https://github.com/huggingface/optimum-graphcore/actions/workflows/test-pipelines.yml)



# Optimum Graphcore

πŸ€— Optimum Graphcore is the interface between the πŸ€— Transformers library and [Graphcore IPUs](https://www.graphcore.ai/products/ipu).
It provides a set of tools enabling model parallelization and loading on IPUs, training, fine-tuning and inference on all the tasks already supported by πŸ€— Transformers while being compatible with the πŸ€— Hub and every model available on it out of the box.

## What is an Intelligence Processing Unit (IPU)?
Quote from the Hugging Face [blog post](https://huggingface.co/blog/graphcore#what-is-an-intelligence-processing-unit):
>IPUs are the processors that power Graphcore’s IPU-POD datacenter compute systems. This new type of processor is designed to support the very specific computational requirements of AI and machine learning. Characteristics such as fine-grained parallelism, low precision arithmetic, and the ability to handle sparsity have been built into our silicon.

> Instead of adopting a SIMD/SIMT architecture like GPUs, Graphcore’s IPU uses a massively parallel, MIMD architecture, with ultra-high bandwidth memory placed adjacent to the processor cores, right on the silicon die.

> This design delivers high performance and new levels of efficiency, whether running today’s most popular models, such as BERT and EfficientNet, or exploring next-generation AI applications.

## Poplar SDK setup
A Poplar SDK environment needs to be enabled to use this library. Please refer to Graphcore's [Getting Started](https://docs.graphcore.ai/en/latest/getting-started.html) guides.

## Install
To install the latest release of this package:

`pip install optimum-graphcore`

Optimum Graphcore is a fast-moving project, and you may want to install from source.

`pip install git+https://github.com/huggingface/optimum-graphcore.git`

### Installing in developer mode

If you are working on the `optimum-graphcore` code then you should use an editable install
by cloning and installing `optimum` and `optimum-graphcore`:

```
git clone https://github.com/huggingface/optimum --branch v1.6.1-release
git clone https://github.com/huggingface/optimum-graphcore
pip install -e optimum -e optimum-graphcore
```

Now whenever you change the code, you'll be able to run with those changes instantly.

## Running the examples

There are a number of examples provided in the `examples` directory. Each of these contains a README with command lines for running them on IPUs with Optimum Graphcore.

Please install the requirements for every example:

```
cd
pip install -r requirements.txt
```

## How to use Optimum Graphcore
πŸ€— Optimum Graphcore was designed with one goal in mind: **make training and evaluation straightforward for any πŸ€— Transformers user while leveraging the complete power of IPUs.**
It requires minimal changes if you are already using πŸ€— Transformers.

To immediately use a model on a given input (text, image, audio, ...), we support the `pipeline` API:

```diff
->>> from transformers import pipeline
+>>> from optimum.graphcore import pipeline

# Allocate a pipeline for sentiment-analysis
->>> classifier = pipeline('sentiment-analysis', model="distilbert-base-uncased-finetuned-sst-2-english")
+>>> classifier = pipeline('sentiment-analysis', model="distilbert-base-uncased-finetuned-sst-2-english", ipu_config = "Graphcore/distilbert-base-ipu")
>>> classifier('We are very happy to introduce pipeline to the transformers repository.')
[{'label': 'POSITIVE', 'score': 0.9996947050094604}]
```

It is also super easy to use the `Trainer` API:

```diff
-from transformers import Trainer, TrainingArguments
+from optimum.graphcore import IPUConfig, IPUTrainer, IPUTrainingArguments

-training_args = TrainingArguments(
+training_args = IPUTrainingArguments(
per_device_train_batch_size=4,
learning_rate=1e-4,
+ # Any IPUConfig on the Hub or stored locally
+ ipu_config_name="Graphcore/bert-base-ipu",
+)
+
+# Loading the IPUConfig needed by the IPUTrainer to compile and train the model on IPUs
+ipu_config = IPUConfig.from_pretrained(
+ training_args.ipu_config_name,
)

# Initialize our Trainer
-trainer = Trainer(
+trainer = IPUTrainer(
model=model,
+ ipu_config=ipu_config,
args=training_args,
train_dataset=train_dataset if training_args.do_train else None,
... # Other arguments
```

For more information, refer to the full [πŸ€— Optimum Graphcore documentation](https://huggingface.co/docs/optimum/graphcore_index).

## Supported models
The following model architectures and tasks are currently supported by πŸ€— Optimum Graphcore:
| | Pre-Training | Masked LM | Causal LM | Seq2Seq LM (Summarization, Translation, etc) | Sequence Classification | Token Classification | Question Answering | Multiple Choice | Image Classification | CTC |
|------------|--------------|-----------|-----------|----------------------------------------------|-------------------------|----------------------|--------------------|-----------------|----------------------| ------------ |
| BART | βœ… | | ❌ | βœ… | βœ… | | ❌ | | | |
| BERT | βœ… | βœ… | ❌ | | βœ… | βœ… | βœ… | βœ… | | |
| ConvNeXt | βœ… | | | | | | | | βœ… | |
| DeBERTa | βœ… | βœ… | | | βœ… | βœ… | βœ… | | | |
| DistilBERT | ❌ | βœ… | | | βœ… | βœ… | βœ… | βœ… | | |
| GPT-2 | βœ… | | βœ… | | βœ… | βœ… | | | | |
| [GroupBERT](https://arxiv.org/abs/2106.05822) | βœ… | βœ… | ❌ | | βœ… | βœ… | βœ… | βœ… | | |
| HuBERT | ❌ | | | | βœ… | | | | | βœ… |
| LXMERT | ❌ | | | | | | βœ… | | | |
| RoBERTa | βœ… | βœ… | ❌ | | βœ… | βœ… | βœ… | βœ… | | |
| T5 | βœ… | | | βœ… | | | | | | |
| ViT | ❌ | | | | | | | | βœ… | |
| Wav2Vec2 | βœ… | | | | | | | | | βœ… |
| Whisper | ❌ | | | βœ… | | | | | | |

If you find any issue while using those, please open an issue or a pull request.