An open API service indexing awesome lists of open source software.

https://github.com/huggingface/optimum-neuron

Easy, fast and very cheap training and inference on AWS Trainium and Inferentia chips.
https://github.com/huggingface/optimum-neuron

Last synced: 3 months ago
JSON representation

Easy, fast and very cheap training and inference on AWS Trainium and Inferentia chips.

Awesome Lists containing this project

README

        

# Optimum Neuron

🤗 Optimum Neuron is the interface between the 🤗 Transformers library and AWS Accelerators including [AWS Trainium](https://aws.amazon.com/machine-learning/trainium/?nc1=h_ls) and [AWS Inferentia](https://aws.amazon.com/machine-learning/inferentia/?nc1=h_ls).
It provides a set of tools enabling easy model loading, training and inference on single- and multi-Accelerator settings for different downstream tasks.
The list of officially validated models and tasks is available [here](TODO:). Users can try other models and tasks with only few changes.

## Install
To install the latest release of this package:

* For AWS Trainium (trn1) or AWS inferentia2 (inf2)

```bash
pip install --upgrade-strategy eager optimum-neuron[neuronx]
```

* For AWS inferentia (inf1)

```bash
pip install --upgrade-strategy eager optimum-neuron[neuron]
```

Optimum Neuron is a fast-moving project, and you may want to install it from source:

```bash
pip install git+https://github.com/huggingface/optimum-neuron.git
```

> Alternatively, you can install the package without pip as follows:
> ```bash
> git clone https://github.com/huggingface/optimum-neuron.git
> cd optimum-neuron
> python setup.py install
> ```

*Make sure that you have installed the Neuron driver and tools before installing `optimum-neuron`, [more extensive guide here](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/setup/torch-neuronx.html#setup-torch-neuronx).*

Last but not least, don't forget to install the requirements for every example:

```bash
cd
pip install -r requirements.txt
```

## Quick Start

🤗 Optimum Neuron was designed with one goal in mind: **to make training and inference straightforward for any 🤗 Transformers user while leveraging the complete power of AWS Accelerators**.

### Training

There are two main classes one needs to know:
- TrainiumArgumentParser: inherits the original [HfArgumentParser](https://huggingface.co/docs/transformers/main/en/internal/trainer_utils#transformers.HfArgumentParser) in Transformers with additional checks on the argument values to make sure that they will work well with AWS Trainium instances.
- [NeuronTrainer](https://huggingface.co/docs/optimum/neuron/package_reference/trainer): this version trainer takes care of doing the proper checks and changes to the supported models to make them trainable on AWS Trainium instances.

The [NeuronTrainer](https://huggingface.co/docs/optimum/neuron/package_reference/trainer) is very similar to the [🤗 Transformers Trainer](https://huggingface.co/docs/transformers/main_classes/trainer), and adapting a script using the Trainer to make it work with Trainium will mostly consist in simply swapping the Trainer class for the NeuronTrainer one.
That's how most of the [example scripts](https://github.com/huggingface/optimum-neuron/tree/main/examples) were adapted from their [original counterparts](https://github.com/huggingface/transformers/tree/main/examples/pytorch).

```diff
from transformers import TrainingArguments
+from optimum.neuron import NeuronTrainer as Trainer

training_args = TrainingArguments(
# training arguments...
)

# A lot of code here

# Initialize our Trainer
trainer = Trainer(
model=model,
args=training_args, # Original training arguments.
train_dataset=train_dataset if training_args.do_train else None,
eval_dataset=eval_dataset if training_args.do_eval else None,
compute_metrics=compute_metrics,
tokenizer=tokenizer,
data_collator=data_collator,
)
```

### Inference

You can compile and export your 🤗 Transformers models to a serialized format before inference on Neuron devices:

```bash
optimum-cli export neuron \
--model distilbert-base-uncased-finetuned-sst-2-english \
--batch_size 1 \
--sequence_length 32 \
--auto_cast matmul \
--auto_cast_type bf16 \
distilbert_base_uncased_finetuned_sst2_english_neuron/
```

The command above will export `distilbert-base-uncased-finetuned-sst-2-english` with static shapes: `batch_size=1` and `sequence_length=32`, and cast all `matmul` operations from FP32 to BF16. Check out the [exporter guide](https://huggingface.co/docs/optimum-neuron/guides/export_model) for more compilation options.

Then you can run the exported Neuron model on Neuron devices with `NeuronModelForXXX` classes which are similar to `AutoModelForXXX` classes in 🤗 Transformers:

```diff
from transformers import AutoTokenizer
-from transformers import AutoModelForSequenceClassification
+from optimum.neuron import NeuronModelForSequenceClassification

# PyTorch checkpoint
-model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")
+model = NeuronModelForSequenceClassification.from_pretrained("distilbert_base_uncased_finetuned_sst2_english_neuron")

tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")
inputs = tokenizer("Hamilton is considered to be the best musical of past years.", return_tensors="pt")

logits = model(**inputs).logits
print(model.config.id2label[logits.argmax().item()])
# 'POSITIVE'
```

### Documentation

Check out [the documentation of Optimum Neuron](https://huggingface.co/docs/optimum-neuron/index) for more advanced usage.

If you find any issue while using those, please open an issue or a pull request.