https://github.com/huggingface/optimum-neuron
Easy, fast and very cheap training and inference on AWS Trainium and Inferentia chips.
https://github.com/huggingface/optimum-neuron
Last synced: 3 months ago
JSON representation
Easy, fast and very cheap training and inference on AWS Trainium and Inferentia chips.
- Host: GitHub
- URL: https://github.com/huggingface/optimum-neuron
- Owner: huggingface
- License: apache-2.0
- Created: 2023-02-01T11:02:51.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-10-29T09:14:48.000Z (8 months ago)
- Last Synced: 2024-10-29T10:02:45.045Z (8 months ago)
- Language: Jupyter Notebook
- Size: 23.3 MB
- Stars: 206
- Watchers: 13
- Forks: 60
- Open Issues: 83
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Optimum Neuron
🤗 Optimum Neuron is the interface between the 🤗 Transformers library and AWS Accelerators including [AWS Trainium](https://aws.amazon.com/machine-learning/trainium/?nc1=h_ls) and [AWS Inferentia](https://aws.amazon.com/machine-learning/inferentia/?nc1=h_ls).
It provides a set of tools enabling easy model loading, training and inference on single- and multi-Accelerator settings for different downstream tasks.
The list of officially validated models and tasks is available [here](TODO:). Users can try other models and tasks with only few changes.## Install
To install the latest release of this package:* For AWS Trainium (trn1) or AWS inferentia2 (inf2)
```bash
pip install --upgrade-strategy eager optimum-neuron[neuronx]
```* For AWS inferentia (inf1)
```bash
pip install --upgrade-strategy eager optimum-neuron[neuron]
```Optimum Neuron is a fast-moving project, and you may want to install it from source:
```bash
pip install git+https://github.com/huggingface/optimum-neuron.git
```> Alternatively, you can install the package without pip as follows:
> ```bash
> git clone https://github.com/huggingface/optimum-neuron.git
> cd optimum-neuron
> python setup.py install
> ```*Make sure that you have installed the Neuron driver and tools before installing `optimum-neuron`, [more extensive guide here](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/setup/torch-neuronx.html#setup-torch-neuronx).*
Last but not least, don't forget to install the requirements for every example:
```bash
cd
pip install -r requirements.txt
```## Quick Start
🤗 Optimum Neuron was designed with one goal in mind: **to make training and inference straightforward for any 🤗 Transformers user while leveraging the complete power of AWS Accelerators**.
### Training
There are two main classes one needs to know:
- TrainiumArgumentParser: inherits the original [HfArgumentParser](https://huggingface.co/docs/transformers/main/en/internal/trainer_utils#transformers.HfArgumentParser) in Transformers with additional checks on the argument values to make sure that they will work well with AWS Trainium instances.
- [NeuronTrainer](https://huggingface.co/docs/optimum/neuron/package_reference/trainer): this version trainer takes care of doing the proper checks and changes to the supported models to make them trainable on AWS Trainium instances.The [NeuronTrainer](https://huggingface.co/docs/optimum/neuron/package_reference/trainer) is very similar to the [🤗 Transformers Trainer](https://huggingface.co/docs/transformers/main_classes/trainer), and adapting a script using the Trainer to make it work with Trainium will mostly consist in simply swapping the Trainer class for the NeuronTrainer one.
That's how most of the [example scripts](https://github.com/huggingface/optimum-neuron/tree/main/examples) were adapted from their [original counterparts](https://github.com/huggingface/transformers/tree/main/examples/pytorch).```diff
from transformers import TrainingArguments
+from optimum.neuron import NeuronTrainer as Trainertraining_args = TrainingArguments(
# training arguments...
)# A lot of code here
# Initialize our Trainer
trainer = Trainer(
model=model,
args=training_args, # Original training arguments.
train_dataset=train_dataset if training_args.do_train else None,
eval_dataset=eval_dataset if training_args.do_eval else None,
compute_metrics=compute_metrics,
tokenizer=tokenizer,
data_collator=data_collator,
)
```### Inference
You can compile and export your 🤗 Transformers models to a serialized format before inference on Neuron devices:
```bash
optimum-cli export neuron \
--model distilbert-base-uncased-finetuned-sst-2-english \
--batch_size 1 \
--sequence_length 32 \
--auto_cast matmul \
--auto_cast_type bf16 \
distilbert_base_uncased_finetuned_sst2_english_neuron/
```The command above will export `distilbert-base-uncased-finetuned-sst-2-english` with static shapes: `batch_size=1` and `sequence_length=32`, and cast all `matmul` operations from FP32 to BF16. Check out the [exporter guide](https://huggingface.co/docs/optimum-neuron/guides/export_model) for more compilation options.
Then you can run the exported Neuron model on Neuron devices with `NeuronModelForXXX` classes which are similar to `AutoModelForXXX` classes in 🤗 Transformers:
```diff
from transformers import AutoTokenizer
-from transformers import AutoModelForSequenceClassification
+from optimum.neuron import NeuronModelForSequenceClassification# PyTorch checkpoint
-model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")
+model = NeuronModelForSequenceClassification.from_pretrained("distilbert_base_uncased_finetuned_sst2_english_neuron")tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")
inputs = tokenizer("Hamilton is considered to be the best musical of past years.", return_tensors="pt")logits = model(**inputs).logits
print(model.config.id2label[logits.argmax().item()])
# 'POSITIVE'
```### Documentation
Check out [the documentation of Optimum Neuron](https://huggingface.co/docs/optimum-neuron/index) for more advanced usage.
If you find any issue while using those, please open an issue or a pull request.