{"id":13478110,"url":"https://github.com/huggingface/optimum","last_synced_at":"2025-05-14T12:02:27.550Z","repository":{"id":37205852,"uuid":"387786881","full_name":"huggingface/optimum","owner":"huggingface","description":"🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization tools","archived":false,"fork":false,"pushed_at":"2025-04-30T21:16:39.000Z","size":5915,"stargazers_count":2878,"open_issues_count":369,"forks_count":532,"subscribers_count":56,"default_branch":"main","last_synced_at":"2025-05-07T11:41:41.033Z","etag":null,"topics":["graphcore","habana","inference","intel","onnx","onnxruntime","optimization","pytorch","quantization","tflite","training","transformers"],"latest_commit_sha":null,"homepage":"https://huggingface.co/docs/optimum/main/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/huggingface.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2021-07-20T12:36:40.000Z","updated_at":"2025-05-07T07:40:45.000Z","dependencies_parsed_at":"2024-01-22T11:41:53.487Z","dependency_job_id":"d427d0d1-a173-45fa-bd71-1db3454980ec","html_url":"https://github.com/huggingface/optimum","commit_stats":{"total_commits":1131,"total_committers":133,"mean_commits":8.503759398496241,"dds":0.7029177718832891,"last_synced_commit":"65a8a94adaf136dd677d28cfc837c0acfe993031"},"previous_names":[],"tags_count":73,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/huggingface%2Foptimum","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/huggingface%2Foptimum/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/huggingface%2Foptimum/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/huggingface%2Foptimum/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/huggingface","download_url":"https://codeload.github.com/huggingface/optimum/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253908981,"owners_count":21982685,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["graphcore","habana","inference","intel","onnx","onnxruntime","optimization","pytorch","quantization","tflite","training","transformers"],"created_at":"2024-07-31T16:01:52.646Z","updated_at":"2025-05-14T12:02:27.487Z","avatar_url":"https://github.com/huggingface.png","language":"Python","readme":"[![ONNX Runtime](https://github.com/huggingface/optimum/actions/workflows/test_onnxruntime.yml/badge.svg)](https://github.com/huggingface/optimum/actions/workflows/test_onnxruntime.yml)\n\n# Hugging Face Optimum\n\n🤗 Optimum is an extension of 🤗 Transformers and Diffusers, providing a set of optimization tools enabling maximum efficiency to train and run models on targeted hardware, while keeping things easy to use.\n\n## Installation\n\n🤗 Optimum can be installed using `pip` as follows:\n\n```bash\npython -m pip install optimum\n```\n\nIf you'd like to use the accelerator-specific features of 🤗 Optimum, you can install the required dependencies according to the table below:\n\n| Accelerator                                                                                                            | Installation                                                      |\n|:-----------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------|\n| [ONNX Runtime](https://huggingface.co/docs/optimum/onnxruntime/overview)                                               | `pip install --upgrade --upgrade-strategy eager optimum[onnxruntime]`      |\n| [ExecuTorch](https://github.com/huggingface/optimum-executorch)                                                         | `pip install --upgrade --upgrade-strategy eager optimum[executorch]`\n| [Intel Neural Compressor](https://huggingface.co/docs/optimum/intel/index)                                             | `pip install --upgrade --upgrade-strategy eager optimum[neural-compressor]`|\n| [OpenVINO](https://huggingface.co/docs/optimum/intel/index)                                                            | `pip install --upgrade --upgrade-strategy eager optimum[openvino]`         |\n| [NVIDIA TensorRT-LLM](https://huggingface.co/docs/optimum/main/en/nvidia_overview)                                     | `docker run -it --gpus all --ipc host huggingface/optimum-nvidia`          |\n| [AMD Instinct GPUs and Ryzen AI NPU](https://huggingface.co/docs/optimum/amd/index)                                    | `pip install --upgrade --upgrade-strategy eager optimum[amd]`              |\n| [AWS Trainum \u0026 Inferentia](https://huggingface.co/docs/optimum-neuron/index)                                           | `pip install --upgrade --upgrade-strategy eager optimum[neuronx]`          |\n| [Habana Gaudi Processor (HPU)](https://huggingface.co/docs/optimum/habana/index)                                       | `pip install --upgrade --upgrade-strategy eager optimum[habana]`           |\n| [FuriosaAI](https://huggingface.co/docs/optimum/furiosa/index)                                                         | `pip install --upgrade --upgrade-strategy eager optimum[furiosa]`          |\n\nThe `--upgrade --upgrade-strategy eager` option is needed to ensure the different packages are upgraded to the latest possible version.\n\nTo install from source:\n\n```bash\npython -m pip install git+https://github.com/huggingface/optimum.git\n```\n\nFor the accelerator-specific features, append `optimum[accelerator_type]` to the above command:\n\n```bash\npython -m pip install optimum[onnxruntime]@git+https://github.com/huggingface/optimum.git\n```\n\n## Accelerated Inference\n\n🤗 Optimum provides multiple tools to export and run optimized models on various ecosystems:\n\n- [ONNX](https://huggingface.co/docs/optimum/exporters/onnx/usage_guides/export_a_model) / [ONNX Runtime](https://huggingface.co/docs/optimum/onnxruntime/usage_guides/models)\n- [ExecuTorch](https://huggingface.co/docs/optimum-executorch/guides/export), PyTorch’s native solution to inference on the Edge, more details [here](https://pytorch.org/executorch/stable/)\n- TensorFlow Lite\n- [OpenVINO](https://huggingface.co/docs/optimum/intel/inference)\n- Habana first-gen Gaudi / Gaudi2, more details [here](https://huggingface.co/docs/optimum/main/en/habana/usage_guides/accelerate_inference)\n- AWS Inferentia 2 / Inferentia 1, more details [here](https://huggingface.co/docs/optimum-neuron/en/guides/models)\n- NVIDIA TensorRT-LLM , more details [here](https://huggingface.co/blog/optimum-nvidia)\n\nThe [export](https://huggingface.co/docs/optimum/exporters/overview) and optimizations can be done both programmatically and with a command line.\n\n\n### ONNX + ONNX Runtime\n\nBefore you begin, make sure you have all the necessary libraries installed :\n\n```bash\npip install optimum[exporters,onnxruntime]\n```\n\nIt is possible to export 🤗 Transformers and Diffusers models to the [ONNX](https://onnx.ai/) format and perform graph optimization as well as quantization easily.\n\nFor more information on the ONNX export, please check the [documentation](https://huggingface.co/docs/optimum/exporters/onnx/usage_guides/export_a_model).\n\nOnce the model is exported to the ONNX format, we provide Python classes enabling you to run the exported ONNX model in a seemless manner using [ONNX Runtime](https://onnxruntime.ai/) in the backend.\n\nMore details on how to run ONNX models with `ORTModelForXXX` classes [here](https://huggingface.co/docs/optimum/main/en/onnxruntime/usage_guides/models).\n\n\n### ExecuTorch\n\nBefore you begin, make sure you have all the necessary libraries installed :\n\n```bash\npip install optimum[exporters-executorch]\n```\n\nUsers can export 🤗 Transformers models to [ExecuTorch](https://github.com/pytorch/executorch) and run inference on edge devices within PyTorch's ecosystem.\n\nFor more information about export 🤗 Transformers to ExecuTorch, please check the doc for [Optimum-ExecuTorch](https://huggingface.co/docs/optimum-executorch/guides/export).\n\n\n### TensorFlow Lite\n\nBefore you begin, make sure you have all the necessary libraries installed :\n\n```bash\npip install optimum[exporters-tf]\n```\n\nJust as for ONNX, it is possible to export models to [TensorFlow Lite](https://www.tensorflow.org/lite) and quantize them.\nYou can find more information in our [documentation](https://huggingface.co/docs/optimum/main/exporters/tflite/usage_guides/export_a_model).\n\n### Intel (OpenVINO + Neural Compressor + IPEX)\n\nBefore you begin, make sure you have all the necessary [libraries installed](https://huggingface.co/docs/optimum/main/en/intel/installation).\n\nYou can find more information on the different integration in our [documentation](https://huggingface.co/docs/optimum/main/en/intel/index) and in the examples of [`optimum-intel`](https://github.com/huggingface/optimum-intel).\n\n\n### Quanto\n\n[Quanto](https://github.com/huggingface/optimum-quanto) is a pytorch quantization backenb which allowss you to quantize a model either using the python API or the `optimum-cli`.\n\nYou can see more details and [examples](https://github.com/huggingface/optimum-quanto/tree/main/examples) in the [Quanto](https://github.com/huggingface/optimum-quanto) repository.\n\n## Accelerated training\n\n🤗 Optimum provides wrappers around the original 🤗 Transformers [Trainer](https://huggingface.co/docs/transformers/main_classes/trainer) to enable training on powerful hardware easily.\nWe support many providers:\n\n- Habana's Gaudi processors\n- AWS Trainium instances, check [here](https://huggingface.co/docs/optimum-neuron/en/guides/distributed_training)\n- ONNX Runtime (optimized for GPUs)\n\n### Habana\n\nBefore you begin, make sure you have all the necessary libraries installed :\n\n```bash\npip install --upgrade --upgrade-strategy eager optimum[habana]\n```\n\nYou can find examples in the [documentation](https://huggingface.co/docs/optimum/habana/quickstart) and in the [examples](https://github.com/huggingface/optimum-habana/tree/main/examples).\n\n### ONNX Runtime\n\n\nBefore you begin, make sure you have all the necessary libraries installed :\n\n```bash\npip install optimum[onnxruntime-training]\n```\n\nYou can find examples in the [documentation](https://huggingface.co/docs/optimum/onnxruntime/usage_guides/trainer) and in the [examples](https://github.com/huggingface/optimum/tree/main/examples/onnxruntime/training).\n","funding_links":[],"categories":["Python","Table of Contents","其他_机器学习与深度学习","🤖 AI \u0026 Machine Learning","Optimizations and fine-tuning","Popular Libraries","Computation and Communication Optimisation","📋 Contents"],"sub_categories":["AI - Frameworks and Toolkits","⚡ 3. Inference Engines \u0026 Serving"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhuggingface%2Foptimum","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhuggingface%2Foptimum","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhuggingface%2Foptimum/lists"}