{"id":25081760,"url":"https://github.com/openvinotoolkit/nncf","last_synced_at":"2026-04-08T12:03:22.198Z","repository":{"id":37102160,"uuid":"263687600","full_name":"openvinotoolkit/nncf","owner":"openvinotoolkit","description":"Neural Network Compression Framework for enhanced OpenVINO™ inference","archived":false,"fork":false,"pushed_at":"2025-05-12T13:59:22.000Z","size":65815,"stargazers_count":1009,"open_issues_count":50,"forks_count":253,"subscribers_count":30,"default_branch":"develop","last_synced_at":"2025-05-12T15:02:06.303Z","etag":null,"topics":["bert","classification","compression","deep-learning","genai","llm","mixed-precision-training","nlp","object-detection","onnx","openvino","pruning","pytorch","quantization","quantization-aware-training","semantic-segmentation","sparsity","tensorflow","transformers"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/openvinotoolkit.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":"CODEOWNERS","security":"Security.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2020-05-13T16:41:05.000Z","updated_at":"2025-05-12T13:56:39.000Z","dependencies_parsed_at":"2023-09-26T13:10:14.042Z","dependency_job_id":"7353617d-462f-41dd-b804-ca43b827c2a5","html_url":"https://github.com/openvinotoolkit/nncf","commit_stats":{"total_commits":1805,"total_committers":69,"mean_commits":"26.159420289855074","dds":0.7623268698060942,"last_synced_commit":"15333d873ad4610a40ab80b5555dfe02f5c209cb"},"previous_names":[],"tags_count":29,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openvinotoolkit%2Fnncf","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openvinotoolkit%2Fnncf/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openvinotoolkit%2Fnncf/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openvinotoolkit%2Fnncf/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/openvinotoolkit","download_url":"https://codeload.github.com/openvinotoolkit/nncf/tar.gz/refs/heads/develop","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253764021,"owners_count":21960498,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bert","classification","compression","deep-learning","genai","llm","mixed-precision-training","nlp","object-detection","onnx","openvino","pruning","pytorch","quantization","quantization-aware-training","semantic-segmentation","sparsity","tensorflow","transformers"],"created_at":"2025-02-07T05:18:26.514Z","updated_at":"2026-04-08T12:03:22.179Z","avatar_url":"https://github.com/openvinotoolkit.png","language":"Python","readme":"\u003cdiv align=\"center\"\u003e\n\n# Neural Network Compression Framework (NNCF)\n\n[Key Features](#key-features) •\n[Installation](#installation-guide) •\n[Documentation](#documentation) •\n[Usage](#usage) •\n[Tutorials and Samples](#demos-tutorials-and-samples) •\n[Third-party integration](#third-party-repository-integration) •\n[Model Zoo](./docs/ModelZoo.md)\n\n[![GitHub Release](https://img.shields.io/github/v/release/openvinotoolkit/nncf?color=green)](https://github.com/openvinotoolkit/nncf/releases)\n[![Website](https://img.shields.io/website?up_color=blue\u0026up_message=docs\u0026url=https%3A%2F%2Fdocs.openvino.ai%2Fnncf)](https://docs.openvino.ai/nncf)\n[![Apache License Version 2.0](https://img.shields.io/badge/license-Apache_2.0-green.svg)](LICENSE)\n[![PyPI Downloads](https://static.pepy.tech/badge/nncf)](https://pypi.org/project/nncf/)\n\n![Python](https://img.shields.io/badge/python-3.10+-blue)\n![Backends](https://img.shields.io/badge/backends-openvino_|_pytorch_|_onnx_-orange)\n![OS](https://img.shields.io/badge/OS-Linux_|_Windows_|_MacOS-blue)\n\n\u003c/div\u003e\n\nNeural Network Compression Framework (NNCF) provides a suite of post-training and training-time algorithms for\noptimizing inference of neural networks in [OpenVINO\u0026trade;](https://docs.openvino.ai) with a minimal accuracy drop.\n\nNNCF is designed to work with models from [PyTorch](https://pytorch.org/),\n[TorchFX](https://pytorch.org/docs/stable/fx.html),\n[ONNX](https://onnx.ai/) and [OpenVINO\u0026trade;](https://docs.openvino.ai).\n\nNNCF provides [samples](#demos-tutorials-and-samples) that demonstrate the usage of compression algorithms for different\nuse cases and models. See compression results achievable with the NNCF-powered samples on the [NNCF Model Zoo page](./docs/ModelZoo.md).\n\nThe framework is organized as a Python\\* package that can be built and used in a standalone mode. The framework\narchitecture is unified to make it easy to add different compression algorithms for both PyTorch deep\nlearning frameworks.\n\n\u003ca id=\"key-features\"\u003e\u003c/a\u003e\n\n## Key Features\n\n### Post-Training Compression Algorithms\n\n| Compression algorithm                                                                                    | OpenVINO      | PyTorch      | TorchFX       | ONNX          |\n| :------------------------------------------------------------------------------------------------------- | :-----------: | :----------: | :-----------: | :-----------: |\n| [Post-Training Quantization](./docs/usage/post_training_compression/post_training_quantization/Usage.md) | Supported     | Supported    | Experimental  | Supported     |\n| [Weights Compression](./docs/usage/post_training_compression/weights_compression/Usage.md)               | Supported     | Supported    | Experimental  | Supported     |\n| [Activation Sparsity](./src/nncf/experimental/torch/sparsify_activations/ActivationSparsity.md)          | Not supported | Experimental | Not supported | Not supported |\n\n### Training-Time Compression Algorithms\n\n| Compression algorithm                                                                                                                         | PyTorch   |\n| :-------------------------------------------------------------------------------------------------------------------------------------------- | :-------: |\n| [Quantization Aware Training](./docs/usage/training_time_compression/quantization_aware_training/Usage.md)                                    | Supported |\n| [Weight-Only Quantization Aware Training with LoRA and NLS](./docs/usage/training_time_compression/quantization_aware_training_lora/Usage.md) | Supported |\n| [Pruning](./docs/usage/training_time_compression/pruning/Usage.md)                                                                            | Supported |\n\n- Automatic, configurable model graph transformation to obtain the compressed model.\n- Common interface for compression methods.\n- GPU-accelerated layers for faster compressed model fine-tuning.\n- Distributed training support.\n- Git patch for prominent third-party repository ([huggingface-transformers](https://github.com/huggingface/transformers)) demonstrating the process of integrating NNCF into custom training pipelines.\n- Exporting PyTorch compressed models to ONNX\\* checkpoints compressed models to SavedModel or Frozen Graph format, ready to use with [OpenVINO\u0026trade; toolkit](https://docs.openvino.ai).\n\n\u003ca id=\"documentation\"\u003e\u003c/a\u003e\n\n## Documentation\n\nThis documentation covers detailed information about NNCF algorithms and functions needed for the contribution to NNCF.\n\nThe latest user documentation for NNCF is available [here](https://docs.openvino.ai/nncf).\n\nNNCF API documentation can be found [here](https://openvinotoolkit.github.io/nncf/autoapi/nncf/).\n\n\u003ca id=\"usage\"\u003e\u003c/a\u003e\n\n## Usage\n\n### Post-Training Quantization\n\nThe NNCF PTQ is the simplest way to apply 8-bit quantization. To run the algorithm you only need your model and a small (~300 samples) calibration dataset.\n\n[OpenVINO](https://github.com/openvinotoolkit/openvino) is the preferred backend to run PTQ with, while PyTorch and ONNX are also supported.\n\n\u003cdetails open\u003e\u003csummary\u003e\u003cb\u003eOpenVINO\u003c/b\u003e\u003c/summary\u003e\n\n```python\nimport nncf\nimport openvino as ov\nimport torch\nfrom torchvision import datasets, transforms\n\n# Instantiate your uncompressed model\nmodel = ov.Core().read_model(\"/model_path\")\n\n# Provide validation part of the dataset to collect statistics needed for the compression algorithm\nval_dataset = datasets.ImageFolder(\"/path\", transform=transforms.Compose([transforms.ToTensor()]))\ndataset_loader = torch.utils.data.DataLoader(val_dataset, batch_size=1)\n\n# Step 1: Initialize transformation function\ndef transform_fn(data_item):\n    images, _ = data_item\n    return images\n\n# Step 2: Initialize NNCF Dataset\ncalibration_dataset = nncf.Dataset(dataset_loader, transform_fn)\n# Step 3: Run the quantization pipeline\nquantized_model = nncf.quantize(model, calibration_dataset)\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\u003csummary\u003e\u003cb\u003ePyTorch\u003c/b\u003e\u003c/summary\u003e\n\n```python\nimport nncf\nimport torch\nfrom torchvision import datasets, models\n\n# Instantiate your uncompressed model\nmodel = models.mobilenet_v2()\n\n# Provide validation part of the dataset to collect statistics needed for the compression algorithm\nval_dataset = datasets.ImageFolder(\"/path\", transform=transforms.Compose([transforms.ToTensor()]))\ndataset_loader = torch.utils.data.DataLoader(val_dataset)\n\n# Step 1: Initialize the transformation function\ndef transform_fn(data_item):\n    images, _ = data_item\n    return images\n\n# Step 2: Initialize NNCF Dataset\ncalibration_dataset = nncf.Dataset(dataset_loader, transform_fn)\n# Step 3: Run the quantization pipeline\nquantized_model = nncf.quantize(model, calibration_dataset)\n\n```\n\n**NOTE** If the Post-Training Quantization algorithm does not meet quality requirements you can fine-tune the quantized pytorch model. You can find an example of the Quantization-Aware training pipeline for a pytorch model [here](examples/quantization_aware_training/torch/resnet18/README.md).\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\u003csummary\u003e\u003cb\u003eTorchFX\u003c/b\u003e\u003c/summary\u003e\n\n```python\nimport nncf\nimport torch.fx\nfrom torchvision import datasets, models\n\n# Instantiate your uncompressed model\nmodel = models.mobilenet_v2()\n\n# Provide validation part of the dataset to collect statistics needed for the compression algorithm\nval_dataset = datasets.ImageFolder(\"/path\", transform=transforms.Compose([transforms.ToTensor()]))\ndataset_loader = torch.utils.data.DataLoader(val_dataset)\n\n# Step 1: Initialize the transformation function\ndef transform_fn(data_item):\n    images, _ = data_item\n    return images\n\n# Step 2: Initialize NNCF Dataset\ncalibration_dataset = nncf.Dataset(dataset_loader, transform_fn)\n\n# Step 3: Export model to TorchFX\ninput_shape = (1, 3, 224, 224)\nfx_model = torch.export.export_for_training(model, args=(ex_input,)).module()\n# or\n# fx_model = torch.export.export(model, args=(ex_input,)).module()\n\n# Step 4: Run the quantization pipeline\nquantized_fx_model = nncf.quantize(fx_model, calibration_dataset)\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\u003csummary\u003e\u003cb\u003eONNX\u003c/b\u003e\u003c/summary\u003e\n\n```python\nimport onnx\nimport nncf\nimport torch\nfrom torchvision import datasets\n\n# Instantiate your uncompressed model\nonnx_model = onnx.load_model(\"/model_path\")\n\n# Provide validation part of the dataset to collect statistics needed for the compression algorithm\nval_dataset = datasets.ImageFolder(\"/path\", transform=transforms.Compose([transforms.ToTensor()]))\ndataset_loader = torch.utils.data.DataLoader(val_dataset, batch_size=1)\n\n# Step 1: Initialize transformation function\ninput_name = onnx_model.graph.input[0].name\ndef transform_fn(data_item):\n    images, _ = data_item\n    return {input_name: images.numpy()}\n\n# Step 2: Initialize NNCF Dataset\ncalibration_dataset = nncf.Dataset(dataset_loader, transform_fn)\n# Step 3: Run the quantization pipeline\nquantized_model = nncf.quantize(onnx_model, calibration_dataset)\n```\n\n\u003c/details\u003e\n\n[//]: # (NNCF provides full  [samples]\u0026#40;#post-training-quantization-samples\u0026#41;, which demonstrate Post-Training Quantization usage for PyTorch, ONNX, and OpenVINO.)\n\n### Training-Time Quantization\n\nHere is an example of Accuracy Aware Quantization pipeline where model weights and compression parameters may be fine-tuned to achieve a higher accuracy.\n\n\u003cdetails\u003e\u003csummary\u003e\u003cb\u003ePyTorch\u003c/b\u003e\u003c/summary\u003e\n\n```python\nimport nncf\nimport torch\nfrom torchvision import datasets, models\n\n# Instantiate your uncompressed model\nmodel = models.mobilenet_v2()\n\n# Provide validation part of the dataset to collect statistics needed for the compression algorithm\nval_dataset = datasets.ImageFolder(\"/path\", transform=transforms.Compose([transforms.ToTensor()]))\ndataset_loader = torch.utils.data.DataLoader(val_dataset)\n\n# Step 1: Initialize the transformation function\ndef transform_fn(data_item):\n    images, _ = data_item\n    return images\n\n# Step 2: Initialize NNCF Dataset\ncalibration_dataset = nncf.Dataset(dataset_loader, transform_fn)\n# Step 3: Run the quantization pipeline\nquantized_model = nncf.quantize(model, calibration_dataset)\n\n# Now use compressed_model as a usual torch.nn.Module\n# to fine-tune compression parameters along with the model weights\n\n# Save quantization modules and the quantized model parameters\ncheckpoint = {\n    'state_dict': model.state_dict(),\n    'nncf_config': nncf.torch.get_config(model),\n    ... # the rest of the user-defined objects to save\n}\ntorch.save(checkpoint, path_to_checkpoint)\n\n# ...\n\n# Load quantization modules and the quantized model parameters\nresuming_checkpoint = torch.load(path_to_checkpoint)\nnncf_config = resuming_checkpoint['nncf_config']\nstate_dict = resuming_checkpoint['state_dict']\n\nquantized_model = nncf.torch.load_from_config(model, nncf_config, example_input)\nmodel.load_state_dict(state_dict)\n# ... the rest of the usual PyTorch-powered training pipeline\n```\n\n\u003c/details\u003e\n\n\u003ca id=\"demos-tutorials-and-samples\"\u003e\u003c/a\u003e\n\n## Demos, Tutorials and Samples\n\nFor a quicker start with NNCF-powered compression, try sample notebooks and scripts presented below.\n\n### Jupyter* Notebook Tutorials and Demos\n\nReady-to-run Jupyter* notebook tutorials and demos are available to explain and display NNCF compression algorithms for optimizing models for inference with the OpenVINO Toolkit:\n\n| Notebook Tutorial Name                                                                                                                                                                                                                                                                                                                                 |                                  Compression Algorithm                                  |  Backend   |               Domain                |\n|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------:|:----------:|:-----------------------------------:|\n| [BERT Quantization](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/language-quantize-bert)\u003cbr\u003e[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/language-quantize-bert/language-quantize-bert.ipynb) |                               Post-Training Quantization                                |  OpenVINO  |                 NLP                 |\n| [MONAI Segmentation Model Quantization](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/ct-segmentation-quantize)\u003cbr\u003e[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/openvinotoolkit/openvino_notebooks/HEAD?filepath=notebooks%2Fct-segmentation-quantize%2Fct-scan-live-inference.ipynb)     |                               Post-Training Quantization                                |  OpenVINO  |            Segmentation             |\n| [PyTorch Model Quantization](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/pytorch-post-training-quantization-nncf)                                                                                                                                                                                                      |                               Post-Training Quantization                                |  PyTorch   |        Image Classification         |\n| [YOLOv11 Quantization with Accuracy Control](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/yolov11-quantization-with-accuracy-control)                                                                                                                                                                                               |                    Post-Training Quantization with Accuracy Control                     |  OpenVINO  | Speech-to-Text,\u003cbr\u003eObject Detection |\n\nA list of notebooks demonstrating OpenVINO conversion and inference together with NNCF compression for models from various domains:\n\n| Demo Model                                                                                                                                                                                                                                                                                                                                        |               Compression Algorithm               |  Backend  |                                Domain                                |\n|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------:|:---------:|:--------------------------------------------------------------------:|\n| [YOLOv8](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/yolov8-optimization)\u003cbr\u003e[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/yolov8-optimization/yolov8-object-detection.ipynb)            |            Post-Training Quantization             | OpenVINO  |  Object Detection,\u003cbr\u003eKeyPoint Detection,\u003cbr\u003eInstance Segmentation   |\n| [EfficientSAM](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/efficient-sam)                                                                                                                                                                                                                                         |            Post-Training Quantization             | OpenVINO  |                          Image Segmentation                          |\n| [Segment Anything Model](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/segment-anything)                                                                                                                                                                                                                            |            Post-Training Quantization             | OpenVINO  |                          Image Segmentation                          |\n| [OneFormer](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/oneformer-segmentation)                                                                                                                                                                                                                                   |            Post-Training Quantization             | OpenVINO  |                          Image Segmentation                          |\n| [CLIP](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/clip-zero-shot-image-classification)                                                                                                                                                                                                                           |            Post-Training Quantization             | OpenVINO  |                            Image-to-Text                             |\n| [BLIP](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/blip-visual-language-processing)                                                                                                                                                                                                                               |            Post-Training Quantization             | OpenVINO  |                            Image-to-Text                             |\n| [Latent Consistency Model](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/latent-consistency-models-image-generation)                                                                                                                                                                                                |            Post-Training Quantization             | OpenVINO  |                            Text-to-Image                             |\n| [Distil-Whisper](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/distil-whisper-asr)                                                                                                                                                                                                                                  |            Post-Training Quantization             | OpenVINO  |                            Speech-to-Text                            |\n| [Whisper](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/whisper-subtitles-generation)\u003cbr\u003e[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/whisper-subtitles-generation/whisper-convert.ipynb) |            Post-Training Quantization             | OpenVINO  |                            Speech-to-Text                            |\n| [MMS Speech Recognition](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/mms-massively-multilingual-speech)                                                                                                                                                                                                           |            Post-Training Quantization             | OpenVINO  |                            Speech-to-Text                            |\n| [LLM Instruction Following](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/llm-question-answering)                                                                                                                                                                                                                   |                Weight Compression                 | OpenVINO  |                      NLP, Instruction Following                      |\n| [LLM Chat Bots](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/llm-chatbot)                                                                                                                                                                                                                                          |                Weight Compression                 | OpenVINO  |                            NLP, Chat Bot                             |\n\n### Post-Training Quantization and Weight Compression Examples\n\nCompact scripts demonstrating quantization/weight compression and corresponding inference speed boost:\n\n| Example Name                                                                                                                             |              Compression Algorithm               |  Backend   |         Domain         |\n|:-----------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------:|:----------:|:----------------------:|\n| [OpenVINO MobileNetV2](./examples/post_training_quantization/openvino/mobilenet_v2/README.md)                                            |            Post-Training Quantization            |  OpenVINO  |  Image Classification  |\n| [OpenVINO YOLO26](./examples/post_training_quantization/openvino/yolo26/README.md)                                                       |            Post-Training Quantization            |  OpenVINO  |    Object Detection    |\n| [OpenVINO YOLOv8 QwAC](./examples/post_training_quantization/openvino/yolov8_quantize_with_accuracy_control/README.md)                   | Post-Training Quantization with Accuracy Control |  OpenVINO  |    Object Detection    |\n| [OpenVINO Anomaly Classification](./examples/post_training_quantization/openvino/anomaly_stfpm_quantize_with_accuracy_control/README.md) | Post-Training Quantization with Accuracy Control |  OpenVINO  | Anomaly Classification |\n| [PyTorch MobileNetV2](./examples/post_training_quantization/torch/mobilenet_v2/README.md)                                                |            Post-Training Quantization            |  PyTorch   |  Image Classification  |\n| [PyTorch SSD](./examples/post_training_quantization/torch/ssd300_vgg16/README.md)                                                        |            Post-Training Quantization            |  PyTorch   |    Object Detection    |\n| [TorchFX Resnet18](./examples/post_training_quantization/torch_fx/resnet18/README.md)                                                    |            Post-Training Quantization            |  TorchFX   |  Image Classification  |\n| [ONNX MobileNetV2](./examples/post_training_quantization/onnx/mobilenet_v2/README.md)                                                    |            Post-Training Quantization            |    ONNX    |  Image Classification  |\n| [ONNX YOLOv8 QwAC](./examples/post_training_quantization/onnx/yolov8_quantize_with_accuracy_control/README.md)                           | Post-Training Quantization with Accuracy Control |    ONNX    |    Object Detection    |\n| [ONNX TinyLlama WC](./examples/llm_compression/onnx/tiny_llama/README.md)                                                                |                Weight Compression                |    ONNX    |           LLM          |\n| [TorchFX TinyLlama WC](./examples/llm_compression/torch_fx/tiny_llama/README.md)                                                         |                Weight Compression                |  TorchFX   |           LLM          |\n| [OpenVINO TinyLlama WC](./examples/llm_compression/openvino/tiny_llama/README.md)                                                        |                Weight Compression                |  OpenVINO  |           LLM          |\n| [OpenVINO TinyLlama WC with HS](./examples/llm_compression/openvino/tiny_llama_find_hyperparams/README.md)                               |  Weight Compression with Hyperparameters Search  |  OpenVINO  |           LLM          |\n| [ONNX TinyLlama WC with SE](./examples/llm_compression/onnx/tiny_llama_scale_estimation/README.md)                                       |     Weight Compression with Scale Estimation     |    ONNX    |           LLM          |\n\n### Quantization-Aware Training Examples\n\n| Example Name                                                                        |   Compression Algorithm     | Backend |        Domain        |\n|:------------------------------------------------------------------------------------|:---------------------------:|:-------:|:--------------------:|\n| [PyTorch Resnet18](./examples/quantization_aware_training/torch/resnet18/README.md) | Quantization-Aware Training | PyTorch | Image Classification |\n| [PyTorch Anomalib](./examples/quantization_aware_training/torch/anomalib/README.md) | Quantization-Aware Training | PyTorch | Anomaly Detection    |\n\n\u003ca id=\"third-party-repository-integration\"\u003e\u003c/a\u003e\n\n## Third-party Repository Integration\n\nNNCF may be easily integrated into training/evaluation pipelines of third-party repositories.\n\n### Used by\n\n- [HuggingFace Optimum Intel](https://huggingface.co/docs/optimum/intel/optimization_ov)\n\n  NNCF is used as a compression backend within the renowned `transformers` repository in HuggingFace Optimum Intel. For instance, the command below exports the [Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) model to OpenVINO format with INT4-quantized weights:\n\n  ```bash\n  optimum-cli export openvino -m meta-llama/Llama-3.2-3B-Instruct --weight-format int4 ./Llama-3.2-3B-Instruct-int4\n  ```\n\n- [Ultralytics](https://docs.ultralytics.com/integrations/openvino)\n\n  NNCF is integrated into the Intel OpenVINO export pipeline, enabling quantization for the exported models.\n\n- [ExecuTorch](https://github.com/pytorch/executorch/blob/main/examples/openvino/README.md)\n\n  NNCF is used as primary quantization framework for the [ExecuTorch OpenVINO integration](https://docs.pytorch.org/executorch/main/build-run-openvino.html).\n\n- [torch.compile](https://docs.pytorch.org/tutorials/prototype/openvino_quantizer.html)\n\n  NNCF is used as primary quantization framework for the [torch.compile OpenVINO integration](https://docs.openvino.ai/2026/openvino-workflow/torch-compile.html).\n\n- [OpenVINO Training Extensions](https://github.com/openvinotoolkit/training_extensions)\n\n  NNCF is integrated into OpenVINO Training Extensions as a model optimization backend. You can train, optimize, and\n  export new models based on available model templates as well as run the exported models with OpenVINO.\n\n- [Microsoft Olive](https://github.com/microsoft/olive)\n\n  NNCF is used to quantize OpenVINO IR and ONNX models for the [OpenVINO integration](https://microsoft.github.io/Olive/features/ihv-integration/openvino.html).\n\n\u003ca id=\"installation-guide\"\u003e\u003c/a\u003e\n\n## Installation Guide\n\nFor detailed installation instructions, refer to the [Installation](./docs/Installation.md) guide.\n\nNNCF can be installed as a regular PyPI package via pip:\n\n```bash\npip install nncf\n```\n\nNNCF is also available via [conda](https://anaconda.org/conda-forge/nncf):\n\n```bash\nconda install -c conda-forge nncf\n```\n\nSystem requirements of NNCF correspond to the used backend. System requirements for each backend and\nthe matrix of corresponding versions can be found in [installation.md](./docs/Installation.md).\n\n## NNCF Compressed Model Zoo\n\nList of models and compression results for them can be found at our [NNCF Model Zoo page](./docs/ModelZoo.md).\n\n## Citing\n\n```bi\n@article{kozlov2020neural,\n    title =   {Neural network compression framework for fast model inference},\n    author =  {Kozlov, Alexander and Lazarevich, Ivan and Shamporov, Vasily and Lyalyushkin, Nikolay and Gorbachev, Yury},\n    journal = {arXiv preprint arXiv:2002.08679},\n    year =    {2020}\n}\n```\n\n## Contributing Guide\n\nRefer to the [CONTRIBUTING.md](./CONTRIBUTING.md) file for guidelines on contributions to the NNCF repository.\n\n## Useful links\n\n- [Documentation](./docs)\n- [Examples](./examples)\n- [FAQ](./docs/FAQ.md)\n- [Notebooks](https://github.com/openvinotoolkit/openvino_notebooks#-model-training)\n- [HuggingFace Optimum Intel](https://huggingface.co/docs/optimum/intel/optimization_ov)\n- [OpenVINO Model Optimization Guide](https://docs.openvino.ai/nncf)\n- [OpenVINO Hugging Face page](https://huggingface.co/OpenVINO#models)\n- [OpenVino Performance Benchmarks page](https://docs.openvino.ai/2026/about-openvino/performance-benchmarks.html)\n\n## Telemetry\n\nNNCF as part of the OpenVINO™ toolkit collects anonymous usage data for the purpose of improving OpenVINO™ tools.\nYou can opt-out at any time by running the following command in the Python environment where you have NNCF installed:\n\n`opt_in_out --opt_out`\n\nMore information available on [OpenVINO telemetry](https://docs.openvino.ai/2026/about-openvino/additional-resources/telemetry.html).\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopenvinotoolkit%2Fnncf","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fopenvinotoolkit%2Fnncf","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopenvinotoolkit%2Fnncf/lists"}