
An open API service indexing awesome lists of open source software.

Module to Automatically maximize the utilization of GPU resources in a Kubernetes cluster through real-time dynamic partitioning and elastic quotas - Effortless optimization at its finest!

gpu kubernetes optimization

Last synced: 3 months ago
JSON representation

Module to Automatically maximize the utilization of GPU resources in a Kubernetes cluster through real-time dynamic partitioning and elastic quotas - Effortless optimization at its finest!




# Nebuly Operating System (nos)




If you like the project please support it by leaving a star ✨


`nos` is the open-source module to efficiently run AI workloads on Kubernetes,
increasing GPU utilization, cutting down infrastructure costs and improving workloads performance.

Currently, the available features are:

* [Dynamic GPU partitioning]( allow to schedule Pods requesting
fractions of GPU. GPU partitioning is performed automatically in real-time based on the Pods pending and running in
the cluster, so that Pods can request only the resources that are strictly necessary and GPUs are always fully utilized.

* [Elastic Resource Quota management]( increase the number of Pods running on the
cluster by allowing namespaces to borrow quotas of reserved resources from other namespaces as long as they are
not using them.


## Getting started

### Prerequisites

* Kubernetes v1.23 or newer
* [GPU Support must be enabled](
* [Nebuly k8s-device-plugin]( (optional, required only if you want to enable MPS partitioning)
* [cert-manager]( (optional, but recommended)

### Installation

You can install `nos` using Helm 3 (recommended).
You can find all the available configuration values in the Chart [documentation](

helm install oci:// \
--version 0.1.2 \
--namespace nebuly-nos \
--generate-name \

Alternatively, you can use Kustomize by cloning the repository and running `make deploy`.