{"id":14256730,"url":"https://github.com/Project-HAMi/HAMi","last_synced_at":"2025-08-12T20:33:07.812Z","repository":{"id":37448719,"uuid":"406346103","full_name":"Project-HAMi/HAMi","owner":"Project-HAMi","description":"Heterogeneous AI Computing Virtualization Middleware","archived":false,"fork":false,"pushed_at":"2024-10-29T09:36:51.000Z","size":126992,"stargazers_count":846,"open_issues_count":113,"forks_count":179,"subscribers_count":21,"default_branch":"master","last_synced_at":"2024-10-29T11:44:50.873Z","etag":null,"topics":["device-plugin","gpu-management","gpu-virtualization","kubernetes-gpu-cluster","vgpu"],"latest_commit_sha":null,"homepage":"http://project-hami.io/","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Project-HAMi.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS","dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-09-14T11:51:49.000Z","updated_at":"2024-10-29T10:53:30.000Z","dependencies_parsed_at":"2024-01-22T09:10:46.710Z","dependency_job_id":"fd64bb09-3551-462f-9618-457c73a0e54b","html_url":"https://github.com/Project-HAMi/HAMi","commit_stats":null,"previous_names":["project-hami/hami"],"tags_count":128,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Project-HAMi%2FHAMi","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Project-HAMi%2FHAMi/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Project-HAMi%2FHAMi/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Project-HAMi%2FHAMi/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Project-HAMi","download_url":"https://codeload.github.com/Project-HAMi/HAMi/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":229708170,"owners_count":18111226,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["device-plugin","gpu-management","gpu-virtualization","kubernetes-gpu-cluster","vgpu"],"created_at":"2024-08-22T07:01:06.472Z","updated_at":"2025-08-12T20:33:07.783Z","avatar_url":"https://github.com/Project-HAMi.png","language":"Go","readme":"English version | [中文版](README_cn.md)\n\n\u003cimg src=\"imgs/hami-horizontal-colordark.png\" width=\"600px\"\u003e\n\n[![LICENSE](https://img.shields.io/github/license/Project-HAMi/HAMi.svg)](/LICENSE)\n[![build status](https://github.com/Project-HAMi/HAMi/actions/workflows/ci.yaml/badge.svg)](https://github.com/Project-HAMi/HAMi/actions/workflows/ci.yaml)\n[![Releases](https://img.shields.io/github/v/release/Project-HAMi/HAMi)](https://github.com/Project-HAMi/HAMi/releases/latest)\n[![OpenSSF Best Practices](https://www.bestpractices.dev/projects/9416/badge)](https://www.bestpractices.dev/en/projects/9416)\n[![Go Report Card](https://goreportcard.com/badge/github.com/Project-HAMi/HAMi)](https://goreportcard.com/report/github.com/Project-HAMi/HAMi)\n[![codecov](https://codecov.io/gh/Project-HAMi/HAMi/branch/master/graph/badge.svg?token=ROM8CMPXZ6)](https://codecov.io/gh/Project-HAMi/HAMi)\n[![FOSSA Status](https://app.fossa.com/api/projects/git%2Bgithub.com%2FProject-HAMi%2FHAMi.svg?type=shield)](https://app.fossa.com/projects/git%2Bgithub.com%2FProject-HAMi%2FHAMi?ref=badge_shield)\n[![docker pulls](https://img.shields.io/docker/pulls/projecthami/hami.svg)](https://hub.docker.com/r/projecthami/hami)\n[![slack](https://img.shields.io/badge/Slack-Join%20Slack-blue)](https://cloud-native.slack.com/archives/C07T10BU4R2)\n[![discuss](https://img.shields.io/badge/Discuss-Ask%20Questions-blue)](https://github.com/Project-HAMi/HAMi/discussions)\n[![website](https://img.shields.io/badge/website-blue)](http://project-hami.io)\n[![Contact Me](https://img.shields.io/badge/Contact%20Me-blue)](https://github.com/Project-HAMi/HAMi#contact)\n\n## Project-HAMi: Heterogeneous AI Computing Virtualization Middleware\n\n## Introduction\n\nHAMi, formerly known as 'k8s-vGPU-scheduler', is a Heterogeneous device management middleware for Kubernetes. It can manage different types of heterogeneous devices (like GPU, NPU, etc.), share heterogeneous devices among pods, make better scheduling decisions based on topology of devices and scheduling policies.\n\nIt aims to remove the gap between different Heterogeneous devices, and provide a unified interface for users to manage with no changes to their applications. As of December 2024, HAMi has been widely used not only in Internet, public cloud and private cloud, but also broadly adopted in various vertical industries including finance, securities, energy, telecommunications, education, and manufacturing. More than 50 companies or institutions are not only end users but also active contributors. \n\n![cncf_logo](imgs/cncf-logo.png)\n\nHAMi is a sandbox and [landscape](https://landscape.cncf.io/?item=orchestration-management--scheduling-orchestration--hami) project of  \n[Cloud Native Computing Foundation](https://cncf.io/)(CNCF), \n[CNAI Landscape project](https://landscape.cncf.io/?group=cnai\u0026item=cnai--general-orchestration--hami).\n\n\n## Device virtualization\n\nHAMi provides device virtualization for several heterogeneous devices including GPU, by supporting device sharing and device resource isolation. For the list of devices supporting device virtualization, see [supported devices](#supported-devices)\n\n### Device sharing\n\n- Allows partial device allocation by specifying device core usage.\n- Allows partial device allocation by specifying device memory.\n- Imposes a hard limit on streaming multiprocessors.\n- Requires zero changes to existing programs.\n- Support [dynamic-mig](docs/dynamic-mig-support.md) feature, [example](examples/nvidia/dynamic_mig_example.yaml)\n\n\u003cimg src=\"./imgs/example.png\" width = \"500\" /\u003e \n\n### Device Resources Isolation\n\nA simple demonstration of device isolation:\nA task with the following resources will see 3000M device memory inside container:\n\n```yaml\n      resources:\n        limits:\n          nvidia.com/gpu: 1 # declare how many physical GPUs the pod needs\n          nvidia.com/gpumem: 3000 # identifies 3G GPU memory each physical GPU allocates to the pod\n```\n\n![img](./imgs/hard_limit.jpg)\n\n\u003e Note:\n1. **After installing HAMi, the value of `nvidia.com/gpu` registered on the node defaults to the number of vGPUs.**\n2. **When requesting resources in a pod, `nvidia.com/gpu` refers to the number of physical GPUs required by the current pod.**\n\n### Supported devices\n\n[NVIDIA GPU](https://github.com/Project-HAMi/HAMi#preparing-your-gpu-nodes)   \n[Cambricon MLU](docs/cambricon-mlu-support.md)   \n[HYGON DCU](docs/hygon-dcu-support.md)   \n[Iluvatar CoreX GPU](docs/iluvatar-gpu-support.md)   \n[Moore Threads GPU](docs/mthreads-support.md)   \n[HUAWEI Ascend NPU](https://github.com/Project-HAMi/ascend-device-plugin/blob/main/README.md)   \n[MetaX GPU](docs/metax-support.md)   \n\n## Architect\n\n\u003cimg src=\"./imgs/hami-arch.png\" width = \"600\" /\u003e \n\nHAMi consists of several components, including a unified mutatingwebhook, a unified scheduler extender, different device-plugins and different in-container virtualization technics for each heterogeneous AI devices.\n\n## Quick Start\n\n### Choose your orchestrator\n\n[![kube-scheduler](https://img.shields.io/badge/kube-scheduler-blue)](#prerequisites)\n[![volcano-scheduler](https://img.shields.io/badge/volcano-scheduler-orange)](docs/how-to-use-volcano-vgpu.md)\n\n### Prerequisites\n\nThe list of prerequisites for running the NVIDIA device plugin is described below:\n\n- NVIDIA drivers \u003e= 440\n- nvidia-docker version \u003e 2.0\n- default runtime configured as nvidia for containerd/docker/cri-o container runtime\n- Kubernetes version \u003e= 1.18\n- glibc \u003e= 2.17 \u0026 glibc \u003c 2.30\n- kernel version \u003e= 3.10\n- helm \u003e 3.0\n\n### Install\n\nFirst, Label your GPU nodes for scheduling with HAMi by adding the label \"gpu=on\". Without this label, the nodes cannot be managed by our scheduler.\n\n```\nkubectl label nodes {nodeid} gpu=on\n```\n\nAdd our repo in helm\n\n```\nhelm repo add hami-charts https://project-hami.github.io/HAMi/\n```\n\nUse the following command for deployment:\n\n```\nhelm install hami hami-charts/hami -n kube-system\n```\n\nCustomize your installation by adjusting the [configs](docs/config.md).\n\nVerify your installation using the following command:\n\n```\nkubectl get pods -n kube-system\n```\n\nIf both `hami-device-plugin` (formerly known as `vgpu-device-plugin`)  and `hami-scheduler` (formerly known as `vgpu-scheduler`)  pods are in the *Running* state, your installation is successful. You can try examples [here](examples/nvidia/default_use.yaml) \n\n### WebUI\n\n[HAMi-WebUI](https://github.com/Project-HAMi/HAMi-WebUI) is available after HAMi v2.4\n\nFor installation guide, click [here](https://github.com/Project-HAMi/HAMi-WebUI/blob/main/docs/installation/helm/index.md)\n\n### Monitor\n\nMonitoring is automatically enabled after installation. Obtain an overview of cluster information by visiting the following URL:\n\n```\nhttp://{scheduler ip}:{monitorPort}/metrics\n```\n\nThe default monitorPort is 31993; other values can be set using `--set devicePlugin.service.httpPort` during installation.\n\nGrafana dashboard [example](docs/dashboard.md)\n\n\u003e **Note** The status of a node won't be collected before you submit a task\n\n## Notes\n\n- If you don't request vGPUs when using the device plugin with NVIDIA images all the GPUs on the machine may be exposed inside your container\n- Currently, A100 MIG can be supported in only \"none\" and \"mixed\" modes.\n- Tasks with the \"nodeName\" field cannot be scheduled at the moment; please use \"nodeSelector\" instead.\n\n## RoadMap, Governance \u0026 Contributing\n\nThe project is governed by a group of [Maintainers](./MAINTAINERS.md) and [Contributors](./AUTHORS.md). How they are selected and govern is outlined in our [Governance Document](https://github.com/Project-HAMi/community/blob/main/governance.md).\n\nIf you're interested in being a contributor and want to get involved in developing the HAMi code, please see [CONTRIBUTING](CONTRIBUTING.md) for details on submitting patches and the contribution workflow.\n\nSee [RoadMap](docs/develop/roadmap.md) to see anything you interested.\n\n## Meeting \u0026 Contact\n\nThe HAMi community is committed to fostering an open and welcoming environment, with several ways to engage with other users and developers.\n\nIf you have any questions, please feel free to reach out to us through the following channels:\n\n- Regular Community Meeting: Friday at 16:00 UTC+8 (Chinese)(weekly). [Convert to your timezone](https://www.thetimezoneconverter.com/?t=14%3A30\u0026tz=GMT%2B8\u0026).\n  - [Meeting Notes and Agenda](https://docs.google.com/document/d/1YC6hco03_oXbF9IOUPJ29VWEddmITIKIfSmBX8JtGBw/edit#heading=h.g61sgp7w0d0c)\n  - [Meeting Link](https://meeting.tencent.com/dm/Ntiwq1BICD1P)\n- Email: refer to the [MAINTAINERS.md](MAINTAINERS.md) to find the email addresses of all maintainers. Feel free to contact them via email to report any issues or ask questions.\n- [mailing list](https://groups.google.com/forum/#!forum/hami-project)\n- [slack](https://cloud-native.slack.com/archives/C07T10BU4R2) | [Join](https://slack.cncf.io/)\n\n## Talks and References\n\n|                  | Link                                                                                                                    |\n|------------------|-------------------------------------------------------------------------------------------------------------------------|\n| CHINA CLOUD COMPUTING INFRASTRUCTURE DEVELOPER CONFERENCE (Beijing 2024) | [Unlocking heterogeneous AI infrastructure on k8s clusters](https://live.csdn.net/room/csdnnews/3zwDP09S) Starting from 03:06:15 |\n| KubeDay(Japan 2024) | [Unlocking Heterogeneous AI Infrastructure K8s Cluster:Leveraging the Power of HAMi](https://www.youtube.com/watch?v=owoaSb4nZwg) |\n| KubeCon \u0026 AI_dev Open Source GenAI \u0026 ML Summit(China 2024) | [Is Your GPU Really Working Efficiently in the Data Center?N Ways to Improve GPU Usage](https://www.youtube.com/watch?v=ApkyK3zLF5Q) |\n| KubeCon \u0026 AI_dev Open Source GenAI \u0026 ML Summit(China 2024) | [Unlocking Heterogeneous AI Infrastructure K8s Cluster](https://www.youtube.com/watch?v=kcGXnp_QShs)                                     |\n| KubeCon(EU 2024)| [Cloud Native Batch Computing with Volcano: Updates and Future](https://youtu.be/fVYKk6xSOsw) |\n\n## License\n\nHAMi is under the Apache 2.0 license. See the [LICENSE](LICENSE) file for details.\n\n## Star History\n\n[![Star History Chart](https://api.star-history.com/svg?repos=Project-HAMi/HAMi\u0026type=Date)](https://star-history.com/#Project-HAMi/HAMi\u0026Date)\n","funding_links":[],"categories":["Go","GPU","分布式机器学习"],"sub_categories":["Scheduling"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FProject-HAMi%2FHAMi","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FProject-HAMi%2FHAMi","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FProject-HAMi%2FHAMi/lists"}