https://github.com/nexusgpu/tensor-fusion
Tensor Fusion is a state-of-the-art GPU virtualization and pooling solution designed to optimize GPU cluster utilization to its fullest potential.
https://github.com/nexusgpu/tensor-fusion
ai amd-gpu autoscaling dynamic-resource-allocation gpu gpu-acceleration gpu-pooling gpu-scheduling gpu-usage gpu-virtualization inference karpenter kubernetes llm-serving nvidia pytorch rcuda remote-gpu vgpu
Last synced: 12 days ago
JSON representation
Tensor Fusion is a state-of-the-art GPU virtualization and pooling solution designed to optimize GPU cluster utilization to its fullest potential.
- Host: GitHub
- URL: https://github.com/nexusgpu/tensor-fusion
- Owner: NexusGPU
- License: apache-2.0
- Created: 2024-11-12T23:49:48.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2026-01-29T14:56:52.000Z (14 days ago)
- Last Synced: 2026-01-29T21:48:41.321Z (14 days ago)
- Topics: ai, amd-gpu, autoscaling, dynamic-resource-allocation, gpu, gpu-acceleration, gpu-pooling, gpu-scheduling, gpu-usage, gpu-virtualization, inference, karpenter, kubernetes, llm-serving, nvidia, pytorch, rcuda, remote-gpu, vgpu
- Language: Go
- Homepage: https://tensor-fusion.ai
- Size: 2.89 MB
- Stars: 117
- Watchers: 3
- Forks: 26
- Open Issues: 8
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Security: SECURITY.md
Awesome Lists containing this project
README
TensorFusion.AI
Less GPUs, More AI Apps.
Explore the docs ยป
View Demo
|
Report Bug
|
Request Feature
[![Contributors][contributors-shield]][contributors-url]
[![Forks][forks-shield]][forks-url]
[![Stargazers][stars-shield]][stars-url]
[![MIT License][license-shield]][license-url]
[![LinkedIn][linkedin-shield]][linkedin-url]
[](https://deepwiki.com/NexusGPU/tensor-fusion)
Tensor Fusion is a state-of-the-art **GPU virtualization and pooling solution** designed to optimize GPU cluster utilization to its fullest potential.
## ๐ Highlights
#### ๐ Fractional Virtual GPU
#### ๐ Remote GPU Sharing over Ethernet/InfiniBand
#### โ๏ธ GPU-first Scheduling and Auto-scaling
#### ๐ GPU Oversubscription and VRAM Expansion
#### ๐ซ GPU Pooling, Monitoring, Live Migration, Model Preloading and more
## ๐ Quick Start
### Onboard Your Own AI Infra
- [Deploy in Kubernetes with Cloud Console](https://tensor-fusion.ai/guide/getting-started/deployment-k8s)
- [Deploy in Kubernetes with Helm chart](https://tensor-fusion.ai/guide/recipes/deploy-k8s-local-mode)
- [Create new cluster in VM/BareMetal](https://tensor-fusion.ai/guide/getting-started/deployment-vm)
- [Run vGPU in VM Hypervisor](https://tensor-fusion.ai/guide/getting-started/deployment-vm)
- [Learn Essential Concepts & Architecture](https://tensor-fusion.ai/guide/getting-started/architecture)
### ๐ฌ Discussion
- Discord channel: [https://discord.gg/2bybv9yQNk](https://discord.gg/2bybv9yQNk)
- Discuss anything about TensorFusion: [Github Discussions](https://github.com/NexusGPU/tensor-fusion/discussions)
- Contact us with WeCom for Greater China region: [ไผไธๅพฎไฟก](https://work.weixin.qq.com/ca/cawcde42751d9f6a29)
- Email us: [support@tensor-fusion.com](mailto:support@tensor-fusion.com)
- Schedule [1:1 meeting with TensorFusion founders](https://tensor-fusion.ai/book-demo)
## ๐ฎ Features & Roadmap
### Core GPU Virtualization Features
- [x] Fractional GPU and flexible oversubscription
- [x] Remote GPU sharing with SOTA GPU-over-IP technology, less than 4% performance loss
- [x] GPU VRAM expansion and hot/cold tiering
- [x] None NVIDIA GPU/NPU vendor support
### Pooling & Scheduling & Management
- [x] GPU/NPU pool management in Kubernetes
- [x] GPU-first scheduling and allocation, with 1 TFLOPs, 1% Computing, 1 MB precision
- [x] GPU node auto provisioning/termination, Karpenter integration
- [x] GPU compaction/bin-packing
- [x] Take full control of GPU allocation with precision targeting by vendor, model, device index, and more
- [x] Seamless onboarding experience for Pytorch, TensorFlow, llama.cpp, vLLM, Tensor-RT, SGlang and all popular AI training/serving frameworks
- [x] Seamless migration from existing NVIDIA operator and device-plugin stack
- [x] Centralized Dashboard & Control Plane
- [x] GPU-first autoscaling policies, auto set requests/limits/replicas
- [x] Request multiple vGPUs with group scheduling for large models
- [x] Support different QoS levels
- [x] Hardware partitioned mode isolation like NVIDIA Dynamic MIG
- [x] Support Kubernetes dynamic resource allocation (DRA) API
### Enterprise Features
- [x] GPU live-migration, snapshot and restore GPU context cross cluster
- [ ] AI model registry and preloading, build your own private MaaS(Model-as-a-Service)
- [x] Advanced auto-scaling policies, scale to zero, rebalance of hot GPUs
- [ ] Advanced observability features, detailed metrics & tracing/profiling of CUDA calls
- [x] Monetize your GPU cluster by multi-tenancy usage measurement & billing report
- [x] Enterprise level high availability and resilience, support topology aware scheduling, GPU node auto failover etc.
- [x] Enterprise level security, complete on-premise deployment support
- [ ] Enterprise level compliance, SSO/SAML support, advanced audit, ReBAC control, SOC2 and other compliance reports available
### ๐ณ๏ธ Platform Support
- [x] Run on Linux Kubernetes clusters
- [x] Run on Linux VMs or Bare Metal (one-click onboarding to Edge K3S)
- [x] Run on Windows (Not open sourced, contact us for support)
See the [open issues](https://github.com/NexusGPU/tensor-fusion/issues) for a full list of proposed features (and known issues).
## ๐ Contributing
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are **greatly appreciated**.
If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement".
Don't forget to give the project a star! Thanks again!
### Development Setup
This project patches Kubernetes scheduler components. Before building or running tests, you need to vendor dependencies and apply patches:
```bash
make vendor # Vendor dependencies and apply scheduler patches
```
### How to Contribute
1. Fork the Project
2. Create your Feature Branch (`git checkout -b feature/AmazingFeature`)
3. Make your changes and run linter (`make lint-fix`)
4. Run tests to ensure everything works (`make test`)
5. Commit your Changes (`git commit -m 'Add some AmazingFeature'`)
6. Push to the Branch (`git push origin feature/AmazingFeature`)
7. Open a Pull Request
### Top contributors
## ๐ท License
1. [TensorFusion main repo](https://github.com/NexusGPU/tensor-fusion) is open sourced with [Apache 2.0 License](./LICENSE), which includes **GPU pooling, scheduling, management features**, you can use it for free and customize it as you want.
2. [vgpu.rs repo](https://github.com/NexusGPU/vgpu.rs) is open sourced with [Apache 2.0 License](./LICENSE), which includes **Fractional GPU** and **vGPU hypervisor features**, you can use it for free and customize it as you want.
3. **Advanced GPU virtualization and GPU-over-IP sharing features** are also free to use when **GPU total number of your organization is less than 10**, but the implementation is not fully open sourced, please [contact us](mailto:support@tensor-fusion.com) for more details.
4. Features mentioned in "**Enterprise Features**" above are paid, **licensed users can use these features in [TensorFusion Console](https://app.tensor-fusion.ai)**.
5. For large scale deployment that involves non-free features of #3 and #4, please [contact us](mailto:support@tensor-fusion.com), pricing details are available [here](https://tensor-fusion.ai/pricing)
[](https://app.fossa.com/projects/git%2Bgithub.com%2FNexusGPU%2Ftensor-fusion?ref=badge_large&issueType=license)
[contributors-shield]: https://img.shields.io/github/contributors/NexusGPU/tensor-fusion.svg?style=for-the-badge
[contributors-url]: https://github.com/NexusGPU/tensor-fusion/graphs/contributors
[forks-shield]: https://img.shields.io/github/forks/NexusGPU/tensor-fusion.svg?style=for-the-badge
[forks-url]: https://github.com/NexusGPU/tensor-fusion/network/members
[stars-shield]: https://img.shields.io/github/stars/NexusGPU/tensor-fusion.svg?style=for-the-badge
[stars-url]: https://github.com/NexusGPU/tensor-fusion/stargazers
[issues-shield]: https://img.shields.io/github/issues/NexusGPU/tensor-fusion.svg?style=for-the-badge
[issues-url]: https://github.com/NexusGPU/tensor-fusion/issues
[license-shield]: https://img.shields.io/github/license/NexusGPU/tensor-fusion.svg?style=for-the-badge
[license-url]: https://github.com/NexusGPU/tensor-fusion/blob/master/LICENSE
[linkedin-shield]: https://img.shields.io/badge/-LinkedIn-black.svg?style=for-the-badge&logo=linkedin&colorB=555
[linkedin-url]: https://www.linkedin.com/company/tensor-fusion/about
