https://github.com/oracle-quickstart/oci-ai-blueprints

Monitor AI workloads and GPU infrastructure resources with centralized monitoring solution.
https://github.com/oracle-quickstart/oci-ai-blueprints

Last synced: about 1 month ago
JSON representation

Monitor AI workloads and GPU infrastructure resources with centralized monitoring solution.

Host: GitHub
URL: https://github.com/oracle-quickstart/oci-ai-blueprints
Owner: oracle-quickstart
License: upl-1.0
Created: 2024-11-04T23:50:46.000Z (8 months ago)
Default Branch: main
Last Pushed: 2025-05-12T21:26:01.000Z (about 1 month ago)
Last Synced: 2025-05-12T21:38:25.277Z (about 1 month ago)
Language: HCL
Homepage:
Size: 1.31 MB
Stars: 22
Watchers: 4
Forks: 3
Open Issues: 5
Metadata Files:
- Readme: README.md
- License: LICENSE.txt

Awesome Lists containing this project

README

        # OCI AI Blueprints

**Deploy, scale, and monitor AI workloads with the OCI AI Blueprints platform, and reduce your GPU onboarding time from weeks to minutes.**

OCI AI Blueprints is a streamlined, no-code solution for deploying and managing Generative AI workloads on Kubernetes Engine (OKE). By providing opinionated hardware recommendations, pre-packaged software stacks, and out-of-the-box observability tooling, OCI AI Blueprints helps you get your AI applications running quickly and efficiently—without wrestling with the complexities of infrastructure decisions, software compatibility, and MLOps best practices.

[![Install OCI AI Blueprints](https://raw.githubusercontent.com/oracle-quickstart/oci-ai-blueprints/9d1d61b3b79e61dabe19d1672c3e54704b294a93/docs/install.svg)](./GETTING_STARTED_README.md)

## Table of Contents

**Getting Started**

- [Install AI Blueprints](./GETTING_STARTED_README.md)

- [Access AI Blueprints Portal and API](./docs/api_documentation/accessing_oci_ai_blueprints/README.md)

**About OCI AI Blueprints**

- [What is OCI AI Blueprints?](./docs/about/README.md#what-is-oci-ai-blueprints)

- [Why use OCI AI Blueprints?](./docs/about/README.md#why-use-oci-ai-blueprints)

- [Features](./docs/about/README.md#features)

- [List of Blueprints](#blueprints)

- [FAQ](./docs/about/README.md#frequently-asked-questions-faq)

- [Support & Contact](https://github.com/oracle-quickstart/oci-ai-blueprints/blob/vkammari/doc_improvements/docs/about/README.md#frequently-asked-questions-faq)

**API Reference**

- [API Reference Documentation](docs/api_documentation/README.md)

**Additional Resources**

- [Publish Custom Blueprints](./docs/custom_blueprints)

- [Installing Updates](./docs/installing_new_updates)

- [IAM Policies](./docs/iam_policies/README.md)

- [Repository Contents](./docs/about/README.md#repository-contents)

- [Known Issues](docs/known_issues/README.md)

## Getting Started

Install OCI AI Blueprints by clicking on the button below:

[![Install OCI AI Blueprints](https://raw.githubusercontent.com/oracle-quickstart/oci-ai-blueprints/9d1d61b3b79e61dabe19d1672c3e54704b294a93/docs/install.svg)](./GETTING_STARTED_README.md)

## Blueprints

Blueprints go beyond basic Terraform templates. Each blueprint:

- Offers validated hardware suggestions (e.g., optimal shapes, CPU/GPU configurations),

- Includes end-to-end application stacks customized for different GenAI use cases, and

- Comes with monitoring, logging, and auto-scaling configured out of the box.

After you install OCI AI Blueprints to an OKE cluster in your tenancy, you can deploy these pre-built blueprints:

| Blueprint                    | Description                                                                                                                             |

| ---------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------|

| [**LLM & VLM Inference with vLLM**](./docs/sample_blueprints/vllm-inference)  | Deploy Llama 2/3/3.1 7B/8B models using NVIDIA GPU shapes and the vLLM inference engine with auto-scaling.                              |

| [**Fine-Tuning Benchmarking**](./docs/sample_blueprints/lora-benchmarking) | Run MLCommons quantized Llama-2 70B LoRA finetuning on A100 for performance benchmarking.                                               |

| [**LoRA Fine-Tuning**](./docs/sample_blueprints/lora-fine-tuning)         | LoRA fine-tuning of custom or HuggingFace models using any dataset. Includes flexible hyperparameter tuning.                            |

| [**Health Check**](./docs/sample_blueprints/gpu-health-check)             | Comprehensive evaluation of GPU performance to ensure optimal hardware readiness before initiating any intensive computational workload.|

| [**CPU Inference**](./docs/sample_blueprints/cpu-inference)            | Leverage Ollama to test CPU-based inference with models like Mistral, Gemma, and more.                                                  |

| [**Multi-node Inference with RDMA and vLLM**](./docs/multi_node_inference)            | Deploy Llama-405B sized LLMs across multiple nodes with RDMA using H100 nodes with vLLM and LeaderWorkerSet.                                          |

| [**Scaled Inference with vLLM**](./docs/auto_scaling)            | Serve LLMs with auto-scaling using KEDA, which scales to multiple GPUs and nodes using application metrics like inference latency.|

| [**LLM Inference with MIG**](./docs/mig_multi_instance_gpu)            | Deploy LLMs to a fraction of a GPU with Nvidia’s multi-instance GPUs and serve them with vLLM.                                                 |

| [**Health Check**](./docs/sample_blueprints/gpu-health-check)            | Comprehensive evaluation of GPU performance to ensure optimal hardware readiness before initiating any intensive computational workload.                 |

## Support & Contact

If you have any questions, issues, or feedback, contact [[email protected]](mailto:[email protected]) or [[email protected]](mailto:[email protected]).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/oracle-quickstart/oci-ai-blueprints

Awesome Lists containing this project

README