https://github.com/interactivetech/pdk-llm-rag-demo
Proof of Concept showcases how to continuously finetune and deploy of RAG systems using Pachyderm and Determined
https://github.com/interactivetech/pdk-llm-rag-demo
Last synced: about 1 year ago
JSON representation
Proof of Concept showcases how to continuously finetune and deploy of RAG systems using Pachyderm and Determined
- Host: GitHub
- URL: https://github.com/interactivetech/pdk-llm-rag-demo
- Owner: interactivetech
- Created: 2023-12-08T18:47:08.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2023-12-09T02:43:17.000Z (over 2 years ago)
- Last Synced: 2025-02-10T05:25:03.283Z (over 1 year ago)
- Language: Python
- Size: 2.63 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Continuous Retrieval Augmentation Generation (RAG) with the HPE MLOPs Platform
Author: andrew.mendez@hpe.com
This is a proof of concept showing how developers can create a Retrieval Augmentation Generation (RAG) system using Pachyderm and Determined AI.
This is a unique RAG system sitting on top of an MLOPs platform, allowing developers to continuously update and deploy a RAG application as more data is ingested.
We also provide an example of how developers can automatically trigger finetuning an LLM on a instruction tuning dataset.
We use the following stack:
* ChromaDB for the vector database
* Chainlit for the User Interface
* Mistral 7B Instruct for the large language model
* Determined for finetuning the Mistral Model
* Pachyderm to manage dataset versioning and pipeline orchestration.
# Pre-requisite
* This Demo requires running with an A100 80GB GPU.
* This Demo assumes you have pachyderm and determined installed on top of kubernetes. A guide will be provided soon to show how to install pachyderm and kubernetes.
# How to Run
* Run `Deploy RAG with PDK.pynb` to deploy a RAG system using a pretrained LLM
* Run `Finetune and Deploy RAG with PDK.ipynb` to both finetune an LLM and deploy a finetuned model.