https://github.com/redis-developer/gcp-redis-llm-stack
Reference architecture for LLM-based applications on Google Cloud Platform with Redis Enterprise as a high-performance data layer.
https://github.com/redis-developer/gcp-redis-llm-stack
caching chatbot gcp google-cloud llms memory palm-api redis redis-enterprise vector-database vertex-ai
Last synced: about 1 month ago
JSON representation
Reference architecture for LLM-based applications on Google Cloud Platform with Redis Enterprise as a high-performance data layer.
- Host: GitHub
- URL: https://github.com/redis-developer/gcp-redis-llm-stack
- Owner: redis-developer
- License: mit
- Created: 2023-06-29T18:37:32.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2024-12-11T16:39:43.000Z (5 months ago)
- Last Synced: 2025-04-12T03:53:04.997Z (about 1 month ago)
- Topics: caching, chatbot, gcp, google-cloud, llms, memory, palm-api, redis, redis-enterprise, vector-database, vertex-ai
- Language: Jupyter Notebook
- Homepage:
- Size: 3.25 MB
- Stars: 32
- Watchers: 6
- Forks: 12
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Scalable LLM Architectures with Redis & GCP Vertex AI
☁️ [Generative AI](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/overview) with Google Vertex AI comes with a specialized [in-console studio experience](https://cloud.google.com/vertex-ai/docs/generative-ai/start/quickstarts/quickstart), a [dedicated API for Gemini](https://cloud.google.com/vertex-ai/docs/generative-ai/start/quickstarts/api-quickstart) and easy-to-use [Python SDK](https://cloud.google.com/vertex-ai/docs/python-sdk/use-vertex-ai-python-sdk) designed for deploying and managing instances of Google's powerful language models.
⚡ Redis Enterprise offers fast and scalable [vector search](https://redis.io/solutions/vector-search/), with an API for index creation, management, blazing-fast search, and hybrid filtering. When coupled with its [versatile data structures](https://redis.io/docs/latest/develop/data-types/) - Redis Enterprise shines as the optimal solution for building high-quality Large Language Model (LLM) apps.
>This repo serves as a foundational architecture for building LLM applications with Redis and GCP services.
## Reference architecture

1. Primary Data Sources
2. Data Extraction and Loading
3. Large Language Models
- `text-embedding-gecko@003` for embeddings
- `gemini-1.5-flash-001` for LLM generation and chat
3. High-Performance Data Layer (Redis)
- Semantic caching to improve LLM performance and associated costs
- Vector search for context retrieval from knowledge base**Open the code tutorial using the Colab notebook to get your hands dirty with Redis and Vertex AI on GCP.** It's a step-by-step walkthrough of setting up the required data, and generating embeddings, and building RAG from scratch in order to build fast LLM apps; highlighting Redis vector search and semantic caching.
## Additional resources
- [Streamlit PDF chatbot example app](examples/chat-your-pdf/)
- [Redis vector search documentation](https://redis.io/docs/latest/develop/interact/search-and-query/query/vector-search/)
- [Get started with RedisVL](https://redis.io/blog/introducing-the-redis-vector-library-for-enhancing-genai-development/)
- [Google VertexAI resources](https://cloud.google.com/vertex-ai)
- [More Redis ai resources](https://github.com/redis-developer)