Projects in Awesome Lists tagged with semantic-cache

https://github.com/codefuse-ai/modelcache

A LLM semantic caching system aiming to enhance user experience by reducing response time via cached query-result pairs.

llm semantic-cache

Last synced: 15 May 2025

https://github.com/redis/redis-vl-python

Redis Vector Library (RedisVL) -- the AI-native Python client for Redis.

embedding large-language-models llm llmcache openai python redis retrieval-augmented-generation semantic-cache vector-database vector-search

Last synced: 14 May 2025

https://github.com/ferro-labs/ai-gateway

Unified AI Gateway for 30+ LLMs (OpenAI, Anthropic, Bedrock, Azure etc) with Caching, Guardrails, A/B test & cost controls. Go-native Fastest & Scalable AI Gateway LiteLLM & Kong AI Gateway alternative.

ai-gateway ai-infrastructure gateway guardrails kong litellm llm llm-cost llm-proxy llm-strategy llmops mcp pii-detection prompt-management semantic-cache

Last synced: 24 May 2026

https://github.com/sensoris/semcache

Semantic caching layer for your LLM applications. Reuse responses and reduce token usage.

anthropic gemini genai llm openai semantic-cache

Last synced: 29 Aug 2025

https://github.com/redis/redis-vl-java

Redis Vector Library (RedisVL) -- the AI-native Java client for Redis.

agentic-ai ai embeddings generative-ai java llm llm-cache rag rag-chatbot redis semantic-cache semantic-routing vector-database vector-search vectors

Last synced: 18 Feb 2026

https://github.com/peva3/smarterrouter

SmarterRouter: An intelligent LLM gateway and VRAM-aware router for Ollama, llama.cpp, and OpenAI. Features semantic caching, model profiling, and automatic failover for local AI labs.

ai-cache ai-gateway docker fastapi gpu-monitoring llm llm-proxy llm-router local-llm model-serving ollama ollama-api openai-proxy self-hosted self-hosted-ai semantic-cache

Last synced: 27 Feb 2026

https://github.com/vcal-project/ai-firewall

OpenAI-compatible LLM gateway that reduces API costs using Redis exact cache and Qdrant semantic cache.

ai-cost-optimization ai-gateway ai-infrastructure llm openai qdrant redis rust semantic-cache vector-search

Last synced: 13 Jun 2026

https://github.com/jonathanscholtes/llm-performance-with-azure-cosmos-db-semantic-cache

Enhance LLM retrieval performance with Azure Cosmos DB Semantic Cache. Learn how to integrate and optimize caching strategies in real-world web applications.

azurecosmosdb semantic-cache vector-search

Last synced: 23 Jun 2025

https://github.com/reallyartificial/freeport

Open-source LLM Gateway. Multi-provider routing, fallback, semantic caching, cost tracking, guardrails. Drop-in OpenAI API replacement. Self-hosted.

ai ai-agents ai-infrastructure anthropic developer-tools gateway llm llm-gateway open-source openai self-hosted semantic-cache typescript

Last synced: 14 Jun 2026

https://github.com/nico-iaco/nexabudget-be

Backend for NexaBudget, a personal finance management app. This Spring Boot application provides a RESTful API for managing finances, including accounts, transactions, and budgets.

gemini-ai graalvm-native-image java-25 mongodb-atlas personal-finance redis-cache semantic-cache spring-ai spring-security springboot

Last synced: 16 May 2026

https://github.com/mrmushfiq/llm0-gateway

Self-hosted, OpenAI-compatible LLM gateway in a single Go binary. Routes to OpenAI, Anthropic, Gemini, and local Ollama with failover, two-tier caching (exact + semantic), per-key rate limits, and per-customer spend caps.

ai-gate ai-infrastructure anthropic chatgpt claude gemini golang gpt llm llm-gateway openai openai-compatible pgvector postgres rate-limiting redis self-hosted semantic-cache