Projects in Awesome Lists tagged with semantic-cache
A curated list of projects in awesome lists tagged with semantic-cache .
https://github.com/codefuse-ai/modelcache
A LLM semantic caching system aiming to enhance user experience by reducing response time via cached query-result pairs.
Last synced: 15 May 2025
https://github.com/redis/redis-vl-python
Redis Vector Library (RedisVL) -- the AI-native Python client for Redis.
embedding large-language-models llm llmcache openai python redis retrieval-augmented-generation semantic-cache vector-database vector-search
Last synced: 14 May 2025
https://github.com/ferro-labs/ai-gateway
Unified AI Gateway for 30+ LLMs (OpenAI, Anthropic, Bedrock, Azure etc) with Caching, Guardrails, A/B test & cost controls. Go-native Fastest & Scalable AI Gateway LiteLLM & Kong AI Gateway alternative.
ai-gateway ai-infrastructure gateway guardrails kong litellm llm llm-cost llm-proxy llm-strategy llmops mcp pii-detection prompt-management semantic-cache
Last synced: 24 May 2026
https://github.com/sensoris/semcache
Semantic caching layer for your LLM applications. Reuse responses and reduce token usage.
anthropic gemini genai llm openai semantic-cache
Last synced: 29 Aug 2025
https://github.com/redis/redis-vl-java
Redis Vector Library (RedisVL) -- the AI-native Java client for Redis.
agentic-ai ai embeddings generative-ai java llm llm-cache rag rag-chatbot redis semantic-cache semantic-routing vector-database vector-search vectors
Last synced: 18 Feb 2026
https://github.com/peva3/smarterrouter
SmarterRouter: An intelligent LLM gateway and VRAM-aware router for Ollama, llama.cpp, and OpenAI. Features semantic caching, model profiling, and automatic failover for local AI labs.
ai-cache ai-gateway docker fastapi gpu-monitoring llm llm-proxy llm-router local-llm model-serving ollama ollama-api openai-proxy self-hosted self-hosted-ai semantic-cache
Last synced: 27 Feb 2026
https://github.com/vcal-project/ai-firewall
OpenAI-compatible LLM gateway that reduces API costs using Redis exact cache and Qdrant semantic cache.
ai-cost-optimization ai-gateway ai-infrastructure llm openai qdrant redis rust semantic-cache vector-search
Last synced: 13 Jun 2026
https://github.com/jonathanscholtes/llm-performance-with-azure-cosmos-db-semantic-cache
Enhance LLM retrieval performance with Azure Cosmos DB Semantic Cache. Learn how to integrate and optimize caching strategies in real-world web applications.
azurecosmosdb semantic-cache vector-search
Last synced: 23 Jun 2025
https://github.com/reallyartificial/freeport
Open-source LLM Gateway. Multi-provider routing, fallback, semantic caching, cost tracking, guardrails. Drop-in OpenAI API replacement. Self-hosted.
ai ai-agents ai-infrastructure anthropic developer-tools gateway llm llm-gateway open-source openai self-hosted semantic-cache typescript
Last synced: 14 Jun 2026
https://github.com/nico-iaco/nexabudget-be
Backend for NexaBudget, a personal finance management app. This Spring Boot application provides a RESTful API for managing finances, including accounts, transactions, and budgets.
gemini-ai graalvm-native-image java-25 mongodb-atlas personal-finance redis-cache semantic-cache spring-ai spring-security springboot
Last synced: 16 May 2026
https://github.com/mrmushfiq/llm0-gateway
Self-hosted, OpenAI-compatible LLM gateway in a single Go binary. Routes to OpenAI, Anthropic, Gemini, and local Ollama with failover, two-tier caching (exact + semantic), per-key rate limits, and per-customer spend caps.
ai-gate ai-infrastructure anthropic chatgpt claude gemini golang gpt llm llm-gateway openai openai-compatible pgvector postgres rate-limiting redis self-hosted semantic-cache
Last synced: 20 Apr 2026
https://github.com/rawcontext/reflex
Episodic memory and semantic cache proxy for LLM APIs with ~40% token savings
agent-orchestration ai-agents context-graph developer-tools knowledge-graph llm-proxy semantic-cache token-optimization
Last synced: 11 Jan 2026