https://github.com/cachly-dev/sdk-python
Official Cachly sdk-python SDK
https://github.com/cachly-dev/sdk-python
Last synced: about 15 hours ago
JSON representation
Official Cachly sdk-python SDK
- Host: GitHub
- URL: https://github.com/cachly-dev/sdk-python
- Owner: cachly-dev
- Created: 2026-04-16T21:45:44.000Z (2 months ago)
- Default Branch: main
- Last Pushed: 2026-04-16T21:46:29.000Z (2 months ago)
- Last Synced: 2026-04-16T23:38:39.022Z (2 months ago)
- Language: Python
- Size: 98.6 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
Awesome Lists containing this project
README
# cachly Python SDK
> Official Python SDK for [cachly.dev](https://cachly.dev) —
> Managed Valkey/Redis cache built for AI apps. **GDPR-compliant · German servers · Live in 30 seconds.**
[](https://pypi.org/project/cachly/)
[](https://pypi.org/project/cachly/)
[](../../LICENSE)
[](https://cachly.dev/legal)
---
## Installation
```bash
pip install cachly
# or
uv add cachly
```
> Requires Python 3.10+. Uses `redis-py` and `numpy` (for semantic cache).
---
## Quick Start
```python
import os
from cachly import CachlyClient
cache = CachlyClient(url=os.environ["CACHLY_URL"])
# Set / Get
cache.set("user:42", {"name": "Alice"}, ttl=300)
user = cache.get("user:42") # returns dict or None
# Get-or-Set pattern
report = cache.get_or_set("report:monthly", lambda: db.run_expensive_report(), ttl=3600)
# Atomic counter
views = cache.incr("page:views")
cache.close()
```
Create your free instance at **[cachly.dev](https://cachly.dev)** — no credit card required.
---
## Async Usage
```python
from cachly.asyncio import AsyncCachlyClient
async def main():
cache = AsyncCachlyClient(url=os.environ["CACHLY_URL"])
await cache.set("session:abc", session_data, ttl=1800)
data = await cache.get("session:abc")
await cache.close()
```
---
## Semantic AI Cache
Cache LLM responses **by meaning**, not exact text. The same prompt phrased differently returns the cached answer — cutting OpenAI costs by up to 60 %.
```python
from cachly import SemanticOptions
result = cache.semantic.get_or_set(
prompt=user_question,
fn=lambda: openai_client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": user_question}]
),
embed_fn=lambda text: openai_client.embeddings.create(
model="text-embedding-3-small", input=text
).data[0].embedding,
options=SemanticOptions(similarity_threshold=0.92, ttl_seconds=3600),
)
print("hit" if result.hit else "miss", result.value)
```
---
## Batch API — Multiple Ops in One Round-Trip
Bundle GET/SET/DEL/EXISTS/TTL operations into **one** HTTP request (or Redis pipeline).
```python
from cachly import CachlyClient, BatchOp
cache = CachlyClient(
url=os.environ["CACHLY_URL"],
batch_url=os.environ.get("CACHLY_BATCH_URL"), # optional
)
result = await cache.batch([
BatchOp("get", "user:1"),
BatchOp("get", "config:app"),
BatchOp("set", "visits", str(time.time()), ttl=86400),
BatchOp("exists", "session:xyz"),
BatchOp("ttl", "token:abc"),
])
user = result[0] # str | None
config = result[1] # str | None
ok = result[2] # bool
present = result[3] # bool
secs = result[4] # int (-1 = no TTL, -2 = key missing)
```
Without `batch_url` the method falls back automatically to a **Redis pipeline** (one TCP round-trip).
---
## Django / FastAPI Integration
```python
# FastAPI
from fastapi import FastAPI
from cachly import CachlyClient
app = FastAPI()
cache = CachlyClient(url=os.environ["CACHLY_URL"])
@app.on_event("shutdown")
async def shutdown():
cache.close()
@app.get("/data/{key}")
async def get_data(key: str):
return cache.get_or_set(key, lambda: fetch_from_db(key), ttl=60)
```
---
## AI Dev Brain — Persistent Memory for Your Coding Assistant
cachly ships a **30-tool MCP server** that gives Claude Code, Cursor, GitHub Copilot, and Windsurf a persistent memory across sessions — so they never forget your architecture, lessons learned, or last session context.
```bash
# One-time setup
npx @cachly-dev/init
```
Or configure manually in your editor (`~/.vscode/mcp.json` / `.cursor/mcp.json`):
```json
{
"servers": {
"cachly": {
"type": "stdio",
"command": "npx",
"args": ["-y", "@cachly-dev/mcp-server"],
"env": { "CACHLY_JWT": "your-jwt-token" }
}
}
}
```
Add to your AI assistant instructions (e.g. `.github/copilot-instructions.md`):
```markdown
## cachly AI Brain
At the START of every session:
session_start(instance_id = "your-instance-id", focus = "what you're working on today")
At the END of every session:
session_end(instance_id = "your-instance-id", summary = "...", files_changed = [...])
After any bug fix or deploy:
learn_from_attempts(instance_id = "your-instance-id", topic = "category:keyword",
outcome = "success", what_worked = "...", what_failed = "...", severity = "major")
```
`session_start` returns a full briefing in **one call**: last session summary, relevant lessons, open failures, brain health. 60 % fewer file reads, instant context, zero re-discovery.
→ Full docs: [cachly.dev/docs/ai-memory](https://cachly.dev/docs/ai-memory)
---
## LLM Response Caching Proxy
Use cachly as a **drop-in caching proxy** for OpenAI or Anthropic — no SDK changes needed:
```bash
# Instead of https://api.openai.com — use your cachly proxy URL:
OPENAI_BASE_URL=https://api.cachly.dev/v1/llm-proxy/YOUR_TOKEN/openai
# Anthropic:
ANTHROPIC_BASE_URL=https://api.cachly.dev/v1/llm-proxy/YOUR_TOKEN/anthropic
```
Identical requests are served from cache with `X-Cachly-Cache: HIT`. Check savings via `GET /v1/llm-proxy/YOUR_TOKEN/stats`.
---
## Agent Workflow Persistence
Checkpoint agent workflow state so agents can resume from the last completed step on crash:
```python
import httpx
base = f"https://api.cachly.dev/v1/workflow/{token}"
# Save a checkpoint after each workflow step
httpx.post(f"{base}/checkpoints", json={
"run_id": "my-run-123",
"step_index": 0,
"step_name": "research",
"agent_name": "researcher",
"status": "completed",
"state": json.dumps({"topic": "AI caching", "results": []}),
})
# Resume: get the latest checkpoint for a run
checkpoint = httpx.get(f"{base}/runs/my-run-123/latest").json()
# → {"step_index": 2, "step_name": "write", "state": "...", "status": "completed"}
```
---
## Connection Pooling & Keep-Alive
```python
from cachly import CachlyClient, CachlyConfig, PoolConfig
cache = CachlyClient(config=CachlyConfig(
url=os.environ["CACHLY_URL"],
pool=PoolConfig(
keep_alive_s=30, # PING every 30s (prevents firewall idle-disconnect)
max_retries=10, # reconnect retries with exponential backoff
base_retry_delay_s=0.1, # first retry delay
max_retry_delay_s=10, # retry delay cap
idle_timeout_s=300, # auto-disconnect after 5 min idle (0 = disabled)
on_error=lambda e: print(f"cachly error: {e}"),
on_reconnect=lambda: print("cachly reconnected"),
),
))
```
---
## Retry with Exponential Backoff
Every command is automatically retried on transient errors (ConnectionError, TimeoutError, BusyLoadingError, …) using AWS-style full-jitter backoff:
```python
from cachly import CachlyClient, RetryConfig
cache = CachlyClient(
url=os.environ["CACHLY_URL"],
retry=RetryConfig(
max_retries=3, # retry up to 3× (default)
base_delay_s=0.05, # first retry after ~50ms
max_delay_s=2.0, # cap at 2s
),
)
```
Disable retries with `RetryConfig(max_retries=0)`.
---
## OpenTelemetry Tracing
```python
from opentelemetry import trace
cache = CachlyClient(
url=os.environ["CACHLY_URL"],
otel_tracer=trace.get_tracer("my-app"),
)
# Every get/set/delete/incr produces OTEL spans:
# span: "cache.get" attributes: { cache.key: "user:42" }
# span: "cache.set" attributes: { cache.key: "user:42", cache.ttl: 300 }
```
---
## API Reference
| Method | Description |
|---|---|
| `CachlyClient(url, batch_url=None, pool=None)` | Create client from Redis URL |
| `get(key)` | Get value (`None` if missing); auto-deserialises JSON |
| `set(key, value, ttl=None)` | Set value, optional TTL in seconds |
| `delete(*keys)` | Delete one or more keys |
| `exists(key) → bool` | Check existence |
| `expire(key, seconds)` | Update TTL |
| `incr(key) → int` | Atomic increment |
| `get_or_set(key, fn, ttl=None)` | Get-or-set pattern |
| `batch(ops) → BatchResult` | Bulk ops in one round-trip |
| `semantic` | `SemanticCache` for AI workloads |
| `raw` | Direct `redis.Redis` access |
| `close()` | Close connection pool and stop keep-alive |
---
## Environment Variables
```bash
CACHLY_URL=redis://:your-password@my-app.cachly.dev:30101
CACHLY_BATCH_URL=https://api.cachly.dev/v1/cache/YOUR_TOKEN # optional
```
---
## Links
- 📖 [cachly.dev docs](https://cachly.dev/docs)
- 🧠 [AI Memory / MCP Server](https://cachly.dev/docs/ai-memory)
- 🐛 [Issues](https://github.com/cachly-dev/sdk-python/issues)
- 📦 [PyPI](https://pypi.org/project/cachly/)
---
MIT © [cachly.dev](https://cachly.dev)