https://github.com/cachly-dev/sdk-python

Official Cachly sdk-python SDK
https://github.com/cachly-dev/sdk-python
Last synced: about 15 hours ago
JSON representation
Official Cachly sdk-python SDK
Host: GitHub
URL: https://github.com/cachly-dev/sdk-python
Owner: cachly-dev
Created: 2026-04-16T21:45:44.000Z (2 months ago)
Default Branch: main
Last Pushed: 2026-04-16T21:46:29.000Z (2 months ago)
Last Synced: 2026-04-16T23:38:39.022Z (2 months ago)
Language: Python
Size: 98.6 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
Awesome Lists containing this project

README

          # cachly Python SDK

> Official Python SDK for [cachly.dev](https://cachly.dev) —  

> Managed Valkey/Redis cache built for AI apps. **GDPR-compliant · German servers · Live in 30 seconds.**

[![PyPI](https://img.shields.io/pypi/v/cachly?color=blue&logo=pypi)](https://pypi.org/project/cachly/)

[![Python 3.10+](https://img.shields.io/badge/python-3.10%2B-blue?logo=python)](https://pypi.org/project/cachly/)

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](../../LICENSE)

[![GDPR: EU-only](https://img.shields.io/badge/GDPR-EU%20only-green)](https://cachly.dev/legal)

---

## Installation

```bash

pip install cachly

# or

uv add cachly

```

> Requires Python 3.10+. Uses `redis-py` and `numpy` (for semantic cache).

---

## Quick Start

```python

import os

from cachly import CachlyClient

cache = CachlyClient(url=os.environ["CACHLY_URL"])

# Set / Get

cache.set("user:42", {"name": "Alice"}, ttl=300)

user = cache.get("user:42")           # returns dict or None

# Get-or-Set pattern

report = cache.get_or_set("report:monthly", lambda: db.run_expensive_report(), ttl=3600)

# Atomic counter

views = cache.incr("page:views")

cache.close()

```

Create your free instance at **[cachly.dev](https://cachly.dev)** — no credit card required.

---

## Async Usage

```python

from cachly.asyncio import AsyncCachlyClient

async def main():

    cache = AsyncCachlyClient(url=os.environ["CACHLY_URL"])

    await cache.set("session:abc", session_data, ttl=1800)

    data = await cache.get("session:abc")

    await cache.close()

```

---

## Semantic AI Cache

Cache LLM responses **by meaning**, not exact text. The same prompt phrased differently returns the cached answer — cutting OpenAI costs by up to 60 %.

```python

from cachly import SemanticOptions

result = cache.semantic.get_or_set(

    prompt=user_question,

    fn=lambda: openai_client.chat.completions.create(

        model="gpt-4o",

        messages=[{"role": "user", "content": user_question}]

    ),

    embed_fn=lambda text: openai_client.embeddings.create(

        model="text-embedding-3-small", input=text

    ).data[0].embedding,

    options=SemanticOptions(similarity_threshold=0.92, ttl_seconds=3600),

)

print("hit" if result.hit else "miss", result.value)

```

---

## Batch API — Multiple Ops in One Round-Trip

Bundle GET/SET/DEL/EXISTS/TTL operations into **one** HTTP request (or Redis pipeline).

```python

from cachly import CachlyClient, BatchOp

cache = CachlyClient(

    url=os.environ["CACHLY_URL"],

    batch_url=os.environ.get("CACHLY_BATCH_URL"),  # optional

)

result = await cache.batch([

    BatchOp("get",    "user:1"),

    BatchOp("get",    "config:app"),

    BatchOp("set",    "visits", str(time.time()), ttl=86400),

    BatchOp("exists", "session:xyz"),

    BatchOp("ttl",    "token:abc"),

])

user    = result[0]   # str | None

config  = result[1]   # str | None

ok      = result[2]   # bool

present = result[3]   # bool

secs    = result[4]   # int (-1 = no TTL, -2 = key missing)

```

Without `batch_url` the method falls back automatically to a **Redis pipeline** (one TCP round-trip).

---

## Django / FastAPI Integration

```python

# FastAPI

from fastapi import FastAPI

from cachly import CachlyClient

app = FastAPI()

cache = CachlyClient(url=os.environ["CACHLY_URL"])

@app.on_event("shutdown")

async def shutdown():

    cache.close()

@app.get("/data/{key}")

async def get_data(key: str):

    return cache.get_or_set(key, lambda: fetch_from_db(key), ttl=60)

```

---

## AI Dev Brain — Persistent Memory for Your Coding Assistant

cachly ships a **30-tool MCP server** that gives Claude Code, Cursor, GitHub Copilot, and Windsurf a persistent memory across sessions — so they never forget your architecture, lessons learned, or last session context.

```bash

# One-time setup

npx @cachly-dev/init

```

Or configure manually in your editor (`~/.vscode/mcp.json` / `.cursor/mcp.json`):

```json

{

  "servers": {

    "cachly": {

      "type": "stdio",

      "command": "npx",

      "args": ["-y", "@cachly-dev/mcp-server"],

      "env": { "CACHLY_JWT": "your-jwt-token" }

    }

  }

}

```

Add to your AI assistant instructions (e.g. `.github/copilot-instructions.md`):

```markdown

## cachly AI Brain

At the START of every session:

session_start(instance_id = "your-instance-id", focus = "what you're working on today")

At the END of every session:

session_end(instance_id = "your-instance-id", summary = "...", files_changed = [...])

After any bug fix or deploy:

learn_from_attempts(instance_id = "your-instance-id", topic = "category:keyword",

  outcome = "success", what_worked = "...", what_failed = "...", severity = "major")

```

`session_start` returns a full briefing in **one call**: last session summary, relevant lessons, open failures, brain health. 60 % fewer file reads, instant context, zero re-discovery.

→ Full docs: [cachly.dev/docs/ai-memory](https://cachly.dev/docs/ai-memory)

---

## LLM Response Caching Proxy

Use cachly as a **drop-in caching proxy** for OpenAI or Anthropic — no SDK changes needed:

```bash

# Instead of https://api.openai.com — use your cachly proxy URL:

OPENAI_BASE_URL=https://api.cachly.dev/v1/llm-proxy/YOUR_TOKEN/openai

# Anthropic:

ANTHROPIC_BASE_URL=https://api.cachly.dev/v1/llm-proxy/YOUR_TOKEN/anthropic

```

Identical requests are served from cache with `X-Cachly-Cache: HIT`. Check savings via `GET /v1/llm-proxy/YOUR_TOKEN/stats`.

---

## Agent Workflow Persistence

Checkpoint agent workflow state so agents can resume from the last completed step on crash:

```python

import httpx

base = f"https://api.cachly.dev/v1/workflow/{token}"

# Save a checkpoint after each workflow step

httpx.post(f"{base}/checkpoints", json={

    "run_id":     "my-run-123",

    "step_index": 0,

    "step_name":  "research",

    "agent_name": "researcher",

    "status":     "completed",

    "state":      json.dumps({"topic": "AI caching", "results": []}),

})

# Resume: get the latest checkpoint for a run

checkpoint = httpx.get(f"{base}/runs/my-run-123/latest").json()

# → {"step_index": 2, "step_name": "write", "state": "...", "status": "completed"}

```

---

## Connection Pooling & Keep-Alive

```python

from cachly import CachlyClient, CachlyConfig, PoolConfig

cache = CachlyClient(config=CachlyConfig(

    url=os.environ["CACHLY_URL"],

    pool=PoolConfig(

        keep_alive_s=30,          # PING every 30s (prevents firewall idle-disconnect)

        max_retries=10,           # reconnect retries with exponential backoff

        base_retry_delay_s=0.1,   # first retry delay

        max_retry_delay_s=10,     # retry delay cap

        idle_timeout_s=300,       # auto-disconnect after 5 min idle (0 = disabled)

        on_error=lambda e: print(f"cachly error: {e}"),

        on_reconnect=lambda: print("cachly reconnected"),

    ),

))

```

---

## Retry with Exponential Backoff

Every command is automatically retried on transient errors (ConnectionError, TimeoutError, BusyLoadingError, …) using AWS-style full-jitter backoff:

```python

from cachly import CachlyClient, RetryConfig

cache = CachlyClient(

    url=os.environ["CACHLY_URL"],

    retry=RetryConfig(

        max_retries=3,       # retry up to 3× (default)

        base_delay_s=0.05,   # first retry after ~50ms

        max_delay_s=2.0,     # cap at 2s

    ),

)

```

Disable retries with `RetryConfig(max_retries=0)`.

---

## OpenTelemetry Tracing

```python

from opentelemetry import trace

cache = CachlyClient(

    url=os.environ["CACHLY_URL"],

    otel_tracer=trace.get_tracer("my-app"),

)

# Every get/set/delete/incr produces OTEL spans:

#   span: "cache.get"  attributes: { cache.key: "user:42" }

#   span: "cache.set"  attributes: { cache.key: "user:42", cache.ttl: 300 }

```

---

## API Reference

| Method | Description |

|---|---|

| `CachlyClient(url, batch_url=None, pool=None)` | Create client from Redis URL |

| `get(key)` | Get value (`None` if missing); auto-deserialises JSON |

| `set(key, value, ttl=None)` | Set value, optional TTL in seconds |

| `delete(*keys)` | Delete one or more keys |

| `exists(key) → bool` | Check existence |

| `expire(key, seconds)` | Update TTL |

| `incr(key) → int` | Atomic increment |

| `get_or_set(key, fn, ttl=None)` | Get-or-set pattern |

| `batch(ops) → BatchResult` | Bulk ops in one round-trip |

| `semantic` | `SemanticCache` for AI workloads |

| `raw` | Direct `redis.Redis` access |

| `close()` | Close connection pool and stop keep-alive |

---

## Environment Variables

```bash

CACHLY_URL=redis://:your-password@my-app.cachly.dev:30101

CACHLY_BATCH_URL=https://api.cachly.dev/v1/cache/YOUR_TOKEN   # optional

```

---

## Links

- 📖 [cachly.dev docs](https://cachly.dev/docs)

- 🧠 [AI Memory / MCP Server](https://cachly.dev/docs/ai-memory)

- 🐛 [Issues](https://github.com/cachly-dev/sdk-python/issues)

- 📦 [PyPI](https://pypi.org/project/cachly/)

---

MIT © [cachly.dev](https://cachly.dev)
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/cachly-dev/sdk-python

Awesome Lists containing this project

README