https://github.com/cachly-dev/cachly-openclaw

Official Cachly adapter for OpenClaw – persistent sessions, semantic LLM cache, memory storage
https://github.com/cachly-dev/cachly-openclaw
Last synced: about 5 hours ago
JSON representation
Official Cachly adapter for OpenClaw – persistent sessions, semantic LLM cache, memory storage
Host: GitHub
URL: https://github.com/cachly-dev/cachly-openclaw
Owner: cachly-dev
Created: 2026-04-17T01:37:29.000Z (about 2 months ago)
Default Branch: main
Last Pushed: 2026-06-06T21:58:44.000Z (9 days ago)
Last Synced: 2026-06-06T23:19:35.242Z (9 days ago)
Language: TypeScript
Homepage: https://cachly.dev/docs/openclaw
Size: 47.9 KB
Stars: 1
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
Awesome Lists containing this project

README

          # @cachly-dev/openclaw

> **You paid $0.08 for that answer. The next 1,000 identical asks: $0.00.**  

> Semantic LLM cache + persistent sessions + AI memory. 3 lines. No embeddings required.

[![npm](https://img.shields.io/npm/v/@cachly-dev/openclaw?color=red&logo=npm)](https://www.npmjs.com/package/@cachly-dev/openclaw)

[![npm downloads](https://img.shields.io/npm/dm/@cachly-dev/openclaw?color=blue)](https://www.npmjs.com/package/@cachly-dev/openclaw)

[![Free tier](https://img.shields.io/badge/Free%20tier-€0%2Fmo-brightgreen)](https://cachly.dev)

[![License: Apache-2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](../../LICENSE)

---

## Before / After

```typescript

// ❌ Before: Every user message calls OpenAI. Every time. No exceptions.

const reply = await openai.chat.completions.create({ model: 'gpt-4o', messages })

// ✅ After: Same questions = zero API calls = zero cost

const cache = createSemanticLLMCache({ url: process.env.CACHLY_URL })

const reply = await cache.getOrSet(userMessage, () =>

  openai.chat.completions.create({ model: 'gpt-4o', messages })

)

```

"How do I reset my password?" → "How can I reset my pw?" → **cache hit**. $0.00.

---

## Setup — 60 seconds

```bash

npm install @cachly-dev/openclaw

```

```bash

# Get a free Redis instance at cachly.dev (no credit card):

CACHLY_URL=redis://:password@your-instance.cachly.dev:6379

```

---

## ⚡ Semantic LLM Cache — 3 lines

Every time a user asks the same question in different words, you pay OpenAI again. This stops that.

```typescript

import { createSemanticLLMCache } from '@cachly-dev/openclaw'

const cache = createSemanticLLMCache({ url: process.env.CACHLY_URL! })

// Wrap any LLM call — that's it

const answer = await cache.getOrSet(

  userPrompt,

  () => openai.chat.completions.create({ model: 'gpt-4o', messages: [...] })

)

```

**Without an embed function + no vectorUrl:** exact-match caching + **local BM25+ fuzzy search** kick in immediately (20–50% savings). No API calls. "how do I reset password?" matches "password reset help" — pure in-process.

**Without an embed function + vectorUrl:** BM25 + hosted pgvector index, higher hit rates across large caches.

**Add semantic matching** for 60–90% savings (10 more lines):

```typescript

const cache = createSemanticLLMCache({

  url:       process.env.CACHLY_URL!,

  vectorUrl: process.env.CACHLY_VECTOR_URL,   // from cachly.dev dashboard

  embedFn:   (text) =>

    openai.embeddings.create({ model: 'text-embedding-3-small', input: text })

      .then(r => r.data[0].embedding),

  threshold: 0.92,   // cosine similarity (default)

  ttl:       3600,   // seconds

})

```

"How do I reset my password?" → "How can I reset my pw?" → **same cache hit**. 💰

---

## 📊 What you save

| Questions/day | Cache hit rate | Monthly saving (GPT-4o) |

|---------------|---------------|------------------------|

| 100           | 40% exact     | ~$8                    |

| 100           | 70% semantic  | ~$22                   |

| 1 000         | 70% semantic  | ~$220                  |

| 10 000        | 70% semantic  | ~$2 200                |

After 10 cache hits, the console logs:

```

🎯 cachly: 12,340 tokens saved this session (10 hits)

   Full stats → cachly.dev/dashboard

```

(Cost breakdown available in the dashboard.)

---

## 🗄️ Session Store

Persist conversation history in Redis — no cold starts, no lost context:

```typescript

import { createCachlySessionStore } from '@cachly-dev/openclaw'

const sessions = createCachlySessionStore({

  url: process.env.CACHLY_URL!,

  ttl: 604800,  // 7 days

})

const history = await sessions.get(userId)

await sessions.set(userId, [...history, { role: 'user', content: message }])

```

Works with any LLM framework — OpenAI, LangChain, Vercel AI SDK, etc.

---

## 🧠 Memory Adapter

Long-term semantic memory — store facts, recall by meaning:

```typescript

import { createCachlyMemoryAdapter } from '@cachly-dev/openclaw'

const memory = createCachlyMemoryAdapter({

  url:       process.env.CACHLY_URL!,

  vectorUrl: process.env.CACHLY_VECTOR_URL!,

  embedFn:   myEmbedFn,

  ttl:       7776000,  // 90 days

})

await memory.store({ id: 'pref-1', text: 'User prefers TypeScript over Python' })

const results = await memory.search('programming language preference', { topK: 5 })

// → [{ text: 'User prefers TypeScript over Python', score: 0.97 }]

```

---

## 🔍 Brain Search (BM25+)

Full-text search over cached data — no embeddings needed:

```typescript

import { brainSearch } from '@cachly-dev/openclaw'

const results = await brainSearch(process.env.CACHLY_VECTOR_URL!, 'deploy authentication error')

// → [{ key: 'lesson:fix:auth', score: 4.2, preview: '...' }]

```

---

## 🧩 Standalone — works with any LLM stack

No OpenClaw needed. Drop into LangChain, Vercel AI SDK, plain `fetch`, or any custom pipeline:

### LangChain

```typescript

import { createSemanticLLMCache } from '@cachly-dev/openclaw'

import { ChatOpenAI } from '@langchain/openai'

const cache = createSemanticLLMCache({ url: process.env.CACHLY_URL! })

const llm = new ChatOpenAI()

async function cachedInvoke(prompt: string) {

  return cache.getOrSet(

    prompt,

    () => llm.invoke(prompt).then(r => ({ content: r.content as string, model: 'gpt-4o' }))

  )

}

```

### Vercel AI SDK

```typescript

import { createSemanticLLMCache } from '@cachly-dev/openclaw'

import { generateText } from 'ai'

import { openai } from '@ai-sdk/openai'

const cache = createSemanticLLMCache({ url: process.env.CACHLY_URL! })

export async function POST(req: Request) {

  const { prompt } = await req.json()

  const result = await cache.getOrSet(

    prompt,

    () => generateText({ model: openai('gpt-4o'), prompt }).then(r => ({ content: r.text, model: 'gpt-4o' }))

  )

  return Response.json(result)

}

```

### Plain fetch / any provider

```typescript

import { createSemanticLLMCache } from '@cachly-dev/openclaw'

const cache = createSemanticLLMCache({ url: process.env.CACHLY_URL! })

const answer = await cache.getOrSet(

  userMessage,

  async () => {

    const res = await fetch('https://api.anthropic.com/v1/messages', {

      method: 'POST',

      headers: { 'x-api-key': process.env.ANTHROPIC_KEY!, 'content-type': 'application/json' },

      body: JSON.stringify({ model: 'claude-opus-4-5', max_tokens: 1024, messages: [{ role: 'user', content: userMessage }] }),

    })

    const json = await res.json()

    return { content: json.content[0].text, model: 'claude-opus-4-5', inputTokens: json.usage.input_tokens, outputTokens: json.usage.output_tokens }

  }

)

```

---

## 🦾 OpenClaw adapter (bonus)

If you use [OpenClaw](https://openclaw.dev) (22-channel AI assistant), one function wires everything:

```typescript

import { createCachlyOpenClawConfig } from '@cachly-dev/openclaw'

import OpenAI from 'openai'

const openai = new OpenAI()

const cachlyConfig = await createCachlyOpenClawConfig({

  url:       process.env.CACHLY_URL!,

  vectorUrl: process.env.CACHLY_VECTOR_URL,

  embedFn:   (t) =>

    openai.embeddings.create({ model: 'text-embedding-3-small', input: t })

      .then(r => r.data[0].embedding),

})

const app = new OpenClawApp({

  ...cachlyConfig,

  // ... rest of your config

})

```

This gives you: semantic cache + persistent sessions + Redis memory — all across WhatsApp, Telegram, Slack, Discord at once.

---

## 👥 Team Brain

One shared cachly instance → every team member gets smarter from each other's work:

```typescript

// Alice fixes a deploy issue:

await brain.learnFromAttempts({ topic: 'deploy:k8s', outcome: 'success', whatWorked: '...' })

// Bob starts a session the next day:

await brain.sessionStart()

// → "💡 alice solved deploy:k8s 1d ago: ..."

```

Team plans from €99/mo (10 seats) at [cachly.dev/teams](https://cachly.dev/teams).

---

## 🚀 Upgrade path

| Level | What you get | Setup |

|-------|-------------|-------|

| **Free — Exact + BM25** | 20–50% reduction, in-process BM25+ fuzzy, zero config | `CACHLY_URL` only |

| **Free — Semantic cache** | 60–90% cost reduction | + `embedFn` |

| **Speed tier** | Hosted pgvector, higher hit rates at scale | Speed plan at cachly.dev |

| **Team Brain** | Shared knowledge, team lessons, analytics | cachly.dev/teams |

---

## 🤖 Use with Python AI Agents

OpenClaw has a TypeScript SDK, but Python AI agents can use Cachly's REST API directly for persistent memory and semantic caching. Here are patterns for the most popular frameworks.

### LangChain — Persistent Agent Memory

```python

import os, requests

from langchain.memory import ConversationBufferMemory

from langchain.schema import BaseMemory

from typing import Any

CACHLY_URL = os.environ["CACHLY_URL"]

CACHLY_JWT = os.environ["CACHLY_JWT"]

INSTANCE_ID = os.environ["CACHLY_BRAIN_INSTANCE_ID"]

HEADERS = {"Authorization": f"Bearer {CACHLY_JWT}", "Content-Type": "application/json"}

class CachlyBrainMemory(BaseMemory):

    """LangChain memory backed by Cachly Brain — survives restarts."""

    @property

    def memory_variables(self):

        return ["brain_context"]

    def load_memory_variables(self, inputs: dict) -> dict:

        query = inputs.get("input", "")

        r = requests.post(

            f"{CACHLY_URL}/api/v1/instances/{INSTANCE_ID}/brain/smart-recall",

            headers=HEADERS,

            json={"query": query, "topK": 3},

        )

        lessons = r.json().get("results", [])

        context = "\n".join(f"- {l['whatWorked']}" for l in lessons if l.get("whatWorked"))

        return {"brain_context": context or "No relevant memory found."}

    def save_context(self, inputs: dict, outputs: dict) -> None:

        # Learn from what the agent discovered

        output = outputs.get("output", "")

        if output:

            requests.post(

                f"{CACHLY_URL}/api/v1/instances/{INSTANCE_ID}/brain/learn",

                headers=HEADERS,

                json={"topic": "agent:langchain", "outcome": "success", "whatWorked": output[:500]},

            )

    def clear(self):

        pass  # Brain is persistent — clear not supported by design

# Usage:

from langchain.agents import initialize_agent, AgentType

from langchain.chat_models import ChatOpenAI

memory = CachlyBrainMemory()

agent = initialize_agent(

    tools=[...],

    llm=ChatOpenAI(model="gpt-4o"),

    agent=AgentType.CONVERSATIONAL_REACT_DESCRIPTION,

    memory=memory,

    system_message="You have access to a persistent Brain. brain_context = {brain_context}",

)

```

### CrewAI — Shared Team Brain Tool

```python

import os, requests

from crewai_tools import BaseTool

from pydantic import BaseModel, Field

CACHLY_URL = os.environ["CACHLY_URL"]

CACHLY_JWT = os.environ["CACHLY_JWT"]

INSTANCE_ID = os.environ["CACHLY_BRAIN_INSTANCE_ID"]

HEADERS = {"Authorization": f"Bearer {CACHLY_JWT}"}

class RecallInput(BaseModel):

    query: str = Field(description="What to search for in the Brain")

class CachlyRecallTool(BaseTool):

    name: str = "cachly_brain_recall"

    description: str = "Search persistent memory for lessons, solutions, and context from past work"

    args_schema: type[BaseModel] = RecallInput

    def _run(self, query: str) -> str:

        r = requests.post(

            f"{CACHLY_URL}/api/v1/instances/{INSTANCE_ID}/brain/smart-recall",

            headers={**HEADERS, "Content-Type": "application/json"},

            json={"query": query, "topK": 5},

        )

        results = r.json().get("results", [])

        if not results:

            return "No relevant memory found."

        return "\n".join(f"[{l['topic']}] {l['whatWorked']}" for l in results)

class LearnInput(BaseModel):

    topic: str = Field(description="Category slug like 'deploy:api' or 'fix:auth'")

    what_worked: str = Field(description="What solution worked")

    outcome: str = Field(default="success", description="success | failure | partial")

class CachlyLearnTool(BaseTool):

    name: str = "cachly_brain_learn"

    description: str = "Store a lesson in persistent memory so future agents can benefit from it"

    args_schema: type[BaseModel] = LearnInput

    def _run(self, topic: str, what_worked: str, outcome: str = "success") -> str:

        requests.post(

            f"{CACHLY_URL}/api/v1/instances/{INSTANCE_ID}/brain/learn",

            headers={**HEADERS, "Content-Type": "application/json"},

            json={"topic": topic, "outcome": outcome, "whatWorked": what_worked},

        )

        return f"✅ Stored lesson: {topic}"

# Usage with CrewAI:

from crewai import Agent, Task, Crew

researcher = Agent(

    role="Research Analyst",

    goal="Research topics and store findings for the team",

    tools=[CachlyRecallTool(), CachlyLearnTool()],

    backstory="You have a persistent memory that survives across sessions.",

)

```

### AutoGen / Microsoft AutoGen

```python

import os, requests

from autogen import AssistantAgent, UserProxyAgent

CACHLY_URL = os.environ["CACHLY_URL"]

CACHLY_JWT = os.environ["CACHLY_JWT"]

INSTANCE_ID = os.environ["CACHLY_BRAIN_INSTANCE_ID"]

HEADERS = {"Authorization": f"Bearer {CACHLY_JWT}", "Content-Type": "application/json"}

def recall_brain(query: str) -> str:

    """Search Cachly Brain for relevant memory."""

    r = requests.post(

        f"{CACHLY_URL}/api/v1/instances/{INSTANCE_ID}/brain/smart-recall",

        headers=HEADERS, json={"query": query, "topK": 5},

    )

    results = r.json().get("results", [])

    return "\n".join(f"• [{l['topic']}] {l['whatWorked']}" for l in results) or "No memory found."

def store_lesson(topic: str, what_worked: str, outcome: str = "success") -> str:

    """Store a lesson in Cachly Brain."""

    requests.post(

        f"{CACHLY_URL}/api/v1/instances/{INSTANCE_ID}/brain/learn",

        headers=HEADERS, json={"topic": topic, "outcome": outcome, "whatWorked": what_worked},

    )

    return f"Stored: {topic}"

assistant = AssistantAgent(

    name="CachlyAssistant",

    system_message="""You are a helpful AI with persistent memory via Cachly Brain.

ALWAYS start by calling recall_brain() with the user's query.

ALWAYS end by calling store_lesson() with what you discovered.""",

    llm_config={

        "functions": [

            {"name": "recall_brain", "description": "Search persistent memory", "parameters": {"type": "object", "properties": {"query": {"type": "string"}}, "required": ["query"]}},

            {"name": "store_lesson", "description": "Store a lesson", "parameters": {"type": "object", "properties": {"topic": {"type": "string"}, "what_worked": {"type": "string"}, "outcome": {"type": "string"}}, "required": ["topic", "what_worked"]}},

        ],

    },

)

```

### LlamaIndex — QueryEngine with Cachly Memory

```python

import os, requests

from llama_index.core.memory import BaseMemory

from llama_index.core.schema import TextNode

CACHLY_URL = os.environ["CACHLY_URL"]

CACHLY_JWT = os.environ["CACHLY_JWT"]

INSTANCE_ID = os.environ["CACHLY_BRAIN_INSTANCE_ID"]

HEADERS = {"Authorization": f"Bearer {CACHLY_JWT}", "Content-Type": "application/json"}

class CachlyMemory(BaseMemory):

    """LlamaIndex memory backed by Cachly Brain."""

    def get(self, input: str, **kwargs) -> list[TextNode]:

        r = requests.post(

            f"{CACHLY_URL}/api/v1/instances/{INSTANCE_ID}/brain/smart-recall",

            headers=HEADERS, json={"query": input, "topK": 5},

        )

        return [

            TextNode(text=f"[{l['topic']}] {l.get('whatWorked', '')}")

            for l in r.json().get("results", [])

        ]

    def put(self, messages) -> None:

        for msg in messages:

            if hasattr(msg, "content") and msg.content:

                requests.post(

                    f"{CACHLY_URL}/api/v1/instances/{INSTANCE_ID}/brain/learn",

                    headers=HEADERS,

                    json={"topic": "agent:llamaindex", "outcome": "success", "whatWorked": str(msg.content)[:500]},

                )

    def reset(self) -> None:

        pass  # Persistent by design

# Usage:

from llama_index.core.chat_engine import CondensePlusContextChatEngine

chat_engine = CondensePlusContextChatEngine.from_defaults(

    index.as_retriever(),

    memory=CachlyMemory(),

    verbose=True,

)

response = chat_engine.chat("How did we fix the last deployment issue?")

```

### Semantic Cache for LLM API Calls (Python)

Skip expensive LLM calls for semantically similar prompts — no embeddings needed on your side:

```python

import os, hashlib, requests

CACHLY_URL = os.environ["CACHLY_URL"]

CACHLY_JWT = os.environ["CACHLY_JWT"]

INSTANCE_ID = os.environ["CACHLY_BRAIN_INSTANCE_ID"]

HEADERS = {"Authorization": f"Bearer {CACHLY_JWT}", "Content-Type": "application/json"}

def cached_llm_call(prompt: str, llm_fn, namespace: str = "cachly:sem:qa") -> str:

    """Call LLM with semantic caching via Cachly."""

    # 1. Check semantic cache

    r = requests.post(

        f"{CACHLY_URL}/api/v1/instances/{INSTANCE_ID}/cache/semantic-search",

        headers=HEADERS,

        json={"query": prompt, "namespace": namespace, "threshold": 0.85},

    )

    hit = r.json().get("hit")

    if hit:

        return hit["value"]  # Cache hit — no LLM call needed 🎉

    # 2. Cache miss — call LLM

    response = llm_fn(prompt)

    # 3. Store in semantic cache

    key = hashlib.sha256(prompt.encode()).hexdigest()[:16]

    requests.post(

        f"{CACHLY_URL}/api/v1/instances/{INSTANCE_ID}/cache/semantic",

        headers=HEADERS,

        json={"key": key, "value": response, "namespace": namespace, "prompt": prompt},

    )

    return response

# Usage with any LLM:

import openai

client = openai.OpenAI()

def call_gpt(prompt: str) -> str:

    return client.chat.completions.create(

        model="gpt-4o", messages=[{"role": "user", "content": prompt}]

    ).choices[0].message.content

answer = cached_llm_call("What is the capital of France?", call_gpt)

```

### Environment Setup for Python Agents

```bash

pip install requests python-dotenv

# .env

CACHLY_URL=https://api.cachly.dev

CACHLY_JWT=cky_live_...          # from cachly.dev → Dashboard → API Keys

CACHLY_BRAIN_INSTANCE_ID=...     # from cachly.dev → Dashboard → Brain

```

Get your free instance at **[cachly.dev/setup-ai](https://cachly.dev/setup-ai)** — no credit card required.

---

## Links

- 📖 [cachly.dev docs](https://cachly.dev/docs)

- 🧠 [AI Memory / MCP Server](https://cachly.dev/docs/ai-memory)

- 📦 [`@cachly-dev/mcp-server`](https://www.npmjs.com/package/@cachly-dev/mcp-server) — give your AI editor persistent memory (51 MCP tools)

- 🤖 [OpenClaw](https://openclaw.dev)

- 📦 [npm](https://www.npmjs.com/package/@cachly-dev/openclaw)

- 🐛 [Issues](https://github.com/cachly-dev/cachly/issues)

---

Apache-2.0 © [cachly.dev](https://cachly.dev)
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/cachly-dev/cachly-openclaw

Awesome Lists containing this project

README