An open API service indexing awesome lists of open source software.

https://github.com/sopaco/cortex-mem

🧠 The production-ready memory system for intelligent agents. A complete solution for memory management, from extraction and vector search to automated optimization, with a REST API, MCP, CLI, and insights dashboard out-of-the-box.
https://github.com/sopaco/cortex-mem

agent cognition knowledge-base memories memory

Last synced: about 2 months ago
JSON representation

🧠 The production-ready memory system for intelligent agents. A complete solution for memory management, from extraction and vector search to automated optimization, with a REST API, MCP, CLI, and insights dashboard out-of-the-box.

Awesome Lists containing this project

README

          





English
|
δΈ­ζ–‡


🧠 The AI-native memory framework for building intelligent, context-aware applications 🧠


Built with Rust, Cortex Memory is a high-performance, persistent, and intelligent long-term memory system that gives your AI agents / OpenClaw the ability to remember, learn, and personalize interactions across sessions.


Litho Docs
Litho Docs
Benchmark
GitHub Actions Workflow Status
MIT


# πŸ‘‹ What is Cortex Memory?

**Cortex Memory** is a complete, production-ready framework for giving your AI applications a long-term memory. It moves beyond simple chat history, providing an intelligent memory system with a **hierarchical three-tier memory architecture** (L0 Abstract β†’ L1 Overview β†’ L2 Detail) that automatically extracts, organizes, and optimizes information to make your AI agents smarter and more personalized.

Cortex Memory uses a sophisticated pipeline to process and manage memories, centered around a **hybrid storage architecture** combining **virtual-filesystem** durability with vector-based **semantic search**.

| Blazing Fast **Layered Context Loading** | Context Organization as **Virtual Files** | **Precision** Memory Retrieval |
| :--- | :--- | :--- |
| ![Layered Context Loading](./assets/intro/highlight_style_modern.jpg) |![architecture_style_modern](./assets/intro/architecture_style_modern.jpg) | ![architecture_style_classic](./assets/benchmark/cortex_mem_vs_openclaw_2.png) |

**Cortex Memory** organizes data using a **virtual filesystem** approach with the `cortex://` URI scheme:

```
# Basic Structure
cortex://{dimension}/{path}

# Dimensions
session/ - Session memories (conversation history, timeline)
user/ - User memories (preferences, entities, events)
agent/ - Agent memories (cases, skills)
resources/ - Knowledge base resources

# Examples
cortex://session/{session_id}/timeline/{date}/{time}.md
cortex://user/preferences/{name}.md
cortex://agent/cases/{case_id}.md
cortex://resources/{resource_name}/
```


# 😺 Why Use Cortex Memory?


Transform your stateless AI into an intelligent, context-aware partner.




Before Cortex Memory
After Cortex Memory



Stateless AI



  • Forgets user details after every session

  • Lacks personalization and context

  • Repeats questions and suggestions

  • Limited to short-term conversation history

  • Feels robotic and impersonal




Intelligent AI with Cortex Memory



  • Remembers user preferences and history

  • Provides deeply personalized interactions

  • Learns and adapts over time

  • Maintains context across multiple conversations

  • Builds rapport and feels like a true assistant





🌟 For:
- Developers building LLM-powered chatbots and agents.
- Teams creating personalized AI assistants.
- Open source projects that need a memory backbone.
- Anyone who wants to build truly intelligent AI applications!

❀️ Like Cortex Memory? Star it 🌟 or [Sponsor Me](https://github.com/sponsors/sopaco)! ❀️

# 🌠 Features & Capabilities

- File-System Based Storage: Memory content stored as markdown files using the `cortex://` virtual URI scheme, enabling version control compatibility and portability.
- Intelligent Memory Extraction: Automatically extracts structured memories (facts, decisions, entities) from conversations using LLM-powered analysis with confidence scoring.
- Vector-Based Semantic Search: High-performance similarity search via Qdrant with metadata filtering across dimensions (user/agent/session), using weighted scoring.
- Multi-Modal Access: Interact through REST API, CLI, MCP protocol, or direct Rust library integration.
- Three-Tier Memory Hierarchy: Progressive disclosure system (L0 Abstract β†’ L1 Overview β†’ L2 Detail) optimizes LLM context window usage with lazy generation.
- Session Management: Track conversation timelines, participants, and message history with automatic indexing and event-driven processing.
- Multi-Tenancy Support: Isolated memory spaces for different users and agents within a single deployment via tenant-aware collection naming.
- Event-Driven Automation: File watchers and auto-indexers for background processing, synchronization, and profile enrichment.
- LLM Result Caching: Intelligent caching with LRU eviction and TTL expiration reduces redundant LLM API calls by 50-75%, with cascade layer debouncing for 70-90% reduction in layer updates.
- Incremental Memory Updates: Introduced an event-driven incremental update system (`MemoryEventCoordinator`, `CascadeLayerUpdater`) that keeps L0/L1 layers in sync automatically as memories change.
- Memory Forgetting Mechanism: Introduced `MemoryCleanupService` based on the Ebbinghaus forgetting curve β€” automatically archives or deletes low-strength memories to control storage growth in long-running agents.
- Agent Framework Integration: Built-in support for Rig framework and Model Context Protocol (MCP).
- Web Dashboard: Svelte 5 SPA (Insights) for monitoring, tenant management, and semantic search visualization.

# 🧠 How It Works

Cortex Memory uses a sophisticated pipeline to process and manage memories, centered around a **hybrid storage architecture** combining **virtual-filesystem** durability with vector-based **semantic search**.

```mermaid
flowchart TB
subgraph Input["Input Layer"]
User[User Message]
Agent[Agent Message]
CLI[CLI Commands]
API[REST API]
MCP[MCP Protocol]
end

subgraph Core["Core Engine (cortex-mem-core)"]
Session[Session Manager]
Extractor[Memory Extractor]
Indexer[Auto Indexer]
Search[Vector Search Engine]
end

subgraph Storage["Storage Layer"]
FS[(Filesystem
cortex:// URI)]
Qdrant[(Qdrant
Vector Index)]
end

subgraph External["External Services"]
LLM[LLM Provider
Extraction & Analysis]
Embed[Embedding API
Vector Generation]
end

User --> Session
Agent --> Session
CLI --> Core
API --> Core
MCP --> Core

Session -->|Store Messages| FS
Session -->|Trigger Extraction| Extractor

Extractor -->|Analyze Content| LLM
Extractor -->|Store Memories| FS

Indexer -->|Watch Changes| FS
Indexer -->|Generate Embeddings| Embed
Indexer -->|Index Vectors| Qdrant

Search -->|Query Embedding| Embed
Search -->|Vector Search| Qdrant
Search -->|Retrieve Content| FS
```

## Memory Architecture

Cortex Memory organizes data using a **virtual filesystem** approach with the `cortex://` URI scheme:

```
cortex://{dimension}/{scope}/{category}/{id}
```

- **Dimension**: `user`, `agent`, `session`, or `resources`
- **Scope**: Tenant or identifier
- **Category**: `memories`, `profiles`, `entities`, `sessions`, etc.
- **ID**: Unique memory identifier

## Three-Tier Memory Hierarchy

Cortex Memory implements a **progressive disclosure** system with three abstraction layers:

| Layer | Purpose | Token Usage | Use Case |
|-------|---------|-------------|----------|
| **L0 (Abstract)** | Fast positioning, coarse-grained candidate selection | ~100 tokens | Initial screening (20% weight) |
| **L1 (Overview)** | Structured summary with key points and entities | ~500-2000 tokens | Context refinement (30% weight) |
| **L2 (Detail)** | Full conversation content | Variable | Precise matching (50% weight) |

This tiered approach optimizes LLM context window usage by loading only the necessary detail level. The search engine uses **weighted scoring** combining all three layers `L0/L1/L2`.

# 🌐 The Cortex Memory Ecosystem

Cortex Memory is a modular system composed of several crates, each with a specific purpose. This design provides flexibility and separation of concerns.

```mermaid
graph TD
subgraph "User Interfaces"
CLI["cortex-mem-cli
Terminal Interface"]
Insights["cortex-mem-insights
Web Dashboard"]
end

subgraph "APIs & Integrations"
Service["cortex-mem-service
REST API Server"]
MCP["cortex-mem-mcp
MCP Server"]
Rig["cortex-mem-rig
Rig Framework"]
end

subgraph "Core Engine"
Core["cortex-mem-core
Business Logic"]
Tools["cortex-mem-tools
Agent Tools"]
end

subgraph "External Services"
VectorDB[("Qdrant
Vector Database")]
LLM[("LLM Provider
OpenAI/Azure/Local")]
end

%% Define Dependencies
Insights -->|REST API| Service

CLI --> Core
Service --> Core
MCP --> Tools
Rig --> Tools
Tools --> Core

Core --> VectorDB
Core --> LLM
```

- `cortex-mem-core`: The heart of the system. Contains business logic for filesystem abstraction (`cortex://` URI), LLM client wrappers, embedding generation, Qdrant integration, session management, layer generation (L0/L1/L2), extraction engine, search engine, automation orchestrator, and incremental update system (`MemoryEventCoordinator`, `CascadeLayerUpdater`, `LlmResultCache`, `IncrementalMemoryUpdater`) as well as forgetting mechanism (`MemoryCleanupService`).
- `cortex-mem-service`: High-performance REST API server (Axum-based) exposing all memory operations via `/api/v2/*` endpoints. Runs on port 8085 by default.
- `cortex-mem-cli`: Command-line tool (`cortex-mem` binary) for developers and administrators to interact with the memory store directly.
- `cortex-mem-insights`: Pure frontend Svelte 5 SPA for monitoring, analytics, and memory management through a web interface.
- `cortex-mem-mcp`: Model Context Protocol server for integration with AI assistants (Claude Desktop, Cursor, etc.).
- `cortex-mem-rig`: Integration layer with the rig-core agent framework for tool registration.
- `cortex-mem-tools`: MCP tool schemas and operation wrappers for agent integration.
- `cortex-mem-config`: Configuration management module handling TOML loading, environment variable resolution, and tenant-specific overrides.

# πŸ–ΌοΈ Observability Dashboard

Cortex Memory includes a powerful web-based dashboard (`cortex-mem-insights`) that provides real-time monitoring, analytics and management capabilities. The dashboard is a pure frontend Svelte 5 SPA that connects to the `cortex-mem-service` REST API.


Cortex Memory Dashboard


Interactive Dashboard: Tenant overview, system health, and storage statistics at a glance

### Key Features

- **Tenant Management**: View and switch between multiple tenants with isolated memory spaces
- **Memory Browser**: Navigate the `cortex://` filesystem to view and manage memory files
- **Semantic Search**: Perform natural language queries across the memory store
- **Health Monitoring**: Real-time service status and LLM availability checks

### Running the Dashboard

```bash
# Start the backend service first
cortex-mem-service --data-dir ./cortex-data --port 8085

# In another terminal, start the insights dashboard
cd cortex-mem-insights
bun install
bun run dev
```

The dashboard will be available at `http://localhost:5173` and will proxy API requests to the backend service.

# 🦞 Community Showcase: MemClaw

**MemClaw** is a deeply customized memory enhancement plugin for the OpenClaw ecosystem, powered by the locally-running Cortex Memory engine. It delivers superior memory capabilities compared to OpenClaw's built-in memory system, achieving **over 80% token savings** while maintaining exceptional memory accuracy, security, and performance.

## Why MemClaw?

| OpenClaw Native Memory | MemClaw |
|------------------------|---------|
| Basic memory storage | **Three-tier L0/L1/L2 architecture** for intelligent retrieval |
| Higher token consumption | **80%+ token savings** with layered context loading |
| Limited search precision | **Vector search + Agentic VFS exploration** for complex scenarios |

## Key Features

- **🎯 Low Token & Hardware Resource Usage**: Rust-powered high-performance memory components with progressive retrieval for optimal context loading
- **πŸ”’ Complete Data Privacy**: All memories stored locally with zero cloud dependency
- **πŸš€ One-Click Migration**: Seamlessly migrate from OpenClaw native memory to MemClaw
- **βš™οΈ Easy Configuration**: Zero runtime dependencies, one-line installation, minimal config to get started

## Available Tools

| Tool | Purpose |
|------|---------|
| `cortex_search` | Semantic search across all memories with tiered retrieval |
| `cortex_recall` | Recall memories with extended context (snippet + full content) |
| `cortex_add_memory` | Store messages for future retrieval |
| `cortex_close_session` | Close session and trigger memory extraction pipeline |
| `cortex_migrate` | One-click migration from OpenClaw native memory |
| `cortex_maintenance` | Periodic maintenance (prune, reindex, layer generation) |

## Quick Start

```bash
# Install via OpenClaw
openclaw plugins install @memclaw/memclaw
```

> **Note**: Set `memorySearch.enabled: false` to disable OpenClaw's built-in memory and use MemClaw instead.

## Documentation

For detailed configuration, troubleshooting, and best practices, see the [MemClaw README](examples/@memclaw/plugin/README.md).

---

# 🌟 Community Showcase: Cortex TARS

Meet **Cortex TARS** β€” a production-ready AI-native TUI (Terminal User Interface) application that demonstrates the true power of Cortex Memory. Built as a "second brain" companion, Cortex TARS brings **auditory presence** to your AI experience and can truly hear and remember your voice in the real world, showcases how persistent memory transforms AI interactions from fleeting chats into lasting, intelligent partnerships.

## What Makes Cortex TARS Special?

Cortex TARS is more than just a chatbot β€” it's a comprehensive AI assistant platform that leverages Cortex Memory's advanced capabilities:

### 🎭 Multi-Agent Management
Create and manage multiple AI personas, each with distinct personalities, system prompts, and specialized knowledge areas. Whether you need a coding assistant, a creative writing partner, or a productivity coach, Cortex TARS lets you run them all simultaneously with complete separation.

### πŸ’Ύ Persistent Role Memory
Every agent maintains its own long-term memory, learning from interactions over time. Your coding assistant remembers your coding style and preferences; your writing coach adapts to your voice and goals. No more repeating yourself β€” each agent grows smarter with every conversation.

### πŸ”’ Memory Isolation
Advanced memory architecture ensures complete isolation between agents and users. Each agent's knowledge base is separate, preventing cross-contamination while enabling personalized experiences across different contexts and use cases.

### 🎀 Real-Time Audio-to-Memory (The Game Changer)
**This is where Cortex TARS truly shines.** With real-time device audio capture, Cortex TARS can listen to your conversations, meetings, or lectures and automatically convert them into structured, searchable memories. Imagine attending a meeting while Cortex TARS silently captures key insights, decisions, and action items β€” all stored and ready for instant retrieval later. No more frantic note-taking or forgotten details!

## Why Cortex TARS Matters

Cortex TARS isn't just an example β€” it's a fully functional application that demonstrates:

- **Real-world production readiness**: Built with Rust, it's fast, reliable, and memory-safe
- **Seamless Cortex Memory integration**: Shows best practices for leveraging the memory framework
- **Practical AI workflows**: From multi-agent conversations to audio capture and memory extraction
- **User-centric design**: Beautiful TUI interface with intuitive controls and rich features

## Explore Cortex TARS

Ready to see Cortex Memory in action? Dive into the Cortex TARS project:

```bash
cd examples/cortex-mem-tars
cargo build --release
cargo run --release
```

Check out the [Cortex TARS README](examples/cortex-mem-tars/README.md) for detailed setup instructions, configuration guides, and usage examples.

**Cortex TARS proves that Cortex Memory isn't just a framework β€” it's the foundation for building intelligent, memory-aware applications that truly understand and remember.**

# πŸ† Benchmark

Cortex Memory has been rigorously evaluated on the **LoCoMo10 dataset** (conv-26, 152 questions, 19 conversation sessions spanning May–October 2023) using **LLM-as-a-Judge** β€” the same methodology used by the OpenViking official evaluation. The results demonstrate Cortex Memory's superior performance against all other systems.

## Performance Comparison


Cortex Memory vs OpenViking/OpenClaw's Built-in Memory Benchmark


Overall Score: Cortex Memory v5 achieves 68.42% β€” outperforming all OpenViking and OpenClaw configurations

### Overall Scores

| System | Score | Questions |
|--------|:-----:|:---------:|
| **Cortex Memory v5 (Intent ON)** | **68.42%** | 152 |
| OpenViking + OpenClaw (βˆ’memory-core) | 52.08% | 1,540 |
| OpenViking + OpenClaw (+memory-core) | 51.23% | 1,540 |
| OpenClaw + LanceDB (βˆ’memory-core) | 44.55% | 1,540 |
| OpenClaw (built-in memory) | 35.65% | 1,540 |

### Category Breakdown (v5)

| Category | Description | Score |
|:--------:|-------------|:-----:|
| Cat 1 | Factual Recall | 37.50% (12/32) |
| Cat 2 | Temporal Reasoning | 62.16% (23/37) |
| Cat 3 | Commonsense Inference | 76.92% (10/13) |
| Cat 4 | Multi-hop Reasoning | **84.29%** (59/70) |
| **Total** | | **68.42%** (104/152) |

### Token Efficiency

| System | Avg Tokens / Question | Score | Score per 1K Tokens |
|--------|:---------------------:|:-----:|:-------------------:|
| **Cortex Memory v5** | **~2,900** | **68.42%** | **23.6** |
| OpenViking + OpenClaw (βˆ’memory-core) | ~2,769 | 52.08% | 18.8 |
| OpenViking + OpenClaw (+memory-core) | ~1,363 | 51.23% | 37.6 |
| OpenClaw (built-in memory) | ~15,982 | 35.65% | 2.2 |
| OpenClaw + LanceDB (βˆ’memory-core) | ~33,490 | 44.55% | 1.3 |

> Cortex Memory achieves **11Γ— fewer tokens** than OpenClaw+LanceDB and **18Γ— better score-per-token** ratio.

### Key Technical Advantages

- **Intent-Driven Retrieval**: Routing multi-hop queries to entity and relational memory scopes improves Cat 4 accuracy by +18.75pp
- **Hierarchical L0/L1/L2 Architecture**: Precision retrieval starting from ~100-token abstracts β€” you only pay for context you actually need
- **Rust-based Implementation**: High-performance, memory-safe core backed by Qdrant vector database

### Evaluation Framework

The benchmark script is located in `examples/locomo-evaluation`, implementing a two-phase pipeline:

1. **Ingest** β€” conversation sessions are ingested into Cortex Memory per-sample tenant
2. **QA** β€” 152 questions answered via semantic retrieval + LLM generation
3. **Judge** β€” LLM-as-a-Judge scores each answer as CORRECT / WRONG (binary, identical to OpenViking methodology)

For more details on running the evaluation, see the [locomo-evaluation README](examples/locomo-evaluation/README.md) and the full results in [`examples/locomo-evaluation/BENCHMARK.md`](examples/locomo-evaluation/BENCHMARK.md).

# πŸ–₯ Getting Started

### Prerequisites
- [**Rust**](https://www.rust-lang.org) (version 1.86 or later)
- [**Qdrant**](https://qdrant.tech/) vector database (version 1.7+)
- An **OpenAI-compatible** LLM API endpoint (for memory extraction and analysis)
- An **OpenAI-compatible** Embedding API endpoint (for vector search)

### Installation
The simplest way to get started is to use the CLI and Service binaries, which can be installed via `cargo`.
```sh
# Install the CLI for command-line management
cargo install --path cortex-mem-cli

# Install the REST API Service for application integration
cargo install --path cortex-mem-service

# Install the MCP server for AI assistant integrations
cargo install --path cortex-mem-mcp
```

### Configuration
Cortex Memory applications (`cortex-mem-cli`, `cortex-mem-service`, `cortex-mem-mcp`) are configured via a `config.toml` file. The CLI will look for this file in the current directory by default, or you can pass a path using the `-c` or `--config` flag.

Here is a sample `config.toml` with explanations:

```toml
# -----------------------------------------------------------------------------
# Qdrant Vector Database Configuration
# -----------------------------------------------------------------------------
[qdrant]
url = "http://localhost:6334" # URL of your Qdrant instance (gRPC port)
http_url = "http://localhost:6333" # HTTP URL for REST API
collection_name = "cortex-memory" # Base name for collections (tenant suffix added)
timeout_secs = 5 # Timeout for Qdrant operations
embedding_dim = 1536 # Embedding dimension (e.g., 1536 for text-embedding-3-small)

# -----------------------------------------------------------------------------
# LLM (Large Language Model) Configuration (for reasoning, extraction)
# -----------------------------------------------------------------------------
[llm]
api_base_url = "https://api.openai.com/v1" # Base URL of your LLM provider
api_key = "${OPENAI_API_KEY}" # API key (supports env variable)
model_efficient = "gpt-5-mini" # Model for extraction and classification
model_reasoning = "o1-preview" # Model for complex reasoning (optional)
temperature = 0.7 # Sampling temperature for LLM responses
max_tokens = 8192 # Max tokens for LLM generation
timeout_secs = 60 # Timeout for LLM requests

# -----------------------------------------------------------------------------
# Embedding Service Configuration
# -----------------------------------------------------------------------------
[embedding]
api_base_url = "https://api.openai.com/v1" # Base URL of your embedding provider
api_key = "${OPENAI_API_KEY}" # API key (supports env variable)
model_name = "text-embedding-3-small" # Name of the embedding model to use
batch_size = 32 # Number of texts to embed in a single batch
timeout_secs = 30 # Timeout for embedding requests

# -----------------------------------------------------------------------------
# Cortex Data Directory Configuration
# -----------------------------------------------------------------------------
[cortex]
data_dir = "./cortex-data" # Directory for storing memory files and sessions
```

# πŸš€ Usage

### CLI (`cortex-mem-cli`)

The CLI provides a powerful interface for direct interaction with the memory system. All commands require a `config.toml` file, which can be specified with `--config `. The `--tenant` flag allows multi-tenant isolation.

#### Add a Memory
Adds a new message to a session thread, automatically storing it in the memory system.

```sh
cortex-mem --config config.toml --tenant acme add --thread thread-123 --role user "The user is interested in Rust programming."
```
- `--thread `: (Required) The thread/session ID.
- `--role `: Message role (user/assistant/system). Default: "user"
- `content`: The text content of the message (positional argument).

#### Search for Memories
Performs a semantic vector search across the memory store with weighted L0/L1/L2 scoring.

```sh
cortex-mem --config config.toml --tenant acme search "what are the user's hobbies?" --thread thread-123 --limit 10
```
- `query`: The natural language query for the search.
- `--thread `: Filter memories by thread ID.
- `--limit ` / `-n`: Maximum number of results. Default: 10
- `--min-score ` / `-s`: Minimum relevance score (0.0-1.0). Default: 0.4
- `--scope `: Search scope: "session", "user", or "agent". Default: "session"

#### List Memories
Retrieves a list of memories from a specific URI path.

```sh
cortex-mem --config config.toml --tenant acme list --uri "cortex://session" --include-abstracts
```
- `--uri ` / `-u`: URI path to list (e.g., "cortex://session" or "cortex://user/preferences"). Default: `cortex://session`
- `--include-abstracts`: Include L0 abstracts in results.

#### Get a Specific Memory
Retrieves a specific memory by its URI.

```sh
cortex-mem --config config.toml --tenant acme get "cortex://session/thread-123/memory-456.md"
```
- `uri`: The memory URI.
- `--abstract-only` / `-a`: Show L0 abstract instead of full content.
- `--overview` / `-o`: Show L1 overview instead of full content.

#### Delete a Memory
Removes a memory from the store by its URI.

```sh
cortex-mem --config config.toml --tenant acme delete "cortex://session/thread-123/memory-456.md"
```

#### Session Management
Manage conversation sessions.

```sh
# List all sessions
cortex-mem --config config.toml --tenant acme session list

# Create a new session
cortex-mem --config config.toml --tenant acme session create thread-456 --title "My Session"

# Close a session (triggers extraction, layer generation, and vector indexing)
cortex-mem --config config.toml --tenant acme session close thread-456
```

#### Layers and Stats
Manage layer files and display system statistics.

```sh
# Display system statistics
cortex-mem --config config.toml --tenant acme stats

# List available tenants
cortex-mem --config config.toml tenant list

# Show L0/L1 layer file coverage status
cortex-mem --config config.toml --tenant acme layers status

# Generate missing L0/L1 layer files
cortex-mem --config config.toml --tenant acme layers ensure-all

# Regenerate oversized L0 abstract files (> 2K characters)
cortex-mem --config config.toml --tenant acme layers regenerate-oversized
```

### REST API (`cortex-mem-service`)

The REST API allows you to integrate Cortex Memory into any application, regardless of the programming language. The service runs on port 8085 by default.

#### Starting the Service
```sh
# Start the API server with default settings (port 8085)
cortex-mem-service --config config.toml --host 127.0.0.1 --port 8085

# Enable verbose logging
cortex-mem-service --config config.toml -h 127.0.0.1 -p 8085 --verbose
```

#### API Endpoints

**Health Check**
- `GET /health`: Service liveness check
- `GET /health/ready`: Readiness check (Qdrant, LLM connectivity)

**Filesystem Operations**
- `GET /api/v2/filesystem/list?uri=`: List directory contents.
- `GET /api/v2/filesystem/read/`: Read file content.
- `POST /api/v2/filesystem/write`: Write content to a file.
- `GET /api/v2/filesystem/stats?uri=`: Get directory statistics.

**Session Management**
- `GET /api/v2/sessions`: List all sessions.
- `POST /api/v2/sessions`: Create a new session.
- `POST /api/v2/sessions/:thread_id/messages`: Add a message to a session.
- `POST /api/v2/sessions/:thread_id/close`: Close a session and trigger memory extraction.

**Semantic Search**
- `POST /api/v2/search`: Perform semantic search across memories with weighted L0/L1/L2 scoring.

**Automation**
- `POST /api/v2/automation/extract/:thread_id`: Trigger memory extraction for a thread.
- `POST /api/v2/automation/index/:thread_id`: Trigger vector indexing for a thread.
- `POST /api/v2/automation/index-all`: Index all threads.
- `POST /api/v2/automation/sync`: Manually trigger synchronization between filesystem and vector store.

**Tenant Management**
- `GET /api/v2/tenants/tenants`: List all available tenants.
- `POST /api/v2/tenants/tenants/switch`: Switch active tenant context.
- `GET /api/v2/tenants/{id}/stats`: Get per-tenant storage metrics.

#### Example: Create a Session and Add Message

```bash
# Create a new session
curl -X POST http://localhost:8085/api/v2/sessions \
-H "Content-Type: application/json" \
-d '{
"thread_id": "thread-123",
"title": "Support Conversation"
}'

# Add a message to the session
curl -X POST http://localhost:8085/api/v2/sessions/thread-123/messages \
-H "Content-Type: application/json" \
-d '{
"role": "user",
"content": "I just upgraded to the premium plan."
}'
```

#### Example: Semantic Search

```bash
curl -X POST http://localhost:8085/api/v2/search \
-H "Content-Type: application/json" \
-H "X-Tenant-ID: acme" \
-d '{
"query": "What is the user's current subscription?",
"thread": "thread-123",
"scope": "session",
"limit": 5,
"min_score": 0.5
}'
```

#### Example: Trigger Memory Extraction

```bash
# Extract memories from a session (typically called when session is closed)
curl -X POST http://localhost:8085/api/v2/automation/extract/thread-123 \
-H "Content-Type: application/json" \
-d '{ "auto_save": true }'
```

### Model Context Protocol (MCP) Server (`cortex-mem-mcp`)

Cortex Memory provides an MCP server for integration with AI assistants like Claude Desktop, Cursor, or GitHub Copilot. The MCP server exposes memory tools through the stdio transport.

```sh
# Run the MCP server with configuration
cortex-mem-mcp --config config.toml --tenant acme
```

The MCP server exposes the following tools:
- **store_memory**: Store new facts or conversation summaries
- **query_memory**: Search memory with natural language
- **list_memories**: Enumerate available memories by URI prefix
- **get_memory**: Retrieve a specific memory by URI
- **delete_memory**: Remove a memory by URI

Configure your AI assistant to use the MCP server by adding it to your assistant's configuration:

# 🀝 Contribute
We welcome all forms of contributions! Report bugs or submit feature requests through [GitHub Issues](https://github.com/sopaco/cortex-mem/issues).

### Development Process
1. Fork this project
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add some amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Create a Pull Request

# πŸͺͺ License
This project is licensed under the **MIT License**. See the [LICENSE](LICENSE) file for details.