https://github.com/t8/memoryport

Local-first permanent, persistent memory for all agents and humans
https://github.com/t8/memoryport
agents context-window llms
Last synced: 2 months ago
JSON representation
Local-first permanent, persistent memory for all agents and humans
Host: GitHub
URL: https://github.com/t8/memoryport
Owner: t8
License: apache-2.0
Created: 2026-03-24T21:49:42.000Z (3 months ago)
Default Branch: main
Last Pushed: 2026-03-31T05:08:37.000Z (3 months ago)
Last Synced: 2026-03-31T07:40:59.587Z (3 months ago)
Topics: agents, context-window, llms
Language: Rust
Homepage: https://memoryport.ai
Size: 2 MB
Stars: 72
Watchers: 0
Forks: 4
Open Issues: 0
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project

README

          


  





  Website ·

  Download ·

  AMP Spec



Memoryport gives LLMs persistent, queryable memory using [Arweave](https://arweave.org) for permanent storage and [LanceDB](https://lancedb.com) for local vector search. Every conversation is stored permanently and retrieved semantically — so your AI never forgets.

Works with **Claude Code**, **Cursor**, **Open WebUI**, **Ollama**, and any OpenAI-compatible tool.

## Install

### Desktop App

Download the latest release for your platform:

| Platform | Download |

|----------|----------|

| **macOS** (Apple Silicon & Intel) | [Download .dmg](https://memoryport.ai/api/download?platform=mac) |

| **Linux** (x64) | [Download .AppImage](https://memoryport.ai/api/download?platform=linux) |

The desktop app includes a setup wizard, dashboard, and manages all services automatically.

### CLI

```bash

curl -fsSL https://memoryport.ai/install | sh

```

Then run the setup wizard:

```bash

uc init

```

That's it. Restart your editor — Memoryport auto-captures conversations and surfaces relevant context.

### Build from Source

```bash

# Prerequisites: Rust 1.91+, protoc (brew install protobuf), Node.js 18+, pnpm

cargo build --release

cd ui && pnpm install && pnpm build  # Dashboard

```

## Performance

![Query Latency vs Context Space](tests/scale/performance_chart.png)

**294ms query latency at 500M tokens** — brute force with 100% recall. No approximate indexing needed.

| Context Space | Chunks | p50 Latency |

|---|---|---|

| 100K tokens | 266 | 1ms |

| 1M tokens | 2,666 | 3ms |

| 10M tokens | 26,666 | 9ms |

| 100M tokens | 266,666 | 61ms |

| 500M tokens | 1,333,333 | 294ms |

Tested with `nomic-embed-text` (768d, local via Ollama). Compacted LanceDB, no cloud APIs required.

## Supported Integrations

| Tool | Method | Setup |

|------|--------|-------|

| **Claude Code** | API Proxy | `uc init` configures automatically (sets `ANTHROPIC_BASE_URL`) |

| **Cursor** | API Proxy | Set `ANTHROPIC_BASE_URL=http://127.0.0.1:9191` |

| **Open WebUI** | Ollama Proxy | Set Ollama URL to `http://127.0.0.1:9191` in Settings → Connections |

| **Ollama (terminal)** | Ollama Proxy | `OLLAMA_HOST=http://127.0.0.1:9191 ollama run llama3` |

| **Continue.dev** | Ollama/OpenAI Proxy | Set endpoint to `http://127.0.0.1:9191` |

| **Any OpenAI SDK app** | API Proxy | `OPENAI_BASE_URL=http://127.0.0.1:9191` |

| **Claude Code (MCP)** | MCP Server | `uc init` registers MCP automatically |

| **Cursor (MCP)** | MCP Server | `uc init` registers MCP automatically |

The proxy handles all three API formats on a single port (9191):

- **Anthropic** `/v1/messages`

- **OpenAI** `/v1/chat/completions`

- **Ollama** `/api/chat`, `/api/generate`, `/api/tags`, and all `/api/*` routes

## How It Works

Memoryport supports two retrieval modes, configurable per-request:

### Single-turn (default)

```

User sends a message

  │

  ▼

Proxy intercepts transparently

  │

  ├─ Quality gating (skip greetings, commands, trivial queries)

  ├─ Search memory for relevant context

  ├─ Inject context into the message as plain text

  ├─ Forward to LLM (Anthropic, OpenAI, Ollama)

  ├─ Capture user message + assistant response

  │   ├─ Sanitize (strip system prompts, internal commands)

  │   ├─ Embed and store in LanceDB

  │   └─ Optionally sync to Arweave (permanent storage)

  └─ Return response to user

```

### Multi-turn (agentic retrieval)

```

User sends a message

  │

  ▼

Proxy injects a memory search tool into the request

  │

  ├─ LLM decides what to search for and calls the tool

  ├─ Proxy executes the search, returns results to LLM

  ├─ LLM may search again (up to max_rounds)

  ├─ LLM produces final response with full context

  ├─ Capture and store conversation

  └─ Return response to user

```

Multi-turn lets the model iteratively refine its memory queries — useful for complex questions that need multiple pieces of context. Toggle between modes in the dashboard Settings or via `[proxy.agentic] enabled` in config. See the [AMP specification](https://github.com/t8/amp-spec) for the protocol details.

## Dashboard

The Tauri desktop app includes a full dashboard. For CLI users, run the stack manually:

```bash

./dev.sh start    # Build and start server + proxy + UI

```

**Pages:**

- **Dashboard** — context space, indexed chunks, session browser, semantic search

- **Analytics** — activity sparklines, storage growth, type/source distribution, sync status

- **Integrations** — toggle MCP server, API proxy, Ollama capture on/off with live status

- **Settings** — embedding provider, model, API key, smart gating, encryption, Arweave wallet

## CLI

```bash

uc init                  # Interactive setup wizard

uc store "text" -t knowledge  # Store a chunk

uc query "search term"   # Full retrieval pipeline (gated + reranked + assembled)

uc retrieve "search"     # Raw vector search (bypasses gating)

uc proxy                 # Start the API proxy

uc delete --tx-id    # Logical deletion (destroy encryption key)

uc rebuild-index -u  # Rebuild index from Arweave

uc status                # Index stats

uc flush                 # Flush pending writes

```

## MCP Tools

| Tool | Description |

|------|-------------|

| `uc_auto_store` | Silently store a conversation turn (called automatically) |

| `uc_store` | Store text with explicit metadata |

| `uc_query` | Semantic search with full retrieval pipeline |

| `uc_retrieve` | Raw ranked results |

| `uc_get_session` | Full conversation history for a session |

| `uc_list_sessions` | List all stored sessions |

| `uc_status` | System status |

## Configuration

`~/.memoryport/uc.toml` (created by `uc init`):

```toml

[arweave]

gateway = "https://arweave.net"

turbo_endpoint = "https://upload.ardrive.io"

# wallet_path = "~/.memoryport/wallet.json"

[index]

path = "~/.memoryport/index"

embedding_dimensions = 768

[embeddings]

provider = "ollama"              # or "openai"

model = "nomic-embed-text"

dimensions = 768

[retrieval]

max_context_tokens = 50000

similarity_top_k = 50

recency_window = 20

gating_enabled = true            # Three-gate system: skip greetings, route by embedding, filter low quality

# query_expansion = true         # LLM generates alternative search terms

# hyde = true                    # Embed hypothetical answer instead of raw query

# llm_model = "gpt-4o-mini"

[encryption]

# enabled = true

# passphrase_env = "UC_MASTER_PASSPHRASE"

[proxy]

listen = "127.0.0.1:9191"

```

## Architecture

```

crates/

├── uc-arweave/      # Arweave client (wallet, ANS-104, Turbo, GraphQL)

├── uc-embeddings/   # Embedding + LLM providers (OpenAI, Ollama)

├── uc-core/         # Core engine (chunk, index, retrieve, rerank, assemble, encrypt, gate)

├── uc-cli/          # CLI binary with setup wizard

├── uc-mcp/          # MCP server (stdio, 7 tools, 2 resources)

├── uc-proxy/        # Multi-protocol API proxy (Anthropic + OpenAI + Ollama)

├── uc-server/       # Multi-tenant hosted API server + dashboard

└── uc-tauri/        # Tauri desktop app (macOS, Linux)

ui/                  # React 19 dashboard (Vite + Tailwind)

```

## Security

- All data on Arweave is encrypted with AES-256-GCM (per-batch random keys)

- Master key derived from passphrase via Argon2id

- Logical deletion: destroy batch key → ciphertext permanently unreadable

- Proxy sanitizes system prompts, internal commands, and meta-requests before storage

## Deployment

### Docker

```bash

docker compose up

```

Environment variables:

- `OPENAI_API_KEY` — for embeddings (if using OpenAI)

- `UC_ADMIN_API_KEY` — admin API key for user management

- `UC_SERVER_LISTEN` — listen address (default `0.0.0.0:8080`)

- `UC_SERVER_DATA_DIR` — data directory (default `/var/lib/uc-server`)

### Hosted API

```bash

# Create a user

curl -X POST http://localhost:8080/admin/users \

  -H "Authorization: Bearer $UC_ADMIN_API_KEY" \

  -H "Content-Type: application/json" \

  -d '{"email": "user@example.com"}'

# Returns: { "user_id": "...", "api_key": "uc_..." }

# Store context

curl -X POST http://localhost:8080/v1/store \

  -H "Authorization: Bearer uc_..." \

  -H "Content-Type: application/json" \

  -d '{"text": "Arweave uses pay-once permanent storage", "chunk_type": "knowledge"}'

# Query

curl -X POST http://localhost:8080/v1/query \

  -H "Authorization: Bearer uc_..." \

  -H "Content-Type: application/json" \

  -d '{"query": "How does Arweave pricing work?"}'

```

## Data Recovery

If you lose your local data or set up on a new machine, Pro users can rebuild their memory from Arweave:

1. Install Memoryport on the new machine

2. Open Settings → Arweave Storage

3. Enter your API key

4. Click "Rebuild from Arweave"

All encrypted batches are fetched from the permanent storage network and re-indexed locally. Your encryption key never leaves your machine — data is decrypted client-side during rebuild.

## Benchmarks

### LongMemEval (ICLR 2025)

Evaluated on [LongMemEval](https://github.com/xiaowu0162/LongMemEval), a benchmark for long-term memory in chat assistants. Tests retrieval and answer accuracy on the standard split (`longmemeval_s`) with ~115K token haystacks per question.

**Answer Accuracy** (full 500 questions, gpt-4o reader, gpt-4o-mini judge):

| Category | Accuracy | Session Recall | n |

|----------|----------|----------------|---|

| single-session-assistant | **91.1%** | 87% | 56 |

| single-session-user | **60.0%** | 56% | 70 |

| knowledge-update | **53.3%** | 72% | 78 |

| single-session-preference | **36.7%** | 53% | 30 |

| temporal-reasoning | **27.1%** | 36% | 133 |

| multi-session | **27.1%** | 47% | 133 |

| **Overall** | **43.5%** | **61.1%** | **500** |

Note: the full 500-question run places all questions' haystacks in a shared index (~250K chunks). In production, each user has an isolated index, which gives better retrieval quality — our 100-question runs (isolated context) consistently score 60-63%.

**Session Recall** (48-question oracle split, local embeddings):

| Category | Recall | n |

|----------|--------|---|

| knowledge-update | **100%** | 8 |

| multi-session | **100%** | 8 |

| single-session-user | **100%** | 8 |

| single-session-assistant | **100%** | 8 |

| single-session-preference | **100%** | 8 |

| temporal-reasoning | **87.5%** | 8 |

| **Overall** | **97.9%** | **48** |

Key retrieval improvements validated across 41 experiments:

- Temporal fallback (retry without time filter when too few results)

- Date-enriched embeddings (prepend date to chunks before embedding)

- Date-prefixed retrieve responses (LLMs see explicit dates per chunk)

- Round-level conversation storage (user+assistant pairs as single embeddings)

- Chronological session ordering in assembled context

See `tests/longmemeval/autoresearch/results.tsv` for the full experiment optimization log. Autoresearch framework (`tests/longmemeval/autoresearch/`) enables automated experiment iteration.

### Stress Test (10K chunks)

| Metric | Result |

|--------|--------|

| Insert throughput | 25 chunks/sec (1K) → 13 chunks/sec (10K) |

| Retrieval accuracy | 91% recall@10 across 6 topic categories |

| Query latency (p50) | 265ms (1K chunks), 361ms (oracle dataset) |

| Index size | ~50MB at 10K chunks |

### Three-Gate Retrieval Gating

Prevents unnecessary retrieval on simple messages:

| Gate | What it does | Latency |

|------|-------------|---------|

| Gate 1: Rules | Skip greetings, commands, short queries. Force memory references, temporal queries. | ~0ms |

| Gate 2: Embedding routing | Compare query embedding against "needs retrieval" vs "skip" centroids. | ~0ms (reuses existing embedding) |

| Gate 3: Quality threshold | Drop results below relevance score. | ~0ms (checks existing scores) |

### Proxy Latency Overhead

Measures the latency added by the proxy in each mode using a mock upstream (50ms simulated LLM delay):

| Mode | p50 | p95 | mean | Overhead vs direct |

|------|-----|-----|------|--------------------|

| Direct (no proxy) | 58ms | 61ms | 58ms | — |

| Single-turn (context injection) | 108ms | 120ms | 107ms | +49ms |

| Multi-turn (agentic loop, 1 round) | 139ms | 145ms | 139ms | +81ms |

Single-turn overhead is dominated by embedding + LanceDB search. Multi-turn adds one extra round trip to the upstream for tool execution.

Run benchmarks yourself:

```bash

# LongMemEval session recall (oracle split, fast)

python3 tests/longmemeval/run_benchmark.py --questions 50 --dataset oracle

# LongMemEval answer accuracy (standard split, requires OpenAI API key)

python3 tests/longmemeval/run_answer_accuracy.py --questions 100 --dataset s --answer-model gpt-4o

# Autoresearch optimization loop (iterates experiments overnight)

python3 tests/longmemeval/autoresearch/prepare.py --questions 100

# Stress test

python3 tests/stress/generate.py --chunks 10000

python3 tests/stress/benchmark.py

# Latency benchmark (requires mock upstream + proxy pointed at it)

python3 tests/latency/mock_upstream.py --port 8199 &

python3 tests/latency/benchmark.py --proxy http://127.0.0.1:9292 --mock http://127.0.0.1:8199

```

## Comparison

How Memoryport compares to other AI memory tools:

|  | **Memoryport** | **Supermemory** | **claude-mem** |

|--|---------------|----------------|----------------|

| **Architecture** | Local-first + permanent backup | Cloud SaaS | Local plugin |

| **Language** | Rust | TypeScript | TypeScript |

| **Storage** | LanceDB (local) + Arweave (permanent) | PostgreSQL (their cloud) | SQLite (local only) |

| **Encryption** | AES-256-GCM per-batch, Argon2id key | Delegated to cloud infra | None |

| **Data ownership** | You (local + on-chain) | Them (cloud) | You (local files) |

| **Multi-tool** | Proxy + MCP (Claude Code, Cursor, Ollama, OpenAI) | API-based | Claude Code only |

| **Capture method** | Transparent API proxy (zero-config) | Explicit API calls | Lifecycle hooks |

| **Desktop app** | Signed Tauri native app | None (web only) | Localhost web viewer |

| **Open protocol** | [AMP](https://github.com/t8/amp-spec) | No | No |

| **Self-hosting** | Default (runs locally) | Enterprise only | Default (runs locally) |

| **Scale benchmark** | 500M tokens, 294ms p50 | Not published | Not published |

| **Retrieval accuracy** | 43.5% answer accuracy / 500q, 97.9% session recall (LongMemEval) | 84.6% answer accuracy (LongMemEval, GPT-5) | Not published |

| **Permanent storage** | Arweave (pay once, stored forever) | No | No |

| **License** | Apache-2.0 | MIT | AGPL-3.0 |

## Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md).

## License

Apache-2.0
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/t8/memoryport

Awesome Lists containing this project

README