{"id":47918680,"url":"https://github.com/sqliteai/sqlite-memory","last_synced_at":"2026-04-22T07:01:00.907Z","repository":{"id":343415620,"uuid":"1145963944","full_name":"sqliteai/sqlite-memory","owner":"sqliteai","description":"Markdown based AI agent memory with semantic search, hybrid retrieval, and offline-first sync between agents.","archived":false,"fork":false,"pushed_at":"2026-04-20T08:08:43.000Z","size":4948,"stargazers_count":36,"open_issues_count":0,"forks_count":1,"subscribers_count":3,"default_branch":"main","last_synced_at":"2026-04-20T09:35:22.370Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sqliteai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-01-30T12:52:12.000Z","updated_at":"2026-04-19T10:05:04.000Z","dependencies_parsed_at":null,"dependency_job_id":"274ef519-0ab2-4043-a19e-00023cddf53a","html_url":"https://github.com/sqliteai/sqlite-memory","commit_stats":null,"previous_names":["sqliteai/sqlite-memory"],"tags_count":11,"template":false,"template_full_name":null,"purl":"pkg:github/sqliteai/sqlite-memory","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sqliteai%2Fsqlite-memory","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sqliteai%2Fsqlite-memory/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sqliteai%2Fsqlite-memory/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sqliteai%2Fsqlite-memory/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sqliteai","download_url":"https://codeload.github.com/sqliteai/sqlite-memory/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sqliteai%2Fsqlite-memory/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32125094,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-22T00:31:26.853Z","status":"online","status_checked_at":"2026-04-22T02:00:05.693Z","response_time":58,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-04-04T05:49:46.414Z","updated_at":"2026-04-22T07:01:00.893Z","avatar_url":"https://github.com/sqliteai.png","language":"C","readme":"# SQLite Memory\n\nA SQLite extension that gives AI agents persistent, searchable memory, optimized for markdown content. Features hybrid semantic search (vector similarity + FTS5), markdown-aware chunking, and local embedding via llama.cpp.\n\nAgent memory databases can be synchronized between agents using **offline-first technology** via [sqlite-sync](https://github.com/sqliteai/sqlite-sync). Each agent works independently and syncs when connected, making it ideal for distributed AI systems, edge deployments, and collaborative agent architectures.\n\n## The Future of AI Agent Memory\n\nModern AI agents need persistent, searchable memory to maintain context across conversations and tasks. Inspired by [OpenClaw's memory architecture](https://docs.openclaw.ai/concepts/memory), sqlite-memory implements what we believe will become the de facto standard for AI agent memory systems: **markdown files as the source of truth**.\n\nIn this paradigm:\n- **Markdown files** serve as human-readable, version-controllable knowledge bases\n- **Embeddings** enable semantic understanding and retrieval\n- **Hybrid search** combines the precision of full-text search with the intelligence of vector similarity\n\nsqlite-memory bridges these concepts, allowing any SQLite-powered application to ingest, store, and semantically search over knowledge bases.\n\n## Why sqlite-memory?\n\n### For AI Agent Developers\n\n- **Persistent Memory**: Give your agents long-term memory that survives restarts\n- **Semantic Recall**: Retrieve relevant context based on meaning, not just keywords\n- **Context Isolation**: Organize memories by context (projects, conversations, topics)\n- **Local-First**: Run entirely on-device with local embedding models - no API costs, no latency, no data leaving your system\n\n### For Application Developers\n\n- **Zero Infrastructure**: No vector database servers to deploy - it's just SQLite\n- **Single File**: Your entire knowledge base lives in one portable `.db` file\n- **SQL Interface**: Query your semantic memory using familiar SQL\n- **Embeddable**: Works anywhere SQLite works - mobile, desktop, edge, WASM\n\n### Technical Advantages\n\n- **Hybrid Search**: Combines vector similarity (cosine distance) with FTS5 full-text search for superior retrieval\n- **Smart Chunking**: Markdown-aware parsing preserves semantic boundaries\n- **Intelligent Sync**: Content-hash change detection skips unchanged files, atomically replaces modified ones, and cleans up deleted ones\n- **Transactional Safety**: Text/file ingests run inside SAVEPOINT transactions, and directory sync uses transactional cleanup plus per-file transactional updates so failed files do not leave partial rows behind\n- **Efficient Storage**: Binary embeddings with configurable dimensions\n- **Embedding Cache**: Automatically caches computed embeddings, so re-indexing the same text skips redundant API calls and computation\n- **Flexible Embedding**: Use local models (llama.cpp) or [vectors.space](https://vectors.space) remote API\n\n## Architecture\n\n```\n┌─────────────────────────────────────────────────────────────┐\n│                     Your Application                        │\n├─────────────────────────────────────────────────────────────┤\n│                      sqlite-memory                          │\n│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────┐  │\n│  │   Parser    │  │  Embedding  │  │   Hybrid Search     │  │\n│  │  (md4c)     │  │ (llama.cpp) │  │ (vector + FTS5)     │  │\n│  └─────────────┘  └─────────────┘  └─────────────────────┘  │\n├─────────────────────────────────────────────────────────────┤\n│                   sqlite-vector                             │\n├─────────────────────────────────────────────────────────────┤\n│                      SQLite                                 │\n└─────────────────────────────────────────────────────────────┘\n```\n\n## Getting Started\n\n\u003e [!IMPORTANT]\n\u003e Databases created with sqlite-memory versions earlier than `1.0.0` must be rebuilt before use with `1.0.0+`, because the internal schema changed.\n\n### Prerequisites\n\n- SQLite\n- [sqlite-vector](https://github.com/sqliteai/sqlite-vector) extension\n- [sqlite-sync](https://github.com/sqliteai/sqlite-sync) extension (optional, only needed for agent sync)\n- **For local embeddings**: A GGUF embedding model (e.g., [nomic-embed-text](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF))\n- **For remote embeddings**: A free API key from [vectors.space](https://vectors.space)\n\n### Quick Start\n\n```sql\n-- Load extensions (sync is optional)\n.load ./vector\n.load ./cloudsync\n.load ./memory\n\n-- Configure embedding model (choose one):\n\n-- Option 1: Local embedding with llama.cpp (no internet required)\nSELECT memory_set_model('local', '/path/to/nomic-embed-text-v1.5.Q8_0.gguf');\n\n-- Option 2: Remote embedding via vectors.space (requires free API key from https://vectors.space)\n-- The provider name 'openai' selects the vectors.space OpenAI-compatible endpoint.\n-- SELECT memory_set_apikey('your-vectorspace-api-key');\n-- SELECT memory_set_model('openai', 'text-embedding-3-small');\n\n-- Add some knowledge\nSELECT memory_add_text('SQLite is a C-language library that implements a small, fast,\nself-contained, high-reliability, full-featured, SQL database engine. SQLite is the\nmost used database engine in the world.', 'sqlite-docs');\n\nSELECT memory_add_text('Vector databases store data as high-dimensional vectors,\nenabling similarity search. They are essential for semantic search, recommendation\nsystems, and AI applications.', 'concepts');\n\n-- Add an entire documentation directory\nSELECT memory_add_directory('/path/to/docs', 'project-docs');\n\n-- Search your memory semantically\nSELECT path, snippet, ranking\nFROM memory_search\nWHERE query = 'how do databases store information efficiently';\n\n-- Results ranked by semantic similarity + keyword matching\n-- ┌──────────────┬─────────────────────────────────────┬─────────┐\n-- │     path     │               snippet               │ ranking │\n-- ├──────────────┼─────────────────────────────────────┼─────────┤\n-- │ (uuid)       │ SQLite is a C-language library...   │ 0.89    │\n-- │ (uuid)       │ Vector databases store data as...   │ 0.82    │\n-- └──────────────┴─────────────────────────────────────┴─────────┘\n```\n\n### Example: Building an AI Agent with Memory\n\n```python\nimport sqlite3\n\n# Connect to your memory database\nconn = sqlite3.connect('agent_memory.db')\nconn.enable_load_extension(True)\nconn.load_extension('./vector')\nconn.load_extension('./memory')\n\n# One-time setup\nconn.execute(\"SELECT memory_set_model('local', './models/nomic-embed-text-v1.5.Q8_0.gguf')\")\n\n# Store conversation context\ndef remember(content, context=\"conversation\"):\n    conn.execute(\"SELECT memory_add_text(?, ?)\", (content, context))\n    conn.commit()\n\n# Retrieve relevant memories\ndef recall(query, min_score=0.7):\n    cursor = conn.execute(\"\"\"\n        SELECT snippet, ranking FROM memory_search\n        WHERE query = ? AND ranking \u003e ?\n        ORDER BY ranking DESC\n    \"\"\", (query, min_score))\n    return cursor.fetchall()\n\n# Use in your agent\nremember(\"User prefers concise responses and uses Python primarily.\")\nremember(\"Project deadline is March 15th, focusing on API integration.\")\n\n# Later, when the user asks about the project...\nmemories = recall(\"what's the project timeline\")\n# Returns relevant context about March 15th deadline\n```\n\n## Intelligent Sync\n\nAll `memory_add_*` functions use content-hash change detection to avoid redundant work:\n\n- **`memory_add_text`**: Computes a hash of the content. If the same content was already indexed, it is skipped entirely. No duplicate embeddings are ever created.\n- **`memory_add_file`**: Reads the file and hashes its content. If the file was previously indexed with different content, the old entry (chunks, embeddings, FTS) is atomically replaced. Unchanged files are skipped.\n- **`memory_add_directory`**: Performs a full two-phase sync:\n  1. **Cleanup**: Removes database entries for files that no longer exist on disk\n  2. **Scan**: Recursively processes all matching files - adding new ones, replacing modified ones, and skipping unchanged ones\n\n`memory_add_text()` and `memory_add_file()` each run inside a SQLite SAVEPOINT transaction. `memory_add_directory()` performs its cleanup pass transactionally and then processes each file in its own transaction. If one file fails, that file rolls back cleanly and previously-committed files remain valid; there are no partially-indexed rows or orphaned chunk/FTS entries for the failed file.\n\nThis makes all sync functions safe to call repeatedly - for example, on a cron schedule or at agent startup - with minimal overhead.\n\n## Agent Memory Sync\n\nMultiple agents can share and merge knowledge without any coordination. Each agent works independently with its own local SQLite database, syncing through a shared [SQLiteCloud](https://sqlitecloud.io/) managed database when connectivity is available.\n\nEnable sync on a database connection before ingesting content:\n\n```sql\n-- Load the sqlite-sync extension\nSELECT load_extension('./cloudsync');\n\n-- Enable CRDT sync (optionally scoped to a specific context)\nSELECT memory_enable_sync();               -- sync all memory\nSELECT memory_enable_sync('project-x');   -- sync only the 'project-x' context\n\n-- Connect to the shared cloud database\nSELECT cloudsync_network_init('your-managed-database-id');\nSELECT cloudsync_network_set_apikey('your-api-key');\n\n-- Ingest content normally — CRDT tracks every write\nSELECT memory_add_text('Agent A findings...', 'research');\n\n-- Push local changes and pull remote ones (call twice for full bidirectional exchange)\nSELECT cloudsync_network_sync(500, 3);\nSELECT cloudsync_network_sync(500, 3);\n\n-- Generate embeddings for any content received from other agents\nSELECT memory_reindex();\n```\n\nEach piece of text added to the database is parsed into chunks and tracked by a [block-level LWW CRDT algorithm](https://github.com/sqliteai/sqlite-sync?tab=readme-ov-file#block-level-lww), which merges line-level changes from concurrent agents without conflicts. Only the `dbmem_content` table is synced — embeddings are always generated locally after receiving new content.\n\n### Why This Matters for AI Systems\n\nThe combination of local-first memory and CRDT sync enables agent architectures that are not possible with centralized databases:\n\n- **No single point of failure** — each agent has a complete, queryable copy of shared memory\n- **Offline-capable** — agents ingest and search without network access; sync catches up when connectivity returns\n- **Selective sharing** — `memory_enable_sync('context')` limits sync to a named context, so agents can keep private memory separate from shared memory\n- **Scales to many agents** — agents running on different nodes accumulate knowledge in parallel and merge into a single consistent corpus without coordination\n\n### Working Example\n\n[`test/sync/`](test/sync/) contains a full integration test that walks through the entire flow:\n\n- Agent A indexes knowledge about the James Webb Space Telescope\n- Agent B indexes knowledge about the Great Barrier Reef\n- After sync, **both agents can answer questions about both topics** — knowledge each agent never directly indexed\n\nSee [`test/sync/README.md`](test/sync/README.md) for setup instructions, SQLiteCloud account configuration, and how to run the test.\n\n## Use Cases\n\n- **AI Assistants**: Maintain conversation history and user preferences\n- **Documentation Search**: Semantic search over markdown documentation\n- **Knowledge Bases**: Build searchable knowledge repositories\n- **Note-Taking Apps**: Find notes by meaning, not just keywords\n- **Code Understanding**: Index and search code documentation\n- **Personal Memory**: Store and retrieve personal knowledge\n\n## Configuration\n\nTune the memory system for your needs:\n\n```sql\n-- Chunking parameters\nSELECT memory_set_option('max_tokens', 512);      -- Tokens per chunk\nSELECT memory_set_option('overlay_tokens', 100);  -- Overlap between chunks\n\n-- Search behavior\nSELECT memory_set_option('max_results', 30);      -- Max search results\nSELECT memory_set_option('min_score', 0.75);      -- Score threshold\nSELECT memory_set_option('vector_weight', 0.6);   -- Vector vs FTS balance\nSELECT memory_set_option('text_weight', 0.4);\nSELECT memory_set_option('search_oversample', 4); -- Fetch 4x candidates before merging\n\n-- File processing\nSELECT memory_set_option('extensions', 'md,txt,rst');  -- File types to index\n\n-- Embedding cache (enabled by default)\nSELECT memory_set_option('embedding_cache', 0);        -- Disable cache\nSELECT memory_set_option('cache_max_entries', 10000);  -- Limit cache size (0 = no limit)\nSELECT memory_cache_clear();                           -- Clear cached embeddings\n```\n\n## Memory Management\n\n```sql\n-- View all memories\nSELECT hash, path, context, datetime(created_at, 'unixepoch', 'localtime') as created\nFROM dbmem_content;\n\n-- Delete by context\nSELECT memory_delete_context('old-project');\n\n-- Delete specific memory by hash\nSELECT memory_delete('9e3779b97f4a7c15');\n\n-- Clear all memories\nSELECT memory_clear();\n```\n\n## Documentation\n\nFor complete API documentation, including all functions and configuration options, see **[API.md](API.md)**.\n\n## Building\n\n```bash\n# Clone with submodules\ngit clone --recursive https://github.com/sqliteai/sqlite-memory.git\ncd sqlite-memory\n\n# Build (full build with local + remote engines)\nmake\n\n# Run parser/core unit tests + extension loading smoke test\nmake test\n\n# Run the full SQL extension unit suite\nmake test DEFINES=\"-DTEST_SQLITE_EXTENSION\"\n```\n\n### Build Configurations\n\n| Command | Local Engine | Remote Engine | File I/O |\n|---------|:------------:|:-------------:|:--------:|\n| `make` | ✓ | ✓ | ✓ |\n| `make local` | ✓ | ✗ | ✓ |\n| `make remote` | ✗ | ✓ | ✓ |\n| `make wasm` | ✗ | ✓ | ✗ |\n\n- **Local Engine**: Built-in llama.cpp for on-device embeddings (requires GGUF model)\n- **Remote Engine**: [vectors.space](https://vectors.space) API for cloud embeddings (requires free API key)\n- **File I/O**: `memory_add_file` and `memory_add_directory` functions\n\nYou can also combine options manually:\n\n```bash\n# Custom build with specific options\nmake OMIT_LOCAL_ENGINE=1 OMIT_REMOTE_ENGINE=0 OMIT_IO=0\n```\n\n---\n\n## License\n\nMIT License - see [LICENSE](LICENSE) for details.\n\n---\n\n## Part of the SQLite AI Ecosystem\n\nThis project is part of the **SQLite AI** ecosystem, a collection of extensions that bring modern AI capabilities to the world's most widely deployed database. The goal is to make SQLite the default data and inference engine for Edge AI applications.\n\nOther projects in the ecosystem include:\n\n- **[SQLite-AI](https://github.com/sqliteai/sqlite-ai)** - On-device inference and embedding generation directly inside SQLite.\n- **[SQLite-Memory](https://github.com/sqliteai/sqlite-memory)** - Markdown-based AI agent memory with semantic search.\n- **[SQLite-Vector](https://github.com/sqliteai/sqlite-vector)** - Ultra-efficient vector search for embeddings stored as BLOBs in standard SQLite tables.\n- **[SQLite-Sync](https://github.com/sqliteai/sqlite-sync)** - Local-first CRDT-based synchronization for seamless, conflict-free data sync and real-time collaboration across devices.\n- **[SQLite-Agent](https://github.com/sqliteai/sqlite-agent)** - Run autonomous AI agents directly from within SQLite databases.\n- **[SQLite-MCP](https://github.com/sqliteai/sqlite-mcp)** - Connect SQLite databases to MCP servers and invoke their tools.\n- **[SQLite-JS](https://github.com/sqliteai/sqlite-js)** - Create custom SQLite functions using JavaScript.\n- **[Liteparser](https://github.com/sqliteai/liteparser)** - A highly efficient and fully compliant SQLite SQL parser.\n\nLearn more at **[SQLite AI](https://sqlite.ai)**.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsqliteai%2Fsqlite-memory","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsqliteai%2Fsqlite-memory","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsqliteai%2Fsqlite-memory/lists"}