https://github.com/0xrelogic/cognio
Persistent semantic memory server for MCP - Give your AI long-term memory that survives across conversations. Lightweight Python server with SQLite storage and semantic search.
https://github.com/0xrelogic/cognio
ai-memory chatgpt claude copilot embeddings long-term-memory machine-learning mcp mcp-server model-context-protocol nlp semantic-search sentence-transformers vector-database
Last synced: about 1 month ago
JSON representation
Persistent semantic memory server for MCP - Give your AI long-term memory that survives across conversations. Lightweight Python server with SQLite storage and semantic search.
- Host: GitHub
- URL: https://github.com/0xrelogic/cognio
- Owner: 0xReLogic
- License: mit
- Created: 2025-10-05T18:03:43.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2025-10-05T20:23:13.000Z (8 months ago)
- Last Synced: 2025-10-05T20:41:14.469Z (8 months ago)
- Topics: ai-memory, chatgpt, claude, copilot, embeddings, long-term-memory, machine-learning, mcp, mcp-server, model-context-protocol, nlp, semantic-search, sentence-transformers, vector-database
- Language: Python
- Homepage: https://github.com/0xReLogic/Cognio/blob/main/QUICKSTART.md
- Size: 72.3 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
# Cognio
> Persistent semantic memory server for AI assistants via Model Context Protocol (MCP)
[](https://github.com/0xReLogic/Cognio/actions/workflows/ci.yml)
[](https://opensource.org/licenses/MIT)
[](https://www.python.org/downloads/)
[](https://fastapi.tiangolo.com)
Cognio is a Model Context Protocol (MCP) server that provides persistent semantic memory for AI assistants. Unlike ephemeral chat history, Cognio stores context permanently and enables semantic search across conversations.
**Built for:**
- Personal knowledge base that grows over time
- Multi-project context management
- Research notes and learning journal
- Conversation history with semantic retrieval
## Features
- **Semantic Search**: Find memories by meaning using sentence-transformers
- **LEANN Vector Search (Optional)**: Lazy-built index with on-demand recomputation to reduce startup memory
- **Multilingual Support**: Search in 100+ languages seamlessly
- **Persistent Storage**: SQLite-based storage that survives across sessions
- **Project Organization**: Organize memories by project and tags
- **Auto-Tagging**: Automatic tag generation via LLM (GPT-4, Groq, etc)
- **Text Summarization**: Extractive and abstractive summarization for long texts
- **MCP Integration**: One-click setup for VS Code, Claude, Cursor, and more
- **RESTful API**: Standard HTTP API with OpenAPI documentation
- **Export Capabilities**: Export to JSON or Markdown format
- **Docker Support**: Simple deployment with docker-compose
## Quick Start
### 1. Start the Server
```bash
git clone https://github.com/0xReLogic/Cognio.git
cd Cognio
docker-compose up -d
```
Server runs at `http://localhost:8080`
### 2. Auto-Configure AI Clients
The MCP server automatically configures supported AI clients on first start:
**Supported Clients:**
- Claude Desktop
- Claude Code (CLI)
- VS Code (GitHub Copilot)
- Cursor
- Continue.dev
- Cline
- Windsurf
- Kiro
- Gemini CLI
**Quick Setup:**
Run the auto-setup script to configure all clients at once:
```bash
cd mcp-server
npm run setup
```
This generates MCP configs for all 9 supported clients automatically.
**Manual Configuration:**
See [mcp-server/README.md](mcp-server/README.md) for client-specific MCP configuration examples.
On first run, Cognio auto-generates `cognio.md` in your workspace with usage guide for AI tools.
### 3. Test It
```bash
# Save a memory
curl -X POST http://localhost:8080/memory/save \
-H "Content-Type: application/json" \
-d '{"text": "Docker allows running apps in containers", "project": "LEARNING"}'
# Search memories
curl "http://localhost:8080/memory/search?q=containers"
```
Or use naturally in your AI client:
```
"Search my memories for Docker information"
"Remember this: FastAPI is a modern Python web framework"
```
### 4. Web UI Dashboard
Access the interactive memory dashboard:
```
http://localhost:8080/ui
```
Features:
- Browse and search all memories
- Add/edit memories with markdown preview
- View statistics and insights
- Organize by project and tags
- Bulk operations (select, delete)
- Dark/light theme toggle
- Works locally and in Docker
The dashboard auto-detects the API server, so it works on localhost, Docker containers, and remote deployments.
## Documentation
- **[API Reference](docs/api.md)** - Complete endpoint documentation
- **[Examples](docs/examples.md)** - Usage patterns and integrations
- **[Quickstart](docs/quickstart.md)** - Installation and configuration
## MCP Tools
When using the MCP server, you have access to 11 specialized tools:
| Tool | Description |
|------|-------------|
| `save_memory` | Save text with optional project/tags (auto-tagging enabled) |
| `search_memory` | Semantic search with project filtering |
| `list_memories` | List memories with pagination and filters |
| `get_memory_stats` | Get storage statistics and insights |
| `archive_memory` | Soft delete a memory (recoverable) |
| `delete_memory` | Permanently delete a memory by ID |
| `export_memories` | Export memories to JSON or Markdown |
| `summarize_text` | Summarize long text (extractive or LLM-based) |
| **`set_active_project`** | **Set active project context (auto-applies to all operations)** |
| **`get_active_project`** | **View currently active project** |
| **`list_projects`** | **List all available projects from database** |
**Active Project Workflow:**
```
1. list_projects() → See: Helios-LoadBalancer (45), Cognio-Memory (23), ...
2. set_active_project("Helios-LoadBalancer")
3. save_memory("Cache TTL is 300s") → Auto-saves to Helios-LoadBalancer
4. search_memory("cache settings") → Auto-searches in Helios-LoadBalancer only
5. list_memories() → Lists only Helios-LoadBalancer memories
```
**Project Isolation:**
Always specify a `project` name OR use `set_active_project` to keep memories organized and prevent mixing contexts between different workspaces.
## API Endpoints
| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/health` | Health check |
| POST | `/memory/save` | Save new memory |
| GET | `/memory/search` | Semantic/Hybrid search |
| GET | `/memory/list` | List memories with filters |
| DELETE | `/memory/{id}` | Delete memory by ID |
| POST | `/memory/bulk-delete` | Bulk delete by project |
| GET | `/memory/stats` | Get statistics |
| GET | `/memory/export` | Export memories |
| POST | `/memory/summarize` | Summarize long text |
Interactive docs: http://localhost:8080/docs
## Configuration
Environment variables (see `.env.example`):
Copy the example and edit your local overrides:
```bash
cp .env.example .env
```
```bash
# Database
DB_PATH=./data/memory.db
# Embeddings
EMBED_MODEL=all-MiniLM-L6-v2
EMBED_DEVICE=cpu
EMBEDDING_CACHE_PATH=./data/embedding_cache.pkl
# API
API_HOST=0.0.0.0
API_PORT=8080
# Optional API key for auth
API_KEY=your-secret-key
# Search
DEFAULT_SEARCH_LIMIT=5
SIMILARITY_THRESHOLD=0.4
HYBRID_ENABLED=true
HYBRID_MODE=rerank # candidate | rerank
HYBRID_ALPHA=0.6 # 0..1, higher = more semantic
HYBRID_RERANK_TOPK=100 # rerank candidate pool size
# LEANN vector search (optional)
LEANN_ENABLED=false
LEANN_INDEX_PATH=./data/leann/memories.leann
LEANN_BACKEND=hnsw
LEANN_LAZY_BUILD=true
LEANN_RECOMPUTE_ON_SEARCH=true
LEANN_WARMUP_ON_START=false
# Summarization
SUMMARIZATION_ENABLED=true
SUMMARIZATION_METHOD=abstractive # extractive | abstractive
SUMMARIZATION_EMBED_MODEL=all-MiniLM-L6-v2
# Auto-tagging (Optional)
AUTOTAG_ENABLED=true
LLM_PROVIDER=groq
GROQ_API_KEY=your-groq-key
GROQ_MODEL=openai/gpt-oss-120b
# OPENAI_API_KEY=your-openai-api-key
# OPENAI_MODEL=gpt-4o-mini
# Performance
MAX_TEXT_LENGTH=10000
BATCH_SIZE=32
SUMMARIZE_THRESHOLD=50
# Logging
LOG_LEVEL=info
```
**Auto-Tagging Models:**
- `openai/gpt-oss-120b` - High quality
- `gpt-4o-mini` - OpenAI, fast and cheap
- `llama-3.3-70b-versatile` - Groq, balanced
- `llama-3.1-8b-instant` - Groq, fastest
See `.env.example` for all available options and recommendations.
## Project Structure
```
cognio/
├── src/ # Core application
│ ├── main.py # FastAPI app
│ ├── config.py # Environment config
│ ├── models.py # Data schemas
│ ├── database.py # SQLite operations
│ ├── embeddings.py # Semantic search
│ ├── memory.py # Memory CRUD
│ ├── autotag.py # Auto-tagging
│ └── utils.py # Helpers
│
├── mcp-server/ # MCP integration
│ ├── index.js # MCP server
│ └── package.json # Dependencies
│
├── scripts/ # Utilities
│ ├── setup-clients.js # Auto-config AI clients
│ ├── backup.sh # Database backup
│ └── migrate.py # Schema migrations
│
├── tests/ # Test suite
├── docs/ # Documentation
└── examples/ # Usage examples
```
## Development
```bash
# Install dependencies
poetry install
# Run tests
pytest
# Start development server
uvicorn src.main:app --reload
```
## Tech Stack
- **Backend**: Python 3.11+, FastAPI, Uvicorn
- **Database**: SQLite with JSON support
- **Embeddings**: sentence-transformers (paraphrase-multilingual-mpnet-base-v2, 768-dim)
- **MCP Server**: Node.js, @modelcontextprotocol/sdk
- **Auto-Tagging**: Api
- **Testing**: pytest, pytest-asyncio, pytest-cov
- **Deployment**: Docker, docker-compose
## Performance
| Operation | Time | Notes |
|-----------|------|-------|
| Save memory | ~20ms | Including embedding |
| Search (1k memories) | ~15ms | Semantic similarity |
| Search (10k memories) | ~50ms | Still fast |
| Model load | ~3s | One-time on startup |
## License
MIT License - see [LICENSE](LICENSE)
## Links
- **Documentation**: [docs/](docs/)
- **Issues**: [GitHub Issues](https://github.com/0xReLogic/Cognio/issues)
- **Releases**: [GitHub Releases](https://github.com/0xReLogic/Cognio/releases)
---
Built for better AI conversations