https://github.com/bjornmelin/enhanced-mem-vector-rag

⚡ Developer-friendly hybrid-RAG toolkit merging Graphiti, Qdrant, mem0, LlamaIndex, and LangChain into one powerful engine.
https://github.com/bjornmelin/enhanced-mem-vector-rag
Last synced: 2 months ago
JSON representation
⚡ Developer-friendly hybrid-RAG toolkit merging Graphiti, Qdrant, mem0, LlamaIndex, and LangChain into one powerful engine.
Host: GitHub
URL: https://github.com/bjornmelin/enhanced-mem-vector-rag
Owner: BjornMelin
License: mit
Created: 2025-05-06T05:48:28.000Z (5 months ago)
Default Branch: main
Last Pushed: 2025-05-28T16:10:09.000Z (5 months ago)
Last Synced: 2025-06-23T20:49:36.483Z (4 months ago)
Language: Python
Size: 461 KB
Stars: 6
Watchers: 1
Forks: 3
Open Issues: 1
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project

README

          # Enhanced Memory Vector RAG

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)

[![Maintenance](https://img.shields.io/badge/Maintained%3F-yes-green.svg)](https://github.com/BjornMelin/enhanced-mem-vector-rag/graphs/commit-activity)

[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](https://makeapullrequest.com)

[![Documentation Status](https://img.shields.io/badge/docs-in%20progress-orange)](https://github.com/BjornMelin/enhanced-mem-vector-rag/wiki)

[![Claude Code](https://img.shields.io/badge/Claude%20Code-Ready-purple.svg)](https://github.com/BjornMelin/enhanced-mem-vector-rag/blob/main/CLAUDE.md)

⚡ Developer-friendly hybrid-RAG toolkit merging Graphiti, Qdrant, mem0, LlamaIndex, and LangChain into one powerful engine.

This implementation creates a sophisticated knowledge retrieval system by integrating KAG methodologies with traditional RAG approaches. It seamlessly combines Graphiti's graph intelligence, Qdrant's vector capabilities, and mem0's memory persistence - all accessible through flexible LlamaIndex and LangChain interfaces for applications requiring both factual accuracy and contextual understanding.

## Table of Contents

- [Enhanced Memory Vector RAG](#enhanced-memory-vector-rag)

  - [Table of Contents](#table-of-contents)

  - [Overview](#overview)

  - [Features](#features)

  - [Architecture](#architecture)

    - [Layered Architecture](#layered-architecture)

    - [Comprehensive System Architecture](#comprehensive-system-architecture)

  - [Data Flow](#data-flow)

    - [MCP Interaction Flow](#mcp-interaction-flow)

  - [Getting Started](#getting-started)

    - [Prerequisites](#prerequisites)

    - [Installation](#installation)

      - [Local Development](#local-development)

      - [Docker Deployment](#docker-deployment)

    - [Quick Start](#quick-start)

  - [Components](#components)

    - [Memory System (mem0)](#memory-system-mem0)

    - [Graph Knowledge Base (Graphiti/Neo4j)](#graph-knowledge-base-graphitineo4j)

    - [Vector Storage (Qdrant)](#vector-storage-qdrant)

    - [Framework Integration (LlamaIndex \& LangChain)](#framework-integration-llamaindex--langchain)

  - [Usage Examples](#usage-examples)

  - [Configuration](#configuration)

  - [Benchmarks](#benchmarks)

  - [Roadmap](#roadmap)

  - [Contributing](#contributing)

  - [How to Cite](#how-to-cite)

  - [License](#license)

  - [Acknowledgements](#acknowledgements)

  - [Custom MCP Server Implementation](#custom-mcp-server-implementation)

    - [Key MCP Endpoints](#key-mcp-endpoints)

  - [Claude Code Development](#claude-code-development)

  - [Deployment](#deployment)

    - [Docker Components](#docker-components)

    - [Deployment Options](#deployment-options)

      - [Local Deployment](#local-deployment)

      - [Using Makefile](#using-makefile)

    - [Security](#security)

    - [Monitoring \& Observability](#monitoring--observability)

    - [Backup \& Restore](#backup--restore)

## Overview

Enhanced Memory Vector RAG (EMVR) is a comprehensive framework that combines the strengths of multiple retrieval methodologies to create a more robust, accurate, and contextually aware knowledge system. By integrating graph-based Knowledge-Augmented Generation (KAG) with traditional vector-based Retrieval-Augmented Generation (RAG), EMVR provides superior performance in complex knowledge retrieval tasks.

The system leverages:

- **Graphiti/Neo4j** for structured knowledge representation and graph traversal

- **Qdrant** for efficient vector similarity search

- **mem0** for persistent memory and context management

- **LlamaIndex & LangChain** for flexible orchestration and agent-based workflows

## Features

- 🔄 **Hybrid Retrieval System** - Combines vector similarity search with graph-based knowledge retrieval

- 🧠 **Persistent Memory** - Maintains context and relationships across sessions

- 🔍 **Multi-modal Search** - Query across different data types and structures

- 🔗 **Knowledge Graph Integration** - Leverages structured relationships for improved context

- 🚀 **Framework Flexibility** - Works with both LlamaIndex and LangChain

- 📊 **Extensible Architecture** - Easy to customize and extend for specific use cases

- 🛠️ **Developer-Friendly APIs** - Simple interfaces for complex retrieval operations

- 📈 **Performance Optimization** - Efficient retrieval strategies for reduced latency

- 🐳 **Docker Deployment** - Containerized architecture for easy deployment

## Architecture

EMVR implements a comprehensive layered architecture integrating multiple components for advanced retrieval:

### Layered Architecture

```mermaid

graph TD

    subgraph "Application Layer"

        QueryInterface("Query Interfaces")

        ResponseGen("Response Generation")

        AgentWorkflows("Custom Agent Workflows")

        MCP("Model Context Protocol (MCP)")

    end

    subgraph "Orchestration Layer"

        HybridManager("Hybrid Retrieval Manager")

        ContextFusion("Context Fusion Engine")

        GraphTraversal("Knowledge Graph Traversal")

        LangGraph("LangGraph Orchestration")

    end

    subgraph "Integration Layer"

        LlamaIndexConn("LlamaIndex Connectors")

        LangChainComp("LangChain Components")

        FastEmbed("FastEmbed Integration")

        FastMCP("FastMCP Framework")

    end

    subgraph "Storage Layer"

        Qdrant("Vector Database (Qdrant)")

        Neo4j("Graph Database (Neo4j/Graphiti)")

        Mem0("Memory System (mem0)")

        Supabase("Metadata Storage (Supabase)")

    end

    QueryInterface --> HybridManager

    ResponseGen --> ContextFusion

    AgentWorkflows --> GraphTraversal

    AgentWorkflows --> HybridManager

    MCP --> FastMCP

    LangGraph --> LangChainComp

    HybridManager --> LlamaIndexConn

    HybridManager --> LangChainComp

    ContextFusion --> LlamaIndexConn

    ContextFusion --> LangChainComp

    GraphTraversal --> LlamaIndexConn

    FastMCP --> LlamaIndexConn

    FastEmbed -.-> Qdrant

    LlamaIndexConn --> Qdrant

    LlamaIndexConn --> Neo4j

    LlamaIndexConn --> Mem0

    LlamaIndexConn --> Supabase

    LangChainComp --> Qdrant

    LangChainComp --> Neo4j

    LangChainComp --> Mem0

    LangChainComp --> Supabase

```

### Comprehensive System Architecture

```mermaid

graph TB

    User([User]) <--> ClaudeCode["Claude Code & MCP Tools"]

    ClaudeCode <--> CustomMCP["Custom 'memory' MCP Server\n(FastMCP Framework)"]

    ClaudeCode <--> ExternalMCP["External MCP Servers\n(tavily, firecrawl, context7, etc.)"]

    subgraph "Agent System"

        LangGraph["LangGraph\n(Agent Orchestration)"]

        LangChain["LangChain\n(Agent Tools & Planning)"]

        Agents["Specialized Agents\n(Supervisor-Worker Pattern)"]

        LangGraph --> LangChain

        LangGraph --> Agents

    end

    subgraph "RAG Framework"

        LlamaIndex["LlamaIndex\n(Core RAG Framework)"]

        QueryEngines["Query Engines\n(Vector, Graph, Hybrid)"]

        Retrievers["Specialized Retrievers"]

        DataLoaders["Data Loaders & Indexers"]

        LlamaIndex --> QueryEngines

        LlamaIndex --> Retrievers

        LlamaIndex --> DataLoaders

    end

    subgraph "Memory & Storage"

        Qdrant[(Qdrant\nVector Store)]

        Neo4j[(Neo4j\nGraph Database)]

        Mem0["Mem0\n(Memory Interface)"]

        Graphiti["Graphiti\n(Graph Interface)"]

        Supabase[(Supabase\nMetadata & Documents)]

        S3[(AWS S3\nOriginal Documents)]

        Mem0 -.-> Qdrant

        Graphiti -.-> Neo4j

    end

    subgraph "Embedding & Ingestion"

        FastEmbed["FastEmbed\nEmbedding Generation"]

        WebCrawlers["Web Crawlers\n(Crawl4AI, Firecrawl)"]

        ConnectorAPIs["Connector APIs\n(GitHub, Reddit, etc.)"]

        FastEmbed --> Qdrant

        WebCrawlers --> DataLoaders

        ConnectorAPIs --> DataLoaders

    end

    CustomMCP <--> LangGraph

    CustomMCP <--> LlamaIndex

    LangGraph <--> LlamaIndex

    LlamaIndex <--> Qdrant

    LlamaIndex <--> Neo4j

    LlamaIndex <--> Mem0

    LlamaIndex <--> Graphiti

    LlamaIndex <--> Supabase

    DataLoaders --> S3

    DataLoaders --> Supabase

    DataLoaders --> Qdrant

    DataLoaders --> Neo4j

    style CustomMCP fill:#f9d6ff,stroke:#9333ea,stroke-width:2px

    style LlamaIndex fill:#d1fae5,stroke:#059669,stroke-width:2px

    style LangGraph fill:#dbeafe,stroke:#3b82f6,stroke-width:2px

    style Qdrant fill:#fee2e2,stroke:#ef4444,stroke-width:2px

    style Neo4j fill:#ffedd5,stroke:#f97316,stroke-width:2px

```

## Data Flow

```mermaid

flowchart LR

    classDef userInteraction fill:#f9d6ff,stroke:#9333ea,stroke-width:2px

    classDef retrieval fill:#dbeafe,stroke:#3b82f6,stroke-width:2px

    classDef processing fill:#d1fae5,stroke:#059669,stroke-width:2px

    classDef storage fill:#fee2e2,stroke:#ef4444,stroke-width:2px

    classDef fusion fill:#ffedd5,stroke:#f97316,stroke-width:2px

    Input("User Query/Task") --> ClaudeCode("Claude Code\nMCP Interface")

    ClaudeCode --> MemoryMCP("Custom 'memory'\nMCP Server")

    MemoryMCP --> Agent("Agent System\n(LangChain/LangGraph)")

    Agent --> VR("Vector Retrieval\n(Qdrant via LlamaIndex)")

    Agent --> GR("Graph Retrieval\n(Neo4j/Graphiti via LlamaIndex)")

    Agent --> MR("Memory Retrieval\n(mem0)")

    Agent --> WS("Web Search\n(Tavily/Firecrawl)")

    VR --> CF("Context Fusion\n(LlamaIndex Orchestration)")

    GR --> CF

    MR --> CF

    WS --> CF

    CF --> QP("Query Planning\n(LangGraph)")

    QP --> RT("Response Templates")

    CF --> LLM("Large Language Model")

    RT --> LLM

    LLM --> Response("Enhanced Response")

    Response --> MemUpdate("Memory Update\n(mem0)")

    Response --> KGUpdate("Knowledge Graph Update\n(Neo4j)")

    Response --> MetaUpdate("Metadata Update\n(Supabase)")

    Response --> ClaudeCode

    ClaudeCode --> User([User])

    class Input,ClaudeCode,User userInteraction

    class VR,GR,MR,WS retrieval

    class QP,RT,LLM processing

    class MemUpdate,KGUpdate,MetaUpdate storage

    class CF fusion

```

### MCP Interaction Flow

```mermaid

sequenceDiagram

    participant User

    participant Claude as Claude Code

    participant Memory as custom 'memory' MCP

    participant External as External MCP Servers

    participant LlamaIdx as LlamaIndex

    participant Storage as Storage Systems

    User->>Claude: Query or Task

    Claude->>Memory: memory.read_graph()

    Memory->>LlamaIdx: Query through LlamaIndex

    LlamaIdx->>Storage: Fetch from Qdrant/Neo4j/Supabase

    Storage-->>LlamaIdx: Return relevant data

    LlamaIdx-->>Memory: Process & return results

    Memory-->>Claude: Return graph state

    Claude->>External: context7.get_library_docs()

    External-->>Claude: Return documentation

    Note over Claude,Memory: Agent Planning & Execution

    Claude->>Memory: Execute retrieval/update

    Memory->>LlamaIdx: Orchestrate operations

    LlamaIdx->>Storage: Execute operations

    Storage-->>LlamaIdx: Return operation results

    LlamaIdx-->>Memory: Process & return results

    Memory-->>Claude: Return operation status/results

    Claude->>Memory: memory.add_observations()

    Memory->>Storage: Update memory state

    Claude-->>User: Deliver response/results

```

## Getting Started

### Prerequisites

- Python 3.11+

- Docker (recommended for Neo4j, Qdrant, and Supabase)

- `uv` for Python package management

- Basic understanding of RAG systems

### Installation

#### Local Development

```bash

# Clone the repository

git clone https://github.com/BjornMelin/enhanced-mem-vector-rag.git

cd enhanced-mem-vector-rag

# Install dependencies using uv

uv pip install -r requirements.txt

```

#### Docker Deployment

```bash

# Navigate to deployment directory

cd emvr/deployment

# Setup environment

./setup_local.sh

# Start services

docker compose up -d

```

### Quick Start

```python

from emvr import EmvrSystem

# Initialize the system

system = EmvrSystem()

# Load data

system.load_documents("path/to/documents")

system.build_knowledge_graph()

# Query the system

response = system.query("What is the relationship between X and Y?")

print(response)

```

## Components

### Memory System (mem0)

The memory component leverages mem0 to maintain persistent context across queries and sessions. This allows the system to:

- Remember previous interactions

- Build cumulative knowledge

- Maintain entity relationships

- Support temporal reasoning

```mermaid

graph LR

    Query("User Query") --> Memory("mem0 Memory System")

    Memory --> Scoring("Relevance Scoring")

    Memory --> Personalization("Personalization Layer")

    Memory --> Context("Contextual History")

    Scoring --> Retrieval("Enhanced Retrieval")

    Personalization --> Retrieval

    Context --> Retrieval

    Retrieval --> LLM("Large Language Model")

    LLM --> Response("Enhanced Response")

    Response --> Memory

```

### Graph Knowledge Base (Graphiti/Neo4j)

The graph component uses Graphiti with Neo4j to:

- Store structured relationships between entities

- Enable complex traversal queries

- Support reasoning about interconnected concepts

- Provide explicit knowledge paths

```mermaid

graph TD

    subgraph "Knowledge Graph (Neo4j/Graphiti)"

        Entity1("Entity A")

        Entity2("Entity B")

        Entity3("Entity C")

        Entity4("Entity D")

        Entity1 -- "relates_to" --> Entity2

        Entity2 -- "depends_on" --> Entity3

        Entity1 -- "creates" --> Entity4

        Entity3 -- "part_of" --> Entity4

    end

    Query("Knowledge Query") --> GraphTraversal("Graph Traversal (Graphiti)")

    GraphTraversal --> Neo4j("Neo4j Database")

    Neo4j --> Results("Structured Results")

    Results --> LLM("LLM for Reasoning")

```

### Vector Storage (Qdrant)

The vector component uses Qdrant to:

- Store and retrieve document embeddings

- Perform efficient similarity search

- Support semantic matching

- Handle large-scale vector operations

```mermaid

graph TD

    Documents["Input Documents"] --> TextChunker["Text Chunker"]

    TextChunker --> EmbeddingGen["Embedding Generation"]

    EmbeddingGen --> VectorDB["Qdrant Vector Database"]

    Query["User Query"] --> QueryEmbed["Query Embedding"]

    QueryEmbed --> SearchVec["Vector Search"]

    SearchVec --> VectorDB

    VectorDB --> TopMatches["Top K Matches"]

    TopMatches --> Reranker["Reranker"]

    Reranker --> ContextGen["Context Generation"]

```

### Framework Integration (LlamaIndex & LangChain)

EMVR integrates with both major RAG frameworks:

- **LlamaIndex** - For advanced indexing and retrieval operations

- **LangChain** - For agent-based workflows and tool integration

```mermaid

graph TD

    subgraph "LlamaIndex Integration"

        Docs[("Documents")] --> Loaders["Data Loaders"]

        Loaders --> Indexing["Indexing Pipelines"]

        Indexing --> QueryEngines["Query Engines"]

        QueryEngines --> RetFramework["Retrieval Framework"]

    end

    subgraph "LangChain Integration"

        Agents["Agent Framework"] --> Planning["Planning Modules"]

        Planning --> Tools["Tool Integration"]

        Tools --> Memory["Memory Components"]

        Memory --> Callbacks["Callback Handlers"]

    end

    RetFramework <--> Tools

    QueryEngines <--> Agents

```

## Usage Examples

Examples are coming soon. They will demonstrate:

- Basic RAG workflows

- Knowledge graph integration

- Multi-hop reasoning

- Custom retrieval strategies

- Agent-based applications

## Configuration

EMVR can be configured through:

- Configuration files

- Environment variables

- Programmatic settings

Detailed configuration options will be provided in the upcoming documentation.

## Benchmarks

Performance benchmarks comparing EMVR to traditional RAG systems will be available soon.

## Roadmap

- [x] Initial release with core functionality

- [x] Basic documentation

- [x] Agent orchestration implementation

- [x] UI implementation with Chainlit

- [x] Docker containerization and deployment

- [ ] Comprehensive documentation

- [ ] Performance benchmarks

- [ ] Advanced examples

- [ ] Cloud deployment guides

- [ ] Additional vector database integrations

- [ ] Custom agent templates

## Contributing

Contributions are welcome! Please see the [CONTRIBUTING.md](CONTRIBUTING.md) file for guidelines.

## How to Cite

If you use EMVR in your research, please cite:

```bibtex

@software{emvr2025,

  author = {Melin, Bjorn},

  title = {Enhanced Memory Vector RAG: A Hybrid Retrieval Framework},

  year = {2025},

  url = {https://github.com/BjornMelin/enhanced-mem-vector-rag},

  version = {0.1.0}

}

```

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Acknowledgements

- [Graphiti](https://github.com/neo4j/graphiti) for Neo4j integration

- [Qdrant](https://github.com/qdrant/qdrant) for vector database capabilities

- [mem0](https://github.com/mem0ai/mem0) for memory systems

- [LlamaIndex](https://github.com/run-llama/llama_index) for indexing frameworks

- [LangChain](https://github.com/langchain-ai/langchain) for agent orchestration

## Custom MCP Server Implementation

This project implements a custom `memory` MCP server using the FastMCP framework that serves as the central interface between Claude Code and the system's backend components:

```mermaid

flowchart TD

    classDef mcp fill:#f9d6ff,stroke:#9333ea,stroke-width:2px

    classDef frameworks fill:#d1fae5,stroke:#059669,stroke-width:2px

    classDef storage fill:#fee2e2,stroke:#ef4444,stroke-width:2px

    Claude([Claude Code]) --> MCP["Custom 'memory' MCP Server\n(FastMCP Framework)"]

    subgraph "MCP Endpoints"

        SearchHybrid["/search.hybrid"]

        GraphQuery["/graph.query"]

        MemoryOps["/memory.*"]

        RulesValidate["/rules.validate"]

        IngestOps["/ingest.*"]

    end

    MCP --> SearchHybrid

    MCP --> GraphQuery

    MCP --> MemoryOps

    MCP --> RulesValidate

    MCP --> IngestOps

    SearchHybrid --> LlamaIndex["LlamaIndex\nRAG Orchestration"]

    GraphQuery --> LlamaIndex

    MemoryOps --> LlamaIndex

    RulesValidate --> APOC["Neo4j APOC\nRules Engine"]

    IngestOps --> LlamaIndex

    LlamaIndex --> Qdrant[(Qdrant)]

    LlamaIndex --> Neo4j[(Neo4j)]

    LlamaIndex --> Supabase[(Supabase)]

    APOC --> Neo4j

    Mem0["Mem0 SDK"] --> Qdrant

    Graphiti["Graphiti Client"] --> Neo4j

    class MCP,SearchHybrid,GraphQuery,MemoryOps,RulesValidate,IngestOps mcp

    class LlamaIndex,APOC,Mem0,Graphiti frameworks

    class Qdrant,Neo4j,Supabase storage

```

### Key MCP Endpoints

| Endpoint          | Description                                           | Implementation                                                                       |

| ----------------- | ----------------------------------------------------- | ------------------------------------------------------------------------------------ |

| `/search.hybrid`  | Performs hybrid search across vector and graph stores | Uses LlamaIndex for orchestrating hybrid search across Qdrant and Neo4j              |

| `/graph.query`    | Executes knowledge graph queries                      | Translates natural language to Cypher using LlamaIndex's `KnowledgeGraphQueryEngine` |

| `/memory.*`       | Operations for memory management                      | Includes CRUD operations for graph entities and observations                         |

| `/rules.validate` | Validates operations against defined rules            | Uses Neo4j APOC for rule enforcement                                                 |

| `/ingest.*`       | Handles data ingestion from various sources           | Utilizes LlamaIndex data loaders and FastEmbed for embedding generation              |

## Claude Code Development

This project provides a detailed development guide for Claude Code users. The guide includes:

- Project overview and technical architecture

- Development workflow and memory protocol

- Coding standards and practices

- Git workflow

- MCP server documentation and usage

- Key architectural components and their roles

For Claude Code development, please refer to [CLAUDE.md](CLAUDE.md) for comprehensive guidelines.

## Deployment

The project includes a complete deployment system using Docker Compose:

### Docker Components

- **MCP Server**: FastAPI server implementing the Model Context Protocol

- **Chainlit UI**: Web interface for user interaction

- **Qdrant**: Vector database for semantic search

- **Neo4j**: Graph database for knowledge graphs

- **Supabase**: PostgreSQL for structured data and metadata

- **Grafana/Prometheus**: Monitoring and observability

### Deployment Options

#### Local Deployment

```bash

# Navigate to deployment directory

cd emvr/deployment

# Set up environment

./setup_local.sh

# Start services using docker-compose

docker compose up -d

```

#### Using Makefile

```bash

cd emvr/deployment

make setup    # Run setup script

make up       # Start all services

```

### Security

The deployment includes comprehensive security features:

- JWT-based authentication

- Role-Based Access Control (RBAC)

- Secure environment variable management

- Container-based isolation

### Monitoring & Observability

Access system metrics and logs through:

- Grafana dashboard: 

- Prometheus metrics: 

### Backup & Restore

The system includes scripts for data backup and restoration:

```bash

# Create backup

./scripts/backup.sh

# Restore from backup

./scripts/restore.sh ./backups/emvr_backup_20250506_120000.tar.gz

```

For detailed deployment instructions, see the [deployment README](emvr/deployment/README.md).
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/bjornmelin/enhanced-mem-vector-rag

Awesome Lists containing this project

README