An open API service indexing awesome lists of open source software.

https://github.com/celebr4tion/memory-engine

An open-source knowledge management system that converts unstructured text into a graph database with automatic relationship discovery and semantic search. Built with JanusGraph and Milvus, designed to work with both commercial and local AI models. Currently in alpha development.
https://github.com/celebr4tion/memory-engine

experimental graph-database information-retrieval janusgraph knowledge-graph knowledge-management llm-integration machine-learning mcp milvus nlp python semantic-search vector-database

Last synced: 4 months ago
JSON representation

An open-source knowledge management system that converts unstructured text into a graph database with automatic relationship discovery and semantic search. Built with JanusGraph and Milvus, designed to work with both commercial and local AI models. Currently in alpha development.

Awesome Lists containing this project

README

          

# Memory Engine

A semantic knowledge management system that combines graph-based knowledge representation with vector embeddings for information storage, retrieval, and synthesis.

## 🌟 Overview

Memory Engine is an experimental knowledge management system that transforms unstructured text into a structured, searchable knowledge graph. It combines graph databases with vector embeddings to create a foundation for applications that can understand, connect, and reason about information.

## ⚠️ Important Notice

**This is a personal open-source project developed for learning and research purposes. No guarantees are made regarding reliability, security, or suitability for production use. Use at your own risk.**

## 🚧 Project Status

**This project is currently in active development (v0.2.0) and should be considered experimental.**

### Vision

Our goal is to create a truly open and accessible knowledge management system that works with:
- **Any AI model**: Commercial APIs (OpenAI, Anthropic, Google) and local models (Ollama, Hugging Face)
- **Any deployment**: From laptop development to distributed production systems
- **Any data**: Text, documents, structured data, and multimedia content

We aim to eliminate dependency on paid APIs by providing full support for local model execution, making advanced knowledge management accessible to everyone.

## 🎯 What Memory Engine Does

**Input**: Unstructured text, documents, or data

**Output**: Structured knowledge with automatic relationships and semantic search capabilities

### Core Functions

1. **Knowledge Ingestion**: Feed text/documents β†’ Engine extracts entities, facts, and relationships β†’ Stores in graph database
2. **Knowledge Retrieval**: Query in natural language β†’ Engine searches semantically β†’ Returns relevant information with context
3. **Automatic Processing**: The engine handles complexity internally - relationship discovery, quality assessment, versioning, and optimization

### Key Features

- 🧠 **Intelligent Knowledge Extraction**: Uses Google Gemini API to extract structured knowledge from raw text
- πŸ•ΈοΈ **Automatic Relationship Discovery**: Detects and creates relationships between knowledge entities
- πŸ” **Semantic Search**: Vector-based similarity search for contextual information retrieval
- πŸ” **Basic Security Features**: Authentication, RBAC, encryption, and audit logging (educational purposes)
- πŸ›‘οΈ **Privacy Controls**: Fine-grained knowledge privacy levels and access control
- πŸ“Š **Quality Enhancement**: Automated quality assessment and contradiction resolution
- πŸ“š **Version Control**: Complete change tracking and rollback capabilities
- πŸ”— **Flexible Integration**: MCP (Module Communication Protocol) interface for external systems
- πŸ€– **Agent Support**: Google ADK integration for conversational knowledge interactions
- ⚑ **Real-time Processing**: Concurrent processing of knowledge ingestion and retrieval
- πŸ“ˆ **Monitoring**: Performance monitoring, health checks, and observability

## πŸš€ Quick Start

### Prerequisites

- Python 3.8+
- Docker & Docker Compose
- Google Gemini API key ([Get one here](https://makersuite.google.com/app/apikey))

### 1. Installation

```bash
# Clone the repository
git clone https://github.com/Celebr4tion/memory-engine.git
cd memory-engine

# Run automated setup
./scripts/setup.sh
```

The setup script will:
- Check Python version compatibility
- Create virtual environment
- Install dependencies
- Create configuration template
- Set up development tools

### 2. Environment Setup

```bash
# Edit the .env file created by setup
# Set your Gemini API key
GEMINI_API_KEY="your-gemini-api-key"

# Optional: Set environment (defaults to development)
ENVIRONMENT="development"
```

### 3. Start Infrastructure

```bash
# Start JanusGraph and Milvus
cd docker
docker-compose up -d

# Wait for services to initialize (2-3 minutes)
docker-compose logs -f
```

### 4. Basic Usage

```python
from memory_core.core.knowledge_engine import KnowledgeEngine
from memory_core.model.knowledge_node import KnowledgeNode

# Initialize the system
engine = KnowledgeEngine()
engine.connect()

# Create knowledge from text
node = KnowledgeNode(
content="Machine learning is a subset of artificial intelligence",
source="AI Textbook",
rating_truthfulness=0.9
)

# Save to knowledge graph
node_id = engine.save_node(node)
print(f"Created knowledge node: {node_id}")

# Retrieve and explore
retrieved = engine.get_node(node_id)
print(f"Content: {retrieved.content}")
```

## πŸ“– Documentation

| Document | Description |
|----------|-------------|
| [πŸ“‹ Setup Guide](docs/developer/setup_guide.md) | Complete installation and configuration instructions |
| [βš™οΈ Configuration](docs/user/configuration.md) | Basic configuration and environment setup |
| [πŸ”§ Advanced Configuration](docs/developer/configuration_system.md) | Advanced configuration system |
| [πŸ—οΈ Architecture](docs/developer/architecture.md) | System architecture and component interactions |
| [πŸ—οΈ Project Structure](ARCHITECTURE.md) | Detailed project organization and structure |
| [πŸ“‘ API Reference](docs/api/api_reference.md) | Complete API documentation including MCP interface |
| [πŸ” Security Framework](docs/security/README.md) | Authentication, RBAC, encryption, and privacy controls |
| [πŸ”§ Troubleshooting](docs/user/troubleshooting.md) | Common issues and solutions |

## πŸ’» Examples

Explore practical examples in the [`examples/`](examples/) directory:

- [**Basic Usage**](examples/basic_usage.py): Core operations and workflows
- [**Knowledge Extraction**](examples/knowledge_extraction.py): Text processing and knowledge extraction
- [**MCP Integration**](examples/mcp_client_example.py): Using the Module Communication Protocol
- [**Security Framework**](examples/security_example.py): Authentication, RBAC, encryption, and privacy controls
- [**Advanced Queries**](examples/advanced_query_example.py): Complex querying and analytics
- [**Knowledge Synthesis**](examples/synthesis_example.py): Question answering and insight discovery

### Run Examples

```bash
# Ensure infrastructure is running
cd docker && docker-compose up -d

# Run basic usage example
python examples/basic_usage.py

# Run knowledge extraction demo
python examples/knowledge_extraction.py

# Test MCP interface
python examples/mcp_client_example.py

# Try configuration system
python examples/config_example.py
```

## πŸ§ͺ Testing

Memory Engine includes a comprehensive test suite organized by type:

```bash
# Run all tests
./scripts/test.sh all

# Run only unit tests (fast, no external dependencies)
./scripts/test.sh unit

# Run integration tests (requires JanusGraph and Milvus)
./scripts/test.sh integration

# Run tests with coverage report
./scripts/test.sh coverage

# Run specific test file
./scripts/test.sh --file config_manager
```

Test organization:
- **Unit Tests** (`tests/unit/`): Fast, isolated tests
- **Integration Tests** (`tests/integration/`): Tests requiring external services
- **Component Tests** (`tests/`): End-to-end component testing

## πŸ—οΈ Architecture

Memory Engine uses a sophisticated multi-layer architecture:

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Application Layer β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Python API β”‚ MCP Interface β”‚ Knowledge Agentβ”‚ REST API β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Knowledge Engine Core β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Knowledge β”‚ Relationship β”‚ Versioning β”‚ Rating β”‚
β”‚ Processing β”‚ Extraction β”‚ Manager β”‚ System β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Graph Store β”‚ Vector Store β”‚ Embedding β”‚ LLM API β”‚
β”‚ (JanusGraph) β”‚ (Milvus) β”‚ Manager β”‚ (Gemini) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

### Core Components

- **Modular Graph Storage**: Multiple backend options (JanusGraph, SQLite, JSON file)
- **Vector Database (Milvus)**: Enables semantic similarity search
- **Embedding System**: Generates and manages vector representations
- **Processing Pipeline**: Extracts and structures knowledge from text
- **Versioning System**: Tracks changes and enables rollbacks
- **MCP Interface**: Standardized API for external integration

### Storage Backend Options

Choose the storage backend that fits your deployment needs:

- **🏒 JanusGraph**: Production-grade distributed graph database
- **πŸ’Ύ SQLite**: Single-user deployments with SQL capabilities
- **πŸ“„ JSON File**: Development and testing with human-readable storage

## πŸ”§ Technology Stack

| Component | Technology | Purpose |
|-----------|------------|---------|
| Graph Storage | JanusGraph / SQLite / JSON | Knowledge relationships |
| Vector Database | Milvus 2.2.11 | Similarity search |
| LLM API | Google Gemini | Knowledge extraction & embeddings |
| Agent Framework | Google ADK | Conversational interfaces |
| Web Framework | FastAPI | REST API endpoints |
| Language | Python 3.8+ | Core implementation |

## πŸ§ͺ Development

### Running Tests

```bash
# Unit tests only
pytest tests/ -k "not integration" -v

# All tests (requires infrastructure)
pytest tests/ -v

# With coverage
pytest tests/ --cov=memory_core --cov-report=html
```

### Development Setup

```bash
# Install development dependencies
pip install pytest pytest-cov black isort mypy

# Format code
black memory_core/ tests/
isort memory_core/ tests/

# Type checking
mypy memory_core/

# Pre-commit hooks
pip install pre-commit
pre-commit install
```

## πŸ“Š Performance

Performance characteristics will vary depending on your hardware, data complexity, and configuration. We recommend testing with your specific use case and data to establish realistic benchmarks.

## 🀝 Contributing

We welcome contributions! Please see our contributing guidelines:

1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Make your changes with tests
4. Ensure all tests pass (`pytest`)
5. Format code (`black . && isort .`)
6. Commit changes (`git commit -m 'Add amazing feature'`)
7. Push to branch (`git push origin feature/amazing-feature`)
8. Open a Pull Request

### Development Standards

- **Code Quality**: All code must pass linting and type checking
- **Testing**: Maintain >90% test coverage
- **Documentation**: Update docs for any API changes
- **Performance**: Benchmark performance-critical changes

## πŸ“ License

This project is licensed under the [Hippocratic License 3.0](LICENSE.md) - an ethical source license that promotes responsible use of software while protecting human rights and environmental sustainability.

## πŸ†˜ Support

### Getting Help

- πŸ“– **Documentation**: Check the [`docs/`](docs/) directory
- πŸ› **Issues**: Report bugs or request features via [GitHub Issues](https://github.com/Celebr4tion/memory-engine/issues)
- πŸ’¬ **Discussions**: Join conversations in [GitHub Discussions](https://github.com/Celebr4tion/memory-engine/discussions)
- πŸ”§ **Troubleshooting**: See the [troubleshooting guide](docs/user/troubleshooting.md)

### Community

- **Contributing**: See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines
- **Code of Conduct**: Please read our [CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md)
- **Security**: Report security issues via [SECURITY.md](SECURITY.md)

### Status

- ⚠️ **Development Status**: Alpha version - breaking changes expected
- πŸ“ **Documentation**: Basic setup and usage guides available
- πŸ§ͺ **Testing**: Core functionality tested, expanding coverage
- πŸ”§ **Stability**: Experimental - not recommended for production use yet

---

**Memory Engine** - *Transforming information into intelligence* 🧠✨