An open API service indexing awesome lists of open source software.

https://github.com/aiafterdark/lors

LORS (Local O1 Reasoning System) - A distributed reasoning framework using local LLMs for advanced prompt analysis and response generation through parallel processing pipelines. Implements multi-agent architecture with dynamic scaling for complex query processing.
https://github.com/aiafterdark/lors

artificial-intelligence async-programming deep-learning distributed-systems llm machine-learning multi-agent-systems natural-language-processing ollama parallel-processing prompt-engineering python

Last synced: 8 months ago
JSON representation

LORS (Local O1 Reasoning System) - A distributed reasoning framework using local LLMs for advanced prompt analysis and response generation through parallel processing pipelines. Implements multi-agent architecture with dynamic scaling for complex query processing.

Awesome Lists containing this project

README

          

# Local O1 Reasoning System (LORS)

## Abstract
The Local O1 Reasoning System (LORS) is an advanced distributed reasoning framework that implements a novel approach to prompt analysis and response generation using local Large Language Models (LLMs). Inspired by OpenAI's o1 architecture, LORS utilizes a multi-agent system with dynamic scaling capabilities to process complex queries through parallel processing pipelines of varying computational depths.

## System Architecture

### Core Components

```
LORS Architecture
├── Prompt Analysis Engine
│ ├── Complexity Analyzer
│ ├── Domain Classifier
│ └── Cognitive Load Estimator
├── Agent Management System
│ ├── Fast Reasoning Agents (llama3.2)
│ └── Deep Reasoning Agents (llama3.1)
├── Response Synthesis Pipeline
│ ├── Thought Aggregator
│ ├── Context Enhancer
│ └── Final Synthesizer
└── Response Management System
├── Intelligent Naming
└── Structured Storage
```

### Technical Specifications

#### 1. Prompt Analysis Engine
The system employs a sophisticated prompt analysis mechanism that evaluates:

- **Linguistic Complexity Metrics**
- Sentence structure depth (dependency parsing)
- Technical term density
- Named entity recognition
- Cognitive load estimation

- **Domain-Specific Analysis**
```python
domain_complexity = {
'technical': [algorithm, system, framework],
'scientific': [hypothesis, analysis, theory],
'mathematical': [equation, formula, calculation],
'business': [strategy, market, optimization]
}
```

- **Complexity Scoring Algorithm**
```mathematics
C = Σ(wi * fi)
where:
C = total complexity score
wi = weight of feature i
fi = normalized value of feature i
```

#### 2. Dynamic Agent Scaling

The system implements an adaptive scaling mechanism based on prompt complexity:

| Complexity Score | Fast Agents | Deep Agents | Use Case |
|-----------------|-------------|-------------|-----------|
| 80-100 | 5 | 3 | Complex technical analysis |
| 60-79 | 4 | 2 | Moderate complexity |
| 40-59 | 3 | 2 | Standard analysis |
| 0-39 | 2 | 1 | Simple queries |

#### 3. Agent Types and Characteristics

**Fast Reasoning Agents (llama3.2)**
- Optimized for rapid initial analysis
- Lower token limit for quicker processing
- Focus on key concept identification
- Parameters:
```python
{
'temperature': 0.7,
'max_tokens': 150,
'response_time_target': '< 2s'
}
```

**Deep Reasoning Agents (llama3.1)**
- Designed for thorough analysis
- Higher token limit for comprehensive responses
- Focus on relationships and implications
- Parameters:
```python
{
'temperature': 0.9,
'max_tokens': 500,
'response_time_target': '< 5s'
}
```

## Implementation Details

### 1. Asynchronous Processing Pipeline
```python
async def process_prompt(prompt):
complexity_analysis = analyze_prompt_complexity(prompt)
fast_thoughts = await process_fast_agents(prompt)
enhanced_context = synthesize_initial_thoughts(fast_thoughts)
deep_thoughts = await process_deep_agents(enhanced_context)
return synthesize_final_response(fast_thoughts, deep_thoughts)
```

### 2. Complexity Analysis Implementation
The system uses a weighted feature analysis approach:

```python
def calculate_complexity_score(features):
weights = {
'sentence_count': 0.1,
'avg_sentence_length': 0.15,
'subjectivity': 0.1,
'named_entities': 0.15,
'technical_term_count': 0.2,
'domain_complexity': 0.1,
'cognitive_complexity': 0.1,
'dependency_depth': 0.1
}
return weighted_sum(features, weights)
```

### 3. Response Synthesis
The system implements a three-phase synthesis approach:
1. Fast Analysis Aggregation
2. Context Enhancement
3. Deep Analysis Integration

## Performance Characteristics

### Benchmarks
- Average response time: 2-8 seconds
- Memory usage: 4-8GB
- GPU utilization: 60-80%

## Installation and Usage

### Prerequisites
```bash
pip install ollama asyncio rich textblob spacy nltk
python -m spacy download en_core_web_sm
```

### Basic Usage
```bash
python local-o1-reasoning.py -p "Your complex query here"
```

### Response Storage
Responses are stored in JSON format:
```json
{
"prompt": "original_prompt",
"timestamp": "ISO-8601 timestamp",
"complexity_analysis": {
"score": 75.5,
"features": {...}
},
"result": {
"fast_analysis": [...],
"deep_analysis": [...],
"final_synthesis": "..."
}
}
```

## Installation and Usage

### Prerequisites

1. **Install Ollama**
```bash
# For Linux
curl -L https://ollama.com/download/ollama-linux-amd64 -o ollama
chmod +x ollama
./ollama serve

# For Windows
# Download and install from https://ollama.com/download/windows
```

2. **Install Required Models**
```bash
# Install the fast reasoning model (3B Model - fast thought)
ollama pull llama3.2

# Install the deep reasoning model (8B Model - deep thought)
ollama pull llama3.1

# Verify installations
ollama list
```
Expected output:
```
NAME ID SIZE MODIFIED
llama3.2:latest 6c2d00dcdb27 2.1 GB 4 seconds ago
llama3.1:latest 3c46ab11d5ec 4.9 GB 6 days ago
```

3. **Set Up Python Environment**
```bash
# Create virtual environment
python -m venv lors-env

# Activate environment
# On Windows
lors-env\Scripts\activate
# On Unix or MacOS
source lors-env/bin/activate

# Install requirements
pip install -r requirements.txt

# Install spaCy language model
python -m spacy download en_core_web_sm
```

### Basic Usage
```bash
# Simple query
python local-o1-reasoning.py -p "Explain the concept of quantum entanglement"

# Complex analysis
python local-o1-reasoning.py -p "Analyze the implications of quantum computing on modern cryptography systems and propose potential mitigation strategies"
```

### Troubleshooting

1. **Model Loading Issues**
```bash
# Verify model status
ollama list

# Restart Ollama service if needed
ollama stop
ollama serve
```

2. **GPU Memory Issues**
- Ensure no other GPU-intensive applications are running
- Monitor GPU usage:
```bash
nvidia-smi -l 1
```

3. **Common Error Solutions**
- If models fail to load: `ollama pull [model_name] --force`
- If out of CUDA memory: Reduce concurrent agent count in configuration
- If response directory error: Check write permissions

### Directory Structure
```
LORS/
├── local-o1-reasoning.py
├── requirements.txt
├── responses/
│ └── [automated response files]
└── README.md
```

## License
MIT License

## Contributing
We welcome contributions! Please see our contributing guidelines for more information.