https://github.com/just-llms/justllms

Production-ready Python library for multi-provider LLM orchestration
https://github.com/just-llms/justllms
ai api-wrapper chat-completion conversation-management cost-optimization deepseek google-gemini grok intelligent-routing justllms langchain llm machine-learning multi-provider openai
Last synced: about 1 month ago
JSON representation
Production-ready Python library for multi-provider LLM orchestration
Host: GitHub
URL: https://github.com/just-llms/justllms
Owner: just-llms
License: mit
Created: 2025-08-08T04:18:48.000Z (2 months ago)
Default Branch: main
Last Pushed: 2025-08-24T09:16:02.000Z (about 1 month ago)
Last Synced: 2025-08-24T15:31:53.333Z (about 1 month ago)
Topics: ai, api-wrapper, chat-completion, conversation-management, cost-optimization, deepseek, google-gemini, grok, intelligent-routing, justllms, langchain, llm, machine-learning, multi-provider, openai
Language: Python
Homepage: https://www.just-llms.com/
Size: 358 KB
Stars: 14
Watchers: 1
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project

README

          # JustLLMs

A production-ready Python library that simplifies working with multiple Large Language Model providers through intelligent routing, comprehensive analytics, and enterprise-grade features.

[![PyPI version](https://badge.fury.io/py/justllms.svg)](https://pypi.org/project/justllms/) [![Downloads](https://pepy.tech/badge/justllms)](https://pepy.tech/project/justllms)

## Why JustLLMs?

Managing multiple LLM providers is complex. You need to handle different APIs, optimize costs, monitor usage, and ensure reliability. JustLLMs solves these challenges by providing a unified interface that automatically routes requests to the best provider based on your criteria—whether that's cost, speed, or quality.

## Installation

```bash

# Basic installation

pip install justllms

# With PDF export capabilities

pip install "justllms[pdf]"

# All optional dependencies (PDF export, Redis caching, advanced analytics)

pip install "justllms[all]"

```

**Package size**: 1.1MB | **Lines of code**: ~11K | **Dependencies**: Minimal production requirements

## Quick Start

```python

from justllms import JustLLM

# Initialize with your API keys

client = JustLLM({

    "providers": {

        "openai": {"api_key": "your-openai-key"},

        "google": {"api_key": "your-google-key"},

        "anthropic": {"api_key": "your-anthropic-key"}

    }

})

# Simple completion - automatically routes to best provider

response = client.completion.create(

    messages=[{"role": "user", "content": "Explain quantum computing briefly"}]

)

print(response.content)

```

## Core Features

### Multi-Provider Support

Connect to all major LLM providers with a single, consistent interface:

- **OpenAI** (GPT-5, GPT-4, etc.) 

- **Google** (Gemini 2.5, Gemini 1.5 models)  

- **Anthropic** (Claude 3.5, Claude 3 models)

- **Azure OpenAI** (with deployment mapping)

- **xAI Grok**, **DeepSeek**, and more

```python

# Switch between providers seamlessly

client = JustLLM({

    "providers": {

        "openai": {"api_key": "your-key"},

        "google": {"api_key": "your-key"},

        "anthropic": {"api_key": "your-key"}

    }

})

# Same interface, different providers automatically chosen

response1 = client.completion.create(

    messages=[{"role": "user", "content": "Explain AI"}],

    provider="openai"  # Force specific provider

)

response2 = client.completion.create(

    messages=[{"role": "user", "content": "Explain AI"}]

    # Auto-routes to best provider based on your strategy

)

```

### Intelligent Routing

**The game-changing feature that sets JustLLMs apart.** Instead of manually choosing models, let our intelligent routing engine automatically select the optimal provider and model for each request based on your priorities.

#### Available Strategies

**🆕 Cluster-Based Routing** - *AI-Powered Query Analysis*

Our most advanced routing strategy uses machine learning to analyze query semantics and route to the optimal model based on similarity to training data. Achieves **+7% accuracy improvement** and **-27% cost reduction** compared to single-model approaches.

```python

# Cluster-based routing (recommended for production)

client = JustLLM({

    "providers": {...},

    "routing": {"strategy": "cluster"}

})

```

*Based on research from [Beyond GPT-5: Making LLMs Cheaper and Better via Performance–Efficiency Optimized Routing](https://arxiv.org/pdf/2508.12631) - AvengersPro framework*

**Traditional Routing Strategies**

```python

# Cost-optimized: Always picks the cheapest option

client = JustLLM({

    "providers": {...},

    "routing": {"strategy": "cost"}

})

# Speed-optimized: Prioritizes fastest response times

client = JustLLM({

    "providers": {...},

    "routing": {"strategy": "latency"}

})

# Quality-optimized: Uses the best models for complex tasks

client = JustLLM({

    "providers": {...},

    "routing": {"strategy": "quality"}

})

# Task-based: Automatically detects query type and routes accordingly

client = JustLLM({

    "providers": {...},

    "routing": {"strategy": "task"}

})

```

#### How Cluster Routing Works

1. **Query Analysis**: Your request is embedded using Qwen3-Embedding-0.6B

2. **Cluster Matching**: Finds the most similar cluster from pre-trained data

3. **Model Selection**: Routes to the best-performing model for that cluster

4. **Fallback**: Falls back to quality-based routing if needed

**Result**: Up to 60% cost reduction while improving accuracy, with automatic failover to backup providers.

### Real-time Streaming

Full streaming support with proper token handling across all providers:

```python

stream = client.completion.create(

    messages=[{"role": "user", "content": "Write a short story"}],

    stream=True

)

for chunk in stream:

    print(chunk.content, end="", flush=True)

```

### Conversation Management

Built-in conversation state management with context preservation:

```python

# Create client

conversation = Conversation(client=client)

# Set system message

conversation.add_system_message("You are a helpful math tutor. Keep answers concise.")

# Turn 1

response = conversation.send("What is 15 + 25?")

# Turn 2 - Context is automatically preserved

response = conversation.send("Now divide that by 8")

# Get conversation stats

history = conversation.get_history()

```

**Conversation Features:**

- **Auto-save**: Persist conversations automatically

- **Context management**: Smart context window handling

- **Export/Import**: JSON, Markdown, and TXT formats

- **Analytics**: Track usage, costs, and performance per conversation

- **Search**: Find conversations by content or metadata

### Smart Caching

Intelligent response caching that dramatically reduces costs and improves response times:

```python

client = JustLLM({

    "providers": {...},

    "caching": {

        "enabled": True,

        "ttl": 3600,  # 1 hour

        "max_size": 1000

    }

})

# First call - cache miss

response1 = client.completion.create(

    messages=[{"role": "user", "content": "What is AI?"}]

)  # ~2 seconds, full cost

# Second call - cache hit

response2 = client.completion.create(

    messages=[{"role": "user", "content": "What is AI?"}]

)  # ~50ms, no cost

```

### Enterprise Analytics

**Comprehensive usage tracking and cost analysis** that gives you complete visibility into your LLM operations. Unlike other solutions that require external tools, JustLLMs provides built-in analytics that finance and engineering teams actually need.

#### What You Get

- **Cross-provider metrics**: Compare performance across providers

- **Cost tracking**: Detailed cost analysis per model/provider

- **Performance insights**: Latency, throughput, success rates

- **Export capabilities**: CSV, PDF with charts

- **Time series analysis**: Usage patterns over time

- **Top models/providers**: Usage and cost rankings

```python

# Generate detailed reports

report = client.analytics.generate_report()

print(f"Total requests: {report.cross_provider_metrics.total_requests}")

print(f"Total cost: ${report.cross_provider_metrics.total_cost:.2f}")

print(f"Fastest provider: {report.cross_provider_metrics.fastest_provider}")

print(f"Cost per request: ${report.cross_provider_metrics.avg_cost_per_request:.4f}")

# Get granular insights

print(f"Cache hit rate: {report.performance_metrics.cache_hit_rate:.1f}%")

print(f"Token efficiency: {report.optimization_suggestions.token_savings:.1f}%")

# Export reports for finance teams

from justllms.analytics.reports import CSVExporter, PDFExporter

csv_exporter = CSVExporter()

csv_exporter.export(report, "monthly_llm_costs.csv")

pdf_exporter = PDFExporter(include_charts=True)

pdf_exporter.export(report, "executive_summary.pdf")

```

**Business Impact**: Teams typically save 40-70% on LLM costs within the first month by identifying usage patterns and optimizing model selection.

### Unified LLM Interface

**Streamlined access to multiple LLM providers** with intelligent routing, comprehensive analytics, and enterprise-grade features for production deployments.

### Business Rule Validation

**Enterprise-grade content filtering and compliance** built for regulated industries. Ensure your LLM applications meet security, privacy, and business requirements without custom development.

#### Compliance Features

- **PII Detection** - Automatically detect and handle social security numbers, credit cards, phone numbers

- **Content Filtering** - Block inappropriate content, profanity, or sensitive topics

- **Custom Business Rules** - Define your own validation logic with regex patterns or custom functions

- **Audit Trail** - Complete logging of all validation actions for compliance reporting

```python

from justllms.validation import ValidationConfig, BusinessRule, RuleType, ValidationAction

client = JustLLM({

    "providers": {...},

    "validation": ValidationConfig(

        enabled=True,

        business_rules=[

            # Block sensitive data patterns

            BusinessRule(

                name="no_ssn",

                type=RuleType.PATTERNS,

                pattern=r"\\b\\d{3}-\\d{2}-\\d{4}\\b",

                action=ValidationAction.BLOCK,

                message="SSN detected - request blocked for privacy"

            ),

            # Content filtering

            BusinessRule(

                name="professional_content",

                type=RuleType.CONTENT_FILTER,

                categories=["hate", "violence", "adult"],

                action=ValidationAction.SANITIZE

            ),

            # Custom business logic

            BusinessRule(

                name="company_policy",

                type=RuleType.CUSTOM,

                validator=lambda content: "competitor" not in content.lower(),

                action=ValidationAction.WARN

            )

        ],

        # Compliance presets

        compliance_mode="GDPR",  # or "HIPAA", "PCI_DSS"

        audit_logging=True

    )

})

# All requests are automatically validated

response = client.completion.create(

    messages=[{"role": "user", "content": "My SSN is 123-45-6789"}]

)

# This request would be blocked and logged for compliance

```

**Regulatory Compliance**: Built-in support for major compliance frameworks saves months of custom security development.

## Advanced Usage

### Async Operations

Full async/await support for high-performance applications:

```python

import asyncio

async def process_batch():

    tasks = []

    for prompt in prompts:

        task = client.completion.acreate(

            messages=[{"role": "user", "content": prompt}]

        )

        tasks.append(task)

    

    responses = await asyncio.gather(*tasks)

    return responses

```

### Error Handling & Reliability

Automatic retries and fallback providers ensure high availability:

```python

client = JustLLM({

    "providers": {...},

    "retry": {

        "max_attempts": 3,

        "backoff_factor": 2,

        "retry_on": ["timeout", "rate_limit", "server_error"]

    }

})

# Automatically retries on failures

try:

    response = client.completion.create(

        messages=[{"role": "user", "content": "Hello"}],

        provider="invalid-provider"  # Will fail and retry

    )

except Exception as e:

    print(f"All retries failed: {e}")

```

### Configuration Management

Flexible configuration with environment variable support:

```python

# Environment-based config

import os

client = JustLLM({

    "providers": {

        "openai": {"api_key": os.getenv("OPENAI_API_KEY")},

        "azure_openai": {

            "api_key": os.getenv("AZURE_OPENAI_KEY"),

            "resource_name": os.getenv("AZURE_RESOURCE_NAME"),

            "api_version": "2024-12-01-preview"

        }

    }

})

# File-based config

import yaml

with open("config.yaml") as f:

    config = yaml.safe_load(f)

client = JustLLM(config)

```

## 🏆 Comparison with Alternatives

| Feature | JustLLMs | LangChain | LiteLLM | OpenAI SDK | Haystack |

|---------|----------|-----------|---------|------------|----------|

| **Package Size** | 1.1MB | ~50MB | ~5MB | ~1MB | ~20MB |

| **Setup Complexity** | Simple config | Complex chains | Medium | Simple | Complex |

| **Multi-Provider** | ✅ 6+ providers | ✅ Many integrations | ✅ 100+ providers | ❌ OpenAI only | ✅ Limited LLMs |

| **Intelligent Routing** | ✅ Cost/speed/quality | ❌ Manual only | ⚠️ Basic routing | ❌ None | ❌ Pipeline-based |

| **Built-in Analytics** | ✅ Enterprise-grade | ❌ External tools needed | ⚠️ Basic metrics | ❌ None | ⚠️ Pipeline metrics |

| **Conversation Management** | ✅ Full lifecycle | ⚠️ Memory components | ❌ None | ❌ Manual handling | ✅ Dialog systems |

| **Business Rules** | ✅ Content validation | ❌ Custom implementation | ❌ None | ❌ None | ⚠️ Custom filters |

| **Cost Optimization** | ✅ Automatic routing | ❌ Manual optimization | ⚠️ Basic cost tracking | ❌ None | ❌ None |

| **Streaming Support** | ✅ All providers | ✅ Provider-dependent | ✅ Most providers | ✅ OpenAI only | ⚠️ Limited |

| **Production Ready** | ✅ Out of the box | ⚠️ Requires setup | ✅ Minimal setup | ⚠️ Basic features | ✅ Complex setup |

| **Caching** | ✅ Multi-backend | ⚠️ Custom implementation | ✅ Basic caching | ❌ None | ✅ Document stores |

## Enterprise Configuration

For production deployments with advanced features:

```python

enterprise_config = {

    "providers": {

        "azure_openai": {

            "api_key": os.getenv("AZURE_OPENAI_KEY"),

            "resource_name": "my-enterprise-resource",

            "deployment_mapping": {

                "gpt-4": "my-gpt4-deployment",

                "gpt-3.5-turbo": "my-gpt35-deployment"

            }

        },

        "anthropic": {"api_key": os.getenv("ANTHROPIC_KEY")},

        "google": {"api_key": os.getenv("GOOGLE_KEY")}

    },

    "routing": {

        "strategy": "cost",

        "fallback_provider": "azure_openai",

        "fallback_model": "gpt-3.5-turbo"

    },

    "validation": {

        "enabled": True,

        "business_rules": [

            # PII detection, content filtering, compliance rules

        ]

    },

    "analytics": {

        "enabled": True,

        "track_usage": True,

        "track_performance": True

    },

    "caching": {

        "enabled": True,

        "backend": "redis",

        "ttl": 3600

    },

    "conversations": {

        "backend": "disk",

        "auto_save": True,

        "auto_title": True,

        "max_context_tokens": 8000

    }

}

client = JustLLM(enterprise_config)

```

## License

MIT License - see [LICENSE](LICENSE) file for details.

[![Star History Chart](https://api.star-history.com/svg?repos=just-llms/justllms&type=Date)](https://www.star-history.com/#just-llms/justllms&Date)
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/just-llms/justllms

Awesome Lists containing this project

README