https://github.com/contextlab/orchestrator
A convenient wrapper for LangGraph, MCP, model spec, and other AI agent control systems
https://github.com/contextlab/orchestrator
Last synced: 8 months ago
JSON representation
A convenient wrapper for LangGraph, MCP, model spec, and other AI agent control systems
- Host: GitHub
- URL: https://github.com/contextlab/orchestrator
- Owner: ContextLab
- License: mit
- Created: 2025-07-10T14:49:35.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2025-07-18T16:37:05.000Z (8 months ago)
- Last Synced: 2025-07-18T19:47:22.599Z (8 months ago)
- Language: Python
- Size: 9.95 MB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 15
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Orchestrator Framework
[](https://pypi.org/project/py-orc/)
[](https://pypi.org/project/py-orc/)
[](https://pypi.org/project/py-orc/)
[](https://github.com/ContextLab/orchestrator/blob/main/LICENSE)
[](https://github.com/ContextLab/orchestrator/actions/workflows/tests.yml)
[](https://github.com/ContextLab/orchestrator/actions/workflows/coverage.yml)
[](https://orc.readthedocs.io/en/latest/?badge=latest)
## Overview
Orchestrator is a powerful, flexible AI pipeline orchestration framework that simplifies the creation and execution of complex AI workflows. By combining YAML-based configuration with intelligent model selection and automatic ambiguity resolution, Orchestrator makes it easy to build sophisticated AI applications without getting bogged down in implementation details.
### Key Features
- 🎯 **YAML-Based Pipelines**: Define complex workflows in simple, readable YAML with full template variable support
- 🤖 **Multi-Model Support**: Seamlessly work with OpenAI, Anthropic, Google, Ollama, and HuggingFace models
- 🧠 **Intelligent Model Selection**: Automatically choose the best model based on task requirements
- 🔄 **Automatic Ambiguity Resolution**: Use `` tags to let AI resolve configuration ambiguities
- 📦 **Modular Architecture**: Extend with custom models, tools, and control systems
- 🛡️ **Production Ready**: Built-in error handling, retries, checkpointing, and monitoring
- ⚡ **Parallel Execution**: Efficient resource management and parallel task execution
- 🐳 **Sandboxed Execution**: Secure code execution in isolated environments
- 💾 **Lazy Model Loading**: Models are downloaded only when needed, saving disk space
- 🔧 **Reliable Tool Execution**: Guaranteed execution of file operations with LangChain structured outputs
- 📝 **Advanced Templates**: Support for nested variables, filters, and Jinja2-style templates
## Quick Start
### Installation
```bash
pip install py-orc
```
For additional features:
```bash
pip install py-orc[ollama] # Ollama model support
pip install py-orc[cloud] # Cloud model providers
pip install py-orc[dev] # Development tools
pip install py-orc[all] # Everything
```
### API Key Configuration
Orchestrator supports multiple AI providers. Configure your API keys using the interactive setup:
```bash
# Interactive setup for all providers
orchestrator keys setup
# Or add individual keys
orchestrator keys add openai
orchestrator keys add anthropic
orchestrator keys add google
orchestrator keys add huggingface
# Check configured providers
orchestrator keys list
# Validate your configuration
orchestrator keys validate
```
API keys are stored securely in `~/.orchestrator/.env` with file permissions set to 600 (owner read/write only).
#### Required Environment Variables
If you prefer to set environment variables manually:
- `OPENAI_API_KEY` - OpenAI API key (for GPT models)
- `ANTHROPIC_API_KEY` - Anthropic API key (for Claude models)
- `GOOGLE_AI_API_KEY` - Google AI API key (for Gemini models)
- `HF_TOKEN` - Hugging Face token (for HuggingFace models)
**Note**: Ollama models run locally and don't require API keys. They will be downloaded automatically on first use.
### Basic Usage
1. **Create a simple pipeline** (`hello_world.yaml`):
```yaml
id: hello_world
name: Hello World Pipeline
description: A simple example pipeline
steps:
- id: greet
action: generate_text
parameters:
prompt: "Say hello to the world in a creative way!"
- id: translate
action: generate_text
parameters:
prompt: "Translate this greeting to Spanish: {{ greet.result }}"
dependencies: [greet]
outputs:
greeting: "{{ greet.result }}"
spanish: "{{ translate.result }}"
```
2. **Run the pipeline**:
```bash
# Using the CLI script
python scripts/run_pipeline.py hello_world.yaml
# With inputs
python scripts/run_pipeline.py hello_world.yaml -i name=World -i language=Spanish
# From a JSON file
python scripts/run_pipeline.py hello_world.yaml -f inputs.json -o output_dir/
# Or programmatically
import orchestrator as orc
# Initialize models (auto-detects available models)
orc.init_models()
# Compile and run the pipeline
pipeline = orc.compile("hello_world.yaml")
result = pipeline.run()
print(result)
```
### Using AUTO Tags
Orchestrator's `` tags let AI decide configuration details:
```yaml
steps:
- id: analyze_data
action: analyze
parameters:
data: "{{ input_data }}"
method: Choose the best analysis method for this data type
visualization: Decide if we should create a chart
```
## Model Configuration
Configure available models in `models.yaml`:
```yaml
models:
# Local models (via Ollama) - downloaded on first use
- source: ollama
name: llama3.1:8b
expertise: [general, reasoning, multilingual]
size: 8b
- source: ollama
name: qwen2.5-coder:7b
expertise: [code, programming]
size: 7b
# Cloud models
- source: openai
name: gpt-4o
expertise: [general, reasoning, code, analysis, vision]
size: 1760b # Estimated
defaults:
expertise_preferences:
code: qwen2.5-coder:7b
reasoning: deepseek-r1:8b
fast: llama3.2:1b
```
Models are downloaded only when first used, saving disk space and initialization time.
## Advanced Example
Here's a more complex example showing model requirements and parallel execution:
```yaml
id: research_pipeline
name: AI Research Pipeline
description: Research a topic and create a comprehensive report
inputs:
- name: topic
type: string
description: Research topic
- name: depth
type: string
default: Determine appropriate research depth
steps:
# Parallel research from multiple sources
- id: web_search
action: search_web
parameters:
query: "{{ topic }} latest research 2025"
count: Decide how many results to fetch
requires_model:
expertise: [research, web]
- id: academic_search
action: search_academic
parameters:
query: "{{ topic }}"
filters: Set appropriate academic filters
requires_model:
expertise: [research, academic]
# Analyze findings with specialized model
- id: analyze_findings
action: analyze
parameters:
web_results: "{{ web_search.results }}"
academic_results: "{{ academic_search.results }}"
analysis_focus: Determine key aspects to analyze
dependencies: [web_search, academic_search]
requires_model:
expertise: [analysis, reasoning]
min_size: 20b # Require large model for complex analysis
# Generate report
- id: write_report
action: generate_document
parameters:
topic: "{{ topic }}"
analysis: "{{ analyze_findings.result }}"
style: Choose appropriate writing style
length: Determine optimal report length
dependencies: [analyze_findings]
requires_model:
expertise: [writing, general]
outputs:
report: "{{ write_report.document }}"
summary: "{{ analyze_findings.summary }}"
```
## Complete Example: Research Report Generator
Here's a fully functional pipeline that generates research reports:
```yaml
# research_report.yaml
id: research_report
name: Research Report Generator
description: Generate comprehensive research reports with citations
inputs:
- name: topic
type: string
description: Research topic
- name: instructions
type: string
description: Additional instructions for the report
outputs:
- pdf: Generate appropriate filename for the research report PDF
steps:
- id: search
name: Web Search
action: search_web
parameters:
query: Create effective search query for {topic} with {instructions}
max_results: 10
requires_model:
expertise: fast
- id: compile_notes
name: Compile Research Notes
action: generate_text
parameters:
prompt: |
Compile comprehensive research notes from these search results:
{{ search.results }}
Topic: {{ topic }}
Instructions: {{ instructions }}
Create detailed notes with:
- Key findings
- Important quotes
- Source citations
- Relevant statistics
dependencies: [search]
requires_model:
expertise: [analysis, reasoning]
min_size: 7b
- id: write_report
name: Write Report
action: generate_document
parameters:
content: |
Write a comprehensive research report on "{{ topic }}"
Research notes:
{{ compile_notes.result }}
Requirements:
- Professional academic style
- Include introduction, body sections, and conclusion
- Cite sources properly
- {{ instructions }}
format: markdown
dependencies: [compile_notes]
requires_model:
expertise: [writing, general]
min_size: 20b
- id: create_pdf
name: Create PDF
action: convert_to_pdf
parameters:
markdown: "{{ write_report.document }}"
filename: "{{ outputs.pdf }}"
dependencies: [write_report]
```
Run it with:
```python
import orchestrator as orc
# Initialize models
orc.init_models()
# Compile pipeline
pipeline = orc.compile("research_report.yaml")
# Run with inputs
result = pipeline.run(
topic="quantum computing applications in medicine",
instructions="Focus on recent breakthroughs and future potential"
)
print(f"Report saved to: {result}")
```
## Documentation
Comprehensive documentation is available at [orc.readthedocs.io](https://orc.readthedocs.io/), including:
- [Getting Started Guide](https://orc.readthedocs.io/en/latest/getting_started/quickstart.html)
- [YAML Configuration Reference](https://orc.readthedocs.io/en/latest/user_guide/yaml_configuration.html)
- [Model Configuration](https://orc.readthedocs.io/en/latest/user_guide/model_configuration.html)
- [API Reference](https://orc.readthedocs.io/en/latest/api/core.html)
- [Examples and Tutorials](https://orc.readthedocs.io/en/latest/tutorials/examples.html)
## Available Models
Orchestrator supports a wide range of models:
### Local Models (via Ollama)
- **Gemma3 27B**: Google's powerful general-purpose model
- **Llama 3.x**: General purpose, multilingual support
- **DeepSeek-R1**: Advanced reasoning and coding
- **Qwen2.5-Coder**: Specialized for code generation
- **Mistral**: Fast and efficient general purpose
### Cloud Models
- **OpenAI**: GPT-4.1 (latest)
- **Anthropic**: Claude Sonnet 4 (claude-sonnet-4-20250514)
- **Google**: Gemini 2.5 Flash (gemini-2.5-flash)
### HuggingFace Models
- **Mistral 7B Instruct v0.3**: High-quality instruction-following model
- Llama, Qwen, Phi, and many more
- Automatically downloaded on first use
## Requirements
- Python 3.8+
- Optional: Ollama for local model execution
- Optional: API keys for cloud providers (OpenAI, Anthropic, Google)
## Contributing
We welcome contributions! Please see our [Contributing Guide](https://github.com/ContextLab/orchestrator/blob/main/CONTRIBUTING.md) for details.
## Support
- 📚 [Documentation](https://orc.readthedocs.io/)
- 🐛 [Issue Tracker](https://github.com/ContextLab/orchestrator/issues)
- 💬 [Discussions](https://github.com/ContextLab/orchestrator/discussions)
- 📧 Email: contextualdynamics@gmail.com
## License
This project is licensed under the MIT License - see the [LICENSE](https://github.com/ContextLab/orchestrator/blob/main/LICENSE) file for details.
## Citation
If you use Orchestrator in your research, please cite:
```bibtex
@software{orchestrator2025,
title = {Orchestrator: AI Pipeline Orchestration Framework},
author = {Manning, Jeremy R. and {Contextual Dynamics Lab}},
year = {2025},
url = {https://github.com/ContextLab/orchestrator},
organization = {Dartmouth College}
}
```
## Acknowledgments
Orchestrator is developed and maintained by the [Contextual Dynamics Lab](https://www.context-lab.com/) at Dartmouth College.
---
*Built with ❤️ by the Contextual Dynamics Lab*