https://github.com/dmitriz/llm-selection
Strategic selection, evaluation, and routing of large language models
https://github.com/dmitriz/llm-selection
Last synced: 4 months ago
JSON representation
Strategic selection, evaluation, and routing of large language models
- Host: GitHub
- URL: https://github.com/dmitriz/llm-selection
- Owner: dmitriz
- Created: 2025-05-22T07:51:34.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-06-04T09:27:50.000Z (about 1 year ago)
- Last Synced: 2025-06-04T16:54:27.897Z (about 1 year ago)
- Size: 34.2 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# LLM Selection Project
A system for selecting, routing, and evaluating large language models (LLMs) based on task type, reasoning ability, access methods, costs, and reliability.
## 🚀 Getting Started
This project is designed to help you make strategic model selection decisions for your LLM tasks. The repository will grow to include:
- Model catalogs with capability tracking
- Routing logic for task-to-model mapping
- Evaluation tools and reviewer prompts
- Integration patterns for various LLM access methods
## 📂 Repository Structure
- `models.md` - Catalog of available LLMs and their capabilities (planned)
- `routing-strategy.md` - Logic for matching tasks to models (planned)
- `reviewer-prompts.md` - Evaluation criteria and prompts (planned)
- `PROJECT_BACKLOG.md` - Complete project backlog with detailed goals and plans
## 🏁 Initial Tasks
- [ ] Create `models.md` for cataloging available LLMs
- [ ] Add `routing-strategy.md` to outline logic by task type
- [ ] Prepare reviewer prompt set in `reviewer-prompts.md`
- [ ] Design and test a selection prompt (e.g., "Which model should I use for X?")
## 📊 Roadmap & Prioritization
### Phase 1: Model Cataloging (June 2025)
- **Focus**: Comprehensive model inventory and capability assessment
- **Deliverable**: `models.md` with detailed capability matrix
- **Key milestones**:
- Complete API documentation for all major providers
- Document token limits and rate restrictions
- Create standardized capability scoring system
### Phase 2: Routing Strategy (July 2025)
- **Focus**: Decision framework for model selection
- **Deliverable**: `routing-strategy.md` with decision trees
- **Key milestones**:
- Map task types to optimal model profiles
- Create fallback pathways for rate limits and failures
- Design cost optimization strategies
### Phase 3: Evaluation Framework (August 2025)
- **Focus**: Output quality assessment
- **Deliverable**: `reviewer-prompts.md` with scoring metrics
- **Key milestones**:
- Develop standardized evaluation criteria
- Create reviewer prompts for different output types
- Build comparison methodology for cross-model evaluation
### Success Metrics
- **Latency targets**: <500ms for routing decisions, <5s end-to-end for simple tasks
- **Hallucination thresholds**: <2% factual errors in final outputs
- **Cost efficiency**: 30% reduction in token usage vs. single-model approach
- **Selection accuracy**: >95% appropriate model selection for given task types
## 📖 Detailed Documentation
For complete project details including:
- Core components
- Key insights and strategy notes
- Planned experiments
- Roles and agents
Please refer to the [PROJECT_BACKLOG.md](PROJECT_BACKLOG.md) file.
## 🤝 Contributing
Contributions are welcome! Check the PROJECT_BACKLOG.md for areas where you can help, or suggest new features and improvements.
---
> Note: This project is in active development. Features and documentation will evolve over time.