https://github.com/sanand0/promptevals
Automatically improve system prompts using a data-driven LLM approach
https://github.com/sanand0/promptevals
app llm tool
Last synced: 10 months ago
JSON representation
Automatically improve system prompts using a data-driven LLM approach
- Host: GitHub
- URL: https://github.com/sanand0/promptevals
- Owner: sanand0
- License: mit
- Created: 2025-05-16T06:10:28.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-07-25T04:04:20.000Z (11 months ago)
- Last Synced: 2025-08-30T13:54:56.698Z (10 months ago)
- Topics: app, llm, tool
- Language: JavaScript
- Homepage: https://sanand0.github.io/promptevals/
- Size: 14.6 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Prompt Evals

[](https://developer.mozilla.org/en-US/docs/Web/JavaScript)
[](https://getbootstrap.com/)
[](https://opensource.org/licenses/MIT)
A [system-prompt learning tool](https://x.com/karpathy/status/1921368644069765486) for evaluating, optimizing, and iterating on AI prompts through a systematic, data-driven approach.
**Links:**
- [Live Demo](https://sanand0.github.io/promptevals)
- [GitHub Repository](https://github.com/sanand0/promptevals)
## Overview
Prompt Evals helps you systematically evaluate and improve AI prompts. By using a data-driven approach, it allows you to:
1. Generate optimized prompts based on input-output examples
2. Test prompts against your dataset
3. Evaluate the quality of generated outputs against expected results
4. Analyze performance using embedding similarity and custom criteria
5. Iteratively refine prompts based on performance data
This is ideal for AI engineers, prompt engineers, and researchers who want to optimize their interactions with large language models.
## Features
- **Prompt Generation**: Automatically generate effective prompts based on example input-output pairs
- **Batch Output Generation**: Test your prompts on multiple inputs in one go
- **Embedding Similarity Analysis**: Measure how closely generated outputs match expected results
- **Custom Evaluation Criteria**: Define and assess specific criteria for what makes a good output
- **Prompt Revision**: Get AI-assisted suggestions to improve underperforming prompts
- **Experiment History**: Track prompt performance across multiple iterations
- **Dark/Light Mode**: Comfortable viewing experience in any environment
- **Responsive Design**: Works seamlessly on desktop and mobile devices
## Usage
- **Preparing Your Dataset**. Create a dataset with inputs in the first column and expected outputs in the second column. You can paste data directly or load a CSV file. Example format:
```
Input textExpected output text
Another inputAnother expected output
```
- **Generating a Prompt**: Select the number of samples to use. Choose a prompt generation model. Click "Generate prompt"
- **Testing Your Prompt**: Select an output model. Click "Generate output". Review the results in the table
- **Evaluating Performance**: Adjust the embedding similarity threshold. Define your evaluation criteria. Click "Evaluate prompt". Review scores and performance metrics
- **Revising Your Prompt**: Set the number of examples to use. Choose a revision model. Click "Revise prompt". Review and apply the suggested improvements
This process creates a virtuous cycle of continuous improvement for your prompts.
## Demo
The application includes a sample dataset for [clinical trial protocol explanations](clinical-trial-protocol-explanation.csv). This demonstrates how to optimize prompts that explain complex medical protocols in patient-friendly language.
## Installation
### Web Application
No installation required. Access the [live demo](https://sanand0.github.io/promptevals) through any modern web browser.
### Local Setup
1. Clone the repository:
```bash
git clone https://github.com/sanand0/promptevals.git
cd promptevals
```
2. Serve the files using any web server:
```bash
# Using Python's built-in server
python -m http.server 8000
# Or using Node.js with http-server
npx http-server
```
3. Open `http://localhost:8000` in your browser
4. Log in with your LLM Foundry credentials
## License
[MIT License](LICENSE)