{"id":30770489,"url":"https://github.com/sanand0/promptevals","last_synced_at":"2025-09-04T23:08:35.031Z","repository":{"id":306366120,"uuid":"984559818","full_name":"sanand0/promptevals","owner":"sanand0","description":"Automatically improve system prompts using a data-driven LLM approach","archived":false,"fork":false,"pushed_at":"2025-07-25T04:04:20.000Z","size":15,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-08-30T13:54:56.698Z","etag":null,"topics":["app","llm","tool"],"latest_commit_sha":null,"homepage":"https://sanand0.github.io/promptevals/","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sanand0.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-05-16T06:10:28.000Z","updated_at":"2025-07-25T04:04:23.000Z","dependencies_parsed_at":"2025-07-25T09:20:39.376Z","dependency_job_id":"3d195a78-da95-4e49-80ed-a9e434bd93c4","html_url":"https://github.com/sanand0/promptevals","commit_stats":null,"previous_names":["sanand0/promptevals"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/sanand0/promptevals","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sanand0%2Fpromptevals","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sanand0%2Fpromptevals/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sanand0%2Fpromptevals/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sanand0%2Fpromptevals/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sanand0","download_url":"https://codeload.github.com/sanand0/promptevals/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sanand0%2Fpromptevals/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":273685604,"owners_count":25149722,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-04T02:00:08.968Z","response_time":61,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["app","llm","tool"],"created_at":"2025-09-04T23:08:29.908Z","updated_at":"2025-09-04T23:08:35.018Z","avatar_url":"https://github.com/sanand0.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Prompt Evals\n\n![Prompt Evals Logo](https://img.shields.io/badge/Prompt%20Evals-Optimize%20Your%20Prompts-2563eb)\n[![JavaScript](https://img.shields.io/badge/Language-JavaScript-f7df1e)](https://developer.mozilla.org/en-US/docs/Web/JavaScript)\n[![Bootstrap](https://img.shields.io/badge/Framework-Bootstrap%205-7952b3)](https://getbootstrap.com/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\nA [system-prompt learning tool](https://x.com/karpathy/status/1921368644069765486) for evaluating, optimizing, and iterating on AI prompts through a systematic, data-driven approach.\n\n**Links:**\n\n- [Live Demo](https://sanand0.github.io/promptevals)\n- [GitHub Repository](https://github.com/sanand0/promptevals)\n\n## Overview\n\nPrompt Evals helps you systematically evaluate and improve AI prompts. By using a data-driven approach, it allows you to:\n\n1. Generate optimized prompts based on input-output examples\n2. Test prompts against your dataset\n3. Evaluate the quality of generated outputs against expected results\n4. Analyze performance using embedding similarity and custom criteria\n5. Iteratively refine prompts based on performance data\n\nThis is ideal for AI engineers, prompt engineers, and researchers who want to optimize their interactions with large language models.\n\n## Features\n\n- **Prompt Generation**: Automatically generate effective prompts based on example input-output pairs\n- **Batch Output Generation**: Test your prompts on multiple inputs in one go\n- **Embedding Similarity Analysis**: Measure how closely generated outputs match expected results\n- **Custom Evaluation Criteria**: Define and assess specific criteria for what makes a good output\n- **Prompt Revision**: Get AI-assisted suggestions to improve underperforming prompts\n- **Experiment History**: Track prompt performance across multiple iterations\n- **Dark/Light Mode**: Comfortable viewing experience in any environment\n- **Responsive Design**: Works seamlessly on desktop and mobile devices\n\n## Usage\n\n- **Preparing Your Dataset**. Create a dataset with inputs in the first column and expected outputs in the second column. You can paste data directly or load a CSV file. Example format:\n  ```\n  Input text\u003cTAB\u003eExpected output text\n  Another input\u003cTAB\u003eAnother expected output\n  ```\n- **Generating a Prompt**: Select the number of samples to use. Choose a prompt generation model. Click \"Generate prompt\"\n- **Testing Your Prompt**: Select an output model. Click \"Generate output\". Review the results in the table\n- **Evaluating Performance**: Adjust the embedding similarity threshold. Define your evaluation criteria. Click \"Evaluate prompt\". Review scores and performance metrics\n- **Revising Your Prompt**: Set the number of examples to use. Choose a revision model. Click \"Revise prompt\". Review and apply the suggested improvements\n\nThis process creates a virtuous cycle of continuous improvement for your prompts.\n\n## Demo\n\nThe application includes a sample dataset for [clinical trial protocol explanations](clinical-trial-protocol-explanation.csv). This demonstrates how to optimize prompts that explain complex medical protocols in patient-friendly language.\n\n## Installation\n\n### Web Application\n\nNo installation required. Access the [live demo](https://sanand0.github.io/promptevals) through any modern web browser.\n\n### Local Setup\n\n1. Clone the repository:\n   ```bash\n   git clone https://github.com/sanand0/promptevals.git\n   cd promptevals\n   ```\n2. Serve the files using any web server:\n\n   ```bash\n   # Using Python's built-in server\n   python -m http.server 8000\n\n   # Or using Node.js with http-server\n   npx http-server\n   ```\n\n3. Open `http://localhost:8000` in your browser\n4. Log in with your LLM Foundry credentials\n\n## License\n\n[MIT License](LICENSE)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsanand0%2Fpromptevals","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsanand0%2Fpromptevals","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsanand0%2Fpromptevals/lists"}