https://github.com/promptfoo/promptfoo
Test your prompts, agents, and RAGs. Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.
https://github.com/promptfoo/promptfoo
ci ci-cd cicd evaluation evaluation-framework llm llm-eval llm-evaluation llm-evaluation-framework llmops pentesting prompt-engineering prompt-testing prompts rag red-teaming testing vulnerability-scanners
Last synced: about 1 month ago
JSON representation
Test your prompts, agents, and RAGs. Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.
- Host: GitHub
- URL: https://github.com/promptfoo/promptfoo
- Owner: promptfoo
- License: mit
- Created: 2023-04-28T15:48:49.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2025-03-08T00:53:33.000Z (about 1 month ago)
- Last Synced: 2025-03-08T01:24:43.599Z (about 1 month ago)
- Topics: ci, ci-cd, cicd, evaluation, evaluation-framework, llm, llm-eval, llm-evaluation, llm-evaluation-framework, llmops, pentesting, prompt-engineering, prompt-testing, prompts, rag, red-teaming, testing, vulnerability-scanners
- Language: TypeScript
- Homepage: https://promptfoo.dev
- Size: 308 MB
- Stars: 5,756
- Watchers: 21
- Forks: 475
- Open Issues: 188
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- Funding: .github/FUNDING.yml
- License: LICENSE
- Citation: CITATION.cff
Awesome Lists containing this project
- awesome-langchain-zh - Promptfoo
- awesome-ChatGPT-repositories - promptfoo - Test your prompts, models, RAGs. Evaluate and compare LLM outputs, catch regressions, and improve prompt quality. LLM evals for OpenAI/Azure GPT, Anthropic Claude, VertexAI Gemini, Ollama, Local & private models like Mistral/Mixtral/Llama with CI/CD (Prompts)
- awesome-langchain - Promptfoo
- StarryDivineSky - promptfoo/promptfoo
- awesome - promptfoo/promptfoo - Test your prompts, agents, and RAGs. Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration. (TypeScript)
- awesome - promptfoo/promptfoo - Test your prompts, agents, and RAGs. Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration. (TypeScript)
- jimsghstars - promptfoo/promptfoo - Test your prompts, agents, and RAGs. Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command (TypeScript)
- awesome-safety-critical-ai - `promptfoo/promptfoo` - friendly local tool for testing LLM applications (<a id="tools"></a>🛠️ Tools / Bleeding Edge ⚗️)
README
# Promptfoo: LLM evals & red teaming
[](https://npmjs.com/package/promptfoo)
[](https://npmjs.com/package/promptfoo)
[](https://github.com/promptfoo/promptfoo/actions/workflows/main.yml)

[](https://discord.gg/promptfoo)`promptfoo` is a developer-friendly local tool for testing LLM applications. Stop the trial-and-error approach - start shipping secure, reliable AI apps.
## Quick Start
```sh
# Install and initialize project
npx promptfoo@latest init# Run your first evaluation
npx promptfoo eval
```See [Getting Started](https://www.promptfoo.dev/docs/getting-started/) (evals) or [Red Teaming](https://www.promptfoo.dev/docs/red-team/) (vulnerability scanning) for more.
## What can you do with Promptfoo?
- **Test your prompts and models** with [automated evaluations](https://www.promptfoo.dev/docs/getting-started/)
- **Secure your LLM apps** with [red teaming](https://www.promptfoo.dev/docs/red-team/) and vulnerability scanning
- **Compare models** side-by-side (OpenAI, Anthropic, Azure, Bedrock, Ollama, and [more](https://www.promptfoo.dev/docs/providers/))
- **Automate checks** in [CI/CD](https://www.promptfoo.dev/docs/integrations/ci-cd/)
- **Share results** with your teamHere's what it looks like in action:

It works on the command line too:

It also can generate [security vulnerability reports](https://www.promptfoo.dev/docs/red-team/):

## Why promptfoo?
- 🚀 **Developer-first**: Fast, with features like live reload and caching
- 🔒 **Private**: Runs 100% locally - your prompts never leave your machine
- 🔧 **Flexible**: Works with any LLM API or programming language
- 💪 **Battle-tested**: Powers LLM apps serving 10M+ users in production
- 📊 **Data-driven**: Make decisions based on metrics, not gut feel
- 🤝 **Open source**: MIT licensed, with an active community## Learn More
- 📚 [Full Documentation](https://www.promptfoo.dev/docs/intro/)
- 🔐 [Red Teaming Guide](https://www.promptfoo.dev/docs/red-team/)
- 🎯 [Getting Started](https://www.promptfoo.dev/docs/getting-started/)
- 💻 [CLI Usage](https://www.promptfoo.dev/docs/usage/command-line/)
- 📦 [Node.js Package](https://www.promptfoo.dev/docs/usage/node-package/)
- 🤖 [Supported Models](https://www.promptfoo.dev/docs/providers/)## Contributing
We welcome contributions! Check out our [contributing guide](https://www.promptfoo.dev/docs/contributing/) to get started.
Join our [Discord community](https://discord.gg/promptfoo) for help and discussion.