https://github.com/UKGovernmentBEIS/inspect_ai

Inspect: A framework for large language model evaluations
https://github.com/UKGovernmentBEIS/inspect_ai

Last synced: over 1 year ago
JSON representation

Inspect: A framework for large language model evaluations

Host: GitHub
URL: https://github.com/UKGovernmentBEIS/inspect_ai
Owner: UKGovernmentBEIS
License: mit
Created: 2023-11-14T14:53:11.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2025-03-31T17:56:06.000Z (over 1 year ago)
Last Synced: 2025-03-31T17:59:02.947Z (over 1 year ago)
Language: Python
Homepage: https://inspect.aisi.org.uk/
Size: 91.6 MB
Stars: 850
Watchers: 9
Forks: 203
Open Issues: 55
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
- Citation: CITATION.cff

Awesome Lists containing this project

awesome-ai-testing - Inspect AI - LLM evaluation framework from the UK AI Safety Institute. (LLM-as-Judge Evaluation)
awesome-ai-agent-benchmarks - Inspect AI - use, multi-turn workflows, OWASP Top 10 for | `pip install inspect-ai` | (How to run these benchmarks / Start with a unified harness)
awesome-production-machine-learning - Inspect - Inspect is a framework for large language model evaluations. (Evaluation and Monitoring)
awesome-x-ops - Inspect AI - licensed framework for building, running, and analyzing reproducible evaluations of large language models. (LLM and Agent Observability)
awesome-harness-engineering - Inspect AI - box targets, plus built-in bash/python/web browsing tools. Built for safety-grade rigor; the right foundation for harness-level eval infrastructure. ![Stars](https://img.shields.io/github/stars/UKGovernmentBEIS/inspect_ai?style=flat-square&label=★&color=yellow) (Evals & Verification / Adjacent Collections)
awesome-ml-python-packages - inspect-ai
awesome-evals - Inspect AI - eval framework. **(MUST)** (5 · Evaluation infrastructure (the eval stack: datasets, scorers, online/offline, tracing, CI) / 5a · Eval frameworks & harnesses (code-first test-runners))
awesome-ai-eval - **Inspect AI** - UK AI Safety Institute framework for scripted eval plans, tool calls, and model-graded rubrics. (Tools / Evaluators and Test Harnesses)
Awesome-Prompt-Engineering - GitHub
awesome-agent-cortex - Inspect AI - Open-source framework for reproducible LLM and agent evaluations. (Agent Harnessing and Evaluation / Benchmark Reality Check (real-world tool use))
awesome-opensource-ai - Inspect AI - Framework for large language model evaluations from the UK AI Security Institute. ![GitHub stars](https://img.shields.io/github/stars/UKGovernmentBEIS/inspect_ai?style=social) (9. Evaluation, Benchmarks & Datasets)
awesome-eu-ai-act - Inspect AI
awesome-ai-security - Inspect - _Framework for large language model evaluations by the UK AI Security Institute. 200+ pre-built evaluations covering prompt engineering, tool usage, multi-turn dialog, and model-graded scoring._ (Benchmarks & Evaluations / AI-Assisted Offensive Security)
awesome-gpt-security - inspect_ai - Inspect: A framework for large language model evaluations (GPT Security / Standard)
awesome-ai-coding-agent-tools - Inspect AI - UK AI Safety Institute's framework with 200+ pre-built evals for agents and reasoning. (Supporting Infrastructure / Evaluation)
awesome-harness-engineering - UKGovernmentBEIS/inspect_ai

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/UKGovernmentBEIS/inspect_ai

Awesome Lists containing this project