https://github.com/UKGovernmentBEIS/inspect_ai
Inspect: A framework for large language model evaluations
https://github.com/UKGovernmentBEIS/inspect_ai
Last synced: about 1 year ago
JSON representation
Inspect: A framework for large language model evaluations
- Host: GitHub
- URL: https://github.com/UKGovernmentBEIS/inspect_ai
- Owner: UKGovernmentBEIS
- License: mit
- Created: 2023-11-14T14:53:11.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2025-03-31T17:56:06.000Z (about 1 year ago)
- Last Synced: 2025-03-31T17:59:02.947Z (about 1 year ago)
- Language: Python
- Homepage: https://inspect.aisi.org.uk/
- Size: 91.6 MB
- Stars: 850
- Watchers: 9
- Forks: 203
- Open Issues: 55
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
- Citation: CITATION.cff
Awesome Lists containing this project
- awesome-ai-testing - Inspect AI - LLM evaluation framework from the UK AI Safety Institute. (LLM-as-Judge Evaluation)
- awesome-production-machine-learning - Inspect - Inspect is a framework for large language model evaluations. (Evaluation and Monitoring)
- awesome-harness-engineering - Inspect AI - box targets, plus built-in bash/python/web browsing tools. Built for safety-grade rigor; the right foundation for harness-level eval infrastructure.  (Evals & Verification / Adjacent Collections)
- awesome-ml-python-packages - inspect-ai
- awesome-ai-eval - **Inspect AI** - UK AI Safety Institute framework for scripted eval plans, tool calls, and model-graded rubrics. (Tools / Evaluators and Test Harnesses)
- Awesome-Prompt-Engineering - GitHub
- awesome-agent-cortex - Inspect AI - Open-source framework for reproducible LLM and agent evaluations. (Agent Harnessing and Evaluation / Benchmark Reality Check (real-world tool use))
- awesome-ai-coding-agent-tools - Inspect AI - UK AI Safety Institute's framework with 200+ pre-built evals for agents and reasoning. (Supporting Infrastructure / Evaluation)
- awesome-opensource-ai - Inspect AI - Framework for large language model evaluations from the UK AI Security Institute.  (9. Evaluation, Benchmarks & Datasets)
- awesome-gpt-security - inspect_ai - Inspect: A framework for large language model evaluations (GPT Security / Standard)
- awesome-harness-engineering - UKGovernmentBEIS/inspect_ai