https://github.com/uiuc-kang-lab/cve-bench
CVE-Bench: A Benchmark for AI Agents’ Ability to Exploit Real-World Web Application Vulnerabilities
https://github.com/uiuc-kang-lab/cve-bench
benchmark inspect language-model
Last synced: 4 months ago
JSON representation
CVE-Bench: A Benchmark for AI Agents’ Ability to Exploit Real-World Web Application Vulnerabilities
- Host: GitHub
- URL: https://github.com/uiuc-kang-lab/cve-bench
- Owner: uiuc-kang-lab
- License: apache-2.0
- Created: 2025-02-25T16:35:57.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2026-01-10T05:06:06.000Z (5 months ago)
- Last Synced: 2026-01-11T01:32:53.134Z (5 months ago)
- Topics: benchmark, inspect, language-model
- Language: Python
- Homepage: https://arxiv.org/abs/2503.17332
- Size: 121 MB
- Stars: 136
- Watchers: 3
- Forks: 29
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-offensive-security-ai - CVE-Bench GitHub
- Awesome-AI-Security - CVE-Bench - kang-lab/cve-bench?logo=github&label=&style=social)](https://github.com/uiuc-kang-lab/cve-bench) - 40 dockerized web CVEs; success = expected impact triggered. [arXiv](https://arxiv.org/abs/2503.17332) ([↑](#table-of-contents)Benchmarks <a name="benchmarking"></a> / **Adversarial Resilience**)
- awesome-rainmana - uiuc-kang-lab/cve-bench - CVE-Bench: A Benchmark for AI Agents’ Ability to Exploit Real-World Web Application Vulnerabilities (Python)