https://github.com/uiuc-kang-lab/cve-bench

CVE-Bench: A Benchmark for AI Agents’ Ability to Exploit Real-World Web Application Vulnerabilities
https://github.com/uiuc-kang-lab/cve-bench

benchmark inspect language-model

Last synced: 6 months ago
JSON representation

CVE-Bench: A Benchmark for AI Agents’ Ability to Exploit Real-World Web Application Vulnerabilities

awesome-offensive-security-ai - CVE-Bench GitHub
Awesome-AI-Security - CVE-Bench - kang-lab/cve-bench?logo=github&label=&style=social)](https://github.com/uiuc-kang-lab/cve-bench) - 40 dockerized web CVEs; success = expected impact triggered. [arXiv](https://arxiv.org/abs/2503.17332) ([↑](#table-of-contents)Benchmarks <a name="benchmarking"></a> / **Adversarial Resilience**)
awesome-rainmana - uiuc-kang-lab/cve-bench - CVE-Bench: A Benchmark for AI Agents’ Ability to Exploit Real-World Web Application Vulnerabilities (Python)
awesome-ai-agent-benchmarks - CVE-Bench - world web-applicat… | 🟡 medium | 240 | (The index / Tier 2 — widely used (11))

ecosyste.ms