{"id":50751419,"url":"https://github.com/alexandre-leng/ai-security-code-validator","last_synced_at":"2026-06-11T01:30:27.990Z","repository":{"id":363577543,"uuid":"1234665790","full_name":"alexandre-leng/ai-security-code-validator","owner":"alexandre-leng","description":"MVP of an Open-source AI code security scanner for LLM-generated code. Detect prompt injection, secrets, vulnerable dependencies, and OWASP risks in CI/CD and DevSecOps workflows.","archived":false,"fork":false,"pushed_at":"2026-06-09T12:52:09.000Z","size":417,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-09T14:18:39.315Z","etag":null,"topics":["ai-security","application-security","code-security","devsecops","llm-security","owasp","prompt-injection","sarif","sbom","secure-coding","semgrep","supply-chain-security","vulnerability-scanner"],"latest_commit_sha":null,"homepage":"https://github.com/alexandre-leng/ai-code-validator","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/alexandre-leng.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"docs/CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":"NOTICE","maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-10T13:38:42.000Z","updated_at":"2026-06-09T12:52:33.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/alexandre-leng/ai-security-code-validator","commit_stats":null,"previous_names":["alexandre-leng/ai-security-code-validator"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/alexandre-leng/ai-security-code-validator","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alexandre-leng%2Fai-security-code-validator","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alexandre-leng%2Fai-security-code-validator/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alexandre-leng%2Fai-security-code-validator/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alexandre-leng%2Fai-security-code-validator/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/alexandre-leng","download_url":"https://codeload.github.com/alexandre-leng/ai-security-code-validator/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alexandre-leng%2Fai-security-code-validator/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34178819,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-10T02:00:07.152Z","response_time":89,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-security","application-security","code-security","devsecops","llm-security","owasp","prompt-injection","sarif","sbom","secure-coding","semgrep","supply-chain-security","vulnerability-scanner"],"created_at":"2026-06-11T01:30:27.187Z","updated_at":"2026-06-11T01:30:27.975Z","avatar_url":"https://github.com/alexandre-leng.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🤖 AI Code Validator\n\n\u003e **AI Code Validator is an open-source AI code security scanner for LLM-generated code, AI-assisted development, and modern AppSec workflows.**\n\u003e Catch prompt injection, insecure code patterns, secrets, vulnerable dependencies, and OWASP risks before AI-written code reaches production.\n\n🔍 **1000+ security rules** for LLM-generated code\n🛡️ **Prompt injection detection** — integrated with Whitney (transilienceai/whitney)\n⚡ **Scan in seconds** — before the code reaches production\n🤖 **Two-Level Workflow** — auto-approve safe code, escalate risks\n\n**Keywords:** AI code security, AI-generated code scanner, LLM security scanner, prompt injection detection, secure code review, SAST for AI code, DevSecOps, Semgrep security automation, software supply chain security\n\n[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)\n[![Python 3.10+](https://img.shields.io/badge/Python-3.10%2B-green.svg)](https://python.org)\n[![Docker](https://img.shields.io/badge/Docker-Ready-blue.svg)](Dockerfile)\n\n---\n\n## 🚨 The Problem: AI Writes Insecure Code\n\n**45% of code generated by LLMs contains OWASP Top 10 vulnerabilities.**\n*(Veracode, 2025)*\n\nYou asked ChatGPT for a Python script. It gave you something with `os.system()` and a hardcoded API key.\n\nYou used Copilot for a login form. It suggested `eval()` on user input.\n\nYou prompted Claude for a data pipeline. It generated `pickle.loads()` on untrusted data.\n\n**The issue:** Developers trust AI-generated code. They don't read every line. They don't review what Copilot autocomplete inserted between two valid lines.\n\n**Result:** Vulnerabilities slip into production at machine speed.\n\n---\n\n## ✅ The Solution: Security Guardrails for AI-Generated Code\n\nAI Code Validator is the first security scanner **designed specifically for code written by machines** — not humans.\n\nWe don't just find SQL injection. We find:\n\n🔴 **LLM-specific vulnerabilities**\n• Prompt injection in AI apps\n• Unsafe deserialization (pickle, yaml) that LLMs love to generate\n• `eval()` / `exec()` patterns injected by autocomplete\n• Hardcoded API keys generated by Copilot\n• SQL queries built with f-strings (Copilot's favorite pattern)\n\n🟠 **Traditional vulnerabilities** — 1000+ rules across 14 languages\n• SQL injection, XSS, command injection\n• Secrets leakage (AWS, GitHub, Stripe, OpenAI keys)\n• Vulnerable dependencies (SCA with OSV real-time database)\n• License compliance (GPL detection)\n\n🟢 **Workflow that doesn't slow you down**\n• **Level 0:** Auto-approve safe code (zero friction)\n• **Level 1:** Escalate real risks to human review\n• **Auto-fix:** Generate corrected code with one command\n\n---\n\n## ⚡ Quick Start\n\n```bash\n# Install\npip install -e .\npip install semgrep bandit\n\n# Scan a file (instant)\npython3 -m aicv.cli scan app.py\n\n# Scan your Copilot-generated project\npython3 -m aicv.cli scan my-ai-project/\n\n# Two-level workflow (recommended for teams)\npython3 -m aicv.cli level0 my-project/   # Auto-approve safe code\npython3 -m aicv.cli level1 my-project/   # Review escalated risks\n\n# Auto-fix everything fixable\npython3 -m aicv.cli fix --file app.py\n\n# Generate SBOM for compliance\npython3 -m aicv.cli sbom --target . --format cyclonedx\n\n# Check dependencies for known CVEs (OSV real-time)\npython3 -m aicv.cli sca --target .\n\n# Start API + dashboard\nAICV_API_KEY=\"demo\" python3 -m aicv.cli serve\n# → http://localhost:5001/review.html\n```\n\n---\n\n## How It Works\n\nAI Code Validator combines multiple application security testing engines into a single workflow built for **AI-generated code, Copilot-assisted commits, CI/CD pipelines, and security review at scale**.\n\n### 1. Parse the target\n\nThe scanner detects the project language, framework, and file types, then chooses the right analyzers for source code, dependencies, infrastructure-as-code, and AI-specific attack surfaces.\n\n### 2. Run multi-engine analysis\n\nIt orchestrates several engines in parallel:\n\n• **Semgrep rules** for code vulnerabilities, insecure patterns, API misuse, IaC issues, and framework-specific risks  \n• **Prompt injection detection** for LLM apps, RAG pipelines, agent workflows, and dangerous sink chains  \n• **Secrets detection** for API keys, tokens, and leaked credentials  \n• **SCA** against the live OSV database for vulnerable dependencies  \n• **SBOM and license analysis** for compliance, governance, and supply chain visibility\n\n### 3. Correlate and score findings\n\nFindings are normalized, deduplicated, scored by severity and confidence, and converted into a developer-friendly risk report with a global score and actionable remediation hints.\n\n### 4. Route through the two-level workflow\n\nSafe or low-risk findings can be auto-approved in **Level 0**, while high-confidence or critical issues are escalated to **Level 1** for human validation, making the workflow usable in real engineering teams instead of producing alert fatigue.\n\n### 5. Export or integrate\n\nResults can be consumed through the CLI, REST API, SARIF output, dashboard, GitHub PR comments, and editor integration, so teams can use AI Code Validator in local development, pre-commit checks, pull requests, or production release gates.\n\n---\n\n## 🔥 What Makes Us Different\n\n### 1. Prompt Injection Detection\n\nDetects **prompt injection vulnerabilities** in AI applications across 16+ source types:\n\n```python\n# AI Code Validator detects this:\nopenai.ChatCompletion.create(\n    messages=[{\"role\": \"user\", \"content\": user_input}]  # 🔴 PROMPT INJECTION\n)\n\n# And this:\npickle.loads(ai_generated_data)  # 🔴 UNSAFE DESERIALIZATION\n\n# And this:\neval(llm_output)  # 🔴 CODE INJECTION\n```\n\n### 2. Built for the Speed of AI\n\n• **Scan in \u003c 10 seconds** on a typical file\n• **1000+ rules** covering all major vulnerability classes\n• **14 languages** — Python, JS/TS, Java, Go, Rust, C/C++, PHP, Ruby, C#, Swift, Kotlin, Shell\n• **Real-time SCA** — queries OSV database live (not static CVE lists)\n\n### 3. Two-Level Security Workflow\n\n**Level 0 — Automated (Zero Friction)**\n\n```bash\npython3 -m aicv.cli level0 my-project/\n```\n\nWhat happens:\n• LOW/MEDIUM findings with high confidence → **Auto-approved**\n• Known fixable patterns → **Auto-resolved**\n• CRITICAL/HIGH findings → **Escalated to Level 1**\n\nOutput:\n```\n🤖 Level 0 — Automated Security Scan\n\n📊 Results:\n   Auto-approved: 12\n   Auto-resolved: 3\n   Escalated to Level 1: 5\n```\n\n**Level 1 — Human Review**\n\n```bash\npython3 -m aicv.cli level1 my-project/\n```\n\nInteractive session with security checklist per finding type:\n```\n👤 Level 1 — Human Security Review\n\n[CRITICAL] app.py:15\nRule: python-sqli-format-string\nCode: cursor.execute(f\"SELECT * FROM users WHERE id = {user_id}\")\n\n📝 Security Checklist:\n   1. Is the user input properly parameterized?\n   2. Are ORM methods used instead of raw SQL?\n\nDecision [approve/reject/skip/quit]: reject\nNotes: Confirmed SQLi, must use parameterized queries\n```\n\n### 4. VS Code Extension (Real-Time)\n\nScan while you code. Get instant highlights on AI-generated vulnerabilities directly in your editor.\n\nCommands:\n• `Ctrl+Shift+S` — Scan current file\n• Status bar — Live vulnerability count\n• Dashboard — Severity breakdown webview\n\n### 5. Open Source, Self-Hosted, Air-Gap Ready\n\n• **MIT License** — no vendor lock-in\n• **Self-hosted** — your code never leaves your infrastructure\n• **Air-gap compatible** — works offline (except SCA which needs OSV API)\n• **SARIF export** — integrates with GitHub Advanced Security\n\n---\n\n## 🛡️ Detection Engines (13)\n\n**Semgrep** — 1000+ rules, 30+ categories (SQLi, XSS, injection, IaC, containers, CI/CD, AI/ML)\n\n**Prompt Injection (Whitney)** — Semgrep rules for 16+ prompt injection source types (direct HTTP/CLI/Voice, indirect RAG/Web/File Upload, critical sinks: PAL chains, SQL chains, PythonREPL)\n\n**Bandit** — Python AST analysis (hardcoded passwords, shell injection)\n\n**Secrets Detector** — 14 patterns + validators (AWS, GitHub, Stripe, OpenAI keys)\n\n**SCA (Software Composition Analysis)** — Real-time OSV database queries. 7 ecosystems: PyPI, npm, Maven, Go, Cargo, RubyGems, Packagist\n\n**SBOM Generator** — CycloneDX 1.5 + SPDX 2.3 export\n\n**License Scanner** — Copyleft, restrictive, unknown license detection\n\n**SARIF Exporter** — GitHub/CodeQL integration standard\n\n**Dead Code Detector** — Python AST (unused imports, functions, classes)\n\n**Framework Rules** — Django, Flask, Express, Laravel, FastAPI specifics\n\n**Language Checkers** — 14 languages with dedicated analyzers\n\n**Scoring Engine** — 0-100 score + A-F grade\n\n**Auto-Fix** — Regex + AST-based code correction\n\n---\n\n## 🌍 Languages \u0026 Frameworks\n\n**Languages:** Python, JavaScript, TypeScript, Java, Go, PHP, Ruby, C#, Rust, C, C++, Swift, Kotlin, Shell/Bash\n\n**Infrastructure:** Dockerfile, Kubernetes YAML, Terraform, CloudFormation\n\n**AI/ML Specific:** PyTorch, TensorFlow, OpenAI API, LangChain patterns\n\n**Blockchain:** Solidity smart contracts\n\n**Mobile:** Android (Kotlin/Java), iOS (Swift)\n\n---\n\n## 📋 CLI Commands\n\n```\nscan      🔍  Full security scan (all 12 engines)\nlevel0    🤖  Automated workflow — auto-approve safe code\nlevel1    👤  Human review — security checklist per finding\nfix       🔧  Auto-fix vulnerabilities\ndelta     📊  Compare two scans (regression detection)\nnotify    📢  Slack / Discord / Email / Webhook alerts\npr        🐙  Post results as GitHub PR comment\nsbom      📦  Generate Software Bill of Materials\nsca       📦  Dependency vulnerability scan (OSV real-time)\nlicense   📜  License compliance scan\nsarif     📊  SARIF export for GitHub/CodeQL\nchecklist 📝  Security checklists for reviewers\nserve     🚀  REST API + Web dashboard\nversion   📌  Show version\ntools     📦  List available tools\n```\n\n---\n\n## 🚀 Deployment Options\n\n### Docker (Production-Ready)\n```bash\ndocker-compose up -d\n# API: http://localhost:5001\n# Dashboard: http://localhost:5001/review.html\n```\n\nMulti-stage Dockerfile with:\n• Non-root user\n• Read-only filesystem\n• Health checks\n• Nginx reverse proxy with rate limiting\n\n### GitHub Actions\n```yaml\n# Auto-scan every PR\n# Block merge if CRITICAL vulnerabilities found\n```\n\n### API Usage\n```bash\ncurl -X POST http://localhost:5001/api/v1/scan \\\n  -H \"X-API-Key: demo\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"code\": \"import os\\nos.system('id')\", \"language\": \"python\"}'\n```\n\n---\n\n## 📊 Example Output\n\n```\n╔══════════════════════════════════╗\n║   🤖 AI CODE VALIDATOR v3.3    ║\n╚══════════════════════════════════╝\n📁 app.py\n📊 Score: 0/100 💀 Grade F\n🔍 23 vuln(s) | C:12 H:7 M:0 L:1\n\n🔴 [CRITICAL] app.py:10\n   [semgrep] LLM output passed to os.system(). Extreme risk.\n\n🔴 [CRITICAL] app.py:13\n   [semgrep] Unsafe pickle.loads() on AI-generated data\n\n🟠 [HIGH] app.py:7\n   [semgrep] LLM prompt constructed with dynamic input.\n              Risk of prompt injection.\n\n🔴 [CRITICAL] app.py:20\n   [secrets] 🔑 Stripe Secret Key detected: sk_l…7890\n\n⚙️  Two-Level Workflow:\n   Level 0 (Auto): 0/23 handled\n   Level 1 (Human): 23 need review\n```\n\n---\n\n## 🏗️ Architecture\n\n```\naicv/\n├── scanner.py          # Multi-engine orchestrator (12 engines)\n├── scoring.py          # 0-100 + A-F scoring\n├── secrets_detector.py # API key patterns + entropy analysis\n├── sca_scanner.py      # Real-time OSV dependency scanning\n├── sbom_generator.py   # CycloneDX + SPDX generation\n├── sarif_exporter.py   # SARIF v2.1.0 export\n├── license_scanner.py  # License compliance\n├── workflow.py         # Two-level security workflow\n├── auto_fix.py         # Code correction engine\n├── report_exporter.py  # JSON / MD / HTML export\n├── notifier.py         # Slack / Discord / Email / Webhook\n├── github_pr.py        # GitHub PR integration\n├── checkers/           # 11 language-specific analyzers\n│   ├── rust_checker.py\n│   ├── cpp_checker.py\n│   ├── typescript_checker.py\n│   ├── swift_checker.py\n│   ├── kotlin_checker.py\n│   └── shell_checker.py\n\napi/                    # REST API (Flask + Gunicorn)\nstatic/                 # Web dashboard (review.html)\nrules/semgrep/          # 1000+ security rules (50+ files)\nvscode-extension/       # VS Code extension (TypeScript)\n```\n\n---\n\n## 📈 What's New in v3.3\n\n**1000+ Semgrep rules** — 50+ rule files across 15+ categories\n\n**Real-time SCA** — OSV API integration (replaces static CVE lists)\n\n**VS Code Extension** — Real-time scanning in your IDE\n\n**Docker + SaaS MVP** — Multi-stage Dockerfile, docker-compose, nginx\n\n**Community Infrastructure** — Issue templates, CI/CD, CoC, PR templates\n\n**AI/ML Security Rules** — Prompt injection, unsafe pickle, LLM eval detection\n\n---\n\n## 📚 Documentation\n\n• [Architecture](docs/ARCHITECTURE.md) — System design \u0026 data flow\n• [API Reference](docs/API.md) — Complete REST API documentation\n• [Workflow Guide](docs/WORKFLOW.md) — Two-level security workflow\n• [Contributing](docs/CONTRIBUTING.md) — Code standards \u0026 PR process\n• [Security](docs/SECURITY.md) — Supported versions \u0026 disclosure policy\n• [Strategy](docs/STRATEGY.md) — Competitive analysis \u0026 roadmap\n\n---\n\n## GitHub SEO Suggestions\n\nIf you want the repository to be easier to discover on GitHub search, the best lift usually comes from the **About description**, **topics**, and the first 20 lines of the README.\n\n**Suggested GitHub description**\n\n`Open-source AI code security scanner for LLM-generated code. Detect prompt injection, secrets, vulnerable dependencies, and OWASP risks in CI/CD and DevSecOps workflows.`\n\n**Suggested GitHub topics**\n\n`ai-security`, `llm-security`, `application-security`, `code-security`, `secure-coding`, `prompt-injection`, `sast`, `semgrep`, `devsecops`, `owasp`, `supply-chain-security`, `sbom`, `sarif`, `vulnerability-scanner`, `github-actions`\n\n**Suggested website**\n\nUse the repository URL for now, or a future landing page such as `https://www.formalibre.org` with a dedicated product section for AI Code Validator.\n\n---\n\n## 📄 License\n\nMIT\n\n---\n\n## 🙏 Acknowledgments\n\nThis project includes or was inspired by the following open-source projects:\n\n- **[transilienceai/whitney](https://github.com/transilienceai/whitney)** (Apache 2.0) — Prompt injection detection rules covering 16+ source types\n- **[Shiboof/Code-Vulnerability-Scanner](https://github.com/Shiboof/Code-Vulnerability-Scanner)** (MIT) — Security scoring system, language detection patterns\n- **[duriantaco/skylos](https://github.com/duriantaco/skylos)** (Apache 2.0) — Dead code detection patterns, framework-aware rules\n\nSee [LICENSE](LICENSE) for full third-party attribution details.\n\n---\n\n**[⭐ Star on GitHub](https://github.com/alexandre-leng/ai-code-validator)**\n**[🐛 Report Issue](https://github.com/alexandre-leng/ai-code-validator/issues)**\n**[🌐 formalibre.org](https://www.formalibre.org)**\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falexandre-leng%2Fai-security-code-validator","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Falexandre-leng%2Fai-security-code-validator","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falexandre-leng%2Fai-security-code-validator/lists"}