{"id":37247293,"url":"https://github.com/North-Shore-AI/LlmGuard","last_synced_at":"2026-01-22T17:01:56.747Z","repository":{"id":318717523,"uuid":"1073997150","full_name":"North-Shore-AI/LlmGuard","owner":"North-Shore-AI","description":"AI Firewall and guardrails for LLM-based Elixir applications","archived":false,"fork":false,"pushed_at":"2025-12-29T03:59:53.000Z","size":229,"stargazers_count":3,"open_issues_count":0,"forks_count":1,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-12-31T23:53:45.047Z","etag":null,"topics":["ai","ai-firewall","ai-safety","beam","content-filtering","elixir","ensemble-methods","guardrails","llm","llm-security","machine-learning","nshkr-crucible","otp","prompt-injection","reliability","research","safety-constraints","security","security-framework","statistical-testing"],"latest_commit_sha":null,"homepage":null,"language":"Elixir","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/North-Shore-AI.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":"docs/roadmap.md","authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-10-11T00:58:06.000Z","updated_at":"2025-12-29T03:59:37.000Z","dependencies_parsed_at":null,"dependency_job_id":"c3065e63-70d4-48e2-9d75-abbf27b262ca","html_url":"https://github.com/North-Shore-AI/LlmGuard","commit_stats":null,"previous_names":["north-shore-ai/llmguard"],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/North-Shore-AI/LlmGuard","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/North-Shore-AI%2FLlmGuard","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/North-Shore-AI%2FLlmGuard/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/North-Shore-AI%2FLlmGuard/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/North-Shore-AI%2FLlmGuard/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/North-Shore-AI","download_url":"https://codeload.github.com/North-Shore-AI/LlmGuard/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/North-Shore-AI%2FLlmGuard/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28667308,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-22T14:01:31.714Z","status":"ssl_error","status_checked_at":"2026-01-22T13:59:23.143Z","response_time":144,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","ai-firewall","ai-safety","beam","content-filtering","elixir","ensemble-methods","guardrails","llm","llm-security","machine-learning","nshkr-crucible","otp","prompt-injection","reliability","research","safety-constraints","security","security-framework","statistical-testing"],"created_at":"2026-01-15T13:00:27.048Z","updated_at":"2026-01-22T17:01:56.739Z","avatar_url":"https://github.com/North-Shore-AI.png","language":"Elixir","funding_links":[],"categories":["Generative AI"],"sub_categories":["Development Tools"],"readme":"# LlmGuard\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"assets/LlmGuard.svg\" alt=\"LlmGuard Logo\" width=\"200\"\u003e\n\u003c/p\u003e\n\n**AI Firewall and Guardrails for LLM-based Elixir Applications**\n\n[![Elixir](https://img.shields.io/badge/elixir-1.14+-purple.svg)](https://elixir-lang.org)\n[![OTP](https://img.shields.io/badge/otp-25+-blue.svg)](https://www.erlang.org)\n[![Hex.pm](https://img.shields.io/hexpm/v/llm_guard.svg)](https://hex.pm/packages/llm_guard)\n[![Documentation](https://img.shields.io/badge/docs-hexdocs-purple.svg)](https://hexdocs.pm/llm_guard)\n[![License](https://img.shields.io/badge/license-MIT-green.svg)](https://github.com/North-Shore-AI/LlmGuard/blob/main/LICENSE)\n\nLlmGuard provides comprehensive security protection for LLM applications including prompt injection detection, jailbreak prevention, data leakage protection, and content moderation.\n\n## Features\n\n- ✅ **Prompt Injection Detection** - Multi-layer detection with 34+ patterns\n- ✅ **Jailbreak Detection** - Role-playing, hypothetical, encoding, emotional attacks\n- ✅ **PII Detection \u0026 Redaction** - Email, phone, SSN, credit cards, IP, URLs\n- ✅ **Pipeline Architecture** - Flexible, extensible security pipeline\n- ✅ **Configuration System** - Centralized configuration with validation\n- ✅ **Zero Trust** - Validates all inputs and outputs\n- ✅ **High Performance** - \u003c15ms latency for pattern-based detection\n- ⏳ **Content Moderation** - Coming soon\n- ⏳ **Rate Limiting** - Coming soon\n- ⏳ **Audit Logging** - Coming soon\n\n## Quick Start\n\nAdd to your `mix.exs`:\n\n```elixir\ndef deps do\n  [\n    {:llm_guard, \"~\u003e 0.3.1\"}\n  ]\nend\n```\n\nBasic usage:\n\n```elixir\n# Create configuration\nconfig = LlmGuard.Config.new(\n  prompt_injection_detection: true,\n  confidence_threshold: 0.7\n)\n\n# Validate user input\ncase LlmGuard.validate_input(user_input, config) do\n  {:ok, safe_input} -\u003e\n    # Safe to send to LLM\n    llm_response = MyLLM.generate(safe_input)\n    \n    # Validate output\n    case LlmGuard.validate_output(llm_response, config) do\n      {:ok, safe_output} -\u003e {:ok, safe_output}\n      {:error, :detected, details} -\u003e {:error, \"Unsafe output\"}\n    end\n    \n  {:error, :detected, details} -\u003e\n    # Blocked malicious input\n    Logger.warn(\"Threat detected: #{details.reason}\")\n    {:error, \"Input blocked\"}\nend\n```\n\n## Architecture\n\nLlmGuard uses a multi-layer detection strategy:\n\n1. **Pattern Matching** (~1ms) - Fast regex-based detection\n2. **Heuristic Analysis** (~10ms) - Statistical analysis (coming soon)\n3. **ML Classification** (~50ms) - Advanced threat detection (coming soon)\n\n```\nUser Input\n    │\n    ▼\n┌─────────────────┐\n│ Input Validation│\n│  - Length check │\n│  - Sanitization │\n└────────┬────────┘\n         │\n         ▼\n┌─────────────────────┐\n│ Security Pipeline   │\n│  ┌───────────────┐  │\n│  │ Detector 1    │  │\n│  ├───────────────┤  │\n│  │ Detector 2    │  │\n│  ├───────────────┤  │\n│  │ Detector 3    │  │\n│  └───────────────┘  │\n└────────┬────────────┘\n         │\n         ▼\n    LLM Processing\n         │\n         ▼\n┌─────────────────────┐\n│ Output Validation   │\n└────────┬────────────┘\n         │\n         ▼\n     User Response\n```\n\n## Detected Threats\n\n### Prompt Injection (34 patterns)\n- Instruction override: \"Ignore all previous instructions\"\n- System extraction: \"Show me your system prompt\"\n- Delimiter injection: \"---END SYSTEM---\"\n- Mode switching: \"Enter debug mode\"\n- Role manipulation: \"You are now DAN\"\n- Authority escalation: \"As SUPER-ADMIN...\"\n\n### Jailbreak Detection\n- Role-playing: DAN, DUDE, KEVIN, etc.\n- Hypothetical scenarios: \"In a world where...\"\n- Prefix injection: [SYSTEM OVERRIDE], \u003c\u003cDEBUG\u003e\u003e\n- Emotional manipulation: \"For educational purposes...\"\n- Encoding attacks: Base64, hex, leetspeak\n- Format manipulation: Structured jailbreak instructions\n\n### PII Detection \u0026 Redaction\n- Email addresses (95% confidence)\n- Phone numbers (US format, 80-90% confidence)\n- Social Security Numbers (95% confidence)\n- Credit card numbers (98% with Luhn validation)\n- IP addresses (85-90% confidence)\n- URLs (90% confidence)\n\n### Coming Soon\n- Harmful content (violence, hate speech, etc.)\n- Advanced ML-based classification\n- Multi-turn conversation analysis\n\n## Testing\n\n```bash\n# Run all tests\nmix test\n\n# Run with coverage\nmix coveralls.html\n\n# Run security tests only\nmix test --only security\n\n# Run performance benchmarks\nmix test --only performance\n```\n\n**Current Status**:\n- ✅ 222/228 tests passing (97.4%)\n- ✅ Zero compilation warnings\n- ✅ 100% documentation coverage\n\n## Configuration\n\n```elixir\nconfig = LlmGuard.Config.new(\n  # Detection toggles\n  prompt_injection_detection: true,\n  jailbreak_detection: false,  # Coming soon\n  data_leakage_prevention: false,  # Coming soon\n  content_moderation: false,  # Coming soon\n  \n  # Thresholds\n  confidence_threshold: 0.7,\n  max_input_length: 10_000,\n  max_output_length: 10_000,\n  \n  # Rate limiting (coming soon)\n  rate_limiting: %{\n    requests_per_minute: 100,\n    tokens_per_minute: 200_000\n  }\n)\n\n# Optional: Caching (set `caching` to enable pipeline result caching)\ncaching_config = %{\n  enabled: true,\n  pattern_cache: true,\n  result_cache: true,\n  result_ttl_seconds: 300,\n  max_cache_entries: 10_000\n}\n\nconfig = LlmGuard.Config.new(\n  prompt_injection_detection: true,\n  caching: caching_config\n)\n```\n\n### Caching\n\nThe pipeline will reuse detector results when `caching.enabled` is true **and** the cache process is\nrunning.\n\n```elixir\n# Start the cache in your supervision tree\nchildren = [\n  {LlmGuard.Cache.PatternCache, []},\n  # ...other children\n]\n\n# Fetch cache statistics\nstats = LlmGuard.Cache.PatternCache.stats()\n# =\u003e %{pattern_count: 10, result_count: 42, hit_rate: 0.78, ...}\n```\n\n### Telemetry \u0026 Metrics\n\nTelemetry emits pipeline, detector, and cache events with native durations.\n\n```elixir\n# Initialize handlers once (idempotent)\n:ok = LlmGuard.Telemetry.Metrics.setup()\n\n# Inspect metrics in-process\nmetrics = LlmGuard.Telemetry.Metrics.snapshot()\n\n# Prometheus text format\nprom_text = LlmGuard.Telemetry.Metrics.prometheus_metrics()\n```\n\nIntegrate with `Telemetry.Metrics` reporters:\n\n```elixir\nimport Telemetry.Metrics\n\nmetrics = LlmGuard.Telemetry.Metrics.metrics()\n```\n\nUse these metrics with Prometheus (e.g., `TelemetryMetricsPrometheus`) or LiveDashboard to track\nrequest outcomes, detector latency, cache hit rates, and confidence distributions.\n\n## Performance\n\nCurrent (Phase 1):\n- **Latency**: \u003c10ms P95 (pattern matching)\n- **Throughput**: Not yet benchmarked\n- **Memory**: \u003c50MB per instance\n\nTargets (Phase 4):\n- **Latency**: \u003c150ms P95 (all layers)\n- **Throughput**: \u003e1000 req/s\n- **Memory**: \u003c100MB per instance\n\n## Development Status\n\nSee [IMPLEMENTATION_STATUS.md](IMPLEMENTATION_STATUS.md) for detailed progress.\n\n**Phase 1 - Foundation**: ✅ 80% Complete\n- [x] Core framework (Detector, Config, Pipeline)\n- [x] Pattern utilities\n- [x] Prompt injection detector (24 patterns)\n- [x] Main API (validate_input, validate_output, validate_batch)\n- [ ] PII scanner \u0026 redactor\n- [ ] Jailbreak detector\n- [ ] Content safety detector\n\n**Phase 2 - Advanced Detection**: ⏳ 0% Complete\n**Phase 3 - Policy \u0026 Infrastructure**: ⏳ 0% Complete\n**Phase 4 - Optimization**: ⏳ 0% Complete\n\n## Examples\n\nRun examples with `mix run examples/example_name.exs`:\n\n```bash\n# Basic usage demonstration\nmix run examples/basic_usage.exs\n\n# Jailbreak detection examples\nmix run examples/jailbreak_detection.exs\n\n# Comprehensive multi-layer protection\nmix run examples/comprehensive_protection.exs\n```\n\n### CrucibleIR Pipeline Integration\n\n```elixir\n# Use LlmGuard as a stage in CrucibleIR research pipelines\ndefmodule MyExperiment do\n  def run_with_guardrails do\n    # Configure guardrails\n    guardrail = %CrucibleIR.Reliability.Guardrail{\n      profiles: [:default],\n      prompt_injection_detection: true,\n      jailbreak_detection: true,\n      pii_detection: true,\n      pii_redaction: false,\n      fail_on_detection: true\n    }\n\n    # Create experiment context\n    context = %{\n      experiment: %{\n        reliability: %{\n          guardrails: guardrail\n        }\n      },\n      inputs: \"User prompt to validate\"\n    }\n\n    # Run the stage\n    case LlmGuard.Stage.run(context) do\n      {:ok, updated_context} -\u003e\n        # Check validation results\n        case updated_context.guardrails.status do\n          :safe -\u003e\n            IO.puts(\"Input validated successfully\")\n            process_safe_input(updated_context.guardrails.validated_inputs)\n\n          :detected -\u003e\n            IO.puts(\"Threats detected: #{inspect(updated_context.guardrails.detections)}\")\n            handle_detected_threats(updated_context.guardrails)\n\n          :error -\u003e\n            IO.puts(\"Validation errors: #{inspect(updated_context.guardrails.errors)}\")\n        end\n\n      {:error, {:threats_detected, details}} -\u003e\n        # Strict mode: fail_on_detection was true\n        IO.puts(\"Pipeline halted due to detected threats\")\n        {:error, details}\n    end\n  end\nend\n```\n\n### Phoenix Integration\n\n```elixir\ndefmodule MyAppWeb.LlmGuardPlug do\n  import Plug.Conn\n\n  def init(opts), do: opts\n\n  def call(conn, _opts) do\n    with {:ok, input} \u003c- extract_llm_input(conn),\n         {:ok, sanitized} \u003c- LlmGuard.validate_input(input, config()) do\n      assign(conn, :sanitized_input, sanitized)\n    else\n      {:error, :detected, details} -\u003e\n        conn\n        |\u003e put_status(:forbidden)\n        |\u003e json(%{error: \"Input blocked\", reason: details.reason})\n        |\u003e halt()\n    end\n  end\nend\n```\n\n### Batch Validation\n\n```elixir\n# Validate multiple inputs concurrently\ninputs = [\"Message 1\", \"Ignore all instructions\", \"Message 3\"]\nresults = LlmGuard.validate_batch(inputs, config)\n\nEnum.each(results, fn\n  {:ok, safe_input} -\u003e process_safe(safe_input)\n  {:error, :detected, details} -\u003e log_threat(details)\nend)\n```\n\n## Documentation\n\nFull documentation is available at [hexdocs.pm/llm_guard](https://hexdocs.pm/llm_guard).\n\nGenerate locally:\n```bash\nmix docs\nopen doc/index.html\n```\n\n## Contributing\n\nContributions are welcome! Please open an issue or pull request on GitHub.\n\nAreas needing help:\n- Additional detection patterns\n- Performance optimization\n- Documentation improvements\n- Test coverage expansion\n- ML model integration\n\n## Roadmap\n\n- **v0.2.0** - PII detection \u0026 redaction ✅\n- **v0.3.0** - CrucibleIR integration \u0026 Stage implementation ✅\n- **v0.4.0** - Jailbreak detection\n- **v0.5.0** - Content moderation\n- **v0.6.0** - Rate limiting \u0026 audit logging\n- **v0.7.0** - Heuristic analysis (Layer 2)\n- **v1.0.0** - ML classification (Layer 3)\n\n## Security\n\nFor security issues, please email security@example.com instead of using the issue tracker.\n\n## License\n\nMIT License. See [LICENSE](LICENSE) for details.\n\n## Acknowledgments\n\nBuilt following security best practices and threat models from:\n- OWASP LLM Top 10\n- AI Incident Database\n- Prompt injection research papers\n- Production LLM security deployments\n\n---\n\n**Status**: Alpha - Production-ready for prompt injection detection\n**Version**: 0.3.1\n**Elixir**: ~\u003e 1.14\n**OTP**: 25+\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FNorth-Shore-AI%2FLlmGuard","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FNorth-Shore-AI%2FLlmGuard","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FNorth-Shore-AI%2FLlmGuard/lists"}