{"id":50681277,"url":"https://github.com/sochaty/llm-governance-engine","last_synced_at":"2026-06-08T19:04:14.331Z","repository":{"id":339817567,"uuid":"1163488015","full_name":"sochaty/llm-governance-engine","owner":"sochaty","description":"A robust LLM Governance \u0026 ROI Evaluation platform designed to benchmark Frontier models against local open-source models. Built with an enterprise microservices architecture and cloud-ready for Kubernetes, this tool helps organizations optimize AI spend by calculating the accuracy-vs-cost tradeoff of local vs. cloud inference","archived":false,"fork":false,"pushed_at":"2026-03-03T18:47:20.000Z","size":324,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-03-03T21:56:42.683Z","etag":null,"topics":["angular","benchmark","fastapi","finops","generative-ai","kubernetes","llm-engineering","microservices","mlops","ollama"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sochaty.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-02-21T17:54:52.000Z","updated_at":"2026-03-03T18:20:18.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/sochaty/llm-governance-engine","commit_stats":null,"previous_names":["sochaty/llm-governance-engine"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/sochaty/llm-governance-engine","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sochaty%2Fllm-governance-engine","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sochaty%2Fllm-governance-engine/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sochaty%2Fllm-governance-engine/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sochaty%2Fllm-governance-engine/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sochaty","download_url":"https://codeload.github.com/sochaty/llm-governance-engine/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sochaty%2Fllm-governance-engine/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34076021,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-08T02:00:07.615Z","response_time":111,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["angular","benchmark","fastapi","finops","generative-ai","kubernetes","llm-engineering","microservices","mlops","ollama"],"created_at":"2026-06-08T19:04:12.751Z","updated_at":"2026-06-08T19:04:14.315Z","avatar_url":"https://github.com/sochaty.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🛡️ LLM Governance Engine (2026)\n### **The Multi-Model Insight Bridge: Enterprise-Grade Observability \u0026 PII Guardrails**\n\n[![FastAPI](https://img.shields.io/badge/API-FastAPI-009688?style=flat-square\u0026logo=fastapi\u0026logoColor=white)](https://fastapi.tiangolo.com/)\n[![Angular](https://img.shields.io/badge/Frontend-Angular_18-DD0031?style=flat-square\u0026logo=angular\u0026logoColor=white)](https://angular.io/)\n[![Security](https://img.shields.io/badge/Security-PII_Scanning-blueviolet?style=flat-square)](https://github.com/microsoft/presidio)\n[![Ollama](https://img.shields.io/badge/Local_AI-Ollama-white?style=flat-square\u0026logo=ollama)](https://ollama.com/)\n\nThe **LLM Governance Engine** is an advanced evaluation and monitoring framework designed to bridge the gap between **Cloud-based Frontier Models** (e.g., GPT-4o) and **Private Local Inference** (e.g., Llama 3.2). \n\nThis platform serves as a \"Security Gateway,\" ensuring that every interaction is audited for PII (Personally Identifiable Information) leaks, hardware efficiency, and cost-effectiveness—providing a unified dashboard for AI governance.\n\n\n\n---\n\n## 🏗️ System Architecture\n\nThe project follows a modern, decoupled architecture centered around an **Asynchronous Orchestration** pattern.\n\n* **Frontend (Angular 18+):** Reactive UI utilizing RxJS and Signals for real-time token streaming and dynamic Radar Chart visualizations.\n* **Backend (FastAPI):** High-performance Python engine managing the \"Multi-Model Insight Bridge,\" orchestrating requests to both OpenAI and local Ollama instances.\n* **Audit Layer (Microsoft Presidio):** In-flight scanning of prompts and responses to detect and score sensitive data leaks.\n* **Telemetry (NVML/PSUtil):** Direct hardware integration to monitor VRAM and GPU utilization for local models, with graceful CPU fallback.\n* **Persistence (PostgreSQL):** Robust storage for all benchmark metadata, safety scores, and historical audit logs.\n\n\n\n---\n\n## 🚀 Quick Start\n\n### **1. Prerequisites**\n* **Docker Desktop** (with Compose)\n* **NVIDIA Drivers** (Optional, for GPU monitoring features)\n* **OpenAI API Key** (for Cloud benchmarking)\n\n### **2. Setup Environment**\nCreate a `.env` file in the root directory:\n```env\nOPENAI_API_KEY=your_actual_key_here\nCLOUD_MODEL_NAME=gpt-4o\nLOCAL_MODEL_NAME=llama3.2:latest\nOLLAMA_BASE_URL=http://ollama-service:11434/v1\nDATABASE_URL=postgresql+asyncpg://admin:password123@db:5432/llm_governance\n```\n### **3. Run the Stack**\n\n```bash\n# Build and launch all services in detached mode\ndocker-compose up --build -d\n```\n* **Dashboard UI:** http://localhost:4200\n\n* **Interactive API Docs:** http://localhost:8000/docs\n\n* **Ollama API:** http://localhost:11434\n\n---\n\n## 💎 Key Governance Features\n\n### 📊 Real-Time Benchmark Radar\n\nCompare models across 5 critical dimensions: Latency, Cost, Security, Faithfulness, and Hardware Intensity. The chart updates dynamically as tokens stream in, allowing for immediate ROI visualization.\n\n### 🛡️ PII Guardrails\n\nAutomatic detection of 15+ sensitive entities (Emails, SSNs, Credit Cards) using Microsoft Presidio. The engine provides a normalized **Safety Score** based on detection confidence, effectively acting as a data-loss prevention (DLP) layer for LLMs.\n\n### ⚡ Telemetry \u0026 ROI Tracking\n\n* **GPU VRAM Monitoring:** Tracks real-time memory pressure for local models to optimize infrastructure allocation.\n\n* **Cost Projection:** Calculates exact Cloud API expenses vs. the $0/token economy of local hardware.\n\n* **Energy Metrics:** Estimates power consumption (Watts) to help meet corporate sustainability goals.\n\n### 📄 One-Click Audit Reports\n\nExport any benchmark session to a professional PDF report. These reports include full prompt/response pairs, security flags, and performance metadata, ready for compliance review.\n\n---\n\n## 📂 Project Structure\n\n```\n├── backend/\n│   ├── app/\n|   |   ├── api/\n│   │   ├── core/           # Database setup and lifespan logic\n│   │   ├── models/         # SQLAlchemy benchmark schemas\n│   │   ├── services/       \n│   │   │   ├── audit_service.py     # PII \u0026 Safety logic\n│   │   │   └── llm_orchestrator.py  # Provider logic \u0026 streaming\n│   │   └── main.py         # FastAPI routes\n├── frontend/\n│   ├── src/app/\n│   │   ├── core/services       # Dashboard \u0026 History components\n│   │   └── features/       # API and PDF generation services\n└── docker-compose.yml       # Full stack containerization\n```\n---\n\n## 📈 2026 Roadmap\n\n- [x] **Phase 1: Foundation** - Multi-Model Streaming Orchestration (FastAPI + Angular).\n- [x] **Phase 2: Security** - Microsoft Presidio PII Integration \u0026 Real-time Guardrails.\n- [x] **Phase 3: Telemetry** - GPU/VRAM Hardware Monitoring \u0026 NVML Integration.\n- [ ] **Phase 4: Advanced Eval** - RAG Faithfulness \u0026 Hallucination Scoring (RAGAS Integration).\n- [ ] **Phase 5: Sustainability** - CO2 Emission tracking per inference session.\n- [ ] **Phase 6: Enterprise Auth** - Multi-User Role-Based Access Control (RBAC) \u0026 OAuth2.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsochaty%2Fllm-governance-engine","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsochaty%2Fllm-governance-engine","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsochaty%2Fllm-governance-engine/lists"}