https://github.com/bhagwat-chate/lexiguard

🛡️ LexiGuard is a compliance-aware GenAI system that answers legal and policy queries using evidence-based, hallucination-free, multi-LLM orchestration. It supports GDPR, HIPAA, DPDP, and AI Act via modular RAG, evaluation, and guardrails—built for FAANG-grade reliability and explainability.
https://github.com/bhagwat-chate/lexiguard

anthropic cohere dpdp-consent gdpr gemini genai guardrails hipaa huggingface llm-orchestration openai prompt-engineering pydantic-validation qdrant ragas trulens

Last synced: about 1 month ago
JSON representation

Host: GitHub
URL: https://github.com/bhagwat-chate/lexiguard
Owner: bhagwat-chate
Created: 2025-07-23T01:40:28.000Z (4 months ago)
Default Branch: main
Last Pushed: 2025-07-30T02:52:07.000Z (4 months ago)
Last Synced: 2025-07-30T04:43:34.718Z (4 months ago)
Topics: anthropic, cohere, dpdp-consent, gdpr, gemini, genai, guardrails, hipaa, huggingface, llm-orchestration, openai, prompt-engineering, pydantic-validation, qdrant, ragas, trulens
Homepage:
Size: 1.68 MB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# 🛡️ LexiGuard – Compliance-Aware GenAI Advisor for AI Policy Governance

> **FAANG-Grade Project by [Bhagwat Chate](https://www.linkedin.com/in/aimlbhagwatchate)**
> Designed for C-suite utility, real-world auditability, and structured legal GenAI answers with zero hallucinations.

## Development in progress branch - release/1.0/dev/bhagwat

## 🧠 Overview

**LexiGuard** is a modular, production-ready GenAI system designed to analyze **AI policy and legal documents** (e.g., GDPR, HIPAA, EU AI Act, DPDP) and deliver:

- 📚 Contextual answers to compliance questions
- 🔍 Faithfulness, citation, and hallucination checks
- 🧾 Structured outputs conforming to legal formats
- 📌 Actionable next steps with risk classification

Built with **AutoGen**, **LangChain**, **RAGAS**, and **Guardrails.ai**, LexiGuard is engineered to meet enterprise-grade security, validation, and observability expectations.

## 🔍 Problem Statement

Modern enterprises struggle with:
- Ambiguous compliance with AI regulations
- No structured answers from legal teams or LLMs
- Risk of hallucinated clauses or unverified citations
- Siloed policy documents across regions and teams

**LexiGuard** solves this with:
- RAG + Guardrails + Multi-Agent orchestration
- Verified citations and jurisdiction mapping
- Structured legal insights with “not legal advice” disclaimer
- Logs for every query with explainable eval metrics

## 💡 Key Features

| Feature | Description |
|----------------------------------|-----------------------------------------------------------------------------|
| 🧠 **RAG with Legal Context** | Semantic + metadata-based hybrid search on legal documents |
| ✅ **Evaluation Layer** | Faithfulness (RAGAS), Toxicity (TruLens), Hallucination & Risk detection |
| 🛡️ **Guardrails + Pydantic** | Schema enforcement with enums, disclaimers, and JSON output guarantees |
| 🔄 **Multi-Agent AutoGen Orchestration** | 13 modular agents with plug-and-play extensibility |
| 🌍 **Jurisdiction Mapping** | Compare laws like GDPR ↔ DPDP ↔ CCPA |
| 🧾 **Version Comparison** | Tracks policy updates and change logs |
| 📊 **Logs + Scorecards** | MongoDB/JSON logs with query trace, eval scores, risk tags |
| ☁️ **Cloud Native & Secure** | Dockerized, deployable on AWS EC2, S3, and MongoDB Atlas |

## 🧩 Architecture

```
User → FastAPI → AutoGen Orchestrator
→ Retriever → RAG Context
→ AnswerGen → Eval Layer (RAGAS, TruLens, Citation)
→ Guardrails Layer (Schema, Enums, Disclaimers)
→ Actionability + Logging
→ Structured, Verified Response
```

➡️ See: `/docs/architecture.png` for detailed visual.

## 🗂️ Project Structure

See locked folder structure in `/docs/folder_structure.md`.

## ⚙️ Tech Stack

| Layer | Stack Used |
|------------------|------------------------------------------------------------------------------|
| **Agent Runtime** | AutoGen `FunctionCallingAgent`, `GroupChat`, custom orchestrator |
| **RAG** | LangChain loaders, chunkers, OpenAI Ada-002 + Cohere Legal embeddings |
| **Vector Store** | Qdrant with metadata filtering (jurisdiction, clause type, severity) |
| **LLMs** | OpenAI GPT-4o + Cohere Command R+ (fallback) |
| **Evaluation** | RAGAS, TruLens, OpenAI Moderation API |
| **Validation** | Pydantic + Guardrails.ai enforced schemas |
| **API** | FastAPI with modular routes |
| **Infra** | Docker, Redis, MongoDB/JSON Logs, AWS EC2/S3 |

## 🧪 How to Run Locally

```bash
# 1. Clone the repo
git clone https://github.com/bhagwat-chate/lexiguard.git && cd lexiguard

# 2. Setup virtualenv
python -m venv venv && source venv/bin/activate

# 3. Install dependencies
pip install -r requirements.txt

# 4. Load documents into vector store
python scripts/load_documents.py

# 5. Run the FastAPI server
uvicorn api.main:app --reload
```

Optional: `docker-compose up` for full stack (Redis + MongoDB + API)

## 🧪 Example Use Cases

| Prompt | Output Description |
|------------------------------------------------------------------------|------------------------------------------------------------------|
| “Is sharing personal data with 3rd parties allowed under DPDP?” | Cited clause, risk tag = “Medium”, follow-up = “Need DPO review”|
| “How does GDPR differ from DPDP on consent?” | Cross-jurisdiction expander returns clause diff summary |
| “What changed in our privacy policy since last version?” | Version comparator highlights added/removed sections |

## 📈 Success Metrics

- ✅ 100% JSON-structured outputs (schema-conforming)
- ✅ 90%+ RAGAS faithfulness score
- ✅ <1% hallucination and toxicity rate
- ✅ Deployment-ready API on AWS
- ✅ Works across GDPR, HIPAA, DPDP documents

## 📚 Documentation

- `HLD.md` – High-level architecture
- `LLD.md` – Class and component-level breakdown
- `/docs/agent_docs/` – Each agent’s responsibility, input/output
- `/docs/prompts.md` – Standardized prompt strategies
- `/docs/examples/` – Sample queries and JSON responses

## 🤝 License

MIT License — built for educational, enterprise, and interview showcase use.

## 🚀 Author

**Bhagwat Chate**
AI/ML Lead | GenAI Architect
📫 [LinkedIn](https://www.linkedin.com/in/aimlbhagwatchate) • [GitHub](https://github.com/bhagwat-chate)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/bhagwat-chate/lexiguard

Awesome Lists containing this project

README