https://github.com/anulum/director-ai

Real-time LLM hallucination guardrail — NLI + RAG fact-checking with token-level streaming halt. Drop-in for any LLM backend.
https://github.com/anulum/director-ai

ai-safety coherence deberta factual-consistency hallucination-detection llm llm-guardrail machine-learning nli python rag

Last synced: 3 months ago
JSON representation

Real-time LLM hallucination guardrail — NLI + RAG fact-checking with token-level streaming halt. Drop-in for any LLM backend.

Host: GitHub
URL: https://github.com/anulum/director-ai
Owner: anulum
License: agpl-3.0
Created: 2026-02-15T16:53:34.000Z (5 months ago)
Default Branch: main
Last Pushed: 2026-03-25T18:52:35.000Z (4 months ago)
Last Synced: 2026-03-25T19:10:12.046Z (4 months ago)
Topics: ai-safety, coherence, deberta, factual-consistency, hallucination-detection, llm, llm-guardrail, machine-learning, nli, python, rag
Language: Python
Homepage: https://anulum.github.io/director-ai/
Size: 26.6 MB
Stars: 2
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- Funding: .github/FUNDING.yml
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Audit: AUDIT_2026-03-11.md
- Citation: CITATION.cff
- Codeowners: .github/CODEOWNERS
- Security: SECURITY.md
- Support: SUPPORT.md
- Governance: GOVERNANCE.md
- Roadmap: ROADMAP.md
- Zenodo: .zenodo.json
- Notice: NOTICE.md

Awesome Lists containing this project

README

          


  



Director-AI




  Real-time LLM hallucination guardrail — NLI + RAG fact-checking with token-level streaming halt





  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  



---

## What It Does

Director-AI sits between your LLM and the user. It scores every output for

hallucination before it reaches anyone — and can halt generation mid-stream if

coherence drops below threshold.

```mermaid

graph LR

    LLM["LLM
(any provider)"] --> D["Director-AI"]

    D --> S["Scorer
NLI + RAG"]

    D --> K["StreamingKernel
token-level halt"]

    S --> V{Approved?}

    K --> V

    V -->|Yes| U["User"]

    V -->|No| H["HALT + evidence"]

```

**Ten things make it different:**

1. **Token-level streaming halt** — not post-hoc review. Severs output the moment coherence degrades.

2. **Dual-entropy scoring** — NLI contradiction detection (DeBERTa) + RAG fact-checking against your knowledge base.

3. **Meta-confidence** — the guardrail tells you how confident it is in its own verdict. Route low-confidence results to human review.

4. **Structured output verification** — JSON schema validation, tool call fabrication detection, code hallucinated API detection. Zero dependencies (stdlib only).

5. **Online calibration** — collects human feedback, automatically adjusts thresholds for your deployment. The longer you use it, the better it gets.

6. **Contradiction tracking** — detects when an AI contradicts itself across conversation turns.

7. **EU AI Act compliance** — automated Article 15 documentation. Accuracy metrics, drift detection, feedback loop detection, audit trails, per-model breakdown with confidence intervals. Ready for August 2026 enforcement.

8. **Verification gems** — numeric consistency checks, reasoning chain verification, temporal freshness scoring, cross-model consensus, conformal prediction intervals. All stdlib-only, zero dependencies.

9. **Agentic loop monitor** — detects circular tool calls, goal drift, and budget exhaustion in AI agent loops. The first guardrail that monitors agent execution, not just individual calls.

10. **Adversarial self-test** — 25-pattern robustness suite tests your guardrail against zero-width chars, homoglyphs, encoding tricks, and prompt injection.

### Scope

Pure Python core — no compiled extensions required. Optional Rust kernel (`pip install director-ai[rust]`) for SIMD-accelerated scoring. Works on any platform with Python 3.11+.

| Layer | Packages | Install |

|-------|----------|---------|

| **Core** (zero heavy deps) | `CoherenceScorer`, `StreamingKernel`, `GroundTruthStore`, `HaltMonitor` | `pip install director-ai` |

| **NLI models** | DeBERTa, FactCG, MiniCheck, ONNX Runtime | `pip install director-ai[nli]` |

| **Vector DBs** | ChromaDB (`[vector]`), Pinecone (`[pinecone]`), Weaviate (`[weaviate]`), Qdrant (`[qdrant]`) | `pip install director-ai[vector]` |

| **LLM judge** | OpenAI, Anthropic escalation | `pip install director-ai[openai]` |

| **Observability** | OpenTelemetry spans | `pip install director-ai[otel]` |

| **Server** | FastAPI + Uvicorn | `pip install director-ai[server]` |

## Four Ways to Add Guardrails

### A: Wrap your SDK (6 lines)

Duck-type detection for five SDK shapes: OpenAI-compatible (OpenAI, vLLM, Groq,

LiteLLM, Ollama), Anthropic, AWS Bedrock, Google Gemini, and Cohere.

```python

from director_ai import guard

from openai import OpenAI

client = guard(

    OpenAI(),

    facts={"refund_policy": "Refunds within 30 days only"},

)

response = client.chat.completions.create(

    model="gpt-4o-mini",

    messages=[{"role": "user", "content": "What is the refund policy?"}],

)

```

### B: One-shot check (4 lines)

Score a single prompt/response pair without an SDK client:

```python

from director_ai import score

cs = score("What is the refund policy?", response_text,

           facts={"refund": "Refunds within 30 days only"},

           threshold=0.3)

print(f"Coherence: {cs.score:.3f}  Approved: {cs.approved}")

```

### C: Zero code changes (2 lines)

Point any OpenAI-compatible client at the proxy:

```bash

pip install director-ai[server]

director-ai proxy --port 8080 --facts kb.txt --threshold 0.3

```

Then set `OPENAI_BASE_URL=http://localhost:8080/v1` in your app. Every response

gets scored; hallucinations are rejected (or flagged with `--on-fail warn`).

### D: FastAPI middleware (3 lines)

Guard your own API endpoints:

```python

from director_ai.integrations.fastapi_guard import DirectorGuard

app.add_middleware(DirectorGuard,

    facts={"policy": "Refunds within 30 days only"},

    on_fail="reject",

)

```

Responses on POST endpoints get `X-Director-Score` and `X-Director-Approved`

headers. Set `paths=["/api/chat"]` to limit which endpoints are scored.

## Installation

```bash

pip install "director-ai[nli]"                    # recommended — NLI model scoring

pip install "director-ai[nli,vector,server]"       # production stack with RAG + REST API

pip install "director-ai[nli,voice]"               # voice AI with TTS adapters

pip install director-ai                            # heuristic-only (limited accuracy)

```

Extras: `[vector]` (ChromaDB), `[voice]` (ElevenLabs, OpenAI TTS, Deepgram), `[finetune]` (domain adaptation), `[ingestion]` (PDF/DOCX parsing), `[colbert]` (late-interaction retrieval).

Framework integrations: `[langchain]`, `[llamaindex]`, `[langgraph]`, `[haystack]`, `[crewai]`, Semantic Kernel, DSPy/Instructor.

Kubernetes: [Helm chart](deploy/helm/director-ai/) with GPU toggle, HPA, Sigstore-signed releases.

Voice AI: `VoiceGuard` (sync) and `AsyncVoiceGuard` + `voice_pipeline()` (async) — real-time token filter for TTS pipelines with ElevenLabs, OpenAI TTS, and Deepgram adapters ([guide](https://anulum.github.io/director-ai/guide/voice-ai/)).

Full installation guide: [docs](https://anulum.github.io/director-ai/installation/).

## Docker

Dockerfile included for self-hosted builds. Pre-built images not yet published to a registry.

```bash

docker build -t director-ai .                                      # build locally

docker run -p 8080:8080 director-ai                                # CPU

docker build -f Dockerfile.gpu -t director-ai:gpu .                # GPU build

docker run --gpus all -p 8080:8080 director-ai:gpu                 # GPU

```

## Benchmarks

### Accuracy — LLM-AggreFact (29,320 samples)

Scoring model: [`yaxili96/FactCG-DeBERTa-v3-Large`](https://huggingface.co/yaxili96/FactCG-DeBERTa-v3-Large) (0.4B params, MIT license).

| Model | Balanced Acc | Params | Latency | Streaming |

|-------|-------------|--------|---------|-----------|

| Bespoke-MiniCheck-7B | **77.4%** | 7B | ~100 ms | No |

| **Director-AI (FactCG)** | **75.8%** | 0.4B | **14.6 ms** | **Yes** |

| MiniCheck-Flan-T5-L | 75.0% | 0.8B | ~120 ms | No |

| MiniCheck-DeBERTa-L | 72.6% | 0.4B | ~120 ms | No |

75.8% balanced accuracy comes from the FactCG-DeBERTa-v3-Large model (77.2% in

the [NAACL 2025 paper](https://arxiv.org/abs/2501.17144); our eval yields 75.86%

due to threshold tuning and data split version). Latency: 14.6 ms/pair measured

on GTX 1060 6GB with ONNX GPU batching (16-pair batch, 30 iterations, 5 warmup).

Director-AI's unique value is the *system*: NLI + KB + streaming halt.

Full results: [`benchmarks/comparison/COMPETITOR_COMPARISON.md`](benchmarks/comparison/COMPETITOR_COMPARISON.md).

Performance trade-offs and E2E pipeline metrics: [docs](https://anulum.github.io/director-ai/guide/streaming/).

## Domain Presets

10 built-in profiles with preset thresholds (starting points — adjust for your data):

```bash

director-ai config --profile medical   # threshold=0.30, NLI on, reranker on

director-ai config --profile finance   # threshold=0.30, w_fact=0.6

director-ai config --profile legal     # threshold=0.30, w_logic=0.6

director-ai config --profile creative  # threshold=0.40, permissive

```

Domain-specific benchmark scripts exist but have not yet been validated with measured results.

Run them yourself (requires GPU + HuggingFace datasets):

```bash

python -m benchmarks.medical_eval   # MedNLI + PubMedQA

python -m benchmarks.legal_eval     # ContractNLI + CUAD (RAGBench)

python -m benchmarks.finance_eval   # FinanceBench + Financial PhraseBank

```

Known Limitations & When Not to Use

#### Accuracy

- **Heuristic fallback is weak**: Without `[nli]`, scoring uses word-overlap heuristics (~55% accuracy). Use `strict_mode=True` to reject (0.9) instead of guessing.

- **Summarisation FPR at 10.5%**: Reduced from 95% via bidirectional NLI + baseline calibration (v3.5). AggreFact-CNN: 68.8%, ExpertQA: 59.1% (structurally expected at 0.4B params).

- **NLI-only scoring needs KB grounding**: Without a knowledge base, PubMedQA F1=62.1%, FinanceBench 80%+ FPR. Load your domain facts into the vector store.

#### Performance

- **ONNX CPU is slow**: 383 ms/pair without GPU. Use `onnxruntime-gpu` for production.

- **Long documents need ≥16GB VRAM**: Legal contracts and SEC filings exceed 6GB during chunked NLI inference.

#### Configuration

- **Weights are domain-dependent**: Default `w_logic=0.6, w_fact=0.4` suits general QA. Adjust for your domain or use a built-in profile.

- **Threshold defaults differ by API surface**: `guard()`/`score()` default to `threshold=0.3` (permissive). `DirectorConfig` defaults to `coherence_threshold=0.6` (conservative). Always set the threshold explicitly.

#### Privacy

- **LLM-as-judge sends data externally**: When `llm_judge_enabled=True`, truncated prompt+response (500 chars) are sent to the configured provider. Do not enable in privacy-sensitive deployments without user consent. The default NLI-only mode runs entirely locally with no external calls.

## Citation

```bibtex

@software{sotek2026director,

  author    = {Sotek, Miroslav},

  title     = {Director-AI: Real-time LLM Hallucination Guardrail},

  year      = {2026},

  url       = {https://github.com/anulum/director-ai},

  version   = {3.11.1},

  license   = {AGPL-3.0-or-later}

}

```

## License

Dual-licensed:

1. **Open-Source**: [GNU AGPL v3.0](LICENSE) — research, personal use, open-source projects.

2. **Commercial**: [Proprietary license](https://www.anulum.li/licensing) — removes copyleft for closed-source and SaaS.

See [Licensing](docs-site/licensing.md) for pricing tiers and FAQ.

Contact: [anulum.li](https://www.anulum.li) | [director.class.ai@anulum.li](mailto:director.class.ai@anulum.li)

## Community

Join the [Director-AI Discord](https://discord.gg/JvMdKv49) for CI notifications, release announcements, and support. The Discord bot also provides `/version`, `/docs`, `/install`, `/status`, and `/quickstart` slash commands.

## Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md). By contributing, you agree to AGPL v3 terms.

---



  

    

  

      

  

    

  

  


  Developed by ANULUM / Fortis Studio

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/anulum/director-ai

Awesome Lists containing this project

README

Director-AI