https://github.com/edycutjong/quorum
๐๏ธ Offline multi-agent document council โ 3 AI agents debate your documents to a cited answer. The visible disagreement IS the trust mechanism. Built for QVAC Edge AI Hackathon.
https://github.com/edycutjong/quorum
edge-ai hackathon local-llm multi-agent offline-ai qvac-sdk rag react typescript vite
Last synced: 9 days ago
JSON representation
๐๏ธ Offline multi-agent document council โ 3 AI agents debate your documents to a cited answer. The visible disagreement IS the trust mechanism. Built for QVAC Edge AI Hackathon.
- Host: GitHub
- URL: https://github.com/edycutjong/quorum
- Owner: edycutjong
- License: mit
- Created: 2026-06-04T16:17:59.000Z (26 days ago)
- Default Branch: main
- Last Pushed: 2026-06-12T11:20:47.000Z (18 days ago)
- Last Synced: 2026-06-12T13:12:08.547Z (18 days ago)
- Topics: edge-ai, hackathon, local-llm, multi-agent, offline-ai, qvac-sdk, rag, react, typescript, vite
- Language: TypeScript
- Homepage:
- Size: 258 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Security: SECURITY.md
- Agents: AGENTS.md
Awesome Lists containing this project
README
## ๐งโโ๏ธ For Judges โ Review in 5 Steps
> Offline multi-agent document council on `@qvac/sdk`. **Zero cloud โ verifiable.**
1. **โถ Watch the 3-min demo** (network off on camera): https://youtu.be/tnVqrbXNMco
2. **Run it locally** (first launch downloads models ~2 GB):
```bash
npm install
npm run start # backend :3001 + web app :5173 (keep running)
make seed # 2nd terminal โ ingest the dossier (curl -X POST :3001/api/seed)
# open http://localhost:5173 โ status pill reads "LIVE ยท QVAC"
```
3. **Ask the demo query:** *โWho authorized the Entity X payment and was it legitimate?โ* โ watch the **Researcher โ Skeptic โ Synthesizer** debate stream in. The Skeptic catches the planted contradiction (VP Chen "authorized" a payment while on PTO); the Synthesizer returns a **cited, disputed verdict with lowered confidence**.
4. **Verify the claims:**
- ๐ **No remote APIs** โ zero cloud calls: [`docs/REMOTE_APIS.md`](docs/REMOTE_APIS.md)
- ๐ **Structured audit log** โ model loads/unloads + per-inference TTFT / tokens / tokens-per-sec: [`docs/AUDIT_LOG.md`](docs/AUDIT_LOG.md) โ real run in [`docs/audit-log.jsonl`](docs/audit-log.jsonl)
- `python3 scripts/verify_offline.py` โ **0 outbound** (disconnect network first) ยท 18/18 checks
- `npm run bench` โ real on-device latency + contradiction recall โ [`data/bench_results.json`](data/bench_results.json) (p50 โ **2.1 s**, peak RAM โ **180 MB**, citation coverage **1.0**)
- `npm run ci` โ **163 unit tests, 100% core coverage** + lint + typecheck
5. **Why only QVAC / no remote APIs** ([`docs/REMOTE_APIS.md`](docs/REMOTE_APIS.md)): all inference is local (Llama 3.2 1B + GTE-Large via `@qvac/sdk`). See [Why ONLY QVAC?](#-why-only-qvac) โ remove QVAC and you'd need a cloud LLM + hosted vector DB, and the confidentiality premise is gone.
---
Quorum ๐๏ธ
Offline multi-agent document council โ 3 AI agents debate your documents to a cited answer. The visible disagreement IS the trust mechanism.
[](https://youtu.be/tnVqrbXNMco)
[](https://dorahacks.io/hackathon/qvac-unleach-edge-ai-i/detail)
[](https://dorahacks.io/hackathon/qvac-unleach-edge-ai-i/tracks)




[](https://github.com/edycutjong/quorum/actions/workflows/ci.yml)
---
## ๐ก The Problem & Solution
When analyzing confidential documents โ legal dossiers, financial audits, HR records โ you can't upload them to cloud AI. But a single LLM will just parrot the first document it reads, missing contradictions.
**Quorum** solves this with a **3-agent council** that cross-examines your corpus entirely offline:
**Key Features:**
- ๐ **Researcher** โ Retrieves relevant documents, proposes initial answer with citations
- โก **Skeptic** โ Counter-retrieves to challenge claims, finds planted contradictions
- ๐งฉ **Synthesizer** โ Reconciles viewpoints, assigns HIGH/MEDIUM/LOW confidence
- ๐ **Every claim cited** โ Source chunk mapped to exact document
- ๐ด **Contradiction detection** โ Skeptic catches what a single LLM would miss
## ๐ฅ See It In Action
*Real local inference โ the network is off the entire time (note the **`โ LIVE ยท QVAC`** pill).*
**The contradiction catch.** Ask *"Who authorized the Entity X payment and was it legitimate?"* โ a naive RAG repeats the memo (*"VP Chen, March 12"*), but Quorum's Skeptic re-queries and surfaces the conflicting HR access logs and board minutes, so the Synthesizer returns a **cited, disputed verdict with lowered confidence**:
๐ Every claim maps to an exact source chunk
โ๏ธ A second debate โ governance ยง4.2 compliance
## ๐๏ธ Architecture & Tech Stack
```mermaid
flowchart LR
Q["๐ฃ๏ธ User Query"] --> R["๐ Researcher"]
R --> S["โก Skeptic"]
S --> Y["๐งฉ Synthesizer"]
Y --> A["๐ Cited Answer"]
R -.- R1["RAG retrieve\n+ propose"]
S -.- S1["Counter-retrieve\n+ challenge"]
Y -.- Y1["Reconcile\n+ confidence"]
style Q fill:#1e293b,stroke:#06b6d4,color:#f1f5f9
style R fill:#0e2a30,stroke:#06b6d4,color:#06b6d4
style S fill:#2a1f0e,stroke:#f59e0b,color:#f59e0b
style Y fill:#0e2a14,stroke:#22c55e,color:#22c55e
style A fill:#1e293b,stroke:#06b6d4,color:#f1f5f9
style R1 fill:none,stroke:#06b6d4,color:#94a3b8,stroke-dasharray:4
style S1 fill:none,stroke:#f59e0b,color:#94a3b8,stroke-dasharray:4
style Y1 fill:none,stroke:#22c55e,color:#94a3b8,stroke-dasharray:4
```
| Layer | Technology |
|---|---|
| **Frontend** | Vite 8, React 19, TypeScript |
| **AI Engine** | @qvac/sdk (completion, RAG) |
| **Embeddings** | GTE-Large-FP16 via @qvac/sdk |
| **LLM** | Llama 3.2 1B (local) |
## ๐ Why ONLY QVAC?
| QVAC SDK Method | Quorum Usage | Cloud Alternative You'd Need |
|---|---|---|
| `loadModel()` + `completion()` | Runs all 3 agents (Researcher, Skeptic, Synthesizer) | OpenAI API ($0.03/query ร 3 agents) |
| `ragIngest()` + `ragSearch()` | Embeds & searches private dossier locally | Pinecone + OpenAI Embeddings API |
| `loadModel(GTE_LARGE_FP16)` | 1024-dim embeddings for citation matching | Cohere Embed API |
| `unloadModel()` | Memory lifecycle โ load once, 3 agents, unload | N/A (cloud doesn't care) |
**Take QVAC out and you'd need 3 separate cloud services** (OpenAI + Pinecone + Cohere), a network connection, and your confidential documents would leave your machine.
## ๐ Dossier โ Planted Contradictions
The demo includes a 5-document **Northwind dossier** with deliberate contradictions:
| Document | Claims | Contradiction |
|---|---|---|
| `memo_ref_4821.txt` | VP Chen authorized $2.4M payment March 12 | Chen was on PTO |
| `board_minutes_march.txt` | Chen PTO March 11-15, no Entity X discussion | Memo claims March 12 |
| `q1_financial_report.txt` | Audit flags: no SOW, no deliverables | Payment processed |
| `hr_access_logs.txt` | No badge/VPN access March 12 | Memo timestamped March 12 |
| `governance_charter.txt` | >$1M needs board resolution | No board approval |
## ๐ Getting Started
```bash
git clone https://github.com/edycutjong/quorum.git
cd quorum
npm install
npm run start # backend (:3001) + web app (:5173) โ keep this running (make start)
# then, in a SECOND terminal, ingest the dossier into the running backend:
curl -X POST http://localhost:3001/api/seed # or: make seed
# open http://localhost:5173 โ status pill should read "LIVE ยท QVAC"
```
> First launch downloads the local models. The status pill reads **DEMO ยท OFFLINE**
> until the backend is reachable; once it's up and seeded it switches to **LIVE ยท QVAC**.
> **Devastating Demo Query:** "Who authorized the Entity X payment and was it legitimate?"
## ๐ Benchmarks
Run `npm run bench` to reproduce. This runs the **real** 3-agent council over the
dossier via `@qvac/sdk` and writes `data/bench_results.json` (latency, contradiction
recall, citation coverage). Use `npm run bench -- --assert` to fail on budget regressions.
Representative run on an **Apple M1 Max (32 GB)** โ reproduce with `npm run bench`:
| Metric | Measured | Budget |
|---|---|---|
| Full Council Round (p50 / p95) | ~2.5s / ~2.7s | <15,000ms |
| Model Load (cold) | ~1.2s | <10,000ms |
| Corpus Ingest (5 docs โ 9 chunks) | ~1.8s | โ |
| Citation coverage | 1.0 | โฅ0.95 |
| Contradiction recall (planted set) | 0.67โ1.0ยน | 1.0 |
| Peak RAM | ~196 MB | <4,096MB |
> ยน Recall varies run-to-run: the Skeptic reliably retrieves the conflicting
> documents, but Llama-3.2-1B is non-deterministic and doesn't always phrase an
> explicit objection โ an honest limitation of a 1B model on-device. `npm run bench`
> records real measurements from your hardware into `data/bench_results.json`.
> (The legacy `scripts/bench.py` is a deterministic simulation kept only as a CI smoke test.)
## ๐งช Testing & CI
**171 tests ยท 100% core coverage:** 163 unit tests (Vitest) covering RAG citation mapping & chunking, agent orchestration, contradiction-driven confidence, the audit log, and the offline SDK wrappers, plus 8 E2E specs (Playwright) โ backed by 18 offline-verification checks (`verify_offline.py`).
## ๐ Verification & Compliance
Everything the judges' verification asks for, as concrete artifacts:
| Gate | Where | How to reproduce |
|---|---|---|
| **No remote APIs** โ zero cloud calls | [`docs/REMOTE_APIS.md`](docs/REMOTE_APIS.md) | `python3 scripts/verify_offline.py` (scans for banned cloud SDKs; 18/18) |
| **Structured audit log** โ model loads/unloads + inference perf (prompt, tokens, TTFT, tokens/sec) | [`docs/AUDIT_LOG.md`](docs/AUDIT_LOG.md) โ [`docs/audit-log.jsonl`](docs/audit-log.jsonl) | on by default; `npm run start` + a query writes it |
| **Offline proof** โ 0 outbound connections | `scripts/verify_offline.py` | disconnect network, then run |
| **Real on-device benchmarks** โ latency, recall, RAM | [`data/bench_results.json`](data/bench_results.json) | `npm run bench` |
**7-stage pipeline:** Quality โ Security โ Build โ E2E โ Performance โ Offline โ Deploy
```bash
# โโ Code Quality โโโโโโโโโโโโโโโโโโโโโโโโโโโโ
npm run lint # ESLint
npm run typecheck # TypeScript check
npm run ci # Full quality gate
# โโ E2E & Performance โโโโโโโโโโโโโโโโโโโโโโ
npm run e2e # Playwright E2E (3 suites)
npm run lighthouse # Lighthouse CI audit
# โโ Evidence Bundle โโโโโโโโโโโโโโโโโโโโโโโโโ
python3 scripts/verify_offline.py # airgapped run โ disconnect network first
npm run bench # real council latency + contradiction recall
python3 scripts/check_submission_readiness.py
```
| Layer | Tool | Status |
|---|---|---|
| Code Quality | ESLint + TypeScript | โ
|
| E2E Testing | Playwright (3 suites) | โ
|
| Security (SAST) | CodeQL | โ
|
| Security (SCA) | Dependabot + npm audit | โ
|
| Secret Scanning | TruffleHog | โ
|
| Performance | Lighthouse CI | โ
|
| Offline Verification | verify_offline.py (18/18) | โ
|
## ๐ Project Structure
```
quorum/
โโโ docs/ # README assets
โโโ data/fixtures/
โ โโโ northwind_dossier/ # 5 docs with planted contradictions
โโโ e2e/ # Playwright E2E tests
โโโ scripts/ # seed, bench, verify, readiness
โโโ src/
โ โโโ core/
โ โ โโโ qvac.ts # @qvac/sdk wrapper
โ โ โโโ rag.ts # Corpus RAG pipeline
โ โ โโโ council.ts # 3-agent council orchestration
โ โโโ App.tsx # Debate transcript viewer
โ โโโ App.css # Dark mode theme
โโโ .github/ # CI/CD + CodeQL + Dependabot
โโโ playwright.config.ts
โโโ lighthouserc.json
โโโ README.md
```
## โ ๏ธ Honest Limitations
1. Small model โ limited reasoning depth vs cloud LLMs
2. Sequential agents โ no true parallel debate
3. English only
4. Fixed dossier โ no live document upload yet
5. Mock inference in demo mode
## ๐ License
[MIT](LICENSE) ยฉ 2026 Edy Cu
## ๐ Acknowledgments
Built for **QVAC Hackathon I โ Unleash Edge AI** (DoraHacks). Thank you to the QVAC team for making multi-agent AI possible on the edge.