https://github.com/rob-otix-ai/lexius
AI regulatory compliance tool with provenance-tracked verbatim law. 20 regulations, 2 jurisdictions (EU + Cayman Islands CIMA). 1,456 articles, 8,123 deterministic extracts. No hallucinated numbers. Works with Claude, GPT-4, Ollama, or any model via OpenRouter.
https://github.com/rob-otix-ai/lexius
ai cayman-islands cima compliance dora eu-ai-act gdpr legal-tech mcp offshore openrouter pdf-parsing provenance regtech regulation
Last synced: 9 days ago
JSON representation
AI regulatory compliance tool with provenance-tracked verbatim law. 20 regulations, 2 jurisdictions (EU + Cayman Islands CIMA). 1,456 articles, 8,123 deterministic extracts. No hallucinated numbers. Works with Claude, GPT-4, Ollama, or any model via OpenRouter.
- Host: GitHub
- URL: https://github.com/rob-otix-ai/lexius
- Owner: rob-otix-ai
- Created: 2026-04-15T15:50:42.000Z (2 months ago)
- Default Branch: main
- Last Pushed: 2026-05-03T14:14:03.000Z (about 2 months ago)
- Last Synced: 2026-05-03T16:14:56.552Z (about 2 months ago)
- Topics: ai, cayman-islands, cima, compliance, dora, eu-ai-act, gdpr, legal-tech, mcp, offshore, openrouter, pdf-parsing, provenance, regtech, regulation
- Language: TypeScript
- Homepage: https://www.npmjs.com/package/@robotixai/lexius-mcp
- Size: 823 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
Awesome Lists containing this project
README
# Lexius
Legislation-agnostic AI compliance platform with provenance-tracked, deterministic regulatory analysis. Verbatim regulation text from EUR-Lex and offshore PDF sources, deterministic fact extraction, parallel hivemind assessment, and honest provenance labelling on every claim.
20 regulations across 2 jurisdictions — 10 EU (via EUR-Lex CELLAR) and 10 Cayman Islands CIMA (via PDF). Adding a new regulation is two commands — the fetcher, extractor, and swarm handle the rest.
> Lexius provides general regulatory guidance and does not constitute legal advice. For implementation support, consult qualified legal counsel.
## Install
```bash
# MCP server — connect Claude Desktop to the compliance database
npx @robotixai/lexius-mcp
# CLI — query from your terminal
npx @robotixai/lexius-cli
# Interactive agent — Claude-powered compliance consultant
npx @robotixai/lexius-agent
# Database — schema-ready Postgres with pgvector
docker pull robotixai/lexius-db
```
### From Source
```bash
git clone https://github.com/rob-otix-ai/lexius.git
cd lexius
cp .env.example .env # set OPENAI_API_KEY + ANTHROPIC_API_KEY
pnpm setup # install → build → DB → migrate → seed → fetch → extract
```
## Architecture
```
┌──────────────────────────────────────────────────────────────┐
│ Consumers │
│ ┌─────┐ ┌─────┐ ┌─────┐ ┌─────────┐ ┌────────┐ ┌───────┐ │
│ │ API │ │ MCP │ │ CLI │ │ Agent │ │ Skills │ │ Swarm │ │
│ └──┬──┘ └──┬──┘ └──┬──┘ └────┬────┘ └───┬────┘ └───┬───┘ │
│ └───────┴───────┴─────────┴───────────┴──────────┘ │
│ │ │
│ ┌───────┴───────┐ │
│ │ Core │ use cases + domain │
│ │ + Plugins │ legislation-agnostic │
│ └───────┬───────┘ │
│ │ │
│ ┌────────────────┼────────────────┐ │
│ │ │ │ │
│ ┌───┴───┐ ┌───────┴───────┐ ┌────┴─────┐ │
│ │Fetcher│ │ DB │ │Extractor │ │
│ │CELLAR │ │Postgres+pgvec │ │regex/det │ │
│ └───────┘ └───────────────┘ └──────────┘ │
└──────────────────────────────────────────────────────────────┘
```
## Published Packages
### npm
| Package | Version | Description |
|---------|---------|-------------|
| [@robotixai/lexius-api](https://www.npmjs.com/package/@robotixai/lexius-api) | [](https://www.npmjs.com/package/@robotixai/lexius-api) | REST API + SSE MCP server (15 endpoints) |
| [@robotixai/lexius-mcp](https://www.npmjs.com/package/@robotixai/lexius-mcp) | [](https://www.npmjs.com/package/@robotixai/lexius-mcp) | MCP server for Claude Desktop / Claude Code (13 tools, stdio + SSE) |
| [@robotixai/lexius-cli](https://www.npmjs.com/package/@robotixai/lexius-cli) | [](https://www.npmjs.com/package/@robotixai/lexius-cli) | Command-line interface (9 commands) |
| [@robotixai/lexius-agent](https://www.npmjs.com/package/@robotixai/lexius-agent) | [](https://www.npmjs.com/package/@robotixai/lexius-agent) | Interactive Claude compliance consultant + hivemind swarm |
| [@robotixai/lexius-fetcher](https://www.npmjs.com/package/@robotixai/lexius-fetcher) | [](https://www.npmjs.com/package/@robotixai/lexius-fetcher) | EUR-Lex CELLAR fetcher + deterministic extractor |
### Docker
| Image | Tag | Description |
|-------|-----|-------------|
| [robotixai/lexius-db](https://hub.docker.com/r/robotixai/lexius-db) | 0.1.0 | PostgreSQL 16 + pgvector, schema auto-applied (13 tables, 5 migrations) |
| [robotixai/lexius-api](https://hub.docker.com/r/robotixai/lexius-api) | 0.3.0 | Express REST API + SSE MCP + swarm (full platform server) |
| [robotixai/lexius-mcp](https://hub.docker.com/r/robotixai/lexius-mcp) | 0.1.3 | MCP server (node:20-slim, 202MB, self-contained bundle) |
### Workspace Packages (from source)
| Package | Description |
|---------|-------------|
| `@lexius/core` | Domain entities, ports, 14 use cases, legislation plugin system |
| `@lexius/db` | Drizzle schema, 5 migrations (0000-0004), seeds for 10 legislations |
| `@lexius/infra` | Drizzle repositories + OpenAI embedding service |
| `@lexius/logger` | Pino logger factory |
## Quick Start
### Zero to running — all via npm + Docker
```bash
# 1. Start the database (schema auto-applied, no migration needed)
docker run -d -p 5432:5432 \
-e POSTGRES_PASSWORD=$POSTGRES_PASSWORD \
-e POSTGRES_DB=$POSTGRES_DB \
-e POSTGRES_USER=$POSTGRES_USER \
robotixai/lexius-db
export DATABASE_URL=postgresql://$POSTGRES_USER:$POSTGRES_PASSWORD@localhost:5432/$POSTGRES_DB
# 2. Fetch verbatim regulation text from EUR-Lex + run extractor
npx @robotixai/lexius-fetcher ingest --celex 32024R1689 --legislation eu-ai-act
npx @robotixai/lexius-fetcher ingest --celex 32022R2554 --legislation dora
# 3. Query from the CLI
npx @robotixai/lexius-cli legislations
npx @robotixai/lexius-cli article 99 --legislation eu-ai-act
# 4. Or run the interactive agent
export ANTHROPIC_API_KEY=sk-ant-...
npx @robotixai/lexius-agent
```
No git clone needed. The Docker image provides the schema; the fetcher populates it from EUR-Lex; the CLI/agent/MCP server query it.
### Docker Compose (full stack)
```bash
# Set your passwords
export POSTGRES_PASSWORD=secret
export OPENAI_API_KEY=sk-...
# Start DB + API
docker compose -f docker-compose.production.yml up -d
# The API is now at http://localhost:3000
# MCP proxy mode can point at it:
LEXIUS_API_URL=http://localhost:3000 LEXIUS_API_KEY=lx_... npx @robotixai/lexius-mcp
```
### Claude Desktop Integration
Add to `claude_desktop_config.json`:
```json
{
"mcpServers": {
"lexius": {
"command": "npx",
"args": ["@robotixai/lexius-mcp"],
"env": {
"DATABASE_URL": "postgresql://$POSTGRES_USER:$POSTGRES_PASSWORD@localhost:5432/$POSTGRES_DB"
}
}
}
}
```
Or use proxy mode (no local DB needed):
```json
{
"mcpServers": {
"lexius": {
"command": "npx",
"args": ["@robotixai/lexius-mcp"],
"env": {
"LEXIUS_API_URL": "https://your-lexius-instance.example.com",
"LEXIUS_API_KEY": "lx_your_key_here"
}
}
}
}
```
### Docker MCP
```bash
# Proxy mode
docker run -e LEXIUS_API_URL=https://your-lexius-instance.example.com \
-e LEXIUS_API_KEY=lx_... \
robotixai/lexius-mcp
# Direct mode
docker run -e DATABASE_URL=postgresql://$POSTGRES_USER:$POSTGRES_PASSWORD@host:5432/$POSTGRES_DB \
robotixai/lexius-mcp
```
## Provenance
Every fact Lexius returns is labelled with its trust level:
| Tier | Meaning | Enforced by |
|------|---------|-------------|
| **AUTHORITATIVE** | Verbatim from official source (EUR-Lex CELLAR). SHA-256 hash-verified. | DB CHECK constraint + Specflow PROV-001 |
| **CURATED** | Written or reviewed by a domain expert. `curated_by` + `reviewed_at` recorded. | DB CHECK constraint + PROV-003 |
| **AI_GENERATED** | Model output, not expert-reviewed. `generated_by_model` recorded. | DB CHECK constraint |
## Data Pipeline
```
EUR-Lex CELLAR (XHTML)
↓ fetcher (lexius-fetch ingest --celex 32024R1689 or --source cima)
Articles table — 1,456 AUTHORITATIVE sections across 20 regulations, hash-verified
↓ extractor (lexius-fetch extract --legislation eu-ai-act)
Article Extracts — 8,123+ typed facts (fines EUR/KYD, %, dates, cross-refs, shall-clauses, imprisonment terms)
↓ cross-check (pnpm crosscheck)
CI fails if curated penalty amounts ≠ extracted values from verbatim law
↓ swarm (POST /api/v1/swarm/run)
Compliance workspace — 1,882 findings in 2.6s with 4 parallel agents
↓ synthesis
ComplianceReport with relianceByTier breakdown
```
## Legislations
### EU Regulations (via EUR-Lex CELLAR — XHTML)
| Legislation | CELEX | Articles | Extracts |
|-------------|-------|----------|----------|
| GDPR | 32016R0679 | 99 | 637 |
| EU AI Act | 32024R1689 | 126 | 1,181 |
| DORA | 32022R2554 | 64 | 554 |
| Digital Services Act | 32022R2065 | 93 | 699 |
| Digital Markets Act | 32022R1925 | 54 | 475 |
| Data Act | 32023R2854 | 50 | 397 |
| Data Governance Act | 32022R0868 | 38 | 273 |
| Cyber Resilience Act | 32024R2847 | 79 | 700 |
| MiCA | 32023R1114 | 155 | 1,672 |
| eIDAS 2.0 | 32024R1183 | 48 | 335 |
### Cayman Islands CIMA (via PDF)
| Legislation | Sections | Extracts |
|-------------|----------|----------|
| Monetary Authority Act (2020 Rev.) | 63 | ~180 |
| Banks and Trust Companies Act (2025 Rev.) | 29 | ~60 |
| Mutual Funds Act (2025 Rev.) | 52 | ~140 |
| Private Funds Act (2025 Rev.) | 33 | ~80 |
| Securities Investment Business Act (2020 Rev.) | 44 | ~100 |
| Insurance Act (2010) | 41 | ~80 |
| Anti-Money Laundering Regulations (2025 Rev.) | 107 | ~250 |
| Virtual Asset (Service Providers) Act (2024 Rev.) | 41 | ~110 |
| Proceeds of Crime Act (2024 Rev.) | 205 | ~350 |
| Beneficial Ownership Transparency Act (2023) | 35 | ~50 |
### Totals
| | Legislations | Sections/Articles | Extracts |
|---|---|---|---|
| EU | 10 | 806 | 6,923 |
| Cayman Islands | 10 | 650 | ~1,200 |
| **Total** | **20** | **1,456** | **~8,123** |
### Adding a New Regulation
```bash
# EU regulation (via EUR-Lex CELLAR)
lexius-fetch ingest --celex --legislation
# Offshore regulation (via PDF)
lexius-fetch ingest --source pdf --url --legislation
# All CIMA acts (via registry)
lexius-fetch ingest --source cima
# Auto-populate derivedFrom on curated rows from cross-references
lexius-fetch backfill-derivation --legislation --apply
```
## Hivemind Swarm
Parallel compliance analysis via autonomous agents sharing a Postgres workspace (stigmergic communication):
- **Deterministic** — no LLM in the agent loop; same data = same findings
- **Fast** — 1,882 findings in ~2.6s (4 agents, 126 EU AI Act articles)
- **Gap detection** — discovers obligations in the law that the curated set misses (701 candidates)
- **Provenance** — every finding inherits AUTHORITATIVE / CURATED / AI_GENERATED from its source
```bash
# Via API
curl -X POST localhost:3000/api/v1/swarm/run \
-H "Authorization: Bearer lx_..." \
-d '{"legislationId":"eu-ai-act","concurrency":4}'
# Via MCP
# Tool: legalai_run_swarm_assessment({ legislationId: "eu-ai-act" })
```
## API Endpoints
```
POST /api/v1/classify Classify an AI system
GET /api/v1/obligations Obligations by role/risk level
POST /api/v1/penalties/calculate Calculate penalty exposure
GET /api/v1/articles/:number Verbatim article text
GET /api/v1/articles/:id/history Article revision history
GET /api/v1/articles/:id/extracts Extracted facts (fines, dates, cross-refs)
GET /api/v1/deadlines Compliance deadlines
GET /api/v1/obligations/:id/derivation Trace obligation to source articles
POST /api/v1/knowledge/search Semantic search
POST /api/v1/faq/search FAQ search
GET /api/v1/legislations List legislations
POST /api/v1/audit Full compliance report
POST /api/v1/swarm/run Start hivemind swarm
GET /api/v1/swarm/:sessionId/findings Query swarm workspace
POST /api/v1/swarm/:sessionId/synthesise Report from swarm findings
GET /health DB stats + uptime
GET /mcp/sse Remote MCP via SSE
GET /integration-manifest.json Claude.ai integration manifest
```
Auth: `Authorization: Bearer lx_...` on all `/api/v1/*` routes. Health + manifest are unauthenticated.
## MCP Tools (13)
| Tool | Description |
|------|-------------|
| `legalai_classify_system` | Risk classification (signals + keywords + semantic) |
| `legalai_get_obligations` | Obligations filtered by role and risk level |
| `legalai_calculate_penalty` | Penalty calculation with SME rules |
| `legalai_get_article` | Verbatim article text (AUTHORITATIVE) |
| `legalai_get_deadlines` | Compliance deadlines with days remaining |
| `legalai_search_knowledge` | Semantic search across all content |
| `legalai_answer_question` | FAQ lookup |
| `legalai_run_assessment` | Structured assessments (Art. 6(3), GPAI) |
| `legalai_list_legislations` | Available legislations |
| `legalai_get_article_history` | Article revision history |
| `legalai_get_derivation_chain` | Obligation → source article trace |
| `legalai_get_article_extracts` | Deterministically extracted facts |
| `legalai_run_swarm_assessment` | Parallel hivemind assessment |
## CLI Commands
```bash
npx @robotixai/lexius-cli legislations
npx @robotixai/lexius-cli classify --legislation eu-ai-act --description "recruitment AI" --role provider
npx @robotixai/lexius-cli obligations --legislation eu-ai-act --role provider --risk-level high-risk
npx @robotixai/lexius-cli penalty --legislation eu-ai-act --violation high-risk-non-compliance --turnover 500000000
npx @robotixai/lexius-cli article 99 --legislation eu-ai-act
npx @robotixai/lexius-cli deadlines --legislation eu-ai-act
npx @robotixai/lexius-cli search "risk management" --legislation eu-ai-act --type obligation
npx @robotixai/lexius-cli audit --legislation eu-ai-act --description "recruitment AI" --role provider
```
## Contract Enforcement
22 contracts, 64 rules enforced by [Specflow](https://www.npmjs.com/package/@robotixai/specflow-cli):
```bash
npx @robotixai/specflow-cli enforce .
```
| Category | Contracts | Key Rules |
|----------|-----------|-----------|
| **Architecture** | `arch_clean_layers`, `arch_package_boundaries`, `arch_legislation_plugins` | Domain has no infra imports; consumers don't cross-import |
| **Provenance** | `provenance_tiers` | Every entity has `provenance_tier`; fetcher = AUTHORITATIVE only; seeds use helpers |
| **Extractor** | `extractor_determinism` | Pure/sync modules; no LLM; cross-check exits non-zero on mismatch |
| **Integration** | `integration_security` | No key hashes in responses; SSE uses auth |
| **Swarm** | `hivemind_swarm` | No LLM in agent loop; atomic claims; cleanup complete |
| **Offshore** | `offshore_adapters` | No LLM in PDF parsing; source_format=pdf; section merge; dynamic header detection |
| **Model Harness** | `model_harness` | No direct SDK imports in agent code; providers don't import domain |
| **Fetcher** | `fetcher_verbatim` | Records sourceHash + fetchedAt |
| **Curator** | `curator_audit`, `curator_auth`, `curator_integrity` | Audit atomicity, role-gated routes, AUTHORITATIVE immutable, tier transitions, `If-Match` concurrency, `derivedFrom` anchoring (C-INT-007) |
| **Audit** | `audit_report_integrity`, `audit_enhancement_layer`, `audit_agent_layer` | GenerateAuditReport is deterministic; enhancement via port |
| **Security** | `security_secrets`, `security_sql_safety`, `security_input_validation`, `security_no_eval` | No hardcoded creds; parameterised queries; Zod validation |
| **Quality** | `qa_domain_types` | No `any` in domain layer |
## Testing
```bash
pnpm test # All tests
pnpm --filter @lexius/core test # 183 unit tests
pnpm --filter @lexius/api test # 36 functional tests
pnpm --filter @lexius/fetcher test # 78 extractor + parser tests
pnpm crosscheck # Penalty cross-check vs verbatim law
npx @robotixai/specflow-cli enforce . # 20 contracts, 45 rules
```
## Documentation
Full spec documents in `docs/`:
| Type | Count | Index |
|------|-------|-------|
| **PRD** (Product Requirements) | 12 | [docs/prd/INDEX.md](docs/prd/INDEX.md) |
| **ARD** (Architecture Decisions) | 16 | [docs/ard/INDEX.md](docs/ard/INDEX.md) |
| **DDD** (Domain Design) | 15 | [docs/ddd/INDEX.md](docs/ddd/INDEX.md) |
## Environment Variables
| Variable | Required | Description |
|----------|----------|-------------|
| `DATABASE_URL` | Yes | PostgreSQL connection string |
| `OPENAI_API_KEY` | Embeddings | OpenAI API key for semantic search |
| `ANTHROPIC_API_KEY` | Agent only | Anthropic API key for Claude agent |
| `DB_PASSWORD` | Docker | PostgreSQL password for docker-compose |
| `LEXIUS_API_URL` | MCP proxy | Hosted API URL (proxy mode) |
| `LEXIUS_API_KEY` | MCP proxy | API key for hosted API |
| `LEXIUS_PROFILE` | Curator | Profile name to load from credentials file (default: `default`) |
| `LEXIUS_CREDENTIALS_FILE` | Curator | Override path to the credentials file (default: `~/.config/lexius/credentials`) |
| `LEXIUS_CURATOR_ID` | Curator | Override the curator identity stamped on edits |
| `LEXIUS_ROLE` | MCP | Force the MCP server role (`reader` or `curator`), bypassing credentials file |
## Curator Workflow
Named domain experts can edit CURATED-tier facts (obligations in v1) via CLI without a code deploy. Every edit is audited, concurrency-safe, and re-embedded.
```bash
# 1. Admin generates a curator key
pnpm create-api-key --role curator --owner expert@example.com
# 2. Expert logs in (paste-a-key flow)
npx @robotixai/lexius-curate login --key lx_curator_... --url https://lexius.example.com
npx @robotixai/lexius-curate whoami
# 3. Expert works: dry-run by default, --apply to commit
npx @robotixai/lexius-curate obligations list --stale
npx @robotixai/lexius-curate obligations edit eu-ai-act-art-9-provider \
--row-version 3 \
--changes '{"obligation":"Establish and maintain a risk management system"}' \
--reason "clarifying per Art. 9(2)" \
--apply
# 4. View history + revert
npx @robotixai/lexius-curate obligations history eu-ai-act-art-9-provider
npx @robotixai/lexius-curate revert --reason "too aggressive" --apply
```
**Guarantees (PRD-013 / ARD-017):**
- **Transactional:** row update, audit insert, re-embed all land in one DB transaction.
- **Concurrency-safe:** `If-Match: ` required; mismatch returns 409.
- **Auditable:** every edit writes an append-only `curator_edits` row with editor, source (`cli`/`api`/`mcp`/etc.), reason, old/new values, and row_version before/after.
- **Anchored:** every CURATED row carries non-empty `derivedFrom` resolving to AUTHORITATIVE articles. No orphan interpretations.
- **Honest:** curators cannot override mined facts. If an `article_extracts` value is wrong, fix the extractor — don't paper over it with a curator edit.
- **Staleness-aware:** when the fetcher re-ingests an article with a changed `source_hash`, every CURATED obligation citing that article is flagged `needs_review`. Curators triage via `lexius-curate obligations list --stale`.
See [PRD-013](docs/prd/PRD-013-curator-workflow.md), [ARD-017](docs/ard/ARD-017-curator-workflow.md), and [DDD-016](docs/ddd/DDD-016-curator-workflow.md) for design detail.
## Tech Stack
- **Runtime:** Node.js 20+ (ESM)
- **Language:** TypeScript (strict)
- **Database:** PostgreSQL 16 + pgvector
- **ORM:** Drizzle
- **Embeddings:** OpenAI text-embedding-3-small (1536 dimensions)
- **API:** Express 5
- **MCP:** @modelcontextprotocol/sdk (stdio + SSE)
- **CLI:** Commander
- **Agent:** @anthropic-ai/sdk (Claude, temperature 0)
- **Bundler:** esbuild
- **Monorepo:** Turborepo + pnpm workspaces
- **PDF Parsing:** pdfjs-dist (offshore legislation)
- **Contracts:** Specflow (20 contracts, 45 rules)
- **Testing:** Vitest + Supertest (297 tests)
## License
MIT