{"id":50974274,"url":"https://github.com/president-xd/raptor","last_synced_at":"2026-06-19T06:01:25.865Z","repository":{"id":353924569,"uuid":"1220920591","full_name":"president-xd/raptor","owner":"president-xd","description":"Retrieval-Augmented Persistent Threat Orchestration and Reasoning","archived":false,"fork":false,"pushed_at":"2026-05-28T22:08:32.000Z","size":16657,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-28T23:17:10.618Z","etag":null,"topics":["advanced-persistent-threat","apt","cyber-threat-intelligence","cybersecurity","threat-detection","threat-hunting","threat-intelligence"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/president-xd.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-25T14:12:02.000Z","updated_at":"2026-05-28T22:08:36.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/president-xd/raptor","commit_stats":null,"previous_names":["president-xd/raptor"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/president-xd/raptor","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/president-xd%2Fraptor","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/president-xd%2Fraptor/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/president-xd%2Fraptor/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/president-xd%2Fraptor/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/president-xd","download_url":"https://codeload.github.com/president-xd/raptor/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/president-xd%2Fraptor/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34519051,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-19T02:00:06.005Z","response_time":61,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["advanced-persistent-threat","apt","cyber-threat-intelligence","cybersecurity","threat-detection","threat-hunting","threat-intelligence"],"created_at":"2026-06-19T06:01:22.680Z","updated_at":"2026-06-19T06:01:25.858Z","avatar_url":"https://github.com/president-xd.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\n# 🦅 RAPTOR\n\n**Retrieval-Augmented Persistent Threat Orchestration and Reasoning**\n\nA self-hostable APT investigation platform for security analysts. Ingest raw\ntelemetry, correlate it against MITRE ATT\u0026CK, score APT attribution, predict\nlikely next adversary steps, and produce an analyst-ready forensic report — in a\nsingle workflow, running entirely on your own infrastructure.\n\n\u003c/div\u003e\n\n---\n\n## Contents\n\n- [What RAPTOR Does](#what-raptor-does)\n- [Honest Status](#honest-status)\n- [Architecture](#architecture)\n- [Investigation Pipeline](#investigation-pipeline)\n- [Security Model](#security-model)\n- [Quick Start (Local)](#quick-start-local)\n- [Production Deployment](#production-deployment)\n- [Configuration Reference](#configuration-reference)\n- [API Reference](#api-reference)\n- [Roles and Access Control](#roles-and-access-control)\n- [Evidence Handling](#evidence-handling)\n- [Audit Trail](#audit-trail)\n- [Observability](#observability)\n- [Operational Runbook](#operational-runbook)\n- [Development](#development)\n- [Testing](#testing)\n- [Quality Gates](#quality-gates)\n- [Documentation](#documentation)\n- [Project Layout](#project-layout)\n- [License](#license)\n\n---\n\n## What RAPTOR Does\n\n1. **Ingest** raw logs (JSON, CEF, XML, or plaintext), or pull events directly\n   from Elasticsearch, and normalise them into a common event model.\n   Persistent raw evidence storage retains the original upload, encrypted, with the case.\n2. **Detect** technique activity using Sigma-style rules and, when enabled, a\n   hybrid retrieval (RAG) step over MITRE ATT\u0026CK and APT reference material.\n3. **Graph** the attack: hosts, users, processes, and techniques are persisted\n   into Neo4j (with an in-memory fallback when Neo4j is unavailable).\n4. **Attribute** activity to known APT groups using Jaccard TTP overlap with\n   explicit confidence gating (HIGH / MEDIUM / LOW / UNKNOWN).\n5. **Simulate** likely next adversary steps from ATT\u0026CK playbook context, with\n   confidence-aware weighting that de-prioritises LOW / UNKNOWN attribution.\n6. **Report** findings as a structured forensic narrative in the browser, with\n   Markdown and printable PDF export.\n\nThe analyst console also includes a natural-language graph query interface, an\nAPT profile library, and a MITRE ATT\u0026CK matrix overlay. It surfaces the\nCISA Known Exploited Vulnerabilities connector with an in-app feed view.\n\n---\n\n## Honest Status\n\nThis section states plainly what is and is not in the repository, so the rest of\nthe document can be read at face value.\n\n| Area | State |\n|---|---|\n| Backend API, worker, pipeline, RBAC, evidence encryption, audit chain | Implemented |\n| React analyst console (dashboard, investigations, reports, MITRE, APT, NLQ) | Implemented |\n| Docker Compose (local) and hardened production overlay | Implemented |\n| SQLite (default) and PostgreSQL runtime metadata adapter | Implemented |\n| External LLM calls | Optional, **disabled by default** |\n| Local quality gates (`make validate`, `make security-scan`) | Implemented via `Makefile` |\n| GitHub Actions CI/CD (tests, e2e, audits, Gitleaks, Trivy, Cosign signing) | Implemented (`.github/workflows/ci.yml`, `release.yml`) |\n| License file | Implemented |\n| Database migration tooling | Not included — schema is managed at startup |\n\nBy default RAPTOR runs on SQLite and keeps evidence on the local filesystem,\nwhich is appropriate for single-node and controlled deployments. PostgreSQL,\nRedis-backed rate limiting, a separate worker process, and S3-compatible\nevidence storage are available for hardened production use through the\nproduction overlay.\n\n---\n\n## Architecture\n\n```\n+---------------------------------------------------------+\n|  Browser (Analyst Console)                              |\n|  React 18 - Vite - Nginx (rate-limited reverse proxy)   |\n+--------------------+------------------------------------+\n                     | HTTP(S); TLS terminated upstream\n+--------------------v------------------------------------+\n|  RAPTOR API  (FastAPI - Uvicorn - Python 3.11)          |\n|  +----------+ +----------+ +----------+ +----------+    |\n|  |  Auth    | | Evidence | |  Jobs    | | Metrics  |    |\n|  |  RBAC    | | Encrypt  | |  Queue   | | Audit    |    |\n|  +----------+ +----------+ +----------+ +----------+    |\n+------+---------------+---------------+------------------+\n       |               |               |\n+------v------+  +-----v------+  +-----v------------------+\n|  Neo4j 5.15 |  | Weaviate   |  |  RAPTOR Worker          |\n|  Attack     |  | 1.27.6     |  |  (prod: separate proc)  |\n|  Graph      |  | RAG / Vec  |  |  Investigation pipeline |\n+-------------+  +------------+  +-----------+------------+\n                                             |\n+--------------------------------------------v------------+\n|  SQLite (default) / PostgreSQL 16 (prod overlay)        |\n|  Elasticsearch 8.11 (log search)  Redis 7.2 (limits)    |\n+---------------------------------------------------------+\n```\n\nPostgreSQL 16 and the standalone worker container are provisioned by the\nproduction overlay (`docker-compose.prod.yml`). The base `docker-compose.yml`\nruns the backend in a combined role on SQLite.\n\n### Backend Modules\n\nThe backend is a modular FastAPI application with a one-directional import chain\n(no circular dependencies):\n\n```\nconfig\n  -\u003e database          (all DB/Redis I/O, SQLite\u003c-\u003ePostgreSQL adapter)\n       -\u003e metrics_store (Prometheus or in-memory counters)\n            -\u003e auth_core (sessions, RBAC, rate limiting, SSO, CSRF)\n                 -\u003e evidence_crypto (AES-256-GCM envelope encryption)\n                      -\u003e pipeline_runner (investigation worker loop)\n                           -\u003e routers/  (FastAPI endpoint modules)\n                                -\u003e main.py (app factory + middleware)\n```\n\n| Module | Responsibility |\n|---|---|\n| `config.py` | Load and validate environment variables; fail loudly on unsafe production config |\n| `database.py` | Every DB read/write; SQLite\u003c-\u003ePostgreSQL adapter; Redis helpers |\n| `metrics_store.py` | Prometheus counters with in-process fallback |\n| `auth_core.py` | Sessions, PBKDF2 passwords, RBAC, rate limiting, SSO proxy trust, CSRF |\n| `evidence_crypto.py` | DEK/KEK envelope encryption; legacy decrypt; key derivation |\n| `llm_redactor.py` | Strip PII/secrets from prompts before any external LLM call |\n| `pipeline_runner.py` | Six-phase investigation pipeline and worker loop |\n| `storage.py` | Pluggable evidence backend: local filesystem or S3-compatible |\n| `worker.py` | Standalone worker entry point (`RAPTOR_PROCESS_ROLE=worker`) |\n| `main.py` | FastAPI app factory, middleware stack, lifespan hooks |\n| `routers/` | `auth`, `investigations`, `analysis`, `intelligence`, `admin`, `health` |\n| `attribution/` | APT profiles, ATT\u0026CK catalog, Jaccard scoring, confidence, STIX validation |\n| `graph/` | Neo4j client, graph builder, provenance, queries |\n| `ingestion/` | Log parser, normaliser, Sigma matcher |\n| `rag/` | Embeddings, indexer, retriever, reranker, pipeline |\n| `report/` | Forensic report generator |\n| `nlq/` | Natural-language graph query engine |\n| `simulation/` | Next-step predictor |\n\n---\n\n## Investigation Pipeline\n\nEach investigation runs six sequential phases in the worker\n(`backend/pipeline_runner.py`). Progress is reported through the status endpoint.\n\n| Phase | Description |\n|---|---|\n| 1. Parse | Parse and normalise raw logs into `RaptorEvent` records |\n| 2. RAG analysis | Hybrid retrieval + LLM reasoning; deterministic Sigma fallback when the LLM is disabled or returns nothing |\n| 3. STIX validation | Validate and normalise technique IDs against the bundled ATT\u0026CK catalog |\n| 4. Attack graph | Persist hosts/users/techniques into Neo4j; in-memory fallback when Neo4j is down |\n| 5. Attribution | Jaccard TTP scoring across APT profiles with confidence gating |\n| 6. Report | Generate the analyst-facing forensic narrative |\n\nSimulation is a separate analyst action. Predictions are confidence-aware:\ncandidate next techniques are down-weighted when attribution confidence is\nLOW / UNKNOWN.\n\n---\n\n## Security Model\n\n### Authentication\n\n| Method | Header / Cookie | Use case |\n|---|---|---|\n| API Key | `X-RAPTOR-API-Key` or `Authorization: Bearer` | Service-to-service, scripts |\n| Session Cookie | `raptor_session` (HttpOnly, SameSite=Lax) | Browser sessions |\n| SSO / OIDC proxy | `X-Forwarded-User` (configurable) | Enterprise IdP via trusted proxy |\n\nSessions are server-side and revocable via `POST /api/v1/auth/logout`. Lifetime\nis set by `RAPTOR_SESSION_TTL_SECONDS` (default 8 hours). Passwords use\nPBKDF2-SHA256 with a per-user salt; accounts lock after\n`RAPTOR_AUTH_MAX_FAILURES` failed attempts.\n\n### Rate Limiting\n\n| Bucket | Limit | Window |\n|---|---|---|\n| `auth` | 10 req | 60 s |\n| `upload` | 20 req | 300 s |\n| `query` | 60 req | 300 s |\n| `connector` | 30 req | 300 s |\n\nIn production, `RAPTOR_RATE_LIMIT_BACKEND=redis` coordinates limits across API\nworkers. The Nginx frontend adds per-IP rate zones as defense in depth.\n\n### Evidence Encryption\n\nEvidence files are encrypted at rest with **AES-256-GCM envelope encryption**:\n\n1. A random per-file Data Encryption Key (DEK) is generated per upload.\n2. File content is encrypted with the DEK.\n3. The DEK is wrapped by the static Key Encryption Key (KEK) from\n   `EVIDENCE_ENCRYPTION_KEY`.\n4. The wrapped DEK is stored in the file header, so key rotation only re-wraps\n   DEKs rather than re-encrypting content.\n\nA legacy direct-KEK format remains decryptable for backward compatibility. With\n`RAPTOR_STORAGE_BACKEND=s3`, the ciphertext additionally sits behind S3\nserver-side encryption.\n\n### LLM Privacy\n\nExternal LLM calls are **disabled by default** (`RAPTOR_ALLOW_EXTERNAL_LLM=false`).\nWhen enabled, every prompt is scrubbed by `llm_redactor.py` before transmission.\n\n| Pattern | Replacement |\n|---|---|\n| Bearer / Authorization tokens | `Bearer [REDACTED_TOKEN]` |\n| Credential key-value pairs (`password=`, `secret=`, `api_key=`) | `key=[REDACTED]` |\n| US Social Security Numbers | `[SSN_REDACTED]` |\n| Luhn-plausible card numbers | `[CC_REDACTED]` |\n| Email addresses | `[EMAIL_REDACTED]` |\n| US phone numbers | `[PHONE_REDACTED]` |\n| Private IPv4 ranges | `[PRIVATE_IP]` |\n| Sensitive Unix paths (`/etc/shadow`, `.ssh/id_rsa`) | `[SENSITIVE_PATH]` |\n| Windows drive paths (`C:\\...`) | `[WINDOWS_PATH]` |\n| UNC network paths (`\\\\server\\share`) | `[UNC_PATH]` |\n\n### CSRF Protection\n\nBrowser-session mutations (POST/PUT/PATCH/DELETE) require a trusted `Origin` or\n`Referer`. API-key requests are stateless and exempt; the login endpoint is\nexplicitly exempt.\n\n### Audit Trail\n\nAppend-only SQLite audit logging records every security-relevant action in the\n`audit_log` table:\n\n- A database trigger blocks `UPDATE` and `DELETE` on the table.\n- Each entry carries a SHA-256 hash of prior content (hash chain), making\n  tampering detectable.\n- Verified with `scripts/ops/verify_audit_chain.py`.\n- Exportable via `scripts/ops/export_audit_log.py`, with an optional cron\n  wrapper (`scripts/ops/export_audit_cron.sh`).\n\n---\n\n## Quick Start (Local)\n\n### Prerequisites\n\n- Docker Desktop \u003e= 4.28 with Compose V2\n- ~8 GB RAM available for containers\n\n```bash\n# 1. Clone\ngit clone \u003cyour-fork-url\u003e raptor\ncd raptor\n\n# 2. Copy and edit configuration\ncp .env.example .env\n# Replace every change_me_* value before exposing the service.\n\n# 3. Start the full stack\ndocker compose up -d --build\n\n# 4. Wait for health checks (~60 s)\ndocker compose ps\n\n# 5. Open the console\n#    Frontend: http://localhost:3100\n#    API:      http://localhost:8000/api/v1/health\n```\n\nBootstrap admin credentials come from `RAPTOR_BOOTSTRAP_ADMIN_USERNAME` /\n`RAPTOR_BOOTSTRAP_ADMIN_PASSWORD` in `.env`. After creating a permanent admin,\nset `RAPTOR_BOOTSTRAP_ADMIN_DISABLED=true` and restart.\n\nThe base compose stack runs on SQLite. There is no separate PostgreSQL or worker\ncontainer in local mode — the backend runs the pipeline in-process.\n\n---\n\n## Production Deployment\n\n### 1. Generate Secrets\n\n```bash\nopenssl rand -hex 32                                   # RAPTOR_API_KEY\nopenssl rand -base64 24                                # bootstrap admin password\npython3 -c \"import secrets,base64; print('base64:'+base64.b64encode(secrets.token_bytes(32)).decode())\"  # EVIDENCE_ENCRYPTION_KEY\nopenssl rand -hex 24                                   # service passwords\n```\n\nStore secrets in a secrets manager. Never commit them.\n\n### 2. Required Environment Variables\n\n| Variable | Requirement |\n|---|---|\n| `RAPTOR_ENV` | `production` |\n| `RAPTOR_API_KEY` | Service API key (\u003e= 32 chars) |\n| `RAPTOR_BOOTSTRAP_ADMIN_PASSWORD` | Initial admin password |\n| `RAPTOR_DB_ENGINE` | `postgresql` |\n| `RAPTOR_DATABASE_URL` | `postgresql://user:pass@host:5432/raptor` |\n| `RAPTOR_RATE_LIMIT_BACKEND` | `redis` |\n| `RAPTOR_SESSION_COOKIE_SECURE` | `true` (behind TLS) |\n| `RAPTOR_ALLOW_AUTH_DISABLED` | `false` |\n| `EVIDENCE_ENCRYPTION_KEY` | 32-byte base64 KEK |\n| `NEO4J_PASSWORD`, `WEAVIATE_API_KEY`, `ELASTIC_PASSWORD`, `REDIS_PASSWORD`, `POSTGRES_PASSWORD` | Non-placeholder values |\n| `CORS_ALLOW_ORIGINS`, `CSRF_TRUSTED_ORIGINS` | Your frontend origin |\n\nThe backend refuses to start in `RAPTOR_ENV=production` when lab defaults remain,\nwhen PostgreSQL is selected without a DSN, or when SQLite limits are not\nexplicitly acknowledged. Fix the reported variable rather than disabling the guard.\n\n### 3. Deploy\n\n```bash\ndocker compose \\\n  -f docker-compose.yml \\\n  -f docker-compose.prod.yml \\\n  up -d --build\n\ncurl -sf http://localhost:8000/api/v1/health/detailed | python3 -m json.tool\n```\n\nThe production overlay provisions a `postgres` service and a separate `worker`\ncontainer, resets all internal port bindings, and enforces hardened service\nsettings.\n\n### 4. TLS\n\nRAPTOR does not terminate TLS. Deploy behind an Nginx/Caddy host, a cloud HTTPS\nload balancer, or a Kubernetes ingress with cert-manager. HSTS is set by the\nNginx config and when `RAPTOR_PRODUCTION=true`.\n\n### 5. SSO / OIDC\n\n```bash\nRAPTOR_TRUSTED_SSO_ENABLED=true\nRAPTOR_TRUSTED_PROXY_CIDRS=10.0.0.5/32\nRAPTOR_SSO_USER_HEADER=x-forwarded-user\nRAPTOR_SSO_ROLES_HEADER=x-forwarded-roles   # comma-separated\nRAPTOR_SSO_TENANT_HEADER=x-forwarded-tenant\n```\n\nIdentity headers must be stripped from client requests at the ingress; they are\nhonoured only from trusted CIDRs.\n\n### 6. S3 Evidence Storage\n\n```bash\nRAPTOR_STORAGE_BACKEND=s3\nS3_BUCKET=my-raptor-evidence\nS3_PREFIX=evidence/\nS3_REGION=us-east-1\n# S3_ENDPOINT_URL=https://minio.internal:9000   # MinIO / GCS\n```\n\n---\n\n## Configuration Reference\n\n### Core\n\n| Variable | Default | Description |\n|---|---|---|\n| `RAPTOR_ENV` | `development` | `development` or `production` |\n| `RAPTOR_PROCESS_ROLE` | `all` | `api`, `worker`, or `all` (dev only) |\n| `API_HOST` | `0.0.0.0` | Bind address |\n| `API_PORT` | `8000` | Listen port |\n| `MAX_UPLOAD_BYTES` | `10485760` | Max upload size (10 MiB) |\n\n### Auth\n\n| Variable | Default | Description |\n|---|---|---|\n| `RAPTOR_API_KEY` | — | Service API key |\n| `RAPTOR_ALLOW_AUTH_DISABLED` | `false` | Local dev only |\n| `RAPTOR_SESSION_COOKIE_SECURE` | `false` | Set `true` behind TLS |\n| `RAPTOR_SESSION_TTL_SECONDS` | `28800` | Session lifetime |\n| `RAPTOR_REQUIRE_RBAC` | `true` | Enforce role checks |\n| `RAPTOR_AUTH_MAX_FAILURES` | `5` | Lockout threshold |\n| `RAPTOR_AUTH_LOCK_SECONDS` | `900` | Lockout duration |\n| `RAPTOR_BOOTSTRAP_ADMIN_USERNAME` | `admin` | Bootstrap admin name |\n| `RAPTOR_BOOTSTRAP_ADMIN_PASSWORD` | — | Bootstrap admin password |\n| `RAPTOR_BOOTSTRAP_ADMIN_DISABLED` | `false` | Disable bootstrap account |\n| `RAPTOR_RATE_LIMIT_BACKEND` | `memory` | `memory` or `redis` |\n\n### Evidence\n\n| Variable | Default | Description |\n|---|---|---|\n| `EVIDENCE_ENCRYPTION_KEY` | — | 32-byte KEK for envelope encryption |\n| `EVIDENCE_RETENTION_DAYS` | `180` | Retention period |\n| `RAPTOR_STORAGE_BACKEND` | `local` | `local` or `s3` |\n| `S3_BUCKET` / `S3_PREFIX` / `S3_REGION` / `S3_ENDPOINT_URL` | — | S3 storage settings |\n\n### Database\n\n| Variable | Default | Description |\n|---|---|---|\n| `RAPTOR_DB_ENGINE` | `sqlite` | `sqlite` or `postgresql` |\n| `RAPTOR_DATABASE_URL` | — | PostgreSQL DSN |\n| `RAPTOR_ACKNOWLEDGE_SQLITE_LIMITS` | `false` | Required for deliberate single-node SQLite prod |\n\n### LLM\n\n| Variable | Default | Description |\n|---|---|---|\n| `RAPTOR_ALLOW_EXTERNAL_LLM` | `false` | Enable external LLM calls |\n| `LLM_PROVIDER` | `nvidia` | `nvidia` or `openrouter` |\n| `LLM_MODEL` | `z-ai/glm-5.1` | Model identifier |\n| `LLM_TIMEOUT_SECONDS` | `30` | Per-request timeout |\n\nSee `.env.example` for the complete, annotated list.\n\n---\n\n## API Reference\n\nAll endpoints are under `/api/v1` and require authentication unless noted.\n\n### Health\n\n| Method | Path | Auth | Description |\n|---|---|---|---|\n| `GET` | `/health` | Optional | Liveness check |\n| `GET` | `/health/detailed` | Optional | Full subsystem status |\n\n### Auth\n\n| Method | Path | Auth | Description |\n|---|---|---|---|\n| `POST` | `/auth/session` | None | Create session from credentials |\n| `GET` | `/auth/me` | Required | Current principal |\n| `POST` | `/auth/logout` | Required | Revoke session |\n\n### Investigations\n\n| Method | Path | Role | Description |\n|---|---|---|---|\n| `POST` | `/investigate` | analyst | Upload log file |\n| `POST` | `/investigate/text` | analyst | Paste logs or an Elasticsearch query |\n| `GET` | `/investigations` | viewer | List (tenant-scoped) |\n| `GET` | `/investigate/{id}/status` | viewer | Poll progress |\n| `GET` | `/investigate/{id}/report` | viewer | Full report |\n| `GET` | `/investigate/{id}/graph` | viewer | Attack graph |\n| `GET` | `/investigate/{id}/evidence` | viewer | Evidence metadata |\n\n### Analysis\n\n| Method | Path | Role | Description |\n|---|---|---|---|\n| `POST` | `/simulate` | analyst | Predict next adversary steps |\n| `POST` | `/query` | viewer | Natural-language graph query |\n| `GET` | `/mitre/matrix` | viewer | ATT\u0026CK matrix with overlay |\n| `GET` | `/apt/profiles` | viewer | APT group library |\n| `GET` | `/apt/profiles/{name}` | viewer | Single APT profile |\n\n### Intelligence\n\n| Method | Path | Role | Description |\n|---|---|---|---|\n| `GET` | `/threat-feeds/cisa-kev` | viewer | CISA KEV catalog |\n| `POST` | `/ingest/elasticsearch` | analyst | Pull and investigate events |\n| `GET` | `/ingest/elasticsearch/status` | viewer | Poller state |\n| `PUT` | `/ingest/elasticsearch/config` | admin | Configure poller |\n\n### Admin\n\n| Method | Path | Role | Description |\n|---|---|---|---|\n| `GET` | `/audit` | admin | Audit log entries |\n| `GET` | `/metrics` | admin | Prometheus metrics |\n| `GET` | `/admin/schema/status` | admin | Schema status |\n| `POST` `GET` `PATCH` `DELETE` | `/users` | admin | User management |\n\n---\n\n## Roles and Access Control\n\n| Role | Permissions |\n|---|---|\n| `viewer` | Read investigations, reports, graphs, audit log, threat feeds |\n| `analyst` | viewer + create investigations, run simulations, query graph |\n| `admin` | analyst + user management, system configuration |\n| `service` | Full access (API-key principal) |\n\nRBAC is enforced at every endpoint. Investigations are scoped by `tenant_id`;\nanalysts see only their own tenant's cases. User management is tenant-scoped for\nordinary `admin` accounts — only the `service` principal administers users across\nall tenants.\n\n---\n\n## Evidence Handling\n\nEvidence is written under `data/evidence/{investigation_id}/` and encrypted with\nAES-256-GCM envelope encryption.\n\n```bash\n# Expired-evidence cleanup\npython3 scripts/ops/cleanup_expired_evidence.py --db data/raptor.db            # dry run\npython3 scripts/ops/cleanup_expired_evidence.py --db data/raptor.db --execute  # after approval\n\n# Key rotation (re-wraps DEKs; no content re-encryption)\npython3 scripts/ops/rotate_evidence_key.py --db data/raptor.db --execute\n```\n\n---\n\n## Audit Trail\n\n```bash\npython3 scripts/ops/verify_audit_chain.py --db data/raptor.db\npython3 scripts/ops/export_audit_log.py --db data/raptor.db --out exports/audit-log.jsonl\n\n# Scheduled export with optional S3 upload\nAUDIT_S3_BUCKET=my-audit-bucket scripts/ops/export_audit_cron.sh\n```\n\nThe cron script writes a timestamped JSONL export, runs chain verification,\noptionally uploads to S3, and prunes local copies older than 90 days.\n\n---\n\n## Observability\n\n```bash\ncurl http://localhost:8000/api/v1/health/detailed\ncurl -H \"X-RAPTOR-API-Key: $KEY\" http://localhost:8000/api/v1/metrics\n```\n\nKey metrics: `raptor_requests_total`, `raptor_auth_failures_total`,\n`raptor_investigations_created_total`, `raptor_investigations_completed_total`,\n`raptor_investigations_failed_total`, `raptor_parser_errors_total`,\n`raptor_request_latency_seconds_avg`.\n\nReference alert rules live in `observability/prometheus-rules.yml` and a starter\nGrafana dashboard in `observability/grafana-dashboard.json`.\n\n| Alert | Condition |\n|---|---|\n| `RaptorHighErrorRate` | 5xx rate \u003e 5% for 10 min |\n| `RaptorAuthFailureSpike` | \u003e 25 auth failures in 10 min |\n| `RaptorInvestigationFailure` | Any failure in 15 min |\n| `RaptorParserErrorSpike` | \u003e 50 parser errors in 15 min |\n| `RaptorHighLatency` | Average latency \u003e 2 s |\n\n---\n\n## Operational Runbook\n\n```bash\n# Backup / restore\nscripts/ops/backup.sh backups/$(date -u +%Y%m%dT%H%M%SZ)\nscripts/ops/restore.sh backups/\u003cbackup-id\u003e\n\n# Bootstrap lockdown after creating a real admin\n#   set RAPTOR_BOOTSTRAP_ADMIN_DISABLED=true, then:\ndocker compose restart backend\n```\n\nFull procedures — deployment posture, identity, worker operations, backup order,\naudit integrity, and incident response — are in\n[`docs/production-runbook.md`](docs/production-runbook.md).\n\n---\n\n## Development\n\n### Backend\n\n```bash\ncd backend\npython -m venv .venv \u0026\u0026 source .venv/bin/activate\npip install -r requirements.lock pytest\nRAPTOR_ENV=development uvicorn main:app --reload --port 8000\n```\n\n### Frontend\n\n```bash\ncd frontend\nnpm ci\nnpm run dev          # HMR dev server\nnpm run build        # Production build\nnpm run e2e          # Playwright tests\n```\n\n### Full Stack (code-reloading)\n\n```bash\ndocker compose up -d neo4j weaviate elasticsearch redis\ncd backend \u0026\u0026 uvicorn main:app --reload --port 8000 \u0026\ncd frontend \u0026\u0026 npm run dev\n```\n\nAny change to frontend CSS/JSX in a containerised setup requires a rebuild:\n\n```bash\ndocker compose up -d --build frontend\n```\n\n---\n\n## Testing\n\nBackend tests live in `tests/` at the repository root. `conftest.py` adds both\n`tests/` and `backend/` to `sys.path`, so no install step is required.\n\n```bash\npytest -q tests/\n# or\npython -m unittest discover -s tests\n```\n\n| Test file | Coverage |\n|---|---|\n| `tests/test_ingestion_pipeline.py` | Log parsing, Sigma matching, normalisation |\n| `tests/test_parser_graph_nlq.py` | Parser, graph builder, NLQ query engine |\n| `tests/test_graph_query.py` | Graph query construction and guards |\n| `tests/test_analysis_attribution.py` | Attribution scoring and confidence |\n| `tests/test_attack_catalog_matrix.py` | ATT\u0026CK catalog / matrix |\n| `tests/test_rag_fallbacks.py` | Deterministic fallbacks when the LLM is off |\n| `tests/test_persistence_connectors.py` | Evidence store, audit tamper-proofing, ES poll state, API-key auth |\n| `tests/test_api_persistence_connectors.py` | Auth sessions, CSRF guard, ES pull endpoint |\n| `tests/test_postgres_adapter.py` | SQLite -\u003e PostgreSQL SQL translation |\n| `tests/test_postgres_runtime_integration.py` | Full PostgreSQL integration (needs `RAPTOR_DB_ENGINE=postgresql`) |\n| `tests/test_repo_contracts.py` | Repository contract checks |\n\nFrontend end-to-end tests are in `frontend/e2e/` (`dashboard.spec.js`,\n`security.spec.js`) and run with Playwright via `npm run e2e`.\n\n---\n\n## Quality Gates\n\n### Local\n\nRun the gates locally through the `Makefile` before pushing:\n\n```bash\nmake validate         # pytest + frontend build + prod compose config validation\nmake security-scan    # pip-audit (backend) + npm audit --audit-level=high (frontend)\nmake compose-config   # validate the production compose overlay renders\n```\n\n### Continuous Integration (`.github/workflows/ci.yml`)\n\nRuns on every pull request and on pushes to `main`:\n\n| Job | What it verifies |\n|---|---|\n| `backend-tests` | Offline regression suite (`pytest tests/`) on Python 3.11 |\n| `postgres-integration` | `test_postgres_runtime_integration.py` against a live `postgres:16` service |\n| `frontend-build` | Production Vite build |\n| `frontend-e2e` | Playwright functional + security e2e (API-mocked, Vite dev server) |\n| `compose-validate` | Production compose overlay renders with all required variables |\n| `dependency-audit` | `pip-audit` (backend) and `npm audit --audit-level=high` (frontend) |\n| `secret-scan` | Gitleaks secret scan over full history |\n| `filesystem-scan` | Trivy filesystem scan (fails on fixable CRITICALs) |\n| `container-scan` | Builds backend and frontend images and scans them with Trivy |\n\n### Release and Signing (`.github/workflows/release.yml`)\n\nRuns on pushes to `main` and on `v*.*.*` tags:\n\n- Builds the backend and frontend images and pushes them to the GitHub\n  Container Registry (`ghcr.io`).\n- Scans the published image digests with Trivy.\n- Signs each pushed digest with **Cosign keyless** (Sigstore / OIDC) — no\n  long-lived signing keys. Verify a published image with:\n\n```bash\ncosign verify \\\n  --certificate-identity-regexp \"https://github.com/\u003cowner\u003e/\u003crepo\u003e/.github/workflows/release.yml@.*\" \\\n  --certificate-oidc-issuer https://token.actions.githubusercontent.com \\\n  ghcr.io/\u003cowner\u003e/\u003crepo\u003e/backend:\u003ctag\u003e\n```\n\n---\n\n## Documentation\n\n| Document | Purpose |\n|---|---|\n| [`docs/production-runbook.md`](docs/production-runbook.md) | Deployment posture, identity, worker ops, backup/restore, incident response |\n| [`docs/threat-model.md`](docs/threat-model.md) | Assets, trust boundaries, threats and controls, residual risk |\n| [`docs/data-governance.md`](docs/data-governance.md) | Data classes, LLM policy, retention, evidence encryption, audit |\n| [`docs/observability.md`](docs/observability.md) | Metrics, logs, operational checks |\n| [`docs/scaling-limits.md`](docs/scaling-limits.md) | Operating envelope, bottlenecks, expansion path |\n| [`CHANGELOG.md`](CHANGELOG.md) | Release history |\n\n---\n\n## Project Layout\n\n```\nbackend/        FastAPI app, pipeline, attribution, graph, rag, ingestion, report\nfrontend/       React 18 + Vite analyst console (Nginx-served container)\ndata/           Mock telemetry, bundled STIX/ATT\u0026CK, runtime evidence (gitignored)\ndocs/           Operational and governance documentation\nobservability/  Prometheus alert rules and Grafana dashboard\nscripts/        Docker installers, hybrid installers, ops tooling\ntests/          Backend test suite (run from repo root)\ndocker-compose.yml          Local stack (SQLite)\ndocker-compose.prod.yml     Hardened production overlay (PostgreSQL + worker)\nMakefile                    Local quality gates and ops shortcuts\n```\n\n---\n\n## License\n\nThis repository is under Apache License.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpresident-xd%2Fraptor","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpresident-xd%2Fraptor","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpresident-xd%2Fraptor/lists"}