{"id":50762502,"url":"https://github.com/devlinduldulao/citizenship-application","last_synced_at":"2026-06-11T11:02:13.422Z","repository":{"id":338890290,"uuid":"1159166272","full_name":"devlinduldulao/citizenship-application","owner":"devlinduldulao","description":null,"archived":false,"fork":false,"pushed_at":"2026-04-13T11:13:47.000Z","size":4778,"stargazers_count":0,"open_issues_count":6,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-01T19:07:24.065Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/devlinduldulao.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":".github/FUNDING.yml","license":null,"code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":"SUPPORT.md","governance":"GOVERNANCE.md","roadmap":"ROADMAP.md","authors":"AUTHORS.md","dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null},"funding":{"github":["tiangolo"]}},"created_at":"2026-02-16T11:57:23.000Z","updated_at":"2026-04-13T11:13:53.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/devlinduldulao/citizenship-application","commit_stats":null,"previous_names":["devlinduldulao/citizenship-application"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/devlinduldulao/citizenship-application","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devlinduldulao%2Fcitizenship-application","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devlinduldulao%2Fcitizenship-application/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devlinduldulao%2Fcitizenship-application/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devlinduldulao%2Fcitizenship-application/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/devlinduldulao","download_url":"https://codeload.github.com/devlinduldulao/citizenship-application/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devlinduldulao%2Fcitizenship-application/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34195117,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-11T02:00:06.485Z","response_time":57,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-06-11T11:02:13.321Z","updated_at":"2026-06-11T11:02:13.409Z","avatar_url":"https://github.com/devlinduldulao.png","language":"TypeScript","funding_links":["https://github.com/sponsors/tiangolo"],"categories":[],"sub_categories":[],"readme":"# Norwegian Citizenship Automation MVP\n\n## What this project is about\n\nThis project is a **monolithic MVP** that accelerates the manual review queue in Norway's citizenship application pipeline.\n\n### The problem\n\nUDI and Politi already have automated systems that handle straightforward citizenship applications — cases with complete documentation and no flags pass through without manual intervention. The bottleneck is the **manual triage queue**: applications that are flagged, incomplete, or require human judgment. This queue grows faster than reviewers can process it (limited staffing, rising application volumes), resulting in wait times that can exceed **2 years**.\n\n### What this system does\n\nThis MVP sits **on top of** UDI/Politi's existing automated pipeline. It targets specifically the cases that land in the manual review pile:\n\n- **Structured intake and document upload** — standardizes how flagged cases enter the review queue.\n- **OCR/NLP-assisted extraction** — pre-parses uploaded documents so reviewers don't start from scratch.\n- **Explainable rule-based pre-screening** — scores each case with transparent, weighted rules. Reviewers see exactly why a case scored the way it did.\n- **Priority and SLA management** — ranks the backlog by urgency and risk so the most critical cases are reviewed first.\n- **Decision support, not decision replacement** — the system recommends; a human reviewer always makes the final call.\n- **Immutable audit trail** — every action (system and human) is logged for supervision, accountability, and legal compliance.\n\n### Who it's for\n\nThe intended audience is **UDI/Politi operations teams** — specifically the reviewers, team leads, and managers who handle the manual backlog. The goal is not to replace their existing automation, but to give them tools to clear the growing pile of flagged cases faster and more consistently.\n\n## Maintainer\n\n- **Devlin Duldulao** — Senior Software Engineer, Crayon Consulting AS (Oslo, Norway)\n\n## Documentation Map\n\n- [Root README](./README.md): architecture overview, setup, and runtime workflow\n- [Backend README](./backend/README.md): API domains, backend workflow, AI/OCR/NLP services\n- [Frontend README](./frontend/README.md): UI workflow, generated client usage, API call guard\n- [Development Guide](./development.md): environment and local development details\n- [Deployment Guide](./deployment.md): deployment and operations guidance\n- [Roadmap](./ROADMAP.md): phased delivery plan for product + AI evolution\n- [Immigrant User Guide](./IMMIGRANT_USER_GUIDE.md): step-by-step applicant journey\n- [Reviewer Admin Guide](./REVIEWER_ADMIN_GUIDE.md): step-by-step UDI/Politi/admin workflow\n- [Contributing Guide](./CONTRIBUTING.md): contributor workflow and review expectations\n- [Code of Conduct](./CODE_OF_CONDUCT.md): community standards and behavior expectations\n- [Support](./SUPPORT.md): where to ask questions and report non-security issues\n- [Security Policy](./SECURITY.md): vulnerability reporting and disclosure flow\n- [Governance](./GOVERNANCE.md): maintainer roles and decision process\n- [Changelog](./CHANGELOG.md): release tracking entry point\n- [Authors](./AUTHORS.md): maintainers and contributors\n\n## MVP scope (implemented)\n\n### Phase 1 — Intake and processing pipeline\n\n- Citizenship application creation and management\n- Requirement document upload (PDF/image)\n- Background processing pipeline with real OCR and NLP extraction\n\n### Phase 2 — Explainable eligibility engine\n\n- Deterministic weighted rule engine\n- Confidence score and risk level per application\n- Rule-by-rule rationale and evidence breakdown API\n- Frontend explainability panel for reviewers\n\n### Phase 3 — Human-in-the-loop decisioning\n\n- Superuser caseworker decision actions (`approve`, `reject`, `request_more_info`)\n- Mandatory decision reason capture\n- Immutable audit trail endpoint and UI timeline\n- End-to-end tested runtime flow (intake → processing → review decision)\n\n### Phase 4 — Reviewer queue and SLA management\n\n- Priority scoring to rank manual-review workload\n- SLA due-date assignment for pending manual decisions\n- Admin reviewer queue endpoint with overdue indicators\n- Queue metrics endpoint for pending/overdue visibility\n\n## Queue \u0026 SLA operations\n\nThese APIs support reviewer workload balancing and operational monitoring.\n\n- `GET /api/v1/applications/queue/review` (superuser): returns prioritized manual-review queue items with fields including `priority_score`, `sla_due_at`, and `is_overdue`.\n- `GET /api/v1/applications/queue/metrics` (superuser): returns aggregate workload metrics.\n- `GET /api/v1/applications/{application_id}/case-explainer`: returns AI-assisted case memo (`summary`, `recommended_action`, `key_risks`, `missing_evidence`, `next_steps`) with optional LLM generation and rules-based fallback.\n- `GET /api/v1/applications/{application_id}/evidence-recommendations`: returns AI-guided missing-document recommendations and next actions.\n\nMetric interpretation:\n\n- `pending_manual_count`: number of applications currently waiting for reviewer action.\n- `overdue_count`: number of pending-manual applications that passed `sla_due_at`.\n- `high_priority_count`: number of pending-manual applications above the configured high-priority threshold.\n\n## Reviewer Ops Playbook\n\nRecommended daily triage sequence for review teams:\n\n1. Open queue metrics and check `overdue_count` first.\n2. Process overdue applications in descending `priority_score`.\n3. Process remaining high-priority applications.\n4. Use decision breakdown and document evidence before final action.\n5. Submit review decision with a clear mandatory reason.\n6. Audit trail automatically records action history for supervision and handoff.\n\n## AI / ML Architecture\n\nThe system uses a three-stage intelligent pipeline for document analysis:\n\n### Stage 1 — Document Intelligence (OCR)\n\n| Technology | Purpose | When used |\n|---|---|---|\n| **PyMuPDF (fitz)** | Text-layer extraction from digital PDFs | Primary — handles most uploads |\n| **Pillow + pytesseract** | Image preprocessing and optical character recognition | Fallback — for scanned documents and image uploads |\n\nExtraction returns structured metadata: method used, confidence score, character count, page count, and any warnings (e.g. `ocr_unavailable` if Tesseract is not installed).\n\n### Stage 2 — Entity Extraction (NLP)\n\nHybrid NLP pipeline combining a trained Norwegian spaCy model and domain-tuned regex patterns. Extracts:\n\n| Entity type | Examples |\n|---|---|\n| **Dates** | `15.03.1990`, `2023-01-15`, `15 mars 1990`, `20 oktober 2024` |\n| **Passport numbers** | `NO1234567`, fødselsnummer patterns (`ddmmyyXXXXX`) |\n| **Nationalities** | 50+ nationalities in English and Norwegian |\n| **Names** | Surname/given name fields, title-case sequences, spaCy `PER` entities |\n| **Locations/addresses** | spaCy `GPE/LOC` entities + Norwegian postal/street patterns |\n| **Citizenship keywords** | `statsborgerskap`, `permanent residence`, `oppholdstillatelse` |\n| **Language indicators** | `norskprøve`, `B1`, `B2`, `bestått`, `kompetanse norge` |\n| **Residency indicators** | `bodd i Norge`, `folkeregisteret`, `years in Norway` |\n| **Addresses** | Norwegian postal code patterns, street addresses |\n\nEach document receives an NLP score (0–1) based on entity richness across categories.\n\n### Stage 3 — Explainable Rule Engine\n\n7 weighted rules combine document-type signals with NLP-extracted evidence:\n\n| Rule | Weight | NLP enhancement |\n|---|---|---|\n| Identity document present | 0.20 | Passport number in text boosts score even without matching doc type |\n| Residency evidence present | 0.18 | NLP residency keywords can partially satisfy the rule |\n| Document OCR/NLP quality | 0.17 | Includes avg NLP entity richness score |\n| Language/integration evidence | 0.15 | Language proficiency indicators found in text |\n| Security screening evidence | 0.15 | Police clearance document detection |\n| NLP entity richness | 0.10 | Total entities extracted across all documents |\n| Residency duration signal | 0.05 | Case notes + NLP residency signals combined |\n\nEvery rule includes a human-readable rationale and full evidence payload so reviewers can verify the system's reasoning.\n\n## Where AI is applied now\n\n- **Document understanding:** OCR + NLP extraction for uploaded evidence.\n- **Case narrative generation:** `case-explainer` endpoint produces a reviewer-ready case memo with fallback behavior if LLM is unavailable.\n- **Evidence gap recommendations:** `evidence-recommendations` endpoint suggests high-impact missing document types and next actions.\n- **Human-in-the-loop controls:** AI outputs are advisory; final decisions remain caseworker-owned and auditable.\n\n## AI expansion opportunities\n\n- Reviewer copilot Q\u0026A grounded in rules, extracted evidence, and audit trail citations.\n- Backlog risk forecasting (`likely_more_info_required`, SLA breach prediction) for queue prioritization.\n- Cross-document anomaly detection for identity/residency inconsistencies and missing evidence patterns.\n- Multilingual summarization and translation assistance for reviewer notes and applicant-facing communication.\n\n## Testing with sample documents\n\nThe system performs **real OCR text extraction and NLP entity recognition** on uploaded files. Upload actual PDFs with text content to get meaningful extraction results.\n\nUpload files with these `document_type` values to trigger different eligibility rules:\n\n| `document_type` value | Rule it satisfies |\n|---|---|\n| `passport` or `id_card` | Identity document present |\n| `residence_permit`, `residence_proof`, or `tax_statement` | Residency evidence present |\n| `language_certificate`, `norwegian_test`, or `education_certificate` | Language/integration evidence |\n| `police_clearance` | Security screening evidence |\n\nFor the **highest confidence score**, upload one document per category (e.g. a passport PDF, a residence_permit PDF, a language_certificate image, and a police_clearance PDF). Add case notes mentioning \"long-term residence\" or \"years\" to trigger the bonus residency-duration rule.\n\nFor a **low confidence / high risk** case, upload only a single passport — the missing categories will pull the score down and flag the case for priority manual review.\n\n### Live smoke test\n\nAn end-to-end smoke test creates realistic passport and language certificate PDFs (using PyMuPDF), uploads them, triggers processing, and verifies extraction results:\n\n```bash\ncd backend\nuv run python scripts/smoke_ocr_nlp.py\n```\n\nExpected output includes extracted passport numbers, nationalities, language indicators, dates, and NLP-enhanced rule scoring.\n\n## Why this approach\n\n- **Targets the real bottleneck:** works on the manual review pile, not the already-automated happy path\n- **Fast to build and demo:** monolith architecture for MVP speed\n- **Safer than black-box AI:** explainable scoring and explicit rules\n- **Operationally credible:** supports human oversight and auditability\n- **Real AI pipeline:** PyMuPDF document intelligence, regex NLP entity extraction, and NLP-enhanced scoring\n- **Extensible:** can incrementally add stronger OCR/ML models (spaCy, transformer NER) and policy rules\n\n## Next planned phases\n\n- Exportable decision/audit report for case handoff\n- Stronger policy rule coverage aligned to legal requirements\n- Production hardening (security, observability, governance)\n\n## Technology Stack\n\n- Backend: FastAPI, SQLModel, Pydantic, PostgreSQL\n- AI/ML: PyMuPDF (document intelligence), Pillow (image processing), pytesseract + Tesseract OCR (scanned documents), spaCy `nb_core_news_sm` (Norwegian NER), regex NLP (domain-specific entity extraction)\n- Frontend: React, TypeScript, TanStack Router/Query, Tailwind CSS, shadcn/ui\n- Infrastructure: Docker Compose, Traefik, JWT authentication\n- Quality: Pytest backend tests (including OCR/NLP unit tests) and Vitest frontend unit tests\n\n## Quick Start (Docker)\n\nFrom the project root:\n\n```bash\ndocker compose up -d --wait\n```\n\nThen open:\n\n- Frontend: `http://localhost`\n- API docs: `http://localhost/api/v1/docs`\n\n## Local Development\n\n\u003e **Full cross-platform setup guide (macOS, Ubuntu, Windows), `.env` reference, AI/ML configuration, Docker Compose details, and testing instructions are in [development.md](./development.md).**\n\n### Quick setup (all platforms)\n\n```bash\ndocker compose up -d --wait        # start everything in Docker\n```\n\nFrontend → http://localhost:5173 · API → http://localhost:8000 · Swagger UI → http://localhost:8000/docs\n\n### Native development (faster iteration, DB-only in Docker)\n\n```bash\n# 1. Start only the database\ndocker compose up -d db --wait\n\n# 2. Backend\ncd backend\nuv sync                             # install deps (first time)\nuv run alembic upgrade head         # apply migrations\nuv run python -m app.initial_data   # seed superuser (first time)\nuv run fastapi dev app/main.py      # → http://localhost:8000\n\n# 3. Frontend (separate terminal)\ncd frontend\nbun install                         # install deps (first time)\nbun run dev                         # → http://localhost:5173\n```\n\n### AI/ML dependencies (Tesseract + spaCy)\n\n| Platform | Tesseract install | Auto-detected? |\n|---|---|---|\n| macOS | `brew install tesseract tesseract-lang` | ✅ yes |\n| Ubuntu | `sudo apt install tesseract-ocr tesseract-ocr-nor` | ✅ yes |\n| Windows | `winget install UB-Mannheim.TesseractOCR` | ✅ yes |\n\n```bash\n# spaCy Norwegian model (all platforms)\ncd backend\nuv pip install https://github.com/explosion/spacy-models/releases/download/nb_core_news_sm-3.8.0/nb_core_news_sm-3.8.0-py3-none-any.whl\n```\n\nBoth are optional — the system degrades gracefully without them. See [development.md](./development.md#aiml-setup-tesseract--spacy) for details.\n\n### Run tests\n\n```bash\n# Backend (unit tests — no DB needed)\ncd backend \u0026\u0026 uv run pytest tests/unit tests/services -v\n\n# Backend (full suite — requires DB)\ndocker compose up -d db --wait\ncd backend \u0026\u0026 uv run pytest\n\n# Frontend\ncd frontend \u0026\u0026 bun run test\n```\n\n### Development URLs\n\n| Service | URL |\n|---|---|\n| Frontend | http://localhost:5173 |\n| Backend API | http://localhost:8000 |\n| Swagger UI | http://localhost:8000/docs |\n| ReDoc | http://localhost:8000/redoc |\n| Adminer (DB) | http://localhost:8080 |\n| Mailcatcher | http://localhost:1080 |\n\n### Default credentials\n\n| User | Email | Password |\n|---|---|---|\n| Superuser/Admin | admin@example.com | changethis |\n\nFor demo UX, the login page prefills these credentials automatically.\n\n\u003e **Warning:** Rotate all `changethis` defaults in `.env` before any shared or production deployment. See [development.md](./development.md#environment-file-env-reference) for the full `.env` reference.\n\n### Login troubleshooting (spinner / CORS / devtools noise)\n\nIf login keeps spinning, run these checks in order:\n\n```bash\n# 1) Ensure DB is healthy and reachable\ndocker compose up -d db --wait\n\n# 2) Ensure admin user exists\ncd backend\nuv run python -m app.initial_data\n\n# 3) Start backend from project root with explicit project path\ncd ..\nuv run --project backend fastapi dev backend/app/main.py --port 8000\n\n# 4) Start frontend (configured to use strict port 5173)\ncd frontend\nbun run dev\n```\n\nThen hard-refresh the browser (`Ctrl+Shift+R`) and retry login.\n\nIf Chrome console shows `chrome-extension://...` errors (message port / frame errors),\nthose are from browser extensions (often password managers), not from this app. Test\nin Incognito mode (with extensions disabled) to verify app behavior.\n\n### More details\n\n- Full cross-platform setup: [development.md](./development.md)\n- Backend setup and workflow: [backend/README.md](./backend/README.md)\n- Frontend setup and workflow: [frontend/README.md](./frontend/README.md)\n\n## Security Configuration\n\nBefore any shared or production deployment, rotate all default `changethis` credentials and secrets in `.env`.\n\nAt minimum, update:\n\n- `SECRET_KEY`\n- `FIRST_SUPERUSER_PASSWORD`\n- `POSTGRES_PASSWORD`\n\nDeployment guidance: [deployment.md](./deployment.md)\n\n## Further Reading\n\n- [Release Notes](./release-notes.md)\n- [Contributing](./CONTRIBUTING.md)\n- [Security Policy](./SECURITY.md)\n- [Roadmap](./ROADMAP.md)\n- [Code of Conduct](./CODE_OF_CONDUCT.md)\n- [Support](./SUPPORT.md)\n- [Governance](./GOVERNANCE.md)\n- [Changelog](./CHANGELOG.md)\n\n## License\n\nThis project is licensed under the terms of the MIT license.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdevlinduldulao%2Fcitizenship-application","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdevlinduldulao%2Fcitizenship-application","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdevlinduldulao%2Fcitizenship-application/lists"}