https://github.com/paultendo/namespace-guard

Check slug/handle uniqueness across multiple database tables with reserved name protection.
https://github.com/paultendo/namespace-guard
anti-spoofing confusable drizzle homoglyph knex kysely multi-tenant namespace nodejs prisma profanity-filter security sequelize slug tr39 typeorm typescript unicode username validation
Last synced: 3 months ago
JSON representation
Check slug/handle uniqueness across multiple database tables with reserved name protection.
Host: GitHub
URL: https://github.com/paultendo/namespace-guard
Owner: paultendo
License: mit
Created: 2026-02-19T23:55:32.000Z (5 months ago)
Default Branch: main
Last Pushed: 2026-02-26T23:12:59.000Z (4 months ago)
Last Synced: 2026-02-27T04:46:39.632Z (4 months ago)
Topics: anti-spoofing, confusable, drizzle, homoglyph, knex, kysely, multi-tenant, namespace, nodejs, prisma, profanity-filter, security, sequelize, slug, tr39, typeorm, typescript, unicode, username, validation
Language: TypeScript
Homepage: https://paultendo.github.io/namespace-guard/
Size: 931 KB
Stars: 5
Watchers: 0
Forks: 0
Open Issues: 1
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- Funding: .github/FUNDING.yml
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project

README

          # namespace-guard

[![npm version](https://img.shields.io/npm/v/namespace-guard.svg)](https://www.npmjs.com/package/namespace-guard)

[![bundle size](https://img.shields.io/bundlephobia/minzip/namespace-guard)](https://bundlephobia.com/package/namespace-guard)

[![TypeScript](https://img.shields.io/badge/TypeScript-5.0+-blue.svg)](https://www.typescriptlang.org/)

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

**The world's first library that detects confusable characters across non-Latin scripts.** Slug claimability, Unicode anti-spoofing, and LLM [Denial of Spend](https://paultendo.github.io/posts/confusable-vision-llm-attack-tests/) defence in one zero-dependency package.

- Live demo: https://paultendo.github.io/namespace-guard/

- Blog post: https://paultendo.github.io/posts/namespace-guard-launch/

## Cross-script confusable detection

Existing confusable standards (TR39, IDNA) map non-Latin characters to Latin equivalents. They have zero coverage for confusable pairs *between* two non-Latin scripts.

namespace-guard ships 3,525 cross-script pairs from [confusable-vision](https://github.com/paultendo/confusable-vision) (measured across 245 system fonts using vector-outline raycasting — [RaySpace](https://paultendo.github.io/posts/rayspace-methodology/)). This catches attacks that no other library detects:

```typescript

import { areConfusable, detectCrossScriptRisk } from "namespace-guard";

import { CONFUSABLE_WEIGHTS } from "namespace-guard/confusable-weights";

// Hangul ᅵ and Han 丨 are visually identical (ray distance 0.004, Arial Unicode MS)

areConfusable("\u1175", "\u4E28", { weights: CONFUSABLE_WEIGHTS }); // true

// Greek Τ and Han 丅 are near-identical (multiple fonts)

areConfusable("\u03A4", "\u4E05", { weights: CONFUSABLE_WEIGHTS }); // true

// Cyrillic І and Greek Ι are identical outlines (62 fonts)

areConfusable("\u0406", "\u0399", { weights: CONFUSABLE_WEIGHTS }); // true

// Without weights, only skeleton-based detection (TR39 coverage)

areConfusable("\u1175", "\u4E28"); // false

// Analyze an identifier for cross-script risk

const risk = detectCrossScriptRisk("\u1175\u4E28", { weights: CONFUSABLE_WEIGHTS });

// { riskLevel: "high", scripts: ["han", "hangul"], crossScriptPairs: [...] }

```

4,174 total confusable pairs scored by visual measurement (3,111 TR39-confirmed, 1,063 novel). Each pair carries a `danger` score (0–1) representing geometric similarity across fonts; the shipped dataset uses a 0.5 floor. For higher precision, filter at `danger > 0.7` (574 pairs). Cross-script data licensed CC-BY-4.0.

## Installation

```bash

npm install namespace-guard

```

## Quick Start (60 seconds)

```typescript

import { createNamespaceGuardWithProfile } from "namespace-guard";

import { createPrismaAdapter } from "namespace-guard/adapters/prisma";

import { PrismaClient } from "@prisma/client";

const prisma = new PrismaClient();

const guard = createNamespaceGuardWithProfile(

  "consumer-handle",

  {

    reserved: ["admin", "api", "settings", "dashboard", "login", "signup"],

    sources: [

      { name: "user", column: "handleCanonical", scopeKey: "id" },

      { name: "organization", column: "slugCanonical", scopeKey: "id" },

    ],

  },

  createPrismaAdapter(prisma)

);

await guard.assertClaimable("acme-corp");

```

For race-safe writes, use `claim()`:

```typescript

const result = await guard.claim(input.handle, async (canonical) => {

  return prisma.user.create({

    data: {

      handle: input.handle,

      handleCanonical: canonical,

    },

  });

});

if (!result.claimed) {

  return { error: result.message };

}

```

## What You Get

- **Cross-script confusable detection** with 3,525 measured pairs between non-Latin scripts

- Cross-table collision checks (users, orgs, teams, etc.)

- Reserved-name blocking with category-aware messages

- Unicode anti-spoofing (NFKC + confusable detection + mixed-script/risk controls)

- Invisible character detection (zero-width joiners, direction overrides, and other hidden bytes)

- Optional profanity/evasion validation

- Suggestion strategies for taken names

- CLI for red-team generation, calibration, drift, and CI gates

## LLM Pipeline Preprocessing

Confusable characters are pixel-identical to Latin letters but encode as multi-byte BPE tokens. A 95-line contract that costs 881 tokens in clean ASCII costs 4,567 tokens when flooded with confusables: **5.2x the API bill**. The model reads it correctly. The invoice does not care.

We tested this across 4 frontier models, 8 attack types, and 130+ API calls. Zero meaning flips. Every substituted clause was correctly interpreted. But the billing attack succeeds. We call it **Denial of Spend**: the confusable analogue of DDoS, where the attacker cannot degrade the service but can inflate the cost of running it.

`canonicalise()` recovered every substituted term across all 12 attack variants, collapsing the 5.2x inflation to 1.0x. Processing a 10,000-character document takes under 1ms.

```typescript

import { canonicalise, scan, isClean } from "namespace-guard";

const raw = "The seller аssumes аll liаbility.";

const report = scan(raw);        // detailed findings + risk level

const clean = canonicalise(raw); // "The seller assumes all liability."

const ok = isClean(raw);         // false (mixed-script confusable detected)

// For known-Latin documents (e.g. English contracts), use strategy: "all"

// to also catch words where every character was substituted:

canonicalise("поп-refundable", { strategy: "all" }); // "non-refundable"

```

Research:

- Denial of Spend: https://paultendo.github.io/posts/confusable-vision-llm-attack-tests/

- Launch: https://paultendo.github.io/posts/namespace-guard-launch/

- NFKC/TR39 composability: https://paultendo.github.io/posts/unicode-confusables-nfkc-conflict/

## Advanced Security Primitives

Low-level helpers for custom scoring, pairwise checks, and cross-script risk analysis:

```typescript

import { skeleton, areConfusable, confusableDistance } from "namespace-guard";

skeleton("pa\u0443pal"); // "paypal" skeleton form

areConfusable("paypal", "pa\u0443pal"); // true

confusableDistance("paypal", "pa\u0443pal"); // graded similarity + chainDepth + explainable steps

```

For measured visual scoring, pass the optional weights from confusable-vision (4,174 pairs scored across 245 fonts using vector-outline raycasting, including 3,525 cross-script pairs). Each pair has a `danger` score (0–1); the default 0.5 floor favours recall, use `danger > 0.7` for precision. The `context` filter restricts to identifier-valid, domain-valid, or all pairs.

```typescript

import { confusableDistance } from "namespace-guard";

import { CONFUSABLE_WEIGHTS } from "namespace-guard/confusable-weights";

const result = confusableDistance("paypal", "pa\u0443pal", {

  weights: CONFUSABLE_WEIGHTS,

  context: "identifier",

});

// result.similarity, result.steps (including "visual-weight" reason for novel pairs)

```

### Realistic Domain Spoof Detection

For domain name validation, `isDomainSpoof()` only flags threats that could produce registrable domain names. ICANN registrars enforce single-script labels, so mixed-script spoofs (e.g., one Cyrillic letter in a Latin domain) are excluded — they can't actually be registered.

```typescript

import { isDomainSpoof } from "namespace-guard";

import { CONFUSABLE_WEIGHTS } from "namespace-guard/confusable-weights";

// Full-Cyrillic lookalike — registrable and deceptive

isDomainSpoof("\u0440\u0430\u0443\u0440\u0430\u04CF", "paypal", { weights: CONFUSABLE_WEIGHTS });

// { spoof: true, script: "cyrillic", danger: 0.91, substitutions: [...] }

// Mixed-script — not registrable, not flagged

isDomainSpoof("\u0440aypal", "paypal", { weights: CONFUSABLE_WEIGHTS });

// { spoof: false }

// Known-legitimate non-Latin domain — skip via allowlist

isDomainSpoof("\u0430\u0441\u0435", "ace", {

  weights: CONFUSABLE_WEIGHTS,

  allowlist: ["\u0430\u0441\u0435"],

});

// { spoof: false }

```

The `danger` score (0–1) is always returned when a script match is found, even if below the `minDanger` threshold (default 0.5). Set `minDanger: 0.7` for higher precision.

## Research

Two research tracks feed the library:

**Visual measurement.** 4,174 confusable pairs measured across 245 system fonts using vector-outline raycasting ([RaySpace](https://paultendo.github.io/posts/rayspace-methodology/)). 3,525 of these are cross-script pairs between non-Latin scripts (Hangul/Han, Cyrillic/Greek, Cyrillic/Arabic, and more) with zero coverage in any existing standard. Each pair carries a `danger` score (0–1) representing geometric similarity; the shipped floor is 0.5 (for higher precision, try 0.7). Full dataset published as [confusable-vision](https://github.com/paultendo/confusable-vision) (CC-BY-4.0).

**Normalisation composability.** 31 characters where Unicode's confusables.txt and NFKC normalisation disagree. Two production maps (`CONFUSABLE_MAP` for NFKC-first, `CONFUSABLE_MAP_FULL` for raw-input pipelines), a benchmark corpus, and composability vectors wired into CLI drift baselines. Submitted to [Unicode public review (PRI #540)](https://www.unicode.org/review/pri540/) and published in [accumulated feedback](https://www.unicode.org/review/pri540/feedback.html).

- Technical reference: [docs/reference.md#how-the-anti-spoofing-pipeline-works](docs/reference.md#how-the-anti-spoofing-pipeline-works)

- Launch write-up: https://paultendo.github.io/posts/namespace-guard-launch/

- Denial of Spend: https://paultendo.github.io/posts/confusable-vision-llm-attack-tests/

## Built-in Profiles

Use `createNamespaceGuardWithProfile(profile, overrides, adapter)`:

- `consumer-handle`: strict defaults for public handles

- `org-slug`: workspace/org slugs

- `developer-id`: technical IDs with looser numeric rules

Profiles are defaults, not lock-in. Override only what you need.

## Zero-Dependency Moderation Integration

Core stays zero-dependency. You can use built-ins or plug in any external library.

```typescript

import {

  createNamespaceGuard,

  createPredicateValidator,

} from "namespace-guard";

import { createEnglishProfanityValidator } from "namespace-guard/profanity-en";

const guard = createNamespaceGuard(

  {

    sources: [

      { name: "user", column: "handleCanonical", scopeKey: "id" },

      { name: "organization", column: "slugCanonical", scopeKey: "id" },

    ],

    validators: [

      createEnglishProfanityValidator({ mode: "evasion" }),

      createPredicateValidator((identifier) => thirdPartyFilter.has(identifier)),

    ],

  },

  adapter

);

```

## CLI Workflow

```bash

# 1) Generate realistic attack variants

npx namespace-guard attack-gen paypal --json

# 2) Calibrate thresholds and CI gate suggestions from your dataset

npx namespace-guard recommend ./risk-dataset.json

# 3) Preflight canonical collisions before adding DB unique constraints

npx namespace-guard audit-canonical ./users-export.json --json

# 4) Compare TR39-full vs NFKC-filtered behaviour

npx namespace-guard drift --json

```

## Adapter Support

- Prisma

- Drizzle

- Kysely

- Knex

- TypeORM

- MikroORM

- Sequelize

- Mongoose

- Raw SQL

Adapter setup examples and migration guidance: [docs/reference.md#adapters](docs/reference.md#adapters)

## Production Recommendation: Canonical Uniqueness

For full protection against Unicode/canonicalization edge cases, enforce uniqueness on canonical columns (for example `handleCanonical`, `slugCanonical`) and point `sources[*].column` there.

Migration guides per adapter: [docs/reference.md#canonical-uniqueness-migration-per-adapter](docs/reference.md#canonical-uniqueness-migration-per-adapter)

## Documentation Map

- Full reference: [docs/reference.md](docs/reference.md)

- Config reference: [docs/reference.md#configuration](docs/reference.md#configuration)

- Validators (profanity, homoglyph, invisible): [docs/reference.md#async-validators](docs/reference.md#async-validators)

- Canonical preflight audit (`audit-canonical`): [docs/reference.md#audit-canonical-command](docs/reference.md#audit-canonical-command)

- Anti-spoofing pipeline and composability vectors: [docs/reference.md#how-the-anti-spoofing-pipeline-works](docs/reference.md#how-the-anti-spoofing-pipeline-works)

- LLM preprocessing (`canonicalise`, `scan`, `isClean`): [docs/reference.md#llm-pipeline-preprocessing](docs/reference.md#llm-pipeline-preprocessing)

- Benchmark corpus (`confusable-bench.v1`): [docs/reference.md#confusable-benchmark-corpus-artifact](docs/reference.md#confusable-benchmark-corpus-artifact)

- Advanced primitives (`skeleton`, `areConfusable`, `confusableDistance`): [docs/reference.md#advanced-security-primitives](docs/reference.md#advanced-security-primitives)

- Confusable weights (scored pairs, including cross-script): [docs/reference.md#confusable-weights-subpath](docs/reference.md#confusable-weights-subpath)

- Cross-script detection: [docs/reference.md#cross-script-detection](docs/reference.md#cross-script-detection)

- CLI reference: [docs/reference.md#cli](docs/reference.md#cli)

- API reference: [docs/reference.md#api-reference](docs/reference.md#api-reference)

- Framework integration (Next.js/Express/tRPC): [docs/reference.md#framework-integration](docs/reference.md#framework-integration)

## Support

If `namespace-guard` helped you, please star the repo. It helps the project a lot.

- GitHub Sponsors: https://github.com/sponsors/paultendo

- Buy me a coffee: https://buymeacoffee.com/paultendo

## Contributing

Contributions welcome. Please open an issue first to discuss larger changes.

## License

MIT © [Paul Wood FRSA (@paultendo)](https://github.com/paultendo)
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/paultendo/namespace-guard

Awesome Lists containing this project

README