{"id":50199754,"url":"https://github.com/chad-loder/yarlpattern","last_synced_at":"2026-05-25T21:06:43.861Z","repository":{"id":357454825,"uuid":"1237008276","full_name":"chad-loder/yarlpattern","owner":"chad-loder","description":"WHATWG URLPattern for Python. 100% specification strict, pure Python, optimized and yarl-compatible.","archived":false,"fork":false,"pushed_at":"2026-05-12T21:07:28.000Z","size":280,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-12T22:22:55.799Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/chad-loder.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":"NOTICE","maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null},"funding":{"github":"chad-loder"}},"created_at":"2026-05-12T19:36:53.000Z","updated_at":"2026-05-12T21:07:34.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/chad-loder/yarlpattern","commit_stats":null,"previous_names":["chad-loder/yarlpattern"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/chad-loder/yarlpattern","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chad-loder%2Fyarlpattern","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chad-loder%2Fyarlpattern/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chad-loder%2Fyarlpattern/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chad-loder%2Fyarlpattern/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/chad-loder","download_url":"https://codeload.github.com/chad-loder/yarlpattern/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chad-loder%2Fyarlpattern/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33493030,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-25T14:31:05.219Z","status":"ssl_error","status_checked_at":"2026-05-25T14:31:02.878Z","response_time":57,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-05-25T21:06:43.138Z","updated_at":"2026-05-25T21:06:43.854Z","avatar_url":"https://github.com/chad-loder.png","language":"Python","funding_links":["https://github.com/sponsors/chad-loder"],"categories":[],"sub_categories":[],"readme":"# yarlpattern\n\n[![WPT conformance](https://img.shields.io/badge/WPT%20data%20corpus-100%25%20(366%2F366)-2ea043?labelColor=24292f)](https://github.com/web-platform-tests/wpt/tree/master/urlpattern)\n[![WPT auxiliary suites](https://img.shields.io/badge/auxiliary%20suites-103%2F103-2ea043?labelColor=24292f)](https://github.com/web-platform-tests/wpt/tree/master/urlpattern)\n[![Stable spec API](https://img.shields.io/badge/stable%20API-implemented-2ea043?labelColor=24292f)](https://urlpattern.spec.whatwg.org/)\n[![Tentative spec API](https://img.shields.io/badge/tentative%20API-implemented-2ea043?labelColor=24292f)](https://urlpattern.spec.whatwg.org/)\n[![Python](https://img.shields.io/badge/python-3.12%2B-3776ab?labelColor=24292f\u0026logo=python\u0026logoColor=white)](https://www.python.org/)\n[![PyPI](https://img.shields.io/pypi/v/yarlpattern.svg?labelColor=24292f\u0026color=3775a9)](https://pypi.org/project/yarlpattern/)\n[![License](https://img.shields.io/badge/license-Apache--2.0-6e7681?labelColor=24292f)](LICENSE)\n\n**WHATWG URLPattern for Python — 100% conformance** to the upstream\n[WPT corpus](https://github.com/web-platform-tests/wpt/tree/master/urlpattern):\n**469 / 469** cases passing across all five test suites, the same files Chromium,\nSafari, and Firefox validate against.\n\nPure Python on top of [`yarl`](https://github.com/aio-libs/yarl) — immutable\npattern objects, component properties named after their URL counterparts, zero\nnon-Python dependencies. The pattern *is* the API: compile once, then ask\n`.test(url)` or `.exec(url)` from anywhere a `yarl.URL` lives.\n\n```python\nfrom yarlpattern import URLPattern\n\n# Multi-tenant API: the subdomain identifies the tenant, the path\n# captures the API version and the resource tail — all extracted in\n# one match call.\npat = URLPattern({\n    \"hostname\": \":tenant.myapp.com\",\n    \"pathname\": \"/api/v:version/*\",\n})\n\nresult = pat.exec(\"https://acme.myapp.com/api/v2/users/42\")\nresult.hostname[\"groups\"][\"tenant\"]    # 'acme'\nresult.pathname[\"groups\"][\"version\"]   # '2'\nresult.pathname[\"groups\"][\"0\"]         # 'users/42'\n\npat.test(\"https://foo.example.com/api/v2/users\")  # False — wrong host\npat.test(\"https://acme.myapp.com/api/users\")      # False — no version\n```\n\nThat's the differentiator. Flask-style `:id` routers match the path component\nin isolation; URLPattern matches *across* protocol, hostname, port, path, and\nsearch at once, returning structured named groups per component.\n\n## Conformance\n\n**469 / 469** upstream Web Platform Tests pass (100%) across every WHATWG URLPattern test suite\n— the same files Chromium, Safari, Firefox, Ada, and rust-urlpattern validate against. The\nWPT corpus is SHA-pinned by [`scripts/fetch_references.sh`](scripts/fetch_references.sh)\nto commit [`dd54691`](https://github.com/web-platform-tests/wpt/commit/dd54691426c23a08c6f4a0972b2c40965307e5ce)\n(2026-05-11) so the pass count is reproducible at any future date.\n\n| Suite | Source | Status |\n|---|---|:---|\n| `urlpattern.any.js` | WPT \u0026nbsp;·\u0026nbsp; `urlpatterntestdata.json` | ✅ \u0026nbsp; 366 / 366 |\n| `urlpattern-constructor.any.js` | WPT *(inline)* | ✅ \u0026nbsp; 4 / 4 |\n| `urlpattern-hasregexpgroups.any.js` | WPT \u0026nbsp;·\u0026nbsp; `urlpattern-hasregexpgroups-tests.js` | ✅ \u0026nbsp; 55 / 55 |\n| `urlpattern-compare.tentative.any.js` | WPT \u0026nbsp;·\u0026nbsp; `urlpattern-compare-test-data.json` | ✅ \u0026nbsp; 25 / 25 |\n| `urlpattern-generate.tentative.any.js` | WPT \u0026nbsp;·\u0026nbsp; `urlpattern-generate-test-data.json` | ✅ \u0026nbsp; 19 / 19 |\n| yarlpattern unit tests | this repo \u0026nbsp;·\u0026nbsp; tokenizer / parser / parts / regex / engine / pattern | ✅ \u0026nbsp; 130 / 130 |\n| **Total** | | ✅ \u0026nbsp; **599 / 599** |\n\n→ [**Full per-case conformance report**](docs/wpt-compliance.md) (regenerate via `just compliance-report`)\n\u0026nbsp;·\u0026nbsp; [**Documented deviations and stricter-than-yarl rules**](SPEC_DEVIATIONS.md)\n\n### What we get right that's easy to miss\n\nThe 100% number is the headline. Equally load-bearing — and easy to skip past — are the\nper-component canonicalisation rules the WHATWG URLPattern spec quietly requires. yarlpattern\nenforces all of them; a stdlib-only port that goes through `urllib.parse` cannot:\n\n- **WHATWG URL parsing end-to-end** via [`yarl`](https://github.com/aio-libs/yarl), not\n  `urllib.parse` (which is not WHATWG-conformant).\n- **IDNA2008 / UTS46 hostname canonicalization** via the third-party\n  [`idna`](https://pypi.org/project/idna/) package, not Python's stdlib `idna` codec\n  (which is IDNA2003 and not spec-compliant for modern IDN labels).\n- **Strict port parsing** — `\"8080xyz\"` is rejected as the WHATWG URL parser's port-state\n  requires; webhook-validation patterns that constrain on exact ports stay robust against\n  junk suffixes.\n- **Case-preserving `%XX` passthrough** in pattern literals — `caf%c3%a9` round-trips as\n  itself, where yarl would normalise to uppercase (WPT cases 146 / 148 pin this).\n- **U+FFFD substitution for unpaired surrogates** before UTF-8 percent-encoding, where yarl\n  silently drops them (WPT case 157).\n- **Hostname-pattern truncation at `?` / `#` / `/` / `\\`**, matching browser engine\n  behaviour for hostnames that were pasted from full URLs.\n\n\u003e **Stdlib-only mode.** Under stdlib `re` without the `[regex]` extra, conformance on\n\u003e `urlpattern.any.js` is **364 / 366 (99.5%)**. The two outlier patterns — `[a\u0026\u0026b]`\n\u003e (intersection) and `[a--b]` (difference) from the JS `v`-flag — require Matthew\n\u003e Barnett's [`regex`](https://pypi.org/project/regex/) package; they're marked `xfail`\n\u003e with an install hint when it's absent. `pip install yarlpattern[regex]` activates them.\n\n### API surface\n\nEvery stable and tentative method in the WHATWG URLPattern IDL is implemented:\n`URLPattern(input | string, baseURL?, options?)`, `test`, `exec`, all eight component\nproperties, `has_regexp_groups`, `URLPattern.compare_component`, and the tentative\n`generate(component, groups)`. The IDL camelCase spellings (`hasRegExpGroups`,\n`compareComponent`) are kept as aliases so code ported verbatim from the spec or\nbrowser JS reads identically. See [SPEC_DEVIATIONS.md](SPEC_DEVIATIONS.md) for the\nintentional Python-flavour choices.\n\n## How this differs from `aiohttp.web.UrlDispatcher`\n\n[`aiohttp.web.UrlDispatcher`](https://docs.aiohttp.org/en/stable/web_reference.html) is a\nmature path-router shaped around web-request dispatch. yarlpattern is a *predicate*: it\nmatches across all eight URL components (not just the path), works standalone (no server\ncontext required), and uses the same WHATWG pattern syntax browsers, Deno, Bun, and\nCloudflare Workers all implement.\n\nUse `UrlDispatcher` if you're building an aiohttp service. Use yarlpattern if you're matching\nURLs outside a server context, need to constrain on hostname / port / scheme alongside path,\nor want patterns that match what browsers do.\n\n→ [Full comparison](https://chad-loder.github.io/yarlpattern/comparisons/aiohttp/)\n\n## How this differs from yarl\n\n[yarl](https://github.com/aio-libs/yarl) is a URL parser / builder; yarlpattern is a URLPattern\nmatcher. They're complementary — yarlpattern depends on yarl for URL parsing and IDNA hostname\nencoding, accepts `yarl.URL` directly in `.test(...)` and `.exec(...)` calls (no `str()`\nround-trip), and uses WHATWG component names (`protocol` / `hostname` / `pathname` / `search` /\n`hash`) rather than yarl's (`scheme` / `host` / `path` / `query` / `fragment`).\n\nWhere the WHATWG URLPattern spec is stricter than yarl, yarlpattern enforces the spec — see the\n[Conformance](#conformance) section above and [SPEC_DEVIATIONS.md](SPEC_DEVIATIONS.md).\n\nComponent-name mapping for muscle-memory porting:\n\n| yarl | yarlpattern | WHATWG / browser JS |\n|---|---|---|\n| `scheme` | `protocol` | `protocol` |\n| `user` | `username` | `username` |\n| `host` | `hostname` | `hostname` |\n| `path` | `pathname` | `pathname` |\n| `query` (MultiDict) | `search` (str) | `search` |\n| `fragment` | `hash` | `hash` |\n\n→ [Full comparison](https://chad-loder.github.io/yarlpattern/comparisons/yarl/), including the\nWPT cases that pin down each strictness rule, the `with_*` ergonomics, and the encoding\nphilosophy yarlpattern shares with the rest of aio-libs.\n\n## Install\n\nInstall [`yarlpattern`](https://pypi.org/project/yarlpattern/) from PyPI:\n\n```bash\npip install yarlpattern            # stdlib re backend\npip install 'yarlpattern[regex]'   # full 100% conformance — see Conformance § above\n```\n\n## Bring your own regex engine\n\nThe matcher's regex backend is pluggable behind a `@runtime_checkable Protocol`. Two adapters\nship in-tree — stdlib `re` (always available; default fallback) and\n[`regex`](https://pypi.org/project/regex/) (auto-detected when `yarlpattern[regex]` is\ninstalled; closes the `[a\u0026\u0026b]` / `[a--b]` gap).\n\nSelection priority: explicit `engine=` argument \u0026rsaquo; `URLPATTERN_REGEX_ENGINE` env var\n\u0026rsaquo; auto-probe (prefers `regex` when importable, falls back to `re`).\nSee [`src/yarlpattern/_regex_engine/protocols.py`](src/yarlpattern/_regex_engine/protocols.py)\nfor the Protocol definitions; a future PyO3-backed engine slots in as one new adapter module.\n\n## Quick start\n\n```bash\nuv sync --all-groups\nuv run pytest                  # full test suite\njust check                     # lint + types + tests (requires `just`)\n```\n\n```python\nfrom yarlpattern import URLPattern\n\n# Dict form, fully wildcarded except path\napi = URLPattern({\"pathname\": \"/api/v:version/users/:id(\\\\d+)\"})\napi.test({\"pathname\": \"/api/v2/users/42\"})              # True\napi.exec({\"pathname\": \"/api/v2/users/42\"}).pathname     # {'input': '...', 'groups': {'version': '2', 'id': '42'}}\n\n# String form with base URL\nroute = URLPattern(\"/posts/:slug\", \"https://blog.example.com\")\nroute.test(\"https://blog.example.com/posts/hello\")      # True\n\n# Match a full URL against the constructed pattern\npat = URLPattern(\"https://*.shop.example/products/:sku\")\npat.test(\"https://eu.shop.example/products/SKU-991\")    # True\n```\n\n\u003c!-- pypi-end --\u003e\n\n## Architecture\n\nLayout, the matching pipeline, the engine seam used by the optional `regex` package, and the\ndeliberate-divergence notes (yarl fast path, `with_*` derivers, the three WHATWG-strictness\nrules) live on the docs site.\n\n→ [Architecture](https://chad-loder.github.io/yarlpattern/explanation/architecture/)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchad-loder%2Fyarlpattern","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fchad-loder%2Fyarlpattern","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchad-loder%2Fyarlpattern/lists"}