{"id":50909395,"url":"https://github.com/antonio-orionus/url-sanitize","last_synced_at":"2026-06-16T08:32:16.426Z","repository":{"id":360470454,"uuid":"1250304963","full_name":"antonio-orionus/url-sanitize","owner":"antonio-orionus","description":"Remove tracking parameters and unwrap tracking redirects from URLs. ClearURLs-compatible library and CLI for JS, Rust, Python, and CI.","archived":false,"fork":false,"pushed_at":"2026-06-11T10:12:57.000Z","size":565,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-11T11:23:20.798Z","etag":null,"topics":["cleanurls","clearurls","cli","crates-io","github-actions","monorepo","npm-package","privacy","pypi","rust","tracking-protection","typescript","url-cleaner","url-sanitizer"],"latest_commit_sha":null,"homepage":"https://github.com/antonio-orionus/url-sanitize#readme","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/antonio-orionus.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":"docs/roadmap.md","authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-05-26T13:56:10.000Z","updated_at":"2026-06-11T10:11:41.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/antonio-orionus/url-sanitize","commit_stats":null,"previous_names":["antonio-orionus/url-sanitize"],"tags_count":9,"template":false,"template_full_name":null,"purl":"pkg:github/antonio-orionus/url-sanitize","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/antonio-orionus%2Furl-sanitize","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/antonio-orionus%2Furl-sanitize/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/antonio-orionus%2Furl-sanitize/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/antonio-orionus%2Furl-sanitize/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/antonio-orionus","download_url":"https://codeload.github.com/antonio-orionus/url-sanitize/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/antonio-orionus%2Furl-sanitize/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34398406,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-16T02:00:06.860Z","response_time":126,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cleanurls","clearurls","cli","crates-io","github-actions","monorepo","npm-package","privacy","pypi","rust","tracking-protection","typescript","url-cleaner","url-sanitizer"],"created_at":"2026-06-16T08:32:15.536Z","updated_at":"2026-06-16T08:32:16.420Z","avatar_url":"https://github.com/antonio-orionus.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# url-sanitize\n\n[![ci](https://github.com/antonio-orionus/url-sanitize/actions/workflows/ci.yml/badge.svg)](https://github.com/antonio-orionus/url-sanitize/actions/workflows/ci.yml)\n[![npm](https://img.shields.io/npm/v/%40url-sanitize%2Fmerged)](https://www.npmjs.com/package/@url-sanitize/merged)\n[![crates.io](https://img.shields.io/crates/v/url-sanitize)](https://crates.io/crates/url-sanitize)\n[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)\n\n\u003e Remove tracking parameters and unwrap tracking redirects from URLs using ClearURLs, AdGuard, Brave, and Firefox rules.\n\n**Looking for ClearURLs behavior as a library or CLI?** `url-sanitize` removes tracking noise like `utm_*`, `fbclid`, and redirect wrappers from a merged, daily-synced catalog of four upstream rule sources.\n\nAvailable from npm, crates.io, native release binaries, Python, CI environments, workers, browsers, edge runtimes, Node.js, Bun, and Deno.\n\n- **One behavior contract across languages.** TypeScript and Rust implementations are checked against the same JSONL conformance corpus.\n- **Explainable results.** Stripped params, redirect provider, or block rule are included — no opaque string replacement.\n- **Multi-source without AGPL lock-in.** Engine and CLI are MIT; upstream rule data keeps its source license.\n- **Automation-friendly.** The Rust CLI is deterministic, prompt-free, supports `--json`, and embeds a pinned catalog.\n- **Fresh rules.** GitHub Actions syncs ClearURLs, AdGuard, Brave, and Firefox catalogs daily; releases publish npm packages, crates, Python wheels, and native binaries automatically.\n\n## Contents\n\n- [Install](#install)\n- [TypeScript Quick Start](#typescript-quick-start)\n- [CLI Quick Start](#cli-quick-start)\n- [Rust Quick Start](#rust-quick-start)\n- [Packages](#packages)\n- [GitHub Automation](#github-automation)\n- [Docs](#docs)\n- [Roadmap](#roadmap)\n- [Development](#development)\n- [Contributing](#contributing)\n- [License](#license)\n\n## Install\n\n**Fastest path:**\n\n```sh\nnpx @url-sanitize/cli \"https://example.com/?utm_source=x\"\n```\n\n**Native binary, Linux/macOS:**\n\n```sh\ncurl --proto '=https' --tlsv1.2 -LsSf \\\n  https://github.com/antonio-orionus/url-sanitize/releases/latest/download/url-sanitize-installer.sh | sh\n```\n\n**Native binary, Windows x64 PowerShell:**\n\n```powershell\nirm https://github.com/antonio-orionus/url-sanitize/releases/latest/download/url-sanitize-installer.ps1 | iex\n```\n\n**Package managers and libraries:**\n\n```sh\nnpm install -g @url-sanitize/cli\nnpm install @url-sanitize/merged\nnpm install @url-sanitize/core @url-sanitize/clearurls @url-sanitize/adguard @url-sanitize/brave @url-sanitize/firefox\nnpm install @url-sanitize/fetch\ncargo install url-sanitize\ncargo add url-sanitize-core\npip install url-sanitize\n```\n\nThe Python package shells out to the native CLI binary, so install `url-sanitize` with one of the native paths above.\n\n### Install Matrix\n\n| Platform | Command | Notes |\n| --- | --- | --- |\n| Any OS with Node.js | `npx @url-sanitize/cli \"...\"` | No native binary required |\n| Any OS with Rust | `cargo install url-sanitize` | Builds from crates.io |\n| Linux x64 / ARM64 | Shell installer | Installs native binary and verifies `SHA256SUMS` |\n| macOS Apple Silicon / Intel | Shell installer | Installs native binary and verifies `SHA256SUMS` |\n| Windows x64 | PowerShell installer | Installs native binary and verifies `SHA256SUMS` |\n| Windows ARM64 | `npx @url-sanitize/cli \"...\"` | Native release archives not yet published |\n| Python | `pip install url-sanitize` + native CLI | Python shells out to `url-sanitize` on `PATH`, or `URL_SANITIZE_BIN` |\n\n### Homebrew and Scoop\n\n```sh\nbrew install antonio-orionus/url-sanitize/url-sanitize\n```\n\n```powershell\nscoop bucket add url-sanitize https://github.com/antonio-orionus/scoop-url-sanitize\nscoop install url-sanitize\n```\n\nHomebrew supports macOS Apple Silicon/Intel and Linux x64/ARM64. Scoop supports Windows x64. Release automation renders Homebrew and Scoop metadata from the published `SHA256SUMS`; validation fixtures are kept at [`Formula/url-sanitize.rb`](Formula/url-sanitize.rb) and [`bucket/url-sanitize.json`](bucket/url-sanitize.json).\n\n### CI and Containers\n\nFor CI, pin a version instead of using `latest`:\n\n```sh\nversion=\"v2.0.1\"\ntarget=\"x86_64-unknown-linux-gnu\"\nasset=\"url-sanitize-${target}.tar.gz\"\n\ncurl --proto '=https' --tlsv1.2 -fsSLO \"https://github.com/antonio-orionus/url-sanitize/releases/download/${version}/${asset}\"\ncurl --proto '=https' --tlsv1.2 -fsSLO \"https://github.com/antonio-orionus/url-sanitize/releases/download/${version}/SHA256SUMS\"\ngrep \"  ${asset}$\" SHA256SUMS | sha256sum -c -\ntar -xzf \"${asset}\"\n./url-sanitize --version\n```\n\nGitHub Actions:\n\n```yaml\njobs:\n  url-sanitize:\n    runs-on: ubuntu-latest\n    steps:\n      - name: Install url-sanitize\n        run: |\n          set -euo pipefail\n          version=\"v2.0.1\"\n          target=\"x86_64-unknown-linux-gnu\"\n          asset=\"url-sanitize-${target}.tar.gz\"\n\n          curl --proto '=https' --tlsv1.2 -fsSLO \"https://github.com/antonio-orionus/url-sanitize/releases/download/${version}/${asset}\"\n          curl --proto '=https' --tlsv1.2 -fsSLO \"https://github.com/antonio-orionus/url-sanitize/releases/download/${version}/SHA256SUMS\"\n          grep \"  ${asset}$\" SHA256SUMS | sha256sum -c -\n          tar -xzf \"${asset}\"\n          sudo install -m 0755 url-sanitize /usr/local/bin/url-sanitize\n\n      - name: Smoke test\n        run: |\n          url-sanitize --version\n          url-sanitize --json \"https://example.com/article?utm_source=newsletter\u0026id=123\"\n          printf '%s\\n' \"https://example.com/article?utm_source=newsletter\u0026id=123\" | url-sanitize -\n```\n\nGitLab CI:\n\n```yaml\nurl-sanitize:\n  image: ubuntu:24.04\n  before_script:\n    - apt-get update\n    - apt-get install -y --no-install-recommends ca-certificates curl coreutils tar\n  script:\n    - |\n      set -eu\n      version=\"v2.0.1\"\n      target=\"x86_64-unknown-linux-gnu\"\n      asset=\"url-sanitize-${target}.tar.gz\"\n\n      curl --proto '=https' --tlsv1.2 -fsSLO \"https://github.com/antonio-orionus/url-sanitize/releases/download/${version}/${asset}\"\n      curl --proto '=https' --tlsv1.2 -fsSLO \"https://github.com/antonio-orionus/url-sanitize/releases/download/${version}/SHA256SUMS\"\n      grep \"  ${asset}$\" SHA256SUMS | sha256sum -c -\n      tar -xzf \"${asset}\"\n      install -m 0755 url-sanitize /usr/local/bin/url-sanitize\n    - url-sanitize --version\n    - url-sanitize --json \"https://example.com/article?utm_source=newsletter\u0026id=123\"\n    - printf '%s\\n' \"https://example.com/article?utm_source=newsletter\u0026id=123\" | url-sanitize -\n```\n\nDockerfile:\n\n```dockerfile\nFROM ubuntu:24.04\n\nARG URL_SANITIZE_VERSION=v2.0.1\nARG URL_SANITIZE_TARGET=x86_64-unknown-linux-gnu\n\nRUN apt-get update \\\n  \u0026\u0026 apt-get install -y --no-install-recommends ca-certificates curl coreutils tar \\\n  \u0026\u0026 rm -rf /var/lib/apt/lists/*\n\nRUN set -eux; \\\n  asset=\"url-sanitize-${URL_SANITIZE_TARGET}.tar.gz\"; \\\n  curl --proto '=https' --tlsv1.2 -fsSLO \"https://github.com/antonio-orionus/url-sanitize/releases/download/${URL_SANITIZE_VERSION}/${asset}\"; \\\n  curl --proto '=https' --tlsv1.2 -fsSLO \"https://github.com/antonio-orionus/url-sanitize/releases/download/${URL_SANITIZE_VERSION}/SHA256SUMS\"; \\\n  grep \"  ${asset}$\" SHA256SUMS | sha256sum -c -; \\\n  tar -xzf \"${asset}\"; \\\n  install -m 0755 url-sanitize /usr/local/bin/url-sanitize; \\\n  rm -f \"${asset}\" SHA256SUMS url-sanitize; \\\n  url-sanitize --version\n```\n\n## TypeScript Quick Start\n\n```ts\nimport { sanitize } from '@url-sanitize/merged';\n\nconst result = sanitize('https://example.com/article?utm_source=newsletter\u0026id=123');\n\nconsole.log(result);\n// {\n//   kind: 'cleaned',\n//   original: 'https://example.com/article?utm_source=newsletter\u0026id=123',\n//   url: 'https://example.com/article?id=123',\n//   strippedParams: ['utm_source'],\n//   matchedRules: [{ provider: 'globalRules', kind: 'strip-param', pattern: 'utm_.*' }]\n// }\n```\n\n**Custom catalog or options:**\n\n```ts\nimport { compileSanitizer } from '@url-sanitize/core';\nimport { mergedCatalog } from '@url-sanitize/merged';\n\nconst sanitize = compileSanitizer(mergedCatalog, { stripReferralMarketing: true });\n```\n\n**ClearURLs-only behavior:**\n\n```ts\nimport { sanitize } from '@url-sanitize/clearurls';\n```\n\n## CLI Quick Start\n\n```sh\nurl-sanitize \"https://example.com/article?utm_source=newsletter\u0026id=123\"\n# https://example.com/article?id=123\n\nurl-sanitize --json \"https://www.google.com/url?q=https%3A%2F%2Fexample.org\"\n# {\"kind\":\"redirected\",\"original\":\"...\",\"url\":\"https://example.org/\",\"via\":{...}}\n```\n\n## Rust Quick Start\n\n```rust\nuse url_sanitize_core::{Catalog, SanitizerOptions};\n\nlet json = std::fs::read_to_string(\"catalog/catalog.json\")?;\nlet catalog = Catalog::from_json(\u0026json)?;\nlet sanitizer = catalog.compile(SanitizerOptions::default());\nlet result = sanitizer.sanitize(\"https://example.com/?utm_source=x\");\n\nprintln!(\"{}\", serde_json::to_string(\u0026result)?);\n```\n\n## Packages\n\n| Package | Description | License |\n| --- | --- | --- |\n| [`@url-sanitize/core`](packages/core) | Pure TypeScript sanitization engine. Zero runtime deps. | MIT |\n| [`@url-sanitize/merged`](packages/merged) | Default merged multi-source catalog. | MIT (code) + upstream data licenses |\n| [`@url-sanitize/clearurls`](packages/clearurls) | ClearURLs-compatible catalog + adapter. | MIT (code) + LGPL-3.0-only (data) |\n| [`@url-sanitize/adguard`](packages/adguard) | AdGuard URL Tracking Protection catalog + adapter. | LGPL-3.0-only |\n| [`@url-sanitize/brave`](packages/brave) | Brave Debouncer catalog + adapter. | MPL-2.0 |\n| [`@url-sanitize/firefox`](packages/firefox) | Firefox Query Stripping catalog + adapter. | MPL-2.0 |\n| [`@url-sanitize/cli`](packages/cli) | npm CLI for removing tracking parameters and redirect wrappers. | MIT |\n| [`@url-sanitize/fetch`](packages/fetch) | Runtime ClearURLs catalog fetch with SHA256 and pinned-hash verification. | MIT |\n| [`url-sanitize-core`](crates/url-sanitize-core) | Pure-Rust implementation. | MIT |\n| [`url-sanitize`](crates/url-sanitize) | Native Rust CLI with embedded merged catalog. | MIT |\n| [`url-sanitize`](python) | Python wrapper around the native CLI. | MIT |\n| `@url-sanitize/action` | GitHub Action for URL hygiene in PRs and docs. (Planned — not yet published.) | MIT |\n\n## Compared to Existing Options\n\n| Option | Tradeoffs |\n| --- | --- |\n| ClearURLs browser extension | End-user product, not a library |\n| `@quik-fe/clear-urls` | AGPL-3.0-only — adoption-blocker for SaaS and commercial use |\n| Hand-rolled per-project regexes | Stale within months; no upstream rule sync |\n| **url-sanitize** | MIT engine, daily-synced multi-source rules, explainable results |\n\n## GitHub Automation\n\n- `ci.yml` — builds, typechecks, lints, tests, checks generated catalog and conformance freshness, runs Rust fmt/clippy/tests/package checks, validates release binary size, and runs npm/Python/installer/Homebrew/Scoop smoke tests.\n- `sync-clearurls.yml` — syncs upstream rule sources daily and opens a version-bump PR when rules change.\n- `release-dry-run.yml` — builds the release matrix on PRs, assembles archives, renders Homebrew/Scoop metadata, and validates installer/package-manager syntax before merge.\n- `auto-tag.yml` — verifies release metadata, creates annotated tags after version bumps land on `main`, and dispatches `release.yml`.\n- `release.yml` — publishes npm packages, Rust crates, PyPI package, native GitHub Release assets, Homebrew/Scoop metadata, and runs public smoke tests from `v*` tags.\n- `post-release-smoke.yml` — available for manual public smoke reruns against an already-published version.\n\nPublishing to Homebrew tap and Scoop bucket repositories requires a `PACKAGING_REPO_TOKEN` secret. The optional `HOMEBREW_TAP_REPO` and `SCOOP_BUCKET_REPO` repository variables override defaults (`antonio-orionus/homebrew-url-sanitize` and `antonio-orionus/scoop-url-sanitize`). If the token is absent, release automation skips external package-manager publication.\n\n## Docs\n\n- [Roadmap](docs/roadmap.md) — milestone detail, deferred surfaces, and strategic context\n- [Behavioral spec](docs/spec.md) — result schema and implementation contract\n- [Benchmarks](docs/benchmarks.md) — current sanitizer throughput numbers\n- [Threat model](docs/threat-model.md) — what hash verification proves and what it doesn't\n- [License model](docs/license-model.md) — why the engine is MIT and rule data is LGPL-3.0\n- [ClearURLs compatibility](docs/clearurls-compat.md) — migrating from ClearURLs or `@quik-fe/clear-urls`\n- [Non-goals](docs/non-goals.md) — what this project will never do\n- [Security policy](SECURITY.md) — responsible disclosure and supported versions\n\n## Roadmap\n\n- **v0.1** — TypeScript engine, ClearURLs adapter, npm CLI, Rust engine, Rust CLI, shared conformance, daily sync workflow ✓\n- **v0.2** — broader native archive coverage, installer refinements, Homebrew/Scoop, CI install examples ✓\n- **v0.3** — runtime catalog fetching, custom user-defined catalogs, schema validation ✓\n- **v1.0** — stable public API, result types, benchmarks, security policy ✓\n- **v2.0** — multi-source packages: AdGuard, Brave, Firefox, merged catalog ✓\n- **Deferred** — GitHub Action, MCP, AUR/Winget/distro packages, native npm packages, WASM, in-process Python bindings\n\n## Development\n\nRequires Node.js ≥ 22 and pnpm. Rust toolchain required for crate targets (MSRV 1.75).\n\n```sh\ngit clone https://github.com/antonio-orionus/url-sanitize.git\ncd url-sanitize\npnpm install\npnpm build       # tsup build all packages\npnpm test        # vitest\npnpm typecheck\npnpm lint\ncargo test --workspace\n```\n\nUpstream rule catalogs sync automatically via `sync-clearurls.yml`. To pull them manually:\n\n```sh\npnpm sync:sources\n```\n\nPre-push hook runs: `pnpm build`, `pnpm lint`, `pnpm typecheck`, `pnpm test`, `cargo fmt --all --check`, `cargo clippy --workspace --all-targets -- -D warnings`, `cargo test --workspace`, and `cargo package -p url-sanitize-core --allow-dirty`.\n\n## Contributing\n\nPRs welcome. See [CONTRIBUTING.md](CONTRIBUTING.md).\n\n## License\n\nMIT for engine, CLI, and tooling. Bundled upstream rule data keeps its source license: ClearURLs and AdGuard data are LGPL-3.0-only; Brave and Firefox data are MPL-2.0. See [LICENSE](LICENSE) and [docs/license-model.md](docs/license-model.md).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fantonio-orionus%2Furl-sanitize","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fantonio-orionus%2Furl-sanitize","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fantonio-orionus%2Furl-sanitize/lists"}