{"id":49566462,"url":"https://github.com/ferderer/shipweight","last_synced_at":"2026-05-03T11:48:30.476Z","repository":{"id":344192649,"uuid":"1178904932","full_name":"ferderer/shipweight","owner":"ferderer","description":"Fast, cache-first package size analysis for every ecosystem. Self-hosted Bundlephobia alternative with zero rate limits.","archived":false,"fork":false,"pushed_at":"2026-03-15T19:46:32.000Z","size":187,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-03T11:48:18.937Z","etag":null,"topics":["api","axum","badges","bundle-size","bundlephobia","caching","developer-tools","esbuild","npm","package-analysis","package-size","rust","typescript"],"latest_commit_sha":null,"homepage":"https://shipweight.dev/","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ferderer.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-11T13:42:23.000Z","updated_at":"2026-03-20T13:17:44.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/ferderer/shipweight","commit_stats":null,"previous_names":["ferderer/shipweight"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/ferderer/shipweight","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ferderer%2Fshipweight","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ferderer%2Fshipweight/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ferderer%2Fshipweight/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ferderer%2Fshipweight/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ferderer","download_url":"https://codeload.github.com/ferderer/shipweight/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ferderer%2Fshipweight/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32568036,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-03T06:36:36.687Z","status":"ssl_error","status_checked_at":"2026-05-03T06:36:09.306Z","response_time":103,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["api","axum","badges","bundle-size","bundlephobia","caching","developer-tools","esbuild","npm","package-analysis","package-size","rust","typescript"],"created_at":"2026-05-03T11:48:28.183Z","updated_at":"2026-05-03T11:48:30.462Z","avatar_url":"https://github.com/ferderer.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# shipweight\n\nFast, cache-first package size analysis. A Rust-based replacement for [Bundlephobia](https://bundlephobia.com).\n\n## Why\n\nBundlephobia serves ~300K API requests/day on a single server and is constantly rate-limited. The bottleneck is architectural: Node/Webpack-based bundling with no aggressive caching. Package sizes are immutable — once `react@19.0.0` is published, its bundle size never changes. A cache-first design turns this into a solved problem.\n\n## Architecture\n\nTwo-process architecture with Postgres as message broker:\n\n![Architecture](architecture.png)\n\n- **Server** (Rust/Axum): Cache reads, job enqueuing, response serving. Stays up even if the worker crashes.\n- **Worker** (TypeScript/esbuild): Claims jobs from Postgres, downloads tarballs, bundles with esbuild, compresses, writes results back.\n- **Changes Feed** (Rust): Follows the npm registry changes feed, enqueues new versions of known packages into an idle queue for proactive caching.\n\nCache misses return **202 Accepted** with `Retry-After: 2`. Clients poll until the result is ready (200) or failed (422). After warmup, 95%+ of requests are served from the in-memory cache with zero I/O.\n\n## What You Get Back\n\n```\nGET /v1/npm/react/19.0.0\n```\n\n```json\n{\n  \"name\": \"react\",\n  \"version\": \"19.0.0\",\n  \"description\": \"React is a JavaScript library for building user interfaces.\",\n  \"keywords\": [\"react\"],\n  \"ecosystem\": \"npm\",\n  \"size\": 6420,\n  \"gzip\": 2580,\n  \"brotli\": 2190,\n  \"totalSize\": 8300,\n  \"totalGzip\": 3200,\n  \"totalBrotli\": 2750,\n  \"dependencyCount\": 1,\n  \"dependencyNames\": [\"loose-envify\"],\n  \"treeshakeable\": true,\n  \"sideEffects\": false,\n  \"moduleFormat\": \"cjs\",\n  \"repositoryUrl\": \"https://github.com/facebook/react\",\n  \"homepage\": \"https://react.dev\",\n  \"license\": \"MIT\",\n  \"unpackedSize\": 318938,\n  \"hasTypes\": true,\n  \"monthlyDownloads\": 130000000,\n  \"nodeEngine\": \"\u003e=0.10.0\",\n  \"maintainers\": [\"gnoff\", \"sebmarkbage\"],\n  \"cachedAt\": \"2026-03-11T14:30:00Z\"\n}\n```\n\nOwn sizes, total sizes (including transitive deps), treeshakeability, module format, types, license, downloads, maintainers — all extracted at zero extra cost.\n\n## API\n\n### Package Analysis\n\n| Endpoint | Response | Description |\n|---|---|---|\n| `GET /v1/{eco}/{package}/{version}` | 200 / 202 / 422 | Analyze a specific version |\n| `GET /v1/{eco}/@{scope}/{package}/{version}` | 200 / 202 / 422 | Scoped packages |\n| `GET /v1/{eco}/{package}` | 200 / 404 | Latest cached version |\n| `GET /v1/{eco}/@{scope}/{package}` | 200 / 404 | Latest cached (scoped) |\n\n- **200**: Result ready — `{ \"status\": \"ready\", \"result\": { ... } }`\n- **202**: Processing — `{ \"status\": \"processing\", \"retryAfter\": 2 }` (poll again)\n- **422**: Failed — `{ \"status\": \"failed\", \"error\": \"no entry point found\" }`\n\n### Discovery\n\n| Endpoint | Description |\n|---|---|\n| `GET /v1/{eco}/search?q=router\u0026keyword=react\u0026sort=gzip\u0026limit=20` | Fuzzy search with filters |\n| `GET /v1/{eco}/top?sort=downloads\u0026limit=20` | Top packages leaderboard |\n| `GET /v1/{eco}/{package}/alternatives?limit=10` | Similar packages by keyword overlap |\n\n**Search parameters:** `q` (fuzzy name), `keyword` (exact), `treeshakeable` (bool), `has_types` (bool), `sort` (gzip / total_gzip / size / total_size / name), `order` (asc / desc), `limit` (max 100), `offset`\n\n### Compatibility\n\n| Endpoint | Description |\n|---|---|\n| `GET /api/size?package=react@19.0.0` | Bundlephobia-compatible (sync, polls up to 30s) |\n\nReturns `{ \"name\", \"version\", \"size\", \"gzip\" }` — drop-in replacement for existing VS Code extensions and CI tools.\n\n### Badges\n\n```\nGET /badge/{ecosystem}/{package}.svg\nGET /badge/{ecosystem}/@{scope}/{package}.svg\n```\n\n**Query parameters:** `metric`, `style` (flat / flat-square), `color` (hex or name), `label` (custom text)\n\n**Supported metrics:**\n\n| Metric | Example |\n|---|---|\n| `gzip` (default) | ![gzip](https://img.shields.io/badge/gzip-2.5_kB-brightgreen) |\n| `size` / `minified` | Minified bundle size |\n| `brotli` | Brotli-compressed size |\n| `treeshakeable` / `tree-shaking` | ✓ or ✗ |\n| `side-effects` / `sideEffects` | none or yes |\n| `module` / `moduleFormat` | esm / cjs / dual |\n| `dependencies` / `deps` | Dependency count |\n| `types` | included or missing |\n| `license` | MIT, Apache-2.0, etc. |\n| `downloads` | Monthly npm downloads |\n| `version` | Latest cached version |\n\nColors auto-scale: green (≤5 kB) → yellow (≤25 kB) → orange (≤100 kB) → red (\u003e100 kB).\n\n```markdown\n![gzip size](https://shipweight.dev/badge/npm/react.svg)\n![treeshakeable](https://shipweight.dev/badge/npm/react.svg?metric=treeshakeable)\n![module](https://shipweight.dev/badge/npm/react.svg?metric=module\u0026style=flat-square)\n```\n\n### Health\n\n```\nGET /health → { \"status\": \"ok\", \"version\": \"0.1.0\", \"cache_entries\": 12345 }\n```\n\n## Caching\n\n**Two-layer cache with thundering-herd protection:**\n\n- **L1 — moka** (in-memory LRU, 100K entries, ~50 MB): Handles 95%+ of requests. Built-in deduplication via `get_with()`.\n- **L2 — PostgreSQL** (persistent): Immutable entries — npm doesn't allow republishing the same version. Denormalized for fast queries with `pg_trgm` GIN index on names and GIN index on keywords.\n- **Negative cache** (L1-only, 1h TTL): Caches failures to prevent re-bundling broken packages.\n\n**Request metrics:** Atomic `DashMap` counters per package, flushed to Postgres every 5 minutes. Used to prioritize cache warming.\n\n## Spam Protection\n\nMulti-layer defense against SEO spam packages:\n\n- **Name validation:** npm naming rules + spam keyword detection (discount, coupon, casino, keygen, etc.) + SEO slug heuristic (≥5 hyphens and \u003e40 chars)\n- **API gate:** Invalid names rejected with 400 before enqueue\n- **Changes feed filter:** Shared validation + maintainer blacklist loaded from `blocked_maintainers` table\n- **Seed filter:** Minimum 100 monthly downloads + name validation\n\n## Worker Pipeline\n\nFor each job:\n\n1. Download tarball from npm registry (max 50 MB)\n2. Find entry point (`exports` → `module` → `browser` → `main` → `index.js`)\n3. Bundle with esbuild (`--bundle --minify --format=esm`, dependencies as `--external`)\n4. Compress: gzip + brotli (quality 6)\n5. Extract metadata: treeshakeable, module format, types, license, repository, homepage\n6. Write to `size_cache`\n\n**Resilience:** 3 retries (auto-adds unresolved imports as externals), 60s timeout per bundle, static blacklist (`@types/*`, `@babel/runtime`, `core-js`), permanent failure tracking.\n\n**Two queues (priority):**\n1. **Hot queue** (`job_queue`): User-requested packages — processed first\n2. **Idle queue** (`idle_queue`): New versions of known packages — proactive cache warming via changes feed\n\n## Database Schema\n\n| Table | Purpose |\n|---|---|\n| `size_cache` | Denormalized package analysis results (PK: ecosystem, name, version) |\n| `request_stats` | Per-package request counts for cache warming priority |\n| `job_queue` | User-enqueued analysis jobs (pending → processing → done/failed) |\n| `idle_queue` | Proactive caching jobs from changes feed |\n| `failed_packages` | Permanent failures (unbuildable packages) |\n| `blocked_maintainers` | Spam author blacklist |\n| `metadata` | KV store (e.g., npm changes feed sequence number) |\n\n## Multi-Ecosystem Design\n\nRoutes and queries are ecosystem-generic (`/v1/{ecosystem}/...`). npm is the first implementation. Adding an ecosystem means implementing the worker pipeline for that registry.\n\n| Ecosystem | Metric | Bundling |\n|---|---|---|\n| npm | Bundle size (esbuild) | Yes |\n| Maven | JAR size + transitive deps | No |\n| Cargo | Crate download size + dep count | No |\n| Composer | Package install size | No |\n| PyPI | Wheel size per platform | No |\n\n## Running\n\n### Docker Compose (recommended)\n\n```bash\n# Development (includes local Postgres)\ndocker compose up -d\n\n# Production (uses external Postgres, Traefik, VictoriaMetrics)\ndocker compose -f compose.yaml -f compose.prod.yaml up -d\n```\n\n**Services:** `shipweight` (API, 256 MB), `shipweight-worker` (bundler, 2 GB), `shipweight-changes-feed` (npm feed, 256 MB), `postgres` (dev only)\n\n### Manual\n\n```bash\n# Build\ncd server \u0026\u0026 cargo build --release\n\n# Database\npsql -d shipweight -f seed/init.sql\n\n# Configure\nexport DATABASE_URL=postgresql://user:pass@localhost/shipweight\nexport RUST_LOG=shipweight=info\nexport PORT=3000\n\n# API server\n./target/release/shipweight\n\n# Changes feed (separate process)\n./target/release/shipweight --changes-feed\n\n# Seed popular packages\n./target/release/shipweight --seed\n\n# Worker\ncd worker \u0026\u0026 npm start\n```\n\n### Environment Variables\n\n| Variable | Default | Description |\n|---|---|---|\n| `DATABASE_URL` | — | Postgres connection string |\n| `PORT` | `3000` | API server port |\n| `RUST_LOG` | — | Log level (`shipweight=info`) |\n| `STATIC_DIR` | `./static` | SPA static file directory |\n| `POLL_INTERVAL_MS` | `1000` | Worker poll interval |\n| `MAX_DEPTH` | `10` | Max dependency recursion depth |\n| `BROTLI_QUALITY` | `6` | Brotli compression level (must match across services) |\n\n## Tech Stack\n\n- **Server:** Rust, axum, tokio, sqlx, moka, reqwest, tower-http, serde, tracing\n- **Worker:** TypeScript, esbuild, Node.js\n- **Database:** PostgreSQL (pg_trgm, GIN indexes)\n- **Infra:** Docker, Traefik, Cloudflare, VictoriaMetrics\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fferderer%2Fshipweight","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fferderer%2Fshipweight","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fferderer%2Fshipweight/lists"}