{"id":51202413,"url":"https://github.com/anycable/nodejs-websocket-bench","last_synced_at":"2026-06-28T01:04:09.149Z","repository":{"id":365155867,"uuid":"1217777098","full_name":"anycable/nodejs-websocket-bench","owner":"anycable","description":"Benchmarking WebSockets for Node.js: Socket.io, uWS and AnyCable","archived":false,"fork":false,"pushed_at":"2026-06-23T22:06:50.000Z","size":914,"stargazers_count":2,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-24T00:09:10.229Z","etag":null,"topics":["benchmark","websockets"],"latest_commit_sha":null,"homepage":"https://anycable.io/compare/nodejs-websocket","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/anycable.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-22T07:55:43.000Z","updated_at":"2026-06-18T03:01:38.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/anycable/nodejs-websocket-bench","commit_stats":null,"previous_names":["anycable/nodejs-websocket-bench"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/anycable/nodejs-websocket-bench","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anycable%2Fnodejs-websocket-bench","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anycable%2Fnodejs-websocket-bench/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anycable%2Fnodejs-websocket-bench/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anycable%2Fnodejs-websocket-bench/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/anycable","download_url":"https://codeload.github.com/anycable/nodejs-websocket-bench/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anycable%2Fnodejs-websocket-bench/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34873667,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-27T02:00:06.362Z","response_time":126,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["benchmark","websockets"],"created_at":"2026-06-28T01:04:07.887Z","updated_at":"2026-06-28T01:04:09.139Z","avatar_url":"https://github.com/anycable.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Node.js WebSocket Bench\n\nThe repo behind [anycable.io/compare/nodejs-websocket](https://anycable.io/compare/nodejs-websocket). Five WebSocket setups, three questions, one rule: same Railway hardware for every row.\n\n- Does it deliver messages when clients drop and reconnect?\n- Does it survive a deploy?\n- How many idle connections will a single instance hold?\n\nSetups under test: default Socket.io, Socket.io + Connection State Recovery, uWebSockets.js, AnyCable OSS, AnyCable Pro.\n\nMethodology, traps, and the bugs we caught in our own setup: [`docs/methodology.md`](./docs/methodology.md). Below: the numbers and how to rerun them.\n\n## Headlines\n\nAll numbers from one Railway region, Pro tier, 32 vCPU / 32 GB. Bench-runner shards run alongside the targets on the internal network so the driver is never the bottleneck.\n\n### Delivery under jitter: 10K clients, 120 broadcasts at 2/sec\n\nEvery client's TCP socket gets force-closed every ~15 s and stays offline ~2 s before reconnecting. ~8 jitter events per client over the 160 s test.\n\n|                          | Default Socket.io | Socket.io + CSR | uWS         | AnyCable OSS | AnyCable Pro |\n| ------------------------ | ----------------- | --------------- | ----------- | ------------ | ------------ |\n| Deliveries lost          | **184,449**       | **0**           | **153,805** | **0**        | **0**        |\n| Delivery rate            | **84.55%**        | **100%**        | **87.03%**  | **100%**     | **100%**     |\n| CSR resume rate          | n/a               | 99.7%           | n/a         | n/a          | n/a          |\n| Replay p50 (raw)         | 106 ms            | 148 ms          | 92 ms       | 250 ms       | 261 ms       |\n| Replay p95               | 394 ms            | 1.97 s          | 0.72 s      | 4.10 s       | 4.10 s       |\n| Replay p99               | 1.07 s            | 4.58 s          | 1.72 s      | 6.14 s       | 6.15 s       |\n| Replay max               | 1.75 s            | 9.71 s          | 2.95 s      | 9.23 s       | 9.36 s       |\n\nAt-most-once protocols drop whatever landed during the offline window. Replay protocols deliver everything; CSR's tail is slightly shorter at p99 in this in-memory comparison. AnyCable wins elsewhere: separate Go process so app deploys don't sever connections, horizontal scaling via NATS or Redis with replay intact, native client-to-client whispers, Ruby + Rails + Node + Bun + Deno backends.\n\n### Reconnection avalanche: 5,000 clients, one deploy\n\n|                                | Socket.io    | AnyCable |\n| ------------------------------ | ------------ | -------- |\n| Connections dropped            | 5,000 (100%) | 0        |\n| Recovery p50                   | 4,967 ms     | 0 ms     |\n| Recovery p95                   | 5,992 ms     | 0 ms     |\n| Clients that never came back   | 189 (3.8%)   | 0        |\n| Total downtime                 | **~6.8 s**   | **0 s**  |\n\nIn-memory CSR can't save you here. Server state is lost on restart. Redis Streams CSR keeps the state, but the connections themselves still all sever. The avalanche is architectural.\n\n### Idle capacity: how many connections one instance holds\n\n![Fifty bench-runner shards plus three target services on Railway](docs/railway-services.png)\n*Fifty bench-runner shards, each in its own Railway container with its own source IP and ~64K outbound-port pool. The 1M idle test fans out across all of them. The per-IP port ceiling is the reason single-machine 1M is hard; this is how we get past it without kernel tuning.*\n\nSingle-instance target, 32 vCPU / 32 GB, one stream subscription per connection. Headline run: 1,000,000 idle clients via 25 × 40,000 shards.\n\n| Server                      | Held         | Peak memory               | Peak CPU            | Wall                         |\n| --------------------------- | ------------ | ------------------------- | ------------------- | ---------------------------- |\n| Socket.io 4.x (Node 22)     | 119,826      | 6.3 GB                    | 1.34% (1 core)      | single Node event loop       |\n| `anycable-go` (OSS)         | 993,994      | **32 GB** (box ceiling)   | 12.22% (~3.9 vCPU)  | physical RAM                 |\n| `anycable-go-pro` v1.6.13   | **999,954**  | 19.34 GB                  | 9.37% (~3.0 vCPU)   | nothing, 13 GB still free    |\n\n**33 KB per connection OSS, 19 KB Pro** at 1M. Pro is ~1.7× more efficient at this scale and ~2.4× at 200K. Socket.io's wall is architectural: handshakes serialize through one event loop regardless of memory. To reach 1M with Socket.io you run many Node processes behind a Redis adapter (~10K–30K per process).\n\nOSS scaling line on the same box:\n\n| Idle conns | OSS memory  | OSS CPU             |\n| ---------- | ----------- | ------------------- |\n| 1K         | 280 MB      | 0%                  |\n| 10K        | 280 MB      | 0%                  |\n| 20K        | 751 MB      | 1.08% (~0.3 vCPU)   |\n| 50K        | 1.98 GB     | 1.08% (~0.3 vCPU)   |\n| 100K       | 4.18 GB     | 1.62% (~0.5 vCPU)   |\n| 200K       | 8.35 GB     | 2.63% (~0.8 vCPU)   |\n| 994K       | 32 GB       | 12.22% (~3.9 vCPU)  |\n\n## Methodology calls to flag\n\nThree knobs that shape the numbers. Full reasoning in [`docs/methodology.md`](./docs/methodology.md).\n\n- **Default Socket.io's offline window is floored to ~2 s** (`MIN_OFFLINE_MS` in `lib/core/timing.ts`). Otherwise the manual reconnect path stays offline only ~1 s and underreports the loss a real `socket.io-client` user sees with default `reconnection: true`. CSR, AnyCable, and uWS land near 2 s on their own because their client-library backoffs dominate. Same disruption shape across the board.\n- **CSR runs with the in-memory adapter.** Simplest opt-in path. Redis Streams or MongoDB shift the tail by adding network RTT; structural picture holds. CSR is documented as incompatible with the Redis pub/sub adapter, so the \"Redis adapter\" most teams reach for first is the one CSR can't use.\n- **AnyCable's jitter-row RAM is the tradeoff for parallel replay.** Its history buffer is per-stream so `history` parallelises across streams; that costs more RAM during jittery runs. Page-level RAM-per-connection comes from the idle test, where the per-connection footprint is what's measured.\n\n## Repository layout\n\n```\nbenchmark/\n├── docker-compose.yml             # Local Socket.io + anycable-go\n├── railway.toml                   # Railway deploy config\n├── docs/\n│   ├── methodology.md             # How we built it and why\n│   ├── railway-ops.md             # Resize, redeploy, fleet, pause\n│   └── env.md                     # Full env-var reference\n└── backend/\n    ├── Dockerfile                 # One image; SERVICE_ENTRY picks the entry point\n    ├── package.json\n    ├── results/                   # CSV/JSON output (gitignored)\n    └── src/\n        ├── publisher.ts                       # Standalone HTTP publisher (legacy)\n        ├── socketio/server.ts                 # /_broadcast + /publish-local\n        ├── uws/server.ts                      # uWebSockets.js comparison server\n        ├── standalone-publisher/server.ts     # Publisher as a separate Railway service\n        ├── bench-runner/server.ts             # Bearer-auth HTTP wrapper, deployed N times\n        ├── bench/                             # Driver scripts; each is `npm run bench:\u003cname\u003e`\n        │   ├── jitter-*.ts, avalanche-*.ts, throughput-*.ts, deploy-impact-*.ts\n        │   ├── whispers.ts, whispers-multi.ts, idle-multi.ts\n        │   ├── latency-trace-anycable.ts, jitter-anycable-trace.ts\n        │   ├── fetch-jitter-metrics.ts, railway-metrics.ts\n        │   ├── tests-manifest.ts              # Canonical list of every rebaseline test\n        │   ├── rebaseline.ts                  # Walk manifest, regress vs baselines\n        │   └── rebaseline-history.ts          # Per-metric trend across runs\n        └── lib/\n            ├── jitter-runners.ts, jitter-uws.ts, jitter-anycable-traced.ts\n            ├── avalanche-runner.ts, avalanche-uws.ts\n            ├── deploy-impact-runner.ts\n            ├── standalone-deploy-impact-runner.ts\n            ├── standalone-deploy-impact-anycable-runner.ts\n            ├── whispers-runner.ts, throughput.ts, idle-runner.ts\n            ├── anycable-trace.ts\n            └── core/                          # Every runner imports from here\n                ├── params, stats, timing, peak-rss, log, results-dir, chart\n                ├── bench-runner-client        # Driver-side bearer-token fetch\n                ├── job-queue                  # bench-runner async job state\n                ├── shard-coordinator          # Multi-shard fan-out + merge\n                └── railway-api                # Railway GraphQL\n```\n\nLocal CLI scripts and the Railway-hosted HTTP endpoints call the same `runJitter*` / `runThroughput*` functions. Numbers are produced by one code path; only the trigger differs.\n\n## Local quick-start\n\nRequires Node.js 22+ and either Docker or a local `anycable-go` binary (`brew install anycable-go`).\n\n```bash\ncd backend\nnpm install\n```\n\nTwo server terminals:\n\n```bash\n# Terminal 1: Socket.io (no CSR by default; set SOCKETIO_CSR=1 for CSR)\nnpm run dev:socketio                  # :3000\n\n# Terminal 2: anycable-go\nanycable-go --port 8080 --broker=memory --presets=broker --public\n```\n\nThird terminal, run any variant. Each script publishes its own messages.\n\n```bash\n# Default Socket.io\nSOCKETIO_URL=http://localhost:3000 NUM_CLIENTS=50 DURATION=60 \\\n  TOTAL_MESSAGES=60 INTERVAL_MS=500 \\\n  npm run bench:jitter:socketio\n\n# Socket.io + CSR (server must run with SOCKETIO_CSR=1)\nSOCKETIO_URL=http://localhost:3000 NUM_CLIENTS=50 DURATION=60 \\\n  TOTAL_MESSAGES=60 INTERVAL_MS=500 \\\n  npm run bench:jitter:socketio-csr\n\n# AnyCable\nANYCABLE_URL=ws://localhost:8080/cable BROADCAST_URL=http://localhost:8090/_broadcast \\\n  NUM_CLIENTS=50 DURATION=60 TOTAL_MESSAGES=60 INTERVAL_MS=500 \\\n  npm run bench:jitter:anycable\n```\n\nOutput: delivery rate, jitter event count, latency percentiles (raw + min-normalized), runner peak RSS.\n\nLocal avalanche (Socket.io spawns, gets killed, restarts):\n\n```bash\nnpm run build\nNUM_CLIENTS=1000 PORT=4000 npm run bench:avalanche:socketio\n```\n\nFor AnyCable the test confirms nothing happens (separate process, no disruption):\n\n```bash\nanycable-go --port 8080 --broker=memory --presets=broker --public \u0026\nNUM_CLIENTS=1000 ANYCABLE_URL=ws://localhost:8080/cable \\\n  BROADCAST_URL=http://localhost:8090/_broadcast \\\n  npm run bench:avalanche:anycable\n```\n\n## Railway: 10K-client jitter\n\nDeploy three services in one project:\n\n- `socketio-server` from this repo (`backend/Dockerfile`, `SERVICE_ENTRY=socketio/server`). Serves `/_broadcast` for per-message HTTP publishing and `/publish-local` for the diagnostic in-process emit path.\n- `anycable-go` from the official image (`anycable/anycable-go:latest`) with `ANYCABLE_BROKER=memory`, `ANYCABLE_PRESETS=broker`, `ANYCABLE_PUBLIC=true`, `ANYCABLE_HTTP_BROADCAST_SECRET=\u003cyour-secret\u003e`.\n- `bench-runner` from this repo with `SERVICE_ENTRY=bench-runner/server`, `ANYCABLE_BROADCAST_SECRET=\u003csame secret\u003e`, `BENCH_RUNNER_TOKEN=\u003crandom 32+ chars\u003e`.\n\nGive `bench-runner` a public domain. With `BENCH_RUNNER_TOKEN` set, every `/bench-*` and `/jobs/*` request needs `Authorization: Bearer \u003ctoken\u003e`; `/health` stays open for Railway probes.\n\n```bash\n# AnyCable @ 10K\ncurl --max-time 320 -X POST \\\n  -H \"Authorization: Bearer $BENCH_RUNNER_TOKEN\" \\\n  \"https://bench-runner-production.up.railway.app/bench-jitter-anycable?n=10000\u0026duration=200\u0026msgs=120\u0026interval=500\u0026jitter=15\u0026jitterMs=1000\u0026ramp=300\u0026stream=run-ac\"\n\n# Default Socket.io @ 10K\ncurl --max-time 320 -X POST \\\n  -H \"Authorization: Bearer $BENCH_RUNNER_TOKEN\" \\\n  \"https://bench-runner-production.up.railway.app/bench-jitter-socketio?n=10000\u0026duration=200\u0026msgs=120\u0026interval=500\u0026jitter=15\u0026jitterMs=1000\u0026ramp=300\u0026stream=run-d\"\n\n# Socket.io + CSR @ 10K\ncurl --max-time 320 -X POST \\\n  -H \"Authorization: Bearer $BENCH_RUNNER_TOKEN\" \\\n  \"https://bench-runner-production.up.railway.app/bench-jitter-socketio-csr?n=10000\u0026duration=200\u0026msgs=120\u0026interval=500\u0026jitter=15\u0026jitterMs=1000\u0026ramp=300\u0026stream=run-csr\"\n```\n\nDefault mode is sync (blocks until the run finishes, typically 3 to 5 minutes at 10K). Add `?async=1` to get back `{jobId}` and poll `/jobs/:id` instead.\n\nResponse shape: `deliveryRatePct`, `lostDeliveries`, `expectedDeliveries`, `receivedDeliveries`, `jitterEvents`, `csrResumes`, `csrResumeRatePct`, `connectFailures`, `latencyRawMs` and `latencyOverMinMs` (`{avg,p50,p95,p99,max}` plus `skewFloor`), `runnerPeakRssMb`.\n\n## Railway: avalanche\n\n```bash\nSOCKETIO_URL=https://your-socketio.up.railway.app NUM_CLIENTS=5000 \\\n  npm run bench:avalanche:railway\n```\n\nWhen the script reports \"All clients connected\", from a second terminal:\n\n```bash\nrailway restart -s socketio-server --yes\n```\n\n## Connection capacity\n\n`/bench-idle-anycable`, `/bench-idle-socketio`, `/bench-idle-uws` open N raw WebSockets to the target via internal network, hold for `holdSec`, return final counts.\n\nSingle shard (up to ~50K):\n\n```bash\ncurl --max-time 600 -X POST -H \"Authorization: Bearer $BENCH_RUNNER_TOKEN\" \\\n  \"https://bench-runner.up.railway.app/bench-idle-anycable?n=50000\u0026hold=120\u0026ramp=300\"\n```\n\nEach Linux container has a ~64K per-source-IP outbound port pool, capping any single shard around 50K useful connections.\n\nMulti-shard (100K to 1M) fans out across the 50-shard fleet. Deploying the fleet is in [`docs/railway-ops.md`](./docs/railway-ops.md#deploy-code-to-all-50-shards-in-parallel). Then:\n\n```bash\nSHARDS=$(printf 'https://bench-runner-production.up.railway.app'\n         for i in $(seq 2 50); do\n           printf ',https://bench-runner-%s-production.up.railway.app' \"$i\"\n         done)\n\n# 1M idle against anycable-go on 32 vCPU / 32 GB\nSHARDS=\"$SHARDS\" \\\n  PER_SHARD_N=20000 HOLD_SEC=120 RAMP_PER_SEC=200 \\\n  PROJECT_ID=\u003crailway-project-uuid\u003e SERVICE_ID=\u003canycable-go-service-uuid\u003e \\\n  SERVICE_NAME=anycable-go \\\n  npm run bench:idle:multi\n\n# Other targets: TARGET=socketio + SERVER_URL=...\n#                TARGET=uws + UWS_WS_URL=...\n```\n\n`PROJECT_ID` and `SERVICE_ID` are optional. Without them the script reports aggregate counts; with them it pulls memory and CPU from Railway and writes a CSV.\n\n## Re-baselining\n\nA tests manifest lives at `backend/src/bench/tests-manifest.ts`. One command walks it:\n\n```bash\ncd backend\nBENCH_RUNNER_URL=https://bench-runner-production.up.railway.app \\\nBENCH_RUNNER_TOKEN=\u003cyour-token\u003e \\\n  npm run bench:rebaseline\n```\n\nHits each bench-runner endpoint, writes results to `tmp/v1.6.14-bench-results/{id}.json`, prints a delta-vs-baseline report. Drift past the threshold goes yellow (`drift`); a delivery drop or a key-metric breach goes red (`regress`) and exits non-zero.\n\n```\nFILTER=jitter             # only jitter tests\nFILTER=latency-anycable   # only AnyCable latency\nFILTER=jitter,whispers    # comma-separated categories or substrings\nDRY_RUN=1                 # print the plan\nINCLUDE_IDLE=1            # add 4 idle tests (multi-shard, ~16 min)\nINCLUDE_AVALANCHE=1       # add 5 avalanche tests (auto-redeploys server)\n```\n\nFull sweep: ~90 minutes wall-clock.\n\n**Baselines vs the page numbers.** The page was captured during a noisy Railway shared-tenant window. The `baseline` field in `tests-manifest.ts` is what the same tests deliver on a quieter window: latencies ~50% better, everything else the same. So the page is the cautious \"worst seen under shared-infra load\" view, and the rebaseline tests against today's quieter floor. A green rebaseline says \"we still beat today's floor\", which is stricter than the page promises. When we refresh the page, baselines and page numbers move together.\n\nPer-run history lives at `tmp/v1.6.14-bench-results/runs/{ISO-ts}/`. To watch each headline number move across runs:\n\n```bash\nnpm run bench:rebaseline:history\nLAST=10 FILTER=jitter npm run bench:rebaseline:history\n```\n\n## Environment variables\n\nMost-used: `BENCH_RUNNER_URL` and `BENCH_RUNNER_TOKEN` for any Railway driver; `NUM_CLIENTS`, `DURATION`, `JITTER_INTERVAL`, `JITTER_DURATION`, `TOTAL_MESSAGES`, `INTERVAL_MS`, `RAMP_RATE` for the jitter scripts; `SOCKETIO_URL` / `ANYCABLE_URL` / `BROADCAST_URL` to retarget local scripts; `RESULTS_DIR` to redirect CSV/JSON output. Full reference: [`docs/env.md`](./docs/env.md).\n\nRailway infrastructure recipes (resize, redeploy, deploy the 50-shard fleet, pause-after-tests cost control) live in [`docs/railway-ops.md`](./docs/railway-ops.md).\n\n## Caveats worth knowing\n\n- WebSocket-only transport on both sides. No long-polling fallback for Socket.io.\n- AnyCable runs with the in-memory broker here. Production typically wants NATS or Redis for restart durability and multi-node fan-out.\n- Publisher and subscribers are different processes by design (real-world latency model). Latency is reported both raw and min-normalized; the compare page uses the min-normalized view, with `skewFloor` exposed so you can spot a bad clock.\n- Multi-shard min-normalization assumes each shard sees one fast sample. Reliable at 3 to 5 minute test windows; below ~60 s, sanity-check `skewFloor` per shard before trusting the merged p99.\n- One bench-runner saturates around 50K subscribers (Node event-loop work, not memory). Above that, fan out via `bench/idle-multi.ts` or `bench/jitter-multi.ts`.\n- Math.random isn't seeded. Runs reproduce statistically, not bit-for-bit. A few-ms p99 drift between runs is expected.\n- `deliveryRatePct` denominator is `totalMessages × clients`. The JSON also carries `deliveryRateOfConnectedPct` so a run with connect failures isn't silently capped below 100%. In healthy runs (`connectFailures: 0`) the two are identical.\n\n## About\n\nBuilt by [AnyCable](https://anycable.io) alongside the compare page at https://anycable.io/compare/nodejs-websocket. Open issues or PRs if you find a methodological flaw; we'd rather fix it than leave a wrong number standing.\n\nMIT, [LICENSE](./LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fanycable%2Fnodejs-websocket-bench","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fanycable%2Fnodejs-websocket-bench","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fanycable%2Fnodejs-websocket-bench/lists"}