https://github.com/souviks22/decentralized-rate-limiter
High performance API rate limiter optimized for availability
https://github.com/souviks22/decentralized-rate-limiter
api-security libp2p rate-limiting
Last synced: 18 days ago
JSON representation
High performance API rate limiter optimized for availability
- Host: GitHub
- URL: https://github.com/souviks22/decentralized-rate-limiter
- Owner: souviks22
- License: mit
- Created: 2025-05-25T17:31:25.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2026-01-26T11:20:14.000Z (5 months ago)
- Last Synced: 2026-01-27T01:01:46.648Z (5 months ago)
- Topics: api-security, libp2p, rate-limiting
- Language: Go
- Homepage:
- Size: 1.8 MB
- Stars: 3
- Watchers: 0
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Decentralized Rate Limiter
[](https://goreportcard.com/report/github.com/souviks22/decentralized-rate-limiter)
[](LICENSE)
[](https://golang.org)
[](https://github.com/souviks22/decentralized-rate-limiter/issues)
> **A decentralized rate limiter that continues to function under network partitions, built to study why centralized Redis-based designs break at scale.**
Most production rate limiters assume a **central truth** — a single Redis instance, a leader, or a globally consistent counter.
This project assumes the opposite:
**coordination is expensive, failures are normal, and partitions happen.**
Each node enforces rate limits **locally**, never blocking on global state, and reconciles usage **eventually** using CRDTs and peer-to-peer gossip.
**Design goal**
> *Never block locally, never trust globally — and still converge.*
This design targets infra teams operating across regions where availability and latency matter more than strict global enforcement.
---
## Why This Exists (and What Breaks in Practice)
Centralized rate limiting (Redis, API gateways, single coordinators) works well — **until it doesn’t**:
* cross-region latency leaks into the hot path,
* a single dependency becomes a blast radius,
* failure handling turns into policy ambiguity (“should we allow or deny?”).
In large distributed systems, **availability often matters more than perfect precision**.
This project explores a different trade-off:
* allow **temporary divergence**,
* guarantee **eventual convergence**,
* keep the request path **fast and local**.
---
## The Core Idea (High-Level)
Every node is a full authority for rate limiting.
```
Client → Any Node → Local Decision → Async Reconciliation
```
* No global coordinator
* No synchronous cross-node calls
* No blocking on consensus
Synchronization happens **off the hot path**.
If the node is alive, it answers.
---
## How It Works (End-to-End)
```
┌──────────────────────────┐
│ Client Request │
└────────────┬─────────────┘
│
▼
┌────────────────────────────────┐
│ Peer Node (A) │
│ ────────────────────────────── │
│ 1. Receive userID request │
│ 2. Check in-memory LRU │
│ 3. If miss, load from disk │
│ 4. TokenBucket.consume() │
│ 5. Record CRDT delta │
└────────────┬───────────────────┘
│
┌─────────────────────────────┼────────────────────────────┐
▼ ▼ ▼
┌────────────────┐ ┌──────────────────────┐ ┌────────────────────┐
│ In-Memory LRU │◄───────▶│ Disk Storage (/data) │ │ CRDT Delta Cache │
│ Token Buckets │ └──────────────────────┘ └────────▲───────────┘
└────▲───────────┘ ┌──────┴────────┐
│ │ libp2p Gossip │
│ └──────┬────────┘
│ │
│ Broadcast deltas (every 100ms or 100 entries) │
└──────────────────────────────────────────────────────────────┘
│
┌──────────────────┴─────────────────┐
▼ ▼
┌─────────────────────┐ ┌─────────────────────┐
│ Peer Node B │◄───────────────▶│ Peer Node C │
│ (Same architecture)│ P2P Sync │ (Same architecture)│
└─────────────────────┘ └─────────────────────┘
```
### Critical invariant
> **The request path never waits for gossip.**
Local decisions are final *for that node*.
---
## Design Components & Trade-offs
### Token Bucket (Local Authority)
Each user is governed by a standard token bucket:
* burst capacity,
* steady refill rate.
Why keep this simple:
* constant-time decisions,
* predictable latency,
* easy reasoning under load.
No hidden concurrency tricks. Thread safety is explicit.
---
### CRDT Deltas (Eventual Global Convergence)
Nodes exchange **deltas**, not full state.
Why deltas:
* small payloads,
* less merge ambiguity,
* faster convergence.
CRDT properties:
* commutative
* idempotent
* monotonic
This guarantees convergence even under:
* message loss,
* duplication,
* reordering.
**Exact precision is not promised.
Bounded divergence is.**
---
### libp2p Gossip (Coordination Without Leaders)
There is:
* no leader,
* no coordinator,
* no central broker.
Nodes discover peers and exchange deltas via libp2p gossip.
Failures are treated as routine:
* if a node disappears, others continue,
* when it returns, state reconverges.
---
### LRU + Disk (Scaling Beyond Memory)
Keeping all users in memory doesn’t scale.
So the system:
* keeps hot buckets in an **in-memory LRU**,
* evicts cold buckets to **disk**,
* reloads lazily on access.
This keeps:
* memory bounded,
* hot paths fast,
* cold users cheap.
Durability is pragmatic, not transactional.
---
## Performance Snapshot (3-node mesh)
| Metric | Observation |
| ---------------------- | -------------- |
| Throughput per node | ~3,000 req/sec |
| p99 request latency | ~2 ms |
| p99 gossip convergence | ~2 ms |
| Gossip payload size | ~3 KB |
Interpretation:
* request latency is dominated by **local execution**,
* coordination cost is **asynchronous and amortized**.
---
## Failure Semantics (Explicit by Design)
This system chooses **availability over strict correctness**.
* **Network partitions** → nodes operate independently
* **Node crashes** → local state lost, global state reconverges
* **Delayed gossip** → temporary over-allowing possible
Observed behavior:
* ~15% bounded over-acceptance at 3 nodes
* grows roughly linearly with node count
This is acceptable for:
* abuse mitigation,
* fairness control,
* soft enforcement.
It is **not acceptable** for strict accounting.
This complexity is the cost paid to remove a global coordinator from the hot path.
---
## Design Walkthrough (Optional Deep Dive)
For a longer-form architectural walkthrough and design rationale, see:
[**High-Level Design of a Decentralized Rate Limiter**](https://medium.com/@souviksarkar2k3/high-level-design-of-a-decentralized-rate-limiter-1bcc33154ce9)
---
## When *Not* to Use This
This design is **not** a good fit if:
* every request must respect a single global counter,
* over-allowing is unacceptable (e.g., billing),
* centralized infrastructure is cheap and reliable for you.
In those cases, a Redis-backed or coordinator-based design is simpler and safer.
---
## Example Usage
```go
limiter := drl.NewRateLimiter(10, 1) // capacity, refill rate
if limiter.AllowRequest("user-123") {
// request proceeds
} else {
// rate limited locally
}
```
The API stays intentionally boring.
The complexity lives inside.
---
## What This Project Is (and Isn’t)
This project is **not** about replacing Redis.
It’s about answering a harder question:
> *What does rate limiting look like when the system itself refuses to be centralized?*
If that question matters in your environment, this design might be useful.