Token Bucket Allows Exploitable Bursts Across Windows

Token bucket algorithms refill at a fixed rate (e.g., 100 requests per minute) with a burst allowance (e.g., 20). This works for steady traffic but breaks at window boundaries. A client can fire 20 requests at 11:59:59 and another 20 at 12:00:00, totaling 40 in under a second—double the intended burst—because each side sees a full bucket.

In production, using Go's golang.org/x/time/rate library:

limiter := rate.NewLimiter(rate.Every(time.Minute/100), 20)

A faulty retry loop in a downstream service triggered simultaneous bursts across clients, cascading into API-wide timeouts at 4 AM. The setup felt conservative initially, with good latency for normal use, but ignored real-world traffic spikes exploiting boundaries.

Sliding Window Enforces True Rate Limits

Switch to a sliding window counter, which tallies requests in the last N seconds from the current time—no fixed boundaries to game. For multi-instance scaling, store per-client counts in Redis.

A weighted sliding window reduces overhead by blending current and previous windows:

func isAllowed(clientID string, limit int, windowSecs int64) bool {
    now := time.Now().Unix()
    currentWindow := now / windowSecs
    prevWindow := currentWindow - 1
    elapsed := float64(now%windowSecs) / float64(windowSecs)
    prev := float64(getCount(clientID, prevWindow))
    current := float64(getCount(clientID, currentWindow))
    estimated := prev*(1-elapsed) + current
    return estimated < float64(limit)
}

This estimates usage accurately without full timestamps. A burst followed by another after 3 seconds correctly denies if over budget. It's run in production for 6 months, eliminating 4 AM incidents.

Prioritize Desired Behavior Over Implementation Ease

Token bucket seemed simple and documented, but rate limiting defines acceptable traffic contracts. Ask: "What patterns do we want to allow?" Not "What's easiest to code?"

For even distribution without boundary exploits, sliding window wins despite Redis needs. Token bucket suits burst-tolerant scenarios. Pay the design tax upfront to avoid outages—every API hits limits eventually.