How does a platform manage sessions for 2B+ users across multiple devices, supporting instant revocation (password change invalidates all sessions), sliding expiry, and device-specific session limits without checking a central store on every request?
Core challenge: 2B users · 3 devices each = 6B active sessions. Checking a central session store on every single request would require millions of DB lookups/sec. But you need instant revocation (password change = all sessions dead immediately). How do you balance stateless verification with revocation?
2B+
active users
~6B sessions
Instant
revocation
password change ? all dead
Multi-device
per-device limits
phone, laptop, tablet
No central
check per request
stateless verification
Architecture · Short-Lived JWT + Refresh Token + Revocation List
Component
Mechanism
Purpose
Access Token (JWT)
Short-lived (15 min), signed, stateless
Verified locally by any service (no DB call). Contains user_id, roles, device_id, exp.
Refresh Token
Long-lived (30 days), opaque, stored server-side
Used to get new access token. Stored in session DB. Rotated on each use.
Session DB
Redis cluster (user_id ? [sessions])
Tracks active refresh tokens per user per device. Enables revocation.
Revocation List
Bloom filter / short-lived blocklist
On password change: add user_id to blocklist. Services check blocklist (cached, <1ms).
Device Binding
device_fingerprint in token + session
Token only valid from same device. Prevents token theft across devices.
How instant revocation works without per-request DB check: Access tokens are valid for only 15 minutes. On password change: ? Invalidate all refresh tokens in session DB ? Add user_id to revocation bloom filter (propagated to all services within seconds via pub/sub) ? Services check bloom filter (in-memory, <1ms) before accepting JWT ? Within 15 min, all old access tokens expire naturally. Worst case: 15 min window. For critical actions (transfer money): always check session DB.
Refresh token rotation: Each refresh token is single-use. On use: issue new access + new refresh token, invalidate old refresh. If old refresh is reused (stolen token replay) ? invalidate entire session family (all tokens for that device). This detects token theft.
Anti-patterns:Long-lived JWT (24h+) · can't revoke for hours. Session in cookie only · no server-side revocation. No device binding · stolen token works from any device. Checking DB on every request · doesn't scale to billions of requests.
Real-world:Google · short-lived access tokens + refresh via OAuth. Auth0 · rotating refresh tokens with theft detection. GitHub · fine-grained PATs with expiry. Netflix · device-bound sessions with concurrent device limits (4 screens).
Interview Cheat Sheet
The 7 things to say for session management design
1.Short-lived JWT (15 min) + long-lived refresh token (30 days) · balance stateless verification with revocation 2.Refresh token rotation · single-use, issue new on each refresh, detect replay (theft) 3.Revocation bloom filter · propagated via pub/sub, checked in-memory (<1ms), catches revoked users 4.Device binding · token only valid from same device fingerprint (prevents cross-device theft) 5.Session DB in Redis · user_id ? [active sessions per device], enables "sign out all devices" 6.Critical actions always check DB · money transfer, password change ? verify session server-side 7.Worst-case revocation window = 15 min · access token expiry is the upper bound for stale sessions