System Design Concepts

No fluff β€” visual, concise, interview-ready

πŸ”Œ 4 Β· APIs & COMMUNICATION

REST API

Stateless HTTP-based β€” universal, cacheable. The default choice for public APIs

β–Έ Anatomy of a REST URL
GET https:// api . example.com / v1 /users ? age=25 & gender=male & page=2 & limit=10 METHOD GET/POST/PUT/DELETE PROTOCOL always HTTPS SUBDOMAIN api.example.com VERSION backward compat ENDPOINT nouns, not verbs FILTERING narrow results PAGINATION page + limit βœ“ Best Practices: Use nouns for resources (/users not /getUsers) Plural names (/users not /user) Cursor pagination > offset (for large datasets) Idempotency keys for POST (prevent dupes) Version in URL path
GET    /api/v1/products/123     β†’ Fetch (cacheable, idempotent)
POST   /api/v1/orders           β†’ Create (use idempotency key to prevent dupes)
PUT    /api/v1/orders/456       β†’ Replace (idempotent)
DELETE /api/v1/orders/456       β†’ Cancel (idempotent)

Pagination: ?cursor=abc123 (preferred) or ?page=2&limit=10
Caching: Cache-Control: max-age=3600 Β· ETag: "abc123"
Rate Limit: X-RateLimit-Limit: 1000 Β· X-RateLimit-Remaining: 847
Guarantees: Statelessness β€” no server-side session, any instance handles any request. Idempotency of GET/PUT/DELETE β€” safe to retry on failure. Cacheability β€” HTTP caching (CDN, browser) reduces load.
Real-world: Stripe API β€” gold standard (idempotency keys, versioning, pagination). GitHub API v3 β€” REST. Twilio β€” REST for SMS/voice.

gRPC

HTTP/2 + Protobuf β€” 10x faster than REST. 4 call types: unary, server-stream, client-stream, bidirectional

β–Έ 4 gRPC Streaming Modes
Unary Client Server 1 req 1 res Simple RPC 1 request β†’ 1 response GetUser() Server Stream Client Server 1 req stream of res Server pushes N msgs 1 request β†’ N responses ListPrices() Client Stream Client Server stream of req 1 res Client sends N msgs N requests β†’ 1 response UploadFile() Bidirectional Client Server streams both ways Both send N msgs N requests ↔ N responses Chat()
ModeUse CaseExample
UnarySimple request/responseGetUser, CreateOrder
Server StreamServer pushes multiple resultsStock ticker, log tailing
Client StreamClient sends batchFile upload, telemetry
BidirectionalReal-time two-wayChat, multiplayer game
Guarantees: Type safety β€” .proto schema + codegen catches incompatibility at compile time. Deadline propagation β€” timeout flows through entire call chain. Multiplexing β€” multiple concurrent calls on single HTTP/2 connection.
Real-world: Google internal comms. Netflix/Uber microservice-to-microservice. Best for: internal APIs, 10K+ RPS, bidirectional streaming. Not for browsers (use gRPC-Web proxy).

GraphQL

Client specifies exactly which fields β€” single endpoint, no versioning, strongly typed schema

Guarantees: No over-fetching β€” client gets only requested fields. Schema contract β€” server validates queries against schema before execution. Introspection β€” clients can discover available types/fields.
Risks: N+1 problem (fix with DataLoader batching). Deep query DoS (fix with depth limiting + cost analysis). Caching hard (each query unique). GitHub API v4, Shopify Storefront use GraphQL.

Async APIs

For long-running tasks β€” accept immediately, process in background, client polls for result

When to use: Image/video processing, report generation, ML inference, bulk imports β€” any operation that takes seconds to minutes. Don't make the client wait. Accept the request, queue the work, return a status URL.
β–Έ Async Request Lifecycle (Image Processing Example)
πŸ’»Client 🌐API πŸ“‹Queue βš™οΈWorker πŸ—„οΈDB Phase 1: Submit Request POST /api/images save original image + create job record queue processing job 202 Accepted Location: /api/images/{id}/status Phase 2: Client Polls loop GET /api/images/{id}/status 200 OK {status: "processing"} GET /api/images/{id}/status 200 OK {status: "complete", url: "..."} Phase 3: Background Processing dequeue job status β†’ "processing" βš™οΈ process resize, compress, etc. status β†’ "complete" save processed image URL Alternatives to Polling πŸ”” Webhook POST /your-webhook 200 OK Server pushes when done Β· HMAC signed Best for: server-to-server (Stripe, GitHub) ⚑ WebSocket πŸ”— persistent conn β€” server pushes {status: "complete"} Bidirectional β€” client can cancel, get progress % Best for: real-time UIs, live progress bars Submit (202) β†’ get result via: Poll (200) Β· Webhook (POST) Β· WebSocket (push)

Polling

β–Έ Client calls GET /status periodically
β–Έ Simple β€” no infra needed
β–Έ Add Retry-After: 5 header
β–Έ Wasteful for long jobs
Use: Short jobs, browser apps

Webhook

β–Έ Server POSTs result to client URL
β–Έ No wasted requests β€” push
β–Έ HMAC signature for security
β–Έ Client needs public endpoint
Use: Server-to-server, Stripe

WebSocket

β–Έ Persistent conn, server pushes
β–Έ Instant notification β€” no delay
β–Έ Bidirectional (cancel jobs too)
β–Έ Connection management overhead
Use: Real-time UIs, live progress
Real-world: Stripe β€” payment intents (202 β†’ webhook on completion). AWS S3 β€” multipart upload (initiate β†’ upload parts β†’ complete). GitHub Actions β€” trigger workflow (202) β†’ poll or webhook for result. Vercel β€” deploy (202) β†’ poll build status.

Idempotent APIs

Same request N times = same effect as once. Critical for payments, orders, any operation that must not duplicate

β–Έ Why Retries Are Dangerous
User transfers $100. Network glitch β†’ client retries automatically. Without idempotency, the server deducts $100 twice. The user loses $200. This is the duplicate processing problem β€” and it happens in production.
β–Έ 3 Failure Scenarios During an API Call
Client Network Server β‘  Request fails before reaching server βœ— βœ“ Safe to retry β€” server never saw it Server has not started processing β‘‘ Request reaches server, processing interrupted βš™οΈ partial ⚠ UNSAFE β€” did $100 deduct or not? β‘’ Server processes fully, response lost βœ“ done βœ— ⚠ UNSAFE β€” retry = double charge! Scenarios β‘‘ and β‘’ need idempotency keys to make retries safe
β–Έ Solution: Idempotency Key Flow (Stripe Pattern)
Client (App) Payment Server Redis (Key Store) β‘  Generate idempotency key (UUID) key: "abc-123-def" β‘‘ POST /transfer {$100, to: Bob} Header: Idempotency-Key: abc-123-def β‘’ Check: key "abc-123-def" exists? NO β€” first time β‘£ Process $100 store key + result (TTL: 24h) βœ— response lost! β‘€ Client retries (same key!) POST /transfer {$100, to: Bob} key: abc-123-def β‘₯ Check: key exists? YES β€” already processed! ⑦ Return cached result (no re-processing) βœ“ $100 deducted exactly once β€” user safe
Failure ScenarioRetry Safe?With Idempotency Key
Request fails before reaching serverβœ“ SafeKey not consumed β€” retry works normally
Server processing interrupted⚠ Unsafeβœ“ Safe β€” key marks partial, server resumes or rejects
Response lost in transit⚠ Unsafeβœ“ Safe β€” key already processed, returns cached result
Implementation: Store keys in Redis with TTL (24h). Key = UUID, Value = {status, result}. On request: check key β†’ if exists, return cached result β†’ if not, process + store. Delete key after TTL. Stripe requires Idempotency-Key header on all POST endpoints.

SOAP

XML over HTTP β€” enterprise legacy contract style

AspectSOAP
FormatXML envelope (header + body)
ContractWSDL β€” strict, machine-readable
SecurityWS-Security (signed/encrypted parts)
Use todayBanking, telco, government, legacy ERP
<Envelope> Header auth, routing, WS-Security Body operation + parameters

SOAP

Protocol: Strict XML envelope (Header + Body)
Contract: WSDL (machine-generated clients)
Transport: HTTP, SMTP, JMS (transport-agnostic)
Security: WS-Security (message-level encryption)
State: Can be stateful (WS-ReliableMessaging)
Verbose: 10-100Γ— larger payloads than REST/JSON

REST (comparison)

Protocol: HTTP methods (GET/POST/PUT/DELETE)
Contract: OpenAPI (optional, human-friendly)
Transport: HTTP only
Security: TLS + OAuth2 (transport-level)
State: Stateless by design
Lightweight: JSON, minimal overhead
When SOAP still wins: Banking/finance (WS-Security for signed transactions), Government (strict contracts, audit trails), Legacy integration (SAP, Oracle ERP). If you're building new: use REST or gRPC. If integrating with enterprise: expect SOAP.
Why heavier than REST: XML parsing, verbose envelope, mandatory schemas, stateful sessions.

CORS

Browser-enforced cross-origin policy

Browser app.com api.other.com Server 1. OPTIONS preflight 2. Access-Control-Allow-Origin 3. Real GET / POST
HeaderDirectionPurposeExample
OriginRequest β†’Browser sends the requesting originOrigin: https://app.com
Access-Control-Allow-Origin← ResponseServer declares which origins are allowed* or https://app.com
Access-Control-Allow-Methods← ResponseAllowed HTTP methodsGET, POST, PUT, DELETE
Access-Control-Allow-Headers← ResponseAllowed custom headersAuthorization, Content-Type
Access-Control-Max-Age← ResponseCache preflight result (seconds)86400 (24 hours)
Access-Control-Allow-Credentials← ResponseAllow cookies/auth headerstrue (cannot use with * origin)
Simple vs Preflight: Simple requests (GET/POST with standard headers) go directly β€” browser adds Origin, checks response. Preflight (PUT/DELETE, custom headers, non-standard Content-Type) triggers an OPTIONS request first. Server must respond with allowed methods/headers before browser sends the real request.
Common CORS mistakes: Using * with credentials (browsers reject this). Forgetting OPTIONS handler (preflight fails β†’ request blocked). Not caching preflight (Max-Age=0 β†’ OPTIONS on every request = 2Γ— latency). Reflecting Origin without validation (security vulnerability β€” allows any site).
HeaderPurpose
Access-Control-Allow-OriginWhich origins may read the response
Access-Control-Allow-MethodsAllowed verbs (GET, POST, …)
Access-Control-Allow-HeadersCustom headers permitted
Access-Control-Max-AgePreflight cache TTL (sec)
Simple requests (GET/POST with safe headers) skip preflight. Anything else β†’ OPTIONS first.

OpenAPI / Swagger

Machine-readable spec for REST APIs

GetFrom the spec
Interactive docsSwagger UI, Redoc
Client SDKsopenapi-generator (Java/Go/TS…)
Mock serverPrism, Stoplight
Contract testsDredd, Schemathesis
Gateway configKong, AWS API Gateway import
paths:
  /users/{id}:
    get:
      parameters:
        - in: path
          name: id
          schema: { type: string }
      responses:
        '200': { $ref: '#/components/schemas/User' }
β–Έ OpenAPI Development Workflow
πŸ“ Design Write YAML spec πŸ” Lint Spectral rules βš™οΈ Generate SDKs + Docs + Mocks βœ“ Validate Runtime checks Single source of truth: spec drives docs, SDKs, mocks, and runtime validation
Workflow: design β†’ lint (Spectral) β†’ commit YAML β†’ CI generates SDKs + docs β†’ server validates against same spec.

API Versioning

Three places you can put a version

StyleExampleProsCons
URL path/v1/usersCacheable, obvious, browseableURL churns on bumps
HeaderAccept: application/vnd.api.v2+jsonClean URLs, content-negotiation nativeHidden, harder to test in browser
Query param?version=2Quick to tryPollutes cache keys
β–Έ Versioning Styles at a Glance
URL Path β˜… recommended /v1/users /v2/users Cacheable, obvious, browseable Header Accept: app/vnd.api.v2+json Content-Negotiation Clean URLs, harder to test Query Param /users?version=2 Quick to try in browser Pollutes cache keys
Default to URL path (Stripe, GitHub do). Bump major version only on breaking changes; add fields backward-compatibly otherwise.

gRPC Streaming Modes

Four interaction patterns over one HTTP/2 connection

Unary C S Server stream (1 β†’ N) C S stock ticker, log tail Client stream (N β†’ 1) C S file upload, batched metrics Bidirectional C S chat, collab editor, RPC sessions
All four ride one HTTP/2 stream β€” multiplexed, header-compressed, binary framed.

Real-time Communication

Technologies for pushing data from server to client β€” choose based on direction + latency needs

Steps 0 / 0

Short Polling

Server t Browser t GET βˆ… ⏱ 5s GET βˆ… ⏱ 5s GET βœ“ data! 200 OK Client asks every N sec β€” most responses empty (wasteful) 2 wasted requests before getting data

Long Polling

Server t Browser t wait (hold) data! wait data! Server holds until data ready

WebSocket

Browser Server Phase 1: HTTP Upgrade Handshake GET /chat HTTP/1.1 + Upgrade: websocket Sec-WebSocket-Key: dGhlIHNhbXBsZQ== 101 Switching Protocols Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo= βœ“ Full-Duplex Connection Established Phase 2: Bidirectional Real-Time Communication Server: Price update Client: User action Server: Instant response Phase 3: Keep-Alive Heartbeat (every 30s) Ping Pong ✦ Binary Frames: FIN(1b) | RSV(3b) | Opcode(4b) | Mask(1b) | Length(7-64b) | Payload ~2-14 bytes overhead vs 400+ for HTTP Β· Text & Binary support One persistent connection Β· 100K+ per server Β· Sub-ms latency

SSE (Server-Sent Events)

Browser Server Phase 1: Initial Connection (GET /events) GET /events (Accept: text/event-stream) HTTP 200 OK Connection stays open Phase 2: Server Streams Events data: Price update $150.25 data: Price update $150.30 data: Price update $150.28 Minutes pass... data: Price update $151.00 Connection drops (network glitch) Auto-reconnect: GET /events (Last-Event-ID: 42) Resumes from event 42 (no data loss!) Server-only push Β· built-in browser auto-reconnect Β· replay via event IDs

WebRTC (Peer-to-Peer)

πŸ‘€ Peer A πŸ‘€ Peer B Signal Server setup only (SDP/ICE) πŸ”’ Direct P2P β€” encrypted media audio / video / screen share βœ“ No server bandwidth for media Zoom Β· Google Meet Β· Discord voice

Webhook (Server β†’ Your Server)

Provider (Stripe, GitHub) Your Server (/webhooks endpoint) Your App ⚑ event fires HTTP POST + JSON payload + HMAC signature header validate 200 OK (within ~5s) queue β†’ process async If timeout / 5xx β†’ provider retries (1s β†’ 2s β†’ 4s β†’ backoff) πŸ”’ Validate HMAC Β· Use event_id for idempotency Β· Whitelist IPs
TechDirectionLatencyBest ForGuarantee
Short PollingClient β†’ Server (repeated)N secDashboard refresh, legacy status checks, simple health monitorsSimple but 99% requests empty (wasteful)
Long PollingServer holds connection~secChat (pre-WS era), low-frequency notifications, JIRA-style updatesNear real-time but 1 conn/client held open
WebSocketFull duplex~msLive stock prices (Robinhood, Binance), chat (Slack), collaborative editing (Figma), gaming (Chess.com)Persistent bidirectional β€” server pushes instantly. ~100K conn/server.
SSEServer β†’ Client~msAI token streaming (ChatGPT, GitHub Copilot), live news tickers, CI/CD build logsAuto-reconnect built into browser. Event ID for resuming. Text-only.
WebRTCPeer-to-peerUltra-lowVideo/audio calls (Zoom, Google Meet), screen sharing, Discord voiceDirect P2P β€” no server bandwidth for media. Browser-enforced encryption.
WebhookServer β†’ Your Server~secPayment events (Stripe), CI/CD triggers (GitHub Actions), order updates (Shopify)Event-driven HTTP POST β€” fire-and-forget. Retries on failure. No persistent connection.
Real-world: Slack β€” WebSocket for messaging. Figma β€” WebSocket for collab editing. Zoom β€” WebRTC for video. Robinhood β€” WebSocket for live stock prices. Socket.IO β€” auto-fallback to polling. ChatGPT β€” SSE for token streaming. Stripe β€” Webhook for payment events.

WebSocket

Persistent, bidirectional communication. Perfect for real-time apps that need instant two-way data flow.

β–Έ Quick Summary
βœ“ Strengthsβœ— Challengesβš™ Best Practices
Bidirectional instant (~ms)
100K+ connections/server
Binary + text
Persistent connection
Stateful (track clients)
Needs reconnect logic
Load balance complexity
No auto-replay on disconnect
Use wss:// (TLS)
Exponential backoff
Redis Pub/Sub for scaling
Validate messages server-side
β–Έ Scaling with Redis Pub/Sub
Problem: Multiple servers β†’ events isolated per server β†’ clients on Server B miss updates from Server A.
Solution: All servers subscribe to Redis channels β†’ event fans out to ALL servers β†’ all clients see updates instantly.
WS Server Cluster Server 1 Client Client Client Server 2 Client Client Client Server 3 Client Client Client Redis Pub/Sub Event Broker Event Producer (Service / Worker) broadcast event to all servers Push to local clients ✦ Multi-Server Flow 1. Client on Server A sends message 2. Server A publishes to Redis channel 3. Redis fans-out to ALL subscribed servers 4. Servers B & C push to their clients instantly Result: All clients see the message regardless of which server they're connected to
Load Balancing: Sticky sessions (pin client to server) vs connection migration (client reconnects) vs shared Redis store (state survives server change).
Real-world: Slack (millions of connections), Figma (collaborative editing), Binance (market data). Typical: 100K–500K connections/server, sub-ms latency.

Server-Sent Events (SSE) β€” Deep Dive

One-way server-to-client push over HTTP. Built-in auto-reconnect, event IDs for replay, automatic browser handling.

Middle ground: Polling is wasteful (99% empty requests) β†’ WebSockets overkill if unidirectional β†’ SSE perfect for server-only push with auto-reconnect built-in.
β–Έ SSE Event Stream Format
✦ SSE Stream Format (Text-based) event: priceUpdate id: 42 retry: 5000 data: {"symbol": "AAPL", "price": 150.25} ← Custom event type ← Unique ID for replay ← Reconnect delay (ms) ← Actual payload (JSON) (blank line ends event) event: notification id: 43 data: {"message": "Market closing in 5 minutes"} πŸ”— Single HTTP/1.1 connection stays open β€” just HTTP headers, no protocol upgrade needed
β–Έ Auto-Reconnection with Event Replay
Browser Auto-Reconnection with Event Replay β‘  Connected, receiving events event id=1,2,3... β‘‘ Connection drops (network error) ❌ lost! β‘’ Browser auto-reconnects (3s default) GET /events + Last-Event-ID: 3 β‘£ Server replays missed events event 4,5,6... βœ“ Zero data loss β€” browser handles reconnection + server replays missed events using Last-Event-ID
β–Έ Quick Summary
βœ“ Strengthsβœ— Limitationsβš™ Common Fixes
Auto-reconnect built-in
Event replay (Last-Event-ID)
Standard HTTP
10K+ connections/server
Server-only (unidirectional)
Text-only (no binary)
6 conn/domain (HTTP/1.1)
Proxy buffering issues
No IE support
Proxy buffering: proxy_buffering off
Connection limits: Use HTTP/2
Idle timeout: Heartbeat every 30s
Storms: retry: 5000ms
β–Έ Use Cases
UseExample
Live prices / market dataRobinhood, Finnhub, Binance
AI token streamingChatGPT, GitHub Copilot
Build logs, CI/CD outputGitHub Actions, Jenkins, CircleCI
Live notificationsGmail, Slack, email
Performance: 10K–100K connections/server, 2–5KB memory/connection, ~10 bytes overhead vs 400+ for HTTP.

WebSocket vs SSE β€” Design Choices

When to pick each technology based on application requirements

β–Έ Feature Comparison Matrix
FeatureWebSocketSSELong Polling
CommunicationBidirectional βœ“Server onlyClient asks repeatedly
Protocol Overhead2-14 bytes/msg10-50 bytes/msg400+ bytes/msg
Browser SupportAll modern (IE10+)All modern (no IE)Universal
Binary Supportβœ“ Yesβœ— Text onlyβœ“ Yes
Auto-ReconnectManual requiredBuilt-in browserBuilt-in (polling loop)
Message ReplayManual requiredBuilt-in (Last-Event-ID)No standard
HTTP/2 MultiplexingNo (separate connection)βœ“ Yes (single connection)βœ“ Yes (http requests)
StatefulVery (per-client state)Mostly (stream state)Stateless
Proxy FriendlySometimes blockedβœ“ Standard HTTPβœ“ Standard HTTP
Connections/Server100K–500K10K–100K1K–10K
Latency~1-50ms~100-200ms~0.5-5s
Memory/Connection5-20KB2-5KBMinimal
β–Έ Decision Matrix: Which to Use?

βœ“ Use WebSocket When:

β–Έ Client ↔ Server messaging needed
β–Έ High frequency updates (100s/sec)
β–Έ Low latency critical (<10ms)
β–Έ Binary data needed
β–Έ Multiplayer games, trading apps
β–Έ Real-time collaboration (Figma)
β–Έ Chat apps (Slack, Discord)
β–Έ Live stock/crypto prices

βœ“ Use SSE When:

β–Έ Server β†’ Client only (no client send)
β–Έ Auto-reconnect needed (free feature)
β–Έ Event replay on disconnect
β–Έ Simple browser API (EventSource)
β–Έ AI token streaming (ChatGPT)
β–Έ Build logs (GitHub Actions)
β–Έ Live notifications / dashboards
β–Έ Text/JSON data only

βœ“ Use Long Polling When:

β–Έ Serverless environment (timeouts)
β–Έ IE support required
β–Έ WebSocket blocked by proxy/firewall
β–Έ Simple infrequent updates OK
β–Έ Existing polling infrastructure
β–Έ Cost sensitive (minimal server state)
β–Έ Doesn't need real-time urgency
β–Έ Stateless is a hard requirement
β–Έ Hybrid Approaches
SSE + HTTP POST: Use SSE for server-to-client push, regular POST for client commands (e.g., Twitch chat, YouTube comments)
WebSocket + REST fallback: Try WS first, fallback to long polling if blocked (Socket.IO does this)
WebSocket + Redis: For scale β€” WS per client, Redis Pub/Sub for multi-server broadcast (Slack, Figma pattern)
WebSocket + Kafka: For event sourcing β€” all events stored in Kafka, clients subscribe via WS (high-scale trading systems)
β–Έ Common Failure Scenarios
ScenarioWebSocket ImpactSSE Impact
Network disconnectionConnection drops, client must reconnect + resync stateBrowser auto-reconnects, replays events via Last-Event-ID βœ“
Server restartAll clients lose connection, must reconnectClients reconnect, get missed events if stored βœ“
Proxy timeouts (>60s idle)Connection dies, must detect + reconnectHeartbeat prevents timeout βœ“
High load spike100K+ connections: high memory, CPU consumedFewer connections, easier to scale with multi-server βœ“
Message orderingNot guaranteed across reconnectsEvent IDs allow ordering verification βœ“
Browser refreshConnection lost, full state resync neededCan optionally restore via session storage + server replay βœ“
Key Insight: SSE excels at resilience (auto-reconnect, event replay), WebSocket excels at latency & bidirectionality. Most real-time apps benefit from a hybrid approach: SSE for notifications, WebSocket for interactive features.