System Design Case Study

How does Google Ads select the best ad from 10M+ candidates in <100ms?

?? Design an ad serving system: 10M candidates, <100ms selection, 10B req/day, real-time budget pacing
Concepts Involved

Problem Statement

How does an ad serving platform select the optimal ad from 10M+ candidates in under 100ms, ranking by predicted CTR · bid · relevance, while serving 10B requests/day and enforcing real-time budget pacing?

Core challenge: Can't score all 10M ads per request in 100ms. Need a funnel approach: fast retrieval (10M?10K) ? lightweight scoring (10K?100) ? heavy ML ranking (100?10) ? auction. All while tracking budgets in real-time across billions of requests.
10M+
candidate ads
<100ms
end-to-end selection
10B
requests / day
Real-time
budget pacing

Architecture

LAYER 1: RETRIEVAL · 10M ? 10K candidates (~10ms) ? Pre-scoring 10K ? 100 (~20ms) Ad Request user + context page, geo, device user interests 10B req/day ~10ms Retrieval 10M ? 10K candidates inverted index by keyword targeting: geo, demo, interest active campaigns only O(1) lookup per keyword ~20ms Pre-scoring 10K ? 100 candidates lightweight ML (logistic reg) budget eligibility filter frequency cap check cheap features only Serve Ad render creative track impression click redirect URL viewability beacon LAYER 2: RANKING · Deep ML (100?10, ~50ms) ? Auction (2nd price/VCG, ~5ms) ? Winner Deep Ranking (Neural Net) 100 ? top 10 candidates Score = pCTR · bid · quality GPU inference, batched requests user embeddings + ad embeddings cross-features, attention layers ~50ms latency budget top 10 Auction Engine 2nd price / VCG mechanism winner pays 2nd bid + $0.01 slot allocation (top N positions) reserve price enforcement ad quality threshold gate ~5ms latency Ad Rank Formula pCTR · bid · quality_score Winner Selection highest Ad Rank wins slot 1 price = min to beat #2 LAYER 3: FEEDBACK · Budget Pacing + CTR Retraining + Impression/Click Tracking Budget Pacing Controller spend daily budget evenly across 24h Overspending ? reduce bid multiplier Underspending ? boost participation distributed counters (Redis/Spanner) throttle/boost updated every 15min prevents budget exhaustion by noon budget filter feeds pre-scoring CTR Model Retraining click/conversion data ? feature pipeline retrain model hourly impression logs + click logs ? join A/B test new model vs current canary deploy ? full rollout continuous improvement loop Impression/Click Tracking impression beacon ? Kafka click redirect ? Kafka real-time analytics pipeline conversion attribution (30-day) fraud detection (invalid clicks) feeds budget + CTR systems Ad Rank = pCTR · bid · quality_score | Funnel: 10M?10K?100?10?1 in <100ms | Budget: if overspending ? reduce bid multiplier 10B requests/day | 2nd-price auction (winner pays just enough to beat #2) | CTR model retrained hourly | Kafka for real-time event streaming
Funnel architecture: Retrieval (inverted index, 10M?10K in ~10ms) ? Pre-scoring (lightweight model, 10K?100 in ~20ms) ? Deep ranking (neural net, 100?10 in ~50ms) ? Auction (top 10?winner in ~5ms). Total <100ms end-to-end.
Ad Rank formula: pCTR · bid · quality_score. Predicted CTR from deep learning model (user features + ad features + context). Quality score penalizes low-relevance ads. Advertiser pays second-price (just enough to beat next ad).
Anti-patterns: Score all 10M ads with heavy model · impossible in 100ms. No budget pacing · budget exhausted by noon, no evening impressions. First-price auction · advertisers game bids, revenue drops. Stale CTR predictions · revenue loss from poor ranking.
Budget pacing: Each campaign has daily budget. Pacing controller adjusts participation rate throughout the day. Spend evenly · if 50% budget spent by noon with 50% day remaining, maintain rate. If overspending, throttle. Distributed counters track spend in near-real-time.

Interview Cheat Sheet

1. Funnel approach · retrieval ? pre-scoring ? deep ranking ? auction (10M?10K?100?10?1)
2. Ad Rank · pCTR · bid · quality_score, second-price auction for pricing
3. Budget pacing · spend evenly across day, throttle/boost participation rate
4. CTR prediction · deep neural net with user/ad/context features, updated hourly
5. Latency budget · retrieval 10ms + pre-score 20ms + deep rank 50ms + auction 5ms = <100ms
6. Feedback loop · click/conversion data feeds back to retrain CTR model continuously