How does a web search engine index 100B+ pages and return ranked results in <200ms, combining authority scoring with relevance ranking across a tiered index serving architecture with early termination?
Core challenge: 100B+ pages. User types a query. You must find the most relevant 10 results from 100 billion candidates in under 200ms. Brute-force scan is impossible. The answer: inverted index + tiered serving + early termination.
100B+
indexed pages
petabytes of index
<200ms
query latency
P99 target
100K+
queries / second
8.5B queries/day
1000s
index shards
scatter-gather query
Architecture · Inverted Index + Tiered Serving
Layer
Role
How
Crawl
Discover and fetch pages
Priority queue by PageRank + change frequency. Politeness (robots.txt). 1000s of crawlers.
Index Build
Create inverted index
word ? [(doc_id, position, tf-idf score), ...]. MapReduce pipeline. Partitioned by doc_id range.
Tier 1 (Hot)
Top 1B most important pages
In-memory index on SSDs. Serves 90% of queries. Fast path.
Tier 2 (Warm)
Next 10B pages
SSD-based index. Queried only if Tier 1 has insufficient results.
Tier 3 (Cold)
Long-tail 90B+ pages
Disk-based. Rarely queried. Niche/old content.
Ranking
Score and order results
BM25 (text relevance) · PageRank (authority) · freshness · user signals. ML re-ranking (BERT).
Serving
Scatter-gather across shards
Query ? fan-out to 1000s of shards ? each returns top-K ? merge ? re-rank ? return top 10.
Early termination: Each shard scores documents in decreasing static rank order (PageRank). Once top-K candidates are "good enough" (score gap threshold), stop scanning. This means most queries only scan ~0.01% of the shard · not all documents containing the query terms.
Ranking signals:BM25 (term frequency, document length normalization). PageRank (link authority). Freshness (recent content boosted for news queries). Click-through rate (user engagement). BERT/LLM (semantic understanding for re-ranking top candidates).
Challenges:Index freshness · breaking news must be searchable in minutes (real-time indexing pipeline). Spam · adversarial SEO (ML classifiers). Tail queries · rare queries with few matching docs (need deep index). Personalization · same query, different intent per user.
Real-world:Google · Caffeine (real-time indexing), Hummingbird (semantic), BERT (understanding). Bing · similar tiered architecture. Elasticsearch · inverted index for enterprise search. Algolia · typo-tolerant, instant search (<50ms).
Scale Estimation
Step
Derivation
Result
Design Impact
1
Queries: 8.5B/day · 86400
~100K queries/sec
Massive parallelism needed · scatter-gather across shards
2
Index size: 100B pages · 10KB avg compressed
~1 PB index
Distributed across 10K+ machines, tiered by importance
3
Shards: 1PB · 100GB per shard
~10K shards
Each query fans out to all shards in its tier
4
Latency budget: 200ms total
~50ms per shard + 50ms merge + 100ms network
Early termination critical · can't scan full posting list
5
Crawl rate: 10B pages · 30 day refresh
~3,800 pages/sec crawled
Continuous crawl + real-time pipeline for breaking news
Resilience & Edge Cases
Failure
Impact
Recovery
Shard unavailable
Partial results (some docs missing)
Serve degraded results from available shards. Replicas take over. User rarely notices.
Spike query (trending topic)
Same query 1M times/sec
Query result cache (popular queries cached for 60s). Serve from cache without hitting index.
Index corruption
Wrong results for affected shard
Checksums on index segments. Rebuild from source documents. Serve from replica during rebuild.
Spam/SEO manipulation
Low-quality results ranked high
ML spam classifiers, link graph analysis, manual penalties. Continuous quality evaluation.
Interview Cheat Sheet
The 7 things to say for search system design
1.Inverted index · word ? list of (doc_id, position, score). Core data structure. 2.Tiered serving · Tier 1 (top 1B pages, in-memory) serves 90% of queries. Tier 2/3 for long-tail. 3.Scatter-gather · query fans out to 1000s of shards, each returns top-K, merge + re-rank 4.Early termination · scan docs in decreasing PageRank order, stop when top-K is "good enough" 5.BM25 + PageRank + freshness + ML re-ranking · multi-signal scoring pipeline 6.Real-time indexing for freshness · breaking news searchable in minutes (separate pipeline) 7.Caching at query level · popular queries cached (80% of queries are repeats)