How does Dropbox sync files across millions of devices?

🎯 Design a file sync engine: millions of devices, block-level dedup, delta sync, conflict resolution

Concepts Involved

Conflict Resolution CDC Message Queues Consistency Blob Storage

Problem Statement

How does a cloud storage platform sync file changes across millions of devices within seconds, deduplicating content at the block level and tracking per-file block maps for efficient delta sync?

Core challenge: User edits a 1GB file on laptop. Only 4KB changed. How do you sync just the changed blocks (not the whole file) to phone, tablet, and web · within seconds · while deduplicating identical blocks across 700M+ users✗

700M+

registered users

billions of files

4KB

block size

content-defined chunking

<5s

sync latency

change → all devices

~60%

dedup savings

block-level deduplication

Architecture · Block-Level Sync Engine

Content-defined chunking → hash blocks → upload only new blocks → notify other devices

Step	What Happens	Key Technology
1. Detect change	File system watcher detects modified file	inotify (Linux), FSEvents (macOS), ReadDirectoryChanges (Windows)
2. Chunk file	Split file into 4KB blocks using content-defined chunking (Rabin fingerprint)	Rolling hash · boundaries shift with content, not fixed offsets
3. Hash blocks	SHA-256 each block → check which are new (not in server's block store)	Client sends hash list → server returns "need" list
4. Upload delta	Upload only NEW blocks (not already on server)	Typically 1-5% of file for small edits → massive bandwidth savings
5. Update metadata	Update file's block map: [block_hash_1, block_hash_2, ...]	Metadata DB (file → ordered list of block hashes)
6. Notify devices	Push notification to all user's other devices: "file X changed"	Long-poll / WebSocket notification channel
7. Pull delta	Other devices fetch new block map, download only missing blocks, reconstruct file	Same dedup: if block already cached locally, skip download

Content-defined chunking: Unlike fixed-size blocks, Rabin fingerprint creates boundaries based on content. Inserting 1 byte at the start of a file only changes 1 block boundary · not all blocks. This means small edits = small uploads regardless of edit position.

Deduplication: Same block content across ANY user = stored once. A popular PDF shared by 1M users = stored once, referenced 1M times. Reference counting tracks when blocks can be garbage-collected. Saves ~60% storage globally.

Conflict resolution: Two devices edit same file offline → both upload → conflict detected (divergent block maps). Resolution: keep both versions as "file.txt" and "file (conflicted copy).txt". User manually merges. For collaborative docs: use OT/CRDT instead.

Real-world: Dropbox · Magic Pocket (custom block store, exabytes). Google Drive · similar chunking for large files. OneDrive · differential sync with BITS protocol. rsync · rolling checksum algorithm (inspiration for all sync engines).

Resilience & Edge Cases

Failure	Impact	Recovery
Upload interrupted mid-file	Partial blocks uploaded	Resumable upload: track last successful block. Resume from there on reconnect.
Conflict (offline edits)	Two versions of same file	Create "conflicted copy" with device name + timestamp. User resolves.
Block store corruption	File can't be reconstructed	Checksum verification on every read. Replicate blocks across AZs (3· redundancy).
Notification service down	Other devices don't know about changes	Periodic full-sync poll (every 5 min) as fallback. Catch up on reconnect.
Large file (10GB+)	Chunking takes time, upload slow	Background chunking. Prioritize small files. Parallel block uploads (4 concurrent).

Interview Cheat Sheet

The 7 things to say for file sync design

1. Content-defined chunking (Rabin fingerprint) · boundaries based on content, not fixed offsets
2. Block-level dedup · SHA-256 per block, upload only new blocks (~60% storage savings)
3. Delta sync · edit 4KB in 1GB file → upload only 1 block, not entire file
4. File = ordered list of block hashes · metadata is tiny, blocks are content-addressed
5. Notification channel (long-poll/WS) · push "file changed" to other devices instantly
6. Conflict = divergent block maps · create "conflicted copy", user resolves manually
7. Reference counting for GC · block deleted only when zero files reference it

System Design Case Study

Problem Statement

Architecture · Block-Level Sync Engine

Resilience & Edge Cases

Interview Cheat Sheet