System Design Lesson Plan
How to think about building systems that handle real traffic — from one server to many, and all the tradeoffs along the way.
Lesson 1: Single-Server Foundations
Section titled “Lesson 1: Single-Server Foundations”Goal: Understand what one machine can handle and learn to measure its limits before reaching for distributed solutions.
Concepts
Section titled “Concepts”Every system starts on one server. Before you distribute anything, know what one box can do — CPU cores, memory, disk I/O, network bandwidth, and file descriptor limits. Back-of-envelope estimation is the most important system design skill: if you need 100 requests/sec and one server handles 1,000, you don’t need a load balancer yet. Measure first, scale second.
Exercises
Section titled “Exercises”-
Check your machine’s limits
Terminal window # File descriptor limit (max open connections)ulimit -n# CPU cores availablesysctl -n hw.ncpu# Total memorysysctl -n hw.memsize | awk '{print $1/1024/1024/1024 " GB"}'# Current resource usagetop -l 1 -n 0 | head -10 -
Build a minimal HTTP server and load test it
server.py from http.server import HTTPServer, BaseHTTPRequestHandlerimport json, timeclass Handler(BaseHTTPRequestHandler):def do_GET(self):# Simulate a small computationdata = {"status": "ok", "timestamp": time.time()}body = json.dumps(data).encode()self.send_response(200)self.send_header("Content-Type", "application/json")self.send_header("Content-Length", str(len(body)))self.end_headers()self.wfile.write(body)def log_message(self, format, *args):pass # Suppress per-request loggingif __name__ == "__main__":server = HTTPServer(("127.0.0.1", 8080), Handler)print("Serving on http://127.0.0.1:8080")server.serve_forever()Terminal window python3 server.py &# Load test with ab (Apache Bench, pre-installed on macOS)ab -n 1000 -c 10 http://127.0.0.1:8080/# Note: Requests per second, Time per request, Failed requestskill %1 -
Back-of-envelope estimation
Scenario: Photo sharing app, 10M users, 10% daily activeDaily active users: 1,000,000Uploads per user per day: 2Total uploads per day: 2,000,000Uploads per second: 2,000,000 / 86,400 ≈ 23/secAverage photo size: 2 MBDaily storage: 2,000,000 × 2 MB = 4 TB/dayAnnual storage: 4 TB × 365 ≈ 1.5 PBRead:write ratio: 10:1 (people view more than they post)Reads per second: 230/secQuestion: Can one server handle 23 writes/sec and 230 reads/sec?With SSDs at ~10,000 IOPS -- yes, if each operation is one I/O. -
Compare threaded vs single-process capacity
threaded_server.py from http.server import HTTPServer, BaseHTTPRequestHandlerfrom socketserver import ThreadingMixInimport json, timeclass Handler(BaseHTTPRequestHandler):def do_GET(self):time.sleep(0.01) # Simulate 10ms of workdata = {"status": "ok"}body = json.dumps(data).encode()self.send_response(200)self.send_header("Content-Type", "application/json")self.send_header("Content-Length", str(len(body)))self.end_headers()self.wfile.write(body)def log_message(self, format, *args):passclass ThreadedHTTPServer(ThreadingMixIn, HTTPServer):passif __name__ == "__main__":server = ThreadedHTTPServer(("127.0.0.1", 8080), Handler)print("Threaded server on http://127.0.0.1:8080")server.serve_forever()Terminal window # Test the threaded versionpython3 threaded_server.py &ab -n 1000 -c 50 http://127.0.0.1:8080/# Compare Requests/sec with the single-threaded versionkill %1
Checkpoint
Section titled “Checkpoint”Run the single-threaded and threaded servers. Load test each with ab at
concurrency levels of 1, 10, 50, and 100. Record the requests/sec and mean
latency for each. Explain the inflection point where the single-threaded server
falls behind.
Lesson 2: Caching
Section titled “Lesson 2: Caching”Goal: Understand why caching works, implement it at the application layer with Redis, and reason about invalidation strategies.
Concepts
Section titled “Concepts”Caching exploits locality of reference — a small fraction of data serves a large fraction of requests (the Pareto distribution). Caches exist at every level: CPU L1/L2/L3, application memory, Redis or Memcached, CDN edge servers. The two hardest problems in computer science are cache invalidation and naming things. Common strategies include TTL (expire after N seconds), write-through (update cache on every write), and write-behind (batch writes asynchronously). Choose based on how stale your reads can tolerate being.
Exercises
Section titled “Exercises”-
Install Redis and explore it
Terminal window brew install redisbrew services start redis# Basic operationsredis-cli SET greeting "hello world"redis-cli GET greeting# Set a key with TTL (expires in 10 seconds)redis-cli SET temp_key "gone soon" EX 10redis-cli TTL temp_keysleep 11redis-cli GET temp_key # Returns (nil) -
Build a Python app with and without caching
cache_demo.py import time, json, redis, hashlibr = redis.Redis(host="localhost", port=6379, decode_responses=True)def expensive_query(user_id):"""Simulate a slow database query."""time.sleep(0.1) # 100ms "database" callreturn {"user_id": user_id, "name": f"User {user_id}", "score": user_id * 42}def get_user_no_cache(user_id):return expensive_query(user_id)def get_user_with_cache(user_id, ttl=30):cache_key = f"user:{user_id}"cached = r.get(cache_key)if cached:return json.loads(cached)result = expensive_query(user_id)r.setex(cache_key, ttl, json.dumps(result))return result# Benchmark: 100 lookups for 10 users (high repeat rate)user_ids = [i % 10 for i in range(100)]start = time.time()for uid in user_ids:get_user_no_cache(uid)no_cache_time = time.time() - startstart = time.time()for uid in user_ids:get_user_with_cache(uid)cache_time = time.time() - startprint(f"Without cache: {no_cache_time:.2f}s")print(f"With cache: {cache_time:.2f}s")print(f"Speedup: {no_cache_time / cache_time:.1f}x")Terminal window pip3 install redispython3 cache_demo.py -
Observe cache hit rate
cache_stats.py import redis, json, random, timer = redis.Redis(host="localhost", port=6379, decode_responses=True)r.flushdb() # Start cleanhits, misses = 0, 0def cached_lookup(key):global hits, missescached = r.get(key)if cached:hits += 1return json.loads(cached)misses += 1# Simulate DB lookupresult = {"key": key, "value": random.randint(1, 100)}r.setex(key, 60, json.dumps(result))return result# Zipf-like distribution: some keys accessed far more than otherspopular_keys = [f"item:{i}" for i in range(5)]rare_keys = [f"item:{i}" for i in range(5, 100)]for _ in range(1000):if random.random() < 0.8: # 80% of requests hit popular itemskey = random.choice(popular_keys)else:key = random.choice(rare_keys)cached_lookup(key)total = hits + missesprint(f"Hits: {hits}, Misses: {misses}")print(f"Hit rate: {hits/total*100:.1f}%")Terminal window python3 cache_stats.py# Expect ~90%+ hit rate due to the skewed access pattern -
Implement cache invalidation
invalidation.py import redis, jsonr = redis.Redis(host="localhost", port=6379, decode_responses=True)def write_through(user_id, data):"""Update DB and cache atomically."""# write to "database" (simulated)print(f"DB write: user:{user_id} -> {data}")# update cache immediatelyr.setex(f"user:{user_id}", 300, json.dumps(data))def invalidate_on_write(user_id, data):"""Update DB, delete from cache. Next read repopulates."""print(f"DB write: user:{user_id} -> {data}")r.delete(f"user:{user_id}")# Write-through: cache always has fresh datawrite_through(1, {"name": "Alice", "score": 100})print("Cache after write-through:", r.get("user:1"))# Invalidation: cache is empty until next readinvalidate_on_write(2, {"name": "Bob", "score": 200})print("Cache after invalidation:", r.get("user:2")) # NoneTerminal window python3 invalidation.py
Checkpoint
Section titled “Checkpoint”Run redis-cli MONITOR in one terminal while running cache_demo.py in
another. Watch the SET and GET commands flow through Redis. Then change the TTL
to 2 seconds, re-run the benchmark, and explain how the hit rate and performance
change.
Lesson 3: Load Balancing and Reverse Proxies
Section titled “Lesson 3: Load Balancing and Reverse Proxies”Goal: Distribute traffic across multiple server instances and understand the algorithms that decide where each request goes.
Concepts
Section titled “Concepts”A load balancer sits between clients and servers, forwarding each request to one of several backends. Round-robin distributes evenly. Least-connections sends to the server with the fewest active requests. Consistent hashing maps requests to servers based on a key, preserving affinity when servers join or leave. Reverse proxies also handle TLS termination, compression, and static file serving. Health checks remove failing backends from the pool automatically.
Exercises
Section titled “Exercises”-
Run multiple Python workers
worker.py import sys, json, timefrom http.server import HTTPServer, BaseHTTPRequestHandlerPORT = int(sys.argv[1])WORKER_ID = sys.argv[1]class Handler(BaseHTTPRequestHandler):def do_GET(self):data = {"worker": WORKER_ID, "time": time.time()}body = json.dumps(data).encode()self.send_response(200)self.send_header("Content-Type", "application/json")self.send_header("Content-Length", str(len(body)))self.end_headers()self.wfile.write(body)def log_message(self, format, *args):passHTTPServer(("127.0.0.1", PORT), Handler).serve_forever()Terminal window python3 worker.py 8081 &python3 worker.py 8082 &python3 worker.py 8083 &# Verify each respondscurl -s http://127.0.0.1:8081 | python3 -m json.toolcurl -s http://127.0.0.1:8082 | python3 -m json.toolcurl -s http://127.0.0.1:8083 | python3 -m json.tool -
Set up nginx as a load balancer
Terminal window brew install nginx# Save as /tmp/lb-nginx.confevents { worker_connections 128; }http {upstream backend {server 127.0.0.1:8081;server 127.0.0.1:8082;server 127.0.0.1:8083;}server {listen 8080;location / {proxy_pass http://backend;proxy_set_header X-Real-IP $remote_addr;}location /health {return 200 'ok';add_header Content-Type text/plain;}}}Terminal window nginx -c /tmp/lb-nginx.conf# Hit the load balancer repeatedly -- observe round-robinfor i in $(seq 1 9); docurl -s http://127.0.0.1:8080 | python3 -c "import sys,json; print(json.load(sys.stdin)['worker'])"done# Output: 8081, 8082, 8083, 8081, 8082, 8083, ...nginx -s stop -
Implement consistent hashing in Python
consistent_hash.py import hashlibclass ConsistentHashRing:def __init__(self, nodes, replicas=100):self.ring = {}self.sorted_keys = []for node in nodes:for i in range(replicas):key = self._hash(f"{node}:{i}")self.ring[key] = nodeself.sorted_keys.append(key)self.sorted_keys.sort()def _hash(self, key):return int(hashlib.md5(key.encode()).hexdigest(), 16)def get_node(self, key):h = self._hash(key)for ring_key in self.sorted_keys:if h <= ring_key:return self.ring[ring_key]return self.ring[self.sorted_keys[0]]ring = ConsistentHashRing(["server-1", "server-2", "server-3"])# Same key always maps to the same serverfor key in ["user:100", "user:200", "user:300", "user:100"]:print(f"{key} -> {ring.get_node(key)}")# Remove a server -- only some keys movering2 = ConsistentHashRing(["server-1", "server-3"])moved = 0for i in range(1000):key = f"item:{i}"if ring.get_node(key) != ring2.get_node(key):moved += 1print(f"Removed server-2: {moved}/1000 keys moved ({moved/10:.1f}%)")Terminal window python3 consistent_hash.py# Expect ~33% of keys to move (only keys assigned to server-2) -
Observe graceful degradation
Terminal window # Start workers and nginx (from exercises 1 and 2)python3 worker.py 8081 &python3 worker.py 8082 &python3 worker.py 8083 &nginx -c /tmp/lb-nginx.conf# Kill one workerkill %2 # kills port 8082# nginx detects the failure and routes to remaining workersfor i in $(seq 1 6); docurl -s http://127.0.0.1:8080 | python3 -c "import sys,json; print(json.load(sys.stdin)['worker'])"done# Output: only 8081 and 8083# Clean upnginx -s stopkill %1 %3 2>/dev/null
Checkpoint
Section titled “Checkpoint”Set up three Python workers behind nginx. Use ab -n 1000 -c 20 to load test
through the balancer. Kill one worker mid-test and observe how nginx handles it.
Check the nginx error log to see the health check failure.
Lesson 4: Databases at Scale
Section titled “Lesson 4: Databases at Scale”Goal: Understand how databases handle growing read/write loads through indexing, replication, and partitioning.
Concepts
Section titled “Concepts”The first scaling lever is indexing — a B-tree index turns an O(n) table scan into an O(log n) lookup. The second is read replicas — direct writes to a primary, reads to copies. The third is sharding — split data across multiple databases by a partition key (user_id, region). Vertical partitioning separates columns; horizontal partitioning separates rows. Connection pooling prevents thousands of clients from each holding a database connection open.
Exercises
Section titled “Exercises”-
Measure the impact of indexes
Terminal window sqlite3 /tmp/scale_test.db <<'SQL'-- Create a table with 100k rowsCREATE TABLE orders (id INTEGER PRIMARY KEY,user_id INTEGER NOT NULL,product TEXT NOT NULL,amount REAL NOT NULL,created_at TEXT NOT NULL);-- Insert 100k rowsWITH RECURSIVE seq(i) AS (VALUES(1) UNION ALL SELECT i+1 FROM seq WHERE i < 100000)INSERT INTO orders (user_id, product, amount, created_at)SELECTabs(random()) % 1000,'product-' || (abs(random()) % 50),(abs(random()) % 10000) / 100.0,datetime('2024-01-01', '+' || (abs(random()) % 365) || ' days')FROM seq;-- Query WITHOUT index.timer onSELECT count(*), sum(amount) FROM orders WHERE user_id = 42;-- Add indexCREATE INDEX idx_orders_user ON orders(user_id);-- Query WITH indexSELECT count(*), sum(amount) FROM orders WHERE user_id = 42;-- Compare query plans.timer offEXPLAIN QUERY PLAN SELECT * FROM orders WHERE user_id = 42;EXPLAIN QUERY PLAN SELECT * FROM orders WHERE product = 'product-7';SQL -
Analyze query plans
Terminal window sqlite3 /tmp/scale_test.db <<'SQL'-- Full table scan (no index on product)EXPLAIN QUERY PLANSELECT * FROM orders WHERE product = 'product-7' AND amount > 50;-- Covering index (all columns in the index)CREATE INDEX idx_orders_user_amount ON orders(user_id, amount);EXPLAIN QUERY PLANSELECT sum(amount) FROM orders WHERE user_id = 42;-- Composite index order mattersEXPLAIN QUERY PLANSELECT * FROM orders WHERE user_id = 42 AND amount > 50;-- This won't use the composite index efficientlyEXPLAIN QUERY PLANSELECT * FROM orders WHERE amount > 50;SQL -
Simulate horizontal partitioning (sharding)
sharding.py import sqlite3, time, randomdef create_shard(shard_id):db = sqlite3.connect(f"/tmp/shard_{shard_id}.db")db.execute("""CREATE TABLE IF NOT EXISTS users (id INTEGER PRIMARY KEY,name TEXT NOT NULL,email TEXT NOT NULL)""")db.commit()return dbdef get_shard(user_id, num_shards):return user_id % num_shards# Create 3 shardsshards = {i: create_shard(i) for i in range(3)}# Insert 10,000 users across shardsfor uid in range(10000):shard_id = get_shard(uid, 3)shards[shard_id].execute("INSERT OR REPLACE INTO users VALUES (?, ?, ?)",(uid, f"User {uid}", f"user{uid}@example.com"))for db in shards.values():db.commit()# Count per shardfor shard_id, db in shards.items():count = db.execute("SELECT count(*) FROM users").fetchone()[0]print(f"Shard {shard_id}: {count} users")# Lookup: route to correct shardlookup_id = 4242shard_id = get_shard(lookup_id, 3)row = shards[shard_id].execute("SELECT * FROM users WHERE id = ?", (lookup_id,)).fetchone()print(f"User {lookup_id} found on shard {shard_id}: {row}")# Cross-shard query (expensive -- must query all shards)start = time.time()total = sum(db.execute("SELECT count(*) FROM users").fetchone()[0]for db in shards.values())print(f"Cross-shard count: {total} ({time.time()-start:.4f}s)")for db in shards.values():db.close()Terminal window python3 sharding.py -
Denormalization tradeoff
Terminal window sqlite3 /tmp/scale_test.db <<'SQL'-- Normalized: requires a JOINCREATE TABLE users (id INTEGER PRIMARY KEY, name TEXT NOT NULL);CREATE TABLE user_orders ASSELECT o.id, o.user_id, o.product, o.amount, o.created_atFROM orders o;INSERT INTO users (id, name)SELECT DISTINCT user_id, 'User ' || user_id FROM orders;.timer on-- Normalized query (JOIN)SELECT u.name, count(*), sum(o.amount)FROM user_orders o JOIN users u ON o.user_id = u.idGROUP BY u.name ORDER BY sum(o.amount) DESC LIMIT 5;-- Denormalized: add user_name to ordersALTER TABLE user_orders ADD COLUMN user_name TEXT;UPDATE user_orders SET user_name = (SELECT name FROM users WHERE users.id = user_orders.user_id);-- Denormalized query (no JOIN)SELECT user_name, count(*), sum(amount)FROM user_ordersGROUP BY user_name ORDER BY sum(amount) DESC LIMIT 5;.timer offSQL
Checkpoint
Section titled “Checkpoint”Create a SQLite database with 100k rows. Write three queries: one that benefits
from an index, one that requires a composite index, and one that does a full
table scan. Run EXPLAIN QUERY PLAN on each and explain why the optimizer chose
each strategy.
Lesson 5: Asynchronous Processing
Section titled “Lesson 5: Asynchronous Processing”Goal: Decouple producers from consumers using queues, and understand when async processing beats synchronous request-response.
Concepts
Section titled “Concepts”Synchronous processing blocks the caller until work completes. When the work is slow (sending email, resizing images, generating reports), the caller waits unnecessarily. Message queues let producers enqueue work and move on; consumers process it independently. This decoupling absorbs traffic spikes, enables retries, and lets you scale consumers independently. Dead letter queues capture messages that repeatedly fail, preventing poison messages from blocking the pipeline.
Exercises
Section titled “Exercises”-
Build a task queue with Redis lists
producer.py import redis, json, timer = redis.Redis(host="localhost", port=6379, decode_responses=True)tasks = [{"type": "send_email", "to": "alice@example.com", "subject": "Welcome"},{"type": "resize_image", "path": "/uploads/photo.jpg", "size": "800x600"},{"type": "generate_report", "user_id": 42, "format": "pdf"},{"type": "send_email", "to": "bob@example.com", "subject": "Invoice"},{"type": "resize_image", "path": "/uploads/avatar.png", "size": "200x200"},]for task in tasks:task["queued_at"] = time.time()r.lpush("task_queue", json.dumps(task))print(f"Enqueued: {task['type']}")print(f"Queue length: {r.llen('task_queue')}")consumer.py import redis, json, time, randomr = redis.Redis(host="localhost", port=6379, decode_responses=True)def process_task(task):"""Simulate processing with random delays and failures."""delay = random.uniform(0.1, 0.5)time.sleep(delay)if random.random() < 0.1: # 10% failure rateraise Exception(f"Failed to process {task['type']}")return f"Completed {task['type']} in {delay:.2f}s"print("Consumer waiting for tasks...")while True:# BRPOP blocks until a task is available (timeout 5s)result = r.brpop("task_queue", timeout=5)if result is None:print("No more tasks. Exiting.")break_, raw = resulttask = json.loads(raw)latency = time.time() - task["queued_at"]print(f"Processing: {task['type']} (waited {latency:.2f}s)")try:msg = process_task(task)print(f" -> {msg}")except Exception as e:print(f" -> ERROR: {e}")# Move to dead letter queuer.lpush("dead_letter_queue", raw)print(f" -> Moved to dead letter queue")dlq_len = r.llen("dead_letter_queue")if dlq_len:print(f"Dead letter queue: {dlq_len} failed tasks")Terminal window pip3 install redispython3 producer.pypython3 consumer.py -
Run multiple consumers in parallel
Terminal window # Enqueue 20 taskspython3 -c "import redis, json, timer = redis.Redis(host='localhost', port=6379, decode_responses=True)for i in range(20):task = {'type': f'task_{i}', 'queued_at': time.time()}r.lpush('task_queue', json.dumps(task))print(f'Enqueued 20 tasks. Queue length: {r.llen(\"task_queue\")}')"# Run 3 consumers in parallelpython3 consumer.py &python3 consumer.py &python3 consumer.py &wait# Each consumer picks up different tasks -- no duplicates -
Compare sync vs async throughput
sync_vs_async.py import time, threading, queuedef slow_operation(item):time.sleep(0.1) # Simulate I/Oreturn item * 2items = list(range(50))# Synchronousstart = time.time()sync_results = [slow_operation(i) for i in items]sync_time = time.time() - start# Async with worker poolq = queue.Queue()results = []lock = threading.Lock()def worker():while True:try:item = q.get(timeout=1)result = slow_operation(item)with lock:results.append(result)q.task_done()except queue.Empty:breakstart = time.time()for item in items:q.put(item)workers = [threading.Thread(target=worker) for _ in range(10)]for w in workers:w.start()q.join()async_time = time.time() - startprint(f"Synchronous: {sync_time:.2f}s ({len(sync_results)} results)")print(f"10 workers: {async_time:.2f}s ({len(results)} results)")print(f"Speedup: {sync_time / async_time:.1f}x")Terminal window python3 sync_vs_async.py -
Implement retry with backoff
retry.py import time, randomdef unreliable_operation():if random.random() < 0.7:raise Exception("Transient failure")return "success"def retry_with_backoff(fn, max_retries=5, base_delay=0.1):for attempt in range(max_retries):try:result = fn()print(f" Attempt {attempt + 1}: succeeded")return resultexcept Exception as e:delay = base_delay * (2 ** attempt) + random.uniform(0, 0.1)print(f" Attempt {attempt + 1}: {e} (retry in {delay:.2f}s)")time.sleep(delay)raise Exception(f"Failed after {max_retries} attempts")print("Retry with exponential backoff:")try:result = retry_with_backoff(unreliable_operation)print(f"Final result: {result}")except Exception as e:print(f"Gave up: {e}")Terminal window python3 retry.py
Checkpoint
Section titled “Checkpoint”Run the producer to enqueue 20 tasks, then start three consumers simultaneously. Watch them divide the work. Check the dead letter queue for failed tasks. Explain why BRPOP prevents two consumers from processing the same task.
Lesson 6: Consistency and Availability
Section titled “Lesson 6: Consistency and Availability”Goal: Understand the CAP theorem through concrete examples and reason about when eventual consistency is acceptable.
Concepts
Section titled “Concepts”The CAP theorem states that a distributed system can guarantee at most two of three properties: Consistency (every read sees the latest write), Availability (every request gets a response), and Partition tolerance (the system works despite network failures). Since network partitions are inevitable, the real choice is between consistency and availability during a partition. Strong consistency means all nodes agree before responding; eventual consistency means nodes converge over time, allowing temporary disagreement.
Exercises
Section titled “Exercises”-
Simulate two nodes with a network partition
cap_sim.py import time, threadingclass Node:def __init__(self, name):self.name = nameself.data = {}self.peers = []self.partitioned = Falsedef write(self, key, value):self.data[key] = {"value": value, "ts": time.time()}print(f" [{self.name}] WRITE {key}={value}")self._replicate(key)def read(self, key):entry = self.data.get(key, {"value": None})print(f" [{self.name}] READ {key}={entry['value']}")return entry["value"]def _replicate(self, key):for peer in self.peers:if not self.partitioned and not peer.partitioned:peer.data[key] = self.data[key].copy()print(f" [{self.name}] -> replicated {key} to {peer.name}")else:print(f" [{self.name}] -> PARTITION: cannot reach {peer.name}")# Setup: two nodes that replicate to each othernode_a = Node("A")node_b = Node("B")node_a.peers = [node_b]node_b.peers = [node_a]print("=== Normal operation (consistent) ===")node_a.write("balance", 100)node_b.read("balance")print("\n=== Network partition ===")node_a.partitioned = Truenode_b.partitioned = Truenode_a.write("balance", 80) # A sees 80node_b.write("balance", 120) # B sees 120print(f"\nConflict! A={node_a.read('balance')}, B={node_b.read('balance')}")print("\n=== Partition heals ===")node_a.partitioned = Falsenode_b.partitioned = False# Last-write-wins resolutionif node_a.data["balance"]["ts"] > node_b.data["balance"]["ts"]:node_b.data["balance"] = node_a.data["balance"].copy()winner = "A"else:node_a.data["balance"] = node_b.data["balance"].copy()winner = "B"print(f" Resolved with last-write-wins: {winner}'s value")print(f" A={node_a.read('balance')}, B={node_b.read('balance')}")Terminal window python3 cap_sim.py -
Demonstrate eventual consistency with Redis pub/sub
eventual.py import redis, json, time, threadingr = redis.Redis(host="localhost", port=6379, decode_responses=True)local_cache = {}def subscriber():"""Background thread that listens for updates."""pubsub = r.pubsub()pubsub.subscribe("updates")for message in pubsub.listen():if message["type"] == "message":update = json.loads(message["data"])local_cache[update["key"]] = update["value"]print(f" [subscriber] synced: {update['key']}={update['value']}")# Start subscriber in backgroundt = threading.Thread(target=subscriber, daemon=True)t.start()time.sleep(0.5) # Let subscriber connect# Write to "primary"r.set("counter", 1)r.publish("updates", json.dumps({"key": "counter", "value": 1}))print("Published counter=1")time.sleep(0.1)print(f"Local cache: {local_cache}")r.set("counter", 2)r.publish("updates", json.dumps({"key": "counter", "value": 2}))print("Published counter=2")time.sleep(0.1)print(f"Local cache: {local_cache}")# Simulate replication lagprint("\nSimulating lag: write happens but subscriber is slow...")r.set("counter", 3)print(f" Primary: counter=3")print(f" Local cache: counter={local_cache.get('counter')} (stale!)")r.publish("updates", json.dumps({"key": "counter", "value": 3}))time.sleep(0.2)print(f" Local cache: counter={local_cache.get('counter')} (eventually consistent)")Terminal window python3 eventual.py -
When consistency matters vs when it doesn’t
Eventual consistency is FINE for:- Social media likes/counts (off by one is harmless)- Shopping cart contents (user sees their own writes)- Search index updates (slight delay is acceptable)- Analytics dashboards (minutes-old data is fine)- User profile caches (stale name for a few seconds is OK)Strong consistency is REQUIRED for:- Bank account balances (double-spend is catastrophic)- Inventory decrements (overselling is expensive)- Distributed locks (two holders = data corruption)- Unique constraint enforcement (duplicate usernames)- Leader election (two leaders = split brain)Ask: "What is the cost of a stale read?"If the answer is "minor annoyance" -> eventual consistencyIf the answer is "data loss or money" -> strong consistency -
Implement a conflict resolution strategy
conflict.py import timeclass VersionedValue:def __init__(self, value, version=0, timestamp=None):self.value = valueself.version = versionself.timestamp = timestamp or time.time()def last_write_wins(a, b):winner = a if a.timestamp > b.timestamp else bprint(f" LWW: picked '{winner.value}' (newer timestamp)")return winnerdef highest_version_wins(a, b):winner = a if a.version > b.version else bprint(f" Version: picked '{winner.value}' (version {winner.version})")return winnerdef merge_sets(a_set, b_set):merged = a_set | b_setprint(f" Merge: {a_set} + {b_set} = {merged}")return merged# Scenario: two nodes edit a username concurrentlyprint("=== Last-write-wins ===")v1 = VersionedValue("alice_old", version=1, timestamp=100)v2 = VersionedValue("alice_new", version=1, timestamp=101)last_write_wins(v1, v2)print("\n=== Version vector ===")v1 = VersionedValue("draft", version=3)v2 = VersionedValue("published", version=5)highest_version_wins(v1, v2)print("\n=== Set merge (CRDTs) ===")cart_a = {"apple", "banana"}cart_b = {"banana", "cherry"}merge_sets(cart_a, cart_b)Terminal window python3 conflict.py
Checkpoint
Section titled “Checkpoint”Run the CAP simulation and modify it so both nodes accept writes during the partition. Introduce a conflict by writing different values to the same key. Implement two resolution strategies (last-write-wins and merge) and explain when each is appropriate.
Lesson 7: Observability
Section titled “Lesson 7: Observability”Goal: Build the instrumentation that lets you understand what a running system is doing — metrics, logs, and traces.
Concepts
Section titled “Concepts”You cannot fix what you cannot see. Observability rests on three pillars: metrics (counters, gauges, histograms that track system health), logs (structured events that record what happened), and traces (request flows across services). Structured logging with JSON and correlation IDs lets you follow a single request through multiple services. SLIs (Service Level Indicators) measure what users experience; SLOs (Service Level Objectives) set targets; error budgets define how much failure is acceptable before you stop shipping features.
Exercises
Section titled “Exercises”-
Build structured logging
structured_log.py import json, time, uuid, sysdef log(level, message, **kwargs):entry = {"timestamp": time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()),"level": level,"message": message,**kwargs,}print(json.dumps(entry))def with_request_id(fn):"""Decorator that adds a correlation ID to every log."""def wrapper(*args, **kwargs):kwargs["request_id"] = str(uuid.uuid4())[:8]return fn(*args, **kwargs)return wrapper@with_request_iddef handle_request(path, request_id=None):log("info", "request started", path=path, request_id=request_id)start = time.time()# Simulate worktime.sleep(0.05)log("info", "database query", table="users", rows=42, request_id=request_id)time.sleep(0.02)duration_ms = (time.time() - start) * 1000log("info", "request completed", path=path,duration_ms=round(duration_ms, 1), status=200, request_id=request_id)handle_request("/api/users")handle_request("/api/orders")Terminal window python3 structured_log.py | python3 -m json.tool# Each log line is valid JSON with a shared request_id -
Build a metrics collector
metrics.py import time, threading, jsonfrom collections import defaultdictclass Metrics:def __init__(self):self.counters = defaultdict(int)self.gauges = {}self.histograms = defaultdict(list)self._lock = threading.Lock()def increment(self, name, amount=1):with self._lock:self.counters[name] += amountdef gauge(self, name, value):with self._lock:self.gauges[name] = valuedef observe(self, name, value):with self._lock:self.histograms[name].append(value)def report(self):with self._lock:report = {"counters": dict(self.counters), "gauges": dict(self.gauges)}for name, values in self.histograms.items():sorted_v = sorted(values)n = len(sorted_v)report[f"histogram_{name}"] = {"count": n,"p50": sorted_v[n // 2] if n else 0,"p95": sorted_v[int(n * 0.95)] if n else 0,"p99": sorted_v[int(n * 0.99)] if n else 0,}return reportm = Metrics()# Simulate trafficimport randomfor _ in range(1000):m.increment("http_requests_total")latency = random.expovariate(1 / 50) # Mean 50msm.observe("request_duration_ms", round(latency, 1))if random.random() < 0.02:m.increment("http_errors_total")m.gauge("active_connections", 47)m.gauge("queue_depth", 12)print(json.dumps(m.report(), indent=2))Terminal window python3 metrics.py -
Implement request tracing across services
tracing.py import json, time, uuiddef trace_span(service, operation, parent_id=None, trace_id=None):span_id = str(uuid.uuid4())[:8]trace_id = trace_id or str(uuid.uuid4())[:8]start = time.time()def finish(status="ok", **attrs):duration_ms = (time.time() - start) * 1000span = {"trace_id": trace_id,"span_id": span_id,"parent_id": parent_id,"service": service,"operation": operation,"duration_ms": round(duration_ms, 1),"status": status,**attrs,}print(json.dumps(span))return span_id, trace_idreturn span_id, trace_id, finish# Simulate a request flowing through three services# API Gateway -> User Service -> Databasesid1, tid, finish1 = trace_span("api-gateway", "GET /api/users/42")time.sleep(0.01)sid2, _, finish2 = trace_span("user-service", "get_user", parent_id=sid1, trace_id=tid)time.sleep(0.005)sid3, _, finish3 = trace_span("database", "SELECT * FROM users", parent_id=sid2, trace_id=tid)time.sleep(0.02)finish3(rows=1)finish2(cache_hit=False)finish1(status_code=200)Terminal window python3 tracing.py | python3 -m json.tool# All three spans share the same trace_id# parent_id links child spans to their parent -
Build a health check endpoint
healthcheck.py import json, time, sqlite3, redisfrom http.server import HTTPServer, BaseHTTPRequestHandlerclass HealthHandler(BaseHTTPRequestHandler):def do_GET(self):checks = {}# Check Redistry:r = redis.Redis(host="localhost", port=6379)r.ping()checks["redis"] = {"status": "healthy", "latency_ms": 0}except Exception as e:checks["redis"] = {"status": "unhealthy", "error": str(e)}# Check SQLite (stand-in for a real database)try:start = time.time()db = sqlite3.connect(":memory:")db.execute("SELECT 1")db.close()checks["database"] = {"status": "healthy","latency_ms": round((time.time() - start) * 1000, 1),}except Exception as e:checks["database"] = {"status": "unhealthy", "error": str(e)}overall = "healthy" if all(c["status"] == "healthy" for c in checks.values()) else "unhealthy"body = json.dumps({"status": overall, "checks": checks}, indent=2).encode()status = 200 if overall == "healthy" else 503self.send_response(status)self.send_header("Content-Type", "application/json")self.end_headers()self.wfile.write(body)def log_message(self, format, *args):passserver = HTTPServer(("127.0.0.1", 8080), HealthHandler)print("Health check on http://127.0.0.1:8080")server.serve_forever()Terminal window python3 healthcheck.py &curl -s http://127.0.0.1:8080 | python3 -m json.tool# Returns {"status": "healthy", "checks": {...}}kill %1
Checkpoint
Section titled “Checkpoint”Run the structured logging example and pipe the output into a file. Use jq to
filter for only error-level logs, extract all unique request IDs, and calculate
the average request duration. This is the workflow for investigating production
incidents from log files.
Lesson 8: Putting It Together
Section titled “Lesson 8: Putting It Together”Goal: Apply every concept from the previous lessons to design complete systems end-to-end, making tradeoffs explicit at each layer.
Concepts
Section titled “Concepts”System design is the art of making tradeoffs — every choice forecloses others. The process follows a repeatable pattern: estimate scale, define the API, choose a data model, add caching where reads dominate, introduce queues where writes spike, decide on consistency guarantees, and instrument everything. Walking through a design end-to-end reveals hidden dependencies and forces you to justify every component. If you cannot explain why a component exists, remove it.
Exercises
Section titled “Exercises”-
Design a URL shortener
=== Requirements ===- Shorten long URLs to 7-character codes- Redirect short URLs to the original- Track click analytics- 100M URLs stored, 1000 redirects/sec=== Back-of-envelope ===Storage: 100M × (7 + 200 bytes avg URL) ≈ 20 GB (fits one machine)Reads: 1000/sec (heavy read, light write)Writes: ~10/sec (new URLs)Ratio: 100:1 read-heavy → cache aggressively=== Schema ===urls(code TEXT PK, original_url TEXT NOT NULL, created_at TEXT)clicks(id INTEGER PK, code TEXT FK, clicked_at TEXT, referrer TEXT, ip TEXT)=== Components ===1. API server: POST /shorten, GET /:code (302 redirect)2. Redis cache: code -> original_url (TTL 1 hour)3. SQLite/Postgres: persistent storage4. Analytics queue: log clicks asynchronouslyurl_shortener.py import json, time, hashlib, sqlite3, redisfrom http.server import HTTPServer, BaseHTTPRequestHandlerdb = sqlite3.connect("/tmp/shortener.db")db.execute("""CREATE TABLE IF NOT EXISTS urls (code TEXT PRIMARY KEY,original_url TEXT NOT NULL,created_at TEXT DEFAULT (datetime('now')))""")db.execute("""CREATE TABLE IF NOT EXISTS clicks (id INTEGER PRIMARY KEY,code TEXT REFERENCES urls(code),clicked_at TEXT DEFAULT (datetime('now')))""")db.commit()cache = redis.Redis(host="localhost", port=6379, decode_responses=True)def shorten(url):code = hashlib.md5(url.encode()).hexdigest()[:7]db.execute("INSERT OR IGNORE INTO urls (code, original_url) VALUES (?, ?)",(code, url),)db.commit()cache.setex(f"url:{code}", 3600, url)return codedef resolve(code):# Check cache firstcached = cache.get(f"url:{code}")if cached:return cached# Fall back to databaserow = db.execute("SELECT original_url FROM urls WHERE code = ?", (code,)).fetchone()if row:cache.setex(f"url:{code}", 3600, row[0])return row[0]return Noneclass Handler(BaseHTTPRequestHandler):def do_POST(self):length = int(self.headers.get("Content-Length", 0))body = json.loads(self.rfile.read(length))code = shorten(body["url"])resp = json.dumps({"short_url": f"http://localhost:8080/{code}"}).encode()self.send_response(201)self.send_header("Content-Type", "application/json")self.end_headers()self.wfile.write(resp)def do_GET(self):code = self.path.strip("/")if code == "stats":count = db.execute("SELECT count(*) FROM urls").fetchone()[0]clicks = db.execute("SELECT count(*) FROM clicks").fetchone()[0]resp = json.dumps({"urls": count, "clicks": clicks}).encode()self.send_response(200)self.send_header("Content-Type", "application/json")self.end_headers()self.wfile.write(resp)returnurl = resolve(code)if url:db.execute("INSERT INTO clicks (code) VALUES (?)", (code,))db.commit()self.send_response(302)self.send_header("Location", url)self.end_headers()else:self.send_response(404)self.end_headers()self.wfile.write(b"Not found")def log_message(self, format, *args):passprint("URL shortener on http://127.0.0.1:8080")HTTPServer(("127.0.0.1", 8080), Handler).serve_forever()Terminal window python3 url_shortener.py &# Create a short URLcurl -s -X POST http://127.0.0.1:8080 \-H "Content-Type: application/json" \-d '{"url": "https://en.wikipedia.org/wiki/Systems_design"}' | python3 -m json.tool# Follow the redirectcurl -v http://127.0.0.1:8080/$(curl -s -X POST http://127.0.0.1:8080 \-H "Content-Type: application/json" \-d '{"url": "https://example.com"}' | python3 -c "import sys,json; print(json.load(sys.stdin)['short_url'].split('/')[-1])") 2>&1 | grep Location# Check statscurl -s http://127.0.0.1:8080/stats | python3 -m json.toolkill %1 -
Design a rate limiter
rate_limiter.py import redis, time, jsonfrom http.server import HTTPServer, BaseHTTPRequestHandlerr = redis.Redis(host="localhost", port=6379, decode_responses=True)def is_rate_limited(client_id, max_requests=10, window_seconds=60):"""Sliding window rate limiter using Redis sorted sets."""key = f"ratelimit:{client_id}"now = time.time()window_start = now - window_secondspipe = r.pipeline()pipe.zremrangebyscore(key, 0, window_start) # Remove old entriespipe.zadd(key, {str(now): now}) # Add current requestpipe.zcard(key) # Count requests in windowpipe.expire(key, window_seconds) # Auto-cleanupresults = pipe.execute()request_count = results[2]remaining = max(0, max_requests - request_count)return request_count > max_requests, request_count, remainingclass Handler(BaseHTTPRequestHandler):def do_GET(self):client_id = self.client_address[0]limited, count, remaining = is_rate_limited(client_id)self.send_header = self._original_send_headerif limited:self.send_response(429)self.send_header("Retry-After", "60")self.send_header("X-RateLimit-Remaining", str(remaining))self.end_headers()self.wfile.write(b"Rate limit exceeded")else:body = json.dumps({"message": "ok", "requests_used": count}).encode()self.send_response(200)self.send_header("Content-Type", "application/json")self.send_header("X-RateLimit-Remaining", str(remaining))self.end_headers()self.wfile.write(body)def log_message(self, format, *args):passprint("Rate limiter on http://127.0.0.1:8080 (10 req/min)")HTTPServer(("127.0.0.1", 8080), Handler).serve_forever()Terminal window python3 rate_limiter.py &# Send 12 requests quickly -- last ones should be 429for i in $(seq 1 12); doecho "Request $i: $(curl -s -o /dev/null -w '%{http_code}' http://127.0.0.1:8080)"donekill %1 -
System design decision checklist
For any system, walk through these questions:SCALE[ ] How many users? DAU? Peak QPS?[ ] Read:write ratio?[ ] How much data? Growth rate?[ ] Can one server handle it?DATA[ ] What are the entities and relationships?[ ] SQL or NoSQL? Why?[ ] What indexes are needed?[ ] Sharding key (if needed)?CACHING[ ] What data is read-heavy and rarely changes?[ ] Cache invalidation strategy?[ ] What is the acceptable staleness?ASYNC[ ] Which operations can be deferred?[ ] What happens if a queued task fails?[ ] Do consumers need exactly-once semantics?CONSISTENCY[ ] What needs strong consistency?[ ] Where is eventual consistency acceptable?[ ] What is the cost of a stale or conflicting read?OBSERVABILITY[ ] What are the SLIs? (latency, error rate, throughput)[ ] What SLO targets?[ ] How will you detect and diagnose failures? -
Walk through tradeoffs for the URL shortener
TRADEOFF 1: Code generation- Hash (MD5/SHA) vs counter vs random- Hash: deterministic, same URL = same code (dedup)- Counter: sequential, predictable (security concern)- Random: no collisions with DB check, not guessableTRADEOFF 2: Cache TTL- Long TTL (1 hour): fewer DB reads, stale if URL deleted- Short TTL (1 min): fresher data, more DB load- No TTL: fastest reads, cache grows unboundedTRADEOFF 3: Analytics- Sync: every redirect waits for INSERT (adds latency)- Async queue: redirect is fast, analytics may lag- Batch: buffer clicks in memory, flush periodicallyTRADEOFF 4: Consistency- Single DB: strongly consistent, single point of failure- Replicas: available under failure, reads may be stale- The shortener is read-heavy → replicas make sense
Checkpoint
Section titled “Checkpoint”Pick one of these systems: a paste bin, a notification service, or a leaderboard. Walk through the full design checklist: estimate scale, define the API, choose a data model, decide on caching and queuing, set consistency requirements, and define SLIs. Write it up in the same format as the URL shortener exercise above. Justify every component — if you cannot explain why it exists, remove it.
Practice Projects
Section titled “Practice Projects”Project 1: URL Shortener with Analytics Dashboard
Section titled “Project 1: URL Shortener with Analytics Dashboard”Extend the URL shortener from Lesson 8 with a full analytics pipeline. Track
referrers, geographic data (via IP), and click timestamps. Use Redis for
real-time counters and SQLite for historical data. Build a /stats/:code
endpoint that returns click-over-time data as JSON. Add rate limiting to the
create endpoint.
Project 2: Job Queue System
Section titled “Project 2: Job Queue System”Build a production-quality job queue with Redis. Support priority levels (high, normal, low), retry with exponential backoff, dead letter queues, and a dashboard endpoint showing queue depth, processing rate, and failure rate. Add multiple worker types that handle different job types.
Project 3: Mini Load Tester
Section titled “Project 3: Mini Load Tester”Write a Python tool that sends concurrent HTTP requests to a target URL and reports statistics: requests/sec, latency percentiles (p50, p95, p99), error rate, and throughput in bytes/sec. Support configurable concurrency, duration, and request count. Output results as both a summary and a JSON report.
Quick Reference
Section titled “Quick Reference”| Topic | Key Concepts |
|---|---|
| Capacity | ulimit, ab, back-of-envelope estimation, one-server limits |
| Caching | Redis, TTL, write-through, invalidation, hit rate |
| Load balancing | nginx, round-robin, least-connections, consistent hashing |
| Databases | Indexes, EXPLAIN, read replicas, sharding, denormalization |
| Async | Redis queues, BRPOP, worker pools, retry, dead letter queues |
| Consistency | CAP theorem, strong vs eventual, conflict resolution |
| Observability | Structured logs, metrics, traces, correlation IDs, SLIs/SLOs |
| Design process | Estimate, schema, API, cache, queue, consistency, instrument |
See Also
Section titled “See Also”- Data Models Lesson Plan — How you store data shapes everything
- Networking Lesson Plan — The network layer underneath
- Cryptography Lesson Plan — TLS and auth at scale
- PostgreSQL Cheatsheet — Database commands and optimization
- Docker Cheatsheet — Containerization for multi-service setups
- Concurrency
- Information Architecture
- System Design
- API Design