System Design Appendix

1) Load Balancing

What to choose & why

Strategy How it works Pros Cons When to use
Round Robin Rotate across backends Simple Ignores load differences Homogeneous stateless apps
Weighted RR Like RR but with weights Skews toward stronger nodes Manual tuning Mixed instance sizes
Least Connections Pick server with fewest active conns Adapts to variable request time Needs connection tracking Long-lived/variable requests
IP Hash Hash(client IP) → server Sticky by client Poorly balances NAT’d clients Session affinity without cookies
Header/Cookie Hash Hash on header/cookie Fine-grained stickiness Requires header/cookie Stateful session shards
Consistent Hash Hash(key) on ring Minimal reshuffle on scale Requires key choice Caches, sharded stores
Anycast + LB Route to nearest POP Low latency Network expertise Global edges (CDN/API edge)

Health checks & routing


2) Caching Tiers (quick recap + design tips)


3) Consistent Hashing (for caches/shards)

Idea: Place servers on a hash ring; node = first_server_clockwise(hash(key)). Why: Adding/removing a server only remaps a small slice of keys. Tips: Use virtual nodes (many tokens per server) for smoother balance.

Pseudocode

# ring: sorted list of (token, server)
# tokens: hash(server_id + vnode_index)

function lookup(key):
    # hash key onto ring
    h = hash(key)
    # find first token >= h (wrap if needed)
    s = ring.lower_bound(h) or ring[0]
    return s.server

4) Queues & Pub/Sub

Concepts

Delivery semantics

Semantics What it means Typical systems Design requirement
At-most-once May drop messages, no dupes UDP-like, rare for business Accept loss
At-least-once No loss, possible duplicates SQS std, Kafka Idempotent consumers
Exactly-once* No loss, no duplicates Kafka EOS (within scopes) Careful txn boundaries

*Exactly-once is contextual; end-to-end often reduces to at-least-once + idempotency.

Idempotency pattern

Retry/backoff (pseudocode)

retries = 0
delay = base
while retries < max_retries:
    ok = call()
    if ok: return OK
    sleep(delay)
    delay = min(delay * 2, max_delay)  # exponential backoff + cap
    retries += 1
return FAIL

5) Eventual Consistency


6) SLI / SLO / SLA

Error budget

Burn-rate alerts (example)

Useful SLIs


7) Reliability Guards

Circuit breaker (pseudocode)

state = CLOSED
failures = 0
last_open = -inf

function call():
    if state == OPEN and now < last_open + cool_down:
        return FAIL_FAST
    if state == HALF_OPEN:
        allow_only_small_probe_rate()

    ok = downstream()
    if ok:
        if state == HALF_OPEN: state = CLOSED; failures = 0
        else: failures = 0
        return OK
    else:
        failures += 1
        if failures >= threshold:
            state = OPEN
            last_open = now
        return FAIL

Bulkheads: isolate pools; one noisy neighbor can’t drain all threads/connections. Timeouts: every remote call needs a timeout + retry policy. Dead-letter queues: capture poisoned messages.


Final Cheat Sheet — 1 Page

A) Cross-Structure Operations (Python)

Operation list dict set deque heapq  
len(x) ✓ (len of list)  
Membership in O(n) O(1) avg O(1) avg O(n)  
Add .append(x) / .insert(i,x) d[k]=v .add(x) .append(x) / .appendleft(x) heappush(h,x)  
Remove .remove(x) / .pop() / del a[i] del d[k] / .pop(k) .remove(x) / .discard(x) .popleft() / .pop() heappop(h)  
Iterate for v in a for k,v in d.items() for v in s for v in q for v in h (unsorted)  
Sort .sort() / sorted(a) sorted(d.items()) sorted(s) convert → list heaps are partially ordered  
Special slicing, .reverse() .keys() .values() .items() .get() set ops ` | & - ^` O(1) ends min at h[0]  

B) Big-O Quick Table

Structure Access Search Insert Delete
Array/list O(1) O(n) O(n) O(n)
Dict O(1) avg O(1) avg O(1) avg
Set O(1) avg O(1) avg O(1) avg
Heap O(n) O(log n) O(log n)
BST (balanced) O(log n) O(log n) O(log n) O(log n)

C) Algorithm Skeletons (minimal)

Recursion template

def solve(x):
    # base case(s)
    # combine results from smaller subproblems
    return result

DFS (graph)

def dfs(u):
    if u in seen: return
    seen.add(u)
    for v in G[u]:
        dfs(v)

BFS (graph)

from collections import deque
def bfs(s):
    q, seen = deque([s]), {s}
    while q:
        u = q.popleft()
        for v in G[u]:
            if v not in seen:
                seen.add(v); q.append(v)

Binary search (index or insert position)

def bsearch(a, t):
    l, r = 0, len(a)-1
    while l <= r:
        m = (l+r)//2
        if a[m]==t: return m
        if a[m]<t: l = m+1
        else: r = m-1
    return l  # insert position

Sliding window (no repeats)

def longest(s):
    pos, L, best = {}, 0, 0
    for R,ch in enumerate(s):
        if ch in pos and pos[ch] >= L: L = pos[ch]+1
        pos[ch] = R
        best = max(best, R-L+1)
    return best

Dijkstra (min distances)

import heapq
def dijkstra(G, s):
    dist = {s:0}; pq = [(0,s)]
    while pq:
        d,u = heapq.heappop(pq)
        if d != dist.get(u, float('inf')): continue
        for v,w in G[u]:
            nd = d + w
            if nd < dist.get(v, float('inf')):
                dist[v] = nd; heapq.heappush(pq, (nd,v))
    return dist

DP template

# define state dp[...] and base cases
# fill in dependency order using transitions
return answer_state

D) Python Interview Reminders