Chapter 5: Probabilistic Analysis and Randomized Algorithms

Overview

This chapter introduces two complementary ideas: probabilistic analysis, which analyzes algorithm performance by assuming a probability distribution over inputs, and randomized algorithms, which use random choices within the algorithm itself to achieve good expected performance regardless of input. The hiring problem serves as the central motivating example, and the chapter develops indicator random variables as a key analytical tool.

Key Concepts

Probabilistic Analysis vs. Randomized Algorithms

Probabilistic analysis: Assumes a distribution over inputs (e.g., uniform random permutation) to compute an average-case running time. The algorithm itself is deterministic—the randomness is in the input.
Randomized algorithm: Introduces randomness into the algorithm (e.g., randomly permuting input before processing). The running time is an expected running time over the algorithm's random choices, with no assumptions about the input.
Key distinction: In probabilistic analysis, bad inputs can cause bad performance. In a randomized algorithm, no input can consistently cause bad performance—only "unlucky" random choices can.

Indicator Random Variables (Section 5.2)

For a sample space S and event A, the indicator random variable is: I{A} = 1 if A occurs, 0 otherwise.
Lemma 5.1: E[I{A}] = Pr{A}. This simple relationship is the foundation of many analyses.
Power of linearity: Even when random variables are dependent, linearity of expectation (E[X₁ + X₂ + ... + Xₙ] = E[X₁] + E[X₂] + ... + E[Xₙ]) applies, making indicator random variables extremely useful for counting expected occurrences.

The Hiring Problem (Section 5.1)

Model: Interview n candidates sequentially; hire a candidate if they're better than all previous hires. Cost = cᵢn (interviews) + cₕm (hires), where m is the number of hires.
Worst case: Candidates arrive in increasing order of quality → hire every candidate → O(cₕn).
Analysis with indicator variables: Let Xᵢ = I{candidate i is hired}. Since candidate i is hired iff they're the best of the first i, Pr{Xᵢ = 1} = 1/i.
E[X] = Σᵢ₌₁ⁿ (1/i) = ln n + O(1) (harmonic series).
Lemma 5.2: Average-case hiring cost is O(cₕ ln n) assuming random input order.
Lemma 5.3: Randomized version (permute first) has expected hiring cost O(cₕ ln n) for any input.

Random Permutation Algorithms (Section 5.3)

PERMUTE-BY-SORTING: Assign each element a random priority from [1, n³], then sort by priority. Runs in Θ(n lg n). Lemma 5.4 proves this produces a uniform random permutation (each of n! permutations equally likely).