Overview
This chapter tackles the selection problem: given a set of n distinct numbers, find the i-th smallest element (the i-th order statistic). While sorting solves this in O(n lg n) time, the chapter demonstrates that selection can be performed in Θ(n) time — both in expectation (via a randomized algorithm) and in the worst case (via the deterministic median-of-medians strategy). These results are significant because they show that selection is fundamentally easier than sorting, bypassing the Ω(n lg n) comparison-sort lower bound without making any assumptions about the input.
Key Concepts
- Order statistic: The i-th order statistic of a set is its i-th smallest element. The minimum is the 1st order statistic; the maximum is the n-th.
- Median: The "halfway point" of a set. For odd n the median is unique at position (n + 1)/2. For even n there are two medians; the text adopts the lower median at position ⌊(n + 1)/2⌋ by convention.
- Selection problem (formal): Given a set A of n distinct numbers and an integer i (1 ≤ i ≤ n), return the element in A that is larger than exactly i − 1 other elements.
- Comparison lower bound for minimum: Any comparison-based algorithm needs at least n − 1 comparisons to find the minimum, since every non-minimum element must "lose" at least one comparison (tournament argument).
- Simultaneous min and max: Both the minimum and maximum can be found together using at most 3⌊n/2⌋ comparisons, rather than the naïve 2n − 2, by processing elements in pairs.
- Selection vs. sorting: Selection algorithms determine relative order information only as needed, allowing them to run in linear time — they are not subject to the Ω(n lg n) sorting lower bound.
Algorithms and Techniques
9.1 — MINIMUM (and MAXIMUM)
Idea: Scan the array once, tracking the smallest element seen so far.
MINIMUM(A):
min = A[1]
for i = 2 to A.length:
if A[i] < min:
min = A[i]
return min
- Uses exactly n − 1 comparisons, which is optimal by the tournament lower-bound argument.
- MAXIMUM is symmetric, also using n − 1 comparisons.
Simultaneous min and max (pairwise technique):
Rather than finding min and max independently (2n − 2 comparisons), process elements in pairs:
-
Compare the two elements of each pair with each other (1 comparison).
-
Compare the smaller of the pair against the running minimum (1 comparison).
-
Compare the larger of the pair against the running maximum (1 comparison).