My Faults My Own

…willing to sacrifice something we don't have

for something we won't have, so somebody will someday.

CS 223 (Randomized Algorithms)

These are my lecture notes from CS 223: Randomized Algorithms (Piazza), taught by Michael Mitzenmacher at Harvard. I am an enrolled student, and these notes are in no way official; they're reproduced by me from Prof. Mitzenmacher's lectures in Spring 2015, and are licensed Creative Commons Attribution-NonCommercial-ShareAlike 4.0, which supercedes any other site-wide licenses. Creative Commons License


\(R_{01}\): Introductions, Some Probability Theory

Two Problems

Problem 1: Dice

Alice suggests the following Game: Given three fair but nonstandard dice:

  • A: 1,1,6,6,8,8
  • B: 2,2,4,4,9,9
  • C: 3,3,5,5,7,7

Bob will choose a die, then Alice will choose another die. Both will roll, the highest roller will win. What should Bob choose?

Quote: "Well, the meta-level reason that you don't want to play this game [as Bob] is that, if a professor of probability proposes a dice game to you, you probably don't want to play."

Solution: With 5/9 probability each, A beats C beats B beats A. Bob's choice doesn't matter, but he's 5:4 to lose no matter his choice. Read more elsewhere at Wiki | Nontransitive dice.

High-Order Lesson: You have to be very careful with probability; lots of things, like "strength of dice" are unintuitive (e.g. in being nontransitive).

Problem 2: Coins

Bob suggests a different Game: Alice will choose a sequence of three {heads, tails}, then Bob will choose a different sequence. They'll toss a fair coin until one of their sequences comes up as a consecutive subsequence of toss results. What should Alice choose?

Solution: Consider the situation if Alice picks \(\sf HHH\); then Bob can pick \(\sf THH\). Then the game is the same as if Alice had picked \(\sf HHH\) and Bob had picked \(\sf T\); the game will be over after three turns, so it's easy to see that Bob is 7:1 to win.

We can extrapolate the full strategy, involving a similar prefixing trick, whereby Bob prepends a single letter to the first two of Alice's -- specifically, the opposite of Alice's second letter. Then \(\sf HTH\) is an optimal choice for Alice, \(\sf HHT\) is an optimal response by Bob, giving Bob 2:1 to win. Read more elsewhere at Wiki | Penney's game.

Discrete Probability Spaces

  • Set-theoretically, a probability space consists of:
    • a sample space \(\Oa\) (a set of all possible outcomes)
    • an event space \(\F\sbq\Oa\)
    • a function \(\Pr:\F\to\R[0,1]\).
  • Axioms:
    • \(\Pr(\Oa)=1\).
    • \(\fa x\in\F,0\leq\Pr(x)\leq1\).
    • \(\fa x,y\in\F,\Pr(x\un y)=\Pr(x)+\Pr(y)-\Pr(x\cap y)\)

Polynomial Identity Testing

  • Problem: \(\ds{f(x)\dq\prod_{i=1}^n(x-a_i)}=^?\sum_{i=0}^nc_ix^i\qd g(x)\).
  • Deterministic Algorithm: Expand the LHS, compare symbolically.
    • Problem: It might take a long time (exponential blowup).
  • Randomized Algorithm: Plug in the same (randomly chosen) \(x'\) on both sides:
    • If \(\text{LHS}\lr{x'}=\text{RHS}\lr{x'}\), declare equal.
    • If \(\text{LHS}\lr{x'}\neq\text{RHS}\lr{x'}\), declare inequal.
  • However, there's a chance of mistakenly declaring them equal, if you picked a bad \(x'\) by mistake.
    • More precisely, you'll fail if you picked a root of \(f-g\).
    • But since we know that there are at most \(n\) such roots (in the case \(f\neq g\)), we can limit the probability of failure to 1% if we choose from a pool of \(100n\) distinct numbers.
    • Gets even better (\(0.01^k\)) if we choose repeatedly; gets even better without replacement; in the limit, we can obtain a (nonrandomized, exact) \(O\lr{n^2}\) solution by simply choosing \(k=n+1\).
    • But here, we're interested in a ~linear-time solution, so let's

More Probability Theory

  • Events \(E,F\) are independent iff \(\Pr(E\cap F)=\Pr(E)\cdot\Pr(F)\).
  • Events \(E_1,\ldots,E_k\) are mutually independent iff \(\fa \mathscr E\in P(\{E_1,\ldots,E_k\}),\Pr\lr{\IN\mathscr E}=\prod\Pr(\mathscr E)\).
    Caution: Mutual independence is not the same thing as pairwise independence; consider three events consisting of two coin flips and their xor.

Another Problem: MatMult Checking (mod 2)

  • Problem: \(AB=^?C\).
  • Deterministic Algorithm: Do the computation yourself.
    • Problem: it's slow (\(O\lr{n^3}\), or slightly faster).
  • Randomized Algorithm: Use matrix-vector calculations.
    • Pick \(\vec r\) randomly. (Flip a coin for each coordinate: \(2^n\) possible outcomes, all equally likely.)
    • Check \(A(B\vec r)=^?C\vec r\). (Takes \(O\lr{n^2}\).)
    • Ask: If \(AB\neq C\), what is \(\Pr_{\vec r}(AB\vec r=C\vec r)\)?

\(T_2\):

\(AB=^?C\)

  • Claim: Random sampling will catch \(A\lr{B\vec r}\neq C\vec r\) w.p. \(\geq\nf12\) (and 0 error in the other direction).
    • Proof: There is at least one nonzero entry of \(D=AB-C\) if it's nonzero, wlog pick its input last. Then, to give a false-equals, its input value must match the rowsum of the rest -- which it will do w.p. only \(\nf12\).
    • It follows that the tight bound on the p (in the modulo 2 case) is \(1-2^{-k})\, where \(k\) is the rank of \(AB-C\)

Min-Cut

Reminder: Conditional probability: \(\Prr YX\dq\f{\Pr(Y\cap X)}{\Pr(X)}\), requiring \(\Pr(X)>0\).

  • Problem: Given \(G=(V,E)\) and \(u,v\in V\), return a partition \(U\ni u,V\ni v\) minimizing the number (for unweighted case) of edges crossing \(U\leftrightarrow V\).
  • Algorithm: (Contraction algorithm)
    • Pick a random edge (not \((u,v)\)); "contract" those vertices -- i.e. ensure that they're on the same side of the cut.
    • Claim: \(\Pr(\text{got min cut})\geq\f2{n(n-1)}\) -- so boost by running many times and taking min.
    • So after \(k\) runs, \(\Pr(\text{fail})\leq\lr{1-\f2{n(n-1)}}^k\), or \(n^2\ln n\) trials.
  • Proof: Let \(E_i\) be the r.v. that the \(i\)th edge contracted was OK to contract -- i.e. not crossing the mincut. Let \(F_i\) be cumulative success: \[F_i\dq\IN_{j=1}^iE_i=\]
    • Then \(\Pr{\text{succeed}}=\prod_{i=1}^{n-2}\Prr{E_i}{F_{i-1}}\).
    • Claim: Min cut size \(=k\im\abs E\geq\nf{kn}2\).
    • So \(\Pr\lr{E_1}=\Prr{E_1}{F_0}\geq1-\f k{kn/2}=1-2/n=\f{n-2}n\).
    • Then, since there are still \(k\) bad edges, \(\Prr{E_2}{F_1}\geq\f{n-3}{n-1}\) -- and the progressive-fraction-product gives \(\f2{n(n-1)}\)
    • Optimization: Since the risk of failure is much larger at the end, you should spend more work re-trying choices there.
    • Result: This algorithm has been generalized and derandomized.

Expectation

  • \(E:(\Oa\to\R)\to\R\), defined \(EX\dq\sum_ii\Pr(X=i)\).
    • Linearity: \(E\lr{\sum_iX_i}=\sum_iEX_i\) and \(E(cX_i)=cE(X_i)\)

\(T_4\):

Chernoff (-Hoeffding) Bounds

Extend the idea of Chebyshev to \(n\)th moment:
\[\Pr((X-\mu)^n)\geq a^n)\leq\f{E(X-\mu)^n}{a^n}\].

Useful w/ MGFs: \(\Pr(X\geq a)\e\Pr\lr{e^{tx}\geq e^{ta}}\leq\f{Ee^{tx}}{e^{ta}}\) for \(t>0\).

Moment-generating function: \(Ee^{tx}\); the \(n\)th derivative at \(t=0\) is the \(n\)th moment of \(X\).

Also: \(Ee^{t(X+Y)}=E\lr{e^{tX}e^{tY}}=Ee^{tX}Ee^{tY}\)

Chernoff Example

\(X_i\dq\) coin flip 1 w/ prob \(p_i\), 0 else. \(X\dq\sum X_i\); \(\mu\dq EX\).

"Multiplicative" bound: \[\Pr\lr{X\geq(a+\da)EX}\leq\lr{\f{e^\da}{(1+\da)^{1+\da}}}\] \[\Pr(X\geq2EX)\leq(\nf e4)^n\]

(rest of class ignored; was doing pset)


\(T_5\): Bloom Filters, Coupon Collector Revisited -- Probabilistic Method

Bloom Filters

  • Set membership queries -- Is \(x\in S\)? How little space do we need?
  • Trading off time/space vs. accuracy
  • Basic idea: for each \(x\in S\), for each \(i\in\s{1,\hh,k}\), set \(B[h_i(x)]\).
    • Then \(x\in S\mim\fa i,B[h_i(x)]=1\) -- one-sided (false-pos) error
  • Alternative idea: Use perfect hash function \(h:S\bij[0,n-1]\), store (orthogonal) hash fingerprints (\(b\) bits each) -- then false-positive w.p. \(2^{-b}\)
    • Finding/representing/computing \(h\) is potentially nontrivial
  • BF false-pos w.p. \(\lr{1-e^{-nk/m}}^k\)
    • optimal: \(k=\f mn\ln2\)
  • Chernoff bounds concentration: real-life false-pos \(=(\text{frac. empty bins})^k=\lr{1-\f{\text{#empty bins}}m}^k\)
    • But the Poisson model \(\approx\) balls+bins model (\(\leq2e\sqrt m\) times more likely)
    • Thm: The f.p.p. of a BF is \(\lr{1-e^{-nk/m}}^k+o(1)\)

Coupon Collectors

Pr takes \(>n\ln n+cn\) coupons \(\leq\)
\[\lim_{n\to\infty}(\Pr cc>n\ln n+cn)=1-e^{-e^{-c}}\] We throw \(n\ln n+cn\) calls into \(n\) bins -- what's Pr \(\ex\) empty bin?

  • Poisson approx: \(i\)th bin is empty \(\mu=\ln n+c\)
    • \(i\)th bin is empty w.p. \(e^{-\mu}=e^{-(\ln n+c)}=\f{e^{-c}}n\)

Proof: Poisson model yields \(n\ln n+cn+O\lr{\rt{n\log n}}\) balls w.p. \(O(1)\); union the difference term is tiny, irrelevant.

Any time you get a balls-and-bins problem, first ask: What does the Poisson approximation tell me?


\(R_5\): Probabilistic Method

Ramsey Theory

Given: Complete graph \(K_n\) on \(n\) vertices, two-colored. Does there exist a graph with all edges colored, with no mono-colored \(K_k\) subgraph?


\(R_7\): Entropy

Lemma: \(\f{2^{nH(q)}}{n+1}\leq\binom n{nq}\lesssim2^{nH(q)}\)