CS 225 (Pseudorandomness)
\def\al{\alpha}
\def\Al{\Alpha}
\def\bt{\beta}
\def\Bt{\Beta}
\def\da{\delta}
\def\Da{\Delta}
\def\ep{\epsilon}
\def\Ep{\Epsilon}
\def\gm{\gamma}
\def\Gm{\Gamma}
\def\la{\lambda}
\def\La{\Lambda}
\def\sg{\sigma}
\def\Sg{\Sigma}
\def\te{\theta}
\def\Te{\Theta}
\def\oa{\omega}
\def\Oa{\Omega}
\def-#1{^{-#1}}
\def\2{\s{0,1}}
\def\abs#1{\left|#1\right|}
\def\argmin{\text{argmin}}
\DeclareMathOperator*{\argmin}{arg,min}
\def\aryc#1{\lr{\begin{array}{c}#1\end{array}}}
\def\aryd#1{\lr{\begin{array}{cc}#1\end{array}}}
\def\arye#1{\lr{\begin{array}{ccc}#1\end{array}}}
\def\aryf#1{\lr{\begin{array}{cccc}#1\end{array}}}
\def\aryg#1{\lr{\begin{array}{ccccc}#1\end{array}}}
\def\blr#1{\big(#1\big)}
\def\co#1{\cx{co-#1}}
\def\cx#1{{\sf #1}}
\def\dd{\ddots}
\def\Dec{\text{Dec}}
\def\del{\partial}
\def\dq{:=}
\def\ds#1{\displaystyle{#1}}
\def\e\equiv
\def\Enc{\text{Enc}}
\def\ex{\exists}
\def\Exp#1{\exp!!\left[#1\right]}
\def\f{\frac}
\def\fa{\forall}
\def\grad{\nabla}
\def\hh{\ldots}
\def\im{\Longrightarrow}
\def\IN{\bigcap}
\def\ixl#1{\xleftarrow[#1]{}}
\def\ixr#1{\xrightarrow[#1]{}}
\def\lgr#1#2{\left(#1;\middle|;#2\right)}
\def\lr#1{\left(#1\right)}
\def\mi{\Longleftarrow}
\def\mim{\Longleftrightarrow}
\def\n#1{\left|#1\right|}
\def\nf#1#2{{}^{#1}/_{#2}}
\def\nin{\notin}
\def\otr{\overset R\leftarrow}
\def\ox{\otimes}
\def\pcwz#1{\left{\begin{array}{ll}#1\end{array}\right.}
\def\pelse{&\text{else}}
\def\pif{&\text{if }}
\def\poly{\text{poly}}
\def\pr#1{\cx{pr#1}}
\def\Pr{\text{Pr}}
\def\Prr#1#2{\Pr\left(#1;\middle|;#2\right)}
\def\px#1{[\text{#1}]}
\def\qd{=:}
\def\s#1{\left{#1\right}}
\def\sbn{\subsetneq}
\def\sbq{\subseteq}
\def\sbs{\subset}
\def\spn{\supseteq}
\def\spq{\supseteq}
\def\sps{\supset}
\def\st#1#2{\left{#1\;\middle|;#2\right}}
\def\tis#1{\text{ is #1}}
\def\tr{^{\sf t}}
\def\un{\cup}
\def\UN{\bigcup}
\def\vtr{^{;\sf t}}
\def\x{\times}
\def\xl#1{\xleftarrow{#1}}
\def\xr#1{\xrightarrow{#1}}
\def\xxl#1#2{\xleftarrow[#2]{#1}}
\def\xxr#1#2{\xrightarrow[#2]{#1}}
)
These are my lecture notes from CS 225: Pseudorandomness (Canvas, Piazza), taught by Salil Vadhan at Harvard. I am an enrolled student, and these notes are in no way official; they're reproduced by me from Prof. Vadhan's lectures in Spring 2015, and are licensed Creative Commons Attribution-NonCommercial-ShareAlike 4.0, which supercedes any other site-wide licenses.
\(R_1\): Administrivia, Introductions
"Given" two "polynomials" \(f(x_1,\ldots,x_n)\) and \(g(x_1,\ldots,x_n)\), is \(f=g\)?
\[f(x_1,\ldots,x_n)=\sum_{i_1,\ldots,i_n}^{\text{finite}}\lr{c_{i_1,\ldots,i_n}}x_1^{i_1}x_2^{i_2}\ldots x_n^{i_n}\]
- This problem isn't very interesting if we're given a list of coefficients.
- Just check equality over all \(c\).
- More interestingly, maybe we're given an oracle evaluating \(f\) and \(g\).
- Or maybe an arithmetic formulation or circuit (e.g. \((7(x_1+2x_2^3)(x_3-x_4)+x_6)\)).
- This isn't conceptually hard, but blows up exponentially, and we don't know a better way to do it than just FOILing it out.
- This topic was covered better in CS223, an hour later.
\(T_2\): Randomized Complexity Classes
Polynomial Identity Testing
- Problem: Given polynomials \(f,g\) over a field \(\lF\), is \(f=g\)?
- Algorithm:
- Pick \(\al_1,\hh,\al_n\otr S\sbq\lF\).
- Accept if \(f\lr{\al_1,\hh,\al_n}=g\lr{\al_1,\hh,\al_n}\); reject otherwise.
-
Schwartz-Zippel: \\(f\neq g\im\Pr_{\al\_1,\hh,\al\_n}\lr{f\lr{\al\_1,\hh,\al\_n}=g\lr{\al\_1,\hh,\al\_n}}\leq\f d{\abs S}\\)
- Application: Bipartite Perfect Matching \(\in\cx{RNC}\)
\\(\cx{NC}\\): Polylog time in poly# parallel processors.* \\(\det\arye{\dd&&\hh\\\\&\pcwz{x\_{ij}\pif(i,j)\in E\\\\0\pelse}&\\\\\hh&&\dd}\neq0\mim(V,E)\text{ has a perfect matching}\\) * **Application:** \\(\px{AFIT}\_\Z,\px{ACIT}\_\Z\in\co{RP}\\)
Randomized Poly Time (classes)
\(x\in L\im\Pr\blr{A(x)=1}\geq\) | \(\nf12\) | \(\nf14\) | \(1\) | \(1-2^{-n}\) | \(2^{-n}\) | \(\nf1{\poly(n)}\) | \(\nf34\) | \(1-2^{-n}\) | \(1\) |
\(x\nin L\im\Pr\blr{A(x)=1}\leq\) | \(0\) | \(0\) | \(\nf12\) | \(0\) | \(0\) | \(0\) | \(\nf14\) | \(2^{-n}\) | \(0\) |
\(\cx{RP}\) | \(\cx{RP}\) | \(\co{RP}\) | \((\sbq\px{SAT})\) | \(\cx{RP}\) | \(\cx{BPP}\) | \(\cx{BPP}\) | \(\cx{BPP}\) | \(\cx{ZPP}\) (only exp. polytime) |
Boolean Circuits
- Gates: \(\land,\lor,\lnot\).
- "uniformity" enforced
- NC: boolean circuits of polylog depth and poly size
- Problem: estimate average value to within \(\pm\ep\) w.p. \(\geq1-\da\).
- Random sampling: \(t=O\lr{\f{-\log\da}{\ep^2}}\).
- In the oracle-function sense, we need randomization for an efficient (any better than linear-less-than-total) algorithm.
\(R_2\): Sampling and Approximation
Circuit Approximation
Recap: Given oracle access to \(f:\2^m\to[0,1]\), can estimate \(\mu(f)\) to within \(\pm\ep\) w.p. \(\geq1-\da\) using \(O\lr{\f{-\log\da}{\ep^2}}\) queries and time \(\poly(m,-\log\da,1/\ep))\); deterministic alg. will need \(\Oa\lr{2^m}\) queries -- suppose the worst case where you only sample 0s and everything else is 1s.
- Can turn estimation into promise-decision problem:
- \(\px{CA}_Y^\ep=\s{(C,p):\mu(C)>p+\ep}\)
- \(\px{CA}_N^\ep=\s{(C,p):\mu(C)\lt p}\)
- The \(\ep\) is important; in the case \(\ep=0\), the problem is \(\cx{PP}\)-complete.
- Claim: \(\px{CA}^\ep\) is \(\pr{BPP}\)-complete.
- In \(\pr{BPP}\).
- Is \(\pr{BPP}\)-hard: Construct a circuit which runs algorithm \(A\) (from a particular random string); then sample with \(p=1/2\).
- Open: "natural" \(\pr{BPP}\)-complete problems
- \(p=0\im\pr{RP}\)-complete
- \(p=\ep=0\im\pr{NP}\)-complete
Relative Circuit Approximation
- Estimate \(\mu(C)\) to within a \(1+\ep\) multiplicative factor?
- \(\cx{NP}\) -- can differentiate 0 vs. nonzero.
Approximate Counting
- Above means when determining nonzeroness is \(\cx{NP}\)-hard, we don't expect approximation to be easy.
-
DNF: (I kinda blanked on this proof.)
- Q: Can every CSP (constraint satisfaction problem) in \(\cx{P}\) be approximately counted? A: No, there are some that are NP-hard.
S-T Connectivity
Space | Time | |
DFS/BFS | \(O(n)\) | \(O(n)\) |
Recursive (Savitch) | \(O\lr{\log^2n}\) | \(O(n)^{\log n}\) |
Random Walk (in undirected) | \(O(\log n\) | \(\poly(n)\) |
Theorem: If \(s,t\) are in the same connected component of an undirected graph, whp r.w of length \(\poly(n)\) from \(s\) will visit \(t\) (pf. by bounding eigenvalues)
\(T_4\): \(\cx P=\cx{NP}\im\cx{BPP}=\cx P\), Approx. Max. \(k\text-\px{Sat}\), Pairwise Independence
Derandomization (given \(P=NP\))
Theorem: \(\cx P=\cx{NP}\im\cx{BPP}=\cx P\).
Pf: Given \(L\in\cx{BPP}\), \(\ex\) polytime algorithm \(P\) s.t. \(x\in L\mim(\ex y:\fa z)(P(x,y,z))\).
\(\cx{BPP}\) algorithm \(A(x;r)\) for \(L\) w/ err. prob. \(<2^{-m}\):
- \(x\in L\im\ex s_1,\hh,s_m\in\2^m:\)\(\UN_{i=1}^m(A_x+s_i)=\2^m\) Pf: Prob. method.
- \(x\nin L\im\fa s_1,\hh,s_m\in\2^m,\)\(\UN_{i=1}^m(A_x+s_i)\neq\2^m\) Pf: sum (exp. small) areas of the \(A\) accept-sets
Approx. Max. \(k\text-\px{Sat}\)
Given: \(\phi(x_1,\hh,x_n)=\phi_1\land\hh\land\phi_m\), where \(\phi_i\) are disjunctions of exactly \(k\) literals. Goal: Find \(\al\in\2^n\) satisfying as many clauses as you can.
Claim: \(\ex\) a randomized algorithm satisfying \(\geq(1-2^{-k})m clauses\).
Construction: Method of conditional expectations. Assign \(r_1,\hh\) in sequence: Given \(r_1,\hh,r_{i-1}\), select \(r_i\) to maximize \[e(r_1,\hh,r_i)=E\left[\#\text{clauses satisfied}\middle|(R_j=r_j)_{j=1}^i\right]\]\[=\sum_{j=1}^m1-2^{-\#\text{unassigned lit. rem.}}\]
Alternatively, use \(k\)-wise independent distributions on \(R_1,\hh,R_m\) using \(d\ll n\) random bits, which we can simply randomize with \(2^d\) slowdown.
(missed \(R_4\) -- expander graphs)
\(T_5\): More Expanders
Mixing property: the endpoint of a walk of \(O(\log n)\) is ~Uniform. If the starting distribution was Uniform, the whole walk is Uniform (though not quite i.i.d.)
Hitting property: If \(V_1,\hh,V_t\) is a walk on an expander with spectral expansion \(1-\la\). If \(B\) is a set of vertices of density \(\mu\), then \(\Pr(\fa i,V_i\in B)\leq\blr{\mu+\la(1-\mu)}^t\)
- Application: cheaply reducing the error of \(\cx{RP}\) algorithms:
Naive | \(O(k)\) iterations | \(km\) random bits |
Pairwise | \(2^{O(k)}\) iterations | \(m\) random bits |
Expander | \(O(k)\) iterations | \(m+O(k)\) random bits |
Claim 1 (Hitting Theorem): \(G\) regular has spectral expansion \(1-\la\), adjacency matrix \(\Gm\)\(\im\)\(\Gm=(1-\la K+\la E),\fa v,\abs{vE}_2\leq\abs v_2\) (think: completely Uniform w.p. \(1-\la\), not getting worse w.p. \(\la\))
Proof 1 (partial):
\[E\dq\f1\la\blr{\Gm-(1-\la)J}\]
by orthogonality:
\[\abs{vE}_2^2=\abs{v^\parallel E+v^\perp E}^2\]
\[v=v^\parallel+v^\perp=\abs{v^\parallel E}^2+\abs{v^\perp E}^2\]
\[=\abs{v^\parallel}^2+\abs{\f12(v^\perp\Gm-(1-\la)v^\perp J)}^2\]
\[\leq\abs{v^\parallel}^2+\abs{v^\perp}^2\]
\[=\abs v^2\]
To LinAlg-ize the hitting property, let \(P\) be the projector
\[P_{i,i}=\pcwz{1\pif i\in B\\0\pelse};\]
then Claim 2: hitting \(\mim\Pr(\fa i,V_i\in B)=\abs{vP(\Gm P)^{t-1}}_1=\abs{vP(P\Gm P)^{t-1}}_1\). (definitionally true)
Claim 3: \(\fa v,\abs{vP\Gm P}_2\leq\blr{\mu+\la(1-\mu)}\abs V_2\)
Proof 1 (given 2,3):
\[\abs{vP(P\Gm P)^{t-1}\leq\sqrt{\mu N}\abs{vP(P\Gm P)^{t-1}}}\]
\[\leq\sqrt{\mu N}\abs{vP}_2\blr{\mu+\la(1-\mu)}^{t-1}\]
\[\abs v_2=\sqrt{\sum_iv_i^2}\sqrt{\mu N\nf1{N^2}}=\sqrt{\nf\mu N}\]
\[\text{(prev.)}=\mu\blr{\mu+\la(1-\mu)}^{t-1}\]
Proof 3:
\[\Gm=(1-\la)J+\la E\]
\[\abs{VP\Gm P}_2=\abs{vP\blr{(1-\la)J+\la E}P}_2\]
\[=\abs{(1-\la)vPJP+\la vPEP}_2\]
\[\leq\abs{(1-\la)vPJP}_2\abs{\la vPEP}_2\]
\[(1-\la)\abs{vPJP}_2+\la\abs v_2\]
\[=(1-\la)\mu\abs v_2+\la\abs v_2\]
Theorem: Given: graph \(G\) a \(\la\)-spectral expander, \(f:[N]\to\2\) an indicator function, \(v_1,\hh,v_t\) a random walk:[\Pr\lr{\abs{\f1t\sum_if(v_i)-\mu(f)}\geq\ep+\la}\leq2^{-\Oa(t\ep^2)}.\]
(did not pay attention to proof)
\(R_5\): Graph Operations, Explicit Expander Constructions, Undirected S-T Connectivity \(\in\cx L\)
GraphOps
- Squaring -- two-hop all edges -- \((N,D,\gm)^2\to(N,D^2,1-(1-\gm)^2=(2-\gm)\gm)\) -- better expansion, worse degree
- Tensoring -- tensoring adjacency matrices -- \((G_1\ox G_2)\to(N_1N_2,D_1D_2\min\s{\gm_1,\gm_2})\)
- Zig-Zag -- \((N_1,D_1,\gm_1)(Z)(D_1,D_2,\gm_2)\to(N_1 D_1,D_2^2,\gm_1\gm_2)\) -- slightly bigger graph, degree slightly smaller, slightly worse expansion
\(T_6\): Codes
Basic Defs
- Encoding: \(\Enc:\2^n\to\Sg^{\hat n}\)
- \(e\dq\Enc(\2^n)\)
- Parameters:
- minimize (maybe to 2?) alphabet size \(q=\abs\Sg\)
- maximize decoding distance \(\da\)
- minimize rate \(\rho=\f{\log\abs e}{\hat n\log\Sg}\leq1\)
- minimize list size \(L\) -- ideally \(1,;O(1),;\poly(n)\)
- computational complexity
- Would like: \(\fa m,\fa r\in B(\Enc,\da),\Dec(r)=m\) -- a \(\da\)-decoding algorithm
- \(\da\leq\f{\min_{\text{dist}}(3)}2\leq\f12\)
- List decoding \Dec\Sg^{\hat n}\to[N]^L -- a \((\da,L)\) list-decoder; is possible iff \(\fa r,B(r,\da)\cap e\leq L\).
- Often, in practice,
- Also can just take least-distance codeword (lead-distance coding, maximum-likelihood coding)
- Interactive coding -- codes that work well for an interactive protocol of communication
- Adversarial corruption a constant \(\da\)-fraction of the total bits sent
- Usually involves communicating some state about earlier messages later in the conversation
- see: Brakerski-Kalai-Naor, Haveeler
Existential Bounds
- Unique decoding: to achieve min-dist \(\da\), greedily pick codewords s.t. \(c_{i+1}\nin\Un_{j=1}^iB(c_i,\da)\) -- gives \(N\geq\f{2^{\hat n}}{\abs{B(x,\da)}}\qd q^{\hat n(1-H_q(\da,\hat n))}\)
- \(\rho=1-H_q(\da,\hat n)\)