CS 225 (Pseudorandomness)

These are my lecture notes from CS 225: Pseudorandomness (Canvas, Piazza), taught by Salil Vadhan at Harvard. I am an enrolled student, and these notes are in no way official; they're reproduced by me from Prof. Vadhan's lectures in Spring 2015, and are licensed Creative Commons Attribution-NonCommercial-ShareAlike 4.0, which supercedes any other site-wide licenses.

$R_1$: Administrivia, Introductions

"Given" two "polynomials" $f(x_1,\ldots,x_n)$ and $g(x_1,\ldots,x_n)$, is $f=g$?

$f(x_1,\ldots,x_n)=\sum_{i_1,\ldots,i_n}^{\text{finite}}\lr{c_{i_1,\ldots,i_n}}x_1^{i_1}x_2^{i_2}\ldots x_n^{i_n}$

• This problem isn't very interesting if we're given a list of coefficients.
• Just check equality over all $c$.
• More interestingly, maybe we're given an oracle evaluating $f$ and $g$.
• Or maybe an arithmetic formulation or circuit (e.g. $(7(x_1+2x_2^3)(x_3-x_4)+x_6)$).
• This isn't conceptually hard, but blows up exponentially, and we don't know a better way to do it than just FOILing it out.
• This topic was covered better in CS223, an hour later.

$T_2$: Randomized Complexity Classes

Polynomial Identity Testing

• Problem: Given polynomials $f,g$ over a field $\lF$, is $f=g$?
• Algorithm:
• Pick $\al_1,\hh,\al_n\otr S\sbq\lF$.
• Accept if $f\lr{\al_1,\hh,\al_n}=g\lr{\al_1,\hh,\al_n}$; reject otherwise.
• Schwartz-Zippel: $f\neq g\im\Pr_{\al_1,\hh,\al_n}\lr{f\lr{\al_1,\hh,\al_n}=g\lr{\al_1,\hh,\al_n}}\leq\f d{\abs S}$
• Application: Bipartite Perfect Matching $\in\cx{RNC}$
$\cx{NC}$: Polylog time in poly# parallel processors.
• $\det\arye{\dd&&\hh\\&\pcwz{x_{ij}\pif(i,j)\in E\\0\pelse}&\\\hh&&\dd}\neq0\mim(V,E)\text{ has a perfect matching}$
• Application: $\px{AFIT}_\Z,\px{ACIT}_\Z\in\co{RP}$

Randomized Poly Time (classes)

 $x\in L\im\Pr\blr{A(x)=1}\geq$ $\nf12$ $\nf14$ $1$ $1-2^{-n}$ $2^{-n}$ $\nf1{\poly(n)}$ $\nf34$ $1-2^{-n}$ $1$ $x\nin L\im\Pr\blr{A(x)=1}\leq$ $0$ $0$ $\nf12$ $0$ $0$ $0$ $\nf14$ $2^{-n}$ $0$ $\cx{RP}$ $\cx{RP}$ $\co{RP}$ $(\sbq\px{SAT})$ $\cx{RP}$ $\cx{BPP}$ $\cx{BPP}$ $\cx{BPP}$ $\cx{ZPP}$ (only exp. polytime)
• Chernoff bound: $\Pr\blr{\abs{X-\mu}>\ep}\leq2\Exp{-\Oa\lr{t\ep^2}}$.

Boolean Circuits

• Gates: $\land,\lor,\lnot$.
• "uniformity" enforced
• NC: boolean circuits of polylog depth and poly size
• Problem: estimate average value to within $\pm\ep$ w.p. $\geq1-\da$.
• Random sampling: $t=O\lr{\f{-\log\da}{\ep^2}}$.
• In the oracle-function sense, we need randomization for an efficient (any better than linear-less-than-total) algorithm.

$R_2$: Sampling and Approximation

Circuit Approximation

Recap: Given oracle access to $f:\2^m\to[0,1]$, can estimate $\mu(f)$ to within $\pm\ep$ w.p. $\geq1-\da$ using $O\lr{\f{-\log\da}{\ep^2}}$ queries and time $\poly(m,-\log\da,1/\ep))$; deterministic alg. will need $\Oa\lr{2^m}$ queries -- suppose the worst case where you only sample 0s and everything else is 1s.

• Can turn estimation into promise-decision problem:
• $\px{CA}_Y^\ep=\s{(C,p):\mu(C)>p+\ep}$
• $\px{CA}_N^\ep=\s{(C,p):\mu(C)\lt p}$
• The $\ep$ is important; in the case $\ep=0$, the problem is $\cx{PP}$-complete.
• Claim: $\px{CA}^\ep$ is $\pr{BPP}$-complete.
• In $\pr{BPP}$.
• Is $\pr{BPP}$-hard: Construct a circuit which runs algorithm $A$ (from a particular random string); then sample with $p=1/2$.
• Open: "natural" $\pr{BPP}$-complete problems
• $p=0\im\pr{RP}$-complete
• $p=\ep=0\im\pr{NP}$-complete

Relative Circuit Approximation

• Estimate $\mu(C)$ to within a $1+\ep$ multiplicative factor?
• $\cx{NP}$ -- can differentiate 0 vs. nonzero.

Approximate Counting

• Above means when determining nonzeroness is $\cx{NP}$-hard, we don't expect approximation to be easy.
• #DNF: (I kinda blanked on this proof.)
• Q: Can every CSP (constraint satisfaction problem) in $\cx{P}$ be approximately counted? A: No, there are some that are NP-hard.

S-T Connectivity

 Space Time DFS/BFS $O(n)$ $O(n)$ Recursive(Savitch) $O\lr{\log^2n}$ $O(n)^{\log n}$ Random Walk(in undirected) $O(\log n$ $\poly(n)$

Theorem: If $s,t$ are in the same connected component of an undirected graph, whp r.w of length $\poly(n)$ from $s$ will visit $t$ (pf. by bounding eigenvalues)

$T_4$: $\cx P=\cx{NP}\im\cx{BPP}=\cx P$, Approx. Max. $k\text-\px{Sat}$, Pairwise Independence

Derandomization (given $P=NP$)

Theorem: $\cx P=\cx{NP}\im\cx{BPP}=\cx P$.

Pf: Given $L\in\cx{BPP}$, $\ex$ polytime algorithm $P$ s.t. $x\in L\mim(\ex y:\fa z)(P(x,y,z))$.

$\cx{BPP}$ algorithm $A(x;r)$ for $L$ w/ err. prob. $<2^{-m}$:

• $x\in L\im\ex s_1,\hh,s_m\in\2^m:$$\UN_{i=1}^m(A_x+s_i)=\2^m$ Pf: Prob. method.
• $x\nin L\im\fa s_1,\hh,s_m\in\2^m,$$\UN_{i=1}^m(A_x+s_i)\neq\2^m$ Pf: sum (exp. small) areas of the $A$ accept-sets

Approx. Max. $k\text-\px{Sat}$

Given: $\phi(x_1,\hh,x_n)=\phi_1\land\hh\land\phi_m$, where $\phi_i$ are disjunctions of exactly $k$ literals. Goal: Find $\al\in\2^n$ satisfying as many clauses as you can.

Claim: $\ex$ a randomized algorithm satisfying $\geq(1-2^{-k})m clauses$.

Construction: Method of conditional expectations. Assign $r_1,\hh$ in sequence: Given $r_1,\hh,r_{i-1}$, select $r_i$ to maximize $e(r_1,\hh,r_i)=E\left[\#\text{clauses satisfied}\middle|(R_j=r_j)_{j=1}^i\right]$$=\sum_{j=1}^m1-2^{-\#\text{unassigned lit. rem.}}$

Alternatively, use $k$-wise independent distributions on $R_1,\hh,R_m$ using $d\ll n$ random bits, which we can simply randomize with $2^d$ slowdown.

(missed $R_4$ -- expander graphs)

$T_5$: More Expanders

Mixing property: the endpoint of a walk of $O(\log n)$ is ~Uniform. If the starting distribution was Uniform, the whole walk is Uniform (though not quite i.i.d.)

Hitting property: If $V_1,\hh,V_t$ is a walk on an expander with spectral expansion $1-\la$. If $B$ is a set of vertices of density $\mu$, then $\Pr(\fa i,V_i\in B)\leq\blr{\mu+\la(1-\mu)}^t$

• Application: cheaply reducing the error of $\cx{RP}$ algorithms:
 Naive $O(k)$ iterations $km$ random bits Pairwise $2^{O(k)}$ iterations $m$ random bits Expander $O(k)$ iterations $m+O(k)$ random bits

...when trying to boost a $m$-bit algorithm to error probability $2^{-k}$.

Claim 1 (Hitting Theorem): $G$ regular has spectral expansion $1-\la$, adjacency matrix $\Gm$$\im$$\Gm=(1-\la K+\la E),\fa v,\abs{vE}_2\leq\abs v_2$ (think: completely Uniform w.p. $1-\la$, not getting worse w.p. $\la$)

Proof 1 (partial):
$E\dq\f1\la\blr{\Gm-(1-\la)J}$ by orthogonality:
$\abs{vE}_2^2=\abs{v^\parallel E+v^\perp E}^2$ $v=v^\parallel+v^\perp=\abs{v^\parallel E}^2+\abs{v^\perp E}^2$ $=\abs{v^\parallel}^2+\abs{\f12(v^\perp\Gm-(1-\la)v^\perp J)}^2$ $\leq\abs{v^\parallel}^2+\abs{v^\perp}^2$ $=\abs v^2$

To LinAlg-ize the hitting property, let $P$ be the projector
$P_{i,i}=\pcwz{1\pif i\in B\\0\pelse};$ then Claim 2: hitting $\mim\Pr(\fa i,V_i\in B)=\abs{vP(\Gm P)^{t-1}}_1=\abs{vP(P\Gm P)^{t-1}}_1$. (definitionally true)

Claim 3: $\fa v,\abs{vP\Gm P}_2\leq\blr{\mu+\la(1-\mu)}\abs V_2$

Proof 1 (given 2,3):
$\abs{vP(P\Gm P)^{t-1}\leq\sqrt{\mu N}\abs{vP(P\Gm P)^{t-1}}}$ $\leq\sqrt{\mu N}\abs{vP}_2\blr{\mu+\la(1-\mu)}^{t-1}$ $\abs v_2=\sqrt{\sum_iv_i^2}\sqrt{\mu N\nf1{N^2}}=\sqrt{\nf\mu N}$ $\text{(prev.)}=\mu\blr{\mu+\la(1-\mu)}^{t-1}$

Proof 3:
$\Gm=(1-\la)J+\la E$ $\abs{VP\Gm P}_2=\abs{vP\blr{(1-\la)J+\la E}P}_2$ $=\abs{(1-\la)vPJP+\la vPEP}_2$ $\leq\abs{(1-\la)vPJP}_2\abs{\la vPEP}_2$ $(1-\la)\abs{vPJP}_2+\la\abs v_2$ $=(1-\la)\mu\abs v_2+\la\abs v_2$

Theorem: Given: graph $G$ a $\la$-spectral expander, $f:[N]\to\2$ an indicator function, $v_1,\hh,v_t$ a random walk:[\Pr\lr{\abs{\f1t\sum_if(v_i)-\mu(f)}\geq\ep+\la}\leq2^{-\Oa(t\ep^2)}.\]

(did not pay attention to proof)

$R_5$: Graph Operations, Explicit Expander Constructions, Undirected S-T Connectivity $\in\cx L$

GraphOps

• Squaring -- two-hop all edges -- $(N,D,\gm)^2\to(N,D^2,1-(1-\gm)^2=(2-\gm)\gm)$ -- better expansion, worse degree
• Tensoring -- tensoring adjacency matrices -- $(G_1\ox G_2)\to(N_1N_2,D_1D_2\min\s{\gm_1,\gm_2})$
• Zig-Zag -- $(N_1,D_1,\gm_1)(Z)(D_1,D_2,\gm_2)\to(N_1 D_1,D_2^2,\gm_1\gm_2)$ -- slightly bigger graph, degree slightly smaller, slightly worse expansion

$T_6$: Codes

Basic Defs

• Encoding: $\Enc:\2^n\to\Sg^{\hat n}$
• $e\dq\Enc(\2^n)$
• Parameters:
• minimize (maybe to 2?) alphabet size $q=\abs\Sg$
• maximize decoding distance $\da$
• minimize rate $\rho=\f{\log\abs e}{\hat n\log\Sg}\leq1$
• minimize list size $L$ -- ideally $1,\;O(1),\;\poly(n)$
• computational complexity
• Would like: $\fa m,\fa r\in B(\Enc,\da),\Dec(r)=m$ -- a $\da$-decoding algorithm
• $\da\leq\f{\min_{\text{dist}}(3)}2\leq\f12$
• List decoding \Dec\Sg^{\hat n}\to[N]^L -- a $(\da,L)$ list-decoder; is possible iff $\fa r,B(r,\da)\cap e\leq L$.
• Often, in practice,
• Also can just take least-distance codeword (lead-distance coding, maximum-likelihood coding)
• Interactive coding -- codes that work well for an interactive protocol of communication
• Adversarial corruption a constant $\da$-fraction of the total bits sent
• Usually involves communicating some state about earlier messages later in the conversation
• see: Brakerski-Kalai-Naor, Haveeler

Existential Bounds

• Unique decoding: to achieve min-dist $\da$, greedily pick codewords s.t. $c_{i+1}\nin\Un_{j=1}^iB(c_i,\da)$ -- gives $N\geq\f{2^{\hat n}}{\abs{B(x,\da)}}\qd q^{\hat n(1-H_q(\da,\hat n))}$
• $\rho=1-H_q(\da,\hat n)$