CS 225 (Pseudorandomness)

29 Jan 2015

These are my lecture notes from CS 225: Pseudorandomness (Canvas, Piazza), taught by Salil Vadhan at Harvard. I am an enrolled student, and these notes are in no way official; they're reproduced by me from Prof. Vadhan's lectures in Spring 2015, and are licensed Creative Commons Attribution-NonCommercial-ShareAlike 4.0, which supercedes any other site-wide licenses.

\(R_1\): Administrivia, Introductions

"Given" two "polynomials" \(f(x_1,\ldots,x_n)\) and \(g(x_1,\ldots,x_n)\), is \(f=g\)?

\[f(x_1,\ldots,x_n)=\sum_{i_1,\ldots,i_n}^{\text{finite}}\lr{c_{i_1,\ldots,i_n}}x_1^{i_1}x_2^{i_2}\ldots x_n^{i_n}\]

This problem isn't very interesting if we're given a list of coefficients.
Just check equality over all \(c\).
More interestingly, maybe we're given an oracle evaluating \(f\) and \(g\).
Or maybe an arithmetic formulation or circuit (e.g. \((7(x_1+2x_2^3)(x_3-x_4)+x_6)\)).
This isn't conceptually hard, but blows up exponentially, and we don't know a better way to do it than just FOILing it out.
This topic was covered better in CS223, an hour later.

\(T_2\): Randomized Complexity Classes

Polynomial Identity Testing

Problem: Given polynomials \(f,g\) over a field \(\lF\), is \(f=g\)?
Algorithm:
Pick \(\al_1,\hh,\al_n\otr S\sbq\lF\).
Accept if \(f\lr{\al_1,\hh,\al_n}=g\lr{\al_1,\hh,\al_n}\); reject otherwise.
Schwartz-Zippel: \\(f\neq g\im\Pr_{\al\_1,\hh,\al\_n}\lr{f\lr{\al\_1,\hh,\al\_n}=g\lr{\al\_1,\hh,\al\_n}}\leq\f d{\abs S}\\)
Application: Bipartite Perfect Matching \(\in\cx{RNC}\)

\\(\cx{NC}\\): Polylog time in poly# parallel processors.

* \\(\det\arye{\dd&&\hh\\\\&\pcwz{x\_{ij}\pif(i,j)\in E\\\\0\pelse}&\\\\\hh&&\dd}\neq0\mim(V,E)\text{ has a perfect matching}\\) * **Application:** \\(\px{AFIT}\_\Z,\px{ACIT}\_\Z\in\co{RP}\\)

Randomized Poly Time (classes)

\(x\in L\im\Pr\blr{A(x)=1}\geq\)	\(\nf12\)	\(\nf14\)	\(1\)	\(1-2^{-n}\)	\(2^{-n}\)	\(\nf1{\poly(n)}\)	\(\nf34\)	\(1-2^{-n}\)	\(1\)
\(x\nin L\im\Pr\blr{A(x)=1}\leq\)	\(0\)	\(0\)	\(\nf12\)	\(0\)	\(0\)	\(0\)	\(\nf14\)	\(2^{-n}\)	\(0\)
	\(\cx{RP}\)	\(\cx{RP}\)	\(\co{RP}\)	\((\sbq\px{SAT})\)	\(\cx{RP}\)	\(\cx{BPP}\)	\(\cx{BPP}\)	\(\cx{BPP}\)	\(\cx{ZPP}\) (only exp. polytime)

* Chernoff bound: \\(\Pr\blr{\abs{X-\mu}>\ep}\leq2\Exp{-\Oa\lr{t\ep^2}}\\).

Boolean Circuits

Gates: \(\land,\lor,\lnot\).
"uniformity" enforced
NC: boolean circuits of polylog depth and poly size
Problem: estimate average value to within \(\pm\ep\) w.p. \(\geq1-\da\).
Random sampling: \(t=O\lr{\f{-\log\da}{\ep^2}}\).
In the oracle-function sense, we need randomization for an efficient (any better than linear-less-than-total) algorithm.

\(R_2\): Sampling and Approximation

Circuit Approximation

Recap: Given oracle access to \(f:\2^m\to[0,1]\), can estimate \(\mu(f)\) to within \(\pm\ep\) w.p. \(\geq1-\da\) using \(O\lr{\f{-\log\da}{\ep^2}}\) queries and time \(\poly(m,-\log\da,1/\ep))\); deterministic alg. will need \(\Oa\lr{2^m}\) queries -- suppose the worst case where you only sample 0s and everything else is 1s.

Can turn estimation into promise-decision problem:
\(\px{CA}_Y^\ep=\s{(C,p):\mu(C)>p+\ep}\)
\(\px{CA}_N^\ep=\s{(C,p):\mu(C)\lt p}\)
The \(\ep\) is important; in the case \(\ep=0\), the problem is \(\cx{PP}\)-complete.
Claim: \(\px{CA}^\ep\) is \(\pr{BPP}\)-complete.
In \(\pr{BPP}\).
Is \(\pr{BPP}\)-hard: Construct a circuit which runs algorithm \(A\) (from a particular random string); then sample with \(p=1/2\).
Open: "natural" \(\pr{BPP}\)-complete problems
\(p=0\im\pr{RP}\)-complete
\(p=\ep=0\im\pr{NP}\)-complete

Relative Circuit Approximation

Estimate \(\mu(C)\) to within a \(1+\ep\) multiplicative factor?
\(\cx{NP}\) -- can differentiate 0 vs. nonzero.

Approximate Counting

Above means when determining nonzeroness is \(\cx{NP}\)-hard, we don't expect approximation to be easy.
DNF: (I kinda blanked on this proof.)
Q: Can every CSP (constraint satisfaction problem) in \(\cx{P}\) be approximately counted? A: No, there are some that are NP-hard.

S-T Connectivity

	Space	Time
DFS/BFS	\(O(n)\)	\(O(n)\)
Recursive (Savitch)	\(O\lr{\log^2n}\)	\(O(n)^{\log n}\)
Random Walk (in undirected)	\(O(\log n\)	\(\poly(n)\)

Theorem: If \(s,t\) are in the same connected component of an undirected graph, whp r.w of length \(\poly(n)\) from \(s\) will visit \(t\) (pf. by bounding eigenvalues)

\(T_4\): \(\cx P=\cx{NP}\im\cx{BPP}=\cx P\), Approx. Max. \(k\text-\px{Sat}\), Pairwise Independence

Derandomization (given \(P=NP\))

Theorem: \(\cx P=\cx{NP}\im\cx{BPP}=\cx P\).

Pf: Given \(L\in\cx{BPP}\), \(\ex\) polytime algorithm \(P\) s.t. \(x\in L\mim(\ex y:\fa z)(P(x,y,z))\).

\(\cx{BPP}\) algorithm \(A(x;r)\) for \(L\) w/ err. prob. \(<2^{-m}\):

\(x\in L\im\ex s_1,\hh,s_m\in\2^m:\)\(\UN_{i=1}^m(A_x+s_i)=\2^m\) Pf: Prob. method.
\(x\nin L\im\fa s_1,\hh,s_m\in\2^m,\)\(\UN_{i=1}^m(A_x+s_i)\neq\2^m\) Pf: sum (exp. small) areas of the \(A\) accept-sets

Approx. Max. \(k\text-\px{Sat}\)

Given: \(\phi(x_1,\hh,x_n)=\phi_1\land\hh\land\phi_m\), where \(\phi_i\) are disjunctions of exactly \(k\) literals. Goal: Find \(\al\in\2^n\) satisfying as many clauses as you can.

Claim: \(\ex\) a randomized algorithm satisfying \(\geq(1-2^{-k})m clauses\).

Construction: Method of conditional expectations. Assign \(r_1,\hh\) in sequence: Given \(r_1,\hh,r_{i-1}\), select \(r_i\) to maximize \[e(r_1,\hh,r_i)=E\left[\#\text{clauses satisfied}\middle|(R_j=r_j)_{j=1}^i\right]\]\[=\sum_{j=1}^m1-2^{-\#\text{unassigned lit. rem.}}\]

Alternatively, use \(k\)-wise independent distributions on \(R_1,\hh,R_m\) using \(d\ll n\) random bits, which we can simply randomize with \(2^d\) slowdown.

(missed \(R_4\) -- expander graphs)

\(T_5\): More Expanders

Mixing property: the endpoint of a walk of \(O(\log n)\) is ~Uniform. If the starting distribution was Uniform, the whole walk is Uniform (though not quite i.i.d.)

Hitting property: If \(V_1,\hh,V_t\) is a walk on an expander with spectral expansion \(1-\la\). If \(B\) is a set of vertices of density \(\mu\), then \(\Pr(\fa i,V_i\in B)\leq\blr{\mu+\la(1-\mu)}^t\)

Application: cheaply reducing the error of \(\cx{RP}\) algorithms:

Naive	\(O(k)\) iterations	\(km\) random bits
Pairwise	\(2^{O(k)}\) iterations	\(m\) random bits
Expander	\(O(k)\) iterations	\(m+O(k)\) random bits

...when trying to boost a \\(m\\)-bit algorithm to error probability \\(2^{-k}\\).

Claim 1 (Hitting Theorem): \(G\) regular has spectral expansion \(1-\la\), adjacency matrix \(\Gm\)\(\im\)\(\Gm=(1-\la K+\la E),\fa v,\abs{vE}_2\leq\abs v_2\) (think: completely Uniform w.p. \(1-\la\), not getting worse w.p. \(\la\))

Proof 1 (partial):
\[E\dq\f1\la\blr{\Gm-(1-\la)J}\]
by orthogonality:
\[\abs{vE}_2^2=\abs{v^\parallel E+v^\perp E}^2\]
\[v=v^\parallel+v^\perp=\abs{v^\parallel E}^2+\abs{v^\perp E}^2\]
\[=\abs{v^\parallel}^2+\abs{\f12(v^\perp\Gm-(1-\la)v^\perp J)}^2\]
\[\leq\abs{v^\parallel}^2+\abs{v^\perp}^2\]
\[=\abs v^2\]

To LinAlg-ize the hitting property, let \(P\) be the projector
\[P_{i,i}=\pcwz{1\pif i\in B\\0\pelse};\]
then Claim 2: hitting \(\mim\Pr(\fa i,V_i\in B)=\abs{vP(\Gm P)^{t-1}}_1=\abs{vP(P\Gm P)^{t-1}}_1\). (definitionally true)

Claim 3: \(\fa v,\abs{vP\Gm P}_2\leq\blr{\mu+\la(1-\mu)}\abs V_2\)

Proof 1 (given 2,3):
\[\abs{vP(P\Gm P)^{t-1}\leq\sqrt{\mu N}\abs{vP(P\Gm P)^{t-1}}}\]
\[\leq\sqrt{\mu N}\abs{vP}_2\blr{\mu+\la(1-\mu)}^{t-1}\]
\[\abs v_2=\sqrt{\sum_iv_i^2}\sqrt{\mu N\nf1{N^2}}=\sqrt{\nf\mu N}\]
\[\text{(prev.)}=\mu\blr{\mu+\la(1-\mu)}^{t-1}\]

Proof 3:
\[\Gm=(1-\la)J+\la E\]
\[\abs{VP\Gm P}_2=\abs{vP\blr{(1-\la)J+\la E}P}_2\]
\[=\abs{(1-\la)vPJP+\la vPEP}_2\]
\[\leq\abs{(1-\la)vPJP}_2\abs{\la vPEP}_2\]
\[(1-\la)\abs{vPJP}_2+\la\abs v_2\]
\[=(1-\la)\mu\abs v_2+\la\abs v_2\]

Theorem: Given: graph \(G\) a \(\la\)-spectral expander, \(f:[N]\to\2\) an indicator function, \(v_1,\hh,v_t\) a random walk:[\Pr\lr{\abs{\f1t\sum_if(v_i)-\mu(f)}\geq\ep+\la}\leq2^{-\Oa(t\ep^2)}.\]

(did not pay attention to proof)

\(R_5\): Graph Operations, Explicit Expander Constructions, Undirected S-T Connectivity \(\in\cx L\)

GraphOps

Squaring -- two-hop all edges -- \((N,D,\gm)^2\to(N,D^2,1-(1-\gm)^2=(2-\gm)\gm)\) -- better expansion, worse degree
Tensoring -- tensoring adjacency matrices -- \((G_1\ox G_2)\to(N_1N_2,D_1D_2\min\s{\gm_1,\gm_2})\)
Zig-Zag -- \((N_1,D_1,\gm_1)(Z)(D_1,D_2,\gm_2)\to(N_1 D_1,D_2^2,\gm_1\gm_2)\) -- slightly bigger graph, degree slightly smaller, slightly worse expansion

\(T_6\): Codes

Basic Defs

Encoding: \(\Enc:\2^n\to\Sg^{\hat n}\)
\(e\dq\Enc(\2^n)\)
Parameters:
minimize (maybe to 2?) alphabet size \(q=\abs\Sg\)
maximize decoding distance \(\da\)
minimize rate \(\rho=\f{\log\abs e}{\hat n\log\Sg}\leq1\)
minimize list size \(L\) -- ideally \(1,;O(1),;\poly(n)\)
computational complexity
Would like: \(\fa m,\fa r\in B(\Enc,\da),\Dec(r)=m\) -- a \(\da\)-decoding algorithm
\(\da\leq\f{\min_{\text{dist}}(3)}2\leq\f12\)
List decoding \Dec\Sg^{\hat n}\to[N]^L -- a \((\da,L)\) list-decoder; is possible iff \(\fa r,B(r,\da)\cap e\leq L\).
Often, in practice,
Also can just take least-distance codeword (lead-distance coding, maximum-likelihood coding)
Interactive coding -- codes that work well for an interactive protocol of communication
Adversarial corruption a constant \(\da\)-fraction of the total bits sent
Usually involves communicating some state about earlier messages later in the conversation
see: Brakerski-Kalai-Naor, Haveeler

Existential Bounds

Unique decoding: to achieve min-dist \(\da\), greedily pick codewords s.t. \(c_{i+1}\nin\Un_{j=1}^iB(c_i,\da)\) -- gives \(N\geq\f{2^{\hat n}}{\abs{B(x,\da)}}\qd q^{\hat n(1-H_q(\da,\hat n))}\)
\(\rho=1-H_q(\da,\hat n)\)

Icosian Reflections

…a tendency to systematize and a keen sense

that we live in a broken world.