CS506 outline: First day: September 8, 2008 First day: questionnaire and describe course: NC vs P vs RP vs BPP vs NP, First two parts: use old notes. Some new stuff and differences: Want to stop fussing with machine models, to introduce other reasonable models: RAM, variants of RAM, PRAM, etc. Main difference between RAM and TM: indirect addressing Question: How to get homework handed in on time with reasonable flexibility? Question: Grading: (1) Homework, (2) Project/Essay Explain grading: C = fail; B = pass but should NOT do anything related to theory; A- = does not recommend research in this area A = might recommend research in this area A+ = strongly recommend research in this area Question: Outline differences in TM vs RAM, e.g. simulate 1-tape TM with 2-tape TM--tricky simulate RAM with another RAM--not so tricky TIME(n) != NTIME(n) TIME(n^2) != RAM_TIME(n^2) Question: What is a good text for this course? Partial Answer: Hopcroft and Ullman (still the best), Papadimitriou (but not advanced enough), Wegener (gives the modert spirit) One Goal: Reduce silly technicalities, and get to heart of matter more quickly. E.g., Hopcroft and Ullman: S_1(n)/S_2(n) -> 0 (resp. T_1(n) \log T_1(n) / T_2(n) -> 0 ) implies DSPACE( S_1(n) ) properly contained in DSPACE( S_2(n) ) (resp. replace S with T and DSPACE with DTIME) provided that S_1(n) >= log_2 n (resp. no condition on T_i) and S_1,S_2 fully space-constructible (resp. T_1,T_2 fully time-constructible) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Added for Second Day: September 10, 2008 Joint work: Identity sources, collaborators. Write up should be yours. Makes grading more nebulous. Today: TMs, RAMs, PRAMs. Classes: (D)TIME,(D)SPACE,P,EXP, relation with circuits: P/poly Universal TM and simulation. Problem: Prove Palindrome requires c n^2 time on 1-tape. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% September 15, 2008 Problem: Show that Palindrom can be solved in time (n^2/100)+C for some absolute constant, C. In class: outline proof of this problem or give hints. Remarks: Palindrome in DTIME_{2-tape}(n). Define and compare: DTIME_{PRAM}, P, RP, NP, ... Define P/poly and explain relation. Sample algorithms: SAT: in DEXP, NP Miller-Rabin: RTIME(small poly) RP in P/poly, at least if defined appropriately %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% September 17, 2008 and beyond: Explain BPP (P in RP in BPP), NP, Explain P in P/poly, with P/poly defined via advice or circuits Explain SAT NP-complete under logspace reductions Explain? Circuit_SAT P-complete under logspace reductions Cite Jin-Yi Cai's notes; Prove P/poly cannot lie in any countable class (P,NP,BPP,RP,...) Prove Hopcroft and Ullman book theorems mentioned above. Mention goal: P versus NP cannot be resolved by relativising argument %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% September 22, 2008 and beyond: Hennie and Stearns type argument for f(n)log f(n) simulation: sim(n): (if n=1, just simulate), otherwise:) centre(n), sim(n/2), centre(n), sim(n/2), restore(n). Can avoid this with random access TM. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% September 24, 2008: Review: showed DSPACE(f_1(n)), DSPACE(f_2(n)) separation if f_2(n)/f_1(n) -> infty as n->infty, assuming f_1(n) computable in space f_2(n), assuming f_2(n) >= n. Compare P, P/poly and poly_size circuits, and NP; show SAT is NP-complete under log space; show circuit_eval P-complete under logspace Define: NC,NC^i. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% October 29, 2008: Best reference for last few weeks: Cai's notes. Covered poly time hierarchy and PSPACE, PSPACE complete problems under poly time reductions. Saw "real world" complete problems and artificial complete problems. Another example of artificial: A_P = {(,w,1^t) s.t. on input w, M accepts in time t}; claim: this is P complete for logspace. We have NP^A \subset PSPACE^A (for any oracle A), PSPACE^A \subset PSPACE if A \in PSPACE, and finally PSPACE \subset P^A if A is PSPACE complete (under poly time reductions). So for such A we have P^A = NP^A . We have NP^B != P^B for B constructed as follows (see Cai's notes--this is by far the best exposition I've seen, modulo a few typos in the letters): Given B a language over {0,1}, let T_B = { 1^n s.t. B contains at least one word of length n}. The idea is that we construct an oracle, B, to fool all TMs, and we will take B to be contain either 0 or 1 words of length n for each n, and B will be 0 on all "undetermined" queries. We enumerate TM's, M_1,M_2,... to contain each TM an infinite number of times, and bound M_k to have time n^k. Etc. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% November 3, 2008: Circuit complexity: P/poly = PolyCirc contains P, many think it does not contain NP. PolyForm = functions described by poly size formulas. formula depth = circuit depth ; formula size is roughly log formula depth (one direction is immediate, another require cutting the tree at 1/3 - 2/3 point and modifying the formula) Similar: if L is in NC^i, then for each n there is a circuit of depth O((log n)^i) or so to test if w with |w|=n is in L. So formula depth lower bound could separate NC^1 from NC^2, etc. Current best formula size bound is n^(3-o(1)), whose log (to convert to formula depth) is only (3-o(1)) log n. Average and highest formula size (with way overcounting): formulas_i \le 2 \sum_j formulas(j) formulas(i-j-1), or formulas_n \le # matching left/right parens =binom(2n,n)/(n+1), times leaf choices (2n)^n. circuits_i \le circuits_{i-1} * 2(2n+i)^2 the 2 for and,or. Note adding xor, nand, etc, can change things. A lot of literature on monotone circuit complexity. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% November 17, 2008: Hastad's article: Idea: Take formula, let variable remain unchanged with probability p; otherwise set to 0 or 1 with prob (1-p)/2 each. Shinkage exponent: formula length L goes to one of length order p^Gamma L + 1. Highest possible Gamma is called the shrinkage exponent. Subbotovskaya: Gamma is bigger than 1 (she proved 1.5). Andre'ev: Gives explicit function of size n^(1+Gamma-o(1)). Hastad: Proves Gamma is at least 2 (after [NI], [PZ]). Khrapchenko: Gamma is at most 2 (Exercise? Consider parity.) Proof highlights: Andre'ev: Assume 2n inputs, log(n) divides n. Define a functions: first n bits describe all boolean function on log(n) bits; second half divided into groups of size n/log(n); in each group we take the parity and then feed the log(n) parities as inputs to the function described in the first half. Restrict inputs in first half to give a function of complexity at least n/loglog(n), and randomly restrict second half with p = 2 log(n)loglog(n)/n (so that we expect at least one variable in each parity to survive). What's left has complexity at least n/loglog(n). Simplification: From "bottom up" on the formula tree, we apply the following simplification rules under random restriction: if f = (g and h) in the tree (with similar rules for (g or h)), we: set f to 0 if g or h are 0, discard g if g restricts to 1 and same for h (these are considered "obvious" rules); the last rule is a bit subtle: if g is reduced to a literal x_i, we discard g and subsitute x_i=1 everywhere in h (similarly we substitute x_i=0 if g reduces to negation(x_i), and similarly for h instead of g) (this is called the "one-variable simplification rule"). Lemma 4.1: Prob formula phi of size L simplifies to formula of size one (i.e., a variable or its negation) is bounded by q (L Prob(phi simplifies to 1) Prob(phi simplifies to 0) under a random restriction with prob p wrt any filter (where q=2p/(1-p)). (And a filter is any subset of random restriction values closed under specialization, and "prob wrt a filter" means conditional probability.) Proof: Use communication complexity: we wish to break up the set where L simplifies to a formula of size 1, C, to a union C_1,...,C_r, such that for each i there is a variable x_{j_i} such that each element of C_i reduces to x_{j_i} or its negation, and sets A_i,B_i containing where the function takes the value 0 or 1 after restricting x_{j_i} which satisfy the property that A_1 x B_1,...,A_r x B_r form a disjoint union of A x B, where A,B are the sets where "phi simplifies to 0" and "... to 1". Then we use , the formulas at the end of the proof to finish (using that Prob[C_i] is bounded by q times either Prob[A_i] or Prob[B_i]). The communication complexity argument is as follows: consider the sets, A and B, where phi simplifies to 0 and 1 respectively. For each a in A and b in B, consider the variable settings of a and b and the top gate; if this gate is an "and" gate, then for b, both children must have to value 1, whereas for a there is at least one child with the value 0; so let A' be where A takes the value 0 on the left child, and A'' be the complement of A' in A. Then A',A'' form a disjoint union of A, and on the left child we have A' x B take the value 0 and 1, and on the right child A'' x B take the value 0 and 1. Now we repeat on each child (doing a similar thing for "or" gates). We finish at the leaves with dijoint, giving the above partition, where C_i is set of restrictions, rho, such that restricting rho's "free literal" one way gives a, and the other gives b, and (a,b) reaches leaf i. We get a partition C = C_1 union C_2 union ... union C_r, where r = # leaves in tree.