CPSC 506, Sept 17, 2012, Joel Friedman Here are some brief notes on what we will do Sept 17 and 19. Recall that last week we proved the "polynomial time hierarchy theorem," as some call it. The basic idea is that there is a language computable in time, say, O(n^4), that can't be computed in, say, time O(n^(3.99)). The language is the set of strings, s, such that if s represents a Turning machine than on input s halts within |s|^(3.995) steps and rejects s. This is the same idea you saw in a previous course to argue that the acceptance or halting problem is undecidable. This idea is often referred to as diagonalization, and can be viewed as the same way that Kirk and Spock brought down an android army with the self-referential phrase, "I am lying." Weird stuff: since the above hierarchy theorem used only the ability to similute n steps of any Turing machine by O(n log(n) ) steps of a universal Turning machine, you might conjecture that DTIME( f(n) ) is strictly contained in DTIME( g(n) ) for any f,g such that f(n) log f(n) << g(n) (where << means asymptotically less than). But the argument for f(n)=n^3.99 above used the fact that we can compute n^3.99 quickly so that we know when to stop the simulation. So this may not be true if it is very difficult or impossible to compute (or estimate) f(n). Indeed we have: Gap Theorem (Trakhtenbrot 1964, Borodin 1972, thanks to the cold war...): If g(n) is a function from the integers to themselves with g(n) >= n for all n, then there is a function T(n) from the integers to the integers with DTIME ( g(T(n) ) ) = DTIME ( T(n) ) . --------------- We mentioned that similarly: DTIME^A (n^(3.99)) is a strict subset of DIME^A (n^4) for any oracle, A. In fact, the proof relativizes, i.e. works completely the same for Turing machines with the additional resource of having oracle calls to an oracle A. We move on to Baker-Gill-Solovay (1975): There is an oracle, A, for which P^A = NP^A, and an oracle, B, for which P^B = NP^B. Hence any proof that relativizes, i.e. works completely the same if the Turing machine has an arbitrary oracle (such as diagonalization and universal simulation), cannot distinguish P from NP. ------------------- Philosophical Remark: The B-G-S tells us what arguments are not powerful enough to solve P vs. NP. But if you goal is just to solve P vs. NP, you may not need to know the proof (below), which is fairly involved. Yet this proof illustrates a number of important ideas in complexity theory (e.g., diagonalization and fooling), and is not that tricky if one introduces precise notation and gives the intuition. Hence we will cover the proof in detail in class. Again, the proof in the text is a bit brief. ------------------- If EXPCOM={(M,x,1^n) | M accepts x within 2^n steps}, and EXP = union over all polynomials of DTIME( 2^(p(n)) ), then we claim EXP <= P^EXPCOM <= NP^EXPCOM <= EXP the first inequality follows by calls to a padded input (M,x,1^(p(n)), with the padding taking at most poly time; the seond inequality is immediate; the third follow because you can write a machine that would run through all certificates, and it would run within DIME( 2^poly(n) ) for some poly(n). (The proof in the book is a bit breif.) This gives an example of an oracle A for the B-G-S theorem. The tricky part is to construct the oracle, B. For every language, B, it is easy to see that U_B lies in NP^B, where U_B = { 1^n | some string of length n lies in B }. So it suffices to find a B such that U_B does not lie in P^B. We shall determine B as a type of limit. More precisely we shall form two increasing sequences of sets of strings over the alphabet {0,1}: B_1 subset B_2 subset ... and C_1 subset C_2 subset ... (where "subset" means "is a subset of" (not necessarily a proper subset of)). Intuivitely, we will have stages 1, 2, 3, ..., and B_i will represent those strings that by stage i have been determined to be in B, and C_i those determined not to be in B. In particular, we will have B_i and C_i have no intersection, and any string lies either in some B_i or C_i for sufficiently large i. And then B will be the union over all the B_i. We enumerate the Turning machines with an oracle call as M_1, M_2, ... . We think of the oracle given to an M_i as a variable, which affects the computation; however, we can predict how M_i will behave on a given input without knowing the oracle in its entirety, provided that we know how all the oracle calls in a given computation are answered. Two different oracles appear identical (or can "fool" the computation) provided that they give the same answers on the membership of strings that are queried in the computation. This is an essential idea (or at least my view of things). Stage i: Choose n larger than the length of any element of B_i or C_i. Run the i-th oracle Turing machine, M_i, on the input 1^n for time 2^n - 1 with the following oracle responses: answer "yes" to a string in B_i, "no" otherwise; let R be the set of all the strings to which we answered "no". Clearly |R| <= 2^n-1, since each oracle call requires one time step. Recall that all strings in B_i or C_i are of length less than n, so we may pick a string, w, of length n that is not in R. Now let B_{i+1},C_{i+1} be given as follows: if M_i accepts 1^n within time 2^n-1 on the above oracle responses: B_{i+1} = B_i, if not B_{i+1} = B_i union {w} and in both cases: C_i = {all strings of length at most n} setminus B_{i+1} It is easy to see that {B_i} and {C_i} are non-decreasing sequences of sets, and that any string over {0,1} lies in some B_i or C_i for some i. We set B = union_i B_i (and we note that any string not in B lies in C_i for i sufficiently large). We easily see that if M_i accepts 1^n within time 2^n-1, then U_B does not contain 1^n, and otherwise it does. Hence any oracle Turing machine, M_i, that decides each input in time at 2^n-1 with oracle calls to B decides a language that is different from U_B. In particlar U_B does not lie in P^B , but lies in NP^B. So P^B < NP^B (strict). -------------- Note: The value of n could be rather large as a function of i. Homework: Consider the value, n, that one reaches in the i-th stage of the above argument. Give a bound on how large n can be in the i-th stage. Homework: We have argued that EXP <= P^EXPCOM <= NP^EXPCOM <= EXP . (1) Write down a precise proof of this fact, using the definitions of P and NP. (Both the textbook and the discussion above give outlines of the proof, but do not spell out the details.) The proof should have phrases like: "Let L be a language in the class BLAH. Then there is a Turing machine, M, and a constant, c, for which blah blah blah blah blah." (2) Explain why if A is a language and A^comp = Sigma^* setminus A is its complement, for any oracle Turing machine, M, with oracle calls to A, there is an almost identical machine, M', that accepts the same language with calls to A^comp (3) Imagine we are trying to find oracles, A, for which we can prove that P^A = NP^A. Why can't we prove this (as of September 2012) for A = P ? What about A = NP ? Give some reasons to justify that this might be hard to do, if you can't give an iron clad argument.