Doing my oracle duty

I promised myself I’d stop blogging about controversial issues whose mere mention could instigate a flamewar and permanently get me in trouble.  Well, today I’m going to violate that rule, by blogging about the difference relativized and unrelativized complexity classes.

Recently a colleague of mine, who works in the foundations of quantum mechanics, sent me a long list of questions about the seminal 1993 paper of Bernstein and Vazirani that introduced the complexity class BQP (Bounded-Error Quantum Polynomial-Time).  It was clear to me that all of his questions boiled down to a single point: the distinction between the relativized and unrelativized worlds.  This is an absolutely crucial distinction that trips up just about everyone when they’re first learning quantum computing.

So I fired off a response, which my colleague said he found extremely helpful.  It then occurred to me that what one person found helpful, another might as well—and that which makes 30% of my readers’ eyes glaze over with its thoroughgoing duh-obviousness, might be very thing that another 30% of my readers most want to see.  So without further ado, the two worlds of quantum complexity theory…

In the relativized world, we let our algorithms access potentially-powerful oracles, whose internal structure we don’t examine (think of Simon’s algorithm for concreteness).  In that world, we can indeed prove unconditionally that BPP≠BQP—that is, quantum computers can solve certain problems exponentially faster than classical computers, when both computers are given access to the same oracle.

In general, almost every “natural” complexity class has a relativized version associated with it, and the relativized versions tend to be much easier to separate than the unrelativized versions (it’s basically the difference between a masters or PhD thesis and a Fields Medal!)  So for example, within the relativized world, we can separate not only BPP from BQP, but also P from NP, NP from PSPACE, NP from BQP, etc.

By contrast, in the unrelativized world (where there are no oracles), we can’t separate any complexity classes between P and PSPACE.  Doing so is universally recognized as one of the biggest open problems in mathematics (in my opinion, it’s far-and-away the biggest problem).

Now, Bernstein and Vazirani proved that BQP is “sandwiched” between P and PSPACE.  For that reason, as they write in their paper, one can’t hope to prove P≠BQP in the unrelativized world without also proving P≠PSPACE.

Let’s move on to another major result from Bernstein and Vazirani’s paper, namely their oracle separation between BPP and BQP.  You might wonder: what’s the point of proving such a thing?  Well, the Bernstein-Vazirani oracle separation gave the first formal evidence that BQP “might” be larger than BPP.  For if BPP equaled BQP relative to every oracle, then in particular, they’d have to be equal relative to the empty oracle—that is, in the unrelativized world!

(The converse need not hold: it could be the case that BPP=BQP, despite the existence of an oracle that separates them.  So, again, separating complexity classes relative to an oracle can be thought of as a “baby step” toward separating them in the real world.)

But an even more important motivation for Bernstein and Vazirani’s oracle separation is that it led shortly afterward to a better oracle separation by Simon, and that, in turn, led to Shor’s factoring algorithm.

In a sense, what Shor did was to “remove the oracle” from Simon’s problem.  In other words, Shor found a concrete problem in the unrelativized world (namely factoring integers), which has a natural function associated with it (namely the modular exponentiation function, f(r) = xr mod N) that one can usefully treat as an oracle.  Treating f as an oracle, one can then use a quantum algorithm related to Simon’s algorithm to find the period of f, and that in turn lets you factor integers in polynomial time.

Of course, Shor’s algorithm became much more famous than Simon’s algorithm, since the implications for computer science, cryptography, etc. were so much more concrete and dramatic than with an abstract oracle separation.  However, the downside is that the speedup of Shor’s algorithm is no longer unconditional: for all anyone knows today, there might also a fast classical algorithm to factor integers.  By contrast, the speedup of Simon’s algorithm (and of Bernstein-Vazirani before it) is an unconditional one.

37 Responses to “Doing my oracle duty”

  1. Zack Says:

    Can you explain, please, why it’s easier to prove separations in the presence of an oracle? Naively, it seems like oracles should only ever make it harder to prove a separation, because if an oracle doesn’t speed up a problem, the optimal algorithm for that problem will just ignore the oracle.

  2. rrtucci Says:

    “I promised myself I’d stop blogging…Recently a colleague of mine, …. sent me a long list of questions..”

    Come on Scott. Face up to it. You have a blogging problem.
    Just one random email from a colleague, and you fall off the bandwagon.

  3. Dave Says:

    Naively, it seems like oracles should only ever make it harder to prove a separation, because if an oracle doesn’t speed up a problem, the optimal algorithm for that problem will just ignore the oracle.

    You design the oracle so that it “speeds up” one computational model but not another. For instance, NP is defined by nondeterministic Turing machines and P by deterministic Turing machines, so to get an oracle separating P from NP you design the oracle so that it is easy for a NTM to use it to solve some problem but not for a DTM. For instance, here’s how to store a bit so that a NTM can access it but not a DTM: choose a string of length n randomly, and if the bit is 1, put the string in the oracle language; if the bit is 0, don’t. A NTM can use its nondeterministic magic to ask “does the oracle contain any string of length n?” to compute this bit quickly, but a DTM can do no better than to search all of {0,1}n.

    The key is that, due to the total lack of structure in our choice about where to put the string in {0,1}n (we just picked it at random), the necessity of the DTM’s exponential search is easier to prove than our “unrelativized” intuition about why DTM’s can do no better with exponential-size search spaces than to search all of them. With an NP problem, {0,1}n might represent “the set of all subsets of n nodes in a graph”, and the one interesting element that may or may not be in it could be “a subset of n nodes that is a clique”. But this string is hardly a “randomly chosen” element of {0,1}n, so it is not as easy to prove that a DTM must search all of {0,1}n to find it.

  4. Scott Says:

    Zack: Basically, you can use an oracle to “bring out the latent strengths” of one class over another class—creating a “staged situation” where one class is obviously (or if not obviously, then at least provably) better than the other. It’s easiest to illustrate this with an example: let’s say you want an oracle A that separates P from NP. Then for each input length n, you can set A(x)=1 for some “secret string” x∈{0,1}n, and A(x)=0 for all other strings of length n. Then because it’s nondeterministic, an NPA machine can just “magically guess” the secret string x, whereas a PA machine is stuck using trial-and-error. (The P machine can only learn about A by picking various inputs x1, x2, … and querying them—and unless it happens to get lucky and guess the x such that A(x)=1, it will clearly need exponential time to find it.)

    Notice how the use of an oracle has let us sidestep the entire difficulty of the P vs. NP question: relative to this oracle, of course brute-force search is unavoidable and P≠NP, because we designed the oracle specifically to make it that way!

    Oracle separations can get much more complicated than the above, but I hope that gives you the general idea of how a well-designed oracle can make separations easier rather than harder.

  5. Henry Y Says:

    This is very much pie-in-the-sky, but could it be possible that there’s some useful metric amongst the space of all oracles, in which we can measure the “distance” from some complexity class C and C^A for some given oracle A, and this “distance” reflects how different C^A is from C?

    Then we could conceivably analyze a limiting sequence of oracle worlds, getting closer and closer to the unrelativized world, and perhaps see that in the limit, P != NP.

    Just for the sake of throwing it out there.

  6. Sid Says:

    Are there any two classes which are unequal in a relativized world but equal in a non-relativized world?

  7. Henry Y Says:

    Sid, I believe IP = PSPACE is a non-relativizing fact, so there are oracles to which they are unequal. I’m not sure which oracle that is, however.

  8. arnab Says:

    Henry Y, actually even more is known to be true. IP doesn’t equal PSPACE, relativized to almost all oracles. Whereas we know that IP equals PSPACE in the unrelativized world.

  9. Job Says:

    That’s really weird, about IP and PSPACE, given that they’re equivalent.

    I guess it’s like having two functions f(x) and g(x), which both output the primes, but in different orders. They define the same sets, but are not comparable at a given interval of x, or function of x.

  10. Job Says:

    If you have a better analogy, do share. 🙂

  11. arnab Says:

    Here is an easy toy example that may be helpful.

    Define PROJECT to be the languages accepted by machines that can only project their input to some specific bit and output that (i.e., it can only implement dictator functions). Define ONETIME to be the languages accepted by machines that operate in one unit of time, and suppose your model of computation is such that the only thing you can do in one unit of time is read one bit of the input. Then, clearly in an unrelativized world, PROJECT and ONETIME are the same.

    But suppose now, machines are given access to some nontrivial set A, and the model of computation is changed such that machines can also check for membership of the input string in A in one unit of time. PROJECT^A remains the same no matter what A is. But ONETIME^A can be different because in addition to the languages in PROJECT, it also contains A.

  12. Dániel Says:

    If you have a better analogy, do share. 🙂

    All right. You have a Yugo and a Ferrari, and you don’t know which one is the faster. So you equip both with an extra jet engine. Now the picture becomes clear: the Ferrari will be very fast, and the Yugo will explode.

  13. Scott Says:

    Job: I have to admit, your analogy didn’t make sense to me.

    The fact that IP!=PSPACE relative to almost all oracles (even though IP=PSPACE in the unrelativized world) really isn’t that surprising, once you know what’s behind it. A rough analogy is how ALMOST ALL real numbers are uncomputable, yet specific numbers that we care about (e, pi, sqrt(2), 42, etc.) keep turning out to be computable.

  14. John Sidles Says:

    Scott says: “ALMOST ALL real numbers are uncomputable, yet specific numbers that we care about (e, pi, sqrt(2), 42, etc.) keep turning out to be computable.”

    That was a thought-provoking example, because it happens so commonly, as to be perhaps the single most reliable heuristic of complexity theory.

    As Dick Lipton’s blog often points out (in effect): “ALMOST ALL real-world problems are in class NP, yet specific instances that we care about (protein folds, ground states, control algorithms, image reconstruction, and most especially, finding proofs) keep turning out to be in P.”

    The same heuristic strikingly applies (with the help of 17 years of hindsight) even to the 1993 paper of Bernstein and Vazirani that Scott cited, which asserts: “The issue of the extra computational power of quantum mechanics over probabilistic computers was first raised by Feynman in 1982. In that paper Feynman pointed out a very curious problem: the natural simulation of a quantum physical system on a probabilistic Turing Machine requires an exponential slowdown. Moreover, it is unclear how to carry out the simulation more efficiently.”

    The preceding was the consensus during the 1980s and 1990s, but nowadays we appreciate that the same Aaronson/Lipton heuristic applies: “ALMOST ALL real-world quantum simulation problems are in class BQP, yet specific instances that we care about (protein folds, ground states, control algorithms, quantum state reconstruction, etc.) keep turning out to be in P.”

    To the extent that that complexity theory provides us with any understanding of the mysteriously broad real-world efficacy of the Aaronson/Lipton heuristic, perhaps it is that the state-spaces associated to practical problems are not formal languages or formal Hilbert spaces, but rather a class of simpler state-spaces, having some (as yet) dimly-glimpsed informatic structure.

    About the only thing we know about this real-world class, is that useful (meaning, physically realizable) oracles seemingly aren’t in it.

    Alternatively—and at the very least—physical devices having oracle-class capabilities are proving to be exceedingly tough to build and operate, per Scott’s Perimeter Institute seminar “The Computational Complexity of Linear Optics.”

    And this difficulty in building oracle-class devices is a clue—perhaps an important clue—as to the natural structure of the state-space that supports both physics experiments and human cognition.

    That’s what oracles look like from a device-oriented engineering perspective, anyway.

  15. Job Says:

    It’s like two bouncers that, when left alone, happen to let in the same people, but when bribed react completely differently.

  16. Dániel Says:

    It’s like two bouncers that, when left alone, happen to let in the same people, but when bribed react completely differently.

    This is not a good analogy, there are no cars in it.

  17. Vadim P Says:

    To me, it (obviously incorrectly) seems like if an oracle is more powerful than (contains) the class it’s being paired with (i.e., PNP), that the base machine becomes irrelevant and the oracle would be used for all computations (so I naively would conclude that LNP=PNP=NP, and so on). I know I’m wrong, but I can’t understand why.

    Scott, I also don’t understand why your example in #4 isn’t a circular argument. If it turns out that P=NP in an unrelativized world, then wouldn’t the PA machine be able to solve the problem just as well as the NPA machine?

  18. Vadim P Says:

    Weird, the HTML looked fine until I submitted it.

    PNP = P^NP
    LNP = L^NP
    PA = P^A
    NPA = N^PA

  19. Scott Says:

    Vadim: Sorry about the formatting!

    To answer your first question: no, the base class is still extremely relevant, since a more powerful machine can extract more information from the same oracle—that’s the whole point! As one example, we believe NPNP is larger than PNP—since in NPNP, you can check statements involving two quantifiers, like “Does there exist an n-bit string x such that for every n-bit string y, the statement A(x,y) holds?” Whereas it’s not clear how to do the same thing even in PNP. As another example, an extension of the usual Time Hierarchy Theorem shows that EXPA is strictly larger than PA for every oracle A. (Checking that is a good exercise for you!)

    For your second question, even if P=NP in the real world, it’s still trivially the case that PA≠NPA for the oracle A that I described. Since A can only be accessed as a black box, even a hypothetical poly-time algorithm for SAT would be completely useless for finding an x such that A(x)=1, faster than brute-force search. Yes, if we could write out A(x) as a poly-sized Boolean formula involving the bits of x, then we could use our fast SAT algorithm to find an x such that A(x)=1 in polynomial time. The problem is that we can’t write out such a Boolean formula for A—by assumption, all we can do is call A on various inputs and see what the outputs are!

    Understand the above, and you will have achieved enlightenment about what an oracle is.

  20. Vadim P Says:

    Thanks, those were extremely helpful explanations!

  21. blk Says:

    What puzzles me about oracles is that they are sometimes being used as (exponentially sized) *input* for query complexity considerations (e.g. the oracle in Simon’s problem) and sometimes they are are used like *subroutines*, where membership of some explicitly given poly-sized input string x in some language is decided using the additional computational power that an oracle provides. Aren’t these two uses quite incomparable and addressing different issues?

  22. Greg Kuperberg Says:

    blk: You’re right and you shouldn’t necessarily be puzzled. You’ve stated two different interpretations of oracles that are both interesting. Oracular input can even be viewed as a realistic interpretation, since it is commonplace these days to pass a function as part of the input to another function. Or you might write a program that uses a library of “private” functions whose code is not available to you; that is also similar to oracular input.

    On the other hand, these two uses are entirely comparable, since a notable algorithm with oracular input can be equally useful as a construction to prove an oracle separation between two complexity classes. For instance Simon’s algorithm gives you an oracle separation between BQP and BPP.

  23. Scott Says:

    blk: Yes, it’s a good point, and the only thing I’ll add to Greg’s comment is that often it’s useful to switch between the two perspectives within the same oracle construction! So for example, consider an oracle A that puts NEXP inside P/poly. Just like an oracle separating P from NP, you can think of such an A as an exponentially long input to a problem that P/poly machines can solve but NEXP machines can’t. But in this case, to construct such an A, you need a not-entirely-obvious diagonalization process over all NEXP machines—which means that you can also think of A as solving a hard computational problem, and then even study the meta-question of how hard a problem it needs to be.

  24. blk Says:

    Greg, Scott: thanks for your answers!

  25. John Sidles Says:

    As an engineer, I have been searching the oracle literature for results relating to the physical construction of non-trivial oracle devices, that is, devices whose output cannot be simulated with PTIME resources. Pointers to the literature in this regard would be very welcome.

    It seems obvious (but is there a proof in the literature?) that such devices necessarily would be quantum mechanical. So is “any oracle device not in P” included in the definition of what we mean by “quantum computer?”

    Heck, what *is* a quantum computer?

    A recent preprint by Perez-Delgado and Pieter Kok titled “What is a quantum computer, and how do we build one?” (arXiv:0906.4344) reviews various criteria, which are based largely on the seminal criteria proposed by DiVincenzo.

    But viewed in light of Scott’s oracle discussion—which was great!—the DiVincenzo-type criteria seem informatically bulky and (perhaps) unnaturally restrictive.

    So does oracle theory (in addition to its versatility as a proof technology) suggest broader, more natural definitions for “quantum computer”?

  26. Raoul Ohio Says:

    “Quantum Darwinism”, a new (result/theory/breakthrough/deadend?) on “Bridge to the Quantum World” in ScienceDaily:

    http://www.sciencedaily.com/releases/2010/07/100702092200.htm

    Check it out.

  27. Raoul Ohio Says:

    Following up a concept Scott, John, etc., raised above, about how how in some loose sense, many or most problems that (are important, prove interesting, can be solved, …) are somehow much “simpler” than the average problem in its class.

    My favorite example: A nontrivial understanding of pretty much anything in the physical sciences requires calculus, most uses of which involve taking derivatives. Almost all of the functions that turn up have infinitely many derivatives, except perhaps at a finite number of points. Yet, the probability that an arbitrary continuous function has at least one derivative at any point is zero!

    You can study and debate this issue from a lot of directions, such as “maybe measure theory on function spaces is not a good way to assign a probability”, etc.

    My take is that function spaces and other topological objects have nice definitions that allow you to prove stuff, but the connection to reality, usefulness, etc., is not so simple. The exact same remark goes for complexity classes.

  28. Alex Says:

    “By contrast, in the unrelativized world (where there are no oracles), we can’t separate any complexity classes between P and PSPACE. Doing so is universally recognized as one of the biggest open problems in mathematics (in my opinion, it’s far-and-away the biggest problem).”

    I keep reading this fact in many papers, that there would need to be another tool besides the relativization technique to separate complexity classes. Are there any other techniques that are known, or is the entire concept completely open?

  29. Scott Says:

    My take is that function spaces and other topological objects have nice definitions that allow you to prove stuff, but the connection to reality, usefulness, etc., is not so simple. The exact same remark goes for complexity classes.

    Right, so then people propose definitions that have better connections to reality, usefulness, etc., and people prove stuff about those definitions, and the world moves forward. That’s how science works!

    You can see complexity classes themselves as a vast improvement over the ways of modeling computation that preceded them (recursion theory, the Chomsky Hierarchy, switching networks, Karnaugh maps, cybernetics…). Do you have a better alternative to complexity classes? If so, then let’s hear it! 🙂

  30. Scott Says:

    Are there any other techniques that are known, or is the entire concept completely open?

    Yes, we have interactive proof techniques that evade relativization (like those used to prove IP=PSPACE), but almost all of those succumb to a generalization of relativization called algebrization. See my paper with Avi Wigderson about this.

  31. John Sidles Says:

    Do you have a better alternative to complexity classes? If so, then let’s hear it!

    Oh boy … yet another Aaronson challenge! This one seems similarly fun and interesting to “Sure/Shor separators—which has been a considerable inspiration to me—and so I’m willing to take a swing at this one.

    Since a lot of what we know about complexity classes derives from Turing reductions, let’s hypothesize that there’s some aspect of the definition of Turing machine that is particularly well-suited to proving some classes of theorems—in particular, theorems that relativize—but is poorly-suited to solving real-world problems like P=?=NP.

    What could that aspect be? Well, one respect in which Turing Machines depart very conspicuously from real-world computing devices is that there is no requirement that informatic content of the input tape be documented in any way whatsoever.

    How can we impose a requirement “document the Turing input tape” in a well-posed way? We have a still-foggy goal of proving theorems of the form Pd &ne NPd, where “d” denotes the restricted class of problems that can be solved by documented Turing machines.

    At first sight, one needs to know a tremendous amount of mathematics to document a code … but in the real world that’s not true, because we can always document a code by … referring to another code.

    So let’s amalgamate all documented codes into one universal Turing input tape “D”, which (by definition) serves as its own documentation, and which we further require to run in PTIME.

    `Cuz heck, why are we all doing math/science/engineering if we don’t believe we can solve pretty much all real-world problems with PTIME resources? The “D” requirement merely formalizes this notion.

    Now we have the (slightly) less vague idea that we know we can prove PA =?= NPA either way, and therefore, we know that relativisation obstructs P =?= NP, but maybe PD ≠ NPD might be feasible to prove … for some suitable definition of “D”.

    The tricky part is allowing the input “D” to have just the right amount of elasticity in its definition. We want to show that no matter how (polynomially) many improvements we make to the master tape “D”, there are still inputs that separate PD from NPD.

    One nice thing about D-restricted complexity classes is that (by definition) they don’t relativise.

    So if someone from the future (say) showed us a yottabyte code that was a super-duper hybrid of Petkovsek/Wilf/Zeilberger’s A=B and Mathematica, and claimed that this code was a pretty good real-world approximation to “D”, how much credence would we give that claim?

  32. John Sidles Says:

    Hmmm … as Vadim P noted, HTML superscripts and subscripts preview OK, but then are filtered-out … thus the intended notation is:

    PA = P^A, that is, a Turing machine augmented with an oracle “A”, versus PD = P_D, that is, a Turing machine that is restricted to documented input “D”.

  33. asdf Says:

    That there is an oracle separating P and NP is not completely obvious, or at least it hasn’t always been completely obvious, since that’s why it’s a Big Important Theorem by Baker, Gill, and Solovay that was considered groundbreaking when developed in the 1970’s or thereabouts.

    I remember Scott’s comments in an earlier post about “natural” complexity classes, where a class C is natural if C^C=C. Clearly P is a natural class, and I guess PH and #P are natural(?), but it looks like NP is probably unnatural. So I wonder why we obsess so much about NP, and in particular about turning everything into a decision problem? If we look at most of the standard examples of NP-complete problems, most of them are phrased as decision problems only through contortions. For example, someone who poses a SAT instance probably wants to know the actual satisfying assignment if one exists, not just a yes/no answer of -whether- a a satisfying assignment exists. Should we care more about #P than NP?

  34. Scott Says:

    That there is an oracle separating P and NP is not completely obvious, or at least it hasn’t always been completely obvious, since that’s why it’s a Big Important Theorem by Baker, Gill, and Solovay that was considered groundbreaking when developed in the 1970’s or thereabouts.

    In my opinion, the groundbreaking contribution of Baker-Gill-Solovay was a conceptual one: recognizing that complexity theory was going to be fundamentally different from computability theory, for the reason that most of the questions have contradictory relativizations.

    I’m guessing BGS would readily agree that their oracle construction itself was pretty obvious, once the right questions had been asked. (By contrast, many of the oracle constructions that followed them, like the ones making PH infinite, putting PNP outside PP, etc. were not obvious in the same sense.)

  35. Zack Says:

    Ages later, I want to say thank you for answering my question; I think I have a better understanding of complexity theory in general now!

    There is, though, something bothering me about oracles of the sort Dave described in comment 3. The only thing you can do with Dave’s oracle is try to reverse-engineer it. This seems qualitatively different from, say, a 3SAT oracle, which is a subroutine of general utility, and it offends my sense of mathematical aesthetics. I wish I could put that in less subjective terms. I almost want to say that using an oracle that isn’t a subroutine of general utility in a proof is cheating.

  36. Scott Says:

    Zack, I’m wondering if the reason the oracle Dave described feels like cheating is simply that it’s too trivial. I.e., we tied the P machine’s arms and legs together, then beat it in an unfair fight in which it never had a chance!

    If so, then the way to update your sense of mathematical aesthetics is to see some nontrivial examples of oracle separations—ones where we beat some complexity class, but only after a real fight! Here are a few examples to get you started:

    Beigel’s oracle relative to which PNP is not in PP
    Yao’s and Hastad’s oracles that make PH infinite (Hastad’s PhD thesis is the best online reference I could find)
    The collision lower bound, which yields an oracle relative to which SZK is not in BQP

  37. Vladimir Says:

    Hi,

    Sorry for posting this trivial question here, but I just started studying some topics of computational complexity now, and there is one thing that seems hard to understand regarding relativization. Hope nobody minds; otherwise, please delete the comment.

    So, here is my question: take, for instance, the class C=P^NP, and a language B. How are the oracle TM machines from C^B working?
    – are they deterministic polytime machines that ask queries from NP^B?
    – are they deterministic polytime machines that ask queries to both B and NP?
    Can you please explain the answer?

    Thanks!