What Alan T. did for his PhD

We’ve all been there before: by the time you start graduate school in Princeton, you’ve already invented the Turing machine, pioneered the concept of computational universality, and proved the unsolvability of Hilbert’s Entscheidungsproblem.  A few years from now, you’re going to return to England to make decisive contributions to the breaking of the Enigma and the winning of World War II.  Your problem is, what do you do for the couple years in between?  (Keep in mind that you have a PhD thesis to submit, and the Turing machine is already old hat by now!)

The answer, apparently, is to tackle a neat problem in logic, one version of which was asked three weeks ago by a Shtetl-Optimized commenter named Schulz.  Not knowing the answer, I posted Schulz’s problem to MathOverflow.  There, François Dorais and Philip Welch quickly informed me that Turing had already studied the problem in 1939, and Timothy Chow pointed me to Torkel Franzen’s book Inexhaustability: A Non-Exhaustive Treatment, which explains Turing’s basic observation and the background leading up to it in a crystal-clear way.

The problem is this: given any formal system F that we might want to take as a foundation for mathematics (for example, Peano Arithmetic or Zermelo-Fraenkel set theory), Gödel tells us that there are Turing machines that run forever, but that can’t be proved to run forever in F.  An example is a Turing machine M that enumerates all the proofs in F one by one, and that halts if it ever encounters a proof of 0=1.  The claim that M doesn’t halt is equivalent to the claim that F is consistent—but if F is indeed consistent, then the Second Incompleteness Theorem says that it can’t prove its own consistency.

On the other hand, if we just add the reasonable axiom Con(F) (which asserts that F is consistent), then our new theory, F+Con(F), can prove that M runs forever.  Of course, we can then construct a new Turing machine M’, which runs forever if and only if F+Con(F) is consistent.  Then by the same argument, F+Con(F) won’t be able to prove that M’ runs forever: to prove that, we’ll need a yet stronger theory, F+Con(F)+Con(F+Con(F)).  This leads inevitably to considering an infinite tower of theories F0, F1, F2, …, where each theory asserts the consistency of the ones before it:

F0 = F

Fi = Fi-1 + Con(Fi-1) for all i≥1

But there’s no reason not to go further, and define another theory that asserts the consistency of every theory in the above list, and then another theory that asserts the consistency of that theory, and so on.  We can formalize this using ordinals:

Fω = F + Con(F0) + Con(F1) + Con(F2) + …

Fω+i = Fω+i-1 + Con(Fω+i-1) for all i≥1

F = Fω + Con(Fω) + Con(Fω+1) + Con(Fω+2) + …

and so on, for every ordinal α that we can define in the language of F.  For every such ordinal α, we can easily construct a Turing machine Mα that runs forever, but that can’t be proved to run forever in Fα (only in the later theories).  The interesting question is, what happens if we reverse the quantifiers? In other words:

Given a Turing machine M that runs forever, is there always an ordinal α such that Fα proves that M runs forever?

This is the question Turing studied, but I should warn you that his answer is disappointing.  It turns out that the theories Fα are not as well-defined as they look.  The trouble is that, even to define a theory with infinitely many axioms (like Fω or F), you need to encode the axioms in some systematic way: for example, by giving a Turing machine that spits out the axioms one by one.  But Turing observes that the power of Fα can depend strongly on which Turing machine you use to spit out its axioms!  Indeed, he proves the following theorem:

Given any Turing machine M that runs forever, there is some “version” of Fω+1 (i.e., some way of encoding its axioms) such that Fω+1 proves that M runs forever.

The proof is simple.  Assume for simplicity that F itself has only finitely many axioms (removing that assumption is straightforward).  Then consider the following Turing machine P for outputting the axioms of Fω, which gives rise to a “version” of Fω that we’ll call FP:

Output the axioms of F

For t=0,1,2,…

If M halts in t steps or fewer, then output “Con(FP)”; otherwise output “Con(Ft)”

Next t

You might notice that our description of P involves the very theory FP that we’re defining!  What lets us get away with this circularity is the Recursion Theorem, which says (informally) that when writing a program, we can always assume that the program has access to its own code.

Notice that, if P ever output the axiom “Con(FP)”, then FP would assert its own consistency, and would therefore be inconsistent, by the Second Incompleteness Theorem.  But by construction, P outputs “Con(FP)” if and only if M halts.  Therefore, if we assume FP‘s consistency as an axiom, then we can easily deduce that M doesn’t halt.  It follows that the theory Fω+1 := FP + Con(FP) proves that M runs forever.

One question that the above argument leaves open is whether there’s a Turing machine M that runs forever, as well as a system S of ordinal notations “extending as far as possible”, such that if we use S to define the theories Fα, then none of the Fα‘s prove that M runs forever.  If so, then there would be a clear sense in which iterated consistency axioms, by themselves, do not suffice to solve the halting problem.  Alas, I fear the answer might depend on exactly how we interpret the phrase “extending as far as possible” … elucidation welcome!

Update (June 29, 2011): In a comment, François Dorais comes to the rescue once again:

In connection with your last paragraph, Feferman has shown that there are paths through O such that the resulting theory proves all true ∏01 statements. [JSL 27 (1962), 259-316] Immediately after Feferman and Spector showed that not all paths through O do this. [JSL 27 (1962), 383-390] In particular, they show that any good path must be more complicated than O itself: the path cannot be ∏11. In other words, there is no simple way to form a wellordered iterated consistency extension that captures all true ∏01 statements.

39 Responses to “What Alan T. did for his PhD”

  1. Vadim P Says:

    Pardon the ignorant question (I know of no other kind) but I’m having trouble understanding what it means for a formal system to assert its consistency as an axiom. If F contains both “Con(f)” and “0=1” as axioms, wouldn’t that system be provably inconsistent notwithstanding its assertion of consistency? I interpret consistency, or lack of, as a property that shakes out of the set of axioms of F (even if we can’t prove the consistency given a powerful enough F) but calling it an axiom seems meaningless.

  2. Scott Says:

    Vadim: The fact that a system asserts its consistency doesn’t mean it’s right! So for example, a system F that includes 0=1 as an axiom proves everything, including Con(F) (you don’t even need to add Con(F) as a separate axiom), but of course F is inconsistent.

    However, in this post we’re operating under the background assumption that standard formal systems like Peano Arithmetic and ZF set theory are not only consistent but sound, or at least arithmetically sound (i.e., every statement that they prove about the positive integers is true).

    If you don’t believe that PA is sound (for example), then thinking about PA+Con(PA) won’t and shouldn’t convince you otherwise! But if you do believe PA is sound, then you also have to believe Con(PA), Con(PA+Con(PA)), and so on, and it becomes interesting to study which new theorems become provable (possibly theorems having nothing to do with consistency!) as you take on board these new axioms.

  3. Vadim P Says:

    Thanks, that was a helpful explanation.

  4. Juan Chicoy Says:

    I’m not sure Turing deserved a pHD for this work (as opposed to his other contributions). It does not appear significant enough to merit the degree. Perhaps a masters thesis. But then, I’m convinced most doctorates awarded these days are unearned.

  5. Scott Says:

    Juan: 1939 is “these days”? When were doctorates earned? 🙂

  6. Alon R Says:

    By “extending as far as possible” perhaps you mean to all ordinals up to the Church-Kleene ordinal? At that point and beyond the ordinals have no recursive well-ordering relation.

  7. Joshua Zelinsky Says:

    Juan, the trend of how many results are treated- First they are considered major results, then interesting results, then they become exercises in yellow books.

  8. fagagr4bh Says:

    Problems always seem easier in hindsight though… I remember taking Bernard Chazelle’s computational geometry course at Princeton and looking up the homework problems online after they were turned in. A number of them were problems (fairly famous-) people solved for their dissertations. They were probably hard at the time, but now they are recycled as homework problems. This “phenomenon” occurs in any field…

    Or maybe they weren’t that hard and I should have earned 5 doctorates for my performance.

  9. Scott Says:

    Alon R #6:

    By “extending as far as possible” perhaps you mean to all ordinals up to the Church-Kleene ordinal? At that point and beyond the ordinals have no recursive well-ordering relation.

    Yes, that’s probably what I meant! I was thinking of taking some maximal branch in the Church-Kleene tree of ordinal notations that Franzen denotes O. There are only countably many ordinals in such a branch, but they can’t be recursively enumerable (since otherwise one could extend the branch further). One can then ask: does there exist such a branch in which none of the theories let you prove some specific true Π1-sentence? If so, then iterating consistency statements can’t be enough by itself to render every true Π1-sentence provable: you might also need to play a cheap trick like Turing’s, and smuggle the sentence’s truth into the way the consistency statements are encoded.

  10. Mitch Says:

    Scott said:

    “…Peano Arithmetic and ZF set theory are not only consistent but sound, or at least arithmetically sound (i.e., every statement that they prove about the positive integers is true).”

    I am not seeing how PA could be consistent but not sound. It would have to involve PA proving a false statement and being unable to prove it’s (true) negation? All the examples I can think of lead to it being inconsistent as well. Like if PA proved Fermat’s Last Theorem but FLT is actually false, a counterexample would exist and PA would obv be able verify that the counter example is correct.

  11. Mitch Says:

    Err… Thinking about it more, perhaps if FLT is true, but PA proves it is false but can’t construct the exception…. In that case, i would wonder how we would know it is true though. Perhaps there is a lot simplier example I’m missing.

  12. johnstricker Says:

    Scott: I would hereby like to formally commend you on your ability to condense difficult matters into readable language. As a layman, I am hugely profiting from your writings. So, thanks a bundle! 😉

  13. Scott Says:

    johnstricker: Thanks! 🙂

    Mitch: The standard example of a theory that’s consistent but not sound is the “self-hating theory” PA+Not(Con(PA)), or PA plus the assertion of its own inconsistency. If PA is consistent, then PA+Not(Con(PA)) must be consistent as well—since otherwise, PA could prove its own consistency by contradiction, and would therefore be inconsistent by the Second Incompleteness Theorem. On the other hand, PA+Not(Con(PA)) is clearly unsound (in fact, it’s unsound whether or not PA is consistent). Intuitively, PA+Not(Con(PA)) “believes” there’s a proof of its own inconsistency, but can never produce an explicit example of such a proof.

  14. Mitch Says:

    Thanks Scott!

  15. Peter Says:

    Would it be possible to tag this post with PlanetMO? That way, people could find it on http://www.mathblogging.org/planetmo .

  16. John Sidles Says:

    The Wikipedia page Undecidable problem takes care to distinguishes statements that are deductively undecideable (in Gödel’s sense) from statements that are computably undecidable (in Turing’s sense). Is Wikipedia’s distinction sensible?

    On TCS StackExchange, there is a just-created community wiki titled “Do the undecidable attributes of P pose an obstruction to deciding P versus NP?” for which Wikipedia’s Gödel-vs-Turing distinction is significant. To date, the best contributions to this wiki are proof sketches (provided by students) that are mainly associated to the Turing variety of undecidability.

    To the extent that we are guided by the history of complexity theory (and Alan T’s role in particular) perhaps we should assess the creative contribution of Turing-style proof sketches by students as a hopeful sign that these avenues of research will prove productive. 🙂

  17. Bram Cohen Says:

    Scott, according to the wikipedia page on the continuum hypothesis “So far, CH appears to be independent of all known large cardinal axioms in the context of ZFC.” so as a very partial answer to your last question a TM looking for a disproof of CH is conjectured to not be provable to halt. As to whether there’s TM for which such a thing can be proven, I have no idea.

  18. Bram Cohen Says:

    What would be wrong with the machine for outputting axioms saying “If M halts in t steps or fewer, then output False; otherwise output “Con(Ft)””? That would seem to make the trick even cheaper, but appears to work.

  19. Scott Says:

    Peter #15: How do I do that? Do I just add “PlanetMO” as a WordPress category?

  20. Scott Says:

    John #16: There are undecidable statements in Gödel’s sense (relative to a specific formal system F), and then there are undecidable languages in Turing’s sense (which is an absolute notion). And yes, the distinction between the two is sensible, since they’re two completely different (albeit related) concepts that happen to share the same word!

  21. Scott Says:

    Bram #17: No, that’s not right. A TM looking for a proof or disproof of CH can be proved in ZF never to halt (assuming, of course, ZF’s consistency)—that’s what Gödel and Cohen showed! CH itself is undecidable, but CH’s provability in ZF is not undecidable. That means the relevant TM would have to be one that “looked directly for infinities of intermediate cardinality,” which is almost certainly as absurd as it sounds. 🙂

  22. Scott Says:

    Bram #18: Yes, it’s an excellent point, and one that occurred to me also. It would work fine. It looks like Turing had imposed on himself the further requirement that the machine P can be proved in F to only ever output axioms that assert the consistency of some theory in the sequence.

  23. Peter Says:

    @Scott Yes, just as a category or as a wordpress tag — both will be picked up, I think (as in else it’s a bug we will fix). Thanks!

  24. John Sidles Says:

    Thank you Scott … it’s clear that the distinction between Gödel-type undecidability (of statements) and Turing-type undecidability (of languages) is essential to an appreciation of conundrums like Joel David Hamkins’ answer to the MOF question Can a problem be simultaneously polynomial time and undecidable?. Hamkin’s answer amounts to (AFAICT):

    Yes, there exist Turing-decidable languages (in P?) whose words are true ZFC-statements that are Gödel-undecidable.

    Who woulda’ thunk it!? 🙂

  25. Scott Says:

    Peter #23: Thanks, done!

  26. François G. Dorais Says:

    In connection with your last paragraph, Feferman has shown that there are paths through O such that the resulting theory proves all true ∏^0_1 statements. [JSL 27 (1962), 259-316] Immediately after Feferman and Spector showed that not all paths through O do this. [JSL 27 (1962), 383-390] In particular, they show that any good path must be more complicated than O itself: the path cannot be ∏^1_1. In other words, there is no simple way to form a wellordered iterated consistency extension that captures all true ∏^0_1 statements.

  27. Scott Says:

    Thanks so much, François!

  28. jonas Says:

    I don’t understand. Just because the Turing machine was an old hat, if Turing had provably published results about it first, could’t he have gotten a PhD just for writing a dissertation summing his results about it?

  29. Scott Says:

    jonas: Even if he could’ve gotten away with repeating himself, he might not have wanted to! Also keep in mind that Turing’s advisor at Princeton, Alonzo Church, had independently proved Turing’s “main result” (the unsolvability of the Entscheidungsproblem) using lambda-calculus, and that no one (not even Turing) really understood at that time how important the Turing machine would become.

  30. Tracy Hall Says:

    How sneaky of Alan Turing to have hidden the answer to that question in his dissertation.

    Paragraph 3: I think you mean “can prove that M does not halt.”

  31. Scott Says:

    Tracy: Thanks! Fixed.

  32. Peter Says:

    Scott, sorry for bothering you again. It would be great if you could add a “PlanetMO” category for this post. So far, the only category for this post (showing up in your rss-feed) is “Nerd Interest” (which is awesome in it’s own respect).

  33. inquistor Says:

    i note, with some disppointment, that Kurt Godel is not mentioned in the article nor the comments thus far.

    he figered out that there is NO consistent system.

    ‘Nuf said. 🙂

  34. Peter Says:

    Thanks, Scott! It’s fixed now.

  35. Mitch Says:

    Scott said ” Even if he could’ve gotten away with repeating himself, he might not have wanted to!”

    If wiki is to be trused it seemed likely he didn’t have a choice:

    “In his memoirs Turing wrote that he was disappointed about the reception of this 1936 paper and that only two people had reacted – these being Heinrich Scholz and Richard Bevan Braithwaite.”

    http://en.wikipedia.org/wiki/Alan_Turing

    So at the time Turing was working on his thesis, he probably was just a bright grad student who had done some competent work before he got to grad school that nobody really read. Probably not that different from alot of his Princetion classmates.

  36. Scott Says:
      i note, with some disppointment, that Kurt Godel is not mentioned in the article nor the comments thus far.

      he figered out that there is NO consistent system.

      ‘Nuf said. 🙂

    inquisitor #33: I wish it didn’t affect me so much, but the fact that someone as contentedly ignorant as you are exists is going to depress me for the rest of the day.

    Why don’t you go read Wikipedia if you want to understand what it is that Gödel “figered out.” (I’ll give you two hints: (1) it involves the completeness of formal systems, not just their consistency; (2) it’s been the implicit starting point, known to everyone here, for the entire conversation on this thread.)

  37. Tonio Says:

    I’m a bit perplexed by this statement: “and it becomes interesting to study which new theorems become provable (possibly theorems having nothing to do with consistency!) as you take on board these new axioms.”

    Surely an axiom stating that the previous axioms don’t lead to an internal contradiction isn’t going to make new theorems true; in fact it will either make your axioms inconsistent or have no effect. Presumably 0=1 directly contradicts some other axiom (e.g. with PA, 1 = S(0) and by definition is non-zero).

    Indeed, I can prove that axioms of this type won’t lead to new theorems becoming available because you need never make use of an axiom y that states “axiom x is not violated” in a proof since you already have axiom x to call upon. It follows that all axioms of the form “axiom x is not violated” need never be used in a proof.

    This is very different from making an unprovable but true thing into an axiom. (Consider the axiom of choice, which can be assumed to be true, assumed to be false, or avoided altogether, and which leads to different theorems being provable.)

    Perhaps this entire line of reasoning is spurious. A set of axioms should be irreducible before proceeding, and an axiom stating that the preceding axioms are consistent is by its nature eliminated.

  38. Scott Says:

    Tonio #37: Let me explain why your (plausible) intuition fails. Adding a consistency axiom indeed can’t make new theorems TRUE (truth and falsehood are absolute notions, independent of the formal system), but the whole point of Godel’s Theorem is that it can make new theorems PROVABLE. As an example, consider a Turing machine M that enumerates all proofs in your system F and halts iff it finds a contradiction. Then “M runs forever” is a theorem, on its face purely about the behavior of some Turing machine, that we know CAN’T be proved in F (assuming F is consistent) but CAN be proved in F+Con(F). The fact that we, “looking in from the outside”, see nothing in asserting Con(F) that wasn’t already implicit in asserting F itself is irrelevant here, since we’re only talking about what’s provable.

  39. John Says:

    Juan #4 commented that Turing didn’t deserve a Phd for his dissertation given it’s content based on the standards of 2011. Let’s see, if he’d done his work on computability in 1976 instead of 1936 he’d have very likely gotten the Field Medal and the >>>Turing Award<<< before almost all of us started working. And oh by the way in the 5 years following his Phd (1938) he pretty nearly literally saved the world for democracy through his work on code breaking and Enigma. If this guy didn't deserve his Phd from Princeton, supervised by Alonzo Church for crying out loud, nobody ever has. In this case, I'd say more honor falls on the institution for having granted it to him rather than the other way round. All the more shame on the UK government for not giving him higher honors (a KBE?) and a cover-up if that was really necessary in lieu of persecuting him for homosexuality.