## What Alan T. did for his PhD

We’ve all been there before: by the time you start graduate school in Princeton, you’ve already invented the Turing machine, pioneered the concept of computational universality, and proved the unsolvability of Hilbert’s *Entscheidungsproblem*. A few years from now, you’re going to return to England to make decisive contributions to the breaking of the Enigma and the winning of World War II. Your problem is, what do you do for the couple years in between? (Keep in mind that you have a *PhD thesis* to submit, and the Turing machine is already old hat by now!)

The answer, apparently, is to tackle a neat problem in logic, one version of which was asked three weeks ago by a *Shtetl-Optimized* commenter named Schulz. Not knowing the answer, I posted Schulz’s problem to MathOverflow. There, François Dorais and Philip Welch quickly informed me that Turing had already studied the problem in 1939, and Timothy Chow pointed me to Torkel Franzen’s book Inexhaustability: A Non-Exhaustive Treatment, which explains Turing’s basic observation and the background leading up to it in a crystal-clear way.

The problem is this: given any formal system F that we might want to take as a foundation for mathematics (for example, Peano Arithmetic or Zermelo-Fraenkel set theory), Gödel tells us that there are Turing machines that run forever, but that can’t be *proved* to run forever in F. An example is a Turing machine M that enumerates all the proofs in F one by one, and that halts if it ever encounters a proof of 0=1. The claim that M doesn’t halt is equivalent to the claim that F is consistent—but if F is indeed consistent, then the Second Incompleteness Theorem says that it can’t prove its own consistency.

On the other hand, if we just add the reasonable axiom Con(F) (which *asserts* that F is consistent), then our new theory, F+Con(F), *can* prove that M runs forever. Of course, we can then construct a new Turing machine M’, which runs forever if and only if *F+Con(F)* is consistent. Then by the same argument, F+Con(F) won’t be able to prove that M’ runs forever: to prove that, we’ll need a yet stronger theory, F+Con(F)+Con(F+Con(F)). This leads inevitably to considering an infinite tower of theories F_{0}, F_{1}, F_{2}, …, where each theory asserts the consistency of the ones before it:

F_{0} = F

F_{i} = F_{i-1} + Con(F_{i-1}) for all i≥1

But there’s no reason not to go further, and define another theory that asserts the consistency of *every* theory in the above list, and then another theory that asserts the consistency of *that* theory, and so on. We can formalize this using ordinals:

F_{ω} = F + Con(F_{0}) + Con(F_{1}) + Con(F_{2}) + …

F_{ω+i} = F_{ω+i-1} + Con(F_{ω+i-1}) for all i≥1

F_{2ω} = F_{ω} + Con(F_{ω}) + Con(F_{ω+1}) + Con(F_{ω+2}) + …

and so on, for every ordinal α that we can define in the language of F. For every such ordinal α, we can easily construct a Turing machine M_{α} that runs forever, but that can’t be *proved* to run forever in F_{α} (only in the later theories). The interesting question is, *what happens if we reverse the quantifiers?* In other words:

**Given a Turing machine M that runs forever, is there always an ordinal α such that F _{α} proves that M runs forever?**

This is the question Turing studied, but I should warn you that his answer is disappointing. It turns out that the theories F_{α} are not as well-defined as they look. The trouble is that, even to *define* a theory with infinitely many axioms (like F_{ω} or F_{2ω}), you need to encode the axioms in some systematic way: for example, by giving a Turing machine that spits out the axioms one by one. But Turing observes that the power of F_{α} can depend strongly on *which* Turing machine you use to spit out its axioms! Indeed, he proves the following theorem:

**Given any Turing machine M that runs forever, there is some “version” of F _{ω+1} (i.e., some way of encoding its axioms) such that F_{ω+1} proves that M runs forever.**

The proof is simple. Assume for simplicity that F itself has only finitely many axioms (removing that assumption is straightforward). Then consider the following Turing machine P for outputting the axioms of F_{ω}, which gives rise to a “version” of F_{ω} that we’ll call F_{P}:

Output the axioms of F

For t=0,1,2,…

If M halts in t steps or fewer, then output “Con(F_{P})”; otherwise output “Con(F_{t})”

Next t

You might notice that our description of P involves the very theory F_{P} that we’re defining! What lets us get away with this circularity is the Recursion Theorem, which says (informally) that when writing a program, we can always assume that the program has access to its own code.

Notice that, if P ever output the axiom “Con(F_{P})”, then F_{P} would assert its own consistency, and would therefore be *in*consistent, by the Second Incompleteness Theorem. But by construction, P outputs “Con(F_{P})” if and only if M halts. Therefore, if we assume F_{P}‘s consistency as an axiom, then we can easily deduce that M *doesn’t* halt. It follows that the theory F_{ω+1} := F_{P} + Con(F_{P}) proves that M runs forever.

One question that the above argument leaves open is whether there’s a Turing machine M that runs forever, as well as a system S of ordinal notations “extending as far as possible”, such that *if* we use S to define the theories F_{α}, *then* none of the F_{α}‘s prove that M runs forever. If so, then there would be a clear sense in which iterated consistency axioms, by themselves, do *not* suffice to solve the halting problem. Alas, I fear the answer might depend on exactly how we interpret the phrase “extending as far as possible” … elucidation welcome!

**Update (June 29, 2011):** In a comment, François Dorais comes to the rescue once again:

In connection with your last paragraph, Feferman has shown that there are paths through O such that the resulting theory proves all true ∏^{0}_{1} statements. [JSL 27 (1962), 259-316] Immediately after Feferman and Spector showed that not all paths through O do this. [JSL 27 (1962), 383-390] In particular, they show that any good path must be more complicated than O itself: the path cannot be ∏^{1}_{1}. In other words, there is no simple way to form a wellordered iterated consistency extension that captures all true ∏^{0}_{1} statements.

Comment #1 June 28th, 2011 at 8:25 am

Pardon the ignorant question (I know of no other kind) but I’m having trouble understanding what it means for a formal system to assert its consistency as an axiom. If F contains both “Con(f)” and “0=1” as axioms, wouldn’t that system be provably inconsistent notwithstanding its assertion of consistency? I interpret consistency, or lack of, as a property that shakes out of the set of axioms of F (even if we can’t prove the consistency given a powerful enough F) but calling it an axiom seems meaningless.

Comment #2 June 28th, 2011 at 9:44 am

Vadim: The fact that a system

assertsits consistency doesn’t mean it’sright! So for example, a system F that includes 0=1 as an axiom proveseverything, including Con(F) (you don’t even need to add Con(F) as a separate axiom), but of course F is inconsistent.However, in this post we’re operating under the background assumption that standard formal systems like Peano Arithmetic and ZF set theory are not only consistent but

sound, or at least arithmetically sound (i.e., every statement that they prove about the positive integers is true).If you don’t believe that PA is sound (for example), then thinking about PA+Con(PA) won’t and shouldn’t convince you otherwise! But if you

dobelieve PA is sound, then you also have to believe Con(PA), Con(PA+Con(PA)), and so on, and it becomes interesting to study which new theorems become provable (possibly theorems having nothing to do with consistency!) as you take on board these new axioms.Comment #3 June 28th, 2011 at 10:09 am

Thanks, that was a helpful explanation.

Comment #4 June 28th, 2011 at 11:41 am

I’m not sure Turing deserved a pHD for this work (as opposed to his other contributions). It does not appear significant enough to merit the degree. Perhaps a masters thesis. But then, I’m convinced most doctorates awarded these days are unearned.

Comment #5 June 28th, 2011 at 11:52 am

Juan: 1939 is “these days”? When

weredoctorates earned? 🙂Comment #6 June 28th, 2011 at 12:58 pm

By “extending as far as possible” perhaps you mean to all ordinals up to the Church-Kleene ordinal? At that point and beyond the ordinals have no recursive well-ordering relation.

Comment #7 June 28th, 2011 at 1:22 pm

Juan, the trend of how many results are treated- First they are considered major results, then interesting results, then they become exercises in yellow books.

Comment #8 June 28th, 2011 at 1:30 pm

Problems always seem easier in hindsight though… I remember taking Bernard Chazelle’s computational geometry course at Princeton and looking up the homework problems online after they were turned in. A number of them were problems (fairly famous-) people solved for their dissertations. They were probably hard at the time, but now they are recycled as homework problems. This “phenomenon” occurs in any field…

Or maybe they weren’t that hard and I should have earned 5 doctorates for my performance.

Comment #9 June 28th, 2011 at 3:56 pm

Alon R #6:

By “extending as far as possible” perhaps you mean to all ordinals up to the Church-Kleene ordinal? At that point and beyond the ordinals have no recursive well-ordering relation.Yes, that’s probably what I meant! I was thinking of taking

somemaximal branch in the Church-Kleene tree of ordinal notations that Franzen denotes O. There are only countably many ordinals in such a branch, but they can’t be recursively enumerable (since otherwise one could extend the branch further). One can then ask: does there exist such a branch in whichnoneof the theories let you prove some specific true Π_{1}-sentence? If so, then iterating consistency statements can’t be enough by itself to render every true Π_{1}-sentence provable: you mightalsoneed to play a cheap trick like Turing’s, and smuggle the sentence’s truth into the way the consistency statements are encoded.Comment #10 June 28th, 2011 at 4:54 pm

Scott said:

“…Peano Arithmetic and ZF set theory are not only consistent but sound, or at least arithmetically sound (i.e., every statement that they prove about the positive integers is true).”

I am not seeing how PA could be consistent but not sound. It would have to involve PA proving a false statement and being unable to prove it’s (true) negation? All the examples I can think of lead to it being inconsistent as well. Like if PA proved Fermat’s Last Theorem but FLT is actually false, a counterexample would exist and PA would obv be able verify that the counter example is correct.

Comment #11 June 28th, 2011 at 5:06 pm

Err… Thinking about it more, perhaps if FLT is true, but PA proves it is false but can’t construct the exception…. In that case, i would wonder how we would know it is true though. Perhaps there is a lot simplier example I’m missing.

Comment #12 June 28th, 2011 at 5:28 pm

Scott: I would hereby like to formally commend you on your ability to condense difficult matters into readable language. As a layman, I am hugely profiting from your writings. So, thanks a bundle! 😉

Comment #13 June 28th, 2011 at 6:02 pm

johnstricker: Thanks! 🙂

Mitch: The standard example of a theory that’s consistent but not sound is the “self-hating theory” PA+Not(Con(PA)), or PA plus the assertion of its own inconsistency. If PA is consistent, then PA+Not(Con(PA)) must be consistent as well—since otherwise, PA could prove its own consistency by contradiction, and would therefore be

inconsistent by the Second Incompleteness Theorem. On the other hand, PA+Not(Con(PA)) is clearly unsound (in fact, it’s unsound whether or not PA is consistent). Intuitively, PA+Not(Con(PA)) “believes” there’s a proof of its own inconsistency, but can never produce an explicit example of such a proof.Comment #14 June 28th, 2011 at 6:14 pm

Thanks Scott!

Comment #15 June 28th, 2011 at 10:00 pm

Would it be possible to tag this post with PlanetMO? That way, people could find it on http://www.mathblogging.org/planetmo .

Comment #16 June 29th, 2011 at 7:22 am

The Wikipedia page Undecidable problem takes care to distinguishes statements that are

deductivelyundecideable (in Gödel’s sense) from statements that arecomputablyundecidable (in Turing’s sense). Is Wikipedia’s distinction sensible?On

TCS StackExchange, there is a just-created community wiki titled “Do the undecidable attributes of P pose an obstruction to deciding P versus NP?” for which Wikipedia’s Gödel-vs-Turing distinction is significant. To date, the best contributions to this wiki are proof sketches (provided by students) that are mainly associated to the Turing variety of undecidability.To the extent that we are guided by the history of complexity theory (and Alan T’s role in particular) perhaps we should assess the creative contribution of Turing-style proof sketches by students as a hopeful sign that these avenues of research will prove productive. 🙂

Comment #17 June 29th, 2011 at 8:33 am

Scott, according to the wikipedia page on the continuum hypothesis “So far, CH appears to be independent of all known large cardinal axioms in the context of ZFC.” so as a very partial answer to your last question a TM looking for a disproof of CH is conjectured to not be provable to halt. As to whether there’s TM for which such a thing can be proven, I have no idea.

Comment #18 June 29th, 2011 at 8:34 am

What would be wrong with the machine for outputting axioms saying “If M halts in t steps or fewer, then output False; otherwise output “Con(Ft)””? That would seem to make the trick even cheaper, but appears to work.

Comment #19 June 29th, 2011 at 8:58 am

Peter #15: How do I do that? Do I just add “PlanetMO” as a WordPress category?

Comment #20 June 29th, 2011 at 9:01 am

John #16: There are undecidable

statementsin Gödel’s sense (relative to a specific formal system F), and then there are undecidablelanguagesin Turing’s sense (which is an absolute notion). And yes, the distinction between the two is sensible, since they’re two completely different (albeit related) concepts that happen to share the same word!Comment #21 June 29th, 2011 at 9:09 am

Bram #17: No, that’s not right. A TM looking for a proof or disproof of CH can be

provedin ZF never to halt (assuming, of course, ZF’s consistency)—that’s what Gödel and Cohen showed! CH itself is undecidable, but CH’sprovabilityin ZF is not undecidable. That means the relevant TM would have to be one that “looked directly for infinities of intermediate cardinality,” which is almost certainly as absurd as it sounds. 🙂Comment #22 June 29th, 2011 at 9:11 am

Bram #18: Yes, it’s an excellent point, and one that occurred to me also. It would work fine. It

lookslike Turing had imposed on himself the further requirement that the machine P can be proved in F to only ever output axioms that assert the consistency of some theory in the sequence.Comment #23 June 29th, 2011 at 9:33 am

@Scott Yes, just as a category or as a wordpress tag — both will be picked up, I think (as in else it’s a bug we will fix). Thanks!

Comment #24 June 29th, 2011 at 9:50 am

Thank you Scott … it’s clear that the distinction between Gödel-type undecidability (of statements) and Turing-type undecidability (of languages) is essential to an appreciation of conundrums like Joel David Hamkins’ answer to the MOF question Can a problem be simultaneously polynomial time and undecidable?. Hamkin’s answer amounts to (AFAICT):

Who woulda’ thunk it!? 🙂

Comment #25 June 29th, 2011 at 1:58 pm

Peter #23: Thanks, done!

Comment #26 June 29th, 2011 at 3:05 pm

In connection with your last paragraph, Feferman has shown that there are paths through O such that the resulting theory proves all true ∏^0_1 statements. [JSL 27 (1962), 259-316] Immediately after Feferman and Spector showed that not all paths through O do this. [JSL 27 (1962), 383-390] In particular, they show that any good path must be more complicated than O itself: the path cannot be ∏^1_1. In other words, there is no simple way to form a wellordered iterated consistency extension that captures all true ∏^0_1 statements.

Comment #27 June 29th, 2011 at 5:14 pm

Thanks so much, François!

Comment #28 June 30th, 2011 at 4:02 am

I don’t understand. Just because the Turing machine was an old hat, if Turing had provably published results about it first, could’t he have gotten a PhD just for writing a dissertation summing his results about it?

Comment #29 June 30th, 2011 at 6:28 am

jonas: Even if he could’ve gotten away with repeating himself, he might not have

wantedto! Also keep in mind that Turing’s advisor at Princeton, Alonzo Church, had independently proved Turing’s “main result” (the unsolvability of the Entscheidungsproblem) using lambda-calculus, and that no one (not even Turing) really understood at that time how important the Turing machine would become.Comment #30 July 1st, 2011 at 1:29 pm

How sneaky of Alan Turing to have hidden the answer to that question in his dissertation.

Paragraph 3: I think you mean “can prove that M does not halt.”

Comment #31 July 1st, 2011 at 1:52 pm

Tracy: Thanks! Fixed.

Comment #32 July 4th, 2011 at 4:05 pm

Scott, sorry for bothering you again. It would be great if you could add a “PlanetMO” category for this post. So far, the only category for this post (showing up in your rss-feed) is “Nerd Interest” (which is awesome in it’s own respect).

Comment #33 July 5th, 2011 at 10:51 am

i note, with some disppointment, that Kurt Godel is not mentioned in the article nor the comments thus far.

he figered out that there is NO consistent system.

‘Nuf said. 🙂

Comment #34 July 5th, 2011 at 11:07 am

Thanks, Scott! It’s fixed now.

Comment #35 July 5th, 2011 at 12:06 pm

Scott said ” Even if he could’ve gotten away with repeating himself, he might not have wanted to!”

If wiki is to be trused it seemed likely he didn’t have a choice:

“In his memoirs Turing wrote that he was disappointed about the reception of this 1936 paper and that only two people had reacted – these being Heinrich Scholz and Richard Bevan Braithwaite.”

http://en.wikipedia.org/wiki/Alan_Turing

So at the time Turing was working on his thesis, he probably was just a bright grad student who had done some competent work before he got to grad school that nobody really read. Probably not that different from alot of his Princetion classmates.

Comment #36 July 5th, 2011 at 12:40 pm

i note, with some disppointment, that Kurt Godel is not mentioned in the article nor the comments thus far.

he figered out that there is NO consistent system.

‘Nuf said. 🙂

inquisitor #33: I wish it didn’t affect me so much, but the fact that someone as

contentedlyignorant as you are exists is going to depress me for the rest of the day.Why don’t you go read Wikipedia if you want to understand what it is that Gödel “figered out.” (I’ll give you two hints: (1) it involves the

completenessof formal systems, not just their consistency; (2) it’s been the implicit starting point, known to everyone here, for the entire conversation on this thread.)Comment #37 July 5th, 2011 at 2:01 pm

I’m a bit perplexed by this statement: “and it becomes interesting to study which new theorems become provable (possibly theorems having nothing to do with consistency!) as you take on board these new axioms.”

Surely an axiom stating that the previous axioms don’t lead to an internal contradiction isn’t going to make new theorems true; in fact it will either make your axioms inconsistent or have no effect. Presumably 0=1 directly contradicts some other axiom (e.g. with PA, 1 = S(0) and by definition is non-zero).

Indeed, I can prove that axioms of this type won’t lead to new theorems becoming available because you need never make use of an axiom y that states “axiom x is not violated” in a proof since you already have axiom x to call upon. It follows that all axioms of the form “axiom x is not violated” need never be used in a proof.

This is very different from making an unprovable but true thing into an axiom. (Consider the axiom of choice, which can be assumed to be true, assumed to be false, or avoided altogether, and which leads to different theorems being provable.)

Perhaps this entire line of reasoning is spurious. A set of axioms should be irreducible before proceeding, and an axiom stating that the preceding axioms are consistent is by its nature eliminated.

Comment #38 July 5th, 2011 at 2:49 pm

Tonio #37: Let me explain why your (plausible) intuition fails. Adding a consistency axiom indeed can’t make new theorems TRUE (truth and falsehood are absolute notions, independent of the formal system), but the whole point of Godel’s Theorem is that it can make new theorems PROVABLE. As an example, consider a Turing machine M that enumerates all proofs in your system F and halts iff it finds a contradiction. Then “M runs forever” is a theorem, on its face purely about the behavior of some Turing machine, that we know CAN’T be proved in F (assuming F is consistent) but CAN be proved in F+Con(F). The fact that we, “looking in from the outside”, see nothing in asserting Con(F) that wasn’t already implicit in asserting F itself is irrelevant here, since we’re only talking about what’s provable.

Comment #39 July 13th, 2011 at 5:30 pm

Juan #4 commented that Turing didn’t deserve a Phd for his dissertation given it’s content based on the standards of 2011. Let’s see, if he’d done his work on computability in 1976 instead of 1936 he’d have very likely gotten the Field Medal and the >>>Turing Award<<< before almost all of us started working. And oh by the way in the 5 years following his Phd (1938) he pretty nearly literally saved the world for democracy through his work on code breaking and Enigma. If this guy didn't deserve his Phd from Princeton, supervised by Alonzo Church for crying out loud, nobody ever has. In this case, I'd say more honor falls on the institution for having granted it to him rather than the other way round. All the more shame on the UK government for not giving him higher honors (a KBE?) and a cover-up if that was really necessary in lieu of persecuting him for homosexuality.