Explanation-Gödel and Plausibility-Gödel

Here’s an observation that’s mathematically trivial but might not be widely appreciated. In kindergarten, we all learned Gödel’s First Incompleteness Theorem, which given a formal system F, constructs an arithmetical encoding of

G(F) = “This sentence is not provable in F.”

If G(F) is true, then it’s an example of a true arithmetical sentence that’s unprovable in F. If, on the other hand, G(F) is false, then it’s provable, which means that F isn’t arithmetically sound. Therefore F is either incomplete or unsound.

Many have objected: “but despite Gödel’s Theorem, it’s still easy to explain why G(F) is true. In fact, the argument above basically already did it!”

[Note: Please stop leaving comments explaining to me that G(F) follows from F’s consistency. I understand that: the “heuristic” part of the argument is F’s consistency! I made a pedagogical choice to elide that, which nerd-sniping has now rendered untenable.]

You might make a more general point: there are many, many mathematical statements for which we currently lack a proof, but we do seem to have a fully convincing heuristic explanation: one that “proves the statement to physics standards of rigor.” For example:

  • The Twin Primes Conjecture (there are infinitely many primes p for which p+2 is also prime).
  • The Collatz Conjecture (the iterative process that maps each positive integer n to n/2 if n is even, or to 3n+1 if n is odd, eventually reaches 1 regardless of which n you start at).
  • π is a normal number (or even just: the digits 0-9 all occur with equal limiting frequencies in the decimal expansion of π).
  • π+e is irrational.

And so on. No one has any idea how to prove any of the above statements—and yet, just on statistical grounds, it seems clear that it would require a ludicrous conspiracy to make any of them false.

Conversely, one could argue that there are statements for which we do have a proof, even though we lack a “convincing explanation” for the statements’ truth. Maybe the Four-Color Theorem or Hales’s Theorem, for which every known proof requires a massive computer enumeration of cases, belong to this class. Other people might argue that, given a proof, an explanation could always be extracted with enough time and effort, though resolving this dispute won’t matter for what follows.

You might hope that, even if some true mathematical statements can’t be proved, every true statement might nevertheless have a convincing heuristic explanation. Alas, a trivial adaptation of Gödel’s Theorem shows that, if (1) heuristic explanations are to be checkable by computer, and (2) only true statements are to have convincing heuristic explanations, then this isn’t possible either. I mean, let E be a program that accepts or rejects proposed heuristic explanations, for statements like the Twin Prime Conjecture or the Collatz Conjecture. Then construct the sentence

S(E) = “This sentence has no convincing heuristic explanation accepted by E.”

If S(E) is true, then it’s an example of a true arithmetical statement without even a convincing heuristic explanation for its truth (!). If, on the other hand, S(E) is false, then there’s a convincing heuristic explanation of its truth, which means that something has gone wrong.

What’s happening, of course, is that given the two conditions we imposed, our “heuristic explanation system” was a proof system, even though we didn’t call it one. This is my point, though: when we use the word “proof,” it normally invokes a specific image, of a sequence of statements that marches from axioms to a theorem, with each statement following from the preceding ones by rigid inference rules like those of first-order logic. None of that, however, plays any direct role in the proof of the Incompleteness Theorem, which cares only about soundness (inability to prove falsehoods) and checkability by a computer (what, with hindsight, Gödel’s “arithmetization of syntax” was all about). The logic works for “heuristic explanations” too.

Now we come to something that I picked up from my former student (and now AI alignment leader) Paul Christiano, on a recent trip to the Bay Area, and which I share with Paul’s kind permission. Having learned that there’s no way to mechanize even heuristic explanations for all the true statements of arithmetic, we could set our sights lower still, and ask about mere plausibility arguments—arguments that might be overturned on further reflection. Is there some sense in which every true mathematical statement at least has a good plausibility argument?

Maybe you see where this is going. Letting P be a program that accepts or rejects proposed plausibility arguments, we can construct

S(P) = “This sentence has no argument for its plausibility accepted by P.”

If S(P) is true, then it’s an example of a true arithmetical statement without even a plausibility argument for its truth (!). If, on the other hand, S(P) is false, then there is a plausibility argument for it. By itself, this is not at all a fatal problem: all sorts of false statements (IP≠PSPACE, switching doors doesn’t matter in Monty Hall, Trump couldn’t possibly become president…) have had decent plausibility arguments. Having said that, it’s pretty strange that you can have a plausibility argument that’s immediately contradicted by its own existence! This rules out some properties that you might want your “plausibility system” to have, although maybe a plausibility system exists that’s still nontrivial and that has weaker properties.

Anyway, I don’t know where I’m going with this, or even why I posted it, but I hope you enjoyed it! And maybe there’s something to be discovered in this direction.

83 Responses to “Explanation-Gödel and Plausibility-Gödel”

  1. Ernest Davis Says:

    I’m not buying either side. The idea that there are true mathematical statements that have no arguments in favor of their plausibility seems to me not at all surprising (or implausible). And the idea that there is a self-referential mathematical statement for which there are plausibility arguments even though the sentence itself denies their existence seems to me not very surprising.

    Also I’m not sure we have an “heuristic explanation” that pi is normal. We have a heuristic argument — almost every real number is, and we have no reason to think that pi isn’t, and it is out to however many digits we’ve currently calculated — but that’s not exactly an explanation.

  2. fred Says:

    G(F) = “This sentence is not provable in F.”
    S(E) = “This sentence has no convincing heuristic explanation accepted by E.”
    S(P) = “This sentence has no argument for its plausibility accepted by P.”

    are all self-referential (Russel’s paradox, Barber paradox, “What is the smallest whole number with no interesting properties?”, etc).

  3. Woett Says:

    I know it’s not directly on topic, but still can’t help myself to mention a small pet peeve of mine: the suggestion that the proof of the Four Color Theorem does not give us any insight into why it’s true. Of course it’s computer assisted, but it’s in fact quite easy -even to a relative layperson- to explain why the proof works and why FCT should be true! All you have to do is explain the notions of an unavoidable set and a reducible configuration. Once these notions are in place, it is immediate that the existence of an unavoidable set of reducible configuratons proves FCT. The fact that the only such sets we currently know need to be checked by a computer, does not really change that. The famous flawed proof of Kempe almost showed that such a set exists with only a few elements. “All” that had to be done from there was increase the size 😉

    On topic: cool blog post and certainly enjoyed this new insight into Plausibility-Gödel! Thanks for writing it.

  4. Ryan Alweiss Says:

    Nice observation! It’s disturbing but not surprising that some things with no convincing heuristic explanations are still true anyway; that’s life.

  5. Scott Says:

    Ernest Davis #1: I never claimed this was especially shocking! But I do want to defend the idea that the normality of π has been satisfactorily “explained” at some level, even though it hasn’t been proven. Often, an “explanation” simply consists of going through all the known reasons why a statement might be false, and checking that none of them apply to the situation at hand. For example, the best “explanation” for many quantum speedups is simply that, of all the ways you could imagine efficiently simulating such-and-such with a classical computer, none of them work. π is normal because the statistical tendencies of pseudorandom processes will tend to make every real number normal unless it has a good reason not to be (like being rational, or having something special about its base-10 representation).

  6. Scott Says:

    Woett #3: Fair point. Can we agree that Appel and Haken’s proof satisfactorily explains why the four-color theorem is true, conditional on the couple thousand cases that need to be checked by computer coming out the right way?

  7. Dan Riley Says:

    You went to a weird kindergarten. Ours taught us about the symmetries of the QM Hamiltonian long before we got to Gödel incompleteness (or at least that’s what our QMII prof told us).

    Sadly, while composing this I found that said QMII prof, Kurt Gottfried, passed away in August. He was great, in more dimensions than I could ever hope to occupy.

  8. Peter Says:

    Essentially math is order.
    Understanding order by rules, is creating a Math formula describing the order.
    The proof is the verification that the framework of Math still holds.
    Though there is no formula for predicting the nth prime number, still primes exist.
    So it’s our understanding (and acceptance) of order, beyond the nature of things that frightens us. It’s though still the best we have to describe the ‘scary’ nature.
    I guess our minds are not built to handle order,

  9. Joshua Zelinsky Says:

    I’m not completely convinced of your two observations. My general attitude is that unless a self-referential statement has been very carefully written out, trying to interpret it, or even knowing if it makes sense can be difficult. And I’m not sure how to use the Diagonal lemma to successfully get versions of your last two statements.

    But this may be a failure of my own highly concrete thinking. So let’s try the following variant and make things even more concrete. There are a lot of heuristics about when Diophantine equations have finitely many or infinitely many solutions. The standard sort of heuristic approach tries to get a rough notion of the expected number of solutions and declares there to be probably infinitely many if this expectation is infinity. This heuristic can be done with other contexts also, such as the primes, where people have likely seen it before. (See for example discussion here https://math.stackexchange.com/questions/3475496/heuristics-of-counting-twin-primes .)

    When one applies this to a famous Diophantine equation one gets that x^n + y^n = z^n should only have finitely many solutions when n>=4, and have infinitely many when n=2. But somewhat annoyingly, it also suggests that there should be infinitely many when n=3! (That’s an exclamation mark not a factorial.) People have proposed fixes for this sort of heuristic for why n=3, but none of them are that satisfactory. In that sense, the difference between what we can prove and what is heuristically true is a massive gap. one of the easiest of of the cases of Fermat’s Last Theorem is the case which in some sense has no heuristic right to be true, and the easiest case to prove (n=4) is one which just manages to get the heuristic to work right (as one goes to larger n, the relevant series converges faster so we should be more confident in the claim).

    So, can we make a heuristic which at least is always correct about which Diophantine equations have only finitely many solutions? This is about as concrete a problem as one could hope for. But we can’t. You can use the same machinery that solves Hilbert’s Tenth Problem to show that if you had such a foolproof heuristic then you could use the Halting Problem. And the same goes if your heuristic instead of asking if there are finitely many or infinitely many gave good answers to is there at least one solution.

    So what’s the correct question for plausibility for Diophantine equations? How about the following: Can there be an argument such that every Diophantine equation which has at least one solution has a plausible argument that it has a solution? Here there’s a trivial answer of yes: Just declare everything to be accepted as your plausibility argument. So it seems like we need a more careful notion of what we mean by a plausibility argument.

  10. Scott Says:

    Joshua Zelinsky #9: I’m glad that you’re unconvinced, since while I’m certain about correctness, your skepticism makes me feel like this is less trivial! 🙂

    Try using the alternate proof of Gödel where you enumerate all possible proofs and disproofs in order to solve the halting problem. Is it clear then that the same logic generalizes to what I called “heuristic explanations,” but that effectively function as proofs?

  11. Hyman Rosen Says:

    Once you have such a G(F), you can adjoin either G(F) or ¬G(F) to F, creating two new systems which are both as consistent as F itself, so having a heuristic explanation of why G(F) is true doesn’t help – you have a heuristic explanation that ¬G(F) is false, but it isn’t.

  12. Bolton Says:

    Reading the comments about what exactly is meant by a plausibility argument, I thought of this paper https://arxiv.org/pdf/1609.03543.pdf I skimmed a while ago about “logical inductors” – programs that run indefinitely on logical statements and update their assessment of their truth over time, but which can learn heuristics that give it an accurate assessment of classes of statements even before it finds a rigorous proof for them. Perhaps you could prove something about how some classes of statements will “trick” these inductors in some way or for some period of time.

  13. Kevin Says:

    Personally, I prefer the proof of the (first) incompleteness theorem by reduction to the halting problem (proofs are enumerable and machine-checkable, so you can enumerate and check the proofs to determine whether machine M halts on input I, and this can only fail to work if the proof of that fact sometimes fails to exist) over the Gödel numbering proof. This is for multiple reasons:

    1. Explaining Gödel numbering is hard (and while you can elide it at introductory levels, it really is an essential element of the proof). You barely have to explain Coq (or any proof assistant), since you can just fire it up and show it to someone directly.
    2. The halting problem is useful in numerous other contexts, so it must be explained anyway.
    3. The version based on the halting problem doesn’t just prove that independent statements exist. It proves that *interesting* independent statements exist – whether or not machine M halts on input I is a relatively concrete fact, compared to the rather nebulous self-reference that you get out of the traditional proof. Indeed, you can translate M into any Turing-complete model of computation, like a programming language, and try running it. It’s very counterintuitive that something you can literally see executing right in front of you could be independent in this way. (Yes, yes, the proof is nonconstructive, so you don’t automatically know what M and I are, and yes, there are specific interesting statements that are already known to be independent, like GCH. But try explaining the independence of GCH to an undergraduate.)
    4. Those statements get even more interesting if you bring Rice’s theorem into the game. Then, for every (nontrivial) behavioral predicate P, there exists at least one machine M for which P(M) is independent (again using proof enumeration to reduce to Rice’s theorem).
    5. More generally, this suggests that independence and undecidability are intimately related concepts, a very useful intuition to keep in your back pocket.

    The main disadvantage is that this proof is technically weaker insofar as it specifically assumes that the formal system we started with is sound (only proves statements that are “really true”), whereas Gödel’s original proof merely assumes that the system is consistent (does not prove a contradiction). That’s an important technical distinction, as it implies we can’t just rearrange the deck chairs on the SS Principia Mathematica to somehow “fix” independence, but IMHO it is outweighed by the above advantages, at least for expository purposes.

  14. Mateus Araújo Says:

    I don’t think the Gödel sentence is in the same category of the twin primes conjecture, or the other conjectures you have listed. After all we do have a perfectly rigorous proof of G(F) in F+con(F). And the consistency of F is precisely what we are assuming in the informal argument for the truth of G(F), so it’s unsurprising that we need to assume it formally to get an actual proof.

    Nothing of the sort holds for the other conjectures you mention.

  15. Danylo Yakymenko Says:

    The definition implies that the set of true statements with heuristic explanations is computably enumerable. But the set of all true statements is not. Because the set of non-halting Turing machines is not either.

    The same argument proves the first incompleteness theorem. It doesn’t matter if we are talking about computably verifiable proofs, heuristics, plausibility arguments or other predicates. We have the same problem.

    I like the computable perspective more because we don’t need the construction of self-referential statements. It’s also philosophically less painful to accept that we can’t enumerate all true statements in math via a finite algorithm.

  16. Luca Says:

    It’s worth noting that, for most reasonable F, like ZFC, the sentence “If F is consistent then G(F) is true” is provable in F, and in fact this is the reason for the second incompleteness theorem, that if F can prove its consistency then F is inconsistent.

    So the heuristic reasoning that convinces us that G(F) is true is more like a fully formal reasoning with the convenient forgetting of one key assumption (the consistency of F)

  17. MaxM Says:

    Can there be something like PAC-provable conjectures?

    “… Conjecture is PAC-provable if there exists a polynomial algorithm A that picks samples from an infinite set of proofs X with probability of at least 1 − δ, that are correct with a probability at least 1 – ϵ … ” or something like that.

  18. Denis Kup Says:

    I was expecting probabilities/Bayesian inference to show up at some point. I’m surprised to see some discussion about “plausibility” without formalizing it by probabilities as it is the usual go-to. Is there a specific reason for that ? Does the formulation of Gödel with probabilities assigned to statements immediately lead to a fixed point computation that makes it trivial and erases the quantitative aspect?

  19. maline Says:

    A problem with formalizing “plausibility” is that it isn’t only a property of the argument itself; it’s highly dependent on what else we know. Once we see the argument leading to a contradiction, it no longer seems so plausible!

    Also, even if we could somehow define a concept of judging plausibility of an argument “on its own terms” without considering the implications, we still need to avoid having “plausible arguments” for the obviously false implications themselves. Meaning we can’t let plausibility flow through modus ponens.

  20. Daniel Reeves Says:

    Ah, thank you for posting this! It reminds me of something I was confused by in this Numberphile video:

    Namely, the claim that we could conceivably prove the Riemann Hypothesis or Goldbach’s conjecture unprovable and conclude that it must then be *true*. The argument is that if it were false there would have to be a counterexample and we could find that counterexample by brute force. For something like Riemann or Goldbach, which claim some property for every element of some infinite set, being false means being provably false. So it can’t be both unprovable and false. So now we’ve proved it true by proving it unprovable? How is that not proving it?? What’s going on?

    My current understanding of the answer: a proposition P being *unprovable* really means it’s independent of the standard axioms of math (AKA ZFC) which means P is both true and false depending on how you interpret the axioms.

    Like Euclid’s parallel postulate in geometry (there’s only one line through a given point parallel to a given line) is true if you interpret geometry as being on a flat plane. But you can interpret it as all happening on a hyperbolic surface and then it’s false. The parallel postulate is not implied by the other axioms. You just have to decide whether to include it as an axiom.

    So for, say, Goldbach’s conjecture (every even integer >2 is the sum of 2 primes) to be unprovable it means there’ll be some exotic kind of numbers that satisfy all the axioms but for which Goldbach is true despite being false for normal numbers, or vice versa.

    We can’t prove it true by proving it unprovable. Rather, we could prove it true for normal numbers but also show that there exist weirdo numbers that satisfy all the axioms and for which there’s a counterexample to Goldbach. Meaning that to say whether Goldbach is true we have to include more axioms because otherwise the actual answer is “it depends”.

  21. Ilio Says:

    What if you’d interpret adversary examples as (im)plausibility arguments for ann’s soundness?

    Let A be some program that automatically rejects every neural networks that can be fooled by adversary examples. Could we then construct:

    S(A) = « this sentence is adversary for A »

    So that it provides a proof that there’s no A to keep terminators aligned?

  22. fred Says:

    How do you classify a statement like
    “This sentence is false”?

    It’s neither true nor false, right?
    But in a way it can be interpreted as an oscillator between true and false.

  23. matt Says:

    The thing it seems you are missing in this is that heuristic proofs come with some level of reliability. If a program enumerates all plausible proofs of something, that program only itself only gives a slightly less plausible proof, because one must account for the possibility that the program has a bug, or that cosmic rays happened to flip a bit in the computer’s memory while it was running, or whatever.

    I know that “we” (i.e., the community of mathematicians, computer scientists, and physicists that prove theorems) like to pretend that a step-by-step logical argument wins out over every heuristic. But in practice, none of us acts like that. If you read a 100+ page paper that relied on many very delicate statements and which claimed to prove some result that contradicted simple heuristics, of course your first reaction would be that the paper had a bug in it that you had not found yet.

  24. Scott Says:

    matt #23: But here we’re specifically talking about proofs (and heuristic explanations and plausibility arguments, for that matter) that can be checked by a computer.

  25. Nick Drozd Says:
    • Is it possible for T to give a heuristic explanation of P and also a heuristic explanation of not-P?
    • If T gives a heuristic explanation of P, does it also give a heuristic explanation that it gives a heuristic explanation of P?
    • Does T give a heuristic explanation of the previous item?
    • Does T give a heuristic explanation that if it gives a heuristic explanation that P implies Q and a heuristic explanation of P, it also gives a heuristic explanation of Q?
    • Can T give a heuristic explanation of its own proof-consistency?
  26. Scott Says:

    fred #22: “This sentence is false” cannot be assigned a truth-value. Indeed, the concept of “truth” can’t even be mathematically formalized, since if it could, that sentence could be assigned a truth-value—an observation famously due to Tarski, although Gödel also realized the same thing without publishing it.

  27. Scott Says:

    Daniel Reeves #20: It’s true that if you proved Goldbach or Riemann to be independent of ZFC, you’d simultaneously prove them to be true! But the catch is that the proof would presumably take place in a more powerful system than ZFC—and in the context of that more powerful system, it would be straightforward to see that you’d proved Goldbach or Riemann themselves.

  28. Scott Says:

    maline #19: Agreed about the difficulties with formalizing “plausibility.” These are exactly what Paul Christiano and his friends are now thinking about!

  29. Scott Says:

    Denis Kup #18: Formalizing plausibility arguments probably would involve bringing in probabilistic considerations at some point (even though we’re talking about mathematical statements whose “probabilities” are all strictly either 0 or 1). But notice that, e.g.,

    “This sentence has a 1/2 probability of being true”

    leads to no obvious paradox!

  30. Scott Says:

    Mateus #14 and Luca #16: Sorry if the post was unclear. The “argument” for G(F) can be separated into two parts:

    (1) The implication Con(F)→G(F), which as you say is just a simple theorem with no heuristic elements.
    (2) The heuristic arguments for Con(F) (or even more strongly, for F’s arithmetical soundness).

    It’s the latter that we’re interested in, if we want to see the parallels between this situation and (e.g.) Collatz or Riemann.

  31. Scott Says:

    Kevin #13 and Danylo #15: I also prefer the proof based on the halting problem for (most) expository purposes! But really you need both. One gives you the intuition that links provability with computability, the other gives you the explicit example of an unprovable statement (plus the Second Incompleteness Theorem more easily and so on).

  32. matt Says:

    Scott 24: probably I am missing your point, but let me ask: what do you think the probability is that the proof of the 4-color theorem is wrong? Of course, there are several computer verified proofs now. But, if you were offered a bet where you lose 1 penny if the proof is right and you win 100 trillion dollars and get world peace if the proof is wrong, would you take that bet?

  33. Scott Says:

    matt #32: Obviously I’d need to consider the probability of a bug in the program, a bug in the human part of the proof, a misunderstanding on my part of the statement of the Four-Color Theorem, etc etc. But while this is what interests you, I confess I don’t see how it’s germane to the points I was making in this post. An explanation should never explain anything false. And I already agreed that a plausibility argument can argue for something false!

  34. dubious Says:

    Ah, if only we could formulate truth as a complex value, such that the truth-value assigned to a “paradox” statement (or series) was an “imaginary truth”…

  35. fred Says:

    Scott #26, #29,

    So “this sentence is false” can’t be true/false.

    But, although “this sentence is true” or “This sentence has a 1/2 probability of being true” seems to be non paradoxical, it’s not actually super clear what’s “true” about it, in the sense that’s it’s purely self-referential, without any reference to anything external, like creating a truth out of a void.

    “this sentence is false” is the equivalent to
    boolean x = !x;

    “this sentence is true”
    is the equivalent to
    boolean x = x;

  36. Scott Says:

    fred #35: The right way to think about this is in terms of fixed-points.

    Let p = Pr[“This sentence is true”]. Then we know only the trivial p=p, so every p∈[0,1] is a fixed-point.

    Let q = Pr[“This sentence is false.”] Then q=1-q, which has q=1/2 as a unique fixed-point.

    Lastly, let r = Pr[“This sentence has 1/2 probability of being true”]. Then r = r/2 + (1-r)/2 = 1/2, which again has r=1/2 as a unique fixed-point.

  37. Shmi Says:

    A stupid question… Could something like this be provable: for the Collatz sequence the total stopping time is not computable? Or maybe that it is non-computable if starting with a non-computable number? Or are these questions meaningless?

  38. fred Says:

    Scott #36
    very interesting, thanks!

  39. Scott Says:

    Shmi #37: The concepts of “computable” and “uncomputable” apply to real numbers or integer functions, but not to individual integers. (We might say: every integer is trivially computable; just print it out!)

    But no, no one has even proven that the problem “given as input an integer N, does it eventually reach 1 under Collatz iteration?” is computable. If the Collatz Conjecture holds then clearly the answer is yes. 🙂

  40. asdf Says:

    The “heuristic” explanation that G (the Gödel sentence) is true depends on the assumption that the underlying theory of arithmetic (say it’s PA) is consistent. That is, PA by itself doesn’t prove G, but PA+CON(PA) does prove G.

    As for plausibility arguments, any lawyer will tell you that for any proposition P, both P and its negation will have plausibility arguments ;).

  41. asdf Says:

    Shen #37, if the stopping time (in the cases of non-divergence) was assumed to be bounded above by some known computable function F, then the Collatz conjecture would be a Pi-0-1 problem, since you could refute it by finding an N such that it didn’t stop after F(N) iterations. It may be possible to get the same observation where only F’s existence is asserted, and F itself is not known, by enumerating Turing machines a la Levin search. But maybe I missed something.

    I think the above basically says: F must be uncomputable for the same reason the busy beaver function is uncomputable.

    A generalization of the Collatz conjecture is known to be Pi-0-2 complete which I think means (in that general case), the existence of a computable F can’t itself be proved.

  42. asdf Says:

    Scott #27 and others: there is a plausibility argument (cough) that GC is not provably independent here:


    I don’t understand it enough to know if it makes sense, but it is interesting ;).

  43. Tez Says:

    Anyone: Is unsoundness always devastating? Or could it be in some cases like living in a world where you are sure there are only ever cults and no world-religions – ie false beliefs are “small and localized in impact”. (Very vague I know!)

  44. Scott Says:

    asdf #40: Yes, please see my comment #30.

  45. Scott Says:

    asdf #42: I mean, I agree as far as it goes. I don’t expect Goldbach or similar number-theoretic statements to be independent of PA, but supposing they were, I wouldn’t expect it to be provable.

  46. Scott Says:

    Tez #43: The fundamental problem with inconsistency is the “principle of explosion,” that a single contradiction implies anything. And this is not just a technical issue to be defined away, since as many of us have experienced firsthand, you really can prove anything by a proof with a single unsound step!

    Even so, there’s indeed an entire field called paraconsistent logic that tries to make sense of inconsistent systems even in the teeth of the above point. I can’t say I understand it, but anyone who does is welcome to explain.

  47. Antoine Deleforge Says:

    Is there any “quantitative version” of Gödel’s incompleteness theorem? Can we say anything in the direction of “out of all the true statements of length N, the ratio of them that is unprovable is big/small or goes to 1 or 0 as N goes to infinity” ?

  48. asdf Says:

    Scott #46

    Tez #43: The fundamental problem with unsoundness is the “principle of explosion,”

    I think you meant “inconsistency” aka 1-inconsistency, rather than unsoundness. 1-inconstency means the theory T proves some proposition P and its negation, i.e. the Pi-0-1 sentence CON(T) is false. If the theory is 1-consistent I don’t think you get the principle of explosion, even though it might be 2-inconsistent.

    Example: suppose the twin prime conjecture (TPC) is true. We could still be in a situation where PA is 1-consistent, but it erroneously proves that the TPC (a Pi-0-2 statement) is false. So it proves that there is a largest twin prime pair p,p+2, but if you search, you will never find it. We would say PA is 2-inconsistent. A 2-inconsistent theory asserts that some Turing machine halts, when it actually doesn’t, but the non-halting can’t be proved. I don’t think you can get a contradiction in the theory from that. There is similarly 3-consistency and so on.

    Soundness means k-consistent for every k. It would be nice to call this ω-consistency, but that term is already in use, as an old-fashioned name for 1-consistency.

  49. Scott Says:

    asdf #48: Yes, sorry, edited my comment. I was using Tez’s term, but from the context, he meant inconsistency as well.

  50. Jon Awbrey Says:

    Dear Scott,

    A general heuristic in problem solving suggests priming the pump with a stronger hypothesis. Applying that strategy here would have us broaden the grounds of validity, our notion of validation, from purely deductive proofs to more general forms of inference. Along that line, and following a lead from Aristotle, C.S. Peirce recognized three distinct modes of inference, called abductive, deductive, and inductive reasoning, and that way of thinking has even had some traction in AI from the days of Warren S. McCulloch on. At any rate I think it helps to view our questions in that ballpark. There’s a budget of resources and running thoughts on the matter I keep on the following page.

    Survey of Abduction, Deduction, Induction, Analogy, Inquiry

  51. William Gasarch Says:

    (In my kindergarden the teacher, a women who got a PhD from Hungary, but could not get a job because she was a women, had has try to 2-color 1,..,9 without getting a monochromatic arithmetic sequence of length 3.)

    Your post made me think about getting a list of Theorems that are true but for which there are not plausible reasons for them. If I had a blog I might make a blog post about it. But for now I’ll just list two:

    a) PERM is Sharp-P Complete. BEFORE Scott’s proof using quantum, the proofs were ad hoc, and both Scott and Oded commented that it was true by gadgets that exist `by accident’

    b) There is a set A that is strictly BETWEEN decidable and HALT by Turing reductions. The proof is a construction of the set A, which is not natural. But aside from the proof there is no real good reason why there should be such a set.

  52. Scott Says:

    William Gasarch #51: Two comments on your examples.

    – My proof that the permanent is #P-hard deduces it as an extremely natural consequence of the KLM Theorem in optical quantum computing—but the latter still requires somewhat ad-hoc gadgets!

    – The following, not nearly as well-known as it deserves to be, is a natural problem intermediate between computable and the halting problem, and for which it’s natural that it would be intermediate.

    Given as input a Turing machine M, if M accepts on a blank input then accept, if M rejects then reject, and if M runs forever then either accept or reject.

    The “only” problem is that this doesn’t define a language—it’s more like a promise problem!

  53. Christopher Says:

    > Many have objected: “but despite Gödel’s Theorem, it’s still easy to explain why G(F) is true. In fact, the argument above basically already did it!” You might make a more general point: there are many, many mathematical statements for which we currently lack a proof, but we do seem to have a fully convincing heuristic explanation

    I think you might have fallen into a small misunderstanding. Let’s say we are working with theory T. The proof that T is true is not just a “heuristic”, it has a proof, which was found by Gödel, and is even valid in T itself.

    The catch: the theorem has an important premise: *T is consistent*. The fact that the statement is true and that the proof works in T is how you get the second incompleteness theorem.

    The statement that T is incomplete *unconditionally* is much tricker. Take ZFC for instance. Although I strongly suspect that ZFC is consistent, I don’t feel that I have some special human only “proof” of this; it’s basically a guess. So although I believe in the proof “Con(ZFC) => ZFC incomplete” and so does ZFC, neither of ZFC nor I have proven “ZFC incomplete” even informally.

  54. Scott Says:

    Christopher #53: I understand that perfectly well—please see comment #30 above.

    I guess I’ll edit the original post to try to stem the tide of commenters re-explaining this to me.

  55. Christopher Says:

    Scott #54:

    Oh, sorry for assuming (and not reading) 😅

    Here’s a fun tid bit that I think kind of goes with the heuristic thing. ZFC in a certain sense *strongly suspects/almost proves* itself consistent. See

    > Doing this produces several closely related “reflection theorems” of ZFC all of which state that we can find a set that is almost a model of ZFC.

    PA has similar properties. This isn’t typical though; PA and ZFC are kind of special.

    I’m not sure how exactly this relates to your post, but I think it does somehow!

  56. asdf Says:

    Gasarch #51, “If I had a blog I might make a blog post about it…”

    Something wrong with https://blog.computationalcomplexity.org/ ? 😉

  57. Peter Gerdes Says:

    I believe the semi-standard (but kinda folklore) way to understand what’s going on is to think about what happens when you try to iteratively add CON sentences to your language. Finitely many times is trivial and then you start needing to work your way up the recursive ordinal notations.

    And the key thing that happens when you do that is that it’s no longer trivial to produce a program extending the sequence. What code enumerates a list of the axioms for w^w many added iterations of CON?

    Ok, I’m sure you know all that but the point that explains the confusion here is that once you start keeping track of the actual indexes you realize that at some point you can no longer figure out how to go from the informal (now take the limit of adding those sentences) to an explicit set of instructions for enumeration of the result.

    Ugh, explaining it badly. Point is that in normal speech we usually ignore the difference between different descriptions of the same thing and that’s what makes this situation feel weird.

    The way you described it was a bit different but, similarly, it strikes us as strange because we simplify things in our mind by overlooking the difference between different ways of picking out a given object so it only really pops out at you when you start grinding through just what you said really means in psuedo-formal speak.

    Damn forgot key part…30s

  58. Peter Gerdes Says:

    Meant to add that your statement that’s true but has no plausibility argument only seems kinda weird because we can so easily refer to it under another description just one we can’t translate into it’s formal mathematical statement.

    After all, no one finds it weird that there are statements in a language with uncountably many constant symbols (eg the reals) that are true that not only can’t be argued for but can’t even be finitely expressed.

    Nor would it seem weird noting thst there will be wrong plausibility args (all things considered) so there will be a sentence where the plausibility arg goes in the wrong direction. But start giving that sentence a fairly clear sounding description and it feels weird.

  59. Jan S Says:

    I think an important insight from the Incompleteness theorem is that it applies to all thought we will ever produce – in principle it is possible to formalise the thoughts produced by a human being locked in a cave in a formal system. (You might deny this based on “free will” or “divine inspiration” but let’s leave this quagmire aside).

    Accepting this, the theorem applies to us. As Scott points out, no matter what level of “proof” we are interested in. In fact we can leave the question of veracity at all and simply ask “given this initial state what future thoughts will this mind produce?” (Which is just another way to say brains are deterministic. Adding an element of chance to it through QM or otherwise doesn’t change that in a meaningful way, it only means we have to consider a myriad of branches now).

    Given that the formal system fully capturing human thought is incomplete, there are ideas about which we will never know if they are true, false or “unknowable” (for ANY standard of knowableness we might be interested in). The arithmetisation in Gödels proof also implies that this is not only true for cute statements that are obviously self-referential. But that there are many statements that are isomorphic to the incompleteness theorem although we might not suspect that.

    I suspect that there are also many unknowable statements that are quite simple, but who knows? If they truly are unknowable, we will never find out, and if they are undecidable on the level of heuristics as well we won’t even get that.

  60. Shion Arita Says:

    I think all of the statements you made are true, but for other reasons than you (I’m not sure whether I’d say that your statements are wrong, but I arrived at the same conclusion in a different way). The reason I’m unclear is that I’m not sure if something having an “argument for its plausibility” is well-defined.

    Nonetheless, it seems like it’s pretty clearly true, and I think it’s kind of as bad as it possibly can get, in that I don’t think there will be much hope for finding weaker statements of the type you make that will hold. Here’s why:

    There are true statements for which even *describing the statement* would require doing some kind of hypercomputation like solving a halting problem. Thus, there couldn’t really be any provable ‘plausibility arguments’ since we can’t even Obviously I can’t explicitly construct such a statement, but it’s clear to me that such things exist, since a statement can be arbitrarily large.

    A general idea in that direction would be something like a statement about a busy beaver number where the number itself also happens to be the source code to some other kind of program (the one the statement is about), and where the input relies on certain irreducible specifics of the BB number that constitutes the source code. I’m not sure I’ve gone pathological enough, but I think this might do it, and I could always go farther if need be.

  61. Shion Arita Says:

    Or, to try to put what I was essentially getting at in terms of the original post:

    Let W be a program that writes down statements.

    S(W) = “This statement cannot expressed by W.”

    If it’s true, then you can’t say it. And if you can say it, it isn’t true.

  62. JPC Says:

    Possibly of interest to this conversation….

    “The Riemann Hypothesis” by J.E. Littlewood

    (from The Scientist Speculates, edited by Good, Mayne and Maynard Smith, 1962)

    I believe this to be false. There is no evidence whatever for it (unless one counts that it is always nice when any function has only real roots). One should not believe things for which there is no evidence. In the spirit of this anthology I should also record my feeling that there is no imaginable reason why it should be true.

    Titchmarsh [1] devised a method, of considerable theoretical interest, for calculating the zeros. The method reveals that for a zero to be off the critical line a remarkable number of ‘coincidences’ have to happen. I have discussed the matter with several people who know the problem in relation to electronic calculation; they are all agreed that the chance of finding a zero off the line in a lifetime’s calculation is millions to one against. It looks then as if we may never know.

    It is true that the existence of an infinity of L-functions raising the same problems creates a remarkable situation. Nonetheless life would be more comfortable if one could believe firmly that the hypothesis is false.

    Partly in response to the above, one of the anthology’s editors, I.J. Good, in collaboration with R.F. Churchhouse, published the brief article “The Riemann hypothesis and pseudorandom features of the Möbius sequence” [2], the stated aim of which is “to suggest a “reason” for believing Riemann’s hypothesis”.

    The authors make use of probabilistic reasoning and a Chilton Atlas computer (presumably one of the fastest available in 1968).

    Abstract. A study of the cumulative sums of the Möbius function on the Atlas Computer of the Science Research Council has revealed certain statistical properties which lead the authors to make a number of conjectures. One of these is that any conjecture of the Mertens type, viz.

    |M(N)| = | μ(1) + … + μ(N)| < k(N1/2)

    where k is any positive constant, is false, and indeed the authors conjecture that

    lim sup{M(x)(x log log x)–1/2} = (121/2)/π

    [1] E.C. Titchmarsh, The Riemann Zeta-Function, Oxford, 1951.

    [2] I.J. Good and R.F. Churchhouse, "The Riemann hypothesis and pseudorandom features of the Möbius sequence", Mathematics of Computation 22 (1968) 857-864.

    Note from Andrew Odlyzko (February 1, 2000):

    "Concerning Littlewood, I am working on a monograph on computations of zeros of the zeta function , which will feature an extended discussion of various conjectures and the numerical evidence for them. Perhaps when I am done I might be willing to write a commentary on Littlewood's opinions. Personally I am a bit of an agnostic about the R.H., but would definitely not agree with Littlewood to the fullest. (BTW, I am familiar with the Good–Churchhouse argument, and have pointed out in one of my papers that while it is appealing, it definitely fails some important tests.)"

  63. asdf Says:

    Peter Gerdes #57, iterating CON hits snags with ordinal notations as soon as level ω. There’s a very good article about this, https://xorshammer.com/2009/03/23/what-happens-when-you-iterate-godels-theorem/

  64. William Gasarch Says:

    Scott 1: So– is PERM #P complete a good example of a theorem for which we have a proof but do not have a plausible explanation?

    Scott 2: How to you prove that your promise problem cannot be used as an oracle to solve HALT (I tried and was unable to do it.)

    asdf: Saying `if I had a blog’ was an attempt at humor, and also a follow up to when Scott made a similar comment about me when he (at my request) posted a link to my book. Given your question, it was probably a failed attempt. And seriously- I might post on the topic at some later time.

  65. David Speyer Says:

    I thought that a number of friends of Paul Christiano already wrote a paper about this? https://arxiv.org/abs/1609.03543 What does the logical induction paper say about this?

    Also, while googling to find the reference to the logical induction, I came across Catrin Campbell-Moore’s thesis https://d-nb.info/1106854586/34 , which looks like it should be interesting to the people who are interested in this topic.

  66. Ben Standeven Says:

    Shion Arita, #60, 61:

    The statement you want is “This statement cannot be expressed by W in less than N steps”; of course N is chosen to be large, but easily described by W.

  67. fred Says:

    Not related to the post, but related to (quantum) computation


  68. Karen Morenz Korol Says:

    What kindergarten did you go to? I must have missed the Godel lesson D:

  69. Ajith Says:

    Is the insolvability of the quintic by radicals a concrete example of a constructive proof of insolvability/undecidability?

    The only prior reference to the quintic I’ve found in this blog is in this post from 2005!

  70. Scott Says:

    Ajith #69: Yes, you can see unsolvability of the quintic by radicals as an early example of an “uncomputability” theorem, where the “model of computation” allows standard arithmetic operations plus the extraction of nth roots for any n. Nick Pippenger even has a book where he develops that viewpoint. Once you understand the analogy, though, it’s unclear how much more insight can be wrung out of it. Certainly the proofs are “radically” different (har, har): one uses group theory, the other uses diagonalization and self-reference.

  71. Joshua Zelinsky Says:

    @Scott #70,

    This raises implicitly a question: is there a way to prove unsolvability of the quintic via some sort of diagonalization argument? My guess is probably not, but I wouldn’t be surprised if there’s some way to do a diagonalization argument to show that solvability breaks down for general n degree polynomials.

  72. Ajith Says:

    Scott, thank you for sharing your thoughts and the reference to Nick Pippenger’s book.

    I looked at a preview of the book online and he talks about the quintic right on pages 4 & 5, where I realized that my question about the quintic unsolvability goes the wrong way. As per his discussion, a better analogy for Godel’s incompleteness is the transcendental numbers lying outside the group of algebraic numbers. In that case we do have both: a proof using a diagonal argument, as well as concrete examples of e and Pi.

    In the finite case, the question of unsolvability can be stretched to point where it is almost silly:
    – impossibility of trisecting an angle with straight edge and compass
    – impossibility of bisecting a line segment with only a straight edge
    – impossibility of tiling a truncated chessboard (one pair of diagonal squares removed) with 1×2 dominoes
    – unsolvability of x^2 = -1 in integers

    As you remarked, none of these examples provide any general insights about other problems, whereas Godel and Turing do.

    PS. As an aside, I noticed that my Chrome browser is underlining every instance of the word unsolvability above and it suggests replacing it with “insolvability”.. I am not sure what the correct grammatical usage is.

  73. Gerald Says:

    The idea of heuristic explanations is interesting, but you should’t have mentioned the Incompleteness Theorems. Nowadays mentioning Gödel is often considered impolite among Number Theorists (that are not also logicians) you know :-).
    We are talking about finite mathematics here, i.e. arithmetical statements about natural numbers (and finite structures by Gödel numbering). We have a huge hierachy of theories available that we intuitively consider to be “true”: Set theory + Large Cardinal assumptions. The only thing the Incompleteness Theorems tell you is that there is no strongest theory among them. If it happens that you can’t prove a particular theorem with a certain set of assumptions, just pick a stronger theory out of the myriad of available theories, nothing more. Yet in practice this is almost never necessary. There a very few natural theorems in graph theory that require something stronger than PA, and then even plain ZF Set Theory is most often overkill. For Number Theory Proper (natural (!) statements about primes, diophantic equations …) there is Harvey Friedman’s Grand Conjecture: Every number theoretic statement proven so far could be formalized in EFA (Elementary Function Arithmetic), a ridiculously weak theory with proof theoretic ordinal just ω³. We have not even started to use the logical power of the available mathematical tools!
    In practice there is no observable correaltion between the difficulty of proving a given finitistic mathematical theorem and the strengh of the theory needed to proof it. Math is hard. Many proofs are complicated and lengthy. That’s the point here, not Gödel. Most likely we have just not been clever enough to find the proofs of the theorems mentioned above. Not our theories are too weak, we are!
    Why is it that so many popular texts about Goldbach, Riemann Hypothesis or (before Wiles) FLT contain the word “Gödel”? That’s nonsense. The proof of FLT can likely be formalized in a theory much weaker than PA, likewise the known proof of a weak version of Goldbach. For the Twin Prime Conjecture significant progress has been made in recent years (James Maynard Theorems). The proofs are long and difficult, but do they reach for ever stronger set theoretic assumptions? No, not at all, the arguments are likely to go through in very weak theories.

  74. Ajith Says:

    A correction to a typo in my comment #72 that’s been bugging me: “… can be stretched to *a* point where it is almost silly …”

    Joshua’s #71 comment was interesting to ponder. I don’t think a diagonalization argument will work to prove quintic unsolvability, because we are comparing two countable infinities – 1) the set of polynomials that are solvable by radicals and 2) the set of algebraic numbers. If I am not mistaken, diagonalization arguments only work if we are comparing two different types of infinities.

    Along the same lines, here’s a simpler problem to chew on: is there a diagonalization argument to prove that sqrt 2 cannot be expressed as a rational number?

  75. Scott (Not Aaronson, or Alexander, but also in the Rationalist Sphere :) ) Says:


    Are we going to discuss incel/nerds-and-sex issues on this blog ever again, or did that asshole kid ruin it for the rest of us?

    Also, what’s the latest on that—did you get the incel, incels, wokes, or whatever the hell they were to stop trolling the Shtetl?

    Scott B.

  76. James Cross Says:

    It would seem if we are making a plausibility argument then some statements are more plausible than others and some statements are just noise or nonsense that we cannot judge about their plausibility.

  77. Scott Says:

    Scott #75: Maybe at some point we’ll discuss those issues again. I hope you understand why I wanted to take a break.

    The Shtetl-Optimized Committee of Guardians (SOCG) has been doing a wonderful job at talking me down from responding to things that I really shouldn’t! 🙂

  78. Joshua Zelinsky Says:

    Ajith #74, Diagonalization can be used in contexts where everything is countable. For example, diagonalization is used in the Time Hierarchy Theorem, and in that case is used to show a certain countable set is contained in another countable set. Similarly, one can use diagonalization to show that there’s a recursive function which is not primitive recursive, and the set of both is countable. Being the same cardinality is not a barrier.

  79. mls Says:

    Gerald #73

    Thank you for your observations. Thirty-seven years ago I began studying set theory using the expression,

    $$\forall x \forall y (( x \subset y \leftrightarrow ( \forall z ( y \subset z \rightarrow x \subset z ) \wedge \exists z ( x \subset z \wedge \neg y \subset z ) ) )$$

    for “subset” with membership given with

    $$\forall x \forall y (( x \in y \leftrightarrow ( \forall z ( y \subset z \rightarrow x \in z ) \wedge \exists z ( x \in z \wedge \neg y \subset z ) ) )$$

    I use the square subset now because this is better understood as an intensional part relation (tex not accepting sqsubset, not working for me). There is a lot of work involved with developing an extensional subset from this.

    In these investigations, I realized that the second sentence can be changed slightly to accommodate “essential divisors” relative to “proper divisors.”

    $$\forall x \forall y (( x \in y \leftrightarrow ( \forall z ( y \subset z \rightarrow x \in z ) \wedge \forall z ( z \subset x \rightarrow \forall w ( z \in w ) ) ) )$$

    An essential divisor is either the unit or a prime. The membership relation is re-interpreted as essential division.

    Number theorists study divisibility. And, divisibility relates to foundational debates in various ways. Around 2012 Leinster wrote blogposts on set theory and order. In so far as these sentences relate to ideas from both numbers and sets, they are candidates for investigation.

    As you observed, math is hard. These sentences have made me question every claim coming from foundational mathematics. It is nice to see that others recognize the subtle problems within the hyperbole.

  80. OhMyGoodness Says:

    I did enjoy this. In Physics now it seems any two sigma deviation attracts numerous theoretical papers explaining the case if the deviation persists and eventually reaches the discovery threshold. 🙂

    Really all that it is reasonable to expect in life for the general case is plausible truth without counterexample. Truths that are mathematically certain are pearls to be treasured.

  81. Oranges and Apples Says:

    Scott — apropos of AGI — you’re in a small minority of people who can confidently say they’ve discovered or invented something hitherto unknown to mankind. When you think about this work, can you relate a true-to-life story of how it came about with the language we use to describe our current approaches to AI, or is something missing?

    Was it that you knew more facts than other people? Or that you were somehow able to squeeze more juice out of the same stone? Did you saw things differently than other people, or were you faster, for whatever reason, at deriving an inevitable conclusion?

  82. Scott Says:

    Oranges and Apples #81: I don’t know that there’s really a fundamental difference between making an original scientific discovery and, I dunno, solving a Sudoku, or debugging a piece of code, or figuring out who or what is rooting through your garbage cans at night, or anything else that orders of magnitude more people have experience with. It’s just that the scientific discoveries involve searching a, y’know, vastly larger search space, and there’s no prior assurance whatsoever of success, and in fact “success” often involves creatively redefining what the problem was. And the timescale is typically much longer, weeks or months or years rather than hours. But it’s mostly just the difference between a sprint and a marathon.

    Certainly there was no point in my scientific development where I started “communing with gods” or whatever. 🙂 Rather, first (as a teenager) I rediscovered stuff that was well-known since the 1700s or earlier to whatever extent it was correct, then I rediscovered stuff that was known since the 60s, then I rediscovered stuff that was known since a year or two previously, and finally I discovered stuff that turned out not to be known and to be publishable. Pretty minor stuff at first.

  83. Ajith Says:

    Joshua #78, I was not aware of diagonalization arguments within countable sets. I’ll have to explore and ponder that a bit. Thank you.

Leave a Reply

You can use rich HTML in comments! You can also use basic TeX, by enclosing it within $$ $$ for displayed equations or \( \) for inline equations.

Comment Policies:

  1. All comments are placed in moderation and reviewed prior to appearing.
  2. You'll also be sent a verification email to the email address you provided.
  3. Comments can be left in moderation for any reason, but in particular, for ad-hominem attacks, hatred of groups of people, snide and patronizing tone, trollishness, disingenuousness, or presumptuousness (e.g., linking to a long paper or article and challenging me to respond to it).
  4. Even when no individual comment violates policy, when there are dozens of comments from a single commenter hammering home the same few themes, and the commenter shows no interest in changing their views or learning from anyone else, the commenter will receive a warning followed by a 3-month ban.
  5. Whenever I'm in doubt, I'll forward comments to Shtetl-Optimized Committee of Guardians, and respect SOCG's judgments on whether those comments should appear.
  6. I sometimes accidentally miss perfectly reasonable comments in the moderation queue, or they get caught in the spam filter. If you feel this may have been the case with your comment, shoot me an email.