My Big Numbers talk at Festivaletteratura

Last weekend, I gave a talk on big numbers, as well as a Q&A about quantum computing, at Festivaletteratura: one of the main European literary festivals, held every year in beautiful and historic Mantua, Italy.  (For those who didn’t know, as I didn’t: this is the city where Virgil was born, and where Romeo gets banished in Romeo and Juliet.  Its layout hasn’t substantially changed since the Middle Ages.)

I don’t know how much big numbers or quantum computing have to do with literature, but I relished the challenge of explaining these things to an audience that was not merely “popular” but humanisitically rather than scientifically inclined.  In this case, there was not only a math barrier, but also a language barrier, as the festival was mostly in Italian and only some of the attendees knew English, to varying degrees.  The quantum computing session was live-translated into Italian (the challenge faced by the translator in not mangling this material provided a lot of free humor), but the big numbers talk wasn’t.  What’s more, the talk was held outdoors, on the steps of a cathedral, with tons of background noise, including a bell that loudly chimed halfway through the talk.  So if my own words weren’t simple and clear, forget it.

Anyway, in the rest of this post, I’ll share a writeup of my big numbers talk.  The talk has substantial overlap with my “classic” Who Can Name The Bigger Number? essay from 1999.  While I don’t mean to supersede or displace that essay, the truth is that I think and write somewhat differently than I did as a teenager (whuda thunk?), and I wanted to give Scott2017 a crack at material that Scott1999 has been over already.  If nothing else, the new version is more up-to-date and less self-indulgent, and it includes points (for example, the relation between ordinal generalizations of the Busy Beaver function and the axioms of set theory) that I didn’t understand back in 1999.

For regular readers of this blog, I don’t know how much will be new here.  But if you’re one of those people who keeps introducing themselves at social events by saying “I really love your blog, Scott, even though I don’t understand anything that’s in it”—something that’s always a bit awkward for me, because, uh, thanks, I guess, but what am I supposed to say next?—then this lecture is for you.  I hope you’ll read it and understand it.

Thanks so much to Festivaletteratura organizer Matteo Polettini for inviting me, and to Fabrizio Illuminati for moderating the Q&A.  I had a wonderful time in Mantua, although I confess there’s something about being Italian that I don’t understand.  Namely: how do you derive any pleasure from international travel, if anywhere you go, the pizza, pasta, bread, cheese, ice cream, coffee, architecture, scenery, historical sights, and pretty much everything else all fall short of what you’re used to?

Big Numbers

by Scott Aaronson
Sept. 9, 2017

My four-year-old daughter sometimes comes to me and says something like: “daddy, I think I finally figured out what the biggest number is!  Is it a million million million million million million million million thousand thousand thousand hundred hundred hundred hundred twenty eighty ninety eighty thirty a million?”

So I reply, “I’m not even sure exactly what number you named—but whatever it is, why not that number plus one?”

“Oh yeah,” she says.  “So is that the biggest number?”

Of course there’s no biggest number, but it’s natural to wonder what are the biggest numbers we can name in a reasonable amount of time.  Can I have two volunteers from the audience—ideally, two kids who like math?

[Two kids eventually come up.  I draw a line down the middle of the blackboard, and place one kid on each side of it, each with a piece of chalk.]

So the game is, you each have ten seconds to write down the biggest number you can.  You can’t write anything like “the other person’s number plus 1,” and you also can’t write infinity—it has to be finite.  But other than that, you can write basically anything you want, as long as I’m able to understand exactly what number you’ve named.  [These instructions are translated into Italian for the kids.]

Are you ready?  On your mark, get set, GO!

[The kid on the left writes something like: 999999999

While the kid on the right writes something like: 11111111111111111

Looking at these, I comment:]

9 is bigger than 1, but 1 is a bit faster to write, and as you can see that makes the difference here!  OK, let’s give our volunteers a round of applause.

[I didn’t plant the kids, but if I had, I couldn’t have designed a better jumping-off point.]

I’ve been fascinated by how to name huge numbers since I was a kid myself.  When I was a teenager, I even wrote an essay on the subject, called Who Can Name the Bigger Number?  That essay might still get more views than any of the research I’ve done in all the years since!  I don’t know whether to be happy or sad about that.

I think the reason the essay remains so popular, is that it shows up on Google whenever someone types something like “what is the biggest number?”  Some of you might know that Google itself was named after the huge number called a googol: 10100, or 1 followed by a hundred zeroes.

Of course, a googol isn’t even close to the biggest number we can name.  For starters, there’s a googolplex, which is 1 followed by a googol zeroes.  Then there’s a googolplexplex, which is 1 followed by a googolplex zeroes, and a googolplexplexplex, and so on.  But one of the most basic lessons you’ll learn in this talk is that, when it comes to naming big numbers, whenever you find yourself just repeating the same operation over and over and over, it’s time to step back, and look for something new to do that transcends everything you were doing previously.  (Applications to everyday life left as exercises for the listener.)

One of the first people to think about systems for naming huge numbers was Archimedes, who was Greek but lived in what’s now Italy (specifically Syracuse, Sicily) in the 200s BC.  Archimedes wrote a sort of pop-science article—possibly history’s first pop-science article—called The Sand-Reckoner.  In this remarkable piece, which was addressed to the King of Syracuse, Archimedes sets out to calculate an upper bound on the number of grains of sand needed to fill the entire universe, or at least the universe as known in antiquity.  He thereby seeks to refute people who use “the number of sand grains” as a shorthand for uncountability and unknowability.

Of course, Archimedes was just guessing about the size of the universe, though he did use the best astronomy available in his time—namely, the work of Aristarchus, who anticipated Copernicus.  Besides estimates for the size of the universe and of a sand grain, the other thing Archimedes needed was a way to name arbitrarily large numbers.  Since he didn’t have Arabic numerals or scientific notation, his system was basically just to compose the word “myriad” (which means 10,000) into bigger and bigger chunks: a “myriad myriad” gets its own name, a “myriad myriad myriad” gets another, and so on.  Using this system, Archimedes estimated that ~1063 sand grains would suffice to fill the universe.  Ancient Hindu mathematicians were able to name similarly large numbers using similar notations.  In some sense, the next really fundamental advances in naming big numbers wouldn’t occur until the 20th century.

We’ll come to those advances, but before we do, I’d like to discuss another question that motivated Archimedes’ essay: namely, what are the biggest numbers relevant to the physical world?

For starters, how many atoms are in a human body?  Anyone have a guess?  About 1028.  (If you remember from high-school chemistry that a “mole” is 6×1023, this is not hard to ballpark.)

How many stars are in our galaxy?  Estimates vary, but let’s say a few hundred billion.

How many stars are in the entire observable universe?  Something like 1023.

How many subatomic particles are in the observable universe?  No one knows for sure—for one thing, because we don’t know what the dark matter is made of—but 1090 is a reasonable estimate.

Some of you might be wondering: but for all anyone knows, couldn’t the universe be infinite?  Couldn’t it have infinitely many stars and particles?  The answer to that is interesting: indeed, no one knows whether space goes on forever or curves back on itself, like the surface of the earth.  But because of the dark energy, discovered in 1998, it seems likely that even if space is infinite, we can only ever see a finite part of it.  The dark energy is a force that pushes the galaxies apart.  The further away they are from us, the faster they’re receding—with galaxies far enough away from us receding faster than light.

Right now, we can see the light from galaxies that are up to about 45 billion light-years away.  (Why 45 billion light-years, you ask, if the universe itself is “only” 13.6 billion years old?  Well, when the galaxies emitted the light, they were a lot closer to us than they are now!  The universe expanded in the meantime.)  If, as seems likely, the dark energy has the form of a cosmological constant, then there’s a somewhat further horizon, such that it’s not just that the galaxies beyond that can’t be seen by us right now—it’s that they can never be seen.

In practice, many big numbers come from the phenomenon of exponential growth.  Here’s a graph showing the three functions n, n2, and 2n:

The difference is, n and even n2 grow in a more-or-less manageable way, but 2n just shoots up off the screen.  The shooting-up has real-life consequences—indeed, more important consequences than just about any other mathematical fact one can think of.

The current human population is about 7.5 billion (when I was a kid, it was more like 5 billion).  Right now, the population is doubling about once every 64 years.  If it continues to double at that rate, and humans don’t colonize other worlds, then you can calculate that, less than 3000 years from now, the entire earth, all the way down to the core, will be made of human flesh.  I hope the people use deodorant!

Nuclear chain reactions are a second example of exponential growth: one uranium or plutonium nucleus fissions and emits neutrons that cause, let’s say, two other nuclei to fission, which then cause four nuclei to fission, then 8, 16, 32, and so on, until boom, you’ve got your nuclear weapon (or your nuclear reactor, if you do something to slow the process down).  A third example is compound interest, as with your bank account, or for that matter an entire country’s GDP.  A fourth example is Moore’s Law, which is the thing that said that the number of components in a microprocessor doubled every 18 months (with other metrics, like memory, processing speed, etc., on similar exponential trajectories).  Here at Festivaletteratura, there’s a “Hack Space,” where you can see state-of-the-art Olivetti personal computers from around 1980: huge desk-sized machines with maybe 16K of usable RAM.  Moore’s Law is the thing that took us from those (and the even bigger, weaker computers before them) to the smartphone that’s in your pocket.

However, a general rule is that any time we encounter exponential growth in our observed universe, it can’t last for long.  It will stop, if not before then when it runs out of whatever resource it needs to continue: for example, food or land in the case of people, fuel in the case of a nuclear reaction.  OK, but what about Moore’s Law: what physical constraint will stop it?

By some definitions, Moore’s Law has already stopped: computers aren’t getting that much faster in terms of clock speed; they’re mostly just getting more and more parallel, with more and more cores on a chip.  And it’s easy to see why: the speed of light is finite, which means the speed of a computer will always be limited by the size of its components.  And transistors are now just 15 nanometers across; a couple orders of magnitude smaller and you’ll be dealing with individual atoms.  And unless we leap really far into science fiction, it’s hard to imagine building a transistor smaller than one atom across!

OK, but what if we do leap really far into science fiction?  Forget about engineering difficulties: is there any fundamental principle of physics that prevents us from making components smaller and smaller, and thereby making our computers faster and faster, without limit?

While no one has tested this directly, it appears from current physics that there is a fundamental limit to speed, and that it’s about 1043 operations per second, or one operation per Planck time.  Likewise, it appears that there’s a fundamental limit to the density with which information can be stored, and that it’s about 1069 bits per square meter, or one bit per Planck area. (Surprisingly, the latter limit scales only with the surface area of a region, not with its volume.)

What would happen if you tried to build a faster computer than that, or a denser hard drive?  The answer is: cycling through that many different states per second, or storing that many bits, would involve concentrating so much energy in so small a region, that the region would exceed what’s called its Schwarzschild radius.  If you don’t know what that means, it’s just a fancy way of saying that your computer would collapse to a black hole.  I’ve always liked that as Nature’s way of telling you not to do something!

Note that, on the modern view, a black hole itself is not only the densest possible object allowed by physics, but also the most efficient possible hard drive, storing ~1069 bits per square meter of its event horizon—though the bits are not so easy to retrieve! It’s also, in a certain sense, the fastest possible computer, since it really does cycle through 1043 states per second—though it might not be computing anything that anyone would care about.

We can also combine these fundamental limits on computer speed and storage capacity, with the limits that I mentioned earlier on the size of the observable universe, which come from the cosmological constant.  If we do so, we get an upper bound of ~10122 on the number of bits that can ever be involved in any computation in our world, no matter how large: if we tried to do a bigger computation than that, the far parts of it would be receding away from us faster than the speed of light.  In some sense, this 10122 is the most fundamental number that sets the scale of our universe: on the current conception of physics, everything you’ve ever seen or done, or will see or will do, can be represented by a sequence of at most 10122 ones and zeroes.

Having said that, in math, computer science, and many other fields (including physics itself), many of us meet bigger numbers than 10122 dozens of times before breakfast! How so? Mostly because we choose to ask, not about the number of things that are, but about the number of possible ways they could be—not about the size of ordinary 3-dimensional space, but the sizes of abstract spaces of possible configurations. And the latter are subject to exponential growth, continuing way beyond 10122.

As an example, let’s ask: how many different novels could possibly be written (say, at most 400 pages long, with a normal-size font, yadda yadda)? Well, we could get a lower bound on the number just by walking around here at Festivaletteratura, but the number that could be written certainly far exceeds the number that have been written or ever will be. This was the subject of Jorge Luis Borges’ famous story The Library of Babel, which imagined an immense library containing every book that could possibly be written up to a certain length. Of course, the vast majority of the books are filled with meaningless nonsense, but among their number one can find all the great works of literature, books predicting the future of humanity in perfect detail, books predicting the future except with a single error, etc. etc. etc.

To get more quantitative, let’s simply ask: how many different ways are there to fill the first page of a novel?  Let’s go ahead and assume that the page is filled with intelligible (or at least grammatical) English text, rather than arbitrary sequences of symbols, at a standard font size and page size.  In that case, using standard estimates for the entropy (i.e., compressibility) of English, I estimated this morning that there are maybe ~10700 possibilities.  So, forget about the rest of the novel: there are astronomically more possible first pages than could fit in the observable universe!

We could likewise ask: how many chess games could be played?  I’ve seen estimates from 1040 up to 10120, depending on whether we count only “sensible” games or also “absurd” ones (though in all cases, with a limit on the length of the game as might occur in a real competition). For Go, by contrast, which is played on a larger board (19×19 rather than 8×8) the estimates for the number of possible games seem to start at 10800 and only increase from there. This difference in magnitudes has something to do with why Go is a “harder” game than chess, why computers were able to beat the world chess champion already in 1997, but the world Go champion not until last year.

Or we could ask: given a thousand cities, how many routes are there for a salesman that visit each city exactly once? We write the answer as 1000!, pronounced “1000 factorial,” which just means 1000×999×998×…×2×1: there are 1000 choices for the first city, then 999 for the second city, 998 for the third, and so on.  This number is about 4×102567.  So again, more possible routes than atoms in the visible universe, yadda yadda.

But suppose the salesman is interested only in the shortest route that visits each city, given the distance between every city and every other.  We could then ask: to find that shortest route, would a computer need to search exhaustively through all 1000! possibilities—or, maybe not all 1000!, maybe it could be a bit more clever than that, but at any rate, a number that grew exponentially with the number of cities n?  Or could there be an algorithm that zeroed in on the shortest route dramatically faster: say, using a number of steps that grew only linearly or quadratically with the number of cities?

This, modulo a few details, is one of the most famous unsolved problems in all of math and science.  You may have heard of it; it’s called P versus NP.  P (Polynomial-Time) is the class of problems that an ordinary digital computer can solve in a “reasonable” amount of time, where we define “reasonable” to mean, growing at most like the size of the problem (for example, the number of cities) raised to some fixed power.  NP (Nondeterministic Polynomial-Time) is the class for which a computer can at least recognize a solution in polynomial-time.  If P=NP, it would mean that for every combinatorial problem of this sort, for which a computer could recognize a valid solution—Sudoku puzzles, scheduling airline flights, fitting boxes into the trunk of a car, etc. etc.—there would be an algorithm that cut through the combinatorial explosion of possible solutions, and zeroed in on the best one.  If P≠NP, it would mean that at least some problems of this kind required astronomical time, regardless of how cleverly we programmed our computers.

Most of us believe that P≠NP—indeed, I like to say that if we were physicists, we would’ve simply declared P≠NP a “law of nature,” and given ourselves Nobel Prizes for the discovery of the law!  And if it turned out that P=NP, we’d just give ourselves more Nobel Prizes for the law’s overthrow.  But because we’re mathematicians and computer scientists, we call it a “conjecture.”

Another famous example of an NP problem is: I give you (say) a 2000-digit number, and I ask you to find its prime factors.  Multiplying two thousand-digit numbers is easy, at least for a computer, but factoring the product back into primes seems astronomically hard—at least, with our present-day computers running any known algorithm.  Why does anyone care?  Well, you might know that, any time you order something online—in fact, every time you see a little padlock icon in your web browser—your personal information, like (say) your credit card number, is being protected by a cryptographic code that depends on the belief that factoring huge numbers is hard, or a few closely-related beliefs.  If P=NP, then those beliefs would be false, and indeed all cryptography that depends on hard math problems would be breakable in “reasonable” amounts of time.

In the special case of factoring, though—and of the other number theory problems that underlie modern cryptography—it wouldn’t even take anything as shocking as P=NP for them to fall.  Actually, that provides a good segue into another case where exponentials, and numbers vastly larger than 10122, regularly arise in the real world: quantum mechanics.

Some of you might have heard that quantum mechanics is complicated or hard.  But I can let you in on a secret, which is that it’s incredibly simple once you take the physics out of it!  Indeed, I think of quantum mechanics as not exactly even “physics,” but more like an operating system that the rest of physics runs on as application programs.  It’s a certain generalization of the rules of probability.  In one sentence, the central thing quantum mechanics says is that, to fully describe a physical system, you have to assign a number called an “amplitude” to every possible configuration that the system could be found in.  These amplitudes are used to calculate the probabilities that the system will be found in one configuration or another if you look at it.  But the amplitudes aren’t themselves probabilities: rather than just going from 0 to 1, they can be positive or negative or even complex numbers.

For us, the key point is that, if we have a system with (say) a thousand interacting particles, then the rules of quantum mechanics say we need at least 21000 amplitudes to describe it—which is way more than we could write down on pieces of paper filling the entire observable universe!  In some sense, chemists and physicists knew about this immensity since 1926.  But they knew it mainly as a practical problem: if you’re trying to simulate quantum mechanics on a conventional computer, then as far as we know, the resources needed to do so increase exponentially with the number of particles being simulated.  Only in the 1980s did a few physicists, such as Richard Feynman and David Deutsch, suggest “turning the lemon into lemonade,” and building computers that themselves would exploit the exponential growth of amplitudes.  Supposing we built such a computer, what would it be good for?  At the time, the only obvious application was simulating quantum mechanics itself!  And that’s probably still the most important application today.

In 1994, though, a guy named Peter Shor made a discovery that dramatically increased the level of interest in quantum computers.  That discovery was that a quantum computer, if built, could factor an n-digit number using a number of steps that grows only like about n2, rather than exponentially with n.  The upshot is that, if and when practical quantum computers are built, they’ll be able to break almost all the cryptography that’s currently used to secure the Internet.

(Right now, only small quantum computers have been built; the record for using Shor’s algorithm is still to factor 21 into 3×7 with high statistical confidence!  But Google is planning within the next year or so to build a chip with 49 quantum bits, or qubits, and other groups around the world are pursuing parallel efforts.  Almost certainly, 49 qubits still won’t be enough to do anything useful, including codebreaking, but it might be enough to do something classically hard, in the sense of taking at least ~249 or 563 trillion steps to simulate classically.)

I should stress, though, that for other NP problems—including breaking various other cryptographic codes, and solving the Traveling Salesman Problem, Sudoku, and the other combinatorial problems mentioned earlier—we don’t know any quantum algorithm analogous to Shor’s factoring algorithm.  For these problems, we generally think that a quantum computer could solve them in roughly the square root of the number of steps that would be needed classically, because of another famous quantum algorithm called Grover’s algorithm.  But getting an exponential quantum speedup for these problems would, at the least, require an additional breakthrough.  No one has proved that such a breakthrough in quantum algorithms is impossible: indeed, no one has proved that it’s impossible even for classical algorithms; that’s the P vs. NP question!  But most of us regard it as unlikely.

If we’re right, then the upshot is that quantum computers are not magic bullets: they might yield dramatic speedups for certain special problems (like factoring), but they won’t tame the curse of exponentiality, cut through to the optimal solution, every time we encounter a Library-of-Babel-like profusion of possibilities.  For (say) the Traveling Salesman Problem with a thousand cities, even a quantum computer—which is the most powerful kind of computer rooted in known laws of physics—might, for all we know, take longer than the age of the universe to find the shortest route.

The truth is, though, the biggest numbers that show up in math are way bigger than anything we’ve discussed until now: bigger than 10122, or even

$$ 2^{10^{122}}, $$

which is a rough estimate for the number of quantum-mechanical amplitudes needed to describe our observable universe.

For starters, there’s Skewes’ number, which the mathematician G. H. Hardy once called “the largest number which has ever served any definite purpose in mathematics.”  Let π(x) be the number of prime numbers up to x: for example, π(10)=4, since we have 2, 3, 5, and 7.  Then there’s a certain estimate for π(x) called li(x).  It’s known that li(x) overestimates π(x) for an enormous range of x’s (up to trillions and beyond)—but then at some point, it crosses over and starts underestimating π(x) (then overestimates again, then underestimates, and so on).  Skewes’ number is an upper bound on the location of the first such crossover point.  In 1955, Skewes proved that the first crossover must happen before

$$ x = 10^{10^{10^{964}}}. $$

Note that this bound has since been substantially improved, to 1.4×10316.  But no matter: there are numbers vastly bigger even than Skewes’ original estimate, which have since shown up in Ramsey theory and other parts of logic and combinatorics to take Skewes’ number’s place.

Alas, I won’t have time here to delve into specific (beautiful) examples of such numbers, such as Graham’s number.  So in lieu of that, let me just tell you about the sorts of processes, going far beyond exponentiation, that tend to yield such numbers.

The starting point is to remember a sequence of operations we all learn about in elementary school, and then ask why the sequence suddenly and inexplicably stops.

As long as we’re only talking about positive integers, “multiplication” just means “repeated addition.”  For example, 5×3 means 5 added to itself 3 times, or 5+5+5.

Likewise, “exponentiation” just means “repeated multiplication.”  For example, 53 means 5×5×5.

But what’s repeated exponentiation?  For that we introduce a new operation, which we call tetration, and write like so: 35 means 5 raised to itself 3 times, or

$$ ^{3} 5 = 5^{5^5} = 5^{3125} \approx 1.9 \times 10^{2184}. $$

But we can keep going. Let x pentated to the y, or xPy, mean x tetrated to itself y times.  Let x sextated to the y, or xSy, mean x pentated to itself y times, and so on.

Then we can define the Ackermann function, invented by the mathematician Wilhelm Ackermann in 1928, which cuts across all these operations to get more rapid growth than we could with any one of them alone.  In terms of the operations above, we can give a slightly nonstandard, but perfectly serviceable, definition of the Ackermann function as follows:

A(1) is 1+1=2.

A(2) is 2×2=4.

A(3) is 3 to the 3rd power, or 33=27.

Not very impressive so far!  But wait…

A(4) is 4 tetrated to the 4, or

$$ ^{4}4 = 4^{4^{4^4}} = 4^{4^{256}} = BIG $$

A(5) is 5 pentated to the 5, which I won’t even try to simplify.  A(6) is 6 sextated to the 6.  And so on.

More than just a curiosity, the Ackermann function actually shows up sometimes in math and theoretical computer science.  For example, the inverse Ackermann function—a function α such that α(A(n))=n, which therefore grows as slowly as the Ackermann function grows quickly, and which is at most 4 for any n that would ever arise in the physical universe—sometimes appears in the running times of real-world algorithms.

In the meantime, though, the Ackermann function also has a more immediate application.  Next time you find yourself in a biggest-number contest, like the one with which we opened this talk, you can just write A(1000), or even A(A(1000)) (after specifying that A means the Ackermann function above).  You’ll win—period—unless your opponent has also heard of something Ackermann-like or beyond.

OK, but Ackermann is very far from the end of the story.  If we want to go incomprehensibly beyond it, the starting point is the so-called “Berry Paradox”, which was first described by Bertrand Russell, though he said he learned it from a librarian named Berry.  The Berry Paradox asks us to imagine leaping past exponentials, the Ackermann function, and every other particular system for naming huge numbers.  Instead, why not just go straight for a single gambit that seems to beat everything else:

The biggest number that can be specified using a hundred English words or fewer

Why is this called a paradox?  Well, do any of you see the problem here?

Right: if the above made sense, then we could just as well have written

Twice the biggest number that can be specified using a hundred English words or fewer

But we just specified that number—one that, by definition, takes more than a hundred words to specify—using far fewer than a hundred words!  Whoa.  What gives?

Most logicians would say the resolution of this paradox is simply that the concept of “specifying a number with English words” isn’t precisely defined, so phrases like the ones above don’t actually name definite numbers.  And how do we know that the concept isn’t precisely defined?  Why, because if it was, then it would lead to paradoxes like the Berry Paradox!

So if we want to escape the jaws of logical contradiction, then in this gambit, we ought to replace English by a clear, logical language: one that can be used to specify numbers in a completely unambiguous way.  Like … oh, I know!  Why not write:

The biggest number that can be specified using a computer program that’s at most 1000 bytes long

To make this work, there are just two issues we need to get out of the way.  First, what does it mean to “specify” a number using a computer program?  There are different things it could mean, but for concreteness, let’s say a computer program specifies a number N if, when you run it (with no input), the program runs for exactly N steps and then stops.  A program that runs forever doesn’t specify any number.

The second issue is, which programming language do we have in mind: BASIC? C? Python?  The answer is that it won’t much matter!  The Church-Turing Thesis, one of the foundational ideas of computer science, implies that every “reasonable” programming language can emulate every other one.  So the story here can be repeated with just about any programming language of your choice.  For concreteness, though, we’ll pick one of the first and simplest programming languages, namely “Turing machine”—the language invented by Alan Turing all the way back in 1936!

In the Turing machine language, we imagine a one-dimensional tape divided into squares, extending infinitely in both directions, and with all squares initially containing a “0.”  There’s also a tape head with n “internal states,” moving back and forth on the tape.  Each internal state contains an instruction, and the only allowed instructions are: write a “0” in the current square, write a “1” in the current square, move one square left on the tape, move one square right on the tape, jump to a different internal state, halt, and do any of the previous conditional on whether the current square contains a “0” or a “1.”

Using Turing machines, in 1962 the mathematician Tibor Radó invented the so-called Busy Beaver function, or BB(n), which allowed naming by far the largest numbers anyone had yet named.  BB(n) is defined as follows: consider all Turing machines with n internal states.  Some of those machines run forever, when started on an all-0 input tape.  Discard them.  Among the ones that eventually halt, there must be some machine that runs for a maximum number of steps before halting.  However many steps that is, that’s what we call BB(n), the nth Busy Beaver number.

The first few values of the Busy Beaver function have actually been calculated, so let’s see them.

BB(1) is 1.  For a 1-state Turing machine on an all-0 tape, the choices are limited: either you halt in the very first step, or else you run forever.

BB(2) is 6, as isn’t too hard to verify by trying things out with pen and paper.

BB(3) is 21: that determination was already a research paper.

BB(4) is 107 (another research paper).

Much like with the Ackermann function, not very impressive yet!  But wait:

BB(5) is not yet known, but it’s known to be at least 47,176,870.

BB(6) is at least 7.4×1036,534.

BB(7) is at least

$$ 10^{10^{10^{10^{18,000,000}}}}. $$

Clearly we’re dealing with a monster here, but can we understand just how terrifying of a monster?  Well, call a sequence f(1), f(2), … computable, if there’s some computer program that takes n as input, runs for a finite time, then halts with f(n) as its output.  To illustrate, f(n)=n2, f(n)=2n, and even the Ackermann function that we saw before are all computable.

But I claim that the Busy Beaver function grows faster than any computable function.  Since this talk should have at least some math in it, let’s see a proof of that claim.

Maybe the nicest way to see it is this: suppose, to the contrary, that there were a computable function f that grew at least as fast as the Busy Beaver function.  Then by using that f, we could take the Berry Paradox from before, and turn it into an actual contradiction in mathematics!  So for example, suppose the program to compute f were a thousand bytes long.  Then we could write another program, not much longer than a thousand bytes, to run for (say) 2×f(1000000) steps: that program would just need to include a subroutine for f, plus a little extra code to feed that subroutine the input 1000000, and then to run for 2×f(1000000) steps.  But by assumption, f(1000000) is at least the maximum number of steps that any program up to a million bytes long can run for—even though we just wrote a program, less than a million bytes long, that ran for more steps!  This gives us our contradiction.  The only possible conclusion is that the function f, and the program to compute it, couldn’t have existed in the first place.

(As an alternative, rather than arguing by contradiction, one could simply start with any computable function f, and then build programs that compute f(n) for various “hardwired” values of n, in order to show that BB(n) must grow at least as rapidly as f(n).  Or, for yet a third proof, one can argue that, if any upper bound on the BB function were computable, then one could use that to solve the halting problem, which Turing famously showed to be uncomputable in 1936.)

In some sense, it’s not so surprising that the BB function should grow uncomputably quickly—because if it were computable, then huge swathes of mathematical truth would be laid bare to us.  For example, suppose we wanted to know the truth or falsehood of the Goldbach Conjecture, which says that every even number 4 or greater can be written as a sum of two prime numbers.  Then we’d just need to write a program that checked each even number one by one, and halted if and only if it found one that wasn’t a sum of two primes.  Suppose that program corresponded to a Turing machine with N states.  Then by definition, if it halted at all, it would have to halt after at most BB(N) steps.  But that means that, if we knew BB(N)—or even any upper bound on BB(N)—then we could find out whether our program halts, by simply running it for the requisite number of steps and seeing.  In that way we’d learn the truth or falsehood of Goldbach’s Conjecture—and similarly for the Riemann Hypothesis, and every other famous unproved mathematical conjecture (there are a lot of them) that can be phrased in terms of a computer program never halting.

(Here, admittedly, I’m using “we could find” in an extremely theoretical sense.  Even if someone handed you an N-state Turing machine that ran for BB(N) steps, the number BB(N) would be so hyper-mega-astronomical that, in practice, you could probably never distinguish the machine from one that simply ran forever.  So the aforementioned “strategy” for proving Goldbach’s Conjecture, or the Riemann Hypothesis would probably never yield fruit before the heat death of the universe, even though in principle it would reduce the task to a “mere finite calculation.”)

OK, you wanna know something else wild about the Busy Beaver function?  In 2015, my former student Adam Yedidia and I wrote a paper where we proved that BB(8000)—i.e., the 8000th Busy Beaver number—can’t be determined using the usual axioms for mathematics, which are called Zermelo-Fraenkel (ZF) set theory.  Nor can B(8001) or any larger Busy Beaver number.

To be sure, BB(8000) has some definite value: there are finitely many 8000-state Turing machines, and each one either halts or runs forever, and among the ones that halt, there’s some maximum number of steps that any of them runs for.  What we showed is that math, if it limits itself to the currently-accepted axioms, can never prove the value of BB(8000), even in principle.

The way we did that was by explicitly constructing an 8000-state Turing machine, which (in effect) enumerates all the consequences of the ZF axioms one after the next, and halts if and only if it ever finds a contradiction—that is, a proof of 0=1.  Presumably set theory is actually consistent, and therefore our program runs forever.  But if you proved the program ran forever, you’d also be proving the consistency of set theory.  And has anyone heard of any obstacle to doing that?  Of course, Gödel’s Incompleteness Theorem!  Because of Gödel, if set theory is consistent (well, technically, also arithmetically sound), then it can’t prove our program either halts or runs forever.  But that means set theory can’t determine BB(8000) either—because if it could do that, then it could also determine the behavior of our program.

To be clear, it was long understood that there’s some computer program that halts if and only if set theory is inconsistent—and therefore, that the axioms of set theory can determine at most k values of the Busy Beaver function, for some positive integer k.  “All” Adam and I did was to prove the first explicit upper bound, k≤8000, which required a lot of optimizations and software engineering to get the number of states down to something reasonable (our initial estimate was more like k≤1,000,000).  More recently, Stefan O’Rear has improved our bound—most recently, he says, to k≤1000, meaning that, at least by the lights of ZF set theory, fewer than a thousand values of the BB function can ever be known.

Meanwhile, let me remind you that, at present, only four values of the function are known!  Could the value of BB(100) already be independent of set theory?  What about BB(10)?  BB(5)?  Just how early in the sequence do you leap off into Platonic hyperspace?  I don’t know the answer to that question but would love to.

Ah, you ask, but is there any number sequence that grows so fast, it blows even the Busy Beavers out of the water?  There is!

Imagine a magic box into which you could feed in any positive integer n, and it would instantly spit out BB(n), the nth Busy Beaver number.  Computer scientists call such a box an “oracle.”  Even though the BB function is uncomputable, it still makes mathematical sense to imagine a Turing machine that’s enhanced by the magical ability to access a BB oracle any time it wants: call this a “super Turing machine.”  Then let SBB(n), or the nth super Busy Beaver number, be the maximum number of steps that any n-state super Turing machine makes before halting, if given no input.

By simply repeating the reasoning for the ordinary BB function, one can show that, not only does SBB(n) grow faster than any computable function, it grows faster than any function computable by super Turing machines (for example, BB(n), BB(BB(n)), etc).

Let a super duper Turing machine be a Turing machine with access to an oracle for the super Busy Beaver numbers.  Then you can use super duper Turing machines to define a super duper Busy Beaver function, which you can use in turn to define super duper pooper Turing machines, and so on!

Let “level-1 BB” be the ordinary BB function, let “level-2 BB” be the super BB function, let “level 3 BB” be the super duper BB function, and so on.  Then clearly we can go to “level-k BB,” for any positive integer k.

But we need not stop even there!  We can then go to level-ω BB.  What’s ω?  Mathematicians would say it’s the “first infinite ordinal”—the ordinals being a system where you can pass from any set of numbers you can possibly name (even an infinite set), to the next number larger than all of them.  More concretely, the level-ω Busy Beaver function is simply the Busy Beaver function for Turing machines that are able, whenever they want, to call an oracle to compute the level-k Busy Beaver function, for any positive integer k of their choice.

But why stop there?  We can then go to level-(ω+1) BB, which is just the Busy Beaver function for Turing machines that are able to call the level-ω Busy Beaver function as an oracle.  And thence to level-(ω+2) BB, level-(ω+3) BB, etc., defined analogously.  But then we can transcend that entire sequence and go to level-2ω BB, which involves Turing machines that can call level-(ω+k) BB as an oracle for any positive integer k.  In the same way, we can pass to level-3ω BB, level-4ω BB, etc., until we transcend that entire sequence and pass to level-ω2 BB, which can call any of the previous ones as oracles.  Then we have level-ω3 BB, level-ω4 BB, etc., until we transcend that whole sequence with level-ωω BB.  But we’re still not done!  For why not pass to level

$$ \omega^{\omega^{\omega}} $$,


$$ \omega^{\omega^{\omega^{\omega}}} $$,

etc., until we reach level

$$ \left. \omega^{\omega^{\omega^{.^{.^{.}}}}}\right\} _{\omega\text{ times}} $$?

(This last ordinal is also called ε0.)  And mathematicians know how to keep going even to way, way bigger ordinals than ε0, which give rise to ever more rapidly-growing Busy Beaver sequences.  Ordinals achieve something that on its face seems paradoxical, which is to systematize the concept of transcendence.

So then just how far can you push this?  Alas, ultimately the answer depends on which axioms you assume for mathematics.  The issue is this: once you get to sufficiently enormous ordinals, you need some systematic way to specify them, say by using computer programs.  But then the question becomes which ordinals you can “prove to exist,” by giving a computer program together with a proof that the program does what it’s supposed to do.  The more powerful the axiom system, the bigger the ordinals you can prove to exist in this way—but every axiom system will run out of gas at some point, only to be transcended, in Gödelian fashion, by a yet more powerful system that can name yet larger ordinals.

So for example, if we use Peano arithmetic—invented by the Italian mathematician Giuseppe Peano—then Gentzen proved in the 1930s that we can name any ordinals below ε0, but not ε0 itself or anything beyond it.  If we use ZF set theory, then we can name vastly bigger ordinals, but once again we’ll eventually run out of steam.

(Technical remark: some people have claimed that we can transcend this entire process by passing from first-order to second-order logic.  But I fundamentally disagree, because with second-order logic, which number you’ve named could depend on the model of set theory, and therefore be impossible to pin down.  With the ordinal Busy Beaver numbers, by contrast, the number you’ve named might be breathtakingly hopeless ever to compute—but provided the notations have been fixed, and the ordinals you refer to actually exist, at least we know there is a unique positive integer that you’re talking about.)

Anyway, the upshot of all of this is that, if you try to hold a name-the-biggest-number contest between two actual professionals who are trying to win, it will (alas) degenerate into an argument about the axioms of set theory.  For the stronger the set theory you’re allowed to assume consistent, the bigger the ordinals you can name, therefore the faster-growing the BB functions you can define, therefore the bigger the actual numbers.

So, yes, in the end the biggest-number contest just becomes another Gödelian morass, but one can get surprisingly far before that happens.

In the meantime, our universe seems to limit us to at most 10122 choices that could ever be made, or experiences that could ever be had, by any one observer.  Or fewer, if you believe that you won’t live until the heat death of the universe in some post-Singularity computer cloud, but for at most about 102 years.  In the meantime, the survival of the human race might hinge on people’s ability to understand much smaller numbers than 10122: for example, a billion, a trillion, and other numbers that characterize the exponential growth of our civilization and the limits that we’re now running up against.

On a happier note, though, if our goal is to make math engaging to young people, or to build bridges between the quantitative and literary worlds, the way this festival is doing, it seems to me that it wouldn’t hurt to let people know about the vastness that’s out there.  Thanks for your attention.

197 Responses to “My Big Numbers talk at Festivaletteratura”

  1. sf Says:

    Haven’t managed to read the post yet, sorry, but just noticed that Nature has a supplement this week with 5 or 6 articles on quantum computing, including one on quantum supremacy.

    The cover photo shows a quantum computing laptop in some sort of state of superposition I think.

  2. Michael Says:

    I think there is a typo in the discussion of A(·): 5 pentated to 5 should be followed by 6 secstated (?) to 6, not 6 pentated to 6.

  3. Scott Says:

    Michael #2: Thanks! Fixed.

  4. Scott Says:

    sf #1: Yes, thanks, I saw, though I haven’t yet read carefully! The quantum supremacy piece, by Harrow and Montanaro, looked quite nice.

  5. fred Says:

    I’m guessing that being able to read their name out loud four times in a row is one of the requirements to make a speech at Festivaletteratura?

  6. Scott Says:

    fred #5: I just tried and it wasn’t that hard, so I guess my speech is retroactively OK?

  7. Joshua Zelinsky Says:

    “I estimated this morning that there are maybe ~10700 ”

    This should be 10^700.

  8. Craig Says:

    Quick typos: I assume there are $10^700$ first pages of novels, not 10700; and I think an exponentiation is missing from the value of BB(6).

    I recently encountered another connection between literature and large numbers: a journal article about Walt Whitman’s fascination with big numbers ( Of course, what counts as “big” in this context is minuscule given the content of your talk (Whitman went up as high as decillions or so), but I figured I’d mention it for the benefit of Scott_2035.

  9. fred Says:

    The number of possible first pages in your post “~10700 possibilities”, I guess it should be 10^700?

  10. Scott Says:

    Joshua #7 and Craig #8: Thanks, fixed! (The issue is that the WordPress editor doesn’t include subscripts or superscripts, so I have to go through and insert them by hand, and not surprisingly in a post of this length I miss some.)

  11. fred Says:

    I guess the proportion of halting (BB(n)) vs non-halting programs for a n state Turing machine has to be uncomputable as well?

  12. Scott Says:

    fred #11: Yes. If that were computable, then we could dovetail over all n-state TMs until we knew which ones halted, and thereby solve the halting problem.

    In fact, the proportion of n-state TMs that halt is closely related to Chaitin’s Omega, the famous uncomputable number.

  13. fred Says:

    Any parallel between the halting problem and our universe eventually contracting back into a singularity (= halting) or expending forever (= never halting)?

    (yes, we’re just here because some nerdy God is trying to compute BB(…))

  14. Scott Says:

    fred #13: I don’t think so, because the latter is computable just by looking at a couple simple parameters like the mass density and the cosmological constant.

  15. BLANDCorporatio Says:

    irt. fred #13:

    Somewhat off-topic–

    I have a half-joking rebuttal of the Simulation Argument. There will be a post-Singularity society, it won’t have special provisions against simulation, nevertheless most individuals will not be simulated (so exactly the eventuality the argument claims won’t happen, will happen 😉 ) …

    … because that society’s prodigious computing power will be sunk in updating block chains. Ba-dum-tss.


  16. Flumajond Says:

    So, basically, to get a big number you need to fix a language. But apparently, even if say you work in the language of set theory, the definition might depend on axioms too since there is no absolute truth in mathematics – you could have two sentences in set theory, defining numbers but to prove which one is bigger is independent of set theory. This relates to the Penrose ideas about absolute truth. But in fact, mathematical truth appears to be inductive i.e. it is an illusion on some (deep) level – we can never know what inaccessible cardinal axioms are consistent with set theory or even what the definitions mean.

    The Penrose-Goedel type intuition is also related to ordinals, their definition and representation. In the end, it boils down to the same thing – to transcend the intuition, you need to go beyond the obvious – say, use large cardinal numbers whose existence, for all we know, might contradict NBG/ZFC set theories, or even if it doesn’t, the intuition might be displaced or delusional – and hence you are on square one, since there is no real justification other than empirical. I used to think a lot about these things while growing up and even had an essay on it for some class at mit in 1999; these kind of things seem to be a trapping topic for nerdy youths.

    So what is truth and does absolute truth exist in mathematics even? It might be that some axiom, which is contradictory to NBG/ZFC in fact has so long a proof of contradiction, that it is in fact useful. Useful in the sense that all the consequences we can derive from it are true (at the very least, do not contradict set theory as that would yield a short proof of contradiction). So, by the same evolutionary or pragmatic pressures we might accept the fake axiom as coming from some sort of intuition.

    And even more, if our physical world is indeed finite (bounded by 2^10^122, as you claim but which seems to be a bit too free an interpretation of black hole entropy etc, and especially given that we don’t even know the ultimate theory and even that what is known is not really strictly mathematically proven to exist as a well defined object, as for instance no-one has proven the existence of any 3+1 dimensional nontrivial (let alone gauge) QFT within the largely ignored framework of Axiomatic quantum field theory, despite Jaffes’ putting it on the Clay’s Millenium list as his pet problem), then what sense do these things even make, if they are IN PRINCIPLE unknowable. We would then be compelled to abandon the widespread though admittedly naive notion of absolute mathematical truth, and accept more modest formalist or even quite questionable pragmatist approach, as indeed has been done in the last century. Truth seems to be a strange thing which is most clearly understood in mathematics and logic. It has always been strange to me how these abstract concepts reflect on the real life with which they apparently have only superficial resemblance (say philosophical materialism vs materialistic morality, as an oversimplified example, but such jumps, that philosophers seem to make all too often, were always puzzling), but it is the case that many Americans are nowadays experiencing “post truth” world they live in. Unfortunately, fictions and narratives, media and otherwise, have real world consequences that many have felt in the past, but while it is not surprising that lies, bias and propaganda are abundant in the realm of politics (politics seems to be a popular subject here too), it is somewhat surprising that truth can, on a much, much deeper level, be slippery even in mathematics.

  17. Joshua Zelinsky Says:

    “But apparently, even if say you work in the language of set theory, the definition might depend on axioms too since there is no absolute truth in mathematics”

    That’s not what this means. There’s lots of absolute truth. The problem is only that sufficiently powerful axiomatic systems cannot prove their own consistency.

  18. Scott Says:

    Flumajond #16: Your comment has too many threads to respond to all of them, but just to take what you use as your jumping-off point, I don’t agree with the statement “there is no absolute truth in mathematics.” I take it as self-evident, for example, that a given Turing machine either halts or doesn’t halt (when started on a blank tape)—that there’s some truth of the matter, regardless of whether we can prove it in the case that the machine runs forever. (Often, of course, we can prove it.)

    So in particular, Goldbach’s Conjecture, the Riemann Hypothesis, etc. etc. all have definite truth-values, and the Busy Beaver function is perfectly well-defined. And so is what I called the “super BB function,” and the “super duper BB function,” and far vaster BB functions still, involving all the computable ordinals that I mentioned in the talk, and vaster ones that I didn’t mention.

    It’s only when we get to even more enormous ordinals—e.g., ordinals defined by dovetailing over all Turing machines and ZF proofs that they define well-orderings on the input strings—that we start having to worry about the axioms of set theory. You might find that strange, but in some sense it’s no stranger than Gödel’s theorem itself: in fact it’s just an example of the Gödelian progression of theories, the fact that there’s no single usable theory to prove everything we’d like.

    That statement is constantly misunderstood to mean that all truth is relative, that 2+2 can be 4 in one theory and 5 in another one, etc. etc., but that’s not what it means at all (and is certainly not what Gödel himself believed!).

    On the contrary, all “non-pathological” theories (like PA, ZF, etc., assuming their consistency) are going to be arithmetically sound, which means in particular that if one proves a given arithmetical statement, then the other one will never disprove that statement. It’s just that

    (1) some theories might be able to prove arithmetical statements that others aren’t strong enough to prove (like, say, Con(PA), which is provable in ZF but not in PA), and

    (2) different theories could genuinely disagree about non- arithmetical statements (involving transfinite sets), like the Continuum Hypothesis or the Axiom of Choice.

    In the case at hand, it’s true that, the greater the ordinal strength of a theory, the larger the numbers we can succinctly specify using that theory. But by definition, it’s never going to happen that there are two numbers X and Y with clear arithmetical definitions, such that X>Y in one arithmetically sound theory and X<Y in a different one.

  19. Flumajond Says:

    “The further away they are from us, the faster they’re receding—with galaxies far enough away from us receding faster than light, meaning that any signals they emit can never reach us.”

    This is apparently not true, as the Hubble sphere expands

    Btw, your blog post where you had some discussion with Roger Penrose,
    that I recently had a pleasure to has a very nice idea (is it yours – i’ve never seen that before) relating to decoupling and quantum consistency. Namely, if I got that right, the decoherence of macroscopic objects cannot be reversed, because of FUNDAMENTAL reasons that states get coupled with particles that are pass the Hubble horizon (quantum collapse has been mine also obsession since teen years, Schrodinger cats and “rising to the level of consciousness”, and various interpretations, another problem dear to nerds which too few working physicists seemed to care about, despite its obvious fundamental importance, apparently at least until the dawn of quantum information era, when the subject has become much more widely popular); I don’t know how this objection about expanding Hubble’s horizon relates to that, but anyway the idea seems intriguing. Has anybody developed that further?

    My objection related to that is that the apparent “collapse” (or whatever your favorite interpretation is) can hardly be guaranteed to have these runaway degrees of freedom escaping to the Hubble horizon; one could imagine a containment of a system (of much smaller size), say some sort of mirrors etc, that, in some complicated way, reverse the decoherence. In other words, what would guarantee that such escaping would always happen? Presumably, the things would not depend on existence of a relatively distant (yet contained well inside the Hubble radius, or – given the objection explained by Veritasium, possibly even beyond it) reversing mechanism.

  20. fred Says:


    “Why 45 billion light-years, you ask, if the universe itself is “only” 13.6 billion years old? Well, when the galaxies emitted the light, they were a lot closer to us than they are now! The universe expanded in the meantime.”

    But this would mean that the expansion has to happen faster than the speed of light, no? I.e. space is stretching (or being created) faster than light can travel through it?
    So eventually such distant galaxies will “fade away” from view?
    Meaning that the night sky is getting darker and darker?

  21. Scott Says:

    Flumajond #19: I didn’t watch the video, but, yes, it’s a fundamental feature of de Sitter space that a given observer can only receive signals from a finite region, whose radius depends on the cosmological constant. (With no cosmological constant, it’s true that our visible patch would grow without bound and eventually encompass anything we’d like.)

    As for mathematical truth, I see that Joshua #17 was more concise than I was! 🙂

  22. Scott Says:

    fred #20: Yes.

  23. Flumajond Says:

    ” don’t agree with the statement “there is no absolute truth in mathematics.”

    This is meant to mean, that not ALL of the statements of mathematics, have some absolute truth – for instance, continuum hypothesis etc. But if you use set theory in your definitions (as you did in extended beaver functions) some of the statements might depend on such cases.

    The question for things like Riemann conjecture, which is an existential (or universal depending what is true) statement, which can indeed be interpreted in terms of halting problem, more deceptively seems to have some absolute truth, and indeed we all seem to believe that. But, if your idea about the finite world is correct, not even that would be clear – you cant plug in the counterexample that is too big. For universal formulas, even for number theory, it might be possible that the truth of such a statement might NEVER be known, not in a set theory, nor in any extension that we can work out. So, even if we accept that all true purely existential statements are true in absolute sense, for universal statements truth might always escape us, though it is reasonable to assume that truth exists nevertheless. But then, once we have multiple alternating quantifiers, even for number theory language, we don’t really have justification for absolute truth, as the Skolem functions are infinite objects so even existential intuition fails. Of course, most mathematicians, may I say all, do believe in some sense in the absolute truth of such elementary statements (and among scientists most mathematicians believe in God too apparently), but isn’t it all just a very useful delusion?

  24. Scott Says:

    Flumajond #23: Again I feel like you’re mixing together too many different concepts. As an example, I’d find it perfectly clear and comprehensible if somebody claimed, “the Riemann hypothesis is false, but the counterexample is too big to be checked in the observable universe.” And that would strike me as an interestingly different state of affairs from “the Riemann hypothesis is true.” And crucially, I don’t need to possess a means to distinguish those two states of affairs, in order to recognize them as being different.

    Instead, we can simply say: the same conceptual apparatus that lets us discuss these matters in our world, would clearly still let us discuss them in a hypothetical world where astronomical observations had revealed a much smaller value for the cosmological constant. (Indeed, before 1998, that’s the world almost everyone assumed we lived in!) And in that other, perfectly sensible world, there would be an experiment to distinguish the two states of affairs.

  25. Flumajond Says:

    ” I didn’t watch the video”

    The video is short (this part is only minute or so) and from Veritasium, who is a physicist and popularizes science and he specifically addresses this point. He claims that it is a misconception. What does he refer to – I doubt that he would put this at random, though it is possible that he is wrong (just as you might be wrong too) – perhaps he is talking about ACCELERATING expansion, not static expansion like in de Sitter space; your willingness to dismiss people like this is I must say quite disappointing, but that’s up to you. It is not becoming to underestimate people, and for that to be your first instinct is never a good thing…

  26. Flumajond Says:

    Scott #23 I understand perfectly what you mean, but there is a problem. Mathematicians (Penrose wrote about that in one of his two books about black holes etc) assume existence of an ideal world, and in such a world, what you say makes perfect sense. But what if such a world really is an illusion. As imaginary as God is to atheists (though much more precise). In that way, it wouldn’t make sense to talk about objects which have no bearing (direct OR INDIRECT) to our world. And on the other side, if you for instance agree that Continuum hypothesis does not have some absolute truth value (as many mathematicians indeed do, though by no means all), then what distinguishes it from say properties of some Skolem functions, which also might not “exist” even if physical world is countably infinite. And if you assume physical world is finite (which is perhaps not so widely accepted), then the same objection goes for large numerical counterexamples whose existence we can not know, either directly or prove indirectly. If you assume existence of ideal world of mathematics (as indeed I do too, on a less a cynical day), then all is clear. But there is a possibility that all this is only an illusion, however powerful. And then you seem no different (bare precision and formal system) then a religious guy seems to an atheist, or a terrorist who hopes for 72 virgins in heaven (which, for all we know, might be as real as unreachable cardinals). Granted, mathematics is useful, probably much more so than many sorts or religious or political delusions, though the later occasionally have a bit more impact at least in the short run.

  27. DavidC Says:

    Scott #18

    > a given Turing machine either halts or doesn’t halt

    Naively, it seems plausible to me that that statement is getting close to completely characterizing which mathematical statements have definite truth values.

    Maybe anything that depends only on whether finitely many Turing machines halt? But then I’m not sure what you’re allowed to depend on in establishing that dependence. And why not infinitely many? If I describe a sequence of Turing machines, then if there’s a fact of the matter about whether each one halts, surely there’s also a fact of the matter about whether all of them halt.

    The above kind of all sounds like nonsense, though. I’d enjoy it if there’s a way to make this clearer.

  28. Scott Says:

    Flumajond #25: OK, I finally watched the video—I really don’t like being linked to videos, especially if they require sound, but whatever. The narrator is talking about the Hubble horizon, which is different from the cosmological event horizon that I was talking about; see here for the distinction. I hope that clears this up.

  29. Scott Says:

    DavidC #27: The way to make it more precise is the arithmetical hierarchy. As an example, I would also ascribe a definite truth-value to any Π2 sentence, which is a sentence that asserts that a given “super Turing machine” (that is, Turing machine with an oracle for the halting problem) runs forever. As examples, the P≠NP conjecture and twin prime conjecture can both be put into that form.

    But if you accept that, then you might as well go further, and ascribe definite truth-values to Π3-sentences (about the halting of Turing machines with oracles for the super halting problem), Π4-sentences, and so on. Indeed, you might as well go all the way up to Πα-sentences, where α is any computable ordinal that you understand. My intuition starts to falter only at the computable ordinals that I don’t understand (for example, ones that require the consistency of large cardinal axioms to construct).

  30. Douglas Knight Says:

    Surely you don’t mean to attribute the best available Hellenistic astronomy to Eratosthenes, who measured the size of the Earth, but to Aristarchus, who measured the size of the moon and the sun, proposed heliocentrism, and suggested that the fixed stars were very far away.

  31. Scott Says:

    Douglas #30: Oops, thanks, fixed!

  32. Flumajond Says:

    Scott #18:
    “(2) different theories could genuinely disagree about non- arithmetical statements (involving transfinite sets), like the Continuum Hypothesis or the Axiom of Choice.”

    This is, I believe, patently wrong, and is not even a philosophical point. Of course, what is meant by “theories” is questionable (since Con(ZFC) and its negation are both consistent with ZFC, but only one extension is plausibly “theory” in your sense). Perhaps you subscribe to Penrose’s misguided intuition (that you had a blog about) that we have some mysterious insight into transcending one theory by another. That is quite dubious. The modern set theory postulates some large cardinals, axioms say LC, and it is not at all clear that those theories are consistent. But ZFC+Con(ZFC+LC) proves one arithmetical statement, and ZFC+NOT Con(ZFC+LC) proves another. Of course, if LC is not consistent with ZFC only second one is consistent theory, but otherwise both are. Both can be plausibly TRUE (in your absolute sense), as we have no way to determine if LC is sound. But one claims existence of some natural number which might or might not exist, and the other claims the opposite. If you go to more complicated (say universal existential formulas of arithmetic), such connections might also exist. In any case, there is no reason to assume that these are the ONLY arithmetical formulas that are influenced by transfinite objects. As Hilbert’s failed program to finitize mathematics showed, there is NO WAY to eliminate influence of large, infinite objects on proof of truth about finite ones. Infinite mathematics is fundamental in proving at least some statements purely arithmetic, and many are shown to be provable in ZFC, but not in the Peano arithmetic; there is absolutely no reason to assume that the same does not happen when we go to higher cardinals or whatever transfinite generalization of set theory, though such statements are difficult to find or prove; the Goedel type ones are just the easiest ones and are tip of an iceberg, more than a rule, I would suspect.

  33. Scott Says:

    Flumajond #32: Firstly, the sentence of mine that you quoted doesn’t appear to be the sentence that you disagree with.

    Secondly, I was careful in what I said. I was talking only about arithmetically sound theories, which are theories that can prove arithmetical statements only if those statements are metatheoretically true. (So, PA and ZF are presumably both examples, but PA+Not(Con(PA)) is not.) Some such theories can prove more arithmetical statements than others—and different theories might even prove incomparable sets of arithmetical statements—but by definition, one such theory will never prove an arithmetical statement to be true that another one proves to be false. If they did, then one or the other wouldn’t be arithmetically sound.

  34. Flumajond Says:

    Flumajond #32:

    What I meant in the last post is that emphasize on NON arithmetical is wrong, if it means “only” non arithmetical, as the emphasis (that doesn’t show in my repost) suggests.

    Scott #28:
    Yes, that clarifies it. You didn’t explicitly say which horizon by name, though what you said does correspond to the Hubble horizon, not the particle horizon. However, this seems to be a subtle point and Veritasium and refers to the paper
    Which exactly explains the distinction, which is perhaps sometimes ignored for the sake of simplicity (as in your text).

  35. Flumajond Says:

    OK that clarifies it, I missed “arithmetically sound”, though
    “arithmetically sound”
    is rather restrictive. We have no way of knowing if a theory is “arithmetically sound”, which is exactly my point. The only exception is claiming consistency over ordinals (what Penrose thinks separates us from robots), theories which are all arithmetically sound by construction, GIVEN that we know what is the ordinal, but anyway that is not how large cardinal axioms are constructed. And one large cardinal axiom might transcede many nested consistency-extensions (as it gives model to them all). But on the other hand, it might be inconsistent. Hence my original post (in which perhaps I was not clear too).

  36. DavidC Says:

    Scott #29: makes sense, thanks!

  37. fred Says:

    Flumajong #35

    How does this
    relate with this notion of “arithmetically sound”?

  38. Scott Says:

    fred #37: The Banach-Tarski paradox is not an arithmetical phenomenon; it crucially depends on the Axiom of Choice.

  39. Flumajond Says:

    fred #37:

    I don’t think it is related. ZFC is arithmetically sound, if I understand it correctly, and Banach Tarski is a theorem of ZFC, which is about infinite objects, not arithmetic, which depends on the Axiom of Choice. So, not really related.

  40. fred Says:

    “if we have a system with (say) a thousand interacting particles, then the rules of quantum mechanics say we need at least 2^1000 amplitudes to describe it”

    Since 2^1000 amplitudes are needed to describe the system mathematically, I’m still not understanding how can “nature” have so much hidden expressive power.
    I get that this is all amounting to a bunch of plain/boring probabilities in the end, but it’s still the case that those amplitudes aren’t reducible to anything simpler/cheaper.
    This is unlike anything in the classic models where at most interactions are 1000^2 (one on one).

  41. William Hird Says:

    I’m still mystified by the fact that Claude Shannon showed that most Boolean functions require exponential size circuits to compute but the lowest bound we have for any explicit circuit is only 5n? Scott, could you give us your take on why this is so?

  42. Scott Says:

    William #41: There’s no real mystery about it! If you just want to show that most Boolean functions are hard, then it’s enough to point out that there are lots of Boolean functions, and not enough small circuits to cover all of them. By contrast, if you want to show that your favorite Boolean function is hard, then you need to rule out every possible special reason why your function might be one of the easy ones (and given the fact that you succinctly described your function somehow, in order to name it, it’s clearly not a random one). You’ve therefore called upon your head all the difficulties of P-vs.-NP-like questions.

    But, having said that, whether it’s actually true that we have no superlinear lower bound for an “explicit” Boolean function, depends entirely on what you mean by the word “explicit.” If you read my P vs. NP survey, you’ll learn how it’s possible to construct Boolean functions in complexity classes like EXPSPACE that require exponential-size circuits, or scaling down, in SPACE(n2) that require superlinear-size circuits. So then the problem is “merely” that we don’t know how to do the same for explicit Boolean functions in the complexity classes we care about more, like NP.

  43. William Hird Says:

    @Scott #42
    (and given the fact that you succinctly described your function somehow, in order to name it, it’s clearly not a random one)

    Well I don’t think that is necessarily true because it’s trivial to design a psuedorandom generator where each output bit is the result of a different random Boolean function.

  44. atreat Says:

    Scott, is “arithmetically sound” well defined in the way you are using it?

    I guess what you are saying is that there exist a set of mathematical theories of comparable power to ZFC that all agree on the truth value for any finite arithmetical statement and this is a form of absolute mathematical truth. Did I get that right? Any theory which did not agree simply would not belong to this set.

    But that is a rather meager form of absolute mathematical truth, no? Especially if their exist theories that don’t belong to that set are otherwise consistent and interesting.

    Btw, I love this talk you gave and envious of the folks in the audience. However, I feel like I have read or watched you give a very similar talk fairly recently. What am I remembering? It certainly was not the blog post you made as a teenager…

  45. Scott Says:

    William #43: No. If each output bit is the result of a different random Boolean function, then you’re just talking about a random Boolean function. A pseudorandom function is a perfect example of what I was talking about, something that might look random but actually has a short description.

  46. Scott Says:

    atreat #44: Aha, this is the single most important point I’d like to get across about the philosophy of mathematics.

    Namely: even to talk about math in the first place, already presupposes that we understand what’s meant by arithmetical soundness.

    For example: for a theory to be sound about Π1-sentences just means that, if the theory says that a Turing machine halts, then that Turing machine indeed halts. (The opposite case—if it says the machine runs forever, then it runs forever—already follows from the theory’s being consistent.)

    OK, you object, but why am I allowed to talk so unproblematically about whether the Turing machine “indeed halts,” independently of any theory? Here’s the key: because if I’m not allowed, then I’m also not allowed to talk about the theories themselves, and what they prove or don’t prove, which are likewise questions about the behaviors of various computer programs. But I was already talking unproblematically about the latter, so to do it in the one case but not the other would be a double standard.

    Note that this argument doesn’t generalize to questions about transfinite set theory, like the Axiom of Choice or the Continuum Hypothesis. But I’d say that it generalizes to all arithmetical questions—showing that, if we’re unwilling to talk about arithmetical soundness, then we shouldn’t be willing to talk about consistency or provability either.

  47. pupmki Says:

    Flumajond #35: There is another point, “arithmetically sound” depends on the absolute truth of arithmetical statements. If you reject that such absolute truth exists, then to talk about “arithmetically sound” theories is in fact meaningless.

    Apparently, Scott does not consider the possibility that there might be two natural extensions of ZFC, that have contradicting arithmetical theorems. But, it is for instance known that Reinhardt cardinals are consistent with ZF but not with the axiom of choice, on the other hand we have measurable and supercompact cardinals etc. So, such a possibility cannot be excluded in principle, although it is not clear if there are arithmetical discrepancies between the currently used theories.

    For the “realist” side there is

    Penelope Maddy. Defending the axioms: on the philosophical foundations of set theory. Oxford University Press, Oxford, 2011.

    with excellent review of large cardinals

    While anti-Platonist stance has also its proponents:

    Hartry H. Field. Realism, Mathematics, and Modality.
    Blackwell Publishing, Oxford 1991.


    Hartry H. Field. Science Without Numbers: The Defence of Nominalism, Princeton Legacy Library, Princeton University Press, 2015.

    This is essentially a longstanding philosophical dispute, problem of universals, dating back to Plato and Aristotle.

  48. William Hird Says:

    @Scott #45

    Manindra Agrawal has a paper that talks about using psuedorandom generators being used to separate P from NP. In the paper he says that if you have a cryptographically secure psuedorandom generator , that this implies strong lower bounds, but I am not sure what this means.

  49. Scott Says:

    William #48: Yes, he’s right. Indeed, it’s fairly obvious: it’s just another way of saying that if P=NP, then cryptographically secure PRGs would be impossible, since guessing the seed is an NP problem.

  50. Scott Says:

    pupmki #47:

      Apparently, Scott does not consider the possibility that there might be two natural extensions of ZFC, that have contradicting arithmetical theorems.

    No, it’s not that I don’t consider the possibility, it’s that I consider it and reject it! 😀 For why, see my comment #46.

    (Have any examples of ZFC extensions with conflicting arithmetical theorems seriously been proposed by anyone? As you yourself acknowledge, Reinhardt cardinals etc. won’t do it. Whenever forcing is used, we even have a result, the Shoenfield absoluteness theorem, to rule out the possibility that the different set theories we construct will disagree on anything arithmetical.)

  51. William Hird Says:

    Scott # 49

    So to prove that the seed of a PRG can’t be guessed , would it suffice to have a PRG that” looks like” it has a string of infinite length as its seed, or a seed that is at least as long as some multiple of the number of the output bits of the generator? Like say that the first million bits of output is the result of a key 128 million bits long?

  52. pupmki Says:

    Scott #50,#46:

    Your argument that we already accept some sort of “arithmetic soundness”, if we are talking about theories themselves, and hence you extend that to “arithmetic soundness” in general, is problematic.

    This properly can be applied only to Π1 sentences, as we could not have a consistent theory that claims an “untrue” Π1 sentence. However, once we come higher in the arithmetical hierarchy, things become much less apparent. What does a Π2,2 sentence actually say? It can be interpreted as a 2 step game, but the set of possible strategies is uncountably infinite, which are objects of different kind entirely than the natural numbers. Hence, you might believe in the “absolute truth” of claims about Turing machine haltings and proofs and theorems, yet in this, countably infinite, world, you can fully reject that higher in the arithmetical hierarchy you have absolute truth, just as you can imagine to reject the absolute truth about transfinite objects.

    I’m not sure forcing covers all the cases of possible natural extensions of ZFC, and you are wrong if you think that Shoenfield covers whole of the arithmetical hierarchy. Here is a concrete example of Π1,3 arithmetical sentence that is not absolute for forcing models:

    As you can see from the overflow post, such questions have indeed been considered, and while Shoenfield absoluteness theorem is limited (low levels of arithmetical hierarchy, forcing aside) there are some similar but stronger results. But it is a fact that this IS a concern, and that while for some limited sorts of extensions and some levels of arithmetical hierarchy there are absoluteness results, there are also counterexamples and there is no unique way to proceed and justify absolute “arithmetic soundness” on purely mathematical grounds.

  53. Scott Says:

    William #51: In order to have a useful PRG at all, the output has to be longer than the seed—so that’s always the assumption. Otherwise, you might as well just dispense with the PRG and use true random bits.

    There are functions that efficiently map n random bits to n+1, or 2n, or n2, or even 2n pseudorandom bits, in such a way that we believe no polynomial-time algorithm can distinguish the output from random with non-negligible bias. But this requires making cryptographic assumptions (and would be impossible in a world where P=NP).

  54. mjgeddes Says:


    I’ve got an (admittedly vague fuzzy intuitive) idea for proving that the continuum hypothesis is true.

    I think there’s a way of classifying all numbers into a finite hierarchy such that each new level of number ’emerges’ from the level below it, analogous to the way levels of organization can emerge in the physical world
    (for example cells>tissues>organs).

    Not sure exactly how to do this, but intuitively, it seems to vaguely make sense.

    My conjecture is that you can classify all numbers into only 3 different types (levels). That is to say, I think there is (in some well-defined sense), only 3 types of numbers!

    Each level of number is ‘cleanly separated’ from the layer beneath, in the sense that you could define the higher levels without worrying about the details of how numbers on lower levels are defined (again, analogous to substrate independence for layers of organization in the physical world).

    Proposed 3 levels:

    1 All numbers with cardinality of reals and above?

    2 Algebraic?

    3 ?

    The idea is that the reals are on the base or foundational level, integers are ‘built on top of’ the reals on the 2nd level of abstraction (subset of algebraic) and there’s higher level of abstraction still on a 3rd level.

    So if we can add plausible axioms implying that there’s really only 3 types of numbers that can be cleanly separated into different levels of abstraction like this, this seems to strongly imply continuum hypothesis?

    (No level between 1 and 2)

  55. pupmki Says:

    pupmki #52:

    There should be “analytical hierarchy” instead of “arithmetical” in my post above. Meaning, quantification is over SETS of integers. Then there are these counterexamples of Σ1,3 sentences. If we restrict to numbers alone, i.e. “arithmetical”, I’m not sure that such examples are known for natural extensions, and Shoenfield theorem says if the ordinals are the same as in forcing, then so are arithmetical statements (but not analytical i.e. second order arithmetical). But if we are using new cardinals, the set of ordinal number of course possibly changes, hence there is nothing about absoluteness of arithmetical soundness that is guaranteed. It is unclear weather arithmetical soundness holds in the natural extensions of ZFC that are used (beyond forcing), but in principle there is no reason not to have a counterexample in the possible future extensions if not in those currently studied.

  56. Mateus Araújo Says:

    It feels a bit like cheating to allow the computer program to specify a number by its runtime, instead of having to actually write down the number in the tape (in the solution to Berry’s paradox).

    If one changes this, do we get something interesting, like the largest computable number that can be computed by a Turing machine with n symbols?

  57. pupmki Says:

    I’ve found this paper, from the abstract (link below):

    We prove that the satisfaction relation N⊨φ[a] of first-order logic is not absolute between models of set theory having the structure N and the formulas φ all in common. Two models of set theory can have the same natural numbers, for example, and the same standard model of arithmetic ⟨N,+,⋅,0,1,<⟩, yet disagree on their theories of arithmetic truth

    This is a rather elementary paper, and does not consider large cardinal constructions so these models of set theory might not be "natural", but it illustrates the point that (even when you have Π1 agreement), higher order arithmetical statements are not determined. The two set theories (or the parts used for proofs) might both seem natural by different kind of transfinite intuitions, in principle (this is not a mathematical statement).

  58. Scott Says:

    Mateus #56: Good question! I invite you to prove that that doesn’t change anything significant. (Hint: If there’s a program that runs for T steps, then one can write another, slightly longer program that prints T as its output.)

  59. gentzen Says:

    If you encode a natural number as a finite string, then you already put in the finiteness implicitly via the finiteness of the string. Kolmogorov complexity also does this, by ignoring the information required for encoding the length of the string. Chaitin’s Omega can only be defined if you first fix this omission, by only allowing prefix encodings (where the length is given implicitly). But even here, the finiteness might still be put in implicitly, because being a prefix encoding doesn’t necessarily imply that the question ‘whether a given string is a valid encoding’ is decidable. (Does it really matter at that point? Probably not, since you could just use a decidable prefix encoding. So I stop here.)

  60. Mateus Araújo Says:

    Scott #58: Ah. That shows it’s not a good idea for me to write comments before 0500 (on the other hand, apparently you can do it between 0500 and 0600).

    Give me a double-taped Turing Machine, and it’s trivial to write down the running time in unary in the second tape. With a bit better encoding one can do it in binary in a single tape, but it’s pointless to do it explicitly.

    Let me try again: can one get a sensible definition of a fastest-growing computable function? Say by requiring that the Turing machine corresponding to a “computable busy beaver” of size n be generated by a polynomial-time uniform circuit family?

  61. Scott Says:

    Mateus #60: I’m on sabbatical in Tel Aviv this year. It’s afternoon here.

    Alas, I don’t know of any function that could answer to the name of “fastest-growing computable function.” Indeed, if f(n) is computable, then so is Ackermann(f(n)) and so on. Yes, you could require the “Busy Beaver” Turing machines to be uniformly computable, and if you did, that would limit you to computable functions only—but it still won’t mean there’s any fastest-growing computable function among them.

    Incidentally, it’s true that there’s “plenty of room between Ackermann and Busy Beaver” (e.g., computable functions that grow way faster than Ackermann, and even uncomputable functions that grow way slower than Busy Beaver, which you can construct using priority arguments).

  62. fred Says:

    Since actual computers have finite memory, would it be interesting to explore “theoretical” math on finite sets of integers (modulo math on naturals)? Or would this be just too limiting?

  63. Mateus Araújo Says:

    Scott #61: Please ignore my previous post, it was just an error. I wanted to write instead:

    Aha! I knew you didn’t actually have superpowers ?

    But I don’t see how that’s a problem. One needs more bits to write down Ackermann(f(n)) than f(n), and even more bits to write down Ackermann(Ackermann(f(n))), so that is taken care of by requiring the Turing machine to be finite.

    I’m just afraid that the “fastest-growing” sequence is not well-defined. Let ‘n’ be the size of your program and f(n) its runtime. Now require your sequence of programs of size ‘n’ to be uniformly computable. So each program to define the sequence of programs of size ‘n’ defines a different function f(n). But now how do we choose what is “fastest-growing” f(n)?

  64. Scott Says:

    fred #62: The shelves of the math library groan under the weight of what you can do with finite fields and other finite algebraic structures. Of course, the big limitation is that you then don’t have a natural notion of “greater than” or “less than.”

  65. Scott Says:

    Mateus #63: Alas, my sole superpower is clarity of thought. I’d surely pick a different one if I got to live over again.

    Yes, it’s possible to have two growth rates, neither of which asymptotically dominates the other: for example, f(n)=n versus a function that’s usually flat, but occasionally jumps from way less than n to way more than n at extremely widely spaced values of n. But as far as I know, essentially all the growth rates that “naturally arise in practice” can be totally ordered from slowest to fastest.

    Yes, you could consider, e.g., the fastest-growing computable function definable by a program at most n bits long, for any fixed n of your choice. Of course, as you increase n, these functions will increase without bound (even as Busy Beaver still dominates all of them).

  66. Joshua Zelinsky Says:

    “Yes, it’s possible to have two growth rates, neither of which asymptotically dominates the other: for example, f(n)=n versus a function that’s usually flat, but occasionally jumps from way less than n to way more than n at extremely widely spaced values of n. But as far as I know, essentially all the growth rates that “naturally arise in practice” can be totally ordered from slowest to fastest.”

    I agree that such functions don’t arise in practice. Marginally related tidbit: One can write down explicitly (using just a few logs and a little playing with trig functions) a function
    f(x) such that:
    1) f(x)>0 for all x.
    2) f(x) is infinitely differentiable for all x> 0.
    3) f'(x) >0 for all x>0.
    4) f(x) is not O(x) and x is not O(f(x)).

    Finding such a function is a cute problem.

    I don’t know if one can make an elementary function that does this and has f^(n)(x) >0 for all n.

  67. Joshua Zelinsky Says:

    Actually, disregard last sentence; you obviously can’t since f”(x) being positive would mean that once you speed up again to get large than x by a bit you won’t be able to slow down again.

  68. Sniffnoy Says:

    Scott #46: I just want to say, that is an excellent point and I’m surprised I haven’t seen it made before. Well, not in that form, anyway — I guess I’ve seen people such as myself making the contrapositive point in the form of “aaargh, logic makes no sense because what metatheory are we even using to judge statements about what proves what in the first place??” But you take that contrapositive here and it becomes an interestingly different point of view! 🙂

  69. John Sidles Says:

    Scott observes (circa #65)  “As far as I know, essentially all the growth rates that “naturally arise in practice” can be totally ordered from slowest to fastest.”

    This observation has parallels in computability theory and complexity theory. In computability theory there is …

    Viola’s theorem  Given an integer \(k\) and Turing machine \(M\) promised to be in \(P\), the question “Is the runtime of \(M\) of \(\mathcal{O}(n^k)\) with respect to input length \(n\)?” is undecidable.

    Supposing that, in complexity theory as in function theory, the decidability of “all the growth rates that naturally arise in practice” motivates the definition of \(P’\) as the restriction of the complexity class \(P\) to algorithms that “naturally arise” in this sense.

    In this light, the question \(P’ \overset{\scriptstyle ?}{\subset} NP\) is both theoretically natural and practically interesting (and has been surveyed on TCS StackExchange). In particular, it is not known (AFAICT) whether the well-known obstructions to proving \(P \overset{\scriptstyle ?}{\subset} NP\) also obstruct proofs of its natural restriction \(P’ \overset{\scriptstyle ?}{\subset} NP\).

  70. Atreat Says:

    Scott #46,

    I want to make sure I understand your argument. Is it that we shouldn’t consider pathological theories, whatever they are, *math* theories? That when we talk about *math* theories we’re implicitly confining ourselves to those theories that agree with ZFC for all arithmetical truth value statements?

    And this is what we should venerate ^^^^^^ as absolute mathematical truth? That still seems meager to me. It also seems to suggest that whatever we are thinking about when it comes to transfinite numbers, it isn’t *math*.

  71. Scott Says:

    Atreat #70: I think the main thing you haven’t yet understood is that I don’t give ZFC any sort of exalted status.

    At the end of the day, I’m happy to talk unproblematically about “arithmetical truth,” independently of ZFC or any other system. That is, I draw a line in the sand, and say: there might not be a fact of the matter about the Continuum Hypothesis, but there’s certainly a fact of the matter about the Goldbach Conjecture, and more generally, about whether a given Turing machine halts or doesn’t halt, and (I would say) about higher-level arithmetical statements as well. If we’re unwilling to stick our necks out that far (i.e., not very), then I don’t understand why we’re even bothering to use language to talk about math in the first place.

    For present purposes, a formal system could be called “non-pathological,” if all the arithmetical statements that it proves are actually true ones. E.g., if the system proved Goldbach’s Conjecture was false, you could write a program to search for counterexamples, and with a big enough computer in a big enough universe, sure enough the program would eventually halt with the predicted counterexample. (For higher-level arithmetical statements, as pupmki correctly pointed out, one would need to tell a slightly more involved story, for example involving interaction with a prover, but I’m fine with that as well.)

    I’m almost certain that ZFC is a “non-pathological” theory in the above sense. But—here’s a key point—if we somehow learned that wasn’t the case, then so much the worse for ZFC. We’d then simply want to discard ZFC—which was only ever a tool to begin with, for capturing truths that made sense prior to ZFC or any other system—and find a better foundation for math.

    To address your other point, I think transfinite set theory clearly counts as math (it proves nontrivial theorems, is done in math departments … how much more do you want? 🙂 ). But you’re right that it’s an unusual kind of math, which in some sense is less about the objects themselves (e.g., the transfinite sets), than it is about the formal axiomatic systems that we use to discuss those objects. I wouldn’t say the same about most areas of math, which for the most part, deal with questions that could ultimately be “cashed out” into questions of arithmetic, for example about whether various computer programs halt (or halt given access to a halt oracle, etc.). Again, the latter sort of questions I regard as having a definite meaning, independent of whatever formal systems we use to discuss the questions.

  72. Eric Habegger Says:

    Scott, this a very fine exposition of concepts that I did not understand before. One big advantage of someone like myself (who is not formally trained on this stuff) is that when you learn about it things just pop out to me that don’t seem to pop out to others formally trained in it. I’m thinking about the culture of “shut up and compute” that exists in education.

    What pops out to me is that you are saying there is a high probability that a quantum computer’s only real problem solving ability may be to speed up quantum simulations. That is, it’s only major accomplishment in the real world may be to render encoding of information obsolete.

    Doesn’t this bring up some big issues! Is the upside of creating a quantum computer (assuming this probability is valid) worth the downside? It reminds me of the invention of the thermonuclear bomb. So far it’s only task is to kill people, or to threaten to kill people. Is it possible that a quantum computer will have an even more limited use? I can’t see it even being used as a deterrent as a thermonuclear bomb can be, knowing what I know about the lack of imagination of most people.

  73. Dan Staley Says:

    Okay, I’ve got what I assume is a really naive question – how can the amount of information you can store in a region scale only with surface area, and not with volume? I mean, we can look at a big sphere – it has some amount of surface area, so we can store some amount of information in it. Then we draw a ribbony, fractal-looking region contained in the sphere it with a much larger surface area than the sphere itself. It has more surface area (by an arbitrarily large factor) – does that mean we can store more in it? Can we really store more information by restricting to a sub-region?

  74. Atreat Says:

    Scott #71

    I think I get it. Your relationship and veneration to ZFC is analogous say to your view of Newton’s theory of gravitation. The moment that ZFC contradicted in some fundamental way your intuitive sense of arithmetical soundness (some one was able to show the equivalent of 1+1=3) is the moment you’d be searching for a new set of axioms. Similarly, you feel confident about saying there is a fact of the matter about whether particular Turing machines halt and would only trust formal systems that agreed.

    Transfinite numbers and the challenges that they introduce to our intuition (Banach-Tarski) are not problematic in this same way because the very existence of transfinite numbers are counter to our intuition.

    Couple further questions:

    Doesn’t it bother you at all the fact that the same formal system that so successfully upholds your intuition about arithmetical soundness also betray it so completely when it comes to transfinite numbers?

    Would you be willing to throw out ZFC and its implications for transfinite numbers and Banach-Tarski for a formal system that similarly upheld your intuition for arithmetical soundness, but did not allow such transfinite silliness? Even eager?

    For me, I’m happy to talk about math and all of these things and still believe there is no such thing as absolute mathematical truth. I have no place for a platonic wizard behind the curtain. And the void of absolute mathematical truth also leaves me no choice, but to conclude that absolute truth of any type is an illusion.

    The existence of many-valued logic, fuzzy logic and systems that do not use the law of the excluded middle are further evidence for me of the fallacy of believing in absolute truth. If there is absolute truth or absolute knowing I think it must be ineffable and outside of any language system or rule system.

  75. Flumajоnd Says:

    pupmki #57:
    That paper pretty much destroys Scott #46 comment, and shows how naive this sort of intuition is. Paper is by famous logician/philosopher Joel Hamkins, with input from Hugh Woodin.

    It shows precisely that even if the sets of natural numbers are the same, the universal existential etc. formulas of number theory are not, because of difference in Skolem functions. So, if metatheory is M1, one is true, and if it is M2, the other.

    This can be interpreted in terms of formal systems, so there are two systems of axioms of set theory extending ZFC (or any consistent extension of it), so that they AGREE on all purely existential and universal formulas of number theory (hence, on halting of ordinary Turing machines, provability of Theorems etc. – this is not your Con(ZFC) vs NOT Con(ZFC) difference), but disagree for some formulas when quantifiers are altered (staying in first order logic i.e. talking ONLY about numbers).

    So which one is “true”? There are truth predicates defined in that paper, so that Tr1 and Tr2 differ. This is perhaps the clearest demonstration of logical relativity of truth and illusions of existence of some “absolute truth” and futility of notion of “arithmetic soundness” that Scott is fanatically attached to, but which is justified by non-mathematical arguments ONLY. Purely mathematical arguments of the paper however show that even with having the same natural numbers, some first order properties of them might be different depending on axioms, and that objection from #46 applies only for agreement of universal and existential formulas (which one might ask for as Scott does), but does not extend higher in the hierarchy of first order quantification.

  76. wolfgang Says:


    >> you can write basically anything you want, as long as I’m able to understand exactly what number you’ve named.

    This rule seems ill-defined to me …
    What if I define L as the largest number one can name in 10 seconds following the Scott A. rules ?
    Would this number be the winner or would L+1 beat it?

  77. Scott Says:

    Eric #72: Even supposing that a QC’s only application was to quantum simulation, that seems likely to have lots of positive applications for humanity. Helping to design high-temperature superconductors, better-efficiency solar cells, drugs that bind to desired receptors … pretty much anything involving many-body quantum mechanics, and for which DFT and other current methods don’t yield a good enough result. Why are you so quick to write that off?

    As for breaking exist crypto, hopefully (and in principle) switching to different forms of crypto would only need to be a one-time thing. And in any case, as the Equifax break most recently reminds us, there are like Ackermann(1000) ways to steal people’s personal data that are vastly easier than building a scalable quantum computer. 🙂

    Incidentally, I also think thermonuclear bombs could’ve had positive applications for humanity: for example, bombs exploded underground could be used to drive turbines (yielding fusion energy that could be practical with existing technology), and Project Orion seems like the way to travel through space. Alas, understandable geopolitical concerns seem likely to postpone all such ideas for many generations, if not forever. But, as I once heard it put, the Teller-Ulam design was such a triumph of engineering that it would be a shame for it to only ever be used in murderous weapons.

  78. Scott Says:

    Atreat #74: Let’s be more concrete about it. 2+3=5. Not an absolute truth? What about Modus Ponens?

    For me, taking inspiration from the completeness theorem for first-order logic, a thing not being an “absolute truth” simply means there’s some possible universe where that thing is false. So for example, the 4-dimensionality of our spacetime manifold seems like clearly a contingent fact about our world: something that could have been false, and that for that very reason—given the limitations of our senses, measuring instruments, etc. etc.—we can never be 100% sure is true.

    Without changing the terms involved to mean anything other than their standard meanings, what’s a possible universe where 2+3 is not 5, or where Modus Ponens doesn’t hold? Or at least: what can you say to persuade me that such a universe might exist?

  79. Scott Says:

    wolfgang #76:

      What if I define L as the largest number one can name in 10 seconds following the Scott A. rules ?

    No, that’s just clearly, unambiguously a number that I don’t understand, not being able to do diagonalization over my own brain to see which number-definitions I would or wouldn’t accept. Disqualified.

  80. Sniffnoy Says:

    Scott #71:

    To address your other point, I think transfinite set theory clearly counts as math (it proves nontrivial theorems, is done in math departments … how much more do you want? ? ). But you’re right that it’s an unusual kind of math, which in some sense is less about the objects themselves (e.g., the transfinite sets), than it is about the formal axiomatic systems that we use to discuss those objects.

    Hey now, some of us still do perfectly good work about those transfinite sets themselves! 🙂

    (I have no idea how possible it is to cash out statements about ordinals and well-quasi-orders to statements about natural numbers, being someone who only cares about the order theory itself and not the logic that goes along with it. But I get the impression such a thing may be possible?)

  81. wolfgang Says:


    Come on … just write down the biggest number you can name in 10sec. Then we ask all mathematicians on this planet to do the same … finally we use the fastest supercomputer limited only by Planck time …
    L is the maximum of those generated numbers.

    This is a well defined and *finite* procedure IF your rules are well-defined.

    What is it that you “clearly don’t understand” about it?

  82. mjgeddes Says:

    Quanta magazine just posted an article about recent progress on the continuum hypothesis, with experts extremely surprised to see a proof that two types of infinity (called p and t) were the same cardinality. This supports CH.

  83. Scott Says:

    wolfgang #81: No, because what would actually happen if we tried what you propose, is that it would degenerate into arguments about which number-definitions are valid and which not (and my lecture explains exactly why).

    In fact my rules are not well-defined in general—there are number-definitions about which my own intuition vacillates, so that I can’t tell you in advance exactly where I’d draw the line.

    But what the rules are, is perfectly well-defined enough to adjudicate a biggest-number contest between any two children who I’ve ever met, and all but a few of the adults. 🙂

  84. Scott Says:

    Sniffnoy #80: As it happens, I was talking to Harvey Friedman about exactly this a few months ago. Well-quasi-order theory is an interesting intermediate case, not nearly as ethereal as (say) the Axiom of Choice or the Continuum Hypothesis, but also not quite as “obviously definite” as arithmetical statements. So yes, I suppose I do admit statements about WQOs as having definite truth-values, although I’m more comfortable with the purely arithmetical consequences of those statements. (E.g., that for every excluded-minor family of graphs, there exists a polynomial-time algorithm to test for membership in that family.)

    What Harvey and I realized was that there’s something interestingly open regarding this distinction. Namely, Harvey’s own work from the 1980s, together with Robertson and Seymour, established that some of the key statements of WQO theory are independent of Peano arithmetic, requiring at least some ordinal reasoning to prove. But right now, it seems that no independence from PA is similarly known for the arithmetical consequences of those statements, such as the “algorithmic” version of the Robertson-Seymour theorem that I mentioned above.

  85. Scott Says:

    mjgeddes #82: Yes, I had read that article with interest (or actually, mostly the underlying paper by Malliaris and Shelah, for which I half-understood the introduction, but still got more out than from the popular article).

    But I don’t understand the argument that this breakthrough result “supports CH”; maybe you or someone else could explain it.

    As I understand it, there were two infinite cardinalities, p and t, which were known to satisfy

    1 ≤ p ≤ t ≤ 2ℵ0

    in all models of ZFC. So, if someone had proven p<t, that would’ve implied not(CH). However, after Gödel and Cohen’s work, we knew that you could build models of ZFC where ℵ1=2ℵ0 and therefore p=t, and other models where ℵ1<2ℵ0 so the p vs. t question could a-priori have either answer. What the breakthrough by Malliaris and Shelah shows is that, in all of the latter cases as well, we actually have p=t. So then, what’s your argument for why this renders CH “more likely”?

  86. Joshua Zelinsky Says:

    On the topic of Busy Beaver and its generalizations, a question about another generalization; this is closely related to the earlier questions and conversations about quantifying the small-scale growth of Busy Beaver.

    We can also talk about BB(n,k) where n is the same as before and k is the number of tapes. Obviously, this growth is much much slower than the sort of super-oracled machines you discussed above. However, there’s another closely related generalization: Instead of adding extra tapes, add extra stacks. So consider BB(n,k,s) where s is the number of stacks added. Now, it is well known that two stacks can simulate an extra tape with small overhead (essentially you pretend that each stack is half of your tape and to move the head you pop one stack and push on the other). However, one would expect that in general having s stacks should give one a lot more flexibility than s/2 tapes, since having two stacks allows other operations like just pushing one stack or just popping one stack (which essentially then amounts to something close to being able to add or delete a spot on your tape). Can we make this rigorous in terms of the Busy Beaver function?
    In particular, aside from very small n, if we assume that s is at least 2, is BB(n,k,s) always much bigger than BB(n,k+1, s-2)? My guess would be that for any computable function f(x), as long as n or s is sufficiently large f(BB(n,k+1,s-2)) is less than BB(n,k,s), but this looks tough.

  87. Scott Says:

    Flumajond #75: I regret that I haven’t yet had time to read that paper. In the meantime, though, maybe you could help me by explaining: what is the actual example of a Π2-sentence, or other arithmetical sentence, that’s claimed to lack a definite truth-value? Not any of the philosophical or mathematical arguments around it, just the statement itself.

    For me, a canonical example of a Π2-sentence has always been the Twin Prime Conjecture. So I’m genuinely curious: would you say that, for all we know, the Twin Prime Conjecture might be “neither true nor false,” that its truth might be “relative”? If so, what exactly would such a claim mean? Let’s say that I write a computer program to spit out the list of twin primes. It starts out like so:


    Then the question is: assuming unlimited computer time and memory, will this program ever print out a final pair, or not?

    What would it mean for the answer to this question to be relative: that the program actually does print a last pair under one extension of ZFC, and actually doesn’t under a different ZFC extension? If so, then by what means do the axioms of set theory reach down from Platonic heaven to our world, in order to influence the behavior of this deterministic computer program (as opposed to the very different question of what we can prove from those axioms about the program’s behavior)?

  88. mjgeddes Says:


    Well the experts seemed surprised. From Quanta:

    “It was certainly my opinion, and the general opinion, that p should be less than t,” Shelah said.

    Also, as you pointed out:

    “if p is less than t, then p would be an intermediate infinity — something between the size of the natural numbers and the size of the real numbers. The continuum hypothesis would be false.”

    But with p=t, CH passes the test.

    So to me the hint is that there’s fewer types of inifinity than thought, leading us towards CH….

    Of course whether CH actually has a definite truth value depends on how strong a mathematical realist one is. A formalist or a weak realist would argue there’s no fact of the matter. A stronger realist or a Platonist still insists there’s a definite answer.

    The famous Geddes super-clicker intuition says..CH is true 😀

    Ultra-finite recursion man.

  89. Flumajоnd Says:

    “what is the actual example of a Π2-sentence, or other arithmetical sentence, that’s claimed to lack a definite truth-value”

    In the paper natural numbers are the same, so all purely universal (or purely existential) formulas will have the same truth value in both models constructed – which is exactly the point, while at the same time not all first order formulas. Therefore, there are no examples of such formulas as you ask (which is, universal with two variables – which essentially does not matter, as you could get a pair from one number; the alternations of quantifiers matter, to the nature of Skolem functions), but example that follows (not given explicitly) must have alternations in the type of quantifiers.

    In fact, that is how this paper ruins your argument from #46. Namely, if you ASK for purely universal or purely existential formulas to have definite truth value (i.e. proof existence and halting of Turing machines would all be included), then you can get that, but it does not extent to all higher levels (which have nontrivial Skolem functions) of formulas about number theory (which is what “arithmetic soundness” would require).

    The proof is given in terms of models of ZFC (but presumably to its finite extensions too), that contain the same natural numbers, but have different truth values of formulas with alternating quantifies (first order, but just about numbers). Which simply means, that the models contain different Skolem functions (which is where the difference comes from).

    In the end of the paper, there is nice overview about philosophical positions held by various mathematicians/logicians. From strict Platonists, to middle of the road Feferman, to Hamkins who rejects arithmetic determinancy to ultrafinitists (relevant in the case the world is finite). But the point is, these are PHILOSOPHICAL positions.
    Mathematics can only suggest (even prove as in this paper) that one kind of argument sometimes can not be extended beyond its most narrow scope, as you did in #46 if you wanted to use intuition about Turing machine halting to ALL first order arithmetic and arithmetic soundness and determinancy.

  90. Scott Says:

    Flumajond #89: No, a Π2 sentence does not mean “universal with a pair of variables.” It means a sentence with a universal quantifier followed by an existential one. So I repeat my question: what is the example of that kind of sentence whose truth is “relative to the model of set theory”?

    And I also repeat my question about the program that searches for twin primes. How does anything you’ve talked about reach across the conceptual chasm, and cause that program either to find a final twin prime pair or not find one? I still haven’t gotten an answer.

    Yes, of course this is a philosophical dispute, not something that can be resolved by a theorem. That’s why I gave you a philosophical argument for why the place where I draw the line between definite and indefinite, seems like the only place that makes any sort of sense. 🙂

    I’ve debated Hamkins before about this on MathOverflow—his rejection of arithmetic definiteness never made sense to me—but now I’m asking you. Again, not philosophical schools of thought, just what it is that reaches down from set theory and causes a computer program to either find a last twin prime pair or not find one.

  91. Scott Says:

    mjgeddes #88: Thanks, but without a further argument I don’t buy it. After all: the models of ZFC that we already knew about where CH is false, are all still there now! It’s just that we now have an extra piece of information about those models, namely that p=t in all of them. And more than that, the models where CH is false are the only models where the new result is even interesting, since we already knew that p=t whenever CH holds.

  92. Atreat Says:

    Scott #78: Absolutely! I’d love to get down to brass tacks, but I’m still not sure what kind of argument you’d find convincing. Let’s take 2+3=5 and see if we can figure it out. Precisely who/what do you think should adjudicate? Conscious observers in some universe? I gather you would not find that convincing since consciousness is not well defined. Presumably, you envision some form of computing device in some universe with physical/mathematical law that would foreclose it from ever outputting 5 in response to a well formed program performing the operation that we conceive as 2+3. Would that be convincing?

  93. Walter Franzini Says:

    Hi Scott,

    I’m one of the people that like your blog even if it’s beyond my understanding, I’ve been in Mantua for the 10th time this year and I was surprised to see your name in the authors’ list. While I’ve not attended your speeches because of the language barrier, I was disappointed not to find your book in the temporary bookstore.

    Festivaletteratura is more about readers and authors than about literature. One of the bestselling authors in Italy in 2017 is Carlo Rovelli, a quantum physicist, with a little book about time and he held an event in one of the main location of Mantua (Palazzo Ducale). So the festival is more about what people read than about literature.

    I hope you come again in Mantua with a stock of your book so I can try to understand something more about this blog 🙂

  94. Scott Says:

    Atreat #92: I just want you to describe to me what some world is like where 2+3 is not 5—in the same way that I could describe to you what a world was like that had only 2 spatial dimensions (I could just say: “read Flatland“), or that was classical rather than quantum, or maybe even that had multiple dimensions of time, or no time. That’s all, no more and no less!

  95. Flumajоnd Says:

    Ok, so yes, the twin prime conjecture is universal-existential claim. Your posing it as a list made what you were talking about seem like a halting problem, which it is not – so that is why I jumped to discussing double universal without carefully reading what you wrote, my mistake.

    As for the paper, if you insist for explanation (though you could have done that on your own, I didn’t know about that paper before – and just got to see what it is about and what is claimed, I am as much of an outsider to the model theory as you are, but the proof uses only elementary tools like compactness theorem and is short), ok I can try break it down for you.

    The more explicit of the two given constructions/proofs of Theorem 1, due to Woodin, is on the page 11. I doubt you could say that the formula they construct is universal-existential, as only its Godel number is constructed and it is SOME formula of first order language of arithmetic. So in proof of Theorem 1, there is no explicit expression for formula in that paper, and it is certainly not said to be universal existential, but it is of first order number theoretic and cannot be purely universal or purely existential by its properties. The construction is indirect.

    It goes something like this: One knows that truth predicate on arithmetic cannot be defined as a first order formula of arithmetic (old result of Tarski, Goedelian diagonalization of sorts) – that is, list of Goedel numbers of formulas that are “true” cannot be defined by a first order formula of arithmetic. But it CAN be defined in meta theory (predicate he calls TA). Now he considers 2-type consisting of all formulas (in two variables s and t) “f(s) equivalent f(t)”, and also “TA(s)” and “NOT TA(t)”; each finite subset of such formulas is realizable by some s and t, this set of formulas is recursive and the model is recursively saturated meaning that we can find s and t which satisfy all the formulas in this set.
    Then he will take a formula whose Goedel number is s and construct another model of methateory M2 so that this formula is not satisfied in it (as t corresponds to s in that theory).

    Ok, so thats a sketch of the proof of Theorem 1. I must say that it bothers me that both s and t might not be standard natural numbers, as there are no two standard natural numbers that satisfy the same set of formulas and are different. It appears that Theorem 1 is in fact of interest only to model theory as it shows something about the truth predicate – a predicate on Godel numbers of formulas, which in this case are from some extension of natural numbers. But also, there is Theorem 5 and its Corollary 8 on page 15, that in fact seems to claim that you can have agreement on Tr_k predicate (which inductively applies as on page 6 Tarski definition of truth at most k times) and not on T_(k+1), but even that does not seem to exclude nonstandard Godel numbers.

    Apparently all this is different from the formal interpretation that I claimed, and oriented to model theoretic stuff, which allow non standard Godel numbers which do not come from any “real” formulas. My mistake was to interpret these claims as claims about standard formulas of first order language, as I did not consider the possibility that the models will give NONSTANDARD Godel numbers, i.e. he uses “formulas” which are not formulas in our standard sense (though look like they are to the model).

    Nevertheless one can independently ask the following question (about formalism):

    Can we have two extensions of ZFC, such that they disagree on some universal-existential formulas about numbers, but yet agree on all purely universal and existential formulas about numbers? This claim might have some relatively cheap proof, easier than what was done in that paper, which was model theoretic oriented, but it does not seem to be what was done in that paper, at least not directly. Anyway, that is a mathematical claim, which I believe to be true, and it would be a clear reason why your philosophical intuition about one level does not extend further up the hierarchy. However, this paper indeed does NOT talk about that, on closer inspection, so an independent proof might be needed. Perhaps this is a known result?

  96. Scott Says:

    Walter #93: Thanks for the kind remarks; I’m sorry to have missed you in Mantua.

    Yes, I was also disappointed not to see Quantum Computing Since Democritus in the festival bookstore! But all but a few of the books that were for sale were in Italian (a language into which QCSD hasn’t been translated), so that may have had something to do with it. In any case you can get my book from Amazon, and I’ll sign it when and if we meet!

  97. gentzen Says:

    Scott #78, Atreat #92: As you surely both know, 2+3=5 (or rather S(S(0))+S(S(S(0)))=S(S(S(S(S(0))))) after removing syntactic sugar) directly follows from the axioms (of PA, or Q, or EFA), so it should always be true.

    But the actual point of disagreement is whether the ontological commitment (to the existence of Turing machines and) definite truth (or falsehood) of Π1 sentences (which I and many others want to make in order to be able to talk about the consistency of axiom systems) forces one to also accept definite truth (or falsehood) of Π2 sentences (and by iteration also to accept the definitive truth of Πn sentences).

    Personally, I am not convinced of the absolute truth of arithmetic statements. But I know that it would be hard to convince Scott of my position, and Scott probably knows that it would be hard to convince me of his position. I think:

    In a certain sense, naïve set theory is inconsistent, but naïve arithmetic is not. Hence independence of set theoretic statements from set theory is seen as a sort of fault of the statement itself, while independence of arithmetic statements from a given formal system is seen as a fault of the formal system.

  98. gentzen Says:

    pupmki #57: I have now read section “1. Introduction”, section 2 “2. Indefinite arithmetic truth”, and section “6. Conclusions” (so I skipped pages 12-24) of Hamkins’ paper (with 34 pages). Saying “this is a rather elementary paper” is an interesting assertion.

    I wondered before whether I should bring up that paper, but decided against it. Even if Scott would read it, it wouldn’t necessarily convince him (because the paper intentionally uses non-standard models of the natural numbers). And I should have read the paper first (even if I don’t fully understand it) if I bring it up, independent of anybody else in the discussion would also read the paper.

    Flumajоnd #75, #89, #95: I am happy to see that you have read Hamkins’ paper, understood it significantly better than I could have understood it even if I had invested more effort, and are willing to explain to us what we can learn from it for our discussion. I just wondered if you would have brought up that paper by yourself, if pupmkl hadn’t done it first. After all, it is not really clear whether it benefits a discussion, to go into such technical details.

  99. Scott Says:

    Flumajond #95: I should say that I really, genuinely admire your intellectual honesty in clearly admitting that the Hamkins paper doesn’t do what you think it did—when you could’ve easily obfuscated the point and I would likely have been none the wiser! 🙂

    For me, nonstandard integers have a fundamentally different status from standard ones. They are artifacts of a formal system—in effect, placeholders that say “an integer with such-and-such properties doesn’t actually exist, or not in the usual sense, but this formal system can’t prove that it doesn’t.” Indeed the Completeness Theorem can be interpreted as saying precisely that the nonstandard integers are “nothing more than that,” spandrels that have to exist by the mere fact of ZF+Not(Con(ZF)) and so forth being consistent.

    So, if there’s a “Π2-sentence of arithmetic” with indefinite truth-value, but it needs to have a nonstandard Gödel number—well, that’s more-or-less what my worldview would’ve led you to expect, isn’t it? 😉

  100. Flumajоnd Says:

    To answer my question above, if we strengthen the requirement so that the natural numbers are the same (but Skolem functions might differ), and want them to be standard natural numbers, it seems that universal-existential formulas wont do, because the negation is existential-universal, so in one of the models, we will have a concrete universal formula, that has to agree with the other model and we have a witness there too.

    So, we can extend this logic further up the hierarchy, to conclude that if natural numbers are the same in the two models AND are standard natural numbers, then the first order formulas of arithmetic hold the same in the two models.

    However, if we allow theories to be not complete, situation might be different.

    Suppose we have an extension of ZFC (it will be non recursive) as a formal system, that has ALL purely universal true (in ordinary sense) formulas about arithmetic as axioms. Of course, it will also prove all purely existential such formulas that are true. Lets call this theory M. It might not be complete, but it is complete as far as these lowest level quantified formulas of arithmetic go.

    Now, the question is: is this theory necessarily complete as far as first order arithmetic formulas go?

    Lets see… I don’t think so, but if we have any model of this set theory (M) which has the same natural numbers as standard ones, it has to have the same set of true real first order formulas about numbers.

    So that paper really seems to depend on “nonstandard” formulas and natural numbers.

    And we cant seem to get arithmetic indeterminancy by different Skolem functions alone. One needs extended set of natural numbers. Or yet in another way – if we have omega-consistent and complete theory of arithmetic, it has to be the “standard” one.

    So while I still believe what I asked above is true, another question might be asked: can we have two models of ZFC, with the same but nonstandard sets of natural numbers (i.e. structure with same +, *), but such that not for all REAL formulas of first order on natural numbers they have the same truth value (a variant of the theorem proved in the paper but where formula sigma is “real”, i.e. has a standard Goedel number). The paper does not seem to answer that either.

  101. Flumajоnd Says:

    “doesn’t do what you think it did”

    wrong. doesn’t do what I THOUGHT it did. Before I took a closer look. I hope it was a misprint on your part.

  102. Scott Says:

    Flumajond #101: I stand corrected.

  103. Aula Says:

    There is something odd going on with this blog: when I view the content of this post at the math displays are rendered correctly, but when I look at the displays are not rendered at all, but instead shown as the source code. Anyway, I mosty wanted to respond to comment #77:

    Incidentally, I also think thermonuclear bombs could’ve had positive applications for humanity: for example, bombs exploded underground could be used to drive turbines

    I don’t see how that could be practical. A thermonuclear explosion releases a minimum of 200 terajoules of energy in less than a second. Even if you ignore problems with things like radiation, converting and storing that much energy would be virtually impossible.

    (yielding fusion energy that could be practical with existing technology),

    Not really, as there’s very little fusion energy involved. Thermonuclear bombs work in four stages: chemical explosions generate a shockwave, the shockwave triggers the primary fission explosion, the energy from the primary fission explosion triggers both the fusion reaction and the secondary fission explosion, the neutrons (not energy!) from the fusion reaction enhance the secondary fission. The energy released by the fusion is only a fraction of the total energy; the purpose of the fusion is to produce sufficient amount of neutrons to completely consume all of the fission fuel.

    and Project Orion seems like the way to travel through space.

    No, not through space. Nuclear explosions could propel spacecraft efficiently in the Earth’s atmosphere (and even more so in denser atmospheres of some other planets) but would be very inefficient in vacuum.

  104. Atreat Says:

    Scott #94, gentzen #97, Flumajond #95: First, I want to say thank you all for this conversation! I’ve always found this subject fascinating.

    Personally, I find myself closer to the side of gentzen and Flumajond (whose intellectual honesty I too genuinely admire!), but I suspect I’m willing to go even further.

    gentzen, I think you frame the real disagreement well. When Gödel introduced his famous work it shook the philosophical worldview of a great many mathematicians. I believe Scott thinks that many mathematicians went too far in throwing the platonic baby out with the bathwater.

    Scott, I’m going to bite that bullet (and add mustard to it as you’d say!) and get back to you on that description when I can find a break from my papa duties 😉 I was thinking of something like the lambda calculus or SKI calculus that can embed PA, but with some particularly fine-tuned alterations that would preclude 2+3=5. Would that suffice?

  105. Scott Says:

    Atreat #104:

      I was thinking of something like the lambda calculus or SKI calculus that can embed PA, but with some particularly fine-tuned alterations that would preclude 2+3=5. Would that suffice?

    Hard for me to say whether that would suffice, since I can’t even begin to imagine what it would look like! Remember that I’m not looking for an abstract formalism, but for a description of what the world would be like.

  106. Atreat Says:

    Scott #105: Essentially, it would be a Turing-complete language that would embed PA, but have a “bug” if you will that would render 2+3!=5 or as gentzen would put it: S(S(0))+S(S(S(0)))!=S(S(S(S(S(0)))))

    I’ve written actual interpreters for SKI calculus so I could place the “bug” in the interpreter or if that is too contrived I can think about an alteration to the semantics of the language which will embed the “bug”.

    The real question is if you will accept this “bug” as sufficient reason to abandon your ontological commitment to the absolute mathematical truth of 2+3=5 🙂

  107. Ashley Says:


    Another great post.

    Just a curiosity (regarding (your 🙂 ) mathematical psychology). When you stated the three proofs for the uncomputability of the Busy Beaver function, you stated the one involving solving the halting problem only after the other two. Is it because you like the other proofs ‘better’, or was it maybe because the last one had to be given a link to the halting problem. Because the last one looked the straightforwardest to me and I would not have bothered for another proof.

  108. Marvy Says:

    Atreat 106: This entire discussion is mostly way over my head, but I think Scott’s reaction would be something like “This is just a buggy interpreter, the world has no shortage of them, no big deal. Or more formally, this interpreter implements something other than what people mean by integer arithmetic.”

    I think he wants a description of world where people say things like “This basket has room for 7 apples, so if I put my 2 apples in it, there is not enough room for you to put your 3 apples, since 2+3=9.” (Or maybe 2+3=4, so the basket is way bigger than needed.)

    Or a world where Winston from 1984 says “Freedom is the right to say 2+3 do NOT make 5.” Or something like that.

  109. Sniffnoy Says:

    There is something odd going on with this blog: when I view the content of this post at the math displays are rendered correctly, but when I look at the displays are not rendered at all, but instead shown as the source code.

    I’m also having this problem, btw, just to confirm that it’s not just Aula.

  110. jonas Says:

    Re #53, technically that’s not true. In the real world, people do use PRNGs that give fewer bits of output than input, because they have random sources that do have enough entropy but aren’t independent 1/2 probability random bits. This doesn’t help in this context, that is, creating functions that really don’t have a small circuit.

    Scott re #65:
    > Alas, my sole superpower is clarity of thought. I?d surely pick a different one if I got to live over again.

    In Paul Graham’s writeup “Why Nerds are Unpopular”
    (“”), which you’ve mentioned
    before, Paul Graham appears to say that he would pick clarity of
    thought again, and that other nerds (such as you) would choose the

    > If someone had offered me the chance to be the most popular kid in school, but only at the price of being of average intelligence (humor me here), I wouldn’t have taken it.

    Are you contradicting what Paul Graham said there? Or am I just
    misunderstanding your comment?

  111. Flumajоnd Says:

    “So, if there’s a “Π2-sentence of arithmetic” with indefinite truth-value, but it needs to have a nonstandard Gödel number—well, that’s more-or-less what my worldview would’ve led you to expect, isn’t it”

    Well, sentences with nonstandard Goedel numbers are not really sentences; it just shows that this result (from the paper) is not really relevant to that point.

    What WOULD be convincing for me is if, assuming that we add all universal “true” formulas of arithmetic (or even a bit more, essentially universal, i.e. universal over some simple predicates that are guaranteed to be decidable even in PA, allowing for instance primitive recursive terms) as our axioms, that we will get an arithmetically complete first order theory. But I think that is not true.

    In fact, I can give a reason why I think this is true. The sets can be classified according to computability, to decidable sets, recursively enumerable sets, etc. so that we start from decidable sets, and then use complement and projection like in other hierarchies. So the set of all axioms that are “true” universal formulas will be low in that hierarchy (set of all “true” existential formulas in enumerable, and this is sort of its complement), and set of all theorems from that is then also low (one quantifier above or so). But the set of all “true” formulas should be high, so that in this way “recursive” hierarchy would collapse. Which is either un-plausible, or likely even known (the analogue of P!=NP statement is just that enumerable sets are not decidable, which is easy to prove and likely there might be separation results for this on higher levels – its some hierarchy of sets, starting from decidable and up).

    So there you go, this is a cheap proof of my claim that there are universal-existential or at least first order formulas, that would be independent from the system in which truth of all purely universal (as well as purely existential which we already have of course) is known. A proof, modulo non collapsing of hierarchy of sets starting from decidable (and using projections and complements), which should be known (it must be way easier than noncolapsing of PH).

    Of course, life would seem much easier if my claim were false, and it is only an argument of sorts against extending #46 to higher level. In fact, as I said before, long time ago (after learning about Goedel incompleteness as a teen), I became a bit obsessed with incompleteness and ways to define things and truth. My default position was the same as that Scott describes (arithmetic truths of first order at least are absolute), and the mysterious logic of adding Con(T) to T, and repeating this process over and over, never getting the full thing, seemed like a way to get something – but whatever you do (passing to limits etc), always gives a recursive theory, and misses the mark. There is NO WAY to formalize this yet we seem to have a grasp on whats going on (which is basis of Penrose’s argument). Later, I’ve concluded that this must be an illusion, and that key is in defining limit ordinals, where our intuition in fact might be faulty (as might be adding large cardinal axioms, with no guarantee they are consistent as in fact some are even shown to be inconsistent), and hence, all we can have is “empirical” evidence for some axiom not to cause contradictions – a rather disappointing state of affairs, but it seems that there is little we can do about it. Of course, there is a philosophical difference about us being able to know something, and something being true or not on its own. But while useful, such slippery notion of “truth” of things which are unknowable – and in principle so – is why I remain agnostic as to existence of such a thing as “absolute truth” in the first place. It is not only continuum hypothesis but perhaps even arithmetic statements which are unknowable that are called into question.

    So why do we have intuition that arithmetic absolute truths exist? It seems to me that it is always by virtue of some imagined mechanism of KNOWING what the truth value is. For instance, we might imagine some quasi-physical system, with some “superpowers”, that computes the generalized BB function with ordinals (we can stick to omega for start, I suppose that should also do for computing the truth of first order arithmetic), or determines truth of some statement. But if these aren’t based in reality, then there is plenty of room for skepticism.

  112. Greg McLellan Says:

    Scott #71:

    I’m almost certain that ZFC is a “non-pathological” theory in the above sense. But—here’s a key point—if we somehow learned that wasn’t the case, then so much the worse for ZFC. We’d then simply want to discard ZFC—which was only ever a tool to begin with, for capturing truths that made sense prior to ZFC or any other system—and find a better foundation for math.

    I can’t help but feel like you’re being a little too nonchalant about this possibility. I’ve seen plenty of pure mathematicians wince at the idea of ejecting Choice from functional analysis (it’s kinda doable, but it makes things a hell of a lot harder and less elegant), let alone having to rebuild the modern edifices of topology and algebra if the appeals to Infinity, Powerset and Separation which they are built from turned out to be inconsistent. To me it seems like this isn’t just a matter of knowing a system of seemingly contradictory manipulations is “right” and finding a formal framing for it later, a la the situation with quantum field theories (which get plenty of vindication from experiment in the meantime). It seems more like the foundation crises reached somewhat of a conclusion in early 1930s and then, maybe starting with Bourbaki et al in the mid ’30s, a brand of mathematics exploded onto the scene which made extensive appeals to Z, not only as an organizing principle, but as an ontological mediator of very deep and essential technical claims which come together to make up the powerful explanatory fulcrums of much of modern pure mathematics.

    I guess the counter to this is that a lot of pure mathematics does eventually render an empirically testable claim, and so far those claims have tended to check out, so at least *those* branches of pure math would probably be okay. But then, I take all of that as evidence of ZFC’s consistency, and you’re proposing the hyperastronomically unlikely world where all that stuff worked and yet we’re still not allowed to consistently apply powersets and separations to infinite sets. I don’t really have the imagination to figure out what such a world might look like. I guess I’d expect the riots in the streets to drown out any calm-headed attempts to salvage pure math. :/

  113. Sniffnoy Says:

    Scott #84:

    Fascinating! Like I said, I know very little of the actual logic that this stuff touches on.

    I’m wondering what sorts of statements about well partial orders your thinking about here. Like, it sounds like you’re talking about statements that certain orders are WPOs. Whereas what I’ve done is taking known WPOs and computing their maximum extending ordinals aka their type. I was under the impression these corresponded to growth rates of certain functions? I dunno, I’m not too clear on that aspect, like I said. 😛 Do you know anything about what that actually corresponds to?

    (It occurs to me that the theorem that every WPO has a maximum extending ordinal relies on AC, as does the equivalence of the various definitions of WPO in the first place; but I think if we just pick an appropriate definition for WPO and use one of the other equivalent definitions for type then we still keep basically the whole theory with no AC dependence (at least I hope so!). And of course if you’re talking about particular WPOs like I always am then there should truly be no need for AC…)

    I personally haven’t had much reason to deal with ordinals that are very large. Or I guess I should say, I haven’t had much reason to deal with functions of ordinals that are very large! You can certainly get big ones out if you put big ones in. And like, it does seem like everything you’re talking about above is countable? It seems a bit surprising, at least, that computing types of uncountable WPOs should have any effect on arithmetic.

    Honestly the reason I brought up WPO stuff — or we can leave the partial ordering aspect out of it and just say ordinal stuff, there’s a few ordinal computations I’ve done that best I can tell nobody else has done before — was not because of any connection to arithmetic but rather just because it just seems so, well, definite. Yes, it’s “transfinite”, but because everything’s well-ordered (or well-quasi-ordered), you can use induction to get at it and get a very definite answer. It’s a bit hard to imagine how the answer could possibly be other than it is, you know? It’s not like with cardinals which are famously hard to compare and where if you pick two to compare there’s a good chance the answer is independent of set theory. I’m having a hard time imagining a similar situation with ordinals (other than the trivial case of picking initial ordinals for given cardinals, which relies on AC anyway).

    So like I said, if I prove that a certain WPO has type, say, ω^(ω_1^2) (because in reality I proved a more general formula and then plugged some ω_1’s into it) — and again we can use a definition of type that doesn’t require AC, and my techniques certainly don’t require it either — it’s a bit hard for me to imagine that has any effect on arithmetic, but it’s also hard for me to imagine that there’s anything indefinite or questionable about it.

    …on the other hand, asserting a pre-theoretic notion of truth about something as out there and infinite as ordinals also seems pretty questionable! So I dunno. Maybe the seeming definiteness is just because when we consider all ordinals rather than just natural numbers, we (or at least I 😛 ) tend to ask simpler questions? That could be the case… like, I can imagine a sort of analogue of the arithmetic hierarchy but for ordinals, and (just assuming for now that such an analogy actually works and makes sense) thinking in those terms I’m not sure I’ve done anything that goes beyond Π_1… maybe that’s why everything about that area seems so definite to me… except Π_1 statements about natural numbers can still be undecidable… but then if we’re talking about whether we have a pre-theoretic notion of truth for that domain, that’s irrelevant anyway. So that doesn’t seem like a hepful line of thought after all and maybe you should just disregard much of this paragraph.

    Yeah, I have no idea. You see why I’m confused? 😛

  114. Greg McLellan Says:

    Scott, #99:

    For me, nonstandard integers have a fundamentally different status from standard ones.

    But Scott, don’t you ever lie awake at night, fraught with distress over the possibility that your own internal Platonic model of the Naturals might be infested with Z-chains? What if there is some Turing machine M which diligently runs for all eternity if left unchecked by physical limitations, but your microtubules insist that M halts at time t, where t is writhing around up there on one of those icky, bidirectionally-infinite worms? To what could you possibly appeal to establish that M indeed never halts?

    Not that I have a better answer about what constitutes a “concrete truth” — I tend to be with you when you write off questions like CH as being decidedly unconcrete, and I think your “interactive proof” argument for AH statements having definitive truth values is pretty compelling — but I’ve always felt that talk of a “standard” model of the Naturals in absolute terms, as though we can know we have access to it even as we toy with models of PA and ZF which don’t, seemed to be laced with a kind of hubris.

  115. Chuqui Says:

    Tarski elementary geometry is math w/o arithmetic. So PA not required for proper math. Could presumably extended to yield PA, but, more presumably, be extended to yield NON-PA? Thus would have proper math where 2+3 != 5. Arithmetic is boring imho 😉

  116. leopold Says:

    Atreat #106

    Why should the possibility of a formal system which is not sound (indeed, so unsound as to be able to prove 2+3!=5) make anyone doubt the absolute truth that two plus three equals five? Any more than the fact that sometimes human beings make mistakes when they do arithmetic?

    Is the idea that because mistakes are possible, virtually everyone for thousands (maybe hundreds of thousands) of years might have been making a mistake in supposing that 2+3=5? (Presumably not, since in that case, for this to BE a mistake, it would have to be true that 2+3!=5, and this would be a truth as ‘absolute’ as the one everyone actually now believes viz 2+3=5; in which case there are absolute arithmetical truths after all)

    But then what IS the idea running in the background here?

  117. Scott Says:

    Ashley #107: Good question! In my original “Who Can Name the Bigger Number?” essay, the only proof I gave for the rapid growth of BB was that the computability of any upper bound on it would let you solve the halting problem.

    But in the intervening years, I realized two things:

    First, if you’re talking to a popular audience, it’s faster and simpler to give a direct proof—a proof that’s intrinsic to the BB function—rather than a proof that relies on the uncomputability of the halting problem, which your audience might never have heard of and which therefore requires a separate discussion.

    And second, the other proofs actually yield a stronger conclusion. Namely, those arguments pretty readily imply that BB dominates every computable function—whereas without more work, the halting problem reduction yields only the weaker statement that no computable function dominates BB.

  118. Scott Says:

    Marvy #108:

      I think he wants a description of world where people say things like “This basket has room for 7 apples, so if I put my 2 apples in it, there is not enough room for you to put your 3 apples, since 2+3=9.” (Or maybe 2+3=4, so the basket is way bigger than needed.)

      Or a world where Winston from 1984 says “Freedom is the right to say 2+3 do NOT make 5.” Or something like that.

    Yup. 🙂

  119. Scott Says:

    jonas #110:

      Are you contradicting what Paul Graham said there?

    Well, for one thing, I don’t see any contradiction if Paul Graham would make a different hypothetical choice than I would!

    For a second, “clarity of thought” isn’t quite the same thing as intelligence.

    And for a third, there are lots of imaginable superpowers other than being the most popular kid in school—e.g., the ability to walk through walls and so forth. 🙂

  120. Scott Says:

    Flumajond #111: Indeed, AFAIK throwing in all true Π1-sentences as axioms still doesn’t give you the ability to decide all Πk-sentences for arbitrary k. But of course this isn’t particular relevant for my view, which is prior to all formal axiomatic systems—objects that, for me, exist only to formalize and mechanize mathematical intuitions that were already present beforehand, and succeed only to the extent that they do that.

  121. Scott Says:

    Greg McLellan #112: Don’t get me wrong, an inconsistency in ZFC would be at least as big a deal as a proof of P=NP, perhaps more so. It’s just that this massively-unlikely event wouldn’t, by any stretch of imagination, be “the end of mathematics.” Restaurant tips would still be calculated the same way. The Pythagorean theorem and prime number theorem would still hold (keep in mind, almost everything that’s ever been proved can be formalized in tiny fragments of ZFC). The halting problem would still be unsolvable. There would be lots of new research opportunities in fixing the foundations, but the building would still stand.

    (Admittedly, I’m biased by the fact that my own mathematical interests—in theoretical computer science and the fields relevant to it—are generally extremely far from the kinds of topology, algebra, etc. you mention, the ones that would be most affected by such an event.)

  122. Science @ Festivaletteratura 2017 – out of equilibrium Says:

    […] the major experts on quantum computation, delivering a fun blackboard lecture on very big numbers (here he blogs about it), and physicist Fabrizio Illuminati, expert on quantum information and quantum […]

  123. Scott Says:

    Sniffnoy #113: While I found your comment extremely interesting, if I try to reply in any detail, I’ll very quickly exceed the limits of what I understand. I would want to spend a lot more time working with statements about WQO, before forming a strong opinion about which ones to regard as mathematically definite even if they turned out (let’s say) to be independent of ZFC, and which ones, if any, might have an AC- or CH-like status.

  124. Scott Says:

    Greg McLellan #114: No, I don’t lie awake at night worrying that my internal model of the natural numbers might be infested with Z-chains—any more than I worry that my brain might be infested with buffer overflows or other C programming security vulnerabilities!

    For me, these are both problems that are extremely specific to particular ways of regimenting the world—and the idea of treating them as problems of thought in general, rather than of the formalisms that give rise to them, is merely funny.

    Incidentally, I think Penrose is completely right when he says that our intuition for “the natural numbers” comes prior to any first-order theories that attempt to capture that set. He errs only when he insists that a computer couldn’t, in principle, engage in the same sort of pre-theoretical mathematical reasoning that we do.

  125. Chuqui Says:

    @Marvy #108 This parallels a discussion a while ago, whether natural sciences basically have “one root” aka “particle physics in space” or whether there could be another, independent theory that neither contradicts it nor follows from it. It certainly would be hard to prove that it can in principle not be derived from fundamental physics (the more so since that is (used ot be, at least) a moving target).

    So basically the question is whether 2+3 = 5 is a necessary condition for proper math (in its realm, Arithmetic, it presumably is) or whether there could be other “maths”.

    Maybe first order logic notation is such that 2+3 = 7 would break the notation, making it not sound. Geometry, however, can be “done” by other means than addition and multiplication. I don’t know if geometry is open to other “logics” than standard maths logic.

  126. Atreat Says:

    Marvy #108, leopold #116: Good questions! Let me sketch out in a bit more detail and then I’ll answer.

    I feel confident I can create a variant of the SKI calculus (a proven Turing-complete language) that implements Church numerals to embed basic arithmetic. I can modify this SKI calculus to, let’s say, make 2+3 a null calculation. In other words, 2+3 would not be calculable *in principle* on this variant.

    Now, if you subscribe to the physical Church-Turing thesis (which I believe Scott does), then you regard our universe as ultimately calculable. IOW, there is no physical manifestation of uncomputable behavior in our universe. Our universe can be in principle modeled by a Turing-machine.

    So the question I put to consider the absolute mathematical truth of 2+3=5 is to consider a universe modeled by my variant. I think it possible to create quite a rich world with this variant. To be clear, the abstract notion of the number two and the number three would exist in this universe. Computing machines would exist in this universe. Since we don’t know how consciousness arises I can’t say definitely that conscious beings could exist, but I see no reason to rule it out. What would not exist is any mechanism for computing the answer to 2+3. That would be uncomputable. I suppose conscious beings would not be precluded from asking this question in the abstract and arriving at the answer 5, but boy would they have a hard time taking hints from their surrounding universe.

    I guess, in the end, this is a form of argument against platonic realism that points out the necessity of the observer. What I’ve done is taken out the observer. Neither computing machines, conscious observers, nor the laws of the universe itself would admit that 2+3=5. If they did admit this (in the case of conscious observers), it wouldn’t be in the form of absolute mathematical truth.

    I’m skeptical that Scott and others will find this argument convincing, but I think that’s because of a failure of imagination. To be sure, I also suffer from this failure of imagination! It is incredibly hard for a 2D person living in flatland to conceive of a 3D world. That we’ve managed to realize the contingent nature of so many facts about our universe is quite a triumph. But I ask that you take my contrived thought experiment further. Imagine a world where not just 2+3=5 is foreclosed, but all manner of basic arithmetic is foreclosed. Worlds of incredibly complexity with utterly bizarre rules. Worlds where arithmetic laws are contingent. Again, it’s hard for me to imagine too, but I think the difference (if I can be so humble haha) is I can imagine imagining them 🙂

    In the end, absolute mathematical truth is dependent upon something observing it. It must arise somehow. Does it arise from itself? Does it arise from something else? Does it arise from both? Does it arise without a cause?

    None of the above. It does not inherently exist 😉

  127. Chuqui Says:

    Apropos apples: I buy half a basket of apples in the supermarket, yesterday 12, today just 10 but no cheating.

    In fact, or rather in intuition, geometry works just like arithmetic works: if you split a square along the diagonal, you’ll get two right angled triangles. No need for formal languages, axioms etc.

    Now we have non-euclidean geometry, so many statments turned out to be “can but needn’t”. That’s what usually happens.

  128. leopold Says:

    Atreat #126 “What would not exist is any mechanism for computing the answer to 2+3”

    Perhaps because I know nothing about the SKI calculus, I don’t understand this. What would block the usual computation in PA 2+3=2+SSS0 = S(2+SS0) = S(S(2+S0)) = S(S(S(2+0))) = S(S(S(2))) = 5? And whatever it is that blocks it, why should we interpret that blocking to indicate that 2+3=5 is not an absolute truth, rather than that your strange system simply fails to capture the sense of any or all of ‘2’, ‘3’, ‘5’, and addition, as we actually use these terms?

  129. Flumajоnd Says:

    I’ve found Scotts question related to this blogpost and discussion with Hamkins from a couple of years ago
    -is this the discussion referred to in #90?

    Anyway, the way Scott posed the definiteness there of fast functions (“so that even formalist might agree that it makes sense”) – is good enough for me, but it was immediately apparent that “true in all models of ZFC” just means “provable in ZFC”, so, as was the conclusion there too, z is way less powerful than generalized BB of some small rank (omega would kill it). This seems pretty uncontroversial to me (I guess I could be branded “formalist”).

    However, what was strange to me was to read what Hamkins wrote. He insisted that that was not really the right answer, and that, somehow, “true in all models of ZFC” does not equal “provable in ZFC” or something to that effect (I didn’t quite get it), and it seems that Hamkins takes nonstandard models of METATHEORY as seriously as they were real. This explains why in his paper he considers proofs and formulas with nonstandard Goedel numbers as “real” – I guess he is right if the metatheory is given certain interpretation, but it seems to be a rather peculiar point of view. Perhaps that is related to Scotts objection in #46 (and in that sense, Hamkins is consistent, as he treats metatheory the same way as theory, but then he probably has some meta-metatheory which is realistic, so it gets rather confusing). But that is not the nature of my skepticism – I guess I am mostly worried about dealing with something unknowable, and assuming that there is some “absolute truth” even if we can never reach it (which is the same reason for skepticism about continuum hypothesis). This is a rather different point than what Hamkins is doing (treating nonstandard models seriously and rejecting some intuition about the standard model as preferred). To me it seems like gibberish, makes as much sense as af45dacd69 c2cfcf0269da 49a69ea 7a39dda 73c54389 3b3515 c8e5b491 c1a2d5, but a computer might have different intuition, as does Hamkins apparently.

    So, I would propose the following “axiom” that tries to capture standard model. First, note that in nonstandard models of ZFC (by that I mean with different “truth” of first order arithmetic, for “real” formulas which are the only ones I care about), the real set of natural numbers is NOT a member of that model. It comes with extra elements. Now we want to be able to get rid of them.

    So, suppose we have a model M of some set theory. Suppose even that it is a transitive model. Now, we can take some subset X of M, and then form M[X] – a model, that is extension of M, but contains X that has those elements, AND ONLY those elements (this could work with non-transitive models too if made precise). Now, if for any such X, some theory ZFC+ allows such extensions, than it is OK, but if it does not – then we will reject ZFC+ as non-standard. The good thing is that this can be formalized as it seems to me (maybe starting from finite axiom NBG system if necessary), in the following way:

    We add to NBG/ZFC a set of axioms, claiming, for any closed formula T
    “there is a model M of NBG/ZFC+T, and a set X is a subset of M, such that there is no model of NBG/ZFC+T containing M and set X” implies “not T”.

    Now note that this seems to prove Con(ZFC). I’m not quite sure about it, but this seems interesting. Has anyone considered this?

    The intuition behind this axiom scheme is simple. Transitive models of “true” set theory can always be extended by adding a new subset. But if we have say T equal to “not Con(ZFC)”, then any its model will be ruined by adding a real set of natural numbers. We can extend it as we will, but we can never get back the “fake” proof of contradiction. Therefore, Con(ZFC). This seems like a rather nice idea to me (if I’m not wrong) but anyway most likely its either wrong or no-one will pay attention or I would hardly find time to develop it further, but who knows; of course it will not get one far as truth can never be attained, so any such intuition, however useful, is either dead wrong or limited.

    Note that my axiom scheme might even have something to say about continuum hypothesis (though this is highly speculative).

  130. Lucent Says:

    Trying to understand this a different way, as it reminds me of Typographical Number Theory in GEB and its ability to represent itself once it reaches a certain complexity.

    Is the lowest undeterminable Busy Beaver number also the size of the smallest Turing machine which can emulate itself in software, recreating the halting problem, since using an emulated Turing as input into a “real” one recreates the indeterminacy of the halting problem?

  131. Scott Says:

    Lucent #130: I’m not sure that a “Turing machine which can emulate itself in software, recreating the halting problem” is a well-defined notion. Certainly there are universal Turing machines, and there’s a longstanding game to make them as small as possible, but a crucial there is that universal machines require a description of another Turing machine as input—and for that reason, one can “shoehorn” a huge amount of the complexity into the input encoding scheme. For more see my paper with Yedidia.

    With our task there’s no input to the machine, but there is a theory-dependence: for example, the smallest machine whose behavior is independent of Peano arithmetic, is probably smaller than the smallest machine whose behavior is independent of ZFC. Any attempt to understand our task in terms of something else would have to take this theory-dependence into account.

  132. Sniffnoy Says:

    Scott #123:

    Ah, oh well. But what if we remove the “partial” or “quasi” aspect and just talk about well orders? Once again, ordinal computations seem pretty fricking definite — unlike with cardinals you can get actual answers! — but it seems like it’d be pretty difficult to cash out statements involving uncountable ones. And yes as best I can tell there are interesting questions just of the form “What’s the order type of this well-ordered set?” / “How do you actually compute this function on ordinals?” that nobody’s bothered to answer before, or at least not to write up anywhere I can find. 🙂

  133. Greg McLellan Says:

    Scott #124:

    Incidentally, I think Penrose is completely right when he says that our intuition for “the natural numbers” comes prior to any first-order theories that attempt to capture that set. He errs only when he insists that a computer couldn’t, in principle, engage in the same sort of pre-theoretical mathematical reasoning that we do.

    Okay. But to the extent to which your brain (or a computer capable of the same sort of pre-theoretical mathematical reasoning as you, since you’ve already given us that) purports to leverage its intuition for the Naturals to say correct things about them (as you hopefully do in any journal article you submit) with any kind of rough or eventual consistency, we could then distill that thought process into a specific r.e. set of sentences which agree with some model of the Naturals which is infested with Z-chains, right? Or would you claim that you’re wrong enough of the time that the “rough or eventual consistency” thing doesn’t hold up? (Which makes it difficult to rely upon your advice, surely?)

    Are you maybe expressing a hard rejection of “Hilbert’s thesis”, that informal mathematical reasoning is just a slackening of first-order deductions from ZFC? (Not that this seems to pass muster when it comes time to publish…) Do you maybe think that one day we’ll understand the mojo that makes informal human mathematical thought tick, and its power to get things done in practice will somehow skirt around even the loosest of ways in which me measure the correctness of computer programs today?

  134. Scott Says:

    Greg #133: I’m a big fan of the Popperian idea that the crucial thing, in every part of science—I would even say pure math—is not how to be certain that everything you say is right the first time; rather, it’s how to detect and fix errors. (I’m deliberately stating the idea in a way that’s oversimplified enough that people might actually remember it and apply it to their lives.)

    Some small constant fraction of all the proofs I’ve given in published papers have been wrong—see my mea-culpa blog posts for more!—but it’s okay. If anyone cares about those theorems, then the errors get found and (in most cases) fixed. And the same with my colleagues. It might be that the overall error rate in CS theory is higher than the error rate in other parts of math—I’m not sure—but if so, it just goes to show that it’s possible to make rapid, cumulative progress even in the presence of that higher error rate. And in any case, our error rate is still lower than that of our wonderful friends in physics! 😀

    I agree that, after I’m dead, it might be possible in principle for someone to come along and formalize “the standard of accuracy toward which I was striving while I lived” as a collection of first-order axioms for arithmetic—and if so, those axioms would of course admit models that were infested with Z-chains. But even then, that’s still not the way I’d think about it, were I lying awake at night wondering about such matters (something that no longer happens much—with two kids, now I mostly just crash and start snoring whenever the opportunity arises). Instead I’d simply think: “I’m just a guy who has a pretty clear idea of what a ‘positive integer’ is, but who’s not able to prove all the theorems about them that he’d like.”

  135. Atreat Says:

    leopold #128,

    The SKI calculus was invented by Moses Schönfinkel in 1924 and came before Alonzo Church’s lambda calculus. It was later shown to be equivalent. He was a contemporary of Church and Turing. Here is a good introduction:
    The SKI calclus has been proven to be Turing-complete and is arguably the simplest Turing-complete language ever invented. Other languages have been built upon it that are even simpler (iota, jot and zot for instance), but they are also based upon Schönfinkel’s combinatory logic. Jot is particularly interesting in that every program written in Jot is its own Gödel number.

    Here is language I wrote that is based upon the SKI calculus complete with interpreter:

    This is the church encoding of basic arithmetic in this language:

    // church numerals
    #define INC(X) “AASAASAKSK” X
    #define ADD(M, N) N INC(M)
    #define ZERO FALSE
    #define ONE I
    #define TWO INC(ONE)
    #define THREE INC(TWO)
    #define FOUR INC(THREE)
    #define FIVE INC(FOUR)

    You can also find combinators for booleen logic, p-numerals, church comparison, church pairs and lists, basic recursion, y-combinators, and a while loop all implemented along with the interpreter at the above link.

    Anyway, the proposal for the sake of this argument would be to introduce a new combinator that would effectively nullify the computation of 2+3 in that the combinator would be a noop. This would be done by extending the formal definition of the language and then to make a change to the interpreter to reflect it. To be clear, the language would still have the concept of ‘2’ and ‘3’ and ‘5’ and increment, decrement and addition. The one thing that would not be possible in the language is evaluating 2+3 and outputting 5. 2+3 would not be computable.

    However, I can not stress enough that I don’t think my argument is a contrivance of the SKI calculus. It is possible to transcompile any Turing-complete language into this language and then in principle to do the same with any of these languages. The only caveat is that whatever program you use to model your universe that it use these church encodings when doing arithmetic.

    The argument I’m making is the mathematical equivalent of the ancient zen koan about a tree falling in the forest. I’m trying to demonstrate that it is in principle possible to construct a universe where no observer … not conscious, not computing device, not the laws of the universe itself … will have access to the computation of 2+3. So, in this universe, how can you say that 2+3 is an absolute mathematical truth??

    Isn’t that exactly what Scott is asking for? A universe in which 2+3 is contingent? I think that is exactly what I’m providing. This argument is orthogonal to the debates above about formal systems of mathematics and non-standard-integers as fascinating as that discussion is. This argument relies upon the argument that all facts are contingent upon the observers who conceive them. But, because ‘observers’ is such a ill-defined term I’m trying to show a universe where *any* well-defined observer would not have access to this supposed absolute mathematical truth.

    Anyway, Scott what do you think?

  136. Luke G Says:

    Sniffnoy #132:

    Uncountable ordinals actually do play a role when studying large countable ordinals, as they help with naming very large countable ordinals (as is needed for proof theory). So there may be some possibility to “cash out” theorems on uncountable ordinals to get arithmetic statements.

  137. Atreat Says:

    Also, given the above, who is to say that we have access to all arithmetic computations in our universe? What if there are computations involving the standard integers that are foreclosed by the program running this universe? We would be completely cut-off from them even if our intuition insists that it is complete when it comes to the standard integers?

  138. Jon K. Says:


    Great blog post, Scott. Can you expand on your response?…

    ” Of course, the big limitation is that you then don’t have a natural notion of “greater than” or “less than.” ”


  139. asdf Says:

    Oh look Billy! Another QC breakthrough!

  140. Scott Says:

    Jon K. #138: In a finite field, if you start with 0 and keep adding 1, you eventually loop back to 0. So it doesn’t really make sense to ask whether one field element is “greater” or “less” than another one, any more than it makes sense to ask whether 11pm is earlier or later than 3am — you can reach 3am from 11pm by going to the past or going to the future!

  141. Scott Says:

    Atreat: My feeling is that, if I were a scientist living in a universe with a software bug that caused 2+3 to return 9 or whatever, Occam’s Razor would inevitably lead me to a theory where first I’d do addition in the normal way, and then I’d add in the “friction” or “correction” term that changes the result of 2+3 from 5 to 9. In other words, normal addition would still have the overwhelming advantage of coherence and simplicity, and weird-addition would have only the advantage of matching observations. It’s like if aliens observed humans playing chess: they would never logically deduce castling from the other rules, but could add it in as a weird observed exception to the other rules.

  142. leopold Says:

    Atreat #135

    Thanks very much for the brief intro to SKI. I’m still struggling to understand though why the possibility of such a system is supposed to tell us anything about the natural numbers we are actually familiar with, and the contingency or necessity of the relations they enter into.

    Also, you said “This argument relies upon the argument that all facts are contingent upon the observers who conceive them.”

    I assume this means that a fact has to be conceived in order to be real, so that if no-one can conceive of something, that thing is not true/not real.

    But if that is an assumption of the argument, can’t you much more quickly get to the radical nonabsoluteness of arithmetic just by hypothesising a universe with no life in it at all (or even a universe with nothing in it at all)? There would be no observers, and no computations. So (given your assumption) nothing at all would be true, either about arithmetic or anything else.

    This seems to make the excursion into the SKI calculus unnecessary; for my own part though, it also looks like a reductio ad absurdum of this subjectivist assumption. Why should there not be facts that no-one has ever conceived or ever could conceive (say about galaxies beyond the limits of the observable universe)?

  143. Joshua Zelinsky Says:

    Scott #141,

    If they observe a lot of other human games then they may also notice that many games have special exception rules that have the function of speeding up early game play, and castling fits into that larger framework.

    This isn’t just a nitpick, there’s a serious point here also. If we had world where we had to do the sort of “correction” for 2+3, I’d expect any such correction to fall into some larger framework.

  144. John Tromp Says:

    Thanks for another fascinating blog entry, Scott. You covered two concepts that my co-author Matthieu Walraet and me managed to combine in the paper “A Googleplex of Go games”, available at

  145. Big Numbers – The Square Root Says:

    […] “Oh yeah,” she says. “So is that the biggest number?” … (Shtetl-Optimized) […]

  146. Atreat Says:

    leopold #142: I’d go even further than that. If a fact is not realized by any observer anywhere or embodied in any way, then in what sense do we say it is “real”? Like, what does that word mean for a fact with this context?

    A universe with nothing in it at all? In what way could such a universe be said to “exist”?

    What I am saying is that all things are contingent. All facts or truths are contingent. Absolute facts or truths do not exist because that is an impossible mode of being. We’re talking about absolute truth when what Scott and others mean by “absolute” is some things that exist in and of themselves. Things whose existence are utterly independent of anything else and not contingent in any way. I believe that is provably impossible mode of being.

    Scott, my example doesn’t give 2+3=9, but renders 2+3 uncomputable. The analogy I have in mind is not castling in chess, but rather trying to successfully visualize living in 4 spacial dimensions or even better visualizing a solution for the halting problem or even better being able to “see” the Kolmogorov complexity of some arbitrary string. I am imagining a universe where 2+3 is an impenetrable blind spot.

    This would not contradict the truth that 2+3 = 5! Emphatically, that would still be true in our universe. But it shows that it is contingent 🙂

  147. mjgeddes Says:


    Well Scott, I think we’ve got to go with the guy that’s spent his life studying CH, Hugh Woodin.

    It seems that prior to around 2010, Hugh thought CH was false, but around that time he suddenly changed his mind and started talking about:

    V=Ultimate L

    and now he thinks CH is true

    V (Von Neumann Universe) = Ultimate L (Construcible Universe)

    I’m sympathetic to your view that math has to be tied to physics (specifically computation) to qualify as having definite truth-values. So that would include the math in theory of computation, algebra, geometry, topology and analysis.

    It’s less clear what we should make of the math in the fields of epistemology (probability&stats, set theory?, proof theory, model theory). It does seem that a lot of mathematical logic might be more about our own internal reasoning processes rather than something that is objectivity ‘out there’. So I agree, it’s not clear that CH has a definite truth-value.

    What sorts of infinity are ‘real’ then? well, the infinities tied to physics only perhaps (sets of natural and real numbers). It’s a lot less clear that the additional endless towers of infinities in logic have any sort of objective existence.

  148. leopold Says:

    Atreat #146

    i What I am saying is that all things are contingent. All facts or truths are contingent. […] Things whose existence are utterly independent of anything else and not contingent in any way. I believe that is provably impossible mode of being.

    Provably impossible? So it is a necessary truth that there are no necessary truths?

    (That _does_ sound impossible!)

  149. leopold Says:

    Atreat #146

    “What I am saying is that all things are contingent. All facts or truths are contingent. […] Things whose existence are utterly independent of anything else and not contingent in any way. I believe that is provably impossible mode of being.”

    Provably impossible? So it is a necessary truth that there are no necessary truths?

    (That _does_ sound impossible!)

  150. jonas Says:

    @Atreat re #137 The main reason why we think that is that we have various different simple computation models that we can all prove equivalent. While Gödel’s is based on arithmetic on natural numbers, that’s not the only simple one, and other models don’t have a primitive concept of natural numbers. Some of the models are based on stacks or tapes or strings of symbols from a fixed finite alphabet, some are based on (tree-like or recursive) algebraic structures, some on combinators or lambda expression terms. We can define arithmetic based on any of these. We have used the first-order logic deduction system to prove the well-known properties of arithmetic on natural numbers from the ZFC axioms, and that deduction system and axioms also don’t have addition and multiplication as primitives. We could even avoid natural numbers when building a representation of the logic deduction system in itself and proving theorems like Gödel’s completeness about it, although I’m not aware of anyone having written down such a version of the proofs.

  151. Atreat Says:

    leopold #148, No! I never said that the truth that all truths are contingent is absolute and you won’t hear me saying it. It simply is a truth and it is itself contingent of course. In other words… All truths depend upon context. <– including that one. How is that contradictory?

    And I do believe it is provable (indeed, was proved hundreds of years ago). But think what Scott is asking here: to prove to him that his intuition and belief in absolute truths is wrong when he freely admits that if a formal system actually did that, then so much the worse for the formal system! Any such proof he would consider not evidence that his intuition was wrong, but rather that the formal system was wrong.

  152. Atreat Says:

    BTW, I’m really hesitant to believe that there is anything I’m correct about that Scott is incorrect about 🙂 He’s one of my intellectual heroes (and I like to think a friend) and I have serious appreciation for both his intellect and intellectual honesty. When I think I’m right about something and he’s wrong… something smells. So I’d *love* (if a little scared) to be shown the error of my ways here.

  153. fred Says:

    Scott #94
    I just want you to describe to me what some world is like where 2+3 is not 5

    The problem is that one needs to define precisely the meaning of the symbols 2, 3, 5, and =.
    I don’t think it’s as obvious as it seems.
    Is math nothing more than the power of writing symbols on paper?
    Or is math supposed to simply map itself to real world constructs?
    One could imagine a world where 2+3 = 5 just isn’t ‘useful’.
    Imagine a world where the *only* objects are “blobs” that can only exist as a singleton (1), in “pair state” (2), or in “triplet state” (3), “quadruplet” (4)
    1+1 = 2
    1+2 = 3
    1+3 = 4
    2+3 != 5
    1+4 != 5

    so there’s no such thing as “bringing together” or “observing” a pair and a triplet together. It’s an ill-defined situation.

    Maybe because quintuplets just don’t exist – bringing 2 or 3 together would create some sort of annihilation of the entire universe.

    Or a bit like how people can only visualize distinct groups of 2 or 3 objects in their “mind’s eye”, but are incapable to visualize a group of 5 objects “at once”.

  154. Anonymous Says:


    Indeed the Completeness Theorem can be interpreted as saying precisely that the nonstandard integers are “nothing more than that,” spandrels that have to exist by the mere fact of ZF+Not(Con(ZF)) and so forth being consistent.

    You don’t need the incompleteness theorems to get nonstandard models. You can get a nonstandard model of ℕ simply by taking the set of sequences of elements of ℕ and dividing it by the equivalence relation which holds between two sequences if the set of indices on which they agree is a member of some fixed non-principal ultrafilter. In such a model, Con(ZF) will have the same truth-value as in the model from which you’ve started, and yet it will have members like the equivalence class containing (0, 1, 2, 3, 4, 5, 6, 7, …), which doesn’t correspond to any ‘ordinary’ integer.

    Nonstandard models aren’t there merely because we can’t prove that certain formal systems are consistent; they exist because we can’t even define what it means for something to be finite.

  155. fred Says:

    Scott #140
    “In a finite field, […] it doesn’t really make sense to ask whether one field element is “greater” or “less” than another one, any more than it makes sense to ask whether 11pm is earlier or later than 3am — you can reach 3am from 11pm by going to the past or going to the future!.”

    But it does make sense once you start considering “objects” living in the fields.

    E.g. spans of time.
    You can certainly imagine working for 5 hours, from 6pm till 11pm.
    Or working 5 hours from 11pm till 4am, and then it makes sense to say that “4am is later than 11pm”.
    And it’s not the same as working from 4am till 11pm (a span of 7 hours), in which case 11pm is “later” than 4am.

    Or take a boat sailing the earth (a finite field), you need to be able to distinguish its bow from its rear (i.e. distinguish a boat that’s 100m long from one that has a length of a great circle – 100m… or a boat having a length of 3 great circles – 100m, i.e. wrapping around three times).

  156. asdf Says:

    I still haven’t managed to carefully read this whole interesting post, but a few comments:

    1) Giuseppe Peano’s axioms introduced in the 19th century were not the same thing as what we now call PA. Peano’s axioms used a single induction axiom that quantified over formulas, so we’d now consider it to be given in second-order logic. I don’t know when the first-order theory PA was introduced but my guess is no earlier than the 1920s. It would be interesting to look up the history and I might try to do that.

    2) Even if one says the nth-level BB(k) is a definite number for all finite n and k, it gets messy if n is a transfinite ordinal, because of different possible codings of them. You might like the article:

    if you haven’t seen it. It explains the natural progression from PA through 2nd order arithmetic, ending up with set theory.

    IMHO it’s reasonable to say that every Turing machine either halts or doesn’t (i.e. every Pi-0-1 sentence has a truth value) while not saying the same of sentences with 100s of alternating quantifiers (what can that mean?). Or that PA is unconvincing because the induction axioms say things like “if phi(0) and (phi(n)=>phi(n+1)) then forall n. phi(n)” which presumes that all those n’s actually exist. Of course 0 exists (an axiom says so), as does S0, SS0, etc., but many formulas denoting integers can never be reached that way. That viewpoint is called “predicative arithmetic” and was mostly done by Edward Nelson, who wrote a book and a bunch of articles about it. His predicative system was weaker than PRA but stronger than Robinson’s Q, and he was still able to encode most interesting math in it.

    4. Shachaf Ben-Kiki once joked that ultrafinitists are mathematicians who believe that Peano arithmetic goes from 1 to 88 ;-).

  157. jonas Says:

    Re fred #153:

    > Is math nothing more than the power of writing symbols on paper?

    No, but writing symbols on paper is a rather accurate model that most of us humans can handle easily. Writing symbols with chalk on a blackboard is closer to real math, but has some practical difficulties for humans. Real math in the original of the Book is written on a medium that is entirely unusable by humans, and that may even be unimaginable for them. Humans can only ever read imperfect copies of parts of the Book in various forms more convenient for them.

  158. Atreat Says:

    jonas #157, Thou shall not take any other Book before me, eh?

  159. fred Says:

    jonas #157

    I was thinking about Formal Grammars, i.e. symbols + replacement rules (
    Doesn’t that cover (most?) of mathematics?

  160. tomate Says:

    @ Walter #93 and Scott #96

    It’s Matteo, the organizer of the Festival. Thanks again Scott for your great contributions. You shouldn’t be worried about the language barrier or the difficulty of the talk: the point we try to make at Festivaletteratura is that we shouldn’t downplay the audience. We want them to confront with scientific thinking in all of its complexity rather than with the sort of easily-digestible metaphors that infest scientific popularisation. And thanks Walter for being a Festival supporter. About the books, a short apology: unfortunately the Festival’s library is not used to dealing with academic publishers, and apparently there are complications with orders involving this kind of products.

  161. William Hird Says:

    @Jonas #110
    I think I should clarify what I meant in comment #51. The goal of making PRG is to have a deterministic algorithm that takes a small random seed and output a string that “looks random” to an adversary. One possible way to do this is to build a generator that “behaves like” it has a huge seed even though the actual seed is small, say like 128 bits. So the intuition here is to show that it is infeasible to invert the generator due to an algorithmic information theoretic argument: the program size complexity of the generator is way larger than the string of output bits of the generator by a constant factor, so no algorithm can succeed in describing the internal state of the generator using the output bits.

  162. Scott Says:

    William #161: The trouble is, you always assume the inversion algorithm knows the code of the generator. Anything that’s not known to the inverter is part of the seed, by definition. And the seed is shorter than the output string.

  163. asdf Says:

    Scott#50, “Have any examples of ZFC extensions with conflicting arithmetical theorems seriously been proposed by anyone?”

    Sure, depending on what you mean by serious. ZFC+Not(CON(ZFC)) conflicts with ZFC+”there exists an inaccessible cardinal”, since the existence of the inaccessible cardinal means ZFC has a model and is therefore consistent.

  164. William Hird Says:

    @Scott #162:
    The generator is self-keying, in the same vein as an autokey cipher. It re-seeds itself every clock cycle , the actual generation of output bits is a fractional (tiny) subroutine of the main algorithm. Yes the actual seed is smaller than the output but the “virtual seed” is always larger, there is a phase transition that takes place , I don’t want to give any details of how this would be done here, maybe in the future I will give a link on your blog to the actual design, after it’s well vetted, maybe, we’ll see 😉

  165. William Hird Says:

    As a humorous side note(hopefully) has anyone noticed that Trump’s new chief of staff Kelley looks like the guy who played the head of the Praetorian Guard in the movie “Gladiator” ? At the end of the movie he kinda lets the emperor be killed by Russell Crowe and Rome kinda sorta becomes a democracy again. Is life about to imitate art ?

  166. Scott Says:

    asdf #163: I’m obviously well aware that ZF+Not(Con(ZF)) proves false arithmetical theorems! I meant two extensions that prove conflicting arithmetical statements and that are both “natural,” in the sense that both have mathematicians who advocate them as likely to be arithmetically sound. And let’s exclude cases like ZF+(NEXP=coNEXP) and ZF+(NEXP≠coNEXP), where pretty much everyone hopes that mathematical advances will ultimately reveal one extension or the other to be inconsistent: I only want cases where experts generally agree that the two extensions are both consistent, and disagree only on which is arithmetically sound.

  167. Scott Says:

    William #164: A pseudorandom generator that’s secure for “trivial counting reasons” that have nothing to do with computational intractability, is just as impossible as a lossless compression program that works on random input (i.e., whose success has nothing to do with exploiting regularities in the data). There’s no amount of playing around with the definitions of words that can change this.

    But let’s end this particular exchange here, since no convergence seems to be happening…

  168. jonas Says:

    Re #82 and #85, see John Baez’s blog entry and Gowers’s blog entry about that new result.

  169. Jay Says:

    Atreat #135+

    Maybe that was the best comment ever on this blog! (or, the best I could appreciate) However yes, I do think there are a few loopholes that remain and maybe destruct the whole argument.

    First, you said that using SKI you are confident you can describe a universe with a specific blindspot, for exemple a universe where you can do ordinary arithmetics except that 2+3, and 2+3 specifically, is not decidable in this universe. That sounds 100% fair and directly answering Scott’s request. But *being confident* is hardly enough! 🙂 Can you actually do it and check whether the blindspots you can plant seem to stay still and quiet, or wether they immediatly contaminate most or all arithmetics? (e.g. produce obvious contradictions or make everything non computable)

    Second, and harder, is there any way to *prove* that the resulting system would be sound and consistent, or at least as consistent as PA? Finally, and maybe impossible to reach (but maybe not!), is there any systematic method to construct a sound and consistent mathematical framework with a specific blindspot for *any* (well defined) mathematical truth?

    Anyway thanks for this great line of thought!

  170. Scott Says:

    jonas #168: Thanks so much for that link!!! In effect, you get to watch Tim Gowers and John Baez as they collaborately write the popular article about p=t that Quanta magazine “meant” to have published but didn’t: the one that takes a person who’s never encountered the relevant definitions, and brings them up to the point of actually understanding the question (if not, of course, the answer). While the whole time Tim, John, and the others also engage in a sort of meta-discussion about how one should go about popularizing such an advance, and whether it even should be popularized. (It’s striking how, wherever and whenever something this abstruse is being made comprehensible, the same two names Baez and Gowers keep showing up.)

    I had read enough to understand that

    (1) p=t has no obvious bearing on the Continuum Hypothesis (only p<t would have refuted CH);

    (2) the Quanta article spends almost all its time talking about Cantor and Gödel and Cohen rather than the new advance, which could lead readers to the misconception that Maliaris and Shelah were solving CH or something like that (apparently this happened); and finally,

    (3) the technical statement of p vs. t involves “almost inclusion” of sets and “infinite pseudointersections” and “towers.”

    But I didn’t invest the time to develop the intuition that one can get from that wonderful Google Hangouts exchange.

  171. mjgeddes Says:

    Well , Baez isn’t correct when he says that Godel proved CH doesn’t have a definite answer. In fact, Godel was a Platonist, meaning he definitely thought CH was either true or false.

    All Godel proved was that CH is undecidable *by the axioms of standard ZF set theory*. But of course, add more axioms, and it could in fact be decided.

    I’m suspecting that Yudkowsky’s principle of ultra-finite recursion is somehow related to CH, and could be used to generate a proof. But how?

    If my suspicion is correct, the axiom we should add is that there are only 3 types of infinity (transfinite cardinals), with logical recursion stopping at 3 levels.

    Well natural numbers and real numbers only amount to 2 types of transfinite cardinals, so where’s the third? Is it above the real numbers, below the natural numbers, or in-between the reals and naturals (which would falsify CH)?

    Computable numbers don’t give us a new type of transfinite cardinal (computable numbers are countable), so where’s that third type of infinity?

  172. mjgeddes Says:

    Scott, what would happen if rather than adding axioms to standard set theory, axioms were *weakened* instead… would that help resolve CH?

    Specifically, I’m wondering about ‘Axiom of Powerset’

  173. Scott Says:

    mjgeddes: Yes, of course, Baez was editorializing about what independence means. Though even if Gödel were right, and there were a “Platonic truth” about CH, it’s very hard to understand how that truth could become universally accepted by mathematicians, given that the CH models and not(CH) models are both perfectly good, internally-consistent playgrounds in which to do almost all of mathematics.

    If you wanted only three orders of infinity, then you’d certainly need to give up on the powerset axiom, which (together with the axiom of infinity) gives you infinitely many orders of infinity. But why 3? Why not 17?

    In any case, removing axioms can only ever yield more models of your axiom system, never fewer. So removing the powerset axiom—or any other ZF axiom—certainly can’t suffice to decide CH. I.e., you’ll still have all the models where CH is true (and more than that, I think models where there are only 3 infinite cardinalities or whatever, like you suggested). But you’ll also still have all the models where CH is false.

  174. jonas Says:

    > Though even if Gödel were right, and there were a “Platonic truth” about CH, it’s very hard to understand how that truth could become universally accepted by mathematicians, given that the CH models and not(CH) models are both perfectly good, internally-consistent playgrounds in which to do almost all of mathematics.

    How did the axiom of regularity get accepted by mathematicians?

  175. gentzen Says:

    Scott #166: I don’t have an example for: “I only want cases where experts generally agree that the two extensions are both consistent, and disagree only on which is arithmetically sound.” You might be right that no such example exists.

    But some time ago, I seriously asked the question: “Does a Π2 sentence becomes equivalent to a Π1 sentence after it has been proven?” It was not well received, but it was a serious question from my part. However, my main motivation was better understanding of the connection to ordinal numbers, not to question the consistency (or “definite truth property”/”law of excluded middle”) of Πk sentences.

    After my question was badly received, I tried to explain my motivation for asking that question:

    This is a serious question about provable Π2 sentences. The rather random examples of Π2 sentences (P vs. NP, Goodstein’s theorem, and the strengthened finite Ramsey theorem) given in the question are enormous statements, and I feel the only way how such enormous statements can be true is by being provable. But being provable of course means being provable in some specific formal theory, and the consistency of such a theory is equivalent to a Π1 sentence. But the examples also show a connection to ordinal numbers, and ordinal numbers are used to measure the proof strength of certain formal theories. I had hoped that somebody would be able to explain whether there is a connection between ordinal numbers and Πk sentences.

    At least user21820 tried to help me cleanup my misconceptions in a followup chat to that question. I no longer fully understand all my reasonings from back then, but at some point I made the following remark, defending once again my initial question:

    (2) Even so I thought that ω-consistency showed that the premise of my question was flawed, I wonder whether Gödel’s dialectica interpretation doesn’t show exactly what I was asking for in the case of arithmethic sentences provable in PA (or rather Heyting arithmetic). After all, the “exist term” is constructed explicitly, so all that is left is the “forall sentence”, which is provable in System T (an extension of primitive recursive arithmetic).

    Even so I stopped thinking about that specific question, I did continue trying to understand the material from mathematical logic in terms of Turing machines. At least for myself, I have the impression that I understand that material better now, and on a more intuitive level. (I will probably write a blog post about it at some point, and then be happy with what I achieved.) I could try to explain it, but I am unsure whether you (or anybody else) would really want that. Should I try to explain it?

  176. Scott Says:

    jonas #174:

      How did the axiom of regularity get accepted by mathematicians?

    Technical convenience?

    In any case, I see this as very different from CH, because it’s obvious from the beginning that Regularity can neither be proven nor disproven from the other axioms: it’s simply a question of how “pathological” are the sets that you’d like to consider.

  177. fred Says:

    Are there other general metrics besides “number of states” that could split non-halting programs into different categories?
    (is never halting the same as halting at infinity? :P)

    A bit like separating infinite sums into the ones that have a finite number of non-zero terms and the ones that have an infinite number of non-zero terms, then splitting the latter category into the ones that converge and the ones that diverge?

  178. Atreat Says:

    Jay #169: Thanks! Yes, being confident is hardly a proof. I could try and actually implement this, but it would take some time and it seems clear that Scott doesn’t buy the argument even if I could implement it. Further, I don’t have the skills necessary to provide anything like a formal proof that the result would be sound or consistent or have greater mathematical ramifications.

    Right now, I’m spending my quiet time thinking about what a formal logic would look like where all truth values are contingent ie, where all propositions are dependent. This has some relation to paraconsistent or multi-valued logics, but I’m wondering what a computable calculus would look like with this kind of framework. Cheers!

  179. Antedeluvian Says:

    Out of interest, why do you write BC, not BCE? Is it for political/religious reasons?

  180. Scott Says:

    Antedeluvian #179: Uhh, just because that’s the standard/traditional abbreviation? I have no political or religious objections to writing BCE.

  181. Antediluvian Says:

    Scott #180

    I am not sure it has been standard for a few years now, at least within academic or non-Trumpish circles. Having said that, the change to BCE did cause a stir a few years ago in Australia (see e.g. ) and in the US more than a decade ago (see e.g ).

  182. Dandan Says:

    I have an idea how to name a bigger number 🙂
    But I’m not sure if it’s correct.

    Consider ZF for a start. Now add new axiom to it – statement about its consistency. You’ll get some new theory ZF + Con(ZF). You can repeat to obtain ZF + Con(ZF) + Con(ZF + Con(ZF)). You can continue (just like with +1) up to some ordinal and get some new very strong theory. This theory, probably, can prove the existance of much larger ordinal. So, this is some kind of function F from ordinals to much larger ordinals. Now we can define ordinal which is the union of ordinals F(w), F(F((w)), … , F^n(w), … . And name a very big number using it.

  183. Scott Says:

    Dandan #182: Yeah, that’s pretty much exactly what I was talking about in the last part of the lecture.

  184. Gabriel Says:

    There’s an article at Ars Technica that says that Microsoft is integrating quantum programming into Visual Studio.

    The article *doesn’t* say that quantum computing can break NP complete problems by trying all possibilities at once — which is a good sign, I guess.

  185. Raoul Ohio Says:

    Microsoft announces Q# / P++:

    The quantum programming language will be part of, and fully integrated into, Visual Studios. MS hasn’t thought of a name for the language yet, but my personal quantum prediction machine predicts Q#, or possibly P++.

  186. Raoul Ohio Says:

    Microsoft’s rap:

    A lot of this article is about Michael Freedman, but it also has lots of Gee Whiz stuff like: “A quantum computer is able to model Nature”.

  187. mjgeddes Says:


    “If you wanted only three orders of infinity, then you’d certainly need to give up on the powerset axiom, which (together with the axiom of infinity) gives you infinitely many orders of infinity. But why 3? Why not 17?”

    Scott, I will now state the axioms of my ‘Reality Theory’. My lemmas will finally provide the answer to life, the universe and everything 😉

    *Axiom of Knowledge-Existence Equivalence:

    Reality and knowledge are equivalent at the fundamental level! Whilst it’s *usually* true that the map and the territory are separate, this distinction breaks down at the fundamental level, and knowledge and existence become equivalent!

    *Axiom of Ultra-Finite Recursion:

    There is absolutely nothing to be gained from a logical language that has more than 3 levels of recursion. That is to say, full logical power is reached at the 3rd level of recursion, and additional levels are redundant because they can always be reduced to (collapse to) the 3rd level.

    This is generalization of Yudkowsky’s law of ultra-finite recursion:

    Yudkowsky’s Law of Ultrafinite Recursion states that “in practice, infinite recursions are at most three levels deep.”

    My generalization is to take out the phrase ‘in practice’ and replace with ‘in principle’.

    *Axiom of Self-Similarity of Knowledge

    The structure of knowledge is a fractal. That is to say, there’s a self-similarity property to the structure of knowledge such that we can find a method of knowledge representation that applies across *all* levels of abstraction

    The Geddes axioms are sufficient to construct a theory of everything.

    Lemma 1: Reality is a *language*

    From Axiom of Existence-Knowledge Equivalence:

    All of reality is a language that is reflecting upon itself.

    Lemma 2: From (1) and Axiom of Self-Similarity

    The language of reality is *recursive*. The universe is a fractal reflecting the structure of knowledge

    Lemma 3: From (1) and (2) and Axiom of Ultra-Finite Recursion

    The fundamental language of reality (call it L1) undergoes recursion and splits into 3 levels, representing 3 fundamental levels of abstraction.

    But this logical system as a whole represents a new language (call it L2) and *that* undergoes recursion, again splitting into 3 levels of abstraction.

    The new logical system resulting from the previous steps (L3) does not recurse.

    We are left with 3^3 = 27 fundamental levels of abstraction.


    There are 27 core knowledge domains that ‘cleanly’ (precisely) categorize all knowledge. These are based on 27 a-priori categories/sets/types that are necessary and sufficient for a mind to represent any aspect of reality.

    The 27 core knowledge domains are shown here:

    The domains are arranged according to level of abstraction. So as you look at the knowledge domains on the main page, you can see the whole fractal structure of knowledge across all the levels of abstraction – top-to-bottom and left-to-right. There are only 2 dimensions on the screen – to indicate the 3rd dimension of abstraction I split the domains into groups of 3 using white space.

    Clicking on links takes you through to my wikibooks with A-Z lists of wikipedia articles explaining the central concepts for each domain (on average, each domain has about 100 core concepts).

    The key conjecture I want to get across here is that these 27 categories are NOT inventions. They are *fundamental* a-prior categories of existence itself! They *cleanly* (with 100% precision) carve reality at the joints. This follows from Axiom of Knowledge-Existence Equivalence. The map-territory distinction breaks!

  188. anon Says:

    Some of the math markup isn’t rendering, as of 2017-09-26 at 1:45pm CDT. Each of the items I refer to is demarcated by double dollar signs.

  189. Māris Ozols Says:

    Quite an epic story! You should find an artist who can illustrate it and turn it into a book. It could become a computer science equivalent of “A Brief History of Time”!

    On a different note, when you say “That essay might still get more views than any of the research I’ve done in all the years since!”, it makes me wonder what do you think your legacy will be? I mean, what is the most important thing (by whatever measure) you have done in your career as a scientist? It doesn’t have to be a result or anything like that, I mean it more in the “something for the greater good of the humanity” sense.

  190. Scott Says:

    Maris #189: For god’s sakes, I’m “only” 36! I hope I can still do one or two more things before it’s time for me to ponder my “legacy” (although I wouldn’t bank on it…)

  191. Anders Says:

    Is it easy to show that level-omega BB exists?
    It seems the ability to call any level-k BB (with k a positive integer) could plausibly allow us to create a family of n state halting Turing machines without an upper bound on the number of steps before halting. The finite set argument does not hold if we have infinitely many different operations (oracles) to choose from.

  192. Scott Says:

    Anders #191: I believe that level-ω BB can be constructed in Peano arithmetic or even weaker theories, so yes.

    In more detail, you of course need to define the oracle access mechanism in a suitable way, so that (for example) the integer k such that you want to call the level-k BB oracle gets written on an oracle tape. But if you do, then as usual, there are only finitely many n-state machines, so among those that halt, there’s one that runs for a maximal number of steps.

  193. Scott Says:

    To everyone who pointed out the problem with rendering equations: thanks so much, and I believe the problem is finally fixed!

  194. Lucas Says:

    Couldn’t help but think of this video (jump to 1:15):

  195. Sniffnoy Says:

    Since comments are still open on this post — I was browsing MathOverflow and came across this answer:

    Apparently it’s undecideable in ZFC whether ω_1→(ω_1,ω+2).

    That, uh, seems like serious evidence against my claim that ordinals are on pretty solid groud, huh?

  196. Sniffnoy Says:

    Thinking about it a bit more, maybe it doesn’t — the proposition in question is after all fundamentally a claim about the power set of ω_1 (or rather of (ω_1 choose 2)), rather than about ω_1. So I take it back, ordinals still seem to be on solid ground after all!

  197. Shtetl-Optimized » Blog Archive » The Busy Beaver Frontier Says:

    […] Number? essay that I wrote way back when I was 18, in Quantum Computing Since Democritus, in my public lecture at Festivaletteratura, and in my 2016 paper with Adam Yedidia that showed that the values of all Busy Beaver numbers […]