Striking new Beeping Busy Beaver champion

For the past few days, I was bummed about the sooner-than-expected loss of Steven Weinberg. Even after putting up my post, I spent hours just watching old interviews with Steve on YouTube and reading his old essays for gems of insight that I’d missed. (Someday, I’ll tackle Steve’s celebrated quantum field theory and general relativity textbooks … but that day is not today.)

Looking for something to cheer me up, I was delighted when Shtetl-Optimized reader Nick Drozd reported a significant new discovery in BusyBeaverology—one that, I’m proud to say, was directly inspired by my Busy Beaver survey article from last summer (see here for blog post).

Recall that BB(n), the nth Busy Beaver number (technically, “Busy Beaver shift number”), is defined as the maximum number of steps that an n-state Turing machine, with 1 tape and 2 symbols, can make on an initially all-0 tape before it invokes a Halt transition. Famously, BB(n) is not only uncomputable, it grows faster than any computable function of n—indeed, computing anything that grows as quickly as Busy Beaver is equivalent to solving the halting problem.

As of 2021, here is the extent of human knowledge about concrete values of this function:

  • BB(1) = 1 (trivial)
  • BB(2) = 6 (Lin 1963)
  • BB(3) = 21 (Lin 1963)
  • BB(4) = 107 (Brady 1983)
  • BB(5) ≥ 47,176,870 (Marxen and Buntrock 1990)
  • BB(6) > 7.4 × 1036,534 (Kropitz 2010)
  • BB(7) > 102×10^10^10^18,705,352 (“Wythagoras” 2014)

As you can see, the function is reasonably under control for n≤4, then “achieves liftoff” at n=5.

In my survey, inspired by a suggestion of Harvey Friedman, I defined a variant called Beeping Busy Beaver, or BBB. Define a beeping Turing machine to be a TM that has a single designated state where it emits a “beep.” The beeping number of such a machine M, denoted b(M), is the largest t such that M beeps on step t, or ∞ if there’s no finite maximum. Then BBB(n) is the largest finite value of b(M), among all n-state machines M.

I noted that the BBB function grows uncomputably even given an oracle for the ordinary BB function. In fact, computing anything that grows as quickly as BBB is equivalent to solving any problem in the second level of the arithmetical hierarchy (where the computable functions are in the zeroth level, and the halting problem is in the first level). Which means that pinning down the first few values of BBB should be even more breathtakingly fun than doing the same for BB!

In my survey, I noted the following four concrete results:

  • BBB(1) = 1 = BB(1)
  • BBB(2) = 6 = BB(2)
  • BBB(3) ≥ 55 > 21 = BB(3)
  • BBB(4) ≥ 2,819 > 107 = BB(4)

The first three of these, I managed to get on my own, with the help of a little program I wrote. The fourth one was communicated to me by Nick Drozd even before I finished my survey.

So as of last summer, we knew that BBB coincides with the ordinary Busy Beaver function for n=1 and n=2, then breaks away starting at n=3. We didn’t know how quickly BBB “achieves liftoff.”

But Nick continued plugging away at the problem all year, and he now claims to have resolved the question. More concretely, he claims the following two results:

  • BBB(3) = 55 (via exhaustive enumeration of cases)
  • BBB(4) ≥ 32,779,478 (via a newly-discovered machine)

For more, see Nick’s announcement on the Foundations of Mathematics email list, or his own blog post.

Nick actually writes in terms of yet another Busy Beaver variant, which he calls BLB, or “Blanking Beaver.” He defines BLB(n) to be the maximum finite number of steps that an n-state Turing machine can take before it first “wipes its tape clean”—that is, sets all the tape squares to 0, as they were at the very beginning of the computation, but as they were not at intermediate times. Nick has discovered a 4-state machine that takes 32,779,477 steps to blank out its tape, thereby proving that

  • BLB(4) ≥ 32,779,477.

Nick’s construction, when investigated, turns out to be based on a “Collatz-like” iterative process—exactly like the BB(5) champion and most of the other strong Busy Beaver contenders currently known. A simple modification of his construction yields the lower bound on BBB.

Note that the Blanking Beaver function does not have the same sort of super-uncomputable growth that Beeping Busy Beaver has: it merely grows “normally” uncomputably fast, like the original BB function did. Yet we see that BLB, just like BBB, already “achieves liftoff” by n=4, rather than n=5. So the real lesson here is that 4-state Turing machines can already do fantastically complicated things on blank tapes. It’s just that the usual definitions of the BB function artificially prevent us from seeing that; they hide the uncomputable insanity until we get to 5 states.

99 Responses to “Striking new Beeping Busy Beaver champion”

  1. Bram Cohen Says:

    Those two numbers are suspiciously off by one. Are they the same machine?

  2. Charlie Harrison Says:

    I was confused about how both BBB(n) and BLB(n) were not bounded above by BB(n) until I clicked through to Nick’s announcement and realized none of these machines will halt 🙂

  3. Scott Says:

    Bram #1: I believe the same machine with a trivial modification (designating one of the states as a beep state). If Nick is reading this, hopefully he can explain how the two bounds come to differ by 1.

  4. Nick Drozd Says:

    Bram #1, Scott #3

    It’s the same machine, and it exhibits both behaviors at different times. The tape is blank after 32,779,477 steps, and then it takes one more step and then gets stuck in one state forever. So the number for measuring quasihalting is one more than the one for measuring tape-blanking. (“Quasihalting” is the word I’ve been using instead of “designated-beep-state-stops-beeping”.)

    This is not always the case. The BLB(3) champion quasihalts after 6 steps, but leaves the tape blank after 34 steps. (If anyone is looking for a Turing machine challenge, try writing that program by hand. Hint: unlike other Busy Beaver champions, this one has a clear modular construction.)

    If a program exhibits more than one termination condition, which one is the right one to use to describe its behavior? If this new champion program could talk, how would it describe itself? My feeling is that it would say “I verify that a certain strange and uninteresting Collatz-like function will, when applied repeatedly to 2, eventually reaches 0, at which point the tape will be blank.” The program doesn’t have a plan for what happens after that, or even an understanding that there will be an “after that”. The fact that it quasihalts after wiping the tape is like rigor mortis setting in.

    One reason for thinking this is that the blank tape was how I actually found the thing. Searching for quasihalting programs is unpleasant. The obvious way to do it is to run a list of programs for some number of steps and record the last time each state was hit, and then check the results to see if it looks like any of them are quasihalters. But it’s difficult to distinguish quasihalting programs from programs that just haven’t reached some period again, and this leads to a lot of false positives.

    In contrast, the blank tape can be detected with perfect reliability and for practically free. If you were a program and you wanted to be found, erasing the tape would be a good strategy. That sends out a strong clear signal that could actually be received.

    (You can tell I’ve been at this too long because I keep ascribing thoughts and desires to programs. And not just any programs, but programs that weren’t even written by anybody!)

  5. zuminator Says:

    Don’t be fooled by the lower bound, BBB(4) and BLB(4) likely differ by quite a bit but the ability to calculate the exact values is limited. It’s as if some ancient scientist concluded that the number of stars in the sky was at least 1000 and the number of grains of sand in a handful was also at least 1000, because his abacus only went to 1000. His estimates would have technically not been “wrong”, but in actuality the numbers are neither close to 1000 nor to each other.

  6. TonyK Says:

    Scott, I think you have a misprint: is BLB(4) 32,779,447 or 32,779,477?

  7. Noah Stephens-Davidowitz Says:

    Buzzy Beaver

  8. Scott Says:

    TonyK #6: Duhhhh, thanks!! Fixed.

  9. Scott Says:

    zuminator #5: No, in the case at hand we really, genuinely don’t yet know whether we’re seeing the limits of our abacuses or (close to) the actual limits of what 4-state Turing machines can do.

    It’s like: imagine if, after seeing the 107-step champion for BB(4), someone had sagely said, “don’t be fooled by the lower bound! The real value of BB(4) is surely incomprehensibly greater than that.” But then no, Allan Brady proved in 1983 that BB(4) simply is 107. Likewise, Nick tells me that he’s enumerated cases enough to confirm that BBB(3) simply is 55, the “naïve” lower bound that I managed to get with QBasic.

    As for BLB vs. BBB, we know on abstract grounds that they eventually need to pull away from each other, but there’s no good reason why they need to do so by n=4—again, they might or might not. Unlike most of us, Nick is actually putting in the work to find out! 🙂

  10. fred Says:

    Scott, what do you think of the notion of “computational irreducibility” that S. Wolfram is always throwing around

    I guess you’ll probably say that it’s really obvious 😛

    But I think Wolfram’s real claim is that most of the interesting processes (i.e. the emergence of complexity) happening in our physical world are actually like the BB, there’s no simpler algorithm that can bypass running the actual thing, step by step.
    I do wonder: for a given size of automata/programs, what proportion do have that property. But that’s probably impossible to answer as well, it’s just like asking “what strings of 1 and 0 are random?”.

  11. Scott Says:

    fred #10: Well yes, it is pretty obvious that for a huge number of physical systems, even if we know the dynamical laws and the initial state to suitable precision (which usually we don’t), there might be no computational shortcut to predicting the system’s behavior, other than just recapitulating its time evolution. A limited amount is gained by giving that observation a name and wrongly claiming to be the first to discover it. 😀

    Even that, though, is totally different from saying that most physical systems are “like Busy Beaver computations.” I don’t think that’s true at all. Busy Beavers are painstakingly optimized for doing the greatest possible number of things with the least possible number of states. They’re wildly atypical among Turing machines, or among physical processes more generally.

    Indeed, if you choose an n-state Turing machine at random, the probability that it will exhibit “BB(n)-like behavior” is negligibly small, most likely 1/exp(n) for some reasonable formalization of what we mean by “BB(n)-like behavior” (although I don’t know how to prove such a statement). That’s part of what makes these machines so interesting!

  12. Luke W Says:

    I have a question about the original Yedida-Aaronson BB(7918) eludes ZFC+SRP paper. I’ve never quite understood what makes the argument work. It goes:

    Statement 1: There must be some k such that BB(k) is unknowable in any axiomatic system A.

    Proof: Assume S1 is false. Therefore, all BB(n) are knowable. There also exists a Turing Machine M such that M halts iff there is a contradiction in A. This machine M has some finite number of states k. But A can’t determine the value of BB(k). If it could, A would need to prove M ran forever, which would prove the consistency of A, which violates Godel’s Theorem. QED, S1 is true.

    Why does A determining the value of BB(k) imply A would need to prove M ran forever?

  13. Scott Says:

    Luke W #12: A needs to prove that M runs forever because, if M halted after some number of steps that was much larger than the supposed Busy Beaver champion, then that champion wouldn’t be the champion—M would beat it. In other words, proving the value of BB(n), by its very nature, entails proving that every non-halting n-state Turing machine actually doesn’t halt. That includes any machines that enumerate the theorems of some axiomatic system, halting iff they find a contradiction.

    Incidentally, one should be careful about the quantifiers: the right statement is that for every axiomatic system A, there exists a k such that A can’t prove the value of BB(k). Note on the other hand that for every k, there exists an A such that A does prove the value of BB(k).

  14. Alex Says:

    10^(2×10^10^10^18,705,352) is a very funny-looking lower bound, specifically because of the factor of 2 in the exponent. 10^10^10^10^18,705,352.0001 is far larger than that, so if the bound had been given as 10^10^10^10^18,705,352, I would assume that of course that’s a loose lower bound that almost certainly could be refined to something far larger than 10^(2×10^10^10^18,705,352) if you bothered to take more space to express it. This makes me curious where that 2 came from, and if it’s a typo.

  15. Scott Says:

    Alex #14: No, it’s not a typo, that’s just how Wythagoras stated the bound, surely as a byproduct of how it was obtained. I thought of omitting the “2×” when I quoted the bound, since it’s so comically irrelevant for numbers of that magnitude, but on further reflection decided just to leave it as it was.

  16. LK2 Says:

    ABSOLUTELY fascinating. Even from my point of view which is that of a physicist working on “fundamental” physics. I feel the need to write in C(++) something similar to what you did with Qbasic (wonderful language if my youth..).

  17. fred Says:

    Scott #11

    Yea, sorry, I think irreducibility is indeed a much more general (and trivial) property than BB-likeness.

    I think the conjecture is that most processes leading to the rise of complexity (in nature) just aren’t irreducible in the sense that there’s no shortcut to compute them, i.e. one has to exert the same effort as nature in order to simulate them, basically a step by step process (precision limited by the dt, often in a chaotic manner).

    E.g. even if Newtonian equations of motion and gravity look like a potent “shortcut” describing how nature works, in practice there’s hardly any win except for the most trivial of cases or with very limiting simplifications. In other cases, like for the already very simple three body problem, we pretty much have to do a numerical simulation (step by step). But maybe there’s a clear theoretical understanding on the limitations of applicability of analytical solutions vs numerical solutions.

    Another case I have in mind is for Mandelbrot. In practice noone’s found a way to bypass brute force iteration (without doing approximations), which gets more costly the deeper we go (the more we zoom in), because we have to do math on larger and larger digits. That’s suprising because, given that Mandelbrot exhibits so much self-similarity, one could have hoped that the computational cost could be made independent of zoom level.

  18. Nick Drozd Says:

    It’s a great honor to be featured on this blog. Naturally I will take the opportunity to lay down some constructivist ideology.

    Like many readers, I was first exposed to the Busy Beaver function from Scott’s essay “Who Can Name the Bigger Number?”. The rules to the Bigger Number Game say this:

    Using standard math notation, English words, or both, name a single whole number… Be precise enough for any reasonable modern mathematician to determine exactly what number you’ve named…

    A number must be “named”. Here’s a name for a number: “7”. “7” designates a particular number. What about “2 + 5”? Is that a “name”? Probably, since it also designates a particular number. It might seem like less of a name, since it takes a little work to get from “2 + 5” to 7, but it’s clear how to get there.

    In the same way, “Ackermann(100)” designates some particular number. It would be a comical understatement to say that it takes a little work to get from that name to number that it designates, but some particular number is designated nonetheless. There is a straight line going directly from “Ackermann(100)” to some number, and in principle that number could be obtained.

    What about “BB(100)”? Is that a name? I’m not so sure. Certainly it’s a description of some number, but there is no way to get from the description to the number. All that can be said is that our best theories predict the existence of a number matching that description.

    Nevertheless, halting programs are recursively enumerable, and so in principle the witnessing program for BB(100) could eventually be obtained, and with it the actual number that is BB(100). It would certainly be impossible to prove that it really was BB(100), but it’s within the realm of conceivability that it could be gotten.

    But now think about BBB(100). Halting programs are RE, but quasihalting programs are not: there are programs that quasihalt but cannot be proved to quasihalt. That means that you could be holding the BBB(100) champion program right in your hand and not even know it. We really, really don’t know much about these quasihalting programs. The one discussed in this post is not too hard to understand, but it is entirely possible (maybe even likely) that the BBB(5) champion already cannot be proved to quasihalt. How can a number be named if we don’t know anything about it at all?

    So you say, naming numbers is easy — just make one up! Our best theories predict the existence of a number matching the description “BBB(100)”, so call that number “X”. Now it has a name! But this just means that some number or other has been associated with some made-up name. That number still hasn’t been meaningfully identified, in the sense of “naming names”.

    All this is to say that “BBB(100)” may not qualify for the Bigger Number Game. No reasonable modern mathematician could determine exactly what number that is even in principle. We don’t know shit about BBB(100), and couldn’t even begin to imagine knowing anything about it.

  19. fred Says:

    Scott, sorry for being off-topic…
    the other day, after you posted again on QC hype, I just saw an actual IBM ad for their quantum computer:

    and I couldn’t help noticing how similar it is to the machine in Devs (a series you enjoyed)

    What the hell?!

  20. Scott Says:

    fred #17: It seems to me that almost every single achievement of science over the centuries—from predicting eclipses and hurricanes to landing people on the Moon to inventing lasers and transistors and vaccines for smallpox and covid—provides another counterexample to Wolfram’s “principle of computational irreducibility.” I.e., in every one of these cases, people did manage to predict (or even control) a complex dynamical system well enough to do something useful, rather than just helplessly watching the thing unfold.

    You, or Wolfram, could reply that this is all as nothing compared to the vast majority of complex systems that we still can’t predict or control—but personally, I’m much more impressed by the doing of all these things than by the downplaying of them! 🙂

  21. Scott Says:

    Nick #18: Of course I disagree with you about constructivism … but we’ve already had this philosophical debate several times.

    For now, let me make a much more down-to-earth point: descriptions of a number can be more or less useful, depending on which questions about the number they let you easily answer.

    To take an example, writing “BBB(1000)” doesn’t let us say—at least, without a heroic effort that no one on this planet knows how to perform—whether we’ve named an odd or even number, prime or composite, etc. For those purposes, the description is of little use.

    On the other hand, it would be easy to prove, if we needed to, that BBB(1000) exceeds Ackermann(1000) and BB(1000), probably even that it exceeds BB(BB(1000)). Which tells us that “BBB(1000)” can be a perfectly serviceable description for the purpose of a bigger-number contest—even if not for other purposes!

    And besides, we don’t actually know that the oddness, evenness, primality, etc. of BB(1000) or BBB(1000) will be forever unknown to us. Maybe we’re just missing a mathematical shortcut that would settle such questions—if so, would those numbers seem more “real” even to you?

    For comparison, consider the following “definition”:

    “The largest integer n such that xn+yn=zn has positive integer solutions.”

    Someone could equally well argue that this doesn’t name a definite number, since no effective criterion is given, nor even a proof that the number exists at all. And yet, thanks to Andrew Wiles, we know that it does name a definite number, and the number is 2. So, did this suddenly become an OK definition the day Fermat’s Last Theorem was proved?

  22. fred Says:

    Nick #18

    Does this require a better refinement of the concept of uncomputability?

    Take a computation that’s irreducible and may require having to sit through a potentially unbounded number of steps in order to fully qualify it.
    This is much much weaker than what you describe, right?

    E.g. the Mandelbrot set definition is this way: to figure if a point belongs to it, potentially you’d have to do an unbounded number of steps. But I don’t think there’s any metric associated with Mandelbrot which is as uncomputable as this BBB(100). For example, the area of the Mandebrot set can always be bounded to some arbitrary precision (it just requires more and more steps), but in the case of BBB(100), there isn’t even a known procedure to get us started?

  23. Scott Says:

    fred #19: The machine in Devs was modeled on actual superconducting QCs, and they all look pretty similar to each other, so no surprises there 🙂

  24. fred Says:

    Scott #20

    I don’t think Wolfram would disagree with you there – otherwise he wouldn’t have become so successful from developing Mathematica! 😛

    But that’s not to say that more and more tools that rely on doing step by step simulations of very complex systems are being developed and relied on. E.g. predicting the weather, predicting how two galaxies collide, automated proof solving, montecarlo simulations, etc.

  25. Bruce Smith Says:

    Scott #11:

    … Indeed, if you choose an n-state Turing machine at random, the probability that it will exhibit “BB(n)-like behavior” is negligibly small, most likely 1/exp(n) for some reasonable formalization of what we mean by “BB(n)-like behavior” (although I don’t know how to prove such a statement). That’s part of what makes these machines so interesting!

    That sounds likely to me, but there is also a theorem in the other direction, which says that any reasonable formalization of “BB(n)-like behavior” can’t be too rare, either — otherwise that fact (plus an Introspective Encoder) could be used to compress a BB machine, giving a contradiction. I find this somewhat counterintuitive, thus interesting.

    It goes like this:

    Suppose A(M) is an algorithm from machine descriptions (which are promised to halt) to booleans, true when M is a BB machine, and not true too often. I’d say that is sort of the minimum requirement for calling A “a formalization of BB(n)-like behavior”. (You might even argue that such an A should be computable for all M, not just for promised-to-halt M, but the following theorem doesn’t require that.)

    Then suppose (given n) there are exactly k halting machines M which satisfy A(M).

    Given k, you can compute a BB(n) machine, like this: enumerate halting machines of size n, in order of runtime, until you find the kth such machine, then output it.

    So if k is small enough, you can use k and an IE to make a machine which finds and then simulates a BB(n) machine — contradiction. (If it finds it as described above, it doesn’t even have to simulate it, since it’s already run longer than each machine it found!)

    The fact that this works even if A has to be promised that M halts makes it especially interesting, to me — it means A is allowed to examine the output tape of M, or any other property of M’s complete history, without needing any estimate of M’s runtime in advance. And it *still* can’t use an excessively rare property of all machines, to filter out BB-candidates from all machines.

  26. Bruce Smith Says:

    Another thing I like about that, is that any improvement in IE encoding efficiency strengthens its result. So a mere “technical improvement” has a consequence for that seemingly more fundamental theorem!

  27. Vitor Says:

    Scott #11:

    Assuming there’s only two halting states, isn’t it already vanishingly unlikely that a random n-state Turing machine halts at all (let alone being BB-like)?

    My intuition for this is that there are very simple local constructions (e.g. 2 states pointing to each other), that immediately cause non-halting if they’re ever reached. It doesn’t matter how many fancy, beautiful patterns the TM draws on the tape beforehand, if it hits one of those “trap” constructions, it’s relegated to the non-halting category.

    Now, consider an n-state TM in some reasonable random-graph-like model. As n increases, the expected number of traps is likely to increase pretty fast: the probability of a collection of k states being a trap is some constant independent of n, and the number of subsets of k states is n choose k ~ Omega(n^k). Thus the expected number of traps grows faster than n^k for any k, i.e., superpolynomial.

    Now of course these traps can overlap each other, and I haven’t dealt with going from expectations to an actual distribution, etc, but it seems to me that the basic intuition should hold.

  28. Bruce Smith Says:

    (To clarify my #25: in “until you find the kth such machine”, I meant the kth halting machine M which satisfies A(M). The algorithm is to simulate all machines in parallel, and apply A to each one that halts. Note that this algorithm halts, since by assumption k is equal to (thus no larger than) the number of halting machines that satisfy A.)

  29. Bruce Smith Says:

    Vitor #27: the probability that k states form a trap does depend on n, since the chance that each transition among them points to the correct state to help form the trap is 1/n.

  30. gentzen Says:

    Nick Drozd #18: Scott is right that this is a philosophical debate. So instead of right or wrong, the question is more which ontological commitments are implied by different positions. I would say that your current position corresponds to accepting computation in the limit and its ontological commitments. My impression is that Scott’s current position corresponds to at least accepting the arithmetical hierarchy and its ontological commitments (i.e. that every arithmetical sentence is either true or false).

    If Scott would claim that accepting ontological commitements sufficient for computation in the limit would imply to also accepting ontological commitements sufficient for the arithmetical hierarchy, then I would say he is actually wrong. But he does not claim this.

    Scott wrote:

    To take an example, writing “BBB(1000)” doesn’t let us say—at least, without a heroic effort that no one on this planet knows how to perform—whether we’ve named an odd or even number, prime or composite, etc. For those purposes, the description is of little use.

    I really appreciate these references to concrete properties that might be unknowable for BBB(1000). To be more precise, those might be unknowable with respect to computation in the limit, they are still either true or false with respect to the arithmetical hierarchy. The reason is that there are concrete arithmetical sentences for each of those properties expressing whether BBB(1000) satisfies it.

    I recently chatted with vzn about a construction making the enormous ontological commitments implied by accepting the arithmetical hierarchy more concrete:

    Assume D-Wine now has a new internet offering, which they call “book of arithmetic”. It actually seems quite cheap, only 1 cent per gigabyte. And apparently the information is highly compressed, using state of the art algorithmic information theory. Their sales representative said that it would enable you to decide arbitrary arithmetic sentences, if you had sufficient computational resources to decode the compression.

    The crucial point is that deciding arithmetical sentences at the second level of the hierarchy would require BB(lengthOfSentence) many of those bits (provided as an internet offering), so even at “1 cent per gigabyte” nobody can afford it. (And for the third level you would even need BBB(lengthOfSentence) many bits.)

    Now you might think that I reject those ontological commitments. But actually I would love to go beyond to hyperarithmetic relations and have similarly concrete constructions for it as we have for computation in the limit and the arithmetical hierarchy.

  31. Scott Says:

    Vitor #27: In the usual definition of the BB function, we don’t have halt states but halt transitions, which could wildly change the typical behavior if we look at random TMs—but these are all fairly uninteresting encoding questions.

    A better way to state my conjecture would be as follows:

    Among all halting n-state Turing machines, at most a 1/exp(n) fraction of them run for at least, say, log(BB(n)) steps.

    Importantly, I don’t see that this conjecture is ruled out by the very interesting observation of Bruce Smith. For I’ve given a criterion for “being BB(n)-like” that requires knowledge of BB(n) to apply—but specifying BB(n), of course, requires about as many bits as it takes to specify an n-state Turing machine.

  32. Nick Drozd Says:

    gentzen #30

    There’s philosophy involved, but really this is a legal debate dealing with the rules of the Bigger Number Game. Those rules plainly state that a number must be “named”. Maybe this corresponds to some class like computable functions or arithmetic functions, or maybe there is some other property that cuts across. I think it’s an idea worth taking seriously, even if it leads into Frege territory.

  33. Vitor Says:

    Bruce #29: yes, brainfart on my part. I think my general point stands though: bad things can happen locally, and thus the density of bad things should increase a lot as n increases.

    Scott #31: Would you also accept a stronger version of the conjecture along the lines of:

    “Among all halting n-state Turing machines, at most a 1/exp(n) fraction of them run for more than BB(n-1) steps”?

    Intuitively, there should be a lot of such machines (as you can add a state in many different places without changing a machine’s semantics). OTOH, if a random machine “wastes” even a single state without doing useful computations, that immediately excludes it from the BB(n)-like class. It should be possible to prove that this happens more and more often as n increases.

  34. fred Says:

    So, for a given n, there’s a Turing machine that prints the biggest finite number of 1s. And those are Busy Beavers.

    But, for a given n (and n being large enough), we must start to see Turing machines that never halt but print an unending string of 1s that has no repeating patterns, i.e. machines that output irrational numbers, like the binary sequence of pi (as opposed to machines that don’t halt but print an infinite amount of the same repeating pattern, i.e. rational numbers)?
    If that’s the case, such machines can’t be qualified, right? Because they never halt. Unless we prove that a particular one happens to implement a known algorithm for generating a known irrational number (like e or pi).
    Are such machines as rare as Busy Beavers?

  35. fred Says:

    Above I meant “an unending string of 0s and 1s that has no repeating patterns”

  36. Nick Drozd Says:

    It can be proved that the classic Busy Beaver function grows monotonically in the number of states. For example, the survey showed that BB(n + 1) >= BB(n) + 3. This and other similar results all exploit the fact that Busy Beaver programs have an explicit halt state, and this halt state can be used in a modular fashion to hook in some well-behaved extra logic.

    In contrast, there is no obvious way to prove anything similar for Beeping Busy Beaver or Blanking Beaver. These programs don’t have a halt state, and there’s apparently no way to tell statically where they “end”. Probably there is some nonconstructive way to prove that BBB(n + 1) > BBB(n), but what about for BLB?

    Open Problem: Prove that BLB(n + 1) > BLB(n).

    This is obviously true, even if it isn’t provable. It seems related to Conjecture 13 from the survey, that for sufficiently large n, BB(n + 1) > 2 ^ BB(n). Like Scott said: “But try proving it!”

  37. Scott Says:

    fred #34: No, programs that print irrational numbers shouldn’t be anywhere near as rare as Busy Beavers, since a constant-sized prefix is enough to print an irrational number, whereas a Busy Beaver needs all n states.

  38. Scott Says:

    Vitor #33: Absolutely, I endorse your stronger conjecture as well.

  39. fred Says:

    (sorry for the multiple posts)

    I was wondering about the different types of Turing machines for a given n:

    1) Machines that halt eventually. I.e. they print some natural numbers (if we read the output based on some convention, like starting at the leftmost 1 on the tape). BBs belong here.

    2) Machines that never halt, and:
    21) Eventually drift to one preferred direction, and print repeating strings, i.e. print rational numbers.
    22) Eventually drift to one preferred direction, and print non-repeating strings, i.e. print irrational numbers.
    23) Never drift to one preferred direction (so keep going back left and right, always overwriting what they previously wrote):
    231) Some of those must eventually always settle into some repeating loop (i.e. go back to some prior state of the machine + tape), i.e. they only reach to some finite distance on either direction of the tape (if the reach is bounded, then the number of possible states is finite). They’re like stable oscillators.
    232) While some machines can keep going further and further on either side of the tape, generating an infinitely changing unbounded output pattern (like machines that output an infinite sequence of natural numbers). Those are like unstable oscillators.

  40. fred Says:

    Vitor #33

    “Intuitively, there should be a lot of such machines (as you can add a state in many different places without changing a machine’s semantics). OTOH, if a random machine “wastes” even a single state without doing useful computations, that immediately excludes it from the BB(n)-like class. It should be possible to prove that this happens more and more often as n increases.”

    I’m wondering.
    BB(n) considers the final number of 1s (when the machine halted), right?
    So can two separate machines generate BB(n), but one is just running for more steps than the other, i.e. at some point its tape had more 1s than BB(n), but then it erased those extra 1s before halting?

    Also, if BB(n) is the machine printing the largest number of 1s. What if we start considering the machine that prints the second largest number of 1s? And then the machine that prints the third largest number of 1s, etc… until we get to outputting just a single 1.
    We know that for n, there exists (4n +4)^2n machines, which is a much smaller number than BB(n), so the size of the sequence of such machines must be quite sparse. Maybe that 1/exp(n) from Scott.

  41. Will Sawin Says:

    I think we can prove Vitor #33’s conjecture by only considering the graph of state transitions. Each state transitions to 2 other states. If these are chosen uniformly at random, what’s the probability that every state is transitioned to at least once?

    The number of possible graphs is (2n)! times the coefficient of x^(2n) in (e^x-1)^n, which plugging in x=2 is at most (2n!) (e^2-1)^n / 2^(2n) which is around (2n)^(2n) (e^2-1)^n/ (2e)^(2n) which is indeed exponentially smaller than n^(2n).

  42. Scott Says:

    Will Sawin #41: Nice argument! Note that, even once we condition on the TM halting, your argument should still work, since if we treat “Halt” as just another transition to be randomly chosen, then clearly a Ω(1/n) fraction of n-state TMs halt. (Indeed, I expect that a Ω(1) fraction halt, although I don’t see how to prove that off the top of my head.)

    Of course we could next ask: among all halting TMs that have strongly connected transition graphs, what fraction run for at least BB(n-1) steps? I conjecture that the answer is still 1/exp(n)…

  43. Yaqub Says:

    How does the quantum phase algorithm help you solve factorization? I heard that the quantum phase algorithm is an alternative formulation to Shor’s algorithm in solving factorization.

  44. Bruce Smith Says:

    Regarding these conjectures:

    Scott #31:

    Among all halting n-state Turing machines, at most a 1/exp(n) fraction of them run for at least, say, log(BB(n)) steps.

    Vitor #33:

    Among all halting n-state Turing machines, at most a 1/exp(n) fraction of them run for more than BB(n-1) steps.

    First I need to point out that #33 is only *conjecturally* stronger than #31, since no one has proved BB(n+1) – BB(n) > 3 for more than about half of all n.

    Next I want to give what I think is a proof of #31, but for which the same argument does not (I think) also prove #33. (As I wrapped up writing this proof, I noticed the later likely proof of #33 by Will Sawin #41, but I want to post this before taking the time to study that, since my initial guess is that it would affect all such conjectures in a similar manner.)

    So — suppose n-state machines which run for at least log(BB(n)) states are more common than 1/exp(n) of all halting n-state machines. Halting n-state machines can be encoded in about n log n bits (see the BB survey and its references; the true number of bits is more like 6n + n log n, but I think we can ignore that difference here). So the exact number of Halting n-state machines (call it H(n)) can also be encoded in at most that many bits.

    Fixing n, approximate H(n) as H'(n) by setting the last half of its bits to 0. This lower bound on H(n) can be encoded into an IE (Introspective Encoder) of far fewer than n states, leaving room (in an n-state machine M we’ll construct here) for some constant number of states containing “code”. We’ll have that code run all n-state machines in parallel, to look for halting machines (in order of runtime), and count off H'(n) of those (this terminates, since H'(n) ≤ H(n)). By assumption, the last of those halting machines must run for more than log(BB(n)) states. This lets M compute a number larger than BB(n), and then run that long — contradiction.

    (Clearly this works for any computable function in place of log, and can be quantitatively tightened as well.)

  45. Bruce Smith Says:

    Nick Drozd #36: These are fascinating conjectures and a compelling Open Problem, on which I can’t make any headway in my initial attempts. (I do think we could easily show BLB(n+2) &gt BLB(n), which you probably considered too obvious to point out.)

  46. Bruce Smith Says:

    (Oops, I think we have all been forgetting to add “for sufficiently large n” to all these conjectures.)

  47. Nick Drozd Says:

    Bruce Smith #45

    Please share any proofs you have, because currently there is nothing.

    Thinking it over a little more, BBB(n + 1) > BBB(n) for sufficiently large n should follow from the monotonicity of BB and the fact that BBB dominates BB. Is that right?

    If that’s right, then would the same argument work for BLB, assuming that BLB dominates BB? This assumption again is obviously true but difficult to prove.

  48. fred Says:

    “Nick’s construction, when investigated, turns out to be based on a “Collatz-like” iterative process”

    I’ve always wondered if there’s a deep connection between BB and Collatz.

    Some great new video on Collatz:

  49. Vitor Says:

    Bruce #46: yes, I was assuming these were asymptotic statements. The reason conjecture #33 is stronger than #31 is because BB grows asymptotically faster than any computable function (in particular faster than the tetration 2^^n implied by #31)

  50. Tobias Maassen Says:

    It should not be suprising if Turing Machines without halting state are as complex as halting machines with one more state.

    After all the one halting state is the least complex state imaginable. It occurs at most once, it has no outgoing transitions, and even I perfectly understand it.
    If you remove it, you have a TM with one less state that does exactly the same steps before continuing on.

    This easily explains differing liftoffs.

  51. Scott Says:

    Tobias #50: You’re absolutely right that the beeping and blanking machines don’t need to “waste” one of their precious transitions on halting, and that that’s a huge part of why BBB(4) and BLB(4) are able to be so much larger than BB(4). Again, though, the way the BB game sets things up, halting is not a state, it’s just a transition from a state. So I think it is at least slightly surprising that increasing the number of non-halt transitions from 7 to 8 is enough to unlock such an explosive growth in runtime.

  52. Niels Says:

    Scott #51: Maybe it’s interesting to consider a variant of BB which we might call bb. It is defined exactly as BB but as a function of the number of non-halt transitions rather than the number of states:

    bb(n) is the maximum runtime of a halting Turing Machine with binary alphabet and n non-halt transitions.

    It’s easy to see that bb(n-1) ≤ BB(n) ≤ bb(2n-1). I could verify so far:

    bb(0) = 1
    bb(1) = 2
    bb(2) = 4
    bb(3) = 6
    bb(4) ≥ 17
    bb(5) ≥ 21
    bb(6) ≥ 38
    bb(7) ≥ 107
    bb(8) ≥ 583
    Per Marxen and Buntrock’s BB(5) bound we have:
    bb(9) ≥ 47176870
    I’m trying to learn more about bb(8).

  53. Useless Physicist Says:

    Thinking about whether “most turning machine like Busy Beavers” got me curious about the following

    Define the Mediocre Beaver function \(MB(n)\) as the smallest number such that at least 50% of halting (essentially different) n-state Turing machines halt in \(\le MB(n)\) steps

    There are different versions of the Mediocre Beaver function depending how we count Turing machines that are “essentially the same” as define in “The Busy Beaver Frontier”. I have mostly though about the version where we count essentially different Turing machines but everything in this comment probably applies to both mutatis mutandis.

    Can MB(n) be dominated by a compatible function?

    I’m a useless physicist so after much thought couldn’t answer this. Also I’m not sure I have any intuition either way. But if so it would be fascinating to have an explicit example.

    I made some progress on the a similar question

    Is MB(n) computable?

    I suspect the answer is no but I couldn’t quite prove it.
    My general strategy was like this:
    If we can compute MB(n) then we can simulate all n-state Turing machines and count how many essentially different Turing machines halt in MB(n) or fewer, steps call this N. Now the obvious thing to do is to try and compute BB(n) by simulating n-state Turing machines until 2N of them halt. The problem with that is that multiple essentially different Turing machines might take MB(n) steps to halt and so more the 50% of nstate Turing machines half in N or fewer steps. And so 2N might over estimate the number of essentially different Turing machines that halt.

    Let DH(n) be the number of essentially different Turing machines that halt. If 2N only over estimated DH(n) by a “small” amount (say \(2N-DH(n)=O(n^k)\) for some k) then I think this kind of argument could be saved. But I’m not sure how to show that or even if it is reasonable.

  54. Nick Drozd Says:

    Tobias Maassen #50

    That explains why this one takes off earlier, but it doesn’t explain why it takes off so much earlier.

    The first few of these programs that turned up quasihalted between 2,000 and 3,000 steps. That wasn’t exactly shocking, but it was like, alright alright, not bad, that extra transition really gives it some juice. Then a few months later I found one that quasihalted in 66,349 steps. That number was already pretty shocking, and I wasn’t expecting to find something like that.

    But still, it wasn’t even close to 47,176,870. Why not? I reasoned: a program with n + 1 states doesn’t just have more transitions than a program with n states, it also has more expressive transitions. The 5-state transitions have more choices available to them than their 4-state counterparts, and that lets them do more.

    Imagine you worked at a software company where arbitrary goto statements were permitted, but only to four specific targets. Now imagine that company policy was changed to allow jumps to five specific targets. The Busy Beaver function tells us that pandemonium would ensue.

    Well, the Beeping Busy Beaver function apparently tells us that the company policy of four jump targets would already cause pandemonium, and that company policy should therefore cap the jump targets at three. That’s what’s new here. Is there an explanation for why it should be this early? (Other than, it has to be somewhere, and here it is.)

  55. Nick Drozd Says:

    On the one hand we have BB, which has a computable termination predicate (halting) and uncomputable values; on the other hand we have BBB, which has an undecidable termination predicate (quasihalting) and super-uncomputable values. Every halting program halts provably, but only some quasihalting programs quasihalt provably.

    So how about this for a sequence? Computable Beeping Busy Beaver (CBBB): the longest a program can run (started on the blank tape) before provably quasihalting. CBBB puts an upper bound on the values of BBB that we could ever in our wildest fever dreams possibly hope to obtain.

    What is the complexity of this sequence? Delta_2?

  56. Lou Scheffer Says:

    Scott #20: You state “It seems to me that almost every single achievement of science over the centuries—from predicting eclipses and hurricanes to landing people on the Moon to inventing lasers and transistors and vaccines for smallpox and covid—provides another counterexample to Wolfram’s “principle of computational irreducibility.” I.e., in every one of these cases, people did manage to predict (or even control) a complex dynamical system well enough to do something useful”
    This is true, but I think this is a symptom of looking under the lamp post because that’s where the light is. These simplifications work, not because many things are simplifiable, but because our limited mental machinery can only deal with things that are simplifiable, and then we keep those where the simplifications work.

    I think this is true in many fields. Only a infinitesimal fraction of numbers are integers, or rational, or algebraic. But almost all the numbers we use are, since these are the ones we can think about. Almost all functions are exponentially hard to compute, but we only understand the ones that are O(N^some-low-power). Many of these do something useful, but they are not representative.

    This leads to the only argument I’ve seen that argues why it is possible that P=NP. The analogy is polynomials with integer solutions. For small number of terms, and small powers, we understand how these work, and intuitively expect that higher powers are more complicated, but fundamentally similar. But as the solution to Hilbert’s 10th problem shows, starting at some higher power (I think the proof used 13th powers), solving some of these equations is equivalent to solving the halting problem. So in some sense 13th power polynomials can express an entirely different class of problems compare to lower order polynomials. The argument that P might equal NP then holds that something similar might happen with algorithms. We have lots of understanding and tools for low order algorithms, such as O(N^2) or O(N^3). But perhaps, similar to polynomials, there is some much larger (but finite) exponent where the system becomes capable of addressing harder problems. But we can’t see this since our experience and intuition is all in easier-to-understand cases.

  57. Bruce Smith Says:

    Nick Drozd #47:

    To prove BLB(n+2) > BLB(n), use the proof in the survey of BB(n+2) ≥ (n+2)/n BB(n), which works by picking one state (the most popular) and introducing a new 2-state 2-step time waster before it (move left, move right).

    In this case we can’t necessarily pick the most popular state, since it might be the starting state, and that would mean leaving the tape blank for a couple steps at the start — not allowed (if I understand BLB correctly). So we pick any other state which is transitioned to at least once when the tape is not yet blank again. Thus we weaken the conclusion (probably not as much as what I stated, but I’m too lazy to figure out how much stronger it can still be).

    Maybe we actually *could* pick the starting state, provided the starting state itself was not changed — only the explicit transitions to the starting state would go to the new time-waster instead. Maybe we could rescue the original conclusion that way? I’m too lazy to check that, too (perhaps because it’s late).

    I don’t agree with this:

    BBB(n + 1) > BBB(n) for sufficiently large n should follow from the monotonicity of BB and the fact that BBB dominates BB.

    For example, define X(n) such that X(2n) = BB(2n) + 1 and X(2n-1) = X(2n). Then X dominates BB but X does not always grow.

  58. Bruce Smith Says:

    Vitor #49:

    I don’t follow your argument — I don’t see how it will cover *every* sufficiently high n, not just infinitely many n, given that BB’s growth might come in “spurts”.

    On the other hand, I now see that my specific worry (what happens if for infinitely many n, BB(n-1) is only 3 less than BB(n)) makes things *easier* for your conjecture #33, not harder as I had for some reason been assuming. Therefore, it’s now plausible to me that if I thought about it a bit more, I’d agree with you that it’s clearly stronger. Still, if you can turn your argument into a proof (of #31 from #33, interpreting both as applying to “every sufficiently high n”), then I’d understand much better why it’s stronger.

  59. Sniffnoy Says:

    Nick Drozd #55: The problem with CBBB is that it’s substantially more artificial because the definition has to include what axioms you’re allowing proofs from… like CBBB_ZFC would have to exceed CBBB_PA, right? I think that “C” is misleading given that you really mean “provable” and not “computable”. Things that defined just in terms of computability don’t need a theory specified!

  60. venky Says:

    Scott #13. Given A, the minimum k for which BB(k) is uncomputable <= size of M. Can we say anything more about the minimum k?

  61. Niels Says:

    As has been discussed, BLB(4) >> BB(4) can partly be explained by the extra non-halt transition of the BLB(4)-champion. But maybe something else is going on and blanking is per se prone to produce more outstanding champions than halting? To find out, I’d suggest to compare blanking and halting champions that have the same number of non-halt transitions. What is, to begin with, the best blanking machine that has 7 non-halt transitions like the BB(4)-champion? The best one I could find so far is

    1RB H 0RC 0LA 1LC 1LD 0RB 0RD

    It blanks the tape at step 169. That’s more than BB(4)=107 but not much. Does anyone know a better one?

  62. Nick Drozd Says:

    Bruce Smith #57

    I worked it out explicitly. Here’s a 6-state program that reaches the blank tape after 65,538,549 steps:

    • 1RB 1LE 1RD 1RB 0RD 0RE 1LD 1LA 0RF 1RF 0LC 1LC

    That’s truly a piddling result in the grand scheme of things, but there it is. I think it quasihalts too, so it’s a candidate for BBB.

    Something to note about the two-extra-dummy-state construction is that it foils basic attempts at runtime acceleration. I think it was Marxen and Buntrock who first observed that if the machine is approaching a block of squares of the same color and it will remain in the same state, then it can just skip ahead to the end of the block (possibly changing it). I have a simulator that can run the BLB(4) champion in half a second on my crappy laptop. But even though it only takes about three times as many steps, the 6-state version with the extra dummy states runs for fifteen seconds! A “sufficiently smart” simulator could figure that out, and my simulator isn’t that.

  63. Bruce Smith Says:

    Nick Drozd #47:

    Now I read your posts, and Shawn’s analysis — very interesting, as Scott promised. You asked for proofs whenever we have them, so here is another one (stated and proved informally, but I think it’s reliable and could be easily formalized). This generalizes a theorem from Scott’s survey.

    Definition: Let IEO(n) be the “introspective encoder overhead” needed to encode any n-state TM (that is, an IE which encodes it needs at most n + IEO(n) states — see the survey for an upper bound on IEO(n)). Technical warning: some encoding methods assume the TM halts; we can’t make that assumption here, but this doesn’t change the value by very much.

    Theorem: Let X be either BB or BLB. Let Y be either BB or BLB. Let f be any computable function. Then there is a constant C_f such that

    X(n + IEO(n) + C_f) ≥ f( Y(n) )

    for all n.

    Proof: exactly like the proof in the survey for the case where X = Y = BB.

    This works because:

    – while simulating a TM, you can easily detect the “blank tape” halting condition needed for BLB (this allows Y to be BLB);

    – any “programmed TM” (such as a simulator) can easily arrange to keep its tape non-blank while it “runs” and to blank it on purpose when it’s “done” (just as it can arrange to not halt while it runs and then halt when it’s done) (this lets X be BLB).

    To reiterate the proof from the survey, the X-machine demonstrating the lower bound just encodes and simulates the Y-machine, detects the appropriate halt condition, measures the Y-machine’s runtime r to that point, computes f(r), runs for that long, then blanks its tape and halts (thus showing off by achieving both “halt conditions” at once, though just the appropriate one by itself would have been sufficient).

    This could be further generalized to cover any other “exotic halt condition” which can be both detected and artificially generated in the same ways. It could also cover any other measure besides “runtime” which is computable from the complete history of a run, and is also artificially generatable (such as number of ones left on the tape). That is, besides BB (“shift”) and BLB, we can include the other BB-related functions like “ones” and “num” from Radó’s original paper.

    In plain English: all these functions are “comparable at a large scale”, i.e. if you ignore “small overhead differences” like adding IEO(n) + C_f to the input (n), and applying any computable f to the output.

    I guess Scott was implying this when his post said:

    … Note that the Blanking Beaver function … merely grows “normally” uncomputably fast, like the original BB function did. …

    Even so, BLB evidently does exceed BB in interesting ways, and its “small scale” behavior is very interestingly different from BB (at least regarding what we can presently prove about it), as Nick has said.

  64. Bruce Smith Says:

    Nick Drozd #62:


    Your remark about BBB makes we wonder if this same proof can be adapted to prove BBB(n+2) > BBB(n). (Which if I’m not mistaken is not implied by known results, unless I missed some, which is quite possible.) I guess “almost certainly yes — it seems obvious — probably everyone else already knows this”, but I didn’t yet try it.

    For a simulator to optimize this construction, it would have to notice certain loops of states… which I think would be very rarely occurring in “natural” TMs.

    About BLB vs BB, I appreciate the evidence that for tiny n it seems like BLB(n) exceeds BB(n)… but I can’t think of a proof that for all n, it even always matches it! (That is, a proof of BLB(n) ≥ BB(n) for all n.)

    I can even imagine an intuitive explanation for the hypothesis that for most n, BLB(n) < BB(n)! Namely, BB(n) can only worry about running a long time, not about making its ending tape easy to erase. And if it ends up inside a “pseudorandom” sequence on tape, and not extremely near either end, there is no few-state way to erase the tape from that point (like there would be if it was entirely outside the written part, with just one extra state to write 0s forever in one direction). The overhead of making that possible means fewer states are left for the “initial BB-like part”, or (alternative description of same thing?) that only some BB-like algorithms permit easy tape erasure. The single extra instruction (replacing the explicit Halt required for BB) might not always make up for this (though for tiny n it evidently does).

  65. Bruce Smith Says:

    So yes, BBB(n+2) > BBB(n) is obvious from the same proof, since the transformation doesn’t affect quasihalting (it just spreads out the same beeps by adding states between them in time), and you can always find some state which runs at least once to do it to. Maybe the quantitative version even works the same as for BB — I didn’t check, but it looks likely.

    More notably, BBB(n+1) > BBB(n), also from a proof that works for BB. For BB this has two proofs (I forget which one was in the survey) — add a new state at the beginning, or at the end. For BBB it only works to add it at the beginning. (For BLB, neither way works in general, as you know, so the result is open, and I have no idea how to attack it.)

    All this seems likely to be obvious to anyone who asks themselves whether those BB-proofs work for BBB, so I’m not trying to take any credit here (especially since I didn’t invent the BB versions either, except one of them “independently but later”).

    Finally, I think Niels’ remarks about bb(), blb(), etc, are really interesting, and we should be studying these same questions for them too.

  66. Bruce Smith Says:

    I think I can prove BLB(5) > 32,779,477 (which is Nick’s new lower bound on BLB(4)).

    (Note that this does *not* establish BLB(5) > BLB(4), unless someone proves that bound on BLB(4) is optimal.)

    The method is a specialized tweak to Nick’s BBB(4) candidate, which adds one new state and increases the runtime. It is “specialized” because it only works because state B always has a 1 to its left (since the two transitions to B are both 1RB).

    I found this while trying to prove BLB(n+1) > BLB(n), but the transformation it uses only works when some state knows what’s on tape on one side of it, like in that case (or anytime a non-start state has only one transition to it) — but I have no reason to guess that’s always true in a BLB champion.

    I’ll describe the transformation later if anyone’s interested. In fact it might take less work to just apply it to Nick’s machine, so I’ll do that for sure (after dinner). It’s reminiscent of how I transformed that Wythagoras machine in the survey (see some comment on the blog post about the survey). I’m not claiming the result itself is interesting; I just think transformations like this are a bit interesting if one is searching for possible proofs of X(n+c) > X(n) (or related theorems) for various BB-like functions X and small constants c.

  67. Bruce Smith Says:

    Here is the promised BLB(5) machine (actually a pair of them) based on Nick’s machine for BLB(4), which (from his post) is:

    A0 -> 1RB

    A1 -> 1LC

    B0 -> 1RD

    B1 -> 1RB

    C0 -> 0RD

    C1 -> 0RC

    D0 -> 1LD

    D1 -> 1LA

    We transform state B and add a related last state E.

    There are two ways to do it. One of them will take about 2 count(B0) more steps, and the other will take about 2 count(B1) more steps, than the original machine, to re-blank the tape, where by count(B0) I mean the number of times the original machine is in state B over a 0 (before re-blanking its tape), and similarly for count(B1).

    I say “about” because this added number can vary by about 2 depending on details of what happens at the end.

    All this is just predicted — I have not tested either of these machines (or any other instance of this transform).

    (No one has ever told me how to imitate an HTML “pre” or “code” tag in this blog, so I’ll use lots of extra blank lines instead, and hope it is not too unreadable.)

    The machine which adds two steps for each B0:

    Change B0 to write 0 and move left to new state E. (B0 -> 0LE)

    Make E1 move right to E. (E1 -> 1RE)

    Make E0 imitate what B0 used to do. (E0 -> 1RD)

    A0 -> 1RB

    A1 -> 1LC

    B0 -> 0LE

    B1 -> 1RB

    C0 -> 0RD

    C1 -> 0RC

    D0 -> 1LD

    D1 -> 1LA

    E0 -> 1RD

    E1 -> 1RE

    The machine which adds two steps for each B1:

    Change B1 to write 0 and move left to new state E. (B1 -> 0LE)

    Make E1 move right to E. (E1 -> 1RE)

    Make E0 imitate what B1 used to do. (E0 -> 1RB)

    A0 -> 1RB

    A1 -> 1LC

    B0 -> 1RD

    B1 -> 0LE

    C0 -> 0RD

    C1 -> 0RC

    D0 -> 1LD

    D1 -> 1LA

    E0 -> 1RB

    E1 -> 1RE

    (I guess the last one might run longest, since B1 was a loop going back to state B.)

  68. Bruce Smith Says:

    (I should add that I have no expectations that these “BLB(5) candidates” will remain champions for any length of time once anyone makes a serious search for “natural” champions.)

  69. Nick Drozd Says:

    Bruce Smith #64

    That’s an interesting argument. For another piece of evidence, consider the 2-state 4-color champion discovered by Shawn and Terry Ligocki in 2005:

    • 1RB 2LA 1RA 1RA 1LB 1LA 3RB 1RH

    That program halts in 3,932,964 steps, so BB(2, 4) > BB(4, 2) by a pretty good margin.

    An interesting property of this program is that it is blank-free; that is, it never writes any blanks. This makes sense as a general strategy. From the perspective of the machine’s read/write head, a blank square is meaningless. It could be a blank that was laid down in the course of program execution, or it could be one of the infinitely many blanks that are just sitting around on the tape already. So why would you ever write blank unless you absolutely had to? Even if you needed a blank, you could always just go and grab one, since they are always available.

    A tape-blanking program cannot be blank-free. It has to print some marks and then it has to print some blanks over those marks. So maybe it could be that BB(2, k) > BLB(2, k) for sufficiently large k, since BLB champions can’t make use of the always-color approach.

    At least that’s what I thought before finding this new one. Now that argument just sounds like a rationalization for failing to find a 2-state 4-color BLB champ to beat the BB champ. It’s hard to say. It’s not obvious which way it goes.

    But it is obvious to me that BLB(5, 2) > BB(5, 2), and I even consider it likely that somebody could find a tape-blanking program to witness that. Given that these sequences are uncomputable or worse, Busy Beaver research is essentially an empirical matter. The fact that is can’t be proved a priori even that BLB(n) >= BB(n) doesn’t mean that it isn’t true.

    (For anyone who is looking for a Busy Beaver project, I believe there is still some LOW-HANGING FRUIT to be had here.)

  70. Nick Drozd Says:

    Bruce Smith #63

    Another termination condition I’ve been looking at is what I’m calling “Lin recurrence”, after Shen Lin. This means that the machine will either repeat a previous configuration (state and tape contents) or will get into a repeating pattern moving to the left or right, like the “puffers” and “spaceships” from Game of Life. In 1963 Lin used this property to prove that BB(3) = 21, since recurrence implies non-halting, but it can be used as a termination condition on its own. (The recurrence is also “lin”ear).

    Call the sequence induce by this predicate BBLR (or else, come up with a better name). Here are results for 3-state 2-color:

    • BB(3, 2) = 21
    • BLB(3, 2) = 34
    • BBB(3, 2) = 55
    • BBLR(3, 2) = 101

    The relationships are similar for the 2-2 and 2-3 cases as well. So it looks like Lin recurrence gives a lot of expressive power. This new champion is ALSO Lin recurrent, so we get BBLR(4, 2) >= 32,779,478 for free.

    After recurrence starts, the pattern repeats every so often, and the length of the pattern is the period of the recurrence. The BLB(4) champion eventually enters into Lin recurrence with a period of just 1. But how long can the periods be? Call that sequence BBP. It’s not clear at all how this sequence should relate to BB or even to BBLR.

    Well, Boyd Johnson discovered a 4-state 2-color program that enters into Lin recurrence after 7,170 steps with a period of 29,117 steps, so BBP >= 29,117. That program is:

    • 1RB 0RA 1RC 0RB 1LD 1LC 1RA 0LC

    Why should Lin recurrence be so powerful? BB and BLB are both cases of unbounded search, equivalent to the halting problem. But BBLR requires two unbounded searches: one to find when recurrence starts, and one to find the length of the period. So there is every reason to believe that BBLR dominates BB. Of course, BBLR is still just “regular” uncomputable and is therefore eventually dominated by BBB, which is equivalent to unboundedly many unbounded searches.

    There are other predicates that could be used to define sequences, like the so-called “Christmas tree” recurrence pattern, where the machine sweeps back and forth across the working tape span. I haven’t managed to figure that one out, so that would be another OPEN PROBLEM and RESEARCH PROJECT.

  71. Nick Drozd Says:

    Bruce Smith #67

    I can confirm that your second program reaches the blank tape in 32,810,047 steps, proving in unspectacular fashion that BLB(5) > BLB(4). (As you predicted, the second program runs longer than the first.)

    Although the result isn’t impressive, the construction technique is pretty cool. It’s related to the fact that this BLB(4) champion employs a surprisingly simple control flow. In the survey Scott proposed that BB programs ought to be “spaghetti code”. That is probably true in general, but in this case it is not. The program has just one true branch, in state A, and just one unbounded loop, in state D. States B and C have loops, but they are on the 1-branch, and that’s always guaranteed to terminate since the tape is blank to begin with. You can actually look at the program and kinda vaguely see how it operates.

    In contrast, here’s another program discovered by Boyd Johnson. It enters into Lin recurrence after 158,491 steps with a period of 17,620:

    • 1RB 0RC 1LB 1LD 0RA 0LD 1LA 1RC

    That control flow is totally inscrutable, and it’s really hard to say what that program “does”, although it obviously does something interesting. Is it “universal”? Another good project would be to figure out what that thing does.

    I wrote a few blog posts about Busy Beaver control flow. Anyone who has read this far may find them interesting:

  72. Niels Says:

    For the record, I’ve slightly improved the bb(8)-bound as mentioned in #52. The following 8-non-H-transitions machine halts at step 794:

    1RB H 1LC 1RD 0LE 1RC H 0RC 1RA 0RB

  73. Bruce Smith Says:

    Nick Drozd #69-71,

    Thanks for confirming that prediction, and for your highly interesting thoughts.

    One thing I like about this subject is the interplay between theory and experiment, and another is the rich variety of accessible open problems. You have just added a few more of those! I don’t have any “new leads” at the moment… we’ll see how it goes. (I wish I had more time to work on them than I do these days.)

  74. Bruce Smith Says:

    Hey, I have a new theorem about BB, inspired by thinking about Niels’ variant function bb!

    Theorem: no BB machine has an unused 0-instruction.

    Proof: if it did, say in instruction slot X0, change that to 0RA (with A being the original start state), and change the start state from A to X. This adds no states and one step of runtime. (Note that X can’t have been the start state already, or X0 would have been used.)

    Somehow I never noticed this back when I was trying to find limits on how many unused instruction slots a BB machine could have. (I did notice it couldn’t have unused instruction slots of both types, 0 and 1, or you could merge some of them into fewer states. I still know of no limit on how many 1-instructions could be unused, short of the one implied by a BB machine not being compressible enough to be encoded into an IE much smaller than n.)

  75. Vitor Says:

    I accidentally ended up obsessed with the uniform growth question (Bruce’s comment #58 was on point, as I was just assuming that growth was uniform since the intuition for that is so strong, but it’s still only a conjecture). I’m a bit rusty in TCS, so bear with me.

    I got a result that gets rid of the n/log n overhead from introspective encoding, but it requires a new beaver variant with better composability. Both BB and BLB are hard to work with. BB nicely signals when it halts, but it leaves an arbitrary mess behind on the tape. BLB cleans up the tape, but refuses to release control of the machine afterwards. So I introduce BLBB(n) (“blanking and busy beaver”, better names welcome) which is the largest number of steps that an n-state TM can take before halting with a blank tape.

    This is obviously dominated by both BB and BLB, but the nice thing is that we can chain any two TMs that blank and halt, by connecting the halt transition of one machine to the starting state of the other, and achieve the sum of their runtimes with the sum of their states. Thus, for all n, k:

    $$BLBB(n + k) \geq BLBB(n) + BLBB(k).$$

    Lemma: Let f be any computable function. Then, there exists a constant c s.t. for all n:

    $$BLBB(n + A^{-1}(n) + c) > BLBB(n) + f(n),$$

    where A^-1 is the inverse Ackermann function.

    Proof: For every n, we construct a TM M_n with A^-1(n) + c states that runs for at least f(n) steps. M works as follows:
    1) write A^-1(n) onto the tape (hardcoded with A^-1(n) states)
    2) blow this up to (at least) n by running the Ackermann function
    3) run the machine for f to convert n into f(n) (assume this requires f(n) steps, e.g. unary encodings?)
    4) blank the tape and halt.

    The existence of M_n implies

    $$BLBB(A^{-1}(n) + c) > f(n),$$

    and thus

    $$BLBB(n + A^{-1}(n) + c) \geq BLBB(n) + BLBB(A^{-1}(n) + c) > BLBB(n) + f(n),$$


    Note: the Ackermann stuff is a bit wonky. The idea is that n can be encoded arbitrarily compact, since we only need to recover an upper bound for it. I am lacking the mathematical finesse to get rid of in entirely.

  76. fethi Says:

    I have a question about how good a strategy using the BB numbers is in the “Who Can Name the Bigger Number?” game. BB eventually dominates any computable series, but doesn’t the “eventually” part make it a potentially bad choice in the game? One can easily conjure up a computable function, myCF(n), that beats BB for low enough values, e.g. myCF(4)>BB(4). This might be extended to something like myCF(99)>BB(99) in the future, however unlikely that is.

    Doesn’t it get even stranger when we try to beat myCF by going to BB(9999)? Now we are in the undecidable territory, in the Luke W #12 and Scott #13 sense. Let’s say referees ban such attempts in the game, only decidable numbers are allowed. Using something like BB(999) right now would be betting, one can lose because your bound of 7918 might come below 999 in the future. there is also the chance that myCF(999)>BB(999), but that is tiny. Is there a winning strategy in “Who Can Name the Bigger Number?” by using an uncomputable function which eventually becomes undecidable? are there uncomputable functions that are known to be decidable for all argument values? is there a winning strategy with them?

    This was in my mind for a long time, and I have been following the blog for a good while, but my apologies if this has been discussed before or I am missing something trivial. Thanks for all the curiosity you inspired!

  77. Bruce Smith Says:

    Vitor #75:

    … So I introduce BLBB(n) (“blanking and busy beaver”, better names welcome) which is the largest number of steps that an n-state TM can take before halting with a blank tape.

    I called this “BB_clean” (in a paper in preparation) — I wanted it for the same reason you did. (I’ll credit you for the independent invention… would you rather be anonymous, “Vitor”, or a real name?)

    BLBB(n+k) ≥ BLBB(n) + BLBB(k)

    Also of interest: BB(n+k) ≥ BB(n) + BLBB(k)

    (proof: only the first chained-together machine needs to blank its tape). (But as a set of new theorems about BB, those are mostly superceded by BB(n+2) ≥ (n+2)/n BB(n). Or so I thought…)

    I like your Lemma. As far as I can tell at first glance, your proof is correct. I think it also proves (renaming your \(c\) to \(c_f\))

    $$ BB(n + A^{-1}(n) + c_f) ≥ BB(n) + f(n) $$

    and I am not yet quite sure whether that is implied by things already known, or is more powerful in some cases.


    But — if you really want a BB-related function that composes nicely, try Radó’s num(n) function, which he defines as the largest k such that an n-state machine can halt next to a run of k ones on the tape. (IIRC he doesn’t clarify whether other junk is also allowed on the tape… but for the following discussion, it doesn’t matter.)

    This lets one machine use the value left by the prior one as an input, so we have

    $$ num(n + c_f) ≥ f(num(n)) $$

    with no “composition overhead” at all.

  78. Nick Drozd Says:

    Vitor #75, Bruce Smith #77

    The idea of halting on the blank tape was first discussed by Ludewig et al in 1983 (page 37 of They called it the “Scientific Beaver” because it works hard but produces nothing and sees little career advancement.

    It seems to me that the amount of static analysis that can be done with a class of programs is inversely proportional to its growth rate. The Scientific Beaver sequence is nice to work with because of its composability, but it is dominated by both BB and BLB. BLB seems to dominate BB, and there are basic results about the monotonicity of BB that cannot be proved (as far as anybody knows) about BLB because BLB programs lack that handy dandy halt instruction.

    Here’s another point that suggests BLB dominates BB. Marxen and Buntrock ( discuss two methods for proving that a program does not halt, “forward reasoning” and “backward reasoning”. Forward reasoning means determining that a program will stay in some particular behavior like Lin recurrence, and that applies just as well to halt-free programs. With backward reasoning, we start with the halt instruction and ask, what conditions must obtain for that instruction to be reached? And then, what conditions must obtain for those conditions to obtain? And so on, until it is shown that some condition is impossible and therefore the halt instruction cannot be reached.

    But of course, this backward reasoning cannot be applied to halt-free programs. So at the same time as there are more programs to deal with and longer programs to deal with, there are fewer methods for dealing with them. Less static analysis means more reliance on runtime analysis, and runtime analysis is slow and expensive and a pain in the ass.

  79. Scott Says:

    Nick Drozd #55: Sorry for the delay in answering you!

    As Sniffnoy pointed out, your Computable Beeping Busy Beaver (CBBB) sequence is ill-defined until you specify which formal system the proofs are allowed to use: for example, ZF set theory? (Gödel’s Theorem, of course, tells us that no single proof system can capture all the true quasihalting statements.)

    Once you do specify a proof system, I claim that your CBBB sequence will have BB-like growth. First, it clearly grows at least as rapidly as BB, since halting is a special case of provable quasihalting. But second, it can’t grow faster than BB-like, since using an oracle for the halting problem, we could enumerate all possible proofs of quasihalting and thereby compute CBBB(n).

  80. Nick Drozd Says:

    Bruce Smith #64

    Considering this argument further, I believe it’s a mistake to think in terms of program composition — first do a bunch of stuff, and then clean up the tape. That approach is indeed how the BLB(3, 2) champion works (see my comment #4), but the BLB(4, 2) champion does not work like that. It implements a certain function that happens to get the tape wiped, but there is no specific piece of logic in the program that deals with tape-blanking. Even the BLB(2, 3) program doesn’t have obvious wiping logic. The logic that ends up doing the wiping is used for other purposes earlier on.

    For reference, here are those programs:

    • BLB(3, 2) = 34 : 1RB 1LB 1LA 1LC 1RC 0LC
    • BLB(2, 3) = 77 : 1RB 2LA 0RB 1LA 0LB 1RA

    Unlike the (3, 2) and (4, 2) champions, the BLB(2, 3) champion hits all of its states (all two of them) forever, and therefore does not quasihalt. In fact, it ends up in a state where it prints a few marks and then blanks the tape over and over forever. This suggests a “beeping” variant of BLB: how long can a program run before it never wipes the tape again? This doesn’t just measure steps until the tape is wiped, it measures steps through any finite number of tape-wipings. The BLB(4, 2) champion wipes the tape just once, but it’s conceivable that a program could wipe the tape several times before never wiping it again.

    Will this sequence grow like BBB?

  81. Bruce Smith Says:

    Nick Drozd #78, #80:

    Thanks for the reference! (They don’t seem to mention the composition property, so I’ll still want to reference Vitor too, so my question about “preferred author name in that reference” still stands.)

    Your points about blanking the tape perhaps being integrated with other logic are well taken. It’s easy to imagine that, if you’re trying to blank it in as few extra states as possible, that’s a good method to try for. Even so, a version of my heuristic argument might still be applied — the fewer elements to your “specification”, the fewer extra states it might take to achieve them, whether or not those extra states come at the end, or are just dual-purposed “normal-use states” which are perhaps slightly less state-efficient due to having to accomplish an additional purpose.

    OTOH, if blanking the tape *helps the machine’s logic* (for example, helps it know when to halt), then maybe those “dual purposes” are correlated enough to nullify that argument.

  82. Vitor Says:

    Bruce #77:

    I’m not anonymous, you can credit me as Vitor Bosshard. I’m actually an academic working in a somewhat TCS-adjacent field. This problem is a wonderful distraction from my “real” work, reminding me exactly why I got into TCS in the first place (reading Scott’s famous big numbers essay was a substantial part of it!).

    $$BB(n + A^{-1}(n) + c_f) > BB(n) + f(n).$$

    Duh! Yes, that’s exactly the statement I set out to prove before I hit on the chaining idea that made me switch models. And for some reason I didn’t think of backporting my result to the original statement. Stopped just 1 step shy of success I guess.

    Here’s some intuition why I think that my lemma is essentially the “correct” formulation if we’re only interested in uniformity of growth: in Scott’s survey, he states a form of the uniform growth hypothesis (e.g. conjecture 13) in the \(BB(n + …) > f(BB(n))\) style, as opposed to the weaker \(BB(n + …) > BB(n) + f(n)\) style that I used. My lemma already proves that BB grows uniformly if you look at it “zoomed out” only by (slightly more than) a constant amount. So there’s a very strong limit to how long a “dry spell” without growth can be. The result is also deliciously self-referential, because the length of a dry spell of growth less than f(n) is bounded basically by the implementation complexity of f [0]. I agree that the \(BB(n+2) > (n+2)/n BB(n)\) result eventually comes to dominate this for any computable function, but that’s only a “for all n > n_0” type of statement, with n_0 that might be arbitrarily large for arbitrary f (???). In contrast, my way of formulating it is more “transparent”, “introspective”, etc and holds for the entire BB(n) sequence.

    Also, the first style of statement gives you extra strength on top of the second style which doesn’t tell us more about uniformity of growth per se, but rather about rate of growth. Namely, that BB’s growth doesn’t just dominate all computable functions, but comes to dominate even BB itself! But that’s a way stronger claim. Since f only adds trivial growth on top of the underlying BB growth, we should expect this to be as hard to prove for a single f as for all computable f. Therefore, I wouldn’t be surprised conjecture 13 doesn’t hold. I know this is all very vague and hand-wavy, would love to hear from someone with opposing intuitions.

    [0]: This naturally brings up an interesting sequence (I don’t know if it already has a name). I’ll call it the sequence of fastest computable growth rates, G_k. First, define the set of functions of a certain implementation complexity:

    $$F_k = \{f | \text{there exists a } k\text{-state TM implementing } f\}.$$

    Then, G_k is given as:

    $$G_k = \Theta\left(f(n) = \sum_{f’ \in F_k} f'(n)\right).$$

    Since F_k has constant size, the Theta captures the highest growth rate among this set. I find it plausible that G_k is itself uncomputable, but can’t think of a proof off the top of my head.

  83. Nick Drozd Says:

    To answer my own question from #80, no, that sequence does not grow like BBB. A n-state program that does not blank the tape infinitely often can blank the tape no more than n-1 times. Say it blanks the tape once. What state is it at that point? If it’s in state A (the initial state), then it will repeat what it has just done infinitely. So suppose it’s in stat B, and it blanks the tape again. What state is it in now? If states A or B, then it will be in a loop, so suppose it’s state C. And so on, until there is just one state left, and the tape can’t be left blank.

    Accounting for these extra few blankings could be considered as an extension of BLB, but it won’t be super-uncomputable.

  84. Nick Drozd Says:

    Okay, last one. Another way for a program to signal that its computation is done is to leave the tape forevermore undisturbed. The machine may continue operation, but no tape cells will be overwritten with different marks.

    An example of this is 1RB 1RC 1LC 1RD 1RA 1LD 0RD 0LB, which runs for 2819 steps before silently traveling off forever. (It also quasihalts at that time, and indeed it was the BBB champion for about a month in 2020.)

    Here’s a question: is this condition computable? That is, is there a reliable method for detecting whether or not the tape will get modified again in the future? Certainly there are some circumstances in which this can be detected, but can it always be detected? With halting and blank tape the answer is yes, and with quasihalting the answer is no, so what about this one?

    If it’s undecidable, then the sequence to which it gives rise should grow like BBB.

  85. Vitor Says:

    Nick #84: it’s uncomputable because the halting problem can be reduced to it.

    We have a machine M and would like to know if it halts. Then we create a machine M’ as follows:

    Whenever M halts, M’ goes into a 2-state loop that moves back and forth between two cells forever, without writing anything.

    Between any two states of M, M’ has a series of extra states that move the head all the way to the left of the tape and write some growing pile of junk (maybe just a huge string of ones, with proper delimiters). Then, the head moves back and continues the computation of M for another step. If the “real” computation that M is doing ever overlaps the junk, we just move the junk a few cells further to the left.

    Claim: M’ will stop disturbing the tape iff M halts.

  86. Bruce Smith Says:

    Nick Drozd #84: can you clarify what you mean by (paraphrasing) “halting can be detected and is therefore computable”? I must misunderstand you, since it sounds like you are saying “whether the machine will eventually halt (from the present finite configuration) is computable”.

  87. Bruce Smith Says:

    Nick Drozd #84:

    I think I understand you now. You are contrasting whether “having presently already reached the halt condition” is computable from the present full machine state (for a machine being simulated). If the halt condition is “first running Halt” or “first blanking the tape”, the answer is yes. If it is “you will never again run the Beep State”, the answer is no. If it is “you will never again write on the tape”, you are not yet sure.

    My guess is “no”, since “writing on the tape” is equivalent to “running one of the instructions whose written-bit differs from their read-bit”, so it seems analogous to “running one of the instructions in the Beep State”. But that is not yet a proof. Interesting question.

  88. Bruce Smith Says:

    Nick Drozd #84:

    Now I think it’s computable. If you are never going to modify the tape, there are a finite number of states, provided “off to the left of the written part by k tape cells” is only interesting to you for k mod n!, and same for “off to the right”. That is because behavior on a blank tape region which you can’t write on has to be periodic with some period at most n, and some fixed tape motion at most n during that period. So exploring that finite graph should tell you whether at some point you write on the tape, go infinitely off to left or right, or enter a forever periodic loop without writing anything.

  89. Bruce Smith Says:

    Vitor #82:

    I am not yet convinced that the inequality

    $$ BB(n + A^{-1}(n) + c_f) ≥ BB(n) + f(n) $$

    is interesting. It took me awhile to put my finger on why.

    Consider an instance of it — that is, fix n and f. (Ignore the \( A^{-1}(n) \) part, for ease of this discussion.)

    You would almost certainly do better (that is, you would make the left hand side larger) by reducing n and increasing c_f (by making f more complicated) so as to leave their sum fixed.

    This is because a few states added to c_f go a very long way in allowing f to be more complicated — they are much more valuable that way, than if they just increase n in the separate fixed BB(n) machine.

    And *that* is basically because (as I mentioned earlier) (note: I’m using this for a different f and c_f):

    $$ num(n + c_f) ≥ f(num(n)) $$

    since the insides of the “compute an upper bound on f(n)” part of the right-hand side are well-expressed by \( num(c_f + A^{-1}(n)) \) .

    In other words, the inequality we’re discussing is basically weaker than just pointing out

    $$ BB(n + c_f) ≥ num(n + c_f) ≥ f(num(n)) $$

  90. Bruce Smith Says:

    Vitor #82:

    In case my objection seems too vague or seems to depend on conjectures, let me try to focus it.

    I am interested in whether BB(n+2)/(n+2) ≥ BB(n)/n is the limit of our knowledge about necessary increase of BB(n+c) compared to BB(n) for arbitrarily large n and fixed c. That is, letting n grow without limit, is it possible there are arbitrarily long stretches (large c) within which BB(n) only grows as much as that requires?

    (I will ignore a small technical improvement I discovered in that inequality, and didn’t yet publish, which would replace that BB(n)/n by (BB(n)-1)/(2n-1) as the “non-decreasing quantity for c = 2”. Exercise for the reader, or ask if interested (and I guess someone will ask, so I’ll post it sooner or later). But I don’t think this changes things much.)

    So — of course most of us conjecture “no, there are not even any stretches of length 1 where that is all of BB’s growth, let alone arbitrarily long stretches”. But this is unproven.

    So I would consider some inequality to go beyond that one, if it proved there were not arbitrarily long stretches of growth in BB(n) only proportional to n.

    I don’t think the inequality we’ve been discussing can do that, because it’s additive rather than multiplicative.

    Of course there are lower bounds on BB(n) which provably grow faster than that. But what if BB(n) greatly exceeds those bounds for n = n_0, and then exceeds them less and less as n increases? This possibility means growth rate of any lower bound is not relevant to this question.

  91. Nick Drozd Says:

    Bruce Smith #86, 87

    Right, I’m talking about the predicates themselves. On every machine cycle the Turing machine needs to check to see if various conditions obtain to see if it should stop or not. It’s easy to see if the machine is currently in the halt state or not, and it’s easy to see if the tape is currently blank or not. Those are decidable predicates, and unbounded search over a decidable predicate gives a recursively enumerable (sigma 1) problem.

    Quasihalting is itself a semidecidable predicate; even just determining if the machine right now is quasihalted is already equivalent to the halting problem. Unbounded search over a semidecidable predicate gives a super-undecidable sigma 2 problem.

    Lin recurrence is a decidable predicate, but it isn’t cheap to compute. Lin’s algorithm for detecting recurrence in general runs O(n^2) in the number of steps taken. This makes it difficult to search over, whereas halting and the blank tape are “easy” to search for. (Searching for recurrence with no more than some fixed period would be easier.)

    So yes, I’m asking whether it’s possible to determine once and for all if the machine right now at its current step will ever modify the tape again. If it is possible, then we are in halting territory; otherwise we are in quasihalting territory.

    Vitor #85, your construction shows that a solution to undisturbed tape problem implies a solution to the halting problem. But for what I’m asking, we need the converse, that a solution to the halting problem implies a solution to the undisturbed tape problem.

    The way I look at it is, what kind of checks does the Turing machine make on every step? The basic logic looks like this:

    while (!IS_HALTED)

    IS_HALTED can be implemented, and TAPE_IS_BLANK can be implemented too. IS_QUASIHALTED cannot be implemented, at least not reliably. Another way to ask the question is: can TAPE_IS_FIXED be implemented reliably? I’m leaning towards yes, although I’m not sure how.

    If it is computable, what is its computational complexity? Is it expensive or cheap?

    (BTW Bruce, you can use the HTML <code> tag in blog comments here.)

  92. Bruce Smith Says:

    Nick #91:

    … Another way to ask the question is: can TAPE_IS_FIXED be implemented reliably? I’m leaning towards yes, although I’m not sure how.

    Maybe you missed my #88, which claims to prove “yes”? (Perhaps missed due to a renumbering of comments, sometime after it initially came out as #87.)

    Thanks for your subsequent comments on this; they add a lot of clarity.

    Unrelated: I also like your proof of the “multiple-blanking-BLB” being limited to “blanking at most n-1 times or forever” — which FYI I discovered independently last year (while trying to understand whether a BB machine can leave its tape blank, which I remain unable to figure out). Probably I mentioned it in some comment on that “BB survey post”. But I’d forgotten it, and was happy to be reminded of it.

    I will try the code tag again sometime. I thought I tried it before… maybe it doesn’t work in preview and that scared me off…

  93. Nick Drozd Says:

    Bruce #92

    Is it the case that an undisturbed tape always implies (1) traveling off to the left or right forever, or (2) scanning some portion of the already-traversed tape back and forth forever? If so, then the undisturbed-tape programs are a strict subset of Lin recurrent programs, making them computable.

    Incidentally, it’s also the case that the quasihaltingness of a Lin recurrent program is decidable. Just look at what happens over the course of a recurrence period: it quasihalts iff some of the states aren’t reached.

  94. Nick Drozd Says:

    Back in the survey thread, Joshua Zelinsky conjectured that the transition graph for a Busy Beaver program ought to be strongly connected. I now believe that this conjecture should be regarded as obviously true, for the following reason.

    There is a technique for Busy Beaver searching known as the tree generation procedure. Start with a program whose only defined instruction is A0 -> 1RB. Then run it until an undefined instruction is reached. Generate all the possible instructions that could be used (ignoring isomorphic duplicates), and then recursively run the procedure on all the programs that result from extending the original with the generated instructions. (All along, stop if certain termination conditions are reached.)

    This technique was devised by Allen Brady in 1964, and it has been used as far as I know for just about every Busy Beaver champion discovered, and that includes this new BBB(4) champion that I found. It is dramatically more powerful than naive program generation in the sense that it yields far fewer programs, but with a much higher average program quality. In the 4-state 2-color case, the method produces a list of about 150,000 programs, and that list includes every single interesting program that I am personally aware of.

    Well, it turns out that out of that list, all but only a few hundred of have strongly connected state transition graphs. The method with an empirically proven track record of identifying champions seems to prefer strongly connected programs. That to me is extremely compelling evidence.

    But actually it suggests that the conjecture is fairly weak. Initially I thought the conjecture could be used to improve naive program search by weeding out non-connected duds, and I did manage to use that idea to discover the previous BBB(4) champion of 66,349 steps. But even still that was nowhere near as effective as tree generation. So from a search perspective, the conjecture doesn’t really tell us anything new.

    As I said above, I now consider the conjecture to be obviously true. It is so obviously true that it might even be provable! Not by me, but by someone (OPEN PROBLEM, RESEACH PROJECT). If it really is provable, I don’t think the proof should be too hard. Of course, it might not be provable at all. That fact itself might be provable though, so look into that too.

    Suppose that the conjecture, which is obviously true, is not provable or provably unprovable. What should we make of it? What kind of fact is it? Here again it seems to have an empirical quality.

  95. Bruce Smith Says:

    Nick #93:

    Is it the case that an undisturbed tape always implies (1) traveling off to the left or right forever, or (2) scanning some portion of the already-traversed tape back and forth forever? If so, then the undisturbed-tape programs are a strict subset of Lin recurrent programs, making them computable.

    Yes, by the same proof as in #88. (Is that proof clear enough as it stands? My presentation was highly informal, and in hindsight it has some description errors, which on rereading I felt were obvious, but of course I’m the worst possible judge of that.)

    Nice observation.

    Let’s see if I can restate that proof for this result directly, and also more clearly:

    Suppose (starting from some point in time) the tape remains fixed forever.

    There is a “central portion” of the tape, which we define as every tape cell within n cells of any non-0 cell, or between any two such cells (so it’s a single interval).

    Suppose we stay within that region forever. Then we’re in case (2), since there’s a finite number of possible full-machine-states while we’re within that region. (Its size is n times the width of that region. “full machine state” means entire contents of tape plus one of the n “TM states”. Too bad the word “state” has to be used in these two different senses.)

    OTOH suppose we don’t stay within that region forever. Then motion outside of it can be analyzed “locally” (that is, within n steps or n tape cells) as if it occurred on a completely blank tape, on which writing is not allowed. This means it’s just wandering among the graph of 0-instructions. That has a period of n or less, and a net motion per period of n or less. That either moves us back into the central region (so ultimately there is still periodic motion, perhaps bouncing in and out in ultimately the same way each time — I hope that’s obvious), or it stays in a fixed-within-n place with the same period, or it moves away forever. QED.

    And to augment that proof, to complete the proof of the claim in #88 (that whether we ever write again is computable): suppose we do end up writing on the tape. We can find out (ie compute whether we do that) by analyzing the finite graph of possible full states implied by the preceding argument, resulting either in periodic motion or moving away forever, as above, or in an explicit Halt or a write on the tape. This is computable, since (given the finite central region) that graph of full machine states is finite. QED of augmented theorem.

  96. Bruce Smith Says:

    I said:

    … perhaps bouncing in and out in ultimately the same way each time — I hope that’s obvious …

    Technically that is not precisely right, since (it leaves open the possibility that) it could bounce out and back in in way1, then in way2, then in way1, etc (repeating any number of up to n ways, forever). It was right “in spirit”, since the result is still periodic (and such “bounces out” are not actually possible, given how I defined the central region).

    If I wanted to write this up formally, I’d say something like — if we start outside, wait til we move away (done) or stay fixed-within-n (done) or move towards central region; in that case, wait til we’re inside central region. Now if we ever again leave central region, it means we’re moving away forever (since it implies we’re outside it, with net motion directed away). And if not, we get case (2).

  97. Bruce Smith Says:

    Sorry, I should have said, *repeated* bounces out are not possible. Because (as I did say), if you bounce out once, you’re moving away forever.

    If any of this is still unclear, go back to the formulation in #88 which classifies tape positions mod n! if they’re outside the central region. This lets you make the finite graph of full states (some of those states must be considered “kinds of periodic motion in blank regions”, which might have net motion of at most n cells per period).

  98. Nick Drozd Says:

    Bruce #95

    Okay, assuming that proof is correct and all fixed-tape programs really are Lin recurrent, I can report the following results for “BBFT” (Busy Beaver Fixed Tape) vs “BBH” (Busy Beaver Halting):

    2/2: 6 (vs 6)
    3/2: 22 (vs 21)
    2/3: 39 (vs 38)
    4/2: 2819 (vs 107)

    That 4-state 2-color value may not be the real value, but I’m going to go ahead and say that the other ones are.

    Don’t be fooled by the similar values for 2/2, 3/2, and 2/3; the fixed-tape programs that attain them are totally different from their halting counterparts. I guess it’s just a coincidence that the numbers are so close?

  99. True randomness and ontological commitments | Gentzen translated Says:

    […] And theorem 5.2 in that paper says that if an arbitrary computably axiomatizable theory is consistent, then it has an arithmetical model that is -definable. For context, we are talking about sets of natural numbers here, the -definable sets are the computable sets, the -definable sets are the arithmetical sets (), and the -definable sets are the hyperarithmetical sets. The limit computable sets are “slightly more complex” than the computable sets, but less “complex” than the arithmetical sets. I recently wrote: […]

Leave a Reply

You can use rich HTML in comments! You can also use basic TeX, by enclosing it within $$ $$ for displayed equations or \( \) for inline equations.

Comment Policies:

  1. All comments are placed in moderation and reviewed prior to appearing.
  2. You'll also be sent a verification email to the email address you provided.
  3. This comment section is not a free speech zone. It's my, Scott Aaronson's, virtual living room. Commenters are expected not to say anything they wouldn't say in my actual living room. This means: No trolling. No ad-hominems against me or others. No presumptuous requests (e.g. to respond to a long paper or article). No conspiracy theories. No patronizing me. Comments violating these policies may be left in moderation with no explanation or apology.
  4. Whenever I'm in doubt, I'll forward comments to Shtetl-Optimized Committee of Guardians, and respect SOCG's judgments on whether those comments should appear.
  5. I sometimes accidentally miss perfectly reasonable comments in the moderation queue, or they get caught in the spam filter. If you feel this may have been the case with your comment, shoot me an email.