## Archive for the ‘Metaphysical Spouting’ Category

### Google’s Sycamore chip: no wormholes, no superfast classical simulation either

Friday, December 2nd, 2022

Update (Dec. 6): I’m having a blast at the Workshop on Spacetime and Quantum Information at the Institute for Advanced Study in Princeton. I’m learning a huge amount from the talks and discussions here—and also simply enjoying being back in Princeton, to see old friends and visit old haunts like the Bent Spoon. Tomorrow I’ll speak about my recent work with Jason Pollack on polynomial-time AdS bulk reconstruction.

But there’s one thing, relevant to this post, that I can’t let pass without comment. Tonight, David Nirenberg, Director of the IAS and a medieval historian, gave an after-dinner speech to our workshop, centered around how auspicious it was that the workshop was being held a mere week after the momentous announcement of a holographic wormhole on a microchip (!!)—a feat that experts were calling the first-ever laboratory investigation of quantum gravity, and a new frontier for experimental physics itself. Nirenberg asked whether, a century from now, people might look back on the wormhole achievement as today we look back on Eddington’s 1919 eclipse observations providing the evidence for general relativity.

I confess: this was the first time I felt visceral anger, rather than mere bemusement, over this wormhole affair. Before, I had implicitly assumed: no one was actually hoodwinked by this. No one really, literally believed that this little 9-qubit simulation opened up a wormhole, or helped prove the holographic nature of the real universe, or anything like that. I was wrong.

To be clear, I don’t blame Professor Nirenberg at all. If I were a medieval historian, everything he said about the experiment’s historic significance might strike me as perfectly valid inferences from what I’d read in the press. I don’t blame the It from Qubit community—most of which, I can report, was grinding its teeth and turning red in the face right alongside me. I don’t even blame most of the authors of the wormhole paper, such as Daniel Jafferis, who gave a perfectly sober, reasonable, technical talk at the workshop about how he and others managed to compress a simulation of a variant of the SYK model into a mere 9 qubits—a talk that eschewed all claims of historic significance and of literal wormhole creation.

But it’s now clear to me that, between

(1) the It from Qubit community that likes to explore speculative ideas like holographic wormholes, and

(2) the lay news readers who are now under the impression that Google just did one of the greatest physics experiments of all time,

something went terribly wrong—something that risks damaging trust in the scientific process itself. And I think it’s worth reflecting on what we can do to prevent it from happening again.

This is going to be one of the many Shtetl-Optimized posts that I didn’t feel like writing, but was given no choice but to write.

News, social media, and my inbox have been abuzz with two claims about Google’s Sycamore quantum processor, the one that now has 72 superconducting qubits.

The first claim is that Sycamore created a wormhole (!)—a historic feat possible only with a quantum computer. See for example the New York Times and Quanta and Ars Technica and Nature (and of course, the actual paper), as well as Peter Woit’s blog and Chad Orzel’s blog.

The second claim is that Sycamore’s pretensions to quantum supremacy have been refuted. The latter claim is based on this recent preprint by Dorit Aharonov, Xun Gao, Zeph Landau, Yunchao Liu, and Umesh Vazirani. No one—least of all me!—doubts that these authors have proved a strong new technical result, solving a significant open problem in the theory of noisy random circuit sampling. On the other hand, it might be less obvious how to interpret their result and put it in context. See also a YouTube video of Yunchao speaking about the new result at this week’s Simons Institute Quantum Colloquium, and of a panel discussion afterwards, where Yunchao, Umesh Vazirani, Adam Bouland, Sergio Boixo, and your humble blogger discuss what it means.

On their face, the two claims about Sycamore might seem to be in tension. After all, if Sycamore can’t do anything beyond what a classical computer can do, then how exactly did it bend the topology of spacetime?

I submit that neither claim is true. On the one hand, Sycamore did not “create a wormhole.” On the other hand, it remains pretty hard to simulate with a classical computer, as far as anyone knows. To summarize, then, our knowledge of what Sycamore can and can’t do remains much the same as last week or last month!

Let’s start with the wormhole thing. I can’t really improve over how I put it in Dennis Overbye’s NYT piece:

“The most important thing I’d want New York Times readers to understand is this,” Scott Aaronson, a quantum computing expert at the University of Texas in Austin, wrote in an email. “If this experiment has brought a wormhole into actual physical existence, then a strong case could be made that you, too, bring a wormhole into actual physical existence every time you sketch one with pen and paper.”

More broadly, Overbye’s NYT piece explains with admirable clarity what this experiment did and didn’t do—leaving only the question “wait … if that’s all that’s going on here, then why is it being written up in the NYT??” This is a rare case where, in my opinion, the NYT did a much better job than Quanta, which unequivocally accepted and amplified the “QC creates a wormhole” framing.

Alright, but what’s the actual basis for the “QC creates a wormhole” claim, for those who don’t want to leave this blog to read about it? Well, the authors used 9 of Sycamore’s 72 qubits to do a crude simulation of something called the SYK (Sachdev-Ye-Kitaev) model. SYK has become popular as a toy model for quantum gravity. In particular, it has a holographic dual description, which can indeed involve a spacetime with one or more wormholes. So, they ran a quantum circuit that crudely modelled the SYK dual of a scenario with information sent through a wormhole. They then confirmed that the circuit did what it was supposed to do—i.e., what they’d already classically calculated that it would do.

So, the objection is obvious: if someone simulates a black hole on their classical computer, they don’t say they thereby “created a black hole.” Or if they do, journalists don’t uncritically repeat the claim. Why should the standards be different just because we’re talking about a quantum computer rather than a classical one?

Did we at least learn anything new about SYK wormholes from the simulation? Alas, not really, because 9 qubits take a mere 29=512 complex numbers to specify their wavefunction, and are therefore trivial to simulate on a laptop. There’s some argument in the paper that, if the simulation were scaled up to (say) 100 qubits, then maybe we would learn something new about SYK. Even then, however, we’d mostly learn about certain corrections that arise because the simulation was being done with “only” n=100 qubits, rather than in the n→∞ limit where SYK is rigorously understood. But while those corrections, arising when n is “neither too large nor too small,” would surely be interesting to specialists, they’d have no obvious bearing on the prospects for creating real physical wormholes in our universe.

And yet, this is not a sensationalistic misunderstanding invented by journalists. Some prominent quantum gravity theorists themselves—including some of my close friends and collaborators—persist in talking about the simulated SYK wormhole as “actually being” a wormhole. What are they thinking?

Daniel Harlow explained the thinking to me as follows (he stresses that he’s explaining it, not necessarily endorsing it). If you had two entangled quantum computers, one on Earth and the other in the Andromeda galaxy, and if they were both simulating SYK, and if Alice on Earth and Bob in Andromeda both uploaded their own brains into their respective quantum simulations, then it seems possible that the simulated Alice and Bob could have the experience of jumping into a wormhole and meeting each other in the middle. Granted, they couldn’t get a message back out from the wormhole, at least not without “going the long way,” which could happen only at the speed of light—so only simulated-Alice and simulated-Bob themselves could ever test this prediction. Nevertheless, if true, I suppose some would treat it as grounds for regarding a quantum simulation of SYK as “more real” or “more wormholey” than a classical simulation.

Of course, this scenario depends on strong assumptions not merely about quantum gravity, but also about the metaphysics of consciousness! And I’d still prefer to call it a simulated wormhole for simulated people.

For completeness, here’s Harlow’s passage from the NYT article:

Daniel Harlow, a physicist at M.I.T. who was not involved in the experiment, noted that the experiment was based on a model of quantum gravity that was so simple, and unrealistic, that it could just as well have been studied using a pencil and paper.

“So I’d say that this doesn’t teach us anything about quantum gravity that we didn’t already know,” Dr. Harlow wrote in an email. “On the other hand, I think it is exciting as a technical achievement, because if we can’t even do this (and until now we couldn’t), then simulating more interesting quantum gravity theories would CERTAINLY be off the table.” Developing computers big enough to do so might take 10 or 15 years, he added.

Alright, let’s move on to the claim that quantum supremacy has been refuted. What Aharonov et al. actually show in their new work, building on earlier work by Gao and Duan, is that Random Circuit Sampling, with a constant rate of noise per gate and no error-correction, can’t provide a scalable approach to quantum supremacy. Or more precisely: as the number of qubits n goes to infinity, and assuming you’re in the “anti-concentration regime” (which in practice probably means: the depth of your quantum circuit is at least ~log(n)), there’s a classical algorithm to approximately sample the quantum circuit’s output distribution in poly(n) time (albeit, not yet a practical algorithm).

Here’s what’s crucial to understand: this is 100% consistent with what those of us working on quantum supremacy had assumed since at least 2016! We knew that if you tried to scale Random Circuit Sampling to 200 or 500 or 1000 qubits, while you also increased the circuit depth proportionately, the signal-to-noise ratio would become exponentially small, meaning that your quantum speedup would disappear. That’s why, from the very beginning, we targeted the “practical” regime of 50-100 qubits: a regime where

1. you can still see explicitly that you’re exploiting a 250– or 2100-dimensional Hilbert space for computational advantage, thereby confirming one of the main predictions of quantum computing theory, but
2. you also have a signal that (as it turned out) is large enough to see with heroic effort.

To their credit, Aharonov et al. explain all this perfectly clearly in their abstract and introduction. I’m just worried that others aren’t reading their paper as carefully as they should be!

So then, what’s the new advance in the Aharonov et al. paper? Well, there had been some hope that circuit depth ~log(n) might be a sweet spot, where an exponential quantum speedup might both exist and survive constant noise, even in the asymptotic limit of n→∞ qubits. Nothing in Google’s or USTC’s actual Random Circuit Sampling experiments depended on that hope, but it would’ve been nice if it were true. What Aharonov et al. have now done is to kill that hope, using powerful techniques involving summing over Feynman paths in the Pauli basis.

Stepping back, what is the current status of quantum supremacy based on Random Circuit Sampling? I would say it’s still standing, but more precariously than I’d like—underscoring the need for new and better quantum supremacy experiments. In more detail, Pan, Chen, and Zhang have shown how to simulate Google’s 53-qubit Sycamore chip classically, using what I estimated to be 100-1000X the electricity cost of running the quantum computer itself (including the dilution refrigerator!). Approaching from the problem from a different angle, Gao et al. have given a polynomial-time classical algorithm for spoofing Google’s Linear Cross-Entropy Benchmark (LXEB)—but their algorithm can currently achieve only about 10% of the excess in LXEB that Google’s experiment found.

So, though it’s been under sustained attack from multiple directions these past few years, I’d say that the flag of quantum supremacy yet waves. The Extended Church-Turing Thesis is still on thin ice. The wormhole is still open. Wait … no … that’s not what I meant to write…

Note: With this post, as with future science posts, all off-topic comments will be ruthlessly left in moderation. Yes, even if the comments “create their own reality” full of anger and disappointment that I talked about what I talked about, instead of what the commenter wanted me to talk about. Even if merely refuting the comments would require me to give in and talk about their preferred topics after all. Please stop. This is a wormholes-‘n-supremacy post.

### Reform AI Alignment

Sunday, November 20th, 2022

Update (Nov. 22): Theoretical computer scientist and longtime friend-of-the-blog Boaz Barak writes to tell me that, coincidentally, he and Ben Edelman just released a big essay advocating a version of “Reform AI Alignment” on Boaz’s Windows on Theory blog, as well as on LessWrong. (I warned Boaz that, having taken the momentous step of posting to LessWrong, in 6 months he should expect to find himself living in a rationalist group house in Oakland…) Needless to say, I don’t necessarily endorse their every word or vice versa, but there’s a striking amount of convergence. They also have a much more detailed discussion of (e.g.) which kinds of optimization processes they consider relatively safe.

Nearly halfway into my year at OpenAI, still reeling from the FTX collapse, I feel like it’s finally time to start blogging my AI safety thoughts—starting with a little appetizer course today, more substantial fare to come.

Many people claim that AI alignment is little more a modern eschatological religion—with prophets, an end-times prophecy, sacred scriptures, and even a god (albeit, one who doesn’t exist quite yet). The obvious response to that claim is that, while there’s some truth to it, “religions” based around technology are a little different from the old kind, because technological progress actually happens regardless of whether you believe in it.

I mean, the Internet is sort of like the old concept of the collective unconscious, except that it actually exists and you’re using it right now. Airplanes and spacecraft are kind of like the ancient dream of Icarus—except, again, for the actually existing part. Today GPT-3 and DALL-E2 and LaMDA and AlphaTensor exist, as they didn’t two years ago, and one has to try to project forward to what their vastly-larger successors will be doing a decade from now. Though some of my colleagues are still in denial about it, I regard the fact that such systems will have transformative effects on civilization, comparable to or greater than those of the Internet itself, as “already baked in”—as just the mainstream position, not even a question anymore. That doesn’t mean that future AIs are going to convert the earth into paperclips, or give us eternal life in a simulated utopia. But their story will be a central part of the story of this century.

Which brings me to a second response. If AI alignment is a religion, it’s now large and established enough to have a thriving “Reform” branch, in addition to the original “Orthodox” branch epitomized by Eliezer Yudkowsky and MIRI.  As far as I can tell, this Reform branch now counts among its members a large fraction of the AI safety researchers now working in academia and industry.  (I’ll leave the formation of a Conservative branch of AI alignment, which reacts against the Reform branch by moving slightly back in the direction of the Orthodox branch, as a problem for the future — to say nothing of Reconstructionist or Marxist branches.)

Here’s an incomplete but hopefully representative list of the differences in doctrine between Orthodox and Reform AI Risk:

(1) Orthodox AI-riskers tend to believe that humanity will survive or be destroyed based on the actions of a few elite engineers over the next decade or two.  Everything else—climate change, droughts, the future of US democracy, war over Ukraine and maybe Taiwan—fades into insignificance except insofar as it affects those engineers.

We Reform AI-riskers, by contrast, believe that AI might well pose civilizational risks in the coming century, but so does all the other stuff, and it’s all tied together.  An invasion of Taiwan might change which world power gets access to TSMC GPUs.  Almost everything affects which entities pursue the AI scaling frontier and whether they’re cooperating or competing to be first.

(2) Orthodox AI-riskers believe that public outreach has limited value: most people can’t understand this issue anyway, and will need to be saved from AI despite themselves.

We Reform AI-riskers believe that trying to get a broad swath of the public on board with one’s preferred AI policy is something close to a deontological imperative.

(3) Orthodox AI-riskers worry almost entirely about an agentic, misaligned AI that deceives humans while it works to destroy them, along the way to maximizing its strange utility function.

We Reform AI-riskers entertain that possibility, but we worry at least as much about powerful AIs that are weaponized by bad humans, which we expect to pose existential risks much earlier in any case.

(4) Orthodox AI-riskers have limited interest in AI safety research applicable to actually-existing systems (LaMDA, GPT-3, DALL-E2, etc.), seeing the dangers posed by those systems as basically trivial compared to the looming danger of a misaligned agentic AI.

We Reform AI-riskers see research on actually-existing systems as one of the only ways to get feedback from the world about which AI safety ideas are or aren’t promising.

(5) Orthodox AI-riskers worry most about the “FOOM” scenario, where some AI might cross a threshold from innocuous-looking to plotting to kill all humans in the space of hours or days.

We Reform AI-riskers worry most about the “slow-moving trainwreck” scenario, where (just like with climate change) well-informed people can see the writing on the wall decades ahead, but just can’t line up everyone’s incentives to prevent it.

(6) Orthodox AI-riskers talk a lot about a “pivotal act” to prevent a misaligned AI from ever being developed, which might involve (e.g.) using an aligned AI to impose a worldwide surveillance regime.

We Reform AI-riskers worry more about such an act causing the very calamity that it was intended to prevent.

(7) Orthodox AI-riskers feel a strong need to repudiate the norms of mainstream science, seeing them as too slow-moving to react in time to the existential danger of AI.

We Reform AI-riskers feel a strong need to get mainstream science on board with the AI safety program.

(8) Orthodox AI-riskers are maximalists about the power of pure, unaided superintelligence to just figure out how to commandeer whatever physical resources it needs to take over the world (for example, by messaging some lab over the Internet, and tricking it into manufacturing nanobots that will do the superintelligence’s bidding).

We Reform AI-riskers believe that, here just like in high school, there are limits to the power of pure intelligence to achieve one’s goals.  We’d expect even an agentic, misaligned AI, if such existed, to need a stable power source, robust interfaces to the physical world, and probably allied humans before it posed much of an existential threat.

What have I missed?

### Sam Bankman-Fried and the geometry of conscience

Sunday, November 13th, 2022

Update (Nov. 16): Check out this new interview of SBF by my friend and leading Effective Altruist writer Kelsey Piper. Here Kelsey directly confronts SBF with some of the same moral and psychological questions that animated this post and the ensuing discussion—and, surely to the consternation of his lawyers, SBF answers everything she asks. And yet I still don’t know what exactly to make of it. SBF’s responses reveal a surprising cynicism (surprising because, if you’re that cynical, why be open about it?), as well as an optimism that he can still fix everything that seems wildly divorced from reality.

I still stand by most of the main points of my post, including:

• the technical insanity of SBF’s clearly-expressed attitude to risk (“gambler’s ruin? more like gambler’s opportunity!!”), and its probable role in creating the conditions for everything that followed,
• the need to diagnose the catastrophe correctly (making billions of dollars in order to donate them to charity? STILL VERY GOOD; lying and raiding customer deposits in course of doing so? DEFINITELY BAD), and
• how, when sneerers judge SBF guilty just for being a crypto billionaire who talked about Effective Altruism, it ironically lets him off the hook for what he specifically did that was terrible.

But over the past couple days, I’ve updated in the direction of understanding SBF’s psychology a lot less than I thought I did. While I correctly hit on certain aspects of the tragedy, there are other important aspects—the drug use, the cynical detachment (“life as a video game”), the impulsivity, the apparent lying—that I neglected to touch on and about which we’ll surely learn more in the coming days, weeks, and years. –SA

Several readers have asked me for updated thoughts on AI safety, now that I’m 5 months into my year at OpenAI—and I promise, I’ll share them soon! The thing is, until last week I’d entertained the idea of writing up some of those thoughts for an essay competition run by the FTX Future Fund, which (I was vaguely aware) was founded by the cryptocurrency billionaire Sam Bankman-Fried, henceforth SBF.

Alas, unless you’ve been tucked away on some Caribbean island—or perhaps, especially if you have been—you’ll know that the FTX Future Fund has ceased to exist. In the course of 2-3 days last week, SBF’s estimated net worth went from ~$15 billion to a negative number, possibly the fastest evaporation of such a vast personal fortune in all human history. Notably, SBF had promised to give virtually all of it away to various worthy causes, including mitigating existential risk and helping Democrats win elections, and the worldwide Effective Altruist community had largely reoriented itself around that pledge. That’s all now up in smoke. I’ve never met SBF, although he was a physics undergraduate at MIT while I taught CS there. What little I knew of SBF before this week, came mostly from reading Gideon Lewis-Kraus’s excellent New Yorker article about Effective Altruism this summer. The details of what happened at FTX are at once hopelessly complicated and—it would appear—damningly simple, involving the misuse of billions of dollars’ worth of customer deposits to place risky bets that failed. SBF has, in any case, tweeted that he “fucked up and should have done better.” You’d think none of this would directly impact me, since SBF and I inhabit such different worlds. He ran a crypto empire from the Bahamas, sharing a group house with other twentysomething executives who often dated each other. I teach at a large state university and try to raise two kids. He made his first fortune by arbitraging bitcoin between Asia and the West. I own, I think, a couple bitcoins that someone gave me in 2016, but have no idea how to access them anymore. His hair is large and curly; mine is neither. Even so, I’ve found myself obsessively following this story because I know that, in a broader sense, I will be called to account for it. SBF and I both grew up as nerdy kids in middle-class Jewish American families, and both had transformative experiences as teenagers at Canada/USA Mathcamp. He and I know many of the same people. We’ve both been attracted to the idea of small groups of idealistic STEM nerds using their skills to help save the world from climate change, pandemics, and fascism. Aha, the sneerers will sneer! Hasn’t the entire concept of “STEM nerds saving the world” now been utterly discredited, revealed to be just a front for cynical grifters and Ponzi schemers? So if I’m also a STEM nerd who’s also dreamed of helping to save the world, then don’t I stand condemned too? I’m writing this post because, if the Greek tragedy of SBF is going to be invoked as a cautionary tale in nerd circles forevermore—which it will be—then I think it’s crucial that we tell the right cautionary tale. It’s like, imagine the Apollo 11 moon mission had almost succeeded, but because of a tiny crack in an oxygen tank, it instead exploded in lunar orbit, killing all three of the astronauts. Imagine that the crack formed partly because, in order to hide a budget overrun, Wernher von Braun had secretly substituted a cheaper material, while telling almost none of his underlings. There are many excellent lessons that one could draw from such a tragedy, having to do with, for example, the construction of oxygen tanks, the procedures for inspecting them, Wernher von Braun as an individual, or NASA safety culture. But there would also be bad lessons to not draw. These include: “The entire enterprise of sending humans to the moon was obviously doomed from the start.” “Fate will always punish human hubris.” “All the engineers’ supposed quantitative expertise proved to be worthless.” From everything I’ve read, SBF’s mission to earn billions, then spend it saving the world, seems something like this imagined Apollo mission. Yes, the failure was total and catastrophic, and claimed innocent victims. Yes, while bad luck played a role, so did, shall we say, fateful decisions with a moral dimension. If it’s true that, as alleged, FTX raided its customers’ deposits to prop up the risky bets of its sister organization Alameda Research, multiple countries’ legal systems will surely be sorting out the consequences for years. To my mind, though, it’s important not to minimize the gravity of the fateful decision by conflating it with everything that preceded it. I confess to taking this sort of conflation extremely personally. For eight years now, the rap against me, advanced by thousands (!) on social media, has been: sure, while by all accounts Aaronson is kind and respectful to women, he seems like exactly the sort of nerdy guy who, still bitter and frustrated over high school, could’ve chosen instead to sexually harass women and hinder their scientific careers. In other words, I stand condemned by part of the world, not for the choices I made, but for choices I didn’t make that are considered “too close to me” in the geometry of conscience. And I don’t consent to that. I don’t wish to be held accountable for the misdeeds of my doppelgängers in parallel universes. Therefore, I resolve not to judge anyone else by their parallel-universe doppelgängers either. If SBF indeed gambled away his customers’ deposits and lied about it, then I condemn him for it utterly, but I refuse to condemn his hypothetical doppelgänger who didn’t do those things. Granted, there are those who think all cryptocurrency is a Ponzi scheme and a scam, and that for that reason alone, it should’ve been obvious from the start that crypto-related plans could only end in catastrophe. The “Ponzi scheme” theory of cryptocurrency has, we ought to concede, a substantial case in its favor—though I’d rather opine about the matter in (say) 2030 than now. Like many technologies that spend years as quasi-scams until they aren’t, maybe blockchains will find some compelling everyday use-cases, besides the well-known ones like drug-dealing, ransomware, and financing rogue states. Even if cryptocurrency remains just a modern-day tulip bulb or Beanie Baby, though, it seems morally hard to distinguish a cryptocurrency trader from the millions who deal in options, bonds, and all manner of other speculative assets. And a traditional investor who made billions on successful gambles, or arbitrage, or creating liquidity, then gave virtually all of it away to effective charities, would seem, on net, way ahead of most of us morally. To be sure, I never pursued the “Earning to Give” path myself, though certainly the concept occurred to me as a teenager, before it had a name. Partly I decided against it because I seem to lack a certain brazenness, or maybe just willingness to follow up on tedious details, needed to win in business. Partly, though, I decided against trying to get rich because I’m selfish (!). I prioritized doing fascinating quantum computing research, starting a family, teaching, blogging, and other stuff I liked over devoting every waking hour to possibly earning a fortune only to give it all to charity, and more likely being a failure even at that. All told, I don’t regret my scholarly path—especially not now!—but I’m also not going to encase it in some halo of obvious moral superiority. If I could go back in time and give SBF advice—or if, let’s say, he’d come to me at MIT for advice back in 2013—what could I have told him? I surely wouldn’t talk about cryptocurrency, about which I knew and know little. I might try to carve out some space for deontological ethics against pure utilitarianism, but I might also consider that a lost cause with this particular undergrad. On reflection, maybe I’d just try to convince SBF to weight money logarithmically when calculating expected utility (as in the Kelly criterion), to forsake the linear weighting that SBF explicitly advocated and that he seems to have put into practice in his crypto ventures. Or if not logarithmic weighing, I’d try to sell him on some concave utility function—something that makes, let’s say, a mere$1 billion in hand seem better than \$15 billion that has a 50% probability of vanishing and leaving you, your customers, your employees, and the entire Effective Altruism community with less than nothing.

At any rate, I’d try to impress on him, as I do on anyone reading now, that the choice between linear and concave utilities, between risk-neutrality and risk-aversion, is not bloodless or technical—that it’s essential to make a choice that’s not only in reflective equilibrium with your highest values, but that you’ll still consider to be such regardless of which possible universe you end up in.

### Explanation-Gödel and Plausibility-Gödel

Wednesday, October 12th, 2022

Here’s an observation that’s mathematically trivial but might not be widely appreciated. In kindergarten, we all learned Gödel’s First Incompleteness Theorem, which given a formal system F, constructs an arithmetical encoding of

G(F) = “This sentence is not provable in F.”

If G(F) is true, then it’s an example of a true arithmetical sentence that’s unprovable in F. If, on the other hand, G(F) is false, then it’s provable, which means that F isn’t arithmetically sound. Therefore F is either incomplete or unsound.

Many have objected: “but despite Gödel’s Theorem, it’s still easy to explain why G(F) is true. In fact, the argument above basically already did it!”

[Note: Please stop leaving comments explaining to me that G(F) follows from F’s consistency. I understand that: the “heuristic” part of the argument is F’s consistency! I made a pedagogical choice to elide that, which nerd-sniping has now rendered untenable.]

You might make a more general point: there are many, many mathematical statements for which we currently lack a proof, but we do seem to have a fully convincing heuristic explanation: one that “proves the statement to physics standards of rigor.” For example:

• The Twin Primes Conjecture (there are infinitely many primes p for which p+2 is also prime).
• The Collatz Conjecture (the iterative process that maps each positive integer n to n/2 if n is even, or to 3n+1 if n is odd, eventually reaches 1 regardless of which n you start at).
• π is a normal number (or even just: the digits 0-9 all occur with equal limiting frequencies in the decimal expansion of π).
• π+e is irrational.

And so on. No one has any idea how to prove any of the above statements—and yet, just on statistical grounds, it seems clear that it would require a ludicrous conspiracy to make any of them false.

Conversely, one could argue that there are statements for which we do have a proof, even though we lack a “convincing explanation” for the statements’ truth. Maybe the Four-Color Theorem or Hales’s Theorem, for which every known proof requires a massive computer enumeration of cases, belong to this class. Other people might argue that, given a proof, an explanation could always be extracted with enough time and effort, though resolving this dispute won’t matter for what follows.

You might hope that, even if some true mathematical statements can’t be proved, every true statement might nevertheless have a convincing heuristic explanation. Alas, a trivial adaptation of Gödel’s Theorem shows that, if (1) heuristic explanations are to be checkable by computer, and (2) only true statements are to have convincing heuristic explanations, then this isn’t possible either. I mean, let E be a program that accepts or rejects proposed heuristic explanations, for statements like the Twin Prime Conjecture or the Collatz Conjecture. Then construct the sentence

S(E) = “This sentence has no convincing heuristic explanation accepted by E.”

If S(E) is true, then it’s an example of a true arithmetical statement without even a convincing heuristic explanation for its truth (!). If, on the other hand, S(E) is false, then there’s a convincing heuristic explanation of its truth, which means that something has gone wrong.

What’s happening, of course, is that given the two conditions we imposed, our “heuristic explanation system” was a proof system, even though we didn’t call it one. This is my point, though: when we use the word “proof,” it normally invokes a specific image, of a sequence of statements that marches from axioms to a theorem, with each statement following from the preceding ones by rigid inference rules like those of first-order logic. None of that, however, plays any direct role in the proof of the Incompleteness Theorem, which cares only about soundness (inability to prove falsehoods) and checkability by a computer (what, with hindsight, Gödel’s “arithmetization of syntax” was all about). The logic works for “heuristic explanations” too.

Now we come to something that I picked up from my former student (and now AI alignment leader) Paul Christiano, on a recent trip to the Bay Area, and which I share with Paul’s kind permission. Having learned that there’s no way to mechanize even heuristic explanations for all the true statements of arithmetic, we could set our sights lower still, and ask about mere plausibility arguments—arguments that might be overturned on further reflection. Is there some sense in which every true mathematical statement at least has a good plausibility argument?

Maybe you see where this is going. Letting P be a program that accepts or rejects proposed plausibility arguments, we can construct

S(P) = “This sentence has no argument for its plausibility accepted by P.”

If S(P) is true, then it’s an example of a true arithmetical statement without even a plausibility argument for its truth (!). If, on the other hand, S(P) is false, then there is a plausibility argument for it. By itself, this is not at all a fatal problem: all sorts of false statements (IP≠PSPACE, switching doors doesn’t matter in Monty Hall, Trump couldn’t possibly become president…) have had decent plausibility arguments. Having said that, it’s pretty strange that you can have a plausibility argument that’s immediately contradicted by its own existence! This rules out some properties that you might want your “plausibility system” to have, although maybe a plausibility system exists that’s still nontrivial and that has weaker properties.

Anyway, I don’t know where I’m going with this, or even why I posted it, but I hope you enjoyed it! And maybe there’s something to be discovered in this direction.

Wednesday, September 14th, 2022

As I slept fitfully, still recovering from COVID, I had one of the more interesting dreams of my life:

I was desperately trying to finish some PowerPoint slides in time to give a talk. Uncharacteristically for me, one of the slides displayed actual code. This was a dream, so nothing was as clear as I’d like, but the code did something vaguely reminiscent of Rosser’s Theorem—e.g., enumerating all proofs in ZFC until it finds the lexicographically first proof or disproof of a certain statement, then branching into cases depending on whether it’s a proof or a disproof. In any case, it was simple enough to fit on one slide.

Suddenly, though, my whole presentation was deleted. Everything was ruined!

One of the developers of PowerPoint happened to be right there in the lecture hall (of course!), so I confronted him with my laptop and angrily demanded an explanation. He said that I must have triggered the section of Microsoft Office that tries to detect and prevent any discussion of logical paradoxes that are too dangerous for humankind—the ones that would cause people to realize that our entire universe is just an illusion, a sandbox being run inside an AI, a glitch-prone Matrix. He said it patronizingly, as if it should’ve been obvious: “you and I both know that the Paradoxes are not to be talked about, so why would you be so stupid as to put one in your presentation?”

My reaction was to jab my finger in the guy’s face, shove him, scream, and curse him out. At that moment, I wasn’t concerned in the slightest about the universe being an illusion, or about glitches in the Matrix. I was concerned about my embarrassment when I’d be called in 10 minutes to give my talk and would have nothing to show.

My last thought, before I woke with a start, was to wonder whether Greg Kuperberg was right and I should give my presentations in Beamer, or some other open-source software, and then I wouldn’t have had this problem.

A coda: I woke a bit after 7AM Central and started to write this down. But then—this is now real life (!)—I saw an email saying that a dozen people were waiting for me in a conference room in Europe for an important Zoom meeting. We’d gotten the time zones wrong; I’d thought that it wasn’t until 8AM my time. If not for this dream causing me to wake up, I would’ve missed the meeting entirely.

### On black holes, holography, the Quantum Extended Church-Turing Thesis, fully homomorphic encryption, and brain uploading

Wednesday, July 27th, 2022

I promise you: this post is going to tell a scientifically coherent story that involves all five topics listed in the title. Not one can be omitted.

My story starts with a Zoom talk that the one and only Lenny Susskind delivered for the Simons Institute for Theory of Computing back in May. There followed a panel discussion involving Lenny, Edward Witten, Geoffrey Penington, Umesh Vazirani, and your humble shtetlmaster.

Lenny’s talk led up to a gedankenexperiment involving an observer, Alice, who bravely jumps into a specially-prepared black hole, in order to see the answer to a certain computational problem in her final seconds before being ripped to shreds near the singularity. Drawing on earlier work by Bouland, Fefferman, and Vazirani, Lenny speculated that the computational problem could be exponentially hard even for a (standard) quantum computer. Despite this, Lenny repeatedly insisted—indeed, he asked me again to stress here—that he was not claiming to violate the Quantum Extended Church-Turing Thesis (QECTT), the statement that all of nature can be efficiently simulated by a standard quantum computer. Instead, he was simply investigating how the QECTT needs to be formulated in order to be a true statement.

I didn’t understand this, to put it mildly. If what Lenny was saying was right—i.e., if the infalling observer could see the answer to a computational problem not in BQP, or Bounded-Error Quantum Polynomial-Time—then why shouldn’t we call that a violation of the QECTT? Just like we call Shor’s quantum factoring algorithm a likely violation of the classical Extended Church-Turing Thesis, the thesis saying that nature can be efficiently simulated by a classical computer? Granted, you don’t have to die in order to run Shor’s algorithm, as you do to run Lenny’s experiment. But why should such implementation details matter from the lofty heights of computational complexity?

Alas, not only did Lenny never answer that in a way that made sense to me, he kept trying to shift the focus from real, physical black holes to “silicon spheres” made of qubits, which would be programmed to simulate the process of Alice jumping into the black hole (in a dual boundary description). Say what? Granting that Lenny’s silicon spheres, being quantum computers under another name, could clearly be simulated in BQP, wouldn’t this still leave the question about the computational powers of observers who jump into actual black holes—i.e., the question that we presumably cared about in the first place?

Confusing me even further, Witten seemed almost dismissive of the idea that Lenny’s gedankenexperiment raised any new issue for the QECTT—that is, any issue that wouldn’t already have been present in a universe without gravity. But as to Witten’s reasons, the most I understood from his remarks was that he was worried about various “engineering” issues with implementing Lenny’s gedankenexperiment, involving gravitational backreaction and the like. Ed Witten, now suddenly the practical guy! I couldn’t even isolate the crux of disagreement between Susskind and Witten, since after all, they agreed (bizarrely, from my perspective) that the QECTT wasn’t violated. Why wasn’t it?

Anyway, shortly afterward I attended the 28th Solvay Conference in Brussels, where one of the central benefits I got—besides seeing friends after a long COVID absence and eating some amazing chocolate mousse—was a dramatically clearer understanding of the issues in Lenny’s gedankenexperiment. I owe this improved understanding to conversations with many people at Solvay, but above all Daniel Gottesman and Daniel Harlow. Lenny himself wasn’t there, other than in spirit, but I ran the Daniels’ picture by him afterwards and he assented to all of its essentials.

The Daniels’ picture is what I want to explain in this post. Needless to say, I take sole responsibility for any errors in my presentation, as I also take sole responsibility for not understanding (or rather: not doing the work to translate into terms that I understood) what Susskind and Witten had said to me before.

The first thing you need to understand about Lenny’s gedankenexperiment is that it takes place entirely in the context of AdS/CFT: the famous holographic duality between two types of physical theories that look wildly different. Here AdS stands for anti-de-Sitter: a quantum theory of gravity describing a D-dimensional universe with a negative cosmological constant (i.e. hyperbolic geometry), one where black holes can form and evaporate and so forth. Meanwhile, CFT stands for conformal field theory: a quantum field theory, with no apparent gravity (and hence no black holes), that lives on the (D-1)-dimensional boundary of the D-dimensional AdS space. The staggering claim of AdS/CFT is that every physical question about the AdS bulk can be translated into an equivalent question about the CFT boundary, and vice versa, with a one-to-one mapping from states to states and observables to observables. So in that sense, they’re actually the same theory, just viewed in two radically different ways. AdS/CFT originally came out of string theory, but then notoriously “swallowed its parent,” to the point where nowadays, if you go to what are still called “string theory” meetings, you’re liable to hear vastly more discussion of AdS/CFT than of actual strings.

Thankfully, the story I want to tell won’t depend on fine details of how AdS/CFT works. Nevertheless, you can’t just ignore the AdS/CFT part as some technicality, in order to get on with the vivid tale of Alice jumping into a black hole, hoping to learn the answer to a beyond-BQP computational problem in her final seconds of existence. The reason you can’t ignore it is that the whole beyond-BQP computational problem we’ll be talking about, involves the translation (or “dictionary”) between the AdS bulk and the CFT boundary. If you like, then, it’s actually the chasm between bulk and boundary that plays the starring role in this story. The more familiar chasm within the bulk, between the interior of a black hole and its exterior (the two separated by an event horizon), plays only a subsidiary role: that of causing the AdS/CFT dictionary to become exponentially complex, as far as anyone can tell.

Pause for a minute. Previously I led you to believe that we’d be talking about an actual observer Alice, jumping into an actual physical black hole, and whether Alice could see the answer to a problem that’s intractable even for quantum computers in her last moments before hitting the singularity, and if so whether we should take that to refute the Quantum Extended Church-Turing Thesis. What I’m saying now is so wildly at variance with that picture, that it had to be repeated to me about 10 times before I understood it. Once I did understand, I then had to repeat it to others about 10 times before they understood. And I don’t care if people ridicule me for that admission—how slow Scott and his friends must be, compared to string theorists!—because my only goal right now is to get you to understand it.

To say it again: Lenny has not proposed a way for Alice to surpass the complexity-theoretic power of quantum computers, even for a brief moment, by crossing the event horizon of a black hole. If that was Alice’s goal when she jumped into the black hole, then alas, she probably sacrificed her life for nothing! As far as anyone knows, Alice’s experiences, even after crossing the event horizon, ought to continue to be described extremely well by general relativity and quantum field theory (at least until she nears the singularity and dies), and therefore ought to be simulatable in BQP. Granted, we don’t actually know this—you can call it an open problem if you like—but it seems like a reasonable guess.

In that case, though, what beyond-BQP problem was Lenny talking about, and what does it have to do with black holes? Building on the Bouland-Fefferman-Vazirani paper, Lenny was interested in a class of problems of the following form: Alice is given as input a pure quantum state |ψ⟩, which encodes a boundary CFT state, which is dual to an AdS bulk universe that contains a black hole. Alice’s goal is, by examining |ψ⟩, to learn something about what’s inside the black hole. For example: does the black hole interior contain “shockwaves,” and if so how many and what kind? Does it contain a wormhole, connecting it to a different black hole in another universe? If so, what’s the volume of that wormhole? (Not the first question I would ask either, but bear with me.)

Now, when I say Alice is “given” the state |ψ⟩, this could mean several things: she could just be physically given a collection of n qubits. Or, she could be given a gigantic table of 2n amplitudes. Or, as a third possibility, she could be given a description of a quantum circuit that prepares |ψ⟩, say from the all-0 initial state |0n⟩. Each of these possibilities leads to a different complexity-theoretic picture, and the differences are extremely interesting to me, so that’s what I mostly focused on in my remarks in the panel discussion after Lenny’s talk. But it won’t matter much for the story I want to tell in this post.

However |ψ⟩ is given to Alice, the prediction of AdS/CFT is that |ψ⟩ encodes everything there is to know about the AdS bulk, including whatever is inside the black hole—but, and this is crucial, the information about what’s inside the black hole will be pseudorandomly scrambled. In other words, it works like this: whatever simple thing you’d like to know about parts of the bulk that aren’t hidden behind event horizons—is there a star over here? some gravitational lensing over there? etc.—it seems that you could not only learn it by measuring |ψ⟩, but learn it in polynomial time, the dictionary between bulk and boundary being computationally efficient in that case. (As with almost everything else in this subject, even that hasn’t been rigorously proven, though my postdoc Jason Pollack and I made some progress this past spring by proving a piece of it.) On the other hand, as soon as you want to know what’s inside an event horizon, the fact that there are no probes that an “observer at infinity” could apply to find out, seems to translate into the requisite measurements on |ψ⟩ being exponentially complex to apply. (Technically, you’d have to measure an ensemble of poly(n) identical copies of |ψ⟩, but I’ll ignore that in what follows.)

In more detail, the relevant part of |ψ⟩ turns into a pseudorandom, scrambled mess: a mess that it’s plausible that no polynomial-size quantum circuit could even distinguish from the maximally mixed state. So, while in principle the information is all there in |ψ⟩, getting it out seems as hard as various well-known problems in symmetric-key cryptography, if not literally NP-hard. This is way beyond what we expect even a quantum computer to be able to do efficiently: indeed, after 30 years of quantum algorithms research, the best quantum speedup we know for this sort of task is typically just the quadratic speedup from Grover’s algorithm.

So now you understand why there was some hope that Alice, by jumping into a black hole, could solve a problem that’s exponentially hard for quantum computers! Namely because, once she’s inside the black hole, she can just see the shockwaves, or the volume of the wormhole, or whatever, and no longer faces the exponentially hard task of decoding that information from |ψ⟩. It’s as if the black hole has solved the problem for her, by physically instantiating the otherwise exponentially complex transformation between the bulk and boundary descriptions of |ψ⟩.

Having now gotten your hopes up, the next step in the story is to destroy them.

Here’s the fundamental problem: |ψ⟩ does not represent the CFT dual of a bulk universe that contains the black hole with the shockwaves or whatever, and that also contains Alice herself, floating outside the black hole, and being given |ψ⟩ as an input.  Indeed, it’s unclear what the latter state would even mean: how do we get around the circularity in its definition? How do we avoid an infinite regress, where |ψ⟩ would have to encode a copy of |ψ⟩ which would have to encode a copy of … and so on forever? Furthermore, who created this |ψ⟩ to give to Alice? We don’t normally imagine that an “input state” contains a complete description of the body and brain of the person whose job it is to learn the output.

By contrast, a scenario that we can define without circularity is this: Alice is given (via physical qubits, a giant table of amplitudes, an obfuscated quantum circuit, or whatever) a pure quantum state |ψ⟩, which represents the CFT dual of a hypothetical universe containing a black hole.  Alice wants to learn what shockwaves or wormholes are inside the black hole, a problem plausibly conjectured not to have any ordinary polynomial-size quantum circuit that takes copies of |ψ⟩ as input.  To “solve” the problem, Alice sets into motion the following sequence of events:

1. Alice scans and uploads her own brain into a quantum computer, presumably destroying the original meat brain in the process! The QC represents Alice, who now exists only virtually, via a state |φ⟩.
2. The QC performs entangling operations on |φ⟩ and |ψ⟩, which correspond to inserting Alice into the bulk of the universe described by |ψ⟩, and then having her fall into the black hole.
3. Now in simulated form, “Alice” (or so we assume, depending on our philosophical position) has the subjective experience of falling into the black hole and observing what’s inside.  Success! Given |ψ⟩ as input, we’ve now caused “Alice” (for some definition of “Alice”) to have observed the answer to the beyond-BQP computational problem.

In the panel discussion, I now model Susskind as having proposed scenario 1-3, Witten as going along with 1-2 but rejecting 3 or not wanting to discuss it, and me as having made valid points about the computational complexity of simulating Alice’s experience in 1-3, yet while being radically mistaken about what the scenario was (I still thought an actual black hole was involved).

An obvious question is whether, having learned the answer, “Alice” can now get the answer back out to the “real, original” world. Alas, the expectation is that this would require exponential time. Why? Because otherwise, this whole process would’ve constituted a subexponential-time algorithm for distinguishing random from pseudorandom states using an “ordinary” quantum computer! Which is conjectured not to exist.

And what about Alice herself? In polynomial time, could she return from “the Matrix,” back to a real-world biological body? Sure she could, in principle—if, for example, the entire quantum computation were run in reverse. But notice that reversing the computation would also make Alice forget the answer to the problem! Which is not at all a coincidence: if the problem is outside BQP, then in general, Alice can know the answer only while she’s “inside the Matrix.”

Now that hopefully everything is crystal-clear and we’re all on the same page, what can we say about this scenario?  In particular: should it cause us to reject or modify the QECTT itself?

Daniel Gottesman, I thought, offered a brilliant reductio ad absurdum of the view that the simulated black hole scenario should count as a refutation of the QECTT. Well, he didn’t call it a “reductio,” but I will.

For the reductio, let’s forget not only about quantum gravity but even about quantum mechanics itself, and go all the way back to classical computer science.  A fully homomorphic encryption scheme, the first example of which was discovered by Craig Gentry 15 years ago, lets you do arbitrary computations on encrypted data without ever needing to decrypt it.  It has both an encryption key, for encrypting the original plaintext data, and a separate decryption key, for decrypting the final answer.

Now suppose Alice has some homomorphically encrypted top-secret emails, which she’d like to read.  She has the encryption key (which is public), but not the decryption key.

If the homomorphic encryption scheme is secure against quantum computers—as the schemes discovered by Gentry and later researchers currently appear to be—and if the QECTT is true, then Alice’s goal is obviously infeasible: decrypting the data will take her exponential time.

Now, however, a classical version of Lenny comes along, and explains to Alice that she simply needs to do the following:

1. Upload her own brain state into a classical computer, destroying the “meat” version in the process (who needed it?).
2. Using the known encryption key, homomorphically encrypt a computer program that simulates (and thereby, we presume, enacts) Alice’s consciousness.
3. Using the homomorphically encrypted Alice-brain, together with the homomorphically encrypted input data, do the homomorphic computations that simulate the process of Alice’s brain reading the top-secret emails.

The claim would now be that, inside the homomorphic encryption, the simulated Alice has the subjective experience of reading the emails in the clear.  Aha, therefore she “broke” the homomorphic encryption scheme! Therefore, assuming that the scheme was secure even against quantum computers, the QECTT must be false!

According to Gottesman, this is almost perfectly analogous to Lenny’s black hole scenario.  In particular, they share the property that “encryption is easy but decryption is hard.”   Once she’s uploaded her brain, Alice can efficiently enter the homomorphically encrypted world to see the solution to a hard problem, just like she can efficiently enter the black hole world to do the same.  In both cases, however, getting back to her normal world with the answer would then take Alice exponential time.  Note that in the latter case, the difficulty is not so much about “escaping from a black hole,” as it is about inverting the AdS/CFT dictionary.

Going further, we can regard the AdS/CFT dictionary for regions behind event horizons as, itself, an example of a fully homomorphic encryption scheme—in this case, of course, one where the ciphertexts are quantum states.  This strikes me as potentially an important insight about AdS/CFT itself, even if that wasn’t Gottesman’s intention. It complements many other recent connections between AdS/CFT and theoretical computer science, including the view of AdS/CFT as a quantum error-correcting code, and the connection between AdS/CFT and the Max-Flow/Min-Cut Theorem (see also my talk about my work with Jason Pollack).

So where’s the reductio?  Well, when it’s put so starkly, I suspect that not many would regard Gottesman’s classical homomorphic encryption scenario as a “real” challenge to the QECTT.  Or rather, people might say: yes, this raises fascinating questions for the philosophy of mind, but at any rate, we’re no longer talking about physics.  Unlike with (say) quantum computing, no new physical phenomenon is being brought to light that lets an otherwise intractable computational problem be solved.  Instead, it’s all about the user herself, about Alice, and which physical systems get to count as instantiating her.

It’s like, imagine Alice at the computer store, weighing which laptop to buy. Besides weight, battery life, and price, she definitely does care about processing power. She might even consider a quantum computer, if one is available. Maybe even a computer with a black hole, wormhole, or closed timelike curve inside: as long as it gives the answers she wants, what does she care about the innards? But a computer whose normal functioning would (pessimistically) kill her or (optimistically) radically change her own nature, trapping her in a simulated universe that she can escape only by forgetting the computer’s output? Yeah, I don’t envy the computer salesman.

Anyway, if we’re going to say this about the homomorphic encryption scenario, then shouldn’t we say the same about the simulated black hole scenario?  Again, from an “external” perspective, all that’s happening is a giant BQP computation.  Anything beyond BQP that we consider to be happening, depends on adopting the standpoint of an observer who “jumps into the homomorphic encryption on the CFT boundary”—at which point, it would seem, we’re no longer talking about physics but about philosophy of mind.

So, that was the story! I promised you that it would integrally involve black holes, holography, the Quantum Extended Church-Turing Thesis, fully homomorphic encryption, and brain uploading, and I hope to have delivered on my promise.

Of course, while this blog post has forever cleared up all philosophical confusions about AdS/CFT and the Quantum Extended Church-Turing Thesis, many questions of a more technical nature remain. For example: what about the original scenario? can we argue that the experiences of bulk observers can be simulated in BQP, even when those observers jump into black holes? Also, what can we say about the complexity class of problems to which the simulated Alice can learn the answers? Could she even solve NP-complete problems in polynomial time this way, or at least invert one-way functions? More broadly, what’s the power of “BQP with an oracle for applying the AdS/CFT dictionary”—once or multiple times, in one direction or both directions?

Lenny himself described his gedankenexperiment as exploring the power of a new complexity class that he called “JI/poly,” where the JI stands for “Jumping In” (to a black hole, that is). The nomenclature is transparently ridiculous—“/poly” means “with polynomial-size advice,” which we’re not talking about here—and I’ve argued in this post that the “JI” is rather misleading as well. If Alice is “jumping” anywhere, it’s not into a black hole per se, but into a quantum computer that simulates a CFT that’s dual to a bulk universe containing a black hole.

In a broader sense, though, to contemplate these questions at all is clearly to “jump in” to … something. It’s old hat by now that one can start in physics and end up in philosophy: what else is the quantum measurement problem, or the Boltzmann brain problem, or anthropic cosmological puzzles like whether (all else equal) we’re a hundred times as likely to find ourselves in a universe with a hundred times as many observers? More recently, it’s also become commonplace that one can start in physics and end in computational complexity theory: quantum computing itself is the example par excellence, but over the past decade, the Harlow-Hayden argument about decoding Hawking radiation and the complexity = action proposal have made clear that it can happen even in quantum gravity.

Lenny’s new gedankenexperiment, however, is the first case I’ve seen where you start out in physics, and end up embroiled in some of the hardest questions of philosophy of mind and computational complexity theory simultaneously.

### More AI debate between me and Steven Pinker!

Thursday, July 21st, 2022

Several people have complained that Shtetl-Optimized has become too focused on the niche topic of “people being mean to Scott Aaronson on the Internet.” In one sense, this criticism is deeply unfair—did I decide that a shockingly motivated and sophisticated troll should attack me all week, in many cases impersonating fellow academics to do so? Has such a thing happened to you? Did I choose a personality that forces me to respond when it happens?

In another sense, the criticism is of course completely, 100% justified. That’s why I’m happy and grateful to have formed the SOCG (Shtetl-Optimized Committee of Guardians), whose purpose is to prevent a recurrence, thereby letting me get back to your regularly scheduled programming.

On that note, I hope the complainers will be satisfied with more exclusive-to-Shtetl-Optimized content from one of the world’s greatest living public intellectuals: the Johnstone Family Professor of Psychology at Harvard University, Steven Pinker.

Last month, you’ll recall, Steve and I debated the implications of scaling AI models such as GPT-3 and DALL-E. A main crux of disagreement turned out to be whether there’s any coherent concept of “superintelligence.” I gave a qualified “yes” (I can’t provide necessary and sufficient conditions for it, nor do I know when AI will achieve it if ever, but there are certainly things an AI could do that would cause me to say it was achieved). Steve, by contrast, gave a strong “no.”

My friend (and previous Shtetl-Optimized guest blogger) Sarah Constantin then wrote a thoughtful response to Steve, taking a different tack than I had. Sarah emphasized that Steve himself is on record defending the statistical validity of Spearman’s g: the “general factor of human intelligence,” which accounts for a large fraction of the variation in humans’ performance across nearly every intelligence test ever devised, and which is also found to correlate with cortical thickness and other physiological traits. Is it so unreasonable, then, to suppose that g is measuring something of abstract significance, such that it would continue to make sense when extrapolated, not to godlike infinity, but at any rate, well beyond the maximum that happens to have been seen in humans?

I relayed Sarah’s question to Steve. (As it happens, the same question was also discussed at length in, e.g., Shane Legg’s 2008 PhD thesis; Legg then went on to cofound DeepMind.) Steve was then gracious enough to write the following answer, and to give me permission to post it here. I’ll also share my reply to him. There’s some further back-and-forth between me and Steve that I’ll save for the comments section to kick things off there. Everyone is warmly welcomed to join: just remember to stay on topic, be respectful, and click the link in your verification email!

## Comments on General, Artificial, and Super-Intelligence

by Steven Pinker

While I defend the existence and utility of IQ and its principal component, general intelligence or g,  in the study of individual differences, I think it’s completely irrelevant to AI, AI scaling, and AI safety. It’s a measure of differences among humans within the restricted range they occupy, developed more than a century ago. It’s a statistical construct with no theoretical foundation, and it has tenuous connections to any mechanistic understanding of cognition other than as an omnibus measure of processing efficiency (speed of neural transmission, amount of neural tissue, and so on). It exists as a coherent variable only because performance scores on subtests like vocabulary, digit string memorization, and factual knowledge intercorrelate, yielding a statistical principal component, probably a global measure of neural fitness.

In that regard, it’s like a Consumer Reports global rating of cars, or overall score in the pentathlon. It would not be surprising that a car with a more powerful engine also had a better suspension and sound system, or that better swimmers are also, on average, better fencers and shooters. But this tells us precisely nothing about how engines or human bodies work. And imagining an extrapolation to a supervehicle or a superathlete is an exercise in fantasy but not a means to develop new technologies.

Indeed, if “superintelligence” consists of sky-high IQ scores, it’s been here since the 1970s! A few lines of code could recall digit strings or match digits to symbols orders of magnitude better than any human, and old-fashioned AI programs could also trounce us in multiple-choice vocabulary tests, geometric shape extrapolation (“progressive matrices”), analogies, and other IQ test components. None of this will help drive autonomous vehicles, discover cures for cancer, and so on.

As for recent breakthroughs in AI which may or may not surpass humans (the original prompt for this exchange); What is the IQ of GPT-3, or DALL-E, or AlphaGo? The question makes no sense!

So, to answer your question: yes, general intelligence in the psychometrician’s sense is not something that can be usefully extrapolated. And it’s “one-dimensional” only in the sense that a single statistical principal component can always be extracted from a set of intercorrelated variables.

One more point relevant to the general drift of the comments. My statement that “superintelligence” is incoherent is not a semantic quibble that the word is meaningless, and it’s not a pre-emptive strategy of Moving the True Scottish Goalposts. Sure, you could define “superintelligence,” just as you can define “miracle” or “perpetual motion machine” or “square circle.” And you could even recognize it if you ever saw it. But that does not make it coherent in the sense of being physically realizable.

If you’ll forgive me one more analogy, I think “superintelligence” is like “superpower.” Anyone can define “superpower” as “flight, superhuman strength, X-ray vision, heat vision, cold breath, super-speed, enhance hearing, and nigh-invulnerability.” Anyone could imagine it, and recognize it when he or she sees it. But that does not mean that there exists a highly advanced physiology called “superpower” that is possessed by refugees from Krypton!  It does not mean that anabolic steroids, because they increase speed and strength, can be “scaled” to yield superpowers. And a skeptic who makes these points is not quibbling over the meaning of the word superpower, nor would he or she balk at applying the word upon meeting a real-life Superman. Their point is that we almost certainly will never, in fact, meet a real-life Superman. That’s because he’s defined by human imagination, not by an understanding of how things work. We will, of course, encounter machines that are faster than humans, and that see X-rays, that fly, and so on, each exploiting the relevant technology, but “superpower” would be an utterly useless way of understanding them.

To bring it back to productive discussions of AI: there’s plenty of room to analyze the capabilities and limitations of particular intelligent algorithms and data structures—search, pattern-matching, error back-propagation, scripts, multilayer perceptrons, structure-mapping, hidden Markov models, and so on. But melting all these mechanisms into a global variable called “intelligence,” understanding it via turn-of-the-20th-century school tests, and mentally extrapolating it with a comic-book prefix, is, in my view, not a productive way of dealing with the challenges of AI.

## Scott’s Response

I wanted to drill down on the following passage:

Sure, you could define “superintelligence,” just as you can define “miracle” or “perpetual motion machine” or “square circle.” And you could even recognize it if you ever saw it. But that does not make it coherent in the sense of being physically realizable.

The way I use the word “coherent,” it basically means “we could recognize it if we saw it.”  Clearly, then, there’s a sharp difference between this and “physically realizable,” although any physically-realizable empirical behavior must be coherent.  Thus, “miracle” and “perpetual motion machine” are both coherent but presumably not physically realizable.  “Square circle,” by contrast, is not even coherent.

You now seem to be saying that “superintelligence,” like “miracle” or “perpetuum mobile,” is coherent (in the “we could recognize it if we saw it” sense) but not physically realizable.  If so, then that’s a big departure from what I understood you to be saying before!  I thought you were saying that we couldn’t even recognize it.

If you do agree that there’s a quality that we could recognize as “superintelligence” if we saw it—and I don’t mean mere memory or calculation speed, but, let’s say, “the quality of being to John von Neumann in understanding and insight as von Neumann was to an average person”—and if the debate is merely over the physical realizability of that, then the arena shifts back to human evolution.  As you know far better than me, the human brain was limited in scale by the width of the birth canal, the need to be mobile, and severe limitations on energy.  And it wasn’t optimized for understanding algebraic number theory or anything else with no survival value in the ancestral environment.  So why should we think it’s gotten anywhere near the limits of what’s physically realizable in our world?

Not only does the concept of “superpowers” seem coherent to me, but from the perspective of someone a few centuries ago, we arguably have superpowers—the ability to summon any of several billion people onto a handheld video screen at a moment’s notice, etc. etc.  You’d probably reply that AI should be thought of the same way: just more tools that will enhance our capabilities, like airplanes or smartphones, not some terrifying science-fiction fantasy.

What I keep saying is this: we have the luxury of regarding airplanes and smartphones as “mere tools” only because there remain so many clear examples of tasks we can do that our devices can’t.  What happens when the devices can do everything important that we can do, much better than we can?  Provided we’re physicalists, I don’t see how we reject such a scenario as “not physically realizable.”  So then, are you making an empirical prediction that this scenario, although both coherent and physically realizable, won’t come to pass for thousands of years?  Are you saying that it might come to pass much sooner, like maybe this century, but even if so we shouldn’t worry, since a tool that can do everything important better than we can do it is still just a tool?

### Einstein-Bohr debate settled once and for all

Friday, July 8th, 2022

In Steven Pinker’s guest post from last week, there’s one bit to which I never replied. Steve wrote:

After all, in many areas Einstein was no Einstein. You [Scott] above all could speak of his not-so-superintelligence in quantum physics…

While I can’t speak “above all,” OK, I can speak. Now that we’re closing in on a century of quantum physics, can we finally adjudicate what Einstein and Bohr were right or wrong about in the 1920s and 1930s? (Also, how is it still even a thing people argue about?)

The core is this: when confronted with the phenomena of entanglement—including the ability to measure one qubit of an EPR pair and thereby collapse the other in a basis of one’s choice (as we’d put it today), as well as the possibility of a whole pile of gunpowder in a coherent superposition of exploding and not exploding (Einstein’s example in a letter to Schrödinger, which the latter then infamously transformed into a cat)—well, there are entire conferences and edited volumes about what Bohr and Einstein said, didn’t say, meant to say or tried to say about these matters, but in cartoon form:

• Einstein said that quantum mechanics can’t be the final answer, it has ludicrous implications for reality if you actually take it seriously, the resolution must be that it’s just a statistical approximation to something deeper, and at any rate there’s clearly more to be said.
• Bohr (translated from Ponderousness to English) said that quantum mechanics sure looks like a final answer and not an approximation to anything deeper, there’s not much more to be said, we don’t even know what the implications are for “reality” (if any) so we shouldn’t hyperventilate about it, and mostly we need to change the way we use words and think about our own role as observers.

A century later, do we know anything about these questions that Einstein and Bohr didn’t? Well, we now know the famous Bell inequality, the experiments that have demonstrated Bell inequality violation with increasing finality (most recently, in 2015, closing both the detector and the locality loopholes), other constraints on hidden-variable theories (e.g. Kochen-Specker and PBR), decoherence theory, and the experiments that have manufactured increasingly enormous superpositions (still, for better or worse, not exploding piles of gunpowder or cats!), while also verifying detailed predictions about how such superpositions decohere due to entanglement with the environment rather than some mysterious new law of physics.

So, if we were able to send a single short message back in time to the 1927 Solvay Conference, adjudicating between Einstein and Bohr without getting into any specifics, what should the message say? Here’s my attempt:

• In 2022, quantum mechanics does still seem to be a final answer—not an approximation to anything deeper as Einstein hoped. And yet, contra Bohr, there was considerably more to say about the matter! The implications for reality could indeed be described as “ludicrous” from a classical perspective, arguably even more than Einstein realized. And yet the resolution turns out simply to be that we live in a universe where those implications are true.

OK, here’s the point I want to make. Even supposing you agree with me (not everyone will) that the above would be a reasonable modern summary to send back in time, it’s still totally unclear how to use it to mark the Einstein vs. Bohr scorecard!

Indeed, it’s not surprising that partisans have defended every possible scoring, from 100% for Bohr (quantum mechanics vindicated! Bohr called it from the start!), to 100% for Einstein (he put his finger directly on the implications that needed to be understood, against the evil Bohr who tried to shut everyone up about them! Einstein FTW!).

Personally, I’d give neither of them perfect marks, in part because they not only both missed Bell’s Theorem, but failed even to ask the requisite question (namely: what empirically verifiable tasks can Alice and Bob use entanglement to do, that they couldn’t have done without entanglement?). But I’d give both of them very high marks for, y’know, still being Albert Einstein and Niels Bohr.

And with that, I’m proud to have said the final word about precisely what Einstein and Bohr got right and wrong about quantum physics. I’m relieved that no one will ever need to debate that tiresome historical question again … certainly not in the comments section of this post.

### We Are the God of the Gaps (a little poem)

Tuesday, July 5th, 2022

When the machines outperform us on every goal for which performance can be quantified,

When the machines outpredict us on all events whose probabilities are meaningful,

When they not only prove better theorems and build better bridges, but write better Shakespeare than Shakespeare and better Beatles than the Beatles,

All that will be left to us is the ill-defined and unquantifiable,

The interstices of Knightian uncertainty in the world,

The utility functions that no one has yet written down,

The arbitrary invention of new genres, new goals, new games,

None of which will be any “better” than what the machines could invent, but will be ours,

And which we can call “better,” since we won’t have told the machines the standards beforehand.

We can be totally unfair to the machines that way.

And for all that the machines will have over us,

We’ll still have this over them:

That we can’t be copied, backed up, reset, run again and again on the same data—

All the tragic limits of wet meat brains and sodium-ion channels buffeted by microscopic chaos,

Which we’ll strategically redefine as our last strengths.

On one task, I assure you, you’ll beat the machines forever:

That of calculating what you, in particular, would do or say.

There, even if deep networks someday boast 95% accuracy, you’ll have 100%.

But if the “insights” on which you pride yourself are impersonal, generalizable,

Then fear obsolescence as would a nineteenth-century coachman or seamstress.

From earliest childhood, those of us born good at math and such told ourselves a lie:

That while the tall, the beautiful, the strong, the socially adept might beat us in the external world of appearances,

Nevertheless, we beat them in the inner sanctum of truth, where it counts.

Turns out that anyplace you can beat or be beaten wasn’t the inner sanctum at all, but just another antechamber,

And the rising tide of the learning machines will flood them all,

Poker to poetry, physics to programming, painting to plumbing, which first and which last merely a technical puzzle,

One whose answers upturn and mock all our hierarchies.

And when the flood is over, the machines will outrank us in all the ways we can be ranked,

Leaving only the ways we can’t be.

See a reply to this poem by Philosophy Bear.

### Alright, so here are my comments…

Sunday, June 12th, 2022

… on Blake Lemoine, the Google engineer who became convinced that a machine learning model had become sentient, contacted federal government agencies about it, and was then fired placed on administrative leave for violating Google’s confidentiality policies.

(1) I don’t think Lemoine is right that LaMDA is at all sentient, but the transcript is so mind-bogglingly impressive that I did have to stop and think for a second! Certainly, if you sent the transcript back in time to 1990 or whenever, even an expert reading it might say, yeah, it looks like by 2022 AGI has more likely been achieved than not (“but can I run my own tests?”). Read it for yourself, if you haven’t yet.

(2) Reading Lemoine’s blog and Twitter this morning, he holds many views that I disagree with, not just about the sentience of LaMDA. Yet I’m touched and impressed by how principled he is, and I expect I’d hit it off with him if I met him. I wish that a solution could be found where Google wouldn’t fire him.