Archive for October, 2019

My New York Times op-ed on quantum supremacy

Wednesday, October 30th, 2019

Here it is.

I’d like to offer special thanks to the editor in charge, Eleanor Barkhorn, who commissioned this piece and then went way, way beyond the call of duty to get it right—including relaxing the usual length limit to let me squeeze in amplitudes and interference, and working late into the night to fix last-minute problems. Obviously I take sole responsibility for whatever errors remain.

Of course a lot of material still ended up on the cutting room floor, including a little riff about Andrew Yang’s tweet that because of quantum supremacy, now “no code is uncrackable,” as well as Ivanka Trump’s tweet giving credit for Google’s experiment (one that Google was working toward since 2015) partly to her father’s administration.

While I’m posting: those of a more technical bent might want to check out my new short preprint with UT undergraduate Sam Gunn, where we directly study the complexity-theoretic hardness of spoofing Google’s linear cross-entropy benchmark using a classical computer. Enjoy!

Quantum supremacy: the gloves are off

Wednesday, October 23rd, 2019

Links:
Google paper in Nature
New York Times article
IBM paper and blog post responding to Google’s announcement
Boaz Barak’s new post: “Boaz’s inferior classical inferiority FAQ”
Lipton and Regan’s post
My quantum supremacy interview with the BBC (featuring some of my fewest “uhms” and “ahs” ever!)
NEW: My preprint with Sam Gunn, On the Classical Hardness of Spoofing Linear Cross-Entropy Benchmarking
My interview on NPR affiliate WOSU (starts around 16:30)

When Google’s quantum supremacy paper leaked a month ago—not through Google’s error, but through NASA’s—I had a hard time figuring out how to cover the news here. I had to say something; on the other hand, I wanted to avoid any detailed technical analysis of the leaked paper, because I was acutely aware that my colleagues at Google were still barred by Nature‘s embargo rules from publicly responding to anything I or others said. (I was also one of the reviewers for the Nature paper, which put additional obligations on me.)

I ended up with Scott’s Supreme Quantum Supremacy FAQ, which tried to toe this impossible line by “answering general questions about quantum supremacy, and the consequences of its still-hypothetical achievement, in light of the leak.” It wasn’t an ideal solution—for one thing, because while I still regard Google’s sampling experiment as a historic milestone for our whole field, there are some technical issues, aspects that subsequent experiments (hopefully coming soon) will need to improve. Alas, the ground rules of my FAQ forced me to avoid such issues, which caused some readers to conclude mistakenly that I didn’t think there were any.

Now, though, the Google paper has come out as Nature‘s cover story, at the same time as there have been new technical developments—most obviously, the paper from IBM (see also their blog post) saying that they could simulate the Google experiment in 2.5 days, rather than the 10,000 years that Google had estimated.

(Yesterday I was deluged by emails asking me “whether I’d seen” IBM’s paper. As a science blogger, I try to respond to stuff pretty quickly when necessary, but I don’t—can’t—respond in Twitter time.)

So now the gloves are off. No more embargo. Time to address the technical stuff under the hood—which is the purpose of this post.

I’m going to assume, from this point on, that you already understand the basics of sampling-based quantum supremacy experiments, and that I don’t need to correct beginner-level misconceptions about what the term “quantum supremacy” does and doesn’t mean (no, it doesn’t mean scalability, fault-tolerance, useful applications, breaking public-key crypto, etc. etc.). If this is not the case, you could start (e.g.) with my FAQ, or with John Preskill’s excellent Quanta commentary.

Without further ado:

(1) So what about that IBM thing? Are random quantum circuits easy to simulate classically?

OK, so let’s carefully spell out what the IBM paper says. They argue that, by commandeering the full attention of Summit at Oak Ridge National Lab, the most powerful supercomputer that currently exists on Earth—one that fills the area of two basketball courts, and that (crucially) has 250 petabytes of hard disk space—one could just barely store the entire quantum state vector of Google’s 53-qubit Sycamore chip in hard disk.  And once one had done that, one could simulate the chip in ~2.5 days, more-or-less just by updating the entire state vector by brute force, rather than the 10,000 years that Google had estimated on the basis of my and Lijie Chen’s “Schrödinger-Feynman algorithm” (which can get by with less memory).

The IBM group understandably hasn’t actually done this yet—even though IBM set it up, the world’s #1 supercomputer isn’t just sitting around waiting for jobs! But I see little reason to doubt that their analysis is basically right. I don’t know why the Google team didn’t consider how such near-astronomical hard disk space would change their calculations; probably they wish they had.

I find this to be much, much better than IBM’s initial reaction to the Google leak, which was simply to dismiss the importance of quantum supremacy as a milestone. Designing better classical simulations is precisely how IBM and others should respond to Google’s announcement, and how I said a month ago that I hoped they would respond. If we set aside the pass-the-popcorn PR war (or even if we don’t), this is how science progresses.

But does IBM’s analysis mean that “quantum supremacy” hasn’t been achieved? No, it doesn’t—at least, not under any definition of “quantum supremacy” that I’ve ever used. The Sycamore chip took about 3 minutes to generate the ~5 million samples that were needed to pass the “linear cross-entropy benchmark”—the statistical test that Google applies to the outputs of its device.

(Technical note added: Google’s samples are extremely noisy—the actual distribution being sampled from is something like 0.998U+0.002D, where U is the uniform distribution and D is the hard distribution that you want. What this means, in practice, is that you need to take a number of samples that’s large compared to 1/0.0022, in order to extract a signal corresponding to D. But the good news is that Google can take that many samples in just a few minutes, since once the circuit has been loaded onto the chip, generating each sample takes only about 40 microseconds. And once you’ve done this, what hardness results we have for passing the linear cross-entropy test—to be discussed later in this post—apply basically just as well as if you’d taken a single noiseless sample.)

Anyway, you might notice that three minutes versus 2.5 days is still a quantum speedup by a factor of 1200. But even more relevant, I think, is to compare the number of “elementary operations.” Let’s generously count a FLOP (floating-point operation) as the equivalent of a quantum gate. Then by my estimate, we’re comparing ~5×109 quantum gates against ~2×1020 FLOPs—a quantum speedup by a factor of ~40 billion.

For me, though, the broader point is that neither party here—certainly not IBM—denies that the top-supercomputers-on-the-planet-level difficulty of classically simulating Google’s 53-qubit programmable chip really is coming from the exponential character of the quantum states in that chip, and nothing else. That’s what makes this back-and-forth fundamentally different from the previous one between D-Wave and the people who sought to simulate its devices classically. The skeptics, like me, didn’t much care what speedup over classical benchmarks there was or wasn’t today: we cared about the increase in the speedup as D-Wave upgraded its hardware, and the trouble was that we never saw a convincing case that there would be one. I’m a theoretical computer scientist, and this is what I believe: that after the constant factors have come and gone, what remains are asymptotic growth rates.

In the present case, while increasing the circuit depth won’t evade IBM’s “store everything to hard disk” strategy, increasing the number of qubits will. If Google, or someone else, upgraded from 53 to 55 qubits, that would apparently already be enough to exceed Summit’s 250-petabyte storage capacity. At 60 qubits, you’d need 33 Summits. At 70 qubits, enough Summits to fill a city … you get the idea.

From the beginning, it was clear that quantum supremacy would not be a milestone like the moon landing—something that’s achieved in a moment, and is then clear to everyone for all time. It would be more like eradicating measles: it could be achieved, then temporarily unachieved, then re-achieved. For by definition, quantum supremacy all about beating something—namely, classical computation—and the latter can, at least for a while, fight back.

As Boaz Barak put it to me, the current contest between IBM and Google is analogous to Kasparov versus Deep Blueexcept with the world-historic irony that IBM is playing the role of Kasparov! In other words, Kasparov can put up a heroic struggle, during a “transitional period” that lasts a year or two, but the fundamentals of the situation are that he’s toast. If Kasparov had narrowly beaten Deep Blue in 1997, rather than narrowly losing, the whole public narrative would likely have been different (“humanity triumphs over computers after all!”). Yet as Kasparov himself well knew, the very fact that the contest was close meant that, either way, human dominance would soon end for good.

Let me leave the last word on this to friend-of-the-blog Greg Kuperberg, who graciously gave me permission to quote his comments about the IBM paper.

I’m not entirely sure how embarrassed Google should feel that they overlooked this.   I’m sure that they would have been happier to anticipate it, and happier still if they had put more qubits on their chip to defeat it.   However, it doesn’t change their real achievement.

I respect the IBM paper, even if the press along with it seems more grouchy than necessary.   I tend to believe them that the Google team did not explore all avenues when they said that their 53 qubits aren’t classically simulable.   But if this is the best rebuttal, then you should still consider how much Google and IBM still agree on this as a proof-of-concept of QC.   This is still quantum David vs classical Goliath, in the extreme.   53 qubits is in some ways still just 53 bits, only enhanced with quantum randomness.  To answer those 53 qubits, IBM would still need entire days of computer time with the world’s fastest supercomputer, a 200-petaflop machine with hundreds of thousands of processing cores and trillions of high-speed transistors.   If we can confirm that the Google chip actually meets spec, but we need this much computer power to do it, then to me that’s about as convincing as a larger quantum supremacy demonstration that humanity can no longer confirm at all.

Honestly, I’m happy to give both Google and IBM credit for helping the field of QC, even if it is the result of a strange dispute.

I should mention that, even before IBM’s announcement, Johnnie Gray, a postdoc at Imperial College, gave a talk (abstract here) at Caltech’s Institute for Quantum Information with a proposal for a different faster way to classically simulate quantum circuits like Google’s—in this case, by doing tensor network contraction more cleverly. Unlike both IBM’s proposed brute-force simulation, and the Schrödinger-Feynman algorithm that Google implemented, Gray’s algorithm (as far as we know now) would need to be repeated k times if you wanted k independent samples from the hard distribution. Partly because of this issue, Gray’s approach doesn’t currently look competitive for simulating thousands or millions of samples, but we’ll need to watch it and see what happens.

(2) Direct versus indirect verification.

The discussion of IBM’s proposed simulation brings us to a curious aspect of the Google paper—one that was already apparent when Nature sent me the paper for review back in August. Namely, Google took its supremacy experiments well past the point where even they themselves knew how to verify the results, by any classical computation that they knew how to perform feasibly (say, in less than 10,000 years).

So you might reasonably ask: if they couldn’t even verify the results, then how did they get to claim quantum speedups from those experiments? Well, they resorted to various gambits, which basically involved estimating the fidelity on quantum circuits that looked almost the same as the hard circuits, but happened to be easier to simulate classically, and then making the (totally plausible) assumption that that fidelity would be maintained on the hard circuits. Interestingly, they also cached their outputs and put them online (as part of the supplementary material to their Nature paper), in case it became feasible to verify them in the future.

Maybe you can now see where this is going. From Google’s perspective, IBM’s rainstorm comes with a big silver lining. Namely, by using Summit, hopefully it will now be possible to verify Google’s hardest (53-qubit and depth-20) sampling computations directly! This should provide an excellent test, since not even the Google group themselves would’ve known how to cheat and bias the results had they wanted to.

This whole episode has demonstrated the importance, when doing a sampling-based quantum supremacy experiment, of going deep into the regime where you can no longer classically verify the outputs, as weird as that sounds. Namely, you need to leave yourself a margin, in the likely event that the classical algorithms improve!

Having said that, I don’t mind revealing at this point that the lack of direct verification of the outputs, for the largest reported speedups, was my single biggest complaint when I reviewed Google’s Nature submission. It was because of my review that they added a paragraph explicitly pointing out that they did do direct verification for a smaller quantum speedup:

The largest circuits for which the fidelity can still be directly verified have 53 qubits and a simplified gate arrangement. Performing random circuit sampling on these at 0.8% fidelity takes one million cores 130 seconds, corresponding to a million-fold speedup of the quantum processor relative to a single core.

(An earlier version of this post misstated the numbers involved.)

(3) The asymptotic hardness of spoofing Google’s benchmark.

OK, but if Google thought that spoofing its test would take 10,000 years, using the best known classical algorithms running on the world’s top supercomputers, and it turns out instead that it could probably be done in more like 2.5 days, then how much else could’ve been missed? Will we find out next that Google’s benchmark can be classically spoofed in mere milliseconds?

Well, no one can rule that out, but we do have some reasons to think that it’s unlikely—and crucially, that even if it turned out to be true, one would just have to add 10 or 20 or 30 more qubits to make it no longer true. (We can’t be more definitive than that? Aye, such are the perils of life at a technological inflection point—and of computational complexity itself.)

The key point to understand here is that we really are talking about simulating a random quantum circuit, with no particular structure whatsoever. While such problems might have a theoretically efficient classical algorithm—i.e., one that runs in time polynomial in the number of qubits—I’d personally be much less surprised if you told me there was a polynomial-time classical algorithm for factoring. In the universe where amplitudes of random quantum circuits turn out to be efficiently computable—well, you might as well just tell me that P=PSPACE and be done with it.

Crucially, if you look at IBM’s approach to simulating quantum circuits classically, and Johnnie Gray’s approach, and Google’s approach, they could all be described as different flavors of “brute force.” That is, they all use extremely clever tricks to parallelize, shave off constant factors, make the best use of available memory, etc., but none involves any deep new mathematical insight that could roust BPP and BQP and the other complexity gods from their heavenly slumber. More concretely, none of these approaches seem to have any hope of “breaching the 2n barrier,” where n is the number of qubits in the quantum circuit to be simulated (assuming that the circuit depth is reasonably large). Mostly, they’re just trying to get down to that barrier, while taking the maximum advantage of whatever storage and connectivity and parallelism are there.

Ah, but at the end of the day, we only believe that Google’s Sycamore chip is solving a classically hard problem because of the statistical test that Google applies to its outputs: the so-called “Linear Cross-Entropy Benchmark,” which I described in Q3 of my FAQ. And even if we grant that calculating the output probabilities for a random quantum circuit is almost certainly classically hard, and sampling the output distribution of a random quantum circuit is almost certainly classically hard—still, couldn’t spoofing Google’s benchmark be classically easy?

This last question is where complexity theory can contribute something to the story. A couple weeks ago, UT undergraduate Sam Gunn and I adapted the hardness analysis from my and Lijie Chen’s 2017 paper “Complexity-Theoretic Foundations of Quantum Supremacy Experiments,” to talk directly about the classical hardness of spoofing the Linear Cross-Entropy benchmark. Our short paper about this should be on the arXiv later this week (or early next week, given that there are no arXiv updates on Friday or Saturday nights) here it is.

Briefly, Sam and I show that if you had a sub-2n classical algorithm to spoof the Linear Cross-Entropy benchmark, then you’d also have a sub-2n classical algorithm that, given as input a random quantum circuit, could estimate a specific output probability (for example, that of the all-0 string) with variance at least slightly (say, Ω(2-3n)) better than that of the trivial estimator that just always guesses 2-n. Or in other words: we show that spoofing Google’s benchmark is no easier than the general problem of nontrivially estimating amplitudes in random quantum circuits. Furthermore, this result automatically generalizes to the case of noisy circuits: all that the noise affects is the threshold for the Linear Cross-Entropy benchmark, and thus (indirectly) the number of samples one needs to take with the QC. Our result helps to explain why, indeed, neither IBM nor Johnnie Gray nor anyone else suggested any attack that’s specific to Google’s Linear Cross-Entropy benchmark: they all simply attack the general problem of calculating the final amplitudes.

(4) Why use Linear Cross-Entropy at all?

In the comments of my FAQ, some people wondered why Google chose the Linear Cross-Entropy benchmark specifically—especially since they’d used a different benchmark (multiplicative cross-entropy, which unlike the linear version actually is a cross-entropy) in their earlier papers. I asked John Martinis this question, and his answer was simply that linear cross-entropy had the lowest variance of any estimator they tried. Since I also like linear cross-entropy—it turns out, for example, to be convenient for the analysis of my certified randomness protocol—I’m 100% happy with their choice. Having said that, there are many other choices of benchmark that would’ve also worked fine, and with roughly the same level of theoretical justification.

(5) Controlled-Z versus iSWAP gates.

Another interesting detail from the Google paper is that, in their previous hardware, they could implement a particular 2-qubit gate called the Controlled-Z. For their quantum supremacy demonstration, on the other hand, they modified their hardware to implement a different 2-qubit gate called the iSWAP some weird combination of iSWAP and Controlled-Z; see the comments section for more. Now, this other gate has no known advantages over the Controlled-Z, for any applications like quantum simulation or Shor’s algorithm or Grover search. Why then did Google make the switch? Simply because, with certain classical simulation methods that they’d been considering, the simulation’s running time grows like 4 to the power of the number of these other gates, but only like 2 to the power of the number of Controlled-Z gates! In other words, they made this engineering choice purely and entirely to make a classical simulation of their device sweat more. This seems totally fine and entirely within the rules to me. (Alas, this choice has no effect on a proposed simulation method like IBM’s.)

(6) Gil Kalai’s objections.

Over the past month, Shtetl-Optimized regular and noted quantum computing skeptic Gil Kalai has been posting one objection to the Google experiment after another on his blog. Unlike the IBM group and many of Google’s other critics, Gil completely accepts the centrality of quantum supremacy as a goal. Indeed, he’s firmly predicted for years that quantum supremacy could never be achieved for fundamental reasons—and he agrees that the Google result, if upheld, would refute his worldview. Gil also has no dispute with the exponential classical hardness of the problem that Google is solving.

Instead, Gil—if we’re talking not about “steelmanning” his beliefs, but about what he himself actually said—has taken the position that the Google experiment must’ve been done wrong and will need to be retracted. He’s offered varying grounds for this. First he said that Google never computed the full histogram of probabilities with a smaller number of qubits (for which such an experiment is feasible), which would be an important sanity check. Except, it turns out they did do that, and it’s in their 2018 Science paper. Next he said that the experiment is invalid because the qubits have to be calibrated in a way that depends on the specific circuit to be applied. Except, this too turns out to be false: John Martinis explicitly confirmed for me that once the qubits are calibrated, you can run any circuit on them that you want. In summary, unlike the objections of the IBM group, so far I’ve found Gil’s objections to be devoid of scientific interest or merit.

Update #1: Alas, I’ll have limited availability today for answering comments, since we’ll be grading the midterm exam for my Intro to Quantum Information Science course! I’ll try to handle the backlog tomorrow (Thursday).

Update #2: Aaannd … timed to coincide with the Google paper, last night the group of Jianwei Pan and Chaoyang Lu put up a preprint on the arXiv reporting a BosonSampling experiment with 20 photons 14 photons observed out of 20 generated (the previous record had been 6 photons). At this stage of the quantum supremacy race, many had of course written off BosonSampling—or said that its importance was mostly historical, in that it inspired Google’s random circuit sampling effort.  I’m thrilled to see BosonSampling itself take such a leap; hopefully, this will eventually lead to a demonstration that BosonSampling was (is) a viable pathway to quantum supremacy as well.  And right now, with fault-tolerance still having been demonstrated in zero platforms, we need all the viable pathways we can get.  What an exciting day for the field.

Book Review: ‘The AI Does Not Hate You’ by Tom Chivers

Sunday, October 6th, 2019

A couple weeks ago I read The AI Does Not Hate You: Superintelligence, Rationality, and the Race to Save the World, the first-ever book-length examination of the modern rationalist community, by British journalist Tom Chivers. I was planning to review it here, before it got preempted by the news of quantum supremacy (and subsequent news of classical non-supremacy). Now I can get back to rationalists.

Briefly, I think the book is a triumph. It’s based around in-person conversations with many of the notable figures in and around the rationalist community, in its Bay Area epicenter and beyond (although apparently Eliezer Yudkowsky only agreed to answer technical questions by Skype), together of course with the voluminous material available online. There’s a good deal about the 1990s origins of the community that I hadn’t previously known.

The title is taken from Eliezer’s aphorism, “The AI does not hate you, nor does it love you, but you are made of atoms which it can use for something else.” In other words: as soon as anyone succeeds in building a superhuman AI, if we don’t take extreme care that the AI’s values are “aligned” with human ones, the AI might be expected to obliterate humans almost instantly as a byproduct of pursuing whatever it does value, more-or-less as we humans did with woolly mammoths, moas, and now gorillas, rhinos, and thousands of other species.

Much of the book relates Chivers’s personal quest to figure out how seriously he should take this scenario. Are the rationalists just an unusually nerdy doomsday cult? Is there some non-negligible chance that they’re actually right about the AI thing? If so, how much more time do we have—and is there even anything meaningful that can be done today? Do the dramatic advances in machine learning over the past decade change the outlook? Should Chivers be worried about his own two children? How does this risk compare to the more “prosaic” civilizational risks, like climate change or nuclear war? I suspect that Chivers’s exploration will be most interesting to readers who, like me, regard the answers to none of these questions as obvious.

While it sounds extremely basic, what makes The AI Does Not Hate You so valuable to my mind is that, as far as I know, it’s nearly the only examination of the rationalists ever written by an outsider that tries to assess the ideas on a scale from true to false, rather than from quirky to offensive. Chivers’s own training in academic philosophy seems to have been crucial here. He’s not put off by people who act weirdly around him, even needlessly cold or aloof, nor by utilitarian thought experiments involving death or torture or weighing the value of human lives. He just cares, relentlessly, about the ideas—and about remaining a basically grounded and decent person while engaging them. Most strikingly, Chivers clearly feels a need—anachronistic though it seems in 2019—actually to understand complicated arguments, be able to repeat them back correctly, before he attacks them.

Indeed, far from failing to understand the rationalists, it occurs to me that the central criticism of Chivers’s book is likely to be just the opposite: he understands the rationalists so well, extends them so much sympathy, and ends up endorsing so many aspects of their worldview, that he must simply be a closet rationalist himself, and therefore can’t write about them with any pretense of journalistic or anthropological detachment. For my part, I’d say: it’s true that The AI Does Not Hate You is what you get if you treat rationalists as extremely smart (if unusual) people from whom you might learn something of consequence, rather than as monkeys in a zoo. On the other hand, Chivers does perform the journalist’s task of constantly challenging the rationalists he meets, often with points that (if upheld) would be fatal to their worldview. One of the rationalists’ best features—and this precisely matches my own experience—is that, far from clamming up or storming off when faced with such challenges (“lo! the visitor is not one of us!”), the rationalists positively relish them.

It occurred to me the other day that we’ll never know how the rationalists’ ideas would’ve developed, had they continued to do so in a cultural background like that of the late 20th century. As Chivers points out, the rationalists today are effectively caught in the crossfire of a much larger cultural war—between, to their right, the recrudescent know-nothing authoritarians, and to their left, what one could variously describe as woke culture, call-out culture, or sneer culture. On its face, it might seem laughable to conflate the rationalists with today’s resurgent fascists: many rationalists are driven by their utilitarianism to advocate open borders and massive aid to the Third World; the rationalist community is about as welcoming of alternative genders and sexualities as it’s humanly possible to be; and leading rationalists like Scott Alexander and Eliezer Yudkowsky strongly condemned Trump for the obvious reasons.

Chivers, however, explains how the problem started. On rationalist Internet forums, many misogynists and white nationalists and so forth encountered nerds willing to debate their ideas politely, rather than immediately banning them as more mainstream venues would. As a result, many of those forces of darkness (and they probably don’t mind being called that) predictably congregated on the rationalist forums, and their stench predictably wore off on the rationalists themselves. Furthermore, this isn’t an easy-to-fix problem, because debating ideas on their merits, extending charity to ideological opponents, etc. is sort of the rationalists’ entire shtick, whereas denouncing and no-platforming anyone who can be connected to an ideological enemy (in the modern parlance, “punching Nazis”) is the entire shtick of those condemning the rationalists.

Compounding the problem is that, as anyone who’s ever hung out with STEM nerds might’ve guessed, the rationalist community tends to skew WASP, Asian, or Jewish, non-impoverished, and male. Worse yet, while many rationalists live their lives in progressive enclaves and strongly support progressive values, they’ll also undergo extreme anguish if they feel forced to subordinate truth to those values.

Chivers writes that all of these issues “blew up in spectacular style at the end of 2014,” right here on this blog. Oh, what the hell, I’ll just quote him:

Scott Aaronson is, I think it’s fair to say, a member of the Rationalist community. He’s a prominent theoretical computer scientist at the University of Texas at Austin, and writes a very interesting, maths-heavy blog called Shtetl-Optimised.

People in the comments under his blog were discussing feminism and sexual harassment. And Aaronson, in a comment in which he described himself as a fan of Andrea Dworkin, described having been terrified of speaking to women as a teenager and young man. This fear was, he said, partly that of being thought of as a sexual abuser or creep if any woman ever became aware that he sexually desired them, a fear that he picked up from sexual-harassment-prevention workshops at his university and from reading feminist literature. This fear became so overwhelming, he said in the comment that came to be known as Comment #171, that he had ‘constant suicidal thoughts’ and at one point ‘actually begged a psychiatrist to prescribe drugs that would chemically castrate me (I had researched which ones), because a life of mathematical asceticism was the only future that I could imagine for myself.’ So when he read feminist articles talking about the ‘male privilege’ of nerds like him, he didn’t recognise the description, and so felt himself able to declare himself ‘only’ 97 per cent on board with the programme of feminism.

It struck me as a thoughtful and rather sweet remark, in the midst of a long and courteous discussion with a female commenter. But it got picked up, weirdly, by some feminist bloggers, including one who described it as ‘a yalp of entitlement combined with an aggressive unwillingness to accept that women are human beings just like men’ and that Aaronson was complaining that ‘having to explain my suffering to women when they should already be there, mopping my brow and offering me beers and blow jobs, is so tiresome.’

Scott Alexander (not Scott Aaronson) then wrote a furious 10,000-word defence of his friend… (p. 214-215)

And then Chivers goes on to explain Scott Alexander’s central thesis, in Untitled, that privilege is not a one-dimensional axis, so that (to take one example) society can make many women in STEM miserable while also making shy male nerds miserable in different ways.

For nerds, perhaps an alternative title for Chivers’s book could be “The Normal People Do Not Hate You (Not All of Them, Anyway).” It’s as though Chivers is demonstrating, through understated example, that taking delight in nerds’ suffering, wanting them to be miserable and alone, mocking their weird ideas, is not simply the default, well-adjusted human reaction, with any other reaction being ‘creepy’ and ‘problematic.’ Some might even go so far as to apply the latter adjectives to the sneerers’ attitude, the one that dresses up schoolyard bullying in a social-justice wig.

Reading Chivers’s book prompted me to reflect on my own relationship to the rationalist community. For years, I interacted often with the community—I’ve known Robin Hanson since ~2004 and Eliezer Yudkowsky since ~2006, and our blogs bounced off each other—but I never considered myself a member.  I never ranked paperclip-maximizing AIs among humanity’s more urgent threats—indeed, I saw them as a distraction from an all-too-likely climate catastrophe that will leave its survivors lucky to have stone tools, let alone AIs. I was also repelled by what I saw as the rationalists’ cultier aspects.  I even once toyed with the idea of changing the name of this blog to “More Wrong” or “Wallowing in Bias,” as a play on the rationalists’ LessWrong and OvercomingBias.

But I’ve drawn much closer to the community over the last few years, because of a combination of factors:

  1. The comment-171 affair. This was not the sort of thing that could provide any new information about the likelihood of a dangerous AI being built, but was (to put it mildly) the sort of thing that can tell you who your friends are. I learned that empathy works a lot like intelligence, in that those who boast of it most loudly are often the ones who lack it.
  2. The astounding progress in deep learning and reinforcement learning and GANs, which caused me (like everyone else, perhaps) to update in the direction of human-level AI in our lifetimes being an actual live possibility,
  3. The rise of Scott Alexander. To the charge that the rationalists are a cult, there’s now the reply that Scott, with his constant equivocations and doubts, his deep dives into data, his clarity and self-deprecating humor, is perhaps the least culty cult leader in human history. Likewise, to the charge that the rationalists are basement-dwelling kibitzers who accomplish nothing of note in the real world, there’s now the reply that Scott has attracted a huge mainstream following (Steven Pinker, Paul Graham, presidential candidate Andrew Yang…), purely by offering up what’s self-evidently some of the best writing of our time.
  4. Research. The AI-risk folks started publishing some research papers that I found interesting—some with relatively approachable problems that I could see myself trying to think about if quantum computing ever got boring. This shift seems to have happened at roughly around the same time my former student, Paul Christiano, “defected” from quantum computing to AI-risk research.

Anyway, if you’ve spent years steeped in the rationalist blogosphere, read Eliezer’s “Sequences,” and so on, The AI Does Not Hate You will probably have little that’s new, although it might still be interesting to revisit ideas and episodes that you know through a newcomer’s eyes. To anyone else … well, reading the book would be a lot faster than spending all those years reading blogs! I’ve heard of some rationalists now giving out copies of the book to their relatives, by way of explaining how they’ve chosen to spend their lives.

I still don’t know whether there’s a risk worth worrying about that a misaligned AI will threaten human civilization in my lifetime, or my children’s lifetimes, or even 500 years—or whether everyone will look back and laugh at how silly some people once were to think that (except, silly in which way?). But I do feel fairly confident that The AI Does Not Hate You will make a positive difference—possibly for the world, but at any rate for a little well-meaning community of sneered-at nerds obsessed with the future and with following ideas wherever they lead.

From quantum supremacy to classical fallacy

Wednesday, October 2nd, 2019

Retrospective Comment (Dec. 26, 2019): While I basically stand by what I wrote in this post, I wanted to call attention to the fact that, in its aftermath, one of the authors of the p-bit paper—Kerem Camsari—displayed a striking degree of intellectual courage and honesty. He showed up in the comments section to defend the motivation for the p-bit model, but also to concede the points I’d raised about scaling. Notably, on some matters, he explicitly broke with his own coauthors. He treated having his paper harshly criticized on Shtetl-Optimized not as a personal attack, but as an opportunity to learn and grow. I’m not sure that I would’ve been able to do the same in his shoes, and I regard it as one of the happier outcomes in this blog’s history. –SA

Maybe I should hope that people never learn to distinguish for themselves which claimed breakthroughs in building new forms of computation are obviously serious, and which ones are obviously silly. For as long as they don’t, this blog will always serve at least one purpose. People will cite it, tweet it, invoke its “authority,” even while from my point of view, I’m offering nothing more intellectually special than my toddler does when he calls out “moo-moo cow! baa-baa sheep!” as we pass them on the road.

But that’s too pessimistic. Sure, most readers must more-or-less already know what I’ll say about each thing: that Google’s quantum supremacy claim is serious, that memcomputing to solve NP-complete problems is not, etc. Even so, I’ve heard from many readers that this blog was at least helpful for double-checking their initial impressions, and for making common knowledge what before had merely been known to many. I’m fine for it to continue serving those roles.

Last week, even as I dealt with fallout from Google’s quantum supremacy leak, I also got several people asking me to comment on a Nature paper entitled Integer factorization using stochastic magnetic tunnel junctions (warning: paywalled). See also here for a university press release.

The authors report building a new kind of computer based on asynchronously updated “p-bits” (probabilistic bits). A p-bit is “a robust, classical entity fluctuating in time between 0 and 1, which interacts with other p-bits … using principles inspired by neural networks.” They build a device with 8 p-bits, and use it to factor integers up to 945. They present this as another “unconventional computation scheme” alongside quantum computing, and as a “potentially scalable hardware approach to the difficult problems of optimization and sampling.”

A commentary accompanying the Nature paper goes much further still—claiming that the new factoring approach, “if improved, could threaten data encryption,” and that resources should now be diverted from quantum computing to this promising new idea, one with the advantages of requiring no refrigeration or maintenance of delicate entangled states. (It should’ve added: and how big a number has Shor’s algorithm factored anyway, 21? Compared to 945, that’s peanuts!)

Since I couldn’t figure out a gentler way to say this, here goes: it’s astounding that this paper and commentary made it into Nature in the form that they did. Juxtaposing Google’s sampling achievement with p-bits, as several of my Facebook friends did last week, is juxtaposing the Wright brothers with some guy bouncing around on a pogo stick.

If you were looking forward to watching me dismantle the p-bit claims, I’m afraid you might be disappointed: the task is over almost the moment it begins. “p-bit” devices can’t scalably outperform classical computers, for the simple reason that they are classical computers. A little unusual in their architecture, but still well-covered by the classical Extended Church-Turing Thesis. Just like with the quantum adiabatic algorithm, an energy penalty is applied to coax the p-bits into running a local optimization algorithm: that is, making random local moves that preferentially decrease the number of violated constraints. Except here, because the whole evolution is classical, there doesn’t seem to be even the pretense that anything is happening that a laptop with a random-number generator couldn’t straightforwardly simulate.

Even so, I wouldn’t be writing this post if you opened the paper and it immediately said, in effect, “look, we know. You’re thinking that this is just yet another stochastic local optimization method, which could clearly be simulated efficiently on a conventional computer, thereby putting it into a different conceptual universe from quantum computing. You’re thinking that factoring an n-bit integer will self-evidently take exp(n) time by this method, as compared to exp(n1/3) for the Number Field Sieve, and that no crypto is in even remote danger from this. But here’s why you should still be interested in our p-bit model: because of other advantages X, Y, and Z.” Alas, in vain one searches the whole paper, and the lengthy supplementary material, and the commentary, for any acknowledgment of the pachyderm in the pagoda. Not an asymptotic runtime scaling in sight. Quantum computing is there, but stripped of the theoretical framework that gives it its purpose.

That silence, in the pages of Naturethat’s the part that convinced me that, while on the negative side this blog seems to have accomplished nothing for the world in 14 years of existence, on the positive side it will likely have a role for decades to come.

Update: See a response in the comments, which I appreciated, from Kerem Cansari (one of the authors of the paper), and my response to the response.

(Partly) Unrelated Announcement #1: My new postdoc, Andrea Rocchetto, had the neat idea of compiling a Quantum Computing Fact Sheet: a quick “Cliffs Notes” for journalists, policymakers, and others looking to get the basics right. The fact sheet might grow in the future, but in the meantime, check it out! Or at a more popular level, try the Quantum Atlas made by folks at the University of Maryland.

Unrelated Announcement #2: Daniel Wichs asked me to give a shout-out to a new Conference on Information-Theoretic Cryptography, to be held June 17-19 in Boston.

Third Announcement: Several friends asked me to share that Prof. Peter Wittek, quantum computing researcher at the University of Toronto, has gone missing in the Himalayas. Needless to say we hope for his safe return.