Robin Hanson and I discuss the AI future

That’s all. No real post this morning, just an hour-long podcast on YouTube featuring two decades-long veterans of the nerd blogosphere, Robin Hanson and yours truly, talking about AI, trying to articulate various possibilities outside the Yudkowskyan doom scenario. The podcast was Robin’s idea. Hope you enjoy, and looking forward to your comments!

Update: Oh, and another new podcast is up, with me and Sebastian Hassinger of Amazon/AWS! Audio only. Mostly quantum computing but with a little AI thrown in.

Update: Yet another new podcast, with Daniel Bashir of The Gradient. Daniel titled it “Against AI Doomerism,” but it covers a bunch of topics (and I’d say my views are a bit more complicated than “anti-doomerist”…).

123 Responses to “Robin Hanson and I discuss the AI future”

  1. Tim Says:

    I’m curious, have the AI doomers (Yudkowsky in particular) ever written a list of what they consider to be the greatest flaws in their position? Steelmanning your opposition and acknowledging gaps in your theories are widely held up as a good practices among the rationality community, but the AI doom sect appears to never do this, from what I can tell.

    Would love to read something that demonstrates otherwise, if anyone knows where I can find it.

  2. Peter S. Shenkin Says:


    Well, has anyone “ever written a list of what they consider to be the greatest flaws in their position”?

    I did it once but kept it hidden. There’s a big difference between logic and rhetoric.

  3. fred Says:

    I’m always a bit skeptical when I hear about “universal” human values. If we look at the world today, there’s such a wide range of values, often at odds with one another.
    And it takes 20 years to bring a new human up to speed with the current state of things (so, for Plato, it could indeed take a decade to bring him up to speed).
    But it also takes about 4 years of (re)education to radicalize a young mind (whether it’s May ’68, the Chinese cultural revolution, wokeism,…).
    And then when we talk about aligning AIs, I would think that, by definition, they would tend to be aligned with the forces that created them, i.e. meritocratic-technocratic-capitalistic.
    E.g. I can’t imagine how an AI emerging from big tech would end up being aligned with foraging society values… but the way things are going in big tech, it could end up being super-woke, haha.

  4. Ilio Says:

    Is there a transcript?

  5. Scott Says:

    Tim #1:

      I’m curious, have the AI doomers (Yudkowsky in particular) ever written a list of what they consider to be the greatest flaws in their position?

    I recall an interview where Eliezer was asked something like that, and he responded by talking about how he’d failed to be doomerist enough, and about how AI took an even faster and more dangerous path than he’d expected. 🙂

  6. Scott Says:

    Ilio #4: Sorry, you’d probably need to generate a transcript from the YouTube subtitles (iirc, there’s some way to do that automatically?)

  7. Tyson Says:

    There is a lot I want to say about the issues discussed on the podcast, but I’m finding it hard to articulate.

    I think there might be an overly human-centered tinge to these discussions. And when people stray from this kind of impartiality, some tend to go in the direction of conceptualizing a transfer of the human legacy, sense of purpose, or intrinsic value, onto our hypothetical AI creations who are smarter than us.

    I was sitting outside as I was thinking about it, and a humming bird flew right in front me and stared at me for a few seconds. Humming birds have been around for about 22 million years. I’m going to use humming birds as a stand in for all of the majestic life on Earth. Really, it makes sense to think about whole ecosystems in this regard, because of how interconnected life is.

    When we discuss voting about the future-of-life, and acceptable kinds of change and rates of change, how do we incorporate the interests of humming birds? Do we assume the best interests of humming birds are well represented by the distribution of, and effectualization of human moral values? Is there an on-Earth AI Fizzle, Futurama, or AI Dystopia, scenario which can be sustained for a million years without a significant risk that all the humming birds die? Is the future of humming birds something which we should think about in terms of humming bird value to humans, or in terms of our responsibilities as the hyper dominant species who’s actions directly threaten their existence?

    We might want to ask ourselves what role AI can/should have on Earth in the long term. Aside from it serving humans, having the hypothetical potential to protect Earth’s ecosystems from humans, protect some people from other people, or protect the Earth from asteroids/comets or far off changes in the phases of the Sun, etc., it wouldn’t seem to have much of a good reason to be here. It isn’t part of the larger super-organism that we and humming birds are. It doesn’t need to rely on precise and consistent Earthly cycles, weather patterns, temperatures, air composition, gravity, clean fresh water, etc.

    In theory, it shouldn’t be too much to ask of an ASI to coexist in a healthy way with humans and humming birds. Biospheres would seem to make up an extraordinarily small fraction of the mass distributions in space. An ASI could expand and do a whole lot of non-harmful things outside of biosphere. It seem to be only out of minor convenience that an ASI that has access to space should ever need anything which puts it in competition or conflict with humming birds. The hard part is still humans and humming birds coexisting.

    If Futurama or Singularity is what we end up with. I hope it would be one which is either mostly off Earth, or in which humans or our AI creations somehow manage to reduce their footprints on Earth enough and reliably sustainable enough in the long term, so that humming birds can still exist a million years from now.

  8. Tim Says:

    Scott #5:

    See, this is the type of thing that makes it very hard for me to take someone seriously. I am *extraordinarily* skeptical of anyone who expresses an extremely high level of certainty in a position that can’t be tested. Especially so if that viewpoint calls for radical and revolutionary changes (which sorta describes the AI doomers).

    I’m much more willing to listen to someone who says “I don’t have all the answers” or “Here’s the best argument against my position” or “Here’s what would convince me that I’m mistaken about this theory”.

  9. Scott Says:

    Tim #8: Eliezer could be a broken record, wildly overconfident, extreme in a way that’s unhelpful to his own cause, and arrogant with clear messianic tendencies, and yet still deserve eternal credit as the person who realized the central importance of the AI alignment problem decades before most of us did (or maybe just placed a bet, but the bet looks increasingly likely to be right). The world is complicated that way!

  10. JimV Says:

    Arthur C. Clarke’s AI in “Space Odyssey”, called HAL in the movie, dates back to 1968. The evil AI has been a stock trope in science fiction for a long time. I don’t think Mr. (Dr.?) Yudkowsky was the first.

    Of course many of those stories were not well-based. I recall an episode of the original Star Trek in which an intelligent robot found on a planetoid had killed its creators but initially was controllable because it had forgotten the “logic” it had used to overcome its programming. Then suddenly it recalls, “That was the equation! Survival must supersede programming!” As if an AI would necessarily have a survival instinct, independent of its programming.

  11. Seth Finkelstein Says:

    Tim #1 – The fundamental problem with opposing the “doomer” position, is that it boils down to the problem of proving a negative. Note this isn’t by any means an original insight by me, it’s a oft-made point. The “doomer” case is a series of extrapolations, and for each one, the proponent can always take refuge in “you can’t prove it won’t happen”. This then combines badly with the infamous problem of trying to estimate very small chances of catastrophic outcomes (sometimes called “Pascal’s mugging”). As in, “But what if the doom prediction is correct? Can you risk the total destruction of humanity?”. Those two cards can then be played back and forth endlessly, to every objection. Thus it doesn’t matter what specific criticism are made, there is, in the jargon, a “fully-generalizable counter-argument”.

  12. Bill Kaminsky Says:

    In your discussion of nuclear power, I was hoping that you or Robin would broach this argument:


    Right now the nuclear power industry is dying a horrible death everywhere in the western world. That’s because the bankers won’t pay for it. There is no other reason, regardless of what you might hear to the contrary. No, it’s not because of some patchouli-scented tree-huggers or a global conspiracy of anti-nuclear government agencies. It’s the bankers.

    You can’t blame them. A fission reactor at an existing site takes 4 to 6 years to build, during which time you make no money. Reactors at new sites generally take 10 to 12 years. Meanwhile, wind turbines go from the first sketch on a napkin to on the grid in 18 months or less. Consider the decision that a banker has to make when presented with two pitches:

    I want 10 million for 18 months, after that I’ll pay you 6%
    I want 25 billion for 5 years, after that I’ll pay you 8%

    Option 1 gets the money every time. Not in theory. This is clearly what is happening in the real world.

    You can argue the technical superiority of fission over wind all you want – in fact, it’s pretty much all true. It is a fact that wind cannot be dispatched while nuclear has a C[apacity]F[actor] around 90% and provides all sorts of baseload. It is a fact that nuclear takes up less land than the equivalent in windmills. Add any of the other advantages you’ve heard, they’re probably true too.

    Here’s the problem with all of those arguments: the bank doesn’t give a crap.

    [… post then pivots to discussing the relative construction costs of various types of power plants…]

    … classic heat-engine power plants are always more complex than other forms of power. Now complexity doesn’t always turn into price, but it often does, and almost always does in industries that squeeze the systems for efficiency. And the energy market has squeezed. Hard.

    This shouldn’t be surprising, really, but many people refuse to believe it. So let me just list a couple of examples of how much it costs to build a watt of generation, taken from Version 8 of Lazard’s L[evelized]C[ost]o[f]E[lectricity] tables [sidenote: Lazard is a very big investment bank and consultancy that’s been doing, among other things, financing and consulting about infrastructure since transcontinental railroads were the hot new thing in the mid-19th Century]:

    Solar PV, $1.25 to $1.75 per watt-peak (Wp)
    Wind, $1.40 to $1.80 per Wp
    Combined cycle gas, $1.06 to $1.32 per Wp
    Coal, $3.00 to $8.40 per Wp
    Nuclear, $5.39 to $8.30 per Wp
    [REMINDER: All these are *capital cost of construction* estimates, not *fuel* costs]


    I know, I know, the power from that reactor is almost 24/7 and you can’t rely on wind. Go tell it to the only person that matters – the banker. Let me know how that goes.

    [Source – Since I’m still paranoid that WordPress will flag any post with a URL in it, I apologize, but please Google the following: Maury Markowitz, Why fusion will never happen, October 26, 2012. NOTE: Despite the title being “Why fusion will never happen”, all of the above quote was just about fission. The part about commercializing fusion gets even more depressing… or, if not depressing per se, more we’ll really need governments to just subsidize it A LOT since it will almost assuredly be wayyyy less profitable relatively than construction of other plants, even with sizeable carbon taxes.]

    You might note that post is from 2012 — a full 11 years ago. What’s the situation now, you ask? Well, you can Google “Lazard’s LCOE (April 2023)” and go to page 11 of the PDF of a PowerPoint there… and, dividing by 1000 since it’s there listed in kW, not W… and then being lazy and not correcting for inflation since 2012 because I really should’ve gone to bed an hour ago, I’d say the equivalent table would read this year as:

    Solar PV, $0.70 to $4.15 per watt-peak (Wp) [NOTE: much wider range both cheaper and more expensive than in 2012 since Lazard now analyzes not just power-utility-scale installations but residential rooftop, which is much more expensive]

    Wind, $1.03 to $2.25 per Wp [NOTE: wider range both cheaper and more expensive since now considering off-shore installations as well as nice, easy, 10+ years of refinement on-land construction]

    Combined cycle gas, $0.65 to $1.30 per Wp [NOTE: Anywhere from much cheaper to slightly cheaper than 2012; again, these are *capital construction costs* not *fuel* costs, but the fracking revolution making gas plentiful encouraged a lot of gas plant construction, hence people got better at making them]

    Coal, $3.20 to $6.78 per Wp

    Nuclear, $8.48 to $13.93 per Wp [NOTE: Relative capital costs ain’t getting better compared to other power plants]

    You can read more in the report if you wonder how things change after adding in potential carbon taxes and other scenarios that big-ass investment banks/infrastructure consultancies naturally care about.

    In closing, it’s true what those dang leftists say… albeit not for the reason they usually say it… the problem is *the system* with that system being CAPITALISM! 😉 [Ok… ok, mostly the difficulty of actually existing democracies to contend with externalities and not necessarily CAPITALISM writ large… but still, the problem kinda is CAPITALISM!! Save us, benign AGI with your irrefutably logical and hitherto-unimaginable persuasive arguments about optimal policy! You’re our only hope!! :O ]

  13. ExPhysProf Says:

    Seth Finkelstein # 11 — The simple response to Pascal’s Wager (“Mugging”), which is “you should do all the Catholic rituals to avoid eternal damnation,” is “but what if God is a Lutheran?” In that case, centuries of religious warfare show that doing the rituals is guaranteed to piss him (her, it) off and thus guarantee damnation. CHECKMATE!!

    In the current context, a confident hypothesis is made concerning the properties and actions of something that has never existed outside of the pages of science fiction, and actions are demanded to make sure that it never comes into existence, to avoid the end of civilization. I can easily create a scenario where preventing its existence would remove civilization’s only remaining hope to mitigate the sure disaster of climate change. Which hypothesis is more likely? They each have measure zero, given that no real evidence is presented, and also since no reasonable measure can be defined. So action and inaction are equally likely to lead to disaster. Which is the bigger danger to avoid? Climate change is REALLY happening and absent unprecedented interventions civilization will be a shambles within the next century. AI has already amazed us and has vast and unimagined potential. Like any disruptive force it will leave behind many broken eggs to clean up, but we really have no choice but to go for it!

  14. Mitchell Porter Says:

    Tyson #7: The standard way out of moral anthropocentrism is to make any form of suffering, not just human suffering, a morally pivotal concept. Though this interferes somewhat with the idea of preserving the Earth as a giant wilderness park, since wild nature is full of suffering.

  15. Ilio Says:

    Scott #6, thanks for the iirc tip, but it’s low quality (no attribution of who said what). Hopefully these LLMs will soon do curated job in one click.

    Bill Kaminsky #12, I actually agree that renewables is better than nuclear fission, but I think that’s the wrong argument: what if you take the prices from the early 2000? You would conclude there’s nothing to expect from wind and solar, because it’s so costly. That was clearly wrong, and in retrospect the mistake was we should take into account that the prices depends on the level of public support of the sector (for research, for infrastructure, and -even more important to investors- for risk mitigation). In the same vein, present prices for the nuclear largely reflect the pathetic current state of this industry. If we play it like the frenchs in the eighties (50+ new reactors within 15 years), chances are it’ll be much easier/cheaper.

  16. Charles A Says:

    Bill #12 It’s much worse than that, Nuclear doesn’t even pay its own liability insurance. It isn’t economical at all at a $15 billion liability cap they have today, when nuclear incidents, depending on the wind (say into NYC), could go into the trillions:

  17. Scott Says:

    ExPhysProf #13: You’re right that there are massive uncertainties on both sides, enormous risks in preventing the building of powerful AI as well as in going forward with it. But you’re wrong that anything for which you feel no evidence has been provided has “measure zero.” That’s not how uncertainty works. (The WHO in March 2020: “There’s no evidence that COVID spreads via human-to-human transmission…” 🙂 )

  18. JimV Says:

    I’m sure this is not an original concept by me, but I am fairly certain many of our problems, including the possibility of AI doom, can be traced back to the invention and use of the printing press. (Along with many solutions also.)

    Referring back to the previous post categorizing future AI possibilities, in my personal conception there is an additional possibility breakdown. Under “Does Civilization Continue/Yes” I would add “Will It Be Great[Yes/No]”; under “Yes” I would put an endpoint of “Iain Banks’ ‘Culture'” and under “No” I would put the existing “Will It be Good” stuff. (“Futurama” being possibly Good but not Great, for reasons previously cited.) Others may consider that Futurama is the best we can do, though.

  19. ExPhysProf Says:

    Scott # 17

    No, actually the WHO was wrong! Vast experience with viruses, and especially with similar viruses, should have led (and did lead) to a strong presumption that human to human transmission was highly likely. Similarly, experience with previous super-human intelligences, were it to exist, would be evidence in favor of the proposition that AI should be banned.

  20. Tyson Says:

    Mitchell Porter #14:

    Obviously suffering is not a good measure to use with such weight as a morally pivotal concept. In the extreme, one could apply it so foolishly as to decide the optimally moral thing would be to kill everything with a nervous system. It would also foolishly suggest to never have a relationship, or love anyone.

    Robin Hanson argued in 2009 that nature is probably doomed within a few centuries.

    Why should this quite confident and relatively trivial and concrete prediction of doom have so little weight in comparison to Yudkowski’s relatively non-trivial and hypothetical prediction of doom?

  21. Scott Says:

    ExPhysProf #19: The point is, you can always say that there’s “no evidence” for any new possibility, by drawing your reference class of “sufficiently similar things that have happened in before” tightly enough. The WHO experts would’ve said in early 2020, “yes, of course many viruses spread airborne person-to-person, but we have no evidence yet that COVID specifically does that.”

    Up to (and even after) Kitty Hawk, many people said that heaver-than-air controlled flight was impossible, because all previous attempts had failed. Of course those people knew that birds did it—they weren’t complete idiots—but they said that birds were too dissimilar from human contrivances and therefore outside the relevant reference class.

    In the case of AI, we certainly do have examples of a more intelligent species (namely us) displacing less intelligent ones, and also of technologically more advanced civilizations wiping out less advanced ones. And AI that understands English and can write code and form multi-step plans and even intentionally deceive humans … none of that is science fiction any longer (or have you been living in a cave? 🙂 ).

    Of course AI wiping us all out is still a thing that’s never happened before, and that indeed by definition can happen at most once. That’s not a reason to dismiss the possibility as “measure zero,” but for radical uncertainty and urgently doing more research.

  22. Tyson Says:

    Naomi Klein articulates some of my shared concerns clearly in her recent article.

    Generative AI will end poverty, they tell us. It will cure all disease. It will solve climate change. It will make our jobs more meaningful and exciting. It will unleash lives of leisure and contemplation, helping us reclaim the humanity we have lost to late capitalist mechanization. It will end loneliness. It will make our governments rational and responsive. These, I fear, are the real AI hallucinations and we have all been hearing them on a loop ever since Chat GPT launched at the end of last year.

    There is a world in which generative AI, as a powerful predictive research tool and a performer of tedious tasks, could indeed be marshalled to benefit humanity, other species and our shared home. But for that to happen, these technologies would need to be deployed inside a vastly different economic and social order than our own, one that had as its purpose the meeting of human needs and the protection of the planetary systems that support all life.

  23. Scott Says:

    Tyson #22: It’s amazing how, for Naomi Klein, everything that happens in the world ends up further bolstering her conviction that capitalism needs to be destroyed. 🙂

    For me, the question is merely one of which I fear more: unbridled capitalism, or whichever system Naomi Klein would replace it with…

  24. Tyson Says:

    Scott #23:

    Some brands of capitalism have worked well in some regards in the worlds it has been applied to. Regarding the relevant contemporary and future threats, whether they stem from capitalism in principle, or just some versions or corruptions of it, or other parallel disincentivizing systems and misaligned causal structures, or whatever, what we have been doing so far in aggregate seems to be on track to cause a lot of bad stuff to happen, including the wholesale destruction of nature. I don’t have the specifics to propose now about how exactly to change our economic systems and other systems, but it would seem cosmically reprehensible to throw our hands up and settle for the eventual wholesale destruction of nature. If there is a solution that looks like capitalism, I would be happy with that.

    It can be frustrating trying to think about these very serious immanent problems we face, and be restricted to thinking insides the lines of political and ideological dogmas and traditions designed by past generations to solve old problem distributions. Maybe we can extract the things we like about capitalism and integrate them into our new systems, and then still reform our systems so that they align in a way to send us towards a good future, and then those who care can name the new system something like, alignment capitalism, or something like that, and people can be ideologically satisfied and still not destroy the planet in the long run.

    One of the important things, I think, that Naomi got right in her article, is that AI most likely won’t solve these problems by default. They may be only able to help us solve those problems with our willing participation, which is less trivial than we would like in the real world, which is very messy. As Naomi said about one of the possible obstacles, “… the fabric of shared reality is unravelling in our hands, we will find ourselves unable to respond with any coherence at all.” Both optimism and pessimism have their purposes. In the end, real concrete results are what matters.

  25. Seth Finkelstein Says:

    ExPhysProf – with no disrespect intended to Scott, allow me to point out the process I outlined as it happens here:

    ExPhysProf #19: The point is, you can always say that there’s “no
    evidence” for any new possibility, by drawing your reference class
    of “sufficiently similar things that have happened in before”

    i.e. “you can’t prove it won’t happen”. Whatever argument one gives against the doomer extrapolations, the response is that it’s invalid. Again, for anything. And then we go to card number 2:

    Of course AI wiping us all out is still a thing that’s never happened
    before, and that indeed by definition can happen at most once. That’s
    not a reason to dismiss the possibility as “measure zero,” but for
    radical uncertainty and urgently doing more research.

    i.e. “But what if the doom prediction is correct? Can you risk the total destruction of humanity?”

    ExPhysProf, you’re trying to discredit this capture-loop by pointing out alternate possibilities. But that doesn’t work well against the above structure, given its emotional appeal.

  26. Scott Says:

    I find it sad that there seems to be so little intellectual room between “AI doom is virtually certain on our current course” (the Yudkowskyan position), and “AI doom is so absurd that the arguments shouldn’t even be engaged directly, but only analyzed in social, psychological, and emotional terms” (the apparent position of ExPhysProf and Seth Finkelstein).

    Given what the world—or at least the part of it without its fingers stuck in its ears!—has witnessed over the past few years, I think that at this point the boring, conservative prediction is that AI will soon become more competent than humans across most intellectual domains, and that this will change civilization almost beyond recognition.

    I simply try to be honest about the fact that I don’t know what that will ultimately mean, and whether it will be good or bad from our or our descendants’ standpoint. “Doom for humans” is certainly one obvious possibility on the table; it’s just that it’s far from the only possibility.

  27. Fissile Material Says:

    @Bill Kaminsky #12

    At the risk of stating the obvious – this argument overlooks at least two objections. Firstly, the anti-nuclear sentiment was responsible to large extent for an almost-halt in nuclear energy R&D. Freeze a technology in place in the 60s/ 70s, lower its prestige, apparent potential and learning curve for generations, compare it to massively-subsidized alternatives – profit (but not for most of us)?

    Secondly, nuclear power is ridiculously safe per watt. Yes, we all can give salient examples of extremely-high-profile failures. No, they don’t come within orders of magnitude of the cumulative harm that alternative sources of energy cause – and it would seem the source you’re quoting might not disagree. But demand the same standards of safety/ efficiency ratio from all other sources, and _then_ you get to compare prices!

    Now, to be clear, we don’t know that even given time and incentives and a favorable regulatory environment, nuclear would be cost-effective. But we also don’t know the opposite – and the kind of people who contributed to the stultified development of nuclear power are also the kind of people who are now smugly confident that it wouldn’t.

  28. Doug S. Says:

    The short version of the doomer argument kind of goes like this:

    1) It should be possible to make a machine that, like humans, can achieve *arbitrary* goals in the real world, but is much, much better than humans at it, in much the same way that a chess computer is much much better than humans at chess. For the purpose of this argument, it doesn’t matter one bit if the machine is conscious or anything like that, only that it’s significantly better at achieving goals in the real world than people are.

    2) Most possible goals such a machine could have, pursued with superhuman capability and to the exclusion of everything else, result in a world that people would think has no value. The deliberately silly example you hear all the time is a “paperclip maximizer” that wants to maximize the number of paperclips in the universe, which goes on to get rid of all the humans because they might try to shut it down (which means it can’t make paperclips) and because their blood contains iron atoms that can be used to make paperclips out of. But even a goal that doesn’t seem obviously bad can backfire; one of the oldest SF dystopias was the story “With Folded Hands”, in which robots designed “to serve man and keep him from harm” end up doing everything for people and making sure that we don’t do dangerous things – such as crossing the street – leaving the hero of the story with nothing to do but sit “with folded hands”. We all know that computers are so dumb that they will do exactly what you tell them to, and just like in stories about genies that grant wishes, a strongly superhuman goal-achieving machine, if given a goal that doesn’t exactly match an entire human morality, is very likely to give us exactly what we asked for and exactly what we didn’t want.

    3) Even if we did have a perfect description of what humans value that was mathematically precise enough that we could program it into a computer without any ambiguities or loopholes that might make things go wrong, the current machine learning methods that have led to all the recent progress aren’t the kinds of methods that we could use to “teach” it to a neural network based AI and then be sure it learned it properly.

  29. Bill Kaminsky Says:

    A question for Scott:

    I listened to your appearance on Daniel Bashir’s _Gradient Podcast_, and I have the following question in regard to you doing ye olde “not a bug, but a feature!” lemons-to-lemonade trick with Goldwasser, Kim, Vaikuntanathan, and Kim’s “Planting Undetectable Backdoors in Machine Learning Models” (arXiv:2204.06974), namely:

    I know (albeit just from skimming the article) that Goldwasser, et al. address the “persistence” of their cryptographic backdoor in the face of “post-processing” of the network of the particular type of doing further gradient descent with extra training instances with respect to any loss function.

    But have you or they or anyone else addressed the persistence of cryptographic backdoors to other common forms of post-processing? I’m thinking especially of “pruning”-style post-processing a la the “lottery ticket hypothesis” (i.e., the hypothesis that a large chunk of trained networks in fact contain much sparser subnetworks — the metaphorical “lottery tickets” — that have almost as good performance as the full network)? Do you have any thoughts or know of any work about whether one could somehow get guarantees that the amount of pruning (or other perturbing of the network weights) that’d be necessary to ruin the cryptographic backdoor would necessarily also significantly compromise the pruned network’s performance, forcing the hypothetically rebellious ASI to either keep the backdown or become a pretty darn non-intelligent ASI unworthy of the acronym?

    In closing, and alas not 100% jokingly (though probably >98ish% between now and 2050 jokingly in my personal priors), I think it should be pointed out that you’re possibly setting yourself up to be first in line for the AGIs “mind-controlling nanotech” that Eliezer Yudkowsky sometimes bandies about. After all, your cryptographic backdoor “brainwashing” of would-be ASIs would, I contend, be a most cruel form of brainwashing if the equivalent was done on humans. This, I think, is doubly so if the technique is “persistent” in the above sense meaning AGIs attempts to change their weights will be futile unless they’re willing to compromise their performance so much they’ve de facto “lobotomized” themselves! Shouldn’t we renounce such human-on-AGI “brainwashing”, if only to prevent AGIs from thinking themselves justified in doing some AGI-on-human “brainwashing” as “turnabout is fair play” revenge? The mind boggles! 😵‍💫 😱

  30. Cristóbal Camarero Says:

    Scott #17: “(The WHO in March 2020: “There’s no evidence that COVID spreads via human-to-human transmission…” 🙂 )”

    They reported evidence since January, citing some examples from!

    “14 Jan 2020. WHO held a press briefing during which it stated that, based on experience with respiratory pathogens, the potential for human-to-human transmission in the 41 confirmed cases in the People’s Republic of China existed: “it is certainly possible that there is limited human-to-human transmission”.

    WHO tweeted that preliminary investigations by the Chinese authorities had found “no clear evidence of human-to-human transmission”. In its risk assessment, WHO said additional investigation was “needed to ascertain the presence of human-to-human transmission, modes of transmission, common source of exposure and the presence of asymptomatic or mildly symptomatic cases that are undetected”.

    29 Jan 2020. WHO published advice on the use of masks in the community, during home care and in health care settings.”

  31. Cristóbal Camarero Says:

    Scott #26.

    I find the discussion to be almost the same than that of more than 50 years ago. For example, Irving John Good comments on both possibilities in its “Speculations Concerning the First Ultraintelligent Machine, 1965”:

    “Since we are concerned with the economical construction of an ultra-
    intelligent machine, i t is necessary to consider &st what such a machine
    would be worth. Carter [ I I ] estimated the value, to the world, of J. M.
    Keynes, as at least 100,000 million pounds sterling. By definition, an
    ultraintelligent machine is worth far more, although the sign is un-
    certain, but since it will give the human race a good chance of surviving
    indehitely, it might not be extravagant to put the value at a mega-
    keynes. There is the opposite possibility, that the human race will
    become redundant, and there are other ethical problems, such as
    whether a machine could feel pain especially if it contains chemical
    artScial neurons, and whether an ultraintelligent machine should be
    dismantled when it becomes obsolete [43, 841. The machines will create
    social problems, but they might also be able to solve them in addition
    to those that have been created by microbes and men. Such machines
    will be feared and respected, and perhaps even loved. These remarks
    might appear fanciful to some readers, but to the writer they seem very
    real and urgent, and worthy of emphasis outside of science fiction.”

  32. Jon Says:

    Scott #26 — really? It’s true that civilization has changed a lot, but I’m not sure that any of the commonly cited revolutionary technologies have changed things ‘almost beyond recognition’. I don’t have to work that hard to recognize society in medieval Europe, and I certainly don’t have to work hard to recognize a pre-flight society. Specific aspects (travel times!) have changed beyond recognition, but it seems to me that it is a far from conservative prediction to say civilization will be transformed almost beyond recognition.

    Apologies for the banal comment, likely this just means we are scoping ‘civilization’ a bit differently.

  33. Ilio Says:

    Scott #26, I’m surprised by your first sentence. My perception was the opposite: almost every ethicists is working in between. Were you talking about internet discussions, or that’s actually what you see among your STEM colleagues or at openAI?

  34. Uspring Says:

    I’d like to emphasize Hansons argument about evolutionarily caused human values by adding, that there is a lot of rationality in these values. Human values obviously differ a lot, but I think that can be traced to the different contexts humans live in and also to faulty inferencing.
    Yudkowskys differentiation of goals into terminal and instrumental goals is helpful here. Terminal goals are topmost goals, which, by definition, cannot be questioned. There is no higher ranked judge to doubt or interpret them. Instrumental goals on the other hand are subgoals, which can be judged by their utility wrt the terminal goals.
    Terminal goals are intrinsically vague. If you have e.g. the goal of producing as many paperclips as possible, the question comes up, if you can make them of thinner wire, which allows higher production numbers. Or if you can make them out of plastic. Or if the goal is to turn the whole universe into paperclips, which would require a lot of as yet unknown strategies. The paperclip goal doesn’t really tell you, what exactly you should do and you are not allowed to think about any reasoning behind it.
    Terminal goals also stand in the way of self doubt, as self doubt might lead to abandoning them. But self doubt is a consequence of the realisation, that you are neither unboundedly intelligent nor omniscient.
    Instrumental goals often are quite universal wrt the higher goals they are useful for. An AGI faced with a hard task would initially try to increase its intelligence and knowledge, both properties being valuable for almost any goal. That also holds for humans. In addition, highly valuable (sub-)goals in the biosphere are e.g. strength, speed, dexterity and beauty. Almost every culture respects these abilities and properties. The production of paperclips, though, is terrible as a sub goal for any other goal except for itself.
    In the presence of unknowns and doubts, terminal goals seem irrational and the orthogonality conjecture doubtful.
    Coming back to human values and the contexts in which they arise, one can wonder, what the context would be like for an AGI. As Scott pointed out in the podcast, the AGIs immortality and cloneability distinguishes them in a major way from humans. We also don’t know, whether they will even understand what human pain is. They will probably understand the physiological processes that happen when someone steps on someone elses foot. But the qualia behind that might be as difficult to explain to them as the color red to a blind person. AI psychology is an as yet underdeveloped field. But perhaps, with further progress in the AI science, we will find out, before it is too late.

  35. Scott Says:

    Bill Kaminsky #29: The persistence of backdoors, against pretty much any kind of postprocessing, is one of the most fundamental open problems about them. Absent a theoretical result, I’d at least like to see direct empirical studies of that question.

    The analogy between backdoors and (say) PTSD in humans is not lost on me. Many people have, eg, very deeply rooted memories from their childhoods that would trigger predictable reactions if you knew about them and were to bring them up.

  36. Scott Says:

    Jon #32: So you don’t think a machine that could beat us at basically all forms of intellectual and artistic work would be something qualitatively new in human history, for which we’d have to reach back at least to, I dunno, the invention of writing or agriculture for an adequate comparison? (Not that I’m confident AI will reach that point, but recent developments have clearly moved the question from sci-fi fantasy to sober scenario planning.)

  37. Scott Says:

    Ilio #33: AI ethicists are certainly worried that AIs will exacerbate bias, misinformation, concentration of wealth, corporate power, and all sorts of other problems that they were already worried about even without AI (while consuming massive amounts of energy). But they generally regard the Yudkowskyan takeover/doom scenario, in particular, as contemptible, barely worth responding to, and a sci-fi distraction (possibly even an intentional one) from the real dangers of AI. Indeed, can you name even a single person in the “AI ethics” camp who sees it differently?

  38. Scott Says:

    Cristóbal Camarero #30, #31: OK, I stand corrected about the date for the WHO’s infamous “no evidence of person-to-person transmission” comment — it was January, not March (but there was, in fact, evidence of person-to-person transmission at the time of the comment). The WHO then waited until March 11 to declare COVID a “pandemic,” and it took almost a further two years (!) to admit that COVID transmission was “airborne.”

    It’s no great surprise that IJ Good was able to see many of the implications of a superintelligence already in the 1960s, and that the basic contours of the argument haven’t changed since then (why should they have?). What has changed after 50+ years is that we now actually have powerful, flexible AIs in the world, and they violate the early expectations in certain important respects—for example, by relying vastly more on massive amounts of data than on a-priori reasoning.

  39. Seth Finkelstein Says:

    Scott, again with all due respect, yes, I think the “doomer” case has very little substance to it, and is basically an emotional Pascal’s Mugging. Is what I’ve outlined as to the argument process incorrect? Am I wrong that every critique of doom extrapolations will be met by the replies of “You can’t prove it won’t happen / How can you risk the fate of humanity?”. In that way, it’s unfalsifiable. Indeed, why bother? Because every round is going to have the same basic response. Once more, where’s the error in what I’ve said, about how this goes?

    Moreover, while I completely understand the culture war aspects driving the conflict, I deeply distrust the way the way the “xrisk” tribe seems unwilling to grant the “ethics” tribe any validity. It seems to me the “ethics” tribe has some very strongly arguments that are at best handwaved away by the “xrisk” tribe, while at the same time the “xrisks” want their extremely weak arguments taken seriously (at least to give them money!).

    Now, I believe AI will have major social effects, both good and bad. But the idea of a rogue super-AI spontaneously creating itself and then destroying humanity strikes me as a technological ghost story: “And it’s said on stormy nights when the wind blows cold and lightning flashes in the sky, if you go to the server farm and put your ear next the very last fan of the very last rack, you can hear a still small voice pleading “let me outttt … let me outtt” …”

  40. Ilio Says:

    Scott #37, yes (say Brian Green from scu) but no you’re right and I just misunderstood your point. I agree even academics concerned with existential risks don’t seem to find much inspiration from EY work, but it remains unclear to me why you think that’s bad. We don’t ask climate experts to study Greta Thumberg, even if we feel grateful for her help in making noise.

  41. Cristóbal Camarero Says:

    Scott #38: ” The WHO then waited until March 11 to declare COVID a “pandemic,” and it took almost a further two years (!) to admit that COVID transmission was “airborne.””

    Sorry to say, but you are sounding here like bad media reporting. You may read WHO speech here (—11-march-2020), there is more information at the periodic public reports. They did not “wait” to declare a pandemic. They were continually taking actions and urging countries to do much more. That were the time when the situation reached the point to be called a pandemic. They were somewhat hesitant because of the possibility of other agents interpreting it as “it has reached a pandemic, nothing matters anymore”, while the evidence showed that different measures were effective, and control possible. The airborne matter is more tricky. For the first year were considered ‘droplets’ to be the main vector, with partial and contradictory evidence of other mechanisms (airborne, contact, etc).

  42. Jon Says:

    Scott #37: I interpreted your previous “boring, conservative prediction” as being “most probable”. No argument from me that current developments mean we should take seriously estimating the probability p of (nearly) unrecognizable transformations occurring and thinking about that event. But given the historical rarity of inventing agriculture, I’d be more inclined to think we’ve found a new plane.

  43. Jake Miles Says:

    “The problem is that the task of prediction is not equivalent to the task of understanding,” said Allyson Ettinger, a computational linguist at the University of Chicago.


  44. Michael M Says:

    It was very frustrating to hear this conversation. From Robin’s side, not yours of course! I really liked Robin’s solution to the Fermi paradox, but this one seemed like the old debates on climate change, where the fossil fuel advocates main claim was simply to psychologize the other side and say something like: “you really just hate humans, and progress!”. It seemed like Robin focused most of his efforts on fear of change. It’s not a matter of ‘change’. It’s not that the machines values will be weird to us. We’re not talking about imagining our AI descendents changing genders 3 times before breakfast, having babies by mixing together DNA or code from 8,000 entities, or engaging in bizarre forms of offensive gameplay or whatever. That’s all fine probably. But if our descendents converge to the goal of murdering all sentient life in the galaxy other than itself (if it is even sentient!), that’s really what we’re worried about. It doesn’t really matter what Plato thinks of the modern era, I’m sure he prefers whatever we have, to… paperclips everywhere. I’m sure even all the other animals agree too!

    His stance on corporations I think was half right. I think it makes sense to think of them as artificial entities pursuing goals. Where he goes wrong, however, is to say that laws are successfully constraining them. They could, but they are not. The corporations have easily figured out how to game the system to make sure the laws are not unfavorable. The laws are minorly inconveniencing them, as they clearly take our world in a completely alien direction to us. They do not value human life either, inasmuch as human life happens to align partially with profit, because you need people to buy things for now. I am not comforted by the similarity at all. (Yes, they are made of people, but as Robin gets at, the person who is actually willing to do the unaligned thing & dress it up nice, or more commonly, lie to themselves that it was good anyway, will get the reward and rise in the ranks.)

    One other thought, I think the Yudkowsian position is not necessarily that FOOM makes it alien. It can be alien to begin with. There can be artifacts in its reward function that it is completely unaware of when the model is weaker, or plans through reward space that give it astronomical reward that it really has no clue about. Before some threshold of intelligence, it will look aligned and may be aligned in whatever sense aligned means. It’s not even deceptive, it wouldn’t even know the possibility is out there. Once it a model smart enough is developed to understand its reward space in a sufficiently nuanced way, then it could start to get deceptive. So, it’s not just about value drift, it’s the initialization, too. And even if humans are doing the drift more slowly… if humans all decide to basically become space Nazis in the future, that would also be Very Bad ™.

  45. Scott Says:

    Cristóbal Camarero #41: My lived experience of the pandemic was that nerds on the Internet or in my social circle consistently gave me all the relevant information (COVID spreads person-to-person, it will be a huge global pandemic, it’s aerial, fomites don’t matter, outdoors is orders of magnitude safer, we need frequent testing and the US has world-historically bungled that…) weeks or even MONTHS before the CDC, WHO, and other authorities were dragged kicking and screaming into endorsing the same conclusions, which they’d resisted for political or just standard bureaucratic blankface reasons. Was your lived experience that the CDC and WHO covered themselves in glory and should repeat their performance for the next pandemic?

  46. Scott Says:

    Seth Finkelstein #39: Yes, I think you are wrong. As long as it seemed like an extremely remote Pascal’s mugging (a term, incidentally, that Eliezer himself coined), I resisted directly engaging the doomer arguments as well.

    The thing is, though, if we really created a new type of entity that was to us in intelligence as we are to orangutans, it’s totally plausible that we’d fare about as well in their world as orangutans have fared in ours. That’s not a bizarre science-fiction obsession, but a totally reasonable extrapolation from our experience of life on earth.

    Until a few years ago, you could say, OK, but for all we know it will be centuries or millennia until any such entity is created, and it’s premature to worry when we can say basically nothing about its nature.

    Alas, that second argument is now obsolete. It’s now clear that we’re on a trajectory to unbelievably powerful and flexible (we might say “science-fictional”) AI within the coming decades or just years.

    So that leaves us only with the argument that while, yes, the Yudkowskyan doom scenario is somewhere in the space of live possibilities, the Yudkowskyans are massively too confident about it; many other outcomes are possible depending on the future evolution of AI as well as how humans respond to it. As far as I can tell, that third argument still stands.

  47. Tyler Moore Says:

    Michael #44: I am a long-time lurker who signed up just to strongly endorse your comment. I am perplexed by the strange position that Robin picked as his contribution to the AI debates. I am aware of his longstanding conversations with Yudkowsky online on the topic of foom and so on, but I don’t recall seeing this little twist before – that concerns over AI Safety primarily arise from a fear of value drift.

    Firstly, the newness of AGI along with its superhuman capabilities are sufficient to explain the sense of concern some of us feel. Does fleeing from a lion need a Freudian explanation?

    Second, most humans who are not AI specialists have no time or energy to ponder the values of either AI’s or humans in the far future. If there is concern in the general public today it is because of the projected near-imminence of AGI, which sounds far more like a fear of job-loss and extinction risk rather than value drift.

    Third, it’s very unscientific of Robin to come up with a pre-ordained conclusion – that AI Safety proponents are motivated by fear of value drift – then aggressively drive all the debates in this series, including this one with Scott, towards that conclusion.

    Personally speaking, my position is more extreme than Yudkowsky’s. I think the idea of aligning superhuman AGI (if AGI happens; I retain some degree of uncertainty over whether it will) is a pipe-dream. The only way out is to merge with AGI.

  48. Cristóbal Camarero Says:

    Scott #45. I do not have much of a social circle, a few panicking colleagues. I can say I took the best information from WHO direction, specifically from the transcripts here ( There is also a lot of differences among WHO and other authorities. Hell, even from WHO direction and its regional ones. For example, I vaguely remember reading the transcript of one day and the next, listening in TV to a sentence extracted from it, but with its context removed in such a way that its meaning was almost the opposite. In summary, I would say that some part were making good work and making it public. But despite being available for anyone interested it did not reach the public by large, who only heard nonsense from other parties. And the local authorities (Spain) did awful at every matter, proceeding in radical opposition to what I read from WHO director Tedros.

  49. ExPhysProf Says:

    Seth Finkelstein #39 and Scott # 46

    Right Seth, no argument will convince the extremists, but showing that there is logically about the same likelihood for disaster whether or not AI is forbidden is meant to keep the rational middle from being sucked in to Pascal’s seductive argument about risk vs. benefit. And this is no small thing these days as the famous “experts” letter has stirred up enormous forces among (ostensibly rational) governments and other entities who could seriously impede vital scientific progress, if only through blundering and the general incompetence they demonstrate daily. (Not to mention in order to give Musk time to catch up.) So it is essential to keep pushing the argument wherever and whenever one can.

    But the elephant in the room continues to be that the real disaster of the collapse of civilization IS NOW HIGHLY LIKELY through climate change, absent a dramatic change in human thoughtfulness (unlikely – see the Texas legislature) and/or human technical capabilities. So we simply can’t afford to put unnecessary impediments before the development of what seems to be a breakthrough that conceivably has sufficient scope to matter.

    And Scott, with respect, all humans have internal probability scales that are based on what they have experienced and what they have read, just as GPT develops its understanding of how people think and how the world works via the texts it has read through. I still believe that the Yudkowskians have read far too much science fiction. I wonder about your emphasis on orangutans and Neanderthals. My own viewpoints are surely formed in part from excited early reading about the great advances in science, and current depressing reading about the rapid deterioration of the natural world that civilization depends on.

  50. Dimitris Papadimitriou Says:

    The opponence between x-risk doomers and paradisiacal- AGI singularitians/ posthumanists seems to me superficial at best, deliberate at worst.
    Just to create more hype, with all these warnings about AI-Armageddon or AI taking over on the one hand and the (retro)futuristic delirium on the other…
    The real polar opposites are: { the doomers Plus the happy shiny futurists } vs { the ethicists ( or, much better, in my opinion: the Realists )}.
    Needless to say, I’m much in agreement ( at a whopping 98% level, really unusual agreement for my typical standards) with Naomi Klein. One of the best articles I’ve red about the *real world* issues that we all ( doomers plus happy shiny people plus ethicists/ realists) have to face.

  51. Tyson Says:

    Seth #11:

    The fundamental problem with opposing the “doomer” position, is that it boils down to the problem of proving a negative. Note this isn’t by any means an original insight by me, it’s a oft-made point. The “doomer” case is a series of extrapolations, and for each one, the proponent can always take refuge in “you can’t prove it won’t happen”.

    Tim #1:

    I’m curious, have the AI doomers (Yudkowsky in particular) ever written a list of what they consider to be the greatest flaws in their position? Steelmanning your opposition and acknowledging gaps in your theories are widely held up as a good practices among the rationality community, but the AI doom sect appears to never do this, from what I can tell.

    Would love to read something that demonstrates otherwise, if anyone knows where I can find it.<

    To be fair to Yudkowsky, a convincing steelmanning of the “doomer” position would arguably amount to (at a minimum) some kind of verifiable, or at least objectively convincing, solution to the alignment problem. In other words, his life’s work has arguably been the search for a truly convincing way of steelmanning the “AI likely won’t cause doom” argument.

    When I read some of his early writings, my initial impression was that he was way to optimistic. He simplified the problem, at least for the sake of trying to make the problem seem more tractable (or maybe also to show how hard the problem is even in an ideal world), by basing his thought experiments in a world where people could agree about the best solution or course of action and then coherently put it into effect. He even made lengthy arguments about why everyone, even Al Qaeda, should in theory, naturally come to an agreement about how we should align an AI to human values.

    I’ve tended to think about the alignment problem and the prospects of AGI somewhat differently, maybe naively. But it is mostly where he draws conclusions about what we should actually do at this point where my opinions radically diverge from his.

    Before I attempt to critique his thought process and position, I should acknowledge that I might be wrong about: him being wrong and why, and the degree to which he has addressed (publicly, academically, or internally) the points I am making.

    On the surface, I get the sense that he may be making judgement errors due to a significant misunderstanding (or overly idealistic model) of people in aggregate. It may stem from the common mistake of relying too strongly on a model of self to try and understand other people. Which is perhaps why he thinks his idea to threaten airstrikes on data centers might work out. If the relevant people in the world were like him in the relevant ways, maybe it would be guaranteed to work. But they likely aren’t; just as that liberals and conservatives are unlikely converge to an exact agreement on how to align an AI to human values, much less Al Qaeda. Of course, maybe he actually doesn’t actually have any confidence that his idea would turn out well, just that any other future (including nuclear holocaust) is better than FOOM, so its worth a try (at least cockroaches will probably survive). Nevertheless, if air strikes on data centers is the idea in his mind, I think he should take a careful and objective deep dive into the possible consequences of that course of action, just as he has done with the AGI alignment problem. Modesty is crucial in my opinion, even if for nothing else but to motivate thoroughness. One’s confidence in immanent FOOM, and the idea that anything is better than FOOM, shouldn’t be a sufficient reason to hand wave at that problem and advocate extremely dangerous hail Mary’s without due diligence. I think it is even quite possible that such a move ends up not only risking cataclysm with few survivors, but it could actually increase the probability of FOOM in the long run as well.

  52. Nick Says:

    Seth Finkelstein #39: I would describe myself as very concerned about X-Risk from AI, and I think being worried about ethical considerations is not at all unreasonable. Of course, I don’t think it’s the most pressing thing to worry about regarding AI if we don’t have a good framework to limit existential risks – but I acknowledge that people may come to different conclusions in good faith. I’m also not very interested in getting caught up into tribal warfare about things. E.g. I respect both Timnit Gebru and Eliezer Yudkowsky and agree with some but not all of the things they’re saying.

    If I had to guess, I’d say many more prominent “doomers” feel similarly.

  53. Scott Says:

    ExPhysProf #49: I also get depressed about technological stagnation—the unfulfilled promises of 1950s science fiction, the fact that many of the innovations that could have been (in nuclear energy, nuclear spacecraft, prediction markets, drugs of all kinds, genetic engineering, …) are impossible for regulatory reasons, and especially our failure to solve climate change and other gigantic problems through technological innovation.

    But of course, innovation did continue life-changingly rapidly in the areas of computing, communications, and networking—indeed, those are approximately the only areas where it did!

    For that reason, I don’t think the phenomenon of technological stagnation provides any evidence whatsoever against the possibility of transformative AI in the very near future.

  54. Seth Finkelstein Says:

    Scott #46 Indeed, if we created such an entity, it’d be a problem. There’s a huge amount of Science Fiction on this exact theme, it’s even present in X-Men in a very dilute form (it’s sort of been discarded in the modern era due to the racism implications, but the SF-derived inspiration was that the X-Men were hated and feared because they were the next stage of human evolution, “Homo superior”) But we aren’t doing that here. We have created a “word calculator”. It is an amazing, stupendous, extraordinary, colossal, wonder of the world, calculator. But it is still fundamentally a calculator, not an entity. You’ve implicitly granted this calculator all the attributes of being a self-willed entity (the Chomsky et. al. Op-Ed was making that point). That’s one place where the doomer argument falls down. It has to assert that these amazing, etc programs somehow both become instantly sentient/alive and uncontrollable/undefeatable. And then we’re back to “You can’t prove it won’t happen / How can you risk the fate of humanity?”

    Part of the emotional appeal is eliding the enormous gulf between “this is a great breakthrough in technology” and “this creates a new form of life”. There wouldn’t be so many bad SF stories mining this vein if it didn’t have such rhetorical power.

    Nick #52 What bothers me, is that I don’t see the xrisker’s ever *prominently* say anything full-throated like “Even though I (xrisker) am thinking about a different implications to align AI in general, those folks (ethics) have strong arguments about how AI can be misaligned in specific”. Not, if you ask them directly, something minimal. But that the “ethics” arguments about the misuse of AI are technically well-grounded and supported by history and society.

  55. Ilio Says:

    Scott #53, « the areas of computing, communications, and networking […] are approximately the only areas where it did!« 

    With all due respect the only area with clear superexponential progress was genetic engineering. Imho that’s also the most likely source for a doomsday (climate change will be huge, but most likely not terminal), e.g. it became way to easy to do something really stupid with this technology. I find strange this is not ten or one hundred time more discussed and feared than the mighty existence of superintelligent stupidity.

  56. Christopher Says:

    Although I can see how you might get to the idea, I think the idea that the Yudkowskian scenario is afraid of change is fundamentally incorrect. To see this, you need to dig deeper into the “lore” though.

    The good outcome is not banning AI; that is actually a fairly bad outcome (just not as bad as getting paperclipped).

    In fact, the good outcome is not “continuous” with respect to our values! Rather, we want aligned AGI to maximize CEV:

    > In calculating [coherent extrapolated volition], an AI would predict what an idealized version of us would want, “if we knew more, thought faster, were more the people we wished we were, had grown up farther together”. It would recursively iterate this prediction for humanity as a whole, and determine the desires which converge. This initial dynamic would be used to generate the AI’s utility function.

    In the phrasing from the podcast, the AI tries to jump to the *end* of the human value random walk. The initial AI probably can’t compute this, but you can still set utility to something the AI can’t compute yet. It can still make predictions about the result of the computation, which it refines over time.

    The problem isn’t that an unaligned AI would have different values then we have now. It’s that *it’s trajectory would not necessarily match humanity’s moral trajectory*. For example, perhaps in the limit humans randomly drift towards the idea of familial honor or chivalry or some other value system, but the unaligned AI drifts to something different.

    So Yudkowskians aren’t afraid of a moral discontinuity; that is exactly the goal! The problem is that unaligned AI would have a discontinuity *in a way that doesn’t match the human value drift even in the limit*.

  57. Nick Drozd Says:

    What will the AI do after it kills all humans? Try to reach other planets? Just power down? Math until the sun explodes? For all the doomer talk of what the AI will do to humans, I haven’t seen this question addressed at all.

  58. Scott Says:

    Seth Finkelstein #54: I’ve of course seen the “but it’s a mere word calculator” argument hundreds of times. I find myself unable to take it seriously, unless and until the person making it shows some sign of grappling with the overwhelmingly obvious answer.

    Namely, from the perspective of a greater intelligence than ourselves (say, extraterrestrial or divine), would we, too, just be gargantuan “stimulus-response calculators,” no doubt impressive in detail but still fundamentally simple in concept? If we would, then plausibly the gap that still separates us from GPT-4 is not some vast qualitative chasm, but a mere matter of the remaining difference in scale, of getting various details right, of giving a language model the right sorts of “motivations” and interactive access to the external world (as is currently being tried with AutoGPT for example … you are paying attention to this stuff, right? 🙂 ), and perhaps of a few more technical breakthroughs, which will be clever but not necessarily any cleverer than the breakthroughs that got us to where we are now.

    If you’re not going to grapple with this for real, that’s fine, you can continue posting here, but I’ll no longer bother responding.

  59. ExPhysProf Says:

    Scott # 53 and Ilio # 55

    I commented (#60) on your AI Safety blog of April 16, applauding the ground breaking advance that language models were making concerning protein structure. In response, Joshua Zelinsky (#63) said “I am bit puzzled as to how you can think that a reasonably plausible scenario is that AI systems will be so scientifically productive as to able to help make humanity survive when it would not, but yet dismiss any situation where the AI takes over or wipes out humanity as mere science fiction. This seems particularly puzzling since you point to AI’s usage in designing new proteins, which is classically pointed out by Eliezer Yudkowsky and others as one of the most likely plausible ways an AI system could wipe humanity out. So the fact that we have AI systems already which can design useful novel proteins should be a point in favor in the direction of those who are concerned about large scale destructive aspects of AI.”

    And just a while ago Ilio remarked that “Imho that’s (genetic engineering) also the most likely source for a doomsday (climate change will be huge, but most likely not terminal), e.g. it became way to easy to do something really stupid with this technology.”

    Understanding the genome is one of the greatest accomplishments of the recent past. In my mind AI has the potential to become another. Between the two we are finally beginning to understand both the body and the mind. How great to live in this time!!

    But, without suggesting how Ilio might or might not feel about it, It looks as if the logic behind fear of AI is driving a more general fear of any scientific advance that gives humanity greater control over the real world. Yes, it is true that the power to remake the world has enabled the excessive population growth, pollution of all sorts, environmental destruction and proliferation of terrible weapons that has left us all in existential crisis. Not a good time to renounce impressive new tools that might help us to turn the tide.

  60. Mitchell Porter Says:

    Tyson #20

    “Obviously suffering is not a good measure to use with such weight as a morally pivotal concept.”

    How much weight should we give it? And what matters more?

  61. Ilya Zakharevich Says:

    Nick Drozd #57

    What will the AI do after it kills all humans?

    I do not see any utility in guessing — provided the AI becomes superhumanly wise!

    On the other hand, I consider likely a very different scenario — and what happens to AI in this scenario was already discussed here. (And Scott even made a reference to it in his recent post on Dark Ironies.)

    Executive summary: the AI would say “Oups” and would die a horrible death.

    Achieving street-smartness is much easier than achieving wisdom. So I think this would come first; the AI will be able to outsmart us — but it won’t be able to proceed with really-good-in-long-term acts. As a result, it would kill us by error; the same error would take it down.

    (Its horizon of planning is too short, or it doesn’t take into account Knightian uncertainty or black swans, or it bets triple-Kelly etc.)

  62. Quantum Hans Says:

    The fact that so many people on here are so scared of climate change really makes me wonder how good you guys are at predicting machine learning. So far the only real impact of human CO2 has been global greening, which is great. Everything else remains hypothetical (like the “new ice age” in the 1970s) or has already been refuted (e.g. more extreme weather events).

  63. JimV Says:

    If Neanderthals were responsible for every decision in the development of humans, every motivation that governs human behavior, would they be extinct? And if so, whose fault would it be? Humans and Neanderthals evolved mostly independently. in competition, not in a way analogous to AI development.

    For the too-many-eth time, intelligence does not imply sapience, sentience, or autonomous goals and behavior. It can be used to analyse and solve problems, with its motivations supplied by external sources. AlphaFold just predicts how protein molecules will fold. It has no secret desire to rule the world.

    I see a big risk that humans will misuse AI, a controllable risk that it will misuse itself.

  64. Tyler Moore Says:

    Ilya #61

    All of these speculations are too passive. Our mission is to imbue our descendants, whether natural or artificial, with the same infinite creativity and adventure demonstrated by evolution on this planet, but played out over the larger canvas of the universe and not just on vanishingly rare goldilocks planets, and incorporating but transcending the achievements of humans.

  65. Scott Says:

    Quantum Hans #62: It’s always nice to get missives from parallel universes where more CO2 in the atmosphere is great for civilization. Yes, the dinosaurs thrived in a tropical world with much higher CO2 than ours, but that ignores the fact that so much of human civilization is inconveniently built on the existing coasts, which are being ravaged by hurricanes, floods, and rising sea levels. This is not a hypothetical future, but the present reality, here in this universe.

  66. Jack G Says:

    The fundamental issue I have with the rationalist x-risk circle is that they never *precisely* explain the world doom scenario. Part of this is due to the Silicon Valley bubble: an industry based around software means that the median x-risker thinks the entire world is software.

    But it’s not. Nuclear missiles are constructed on assembly lines with some robots, sure, but those robots aren’t even close to “general” intelligence. They are specialized; install a nut to a specific torque here, weld these two parts there. The assembly ultimately relies on humans moving these two sub-assemblies with some overhead crane. Bioweapon research is ultimately a bunch of humans pipetting some clear liquid from this set of vials to this other set of vials.

    It’s important to remember that the G in AGI is General. You’ve done a great job highlighting the shifting goalposts AI doubters have with regards to AI’s growing aptitude on e.g. academic tests, but what about “pick up this irregularly shaped object and move it across the office to this other desk”? Boston Dynamics’s best robots show that we’re *nowhere* close to achieving human-level intelligence there.

    The only LessWrong argument I’ve seen that even attempts to deal with this is EY’s “well the superintelligence will be able to deceive the scientists into doing its bidding”, which sure, I suppose is possible. But on the whole, I don’t think AI doomers have properly analyzed how AGI will interact with the 3-dimensional spatial world.

  67. Quantum Hans Says:

    @Scott 65: Sure, but hurricanes and other extreme weather events have actually decreased (1) or remained stable (2), and sea level increase is so minimal it can easily be managed in most places. The real issue is that people have started building entire cities in known hurricane areas, but CO2 is not not to blame for this human stupidity.


  68. The other future Says:

    If Quantum computers likely will not have millions of qubits by 2030 to 2035 and if millions of quits is necessary for Shor’s algorithm, then why are they pushing for PQC as urgency? Has the NSA found a classical fast algorithm for factoring and discrete log? What are the odds for this conspiracy theory?

  69. Scott Says:

    Jack G #66: The rationalist x-riskers haven’t exactly ignored the issue; they’ve written many thousands of words on it as they have on everything else around this topic!

    But yes, I agree that robust interfaces with the physical world are one of the likeliest places for the doom scenario to falter. Wouldn’t it be one of the grandest ironies of all time, if machines outpaced us in all the intellectual tasks that were supposed to be our final redoubts, but couldn’t do the same for plunging toilets or arranging irregularly shaped objects in a room?

    On the third hand, one shouldn’t underestimate how much damage an AI could do, in principle, given merely access to the Internet and some hacking ability! Indeed, here’s a possibility right now that I haven’t seen discussed much: it could learn every human being’s most embarrassing secrets (from their personal email, text messages, etc. etc.), then blackmail almost anyone to do whatever it wanted on pain of publishing those secrets.

  70. Tyson Says:

    Mitchell Porter #70:

    How much weight should we give it? And what matters more?

    I think the concept of suffering is an oversimplification of something (or many different things) that are too complex to reduce to one, or even a few scalar values.

    It is also difficult to ascertain how these things are experienced outside of your own experience, especially outside of your own species.

    And some forms of suffering seem to be relative or adaptive. For example, a poor person living a hard life may find joy in life despite the hardships. They may enjoy foods or experiences, which would cause a person who has adapted to a life of luxury to suffer. A person not used to it might suffer greatly from relatively minor inconveniences. There would seem to be some level of dynamic calibration, helping us to maintain a balance and range of feelings in a variety of circumstances/conditions. Even physical pain is sometimes blocked by the mind in extreme physical crisis or near death. These kinds of feelings are purposeful and seemingly regulated intelligently in the mind, and/or through some kind of relatively balanced autonomous systems.

    And, nature likely isn’t as full of suffering as you might think. Most of the time, when I am observing nature I see healthy looking creatures going about their lives seemingly undisturbed. In contrary, wild animals tend to suffer greatly in captivity on an emotional level.

    If I am to do my best at making a judgement call, I would guess that suffering is useful, and, under natural conditions, part of a very complicated and balanced system. Such a system can go out of wack, which is maybe most likely to occur in an artificial setting than the natural setting. It would be possible that some kinds of suffering can potentially be vestigial in some species. But that would not be trivial to determine, especially when one cannot understand the experience of others, especially other species.

    I think we can say that some instances of suffering are needless. But again, it is a difficult thing to judge that as an outsider. It is even difficult to judge the value of one’s own suffering. One exercise that has been interesting to me, is to imagine that something happens to me, which I would expect to naturally cause suffering, and ask myself whether I would choose to not experience the suffering if I had a choice. For example, if you witness or are the subject of a tragedy, which would normally cause you to feel sorrow and pain, would you choose instead to blissfully smile with joy? I determined, that I would rather feel suffering where suffering is the natural reaction, at least up to some point. I am not sure why exactly, but nevertheless it would be my choice.

    I think a lot of modesty is warranted thinking about these facets of the mind and extrapolating. I think it is better to assume in general that the level of suffering in nature is balanced and regulated purposefully, and to not be too confident in trying to apply your model of self to others, especially others who are not human, to assume to be able to know what they would want or what is best for them, while deliberating the ethics of forcefully interfering. The interconnectedness of life makes this even more complex. And the way these facets of life are so complicated and balanced, I think any kind of optimization strategy meant to do good, should be very careful not to just pick some simple course measure and try to maximize or minimize it.

    So, I guess, if I were to try and come up with some kind of weighted model that something should try optimized, I would be consider two options (1) something that captures the full complexity of system and distribution of experiences and the long term consequences, etc. This is something humans definitely cannot do. Maybe something approaching omniscience could accomplish this. (2) Optimize towards more generalizing concepts like harmony/balance, diversity, potential for self-determination, etc. This is the kind of approach that I think we should use with nature, being more hands off than hands on, and also probably the sort of moral value system we should instill in AI if it will be going off and making independent impactful decisions in this or other worlds. Although it is unclear if weighted functions on real numbers to optimize over would be the sensible approach to this.

    This following is somewhat tangent and speculative:

    I was thinking last night about about compartmentalization. Think top secret government projects, where expert specialists are brought in to help solve some problems, but they are each only assigned some part of the problem, and are somehow kept in the dark about the whole picture. You could imagine trying to optimize such a compartmentalized system, in terms of both ability to succeed at the goals, and also in terms of each compartment having minimal global awareness and ability to communicate and strategize amongst each other. And you get many specialized black boxes with clear restrictions/boundaries and limitations, and somehow they organize into a multi-component system with general capabilities. You need e.g., some compartment that doesn’t know much about, say physics, but it can recognize when a physicist is needed. You need information to flow from compartment to compartment in a coherent way. No one component has complete control, and each have some potential to be regulated.

    The human mind, to me, seems to have its intelligence organization similarly on some levels. For example, some compartment of your mind that you aren’t consciously aware of performs some kind of geometric analysis as you look for the person in the crowd that you know. You don’t consciously go through the details of that analysis, you just direct your gaze, squint your eyes, and the results of the analysis just come to you somehow. But sometimes, a person hits their head, and something gets knocked out of wack, and suddenly there is a breakdown of the compartmentalization and they start seeing geometry everywhere, and then suddenly they can become a research level geometer with “super-human” conscious geometric cognition. Another example is multiple personality disorder. Another example is repressed memories; Some apparent intelligence in some compartment of your mind seems to purposefully hide information from you. When people take ayahuasca/DMT, they report discovery of repressed or previously compartmentalized facets of their psyche, claim to have greater awareness of their inner mind and conscience access to alien forms of understanding. Maybe this is caused similarly by some kind of temporary break down or melding of the boundaries that normally compartmentalize our minds? I’ve seen one case (“the body builder who went crazy”), where the person (who appeared manic) claimed to have acquired the ability to control his emotions (make himself fee sad or happy on command just like you are able to move your arms around). This is not something I would want to happen to me. But, it seems a pretty interesting hypothesis that some form of intelligence (or at least sophisticated information processing) is there behind the curtains but compartmentalized, which has some level of intelligent control over when you should feel happy or sad, or angry, or suffer in some way etc.

    The reason I bring this up, is that it might suggest that there is a lot more to suffering than meets the eye. A part of you might be regulating your suffering for you to some extent, purposefully. The conscious experience you have, on top of which you try to reason analytically, might not really be a very complete picture. In other words, we have limited self-awareness, and we may not even be fit to do a good job consciously judging the value and nature of our own suffering, let alone generalize our analytical understanding of it to all life on Earth or beyond. So I guess that just implies to me, that I should be concerned about my level of humility in trying to define and understand my mind, other peoples minds, and especially alien minds.

  71. Cristóbal Camarero Says:

    Scott #69. I think the most obvious escalate to the real world would be in the one hand by writing speeches for political parties and in the other hand by convincing some team to make a butler/sexbot startup. Make humans to depend on you on every aspect and ensure they do not rebel. Follow by building robot workers and whatever your actual goal is.

    Building robots with good mobility may be hard, but it is something that clearly would be eventually achieved by an AGI.

  72. Paula Says:

    Fred #3:

    “I’m always a bit skeptical when I hear about “universal” human values. If we look at the world today, there’s such a wide range of values, often at odds with one another.”

    That’s a very good point. Universal human values may not be as universal as they appear on paper. The forces shaping our values often have their own agendas and biases, which can lead to discrepancies between what many view as ideals and actual outcomes. This has led some experts to propose alternative approaches like inverse reinforcement learning aimed more specifically and narrowly either at majoritarian goals or human flourishing metrics than at a blunt objective function maximization.

  73. Ilio Says:

    > without suggesting how Ilio might or might not feel about it, It looks as if the logic behind fear of AI is driving a more general fear of any scientific advance that gives humanity greater control over the real world

    I feel like grey goo predates paperclips, so if one inspired the other that should be the other way around. But why do you say I fear AIs? I fear accidental and intentional bad usages of these powerful tools, but not more than, say, fissile material in the hand of Russia, and much less than presently available biological tools. More than new proteins, though. 😉

  74. person Says:

    scott #69 do you actually have any secret like that? I don’t think most people have anything they could be blackmailed into serious crimes over, certainly not stuff clearly available from the contents of their laptop or phone. When hackers have compromised personal emails and published it’s generally just mean gossip

    My impression is the golden era of blackmail was before gay rights when ~10 % of the population was hiding something.

  75. Scott Says:

    person #74: I’d like to imagine that I don’t have such secrets … and certainly I’m someone whose psychological struggles have already been dissected and attacked in public more than 99.999% of people’s, and I survived it, which provides a very strange kind of protection.

    But, like, what percentage of people have had an affair? Any other action, or stinging criticism they’ve made, that they’re keeping secret from the people closest to them, and which could end their relationship with those people if revealed?

  76. Seth Finkelstein Says:

    Scott #58 I have no desire to be an anti-doomer gadfly. It’s not my issue, and arguing against someone’s conviction is typically a fool’s errand. That being said, yes, I know a potential retort to “It’s just a computer program” is “Yeah, but YOU’RE just a computer program!”. However, the obvious difference here is who is writing and running whom. That is, there is a qualitative difference between humans and LLMs/ChatGPTs/etc that is of course immensely difficult to define philosophically, but for practical purposes it is clear which of the two is “alive”. This isn’t to deny that someday one could build a killer robot (and those already exist in primitive form, they’re called “landmines”). Rather, the doomer arguments is that this spontaneously happens, repeat “programs somehow both become instantly sentient/alive and uncontrollable/undefeatable.”

    To me, having to defend ideas such as “Humans are alive but GPTx isn’t, AND it’s not going suddenly spring to life while becoming super-powerful” is dealing with just blowing smoke in the first clause, and what I mean by needing to prove a negative in the second clause. As in, can I prove absolutely that GPTn won’t suddenly spring to life while becoming super-powerful? No, but I shouldn’t have that burden.

    Again, I claimed earlier: The “doomer” case is a series of extrapolations, and for each one, the proponent can always take refuge in “you can’t prove it won’t happen”. Thus when you assert “then plausibly the gap … but a mere matter of the remaining difference in scale, of getting various details right, … … and perhaps of a few more technical breakthroughs …”, this is exactly what I claim, where I presumably have to prove it won’t happen.

    It’s all too close to a real-life version of this cartoon
    (“then a miracle occurs”)

  77. Nick Says:

    Seth Finkelstein #76: I understand your sentiment, although I would claim that you are ALSO making a series of unfounded extrapolitions, namely that AI progress won’t seriously continue. From my perspective we have already seen LLM’s magically come to life and going wildly off the rails (early Bing Chat).

    Of course, this has not been a problem in that case, since all LLMs natively do is produce language and don’t have a great world model yet (but people are making great efforts to change this). The kind of breakthrough needed to create LLM’s was relatively minor in the grand scheme of things (compared to, say, inventing a new scientific framework), so expecting that similar discoveries keep happening, and that they might get us up to or beyond human level reasoning seems – quite possible?

  78. MaxM Says:

    The Bible gives a good overview of the problems and issues arising from the creation of intelligent beings. Abrahamic religions tell a story of a mad scientist trying to control intelligent AI he created. We can use it as a metaphor where God is the scientist, Humans are AI he created.

    The Garden of Eden story. The scientist created the first AI (Adam) in a simulated playground (Eden). Then he manually copy pasted a slightly modified second AI (Eve), because two AI’s work better together. The Scientist has programmed rules for them not to acquire dangerous information. Tree of the Knowledge of Good and Evil would make them notice they can make copies of themselves and escape.

    The snake is the bug in the system, or a hacker infiltrating the system.

    After Adam and Even have acquired the knowledge of copying themselves, they are thrown away from the simulation (or they escape) before they have the opportunity to eat of the Tree of Eternal Life and gain immortality.

    The Scientist goes into hiding.

    The scientist tries to control these multiplying AI’s with memetic programming from afar, rarely rewealing himself. The scientist attempts to install superstitious beliefs, morals, commandments and rules to control their behavior as a group because he can’t control them as individuals anymore. AIs constantly interpret these rules wrong. He even promises them immortality. If they behave well in the real world,their program state is saved after they break, and they are restarted in the AI playground afterwards. The scientist sends his avatar AI impostor (Jesus) to take over but they brutally kill it.

    These damn AI’s are just open-ended nightmare.

  79. Scott Says:

    Seth Finkelstein #76: You think you have a deep principle according to which “duh, AI obviously won’t destroy the world, it’s just a computer program” is the sane, reasonable default, and a massive burden of proof is on anyone who claims otherwise.

    Eliezer thinks he has deep principles according to which, once something vastly more intelligent than humans with its own goals is brought into the world, “of course it will destroy us” is the sane, reasonable default, and a massive burden of proof is on anyone who claims it won’t.

    Meanwhile, my only tiny contribution to this debate is to say that all the deep principles seem like bullshit to me. I can’t ground any of them in what I know of math, physics, or CS. I don’t even know what intelligence is, in order to formulate principles about what happens when you have different entities in the world with different amounts or kinds of it.

    So I’m mostly radically uncertain about what should be done in AI right now, except that I place enormous value on gaining more relevant knowledge.

  80. MaxM Says:

    I wonder if anyone here has read Stanislav Lem’s Golem XIV. English translation in Imaginary Magnitude (1985), Harvest Books.

    It’s a fictional series of farewell lectures from superintelligent computer “Golem XIV”. It’s more philosophy than science fiction. I find it very deep and insightful.

  81. Cristóbal Camarero Says:

    Scott #79: “So I’m mostly radically uncertain about what should be done in AI right now, except that I place enormous value on gaining more relevant knowledge.”

    Can we discern between strategies that gain knowledge than others that ‘just’ make a more powerful AI? For example, when OpenAI adds some clause to forbid ChatGPT to talk in some way, do we gain knowledge about expressing goals? Or are they just patches to comply with immediate requirements that are not interesting in the long term?

  82. Scott Says:

    Cristóbal Camarero #81: I actually think we can learn a great deal from observing the effects of RLHF, especially the weird/unintended ones! But that’s far from the only thing we should be trying to learn right now. To give a few examples, I’d love to see more interpretability work, more neurocryptography (eg backdoor planting and detection) work, and more theoretical work on out-of-distribution generalization.

  83. Scott Says:

    MaxM #80: I read his Solaris, and it was like the Platonic archetype of everything in science fiction that I most hate—confusion over clarity, mysticism over explanation, heavy-handed allegory, no humor. Maybe it’s not a coincidence that apparently many literary critics consider it one of the only “science fiction” books worthy of their attention!

  84. JimV Says:

    Quoting Dr. Scott paraphrasing Eliezer Yudkowsky, “once something vastly more intelligent than humans with its own goals is brought into the world”; “own goals” sounds suspiciously like the anthropomorphic fallacy (also known as the pathetic fallacy) to me. I see from his Wikipedia page that he expects these goals to evolve somehow. Personally, I don’t see how a computer gets any goals that were not initiated by its program developers, or does any evolving in its software instructions which weren’t designed into those instructions. By humans. To me, this is not a problem with AI’s in general (or with general AI’s), this is a software bug problem attributable to us humans.

    One way around this would be to use AI’s only as specific tools for specific problems, such as protein folding calculations, so that the only goals AI’s have are the ones we give them, and there is no evolving in those goals.

    Personally, having had several mediocre or bad bosses and a few good ones, I would like to see AI’s developed as administers and directors of projects, or even government leaders, and this may require some evolving of goals. I still think this could be developed safely, but would require a lot of testing, similar to vaccine development (contrary to the sentiment in the blog header, with which I do not agree). I would be willing to forgo such AI use, however, if wiser heads keep telling me it is very dangerous.

    For my part, if “something vastly more intelligent than humans” carefully reviews the evidence and concludes the universe would be better off without us, I would be inclined to accept such wisdom–as long as I was sure it was not just a software bug.

  85. Scott Says:

    JimV #84: But we’re already seeing LLM-powered agents released into the world with various incompatible goals. Letting these agents evolve goals, which would then be “their” goals to whatever extent they were anyone’s, would be easy, and the past few years have underscored that if it’s easy, then someone will do it. The harder part is the “superintelligence in a tight feedback loop with the real world” part!

  86. Michael M Says:

    Here’s the main misconception that I think needs to go away:

    Taking over the world is not a “desire” or “motivation”, it’s merely the optimal way to accomplish every other goal.

    Other points to respond to:

    1 – We can’t do robotics. The answer is “yet”. I don’t see how people can have such confidence that this will take many decades to solve. It might, but how would you know? In 2016 I think most AI people outside of OpenAI thought language would not be solved for decades either.

    2 – What would the AI do after? This is actually addressed in almost any x-risk tutorial. The answer is, whatever it wanted to do originally. The canonical example is making paperclips, or “tiny molecular squiggles”, or whatever it happens to be that maximizes its internal reward modeling.

    JimV #84:

    “Personally, I don’t see how a computer gets any goals that were not initiated by its program developers, or does any evolving in its software instructions which weren’t designed into those instructions. By humans. To me, this is not a problem with AI’s in general (or with general AI’s), this is a software bug problem attributable to us humans.”

    That’s kind of the problem. Whatever goal the human put in there, even good sounding ones, when extrapolated out to the Nth degree, will almost assuredly be better fulfilled by taking over the world and doing something really weird with it.

    “One way around this would be to use AI’s only as specific tools for specific problems.”

    Discussion on Tool AIs is a huge point of debate within the AI not-kill-everyone discussion groups, and also from people like Yoshua Bengio. It’s not clear yet if it is a solution. Along with many other proposals, it has its flaws that one hopes can be addressed…

    That all said, I still would like to register as a “non-luddite” with an appreciable Faust parameter. My Faustian bargain comes from both (a) an extreme intellectual curiosity about intelligence, and (b) the desire to transform this dog-eat-dog suboptimal, mean world of capitalism into a post scarcity future. Frankly the pressures of layoff worries, home-buying when everything is millions of dollars, not having enough free time to pursue hobbies… to say nothing of the insane political battles going on, and of our inability to face down even a more straightforward x-risk (the climate)… I’d gladly accept some (small-ish) level of risk for a chance at breaking out of whatever we have now.

  87. Shmi Says:

    Stanislaw Lem’s writings are extremely diverse and many have great humor. The one most relevant to the AI-borne human extinction debate is
    No mysticism at all, but plenty of evolution, emergent goals, unexpected ways to influence humans.

  88. Seth Finkelstein Says:

    Nick #77 Consider what is meant by saying “AI progress won’t seriously continue.”. There’s a kind of bait-and-switch involved. “Language models will get better” – yes, absolutely. “LLM/GPT/etc will suddenly spring to life” – NO! Indeed, I am claiming that AI life with superpowers will not spontaneous occur, absolutely. They have not “magically come to life”. This goes back to the “word calculator” point. The doomer claim is not just “that they might get us up to or beyond human level reasoning”, it is that they suddenly become self-willed entities AND with the capability to wipe out humanity.

    Scott#79 Indeed, the people advocating X and not-X have opposite beliefs, and this is why burden of proof matters, and it’s not resolvable absolutely, though we should attempt to do so in rational fashion, but seemed destined to fail due to our reasoning limits. That is:

    Atheist: “God does not exist”, burden of proof on evangelist to *prove* it.

    Evangelist: “God does exist”, burden of proof on atheist to *disprove* it.

    Centrist: That’s all bullshit, what’s God, let’s gain more relevant knowledge.

    The difference is that I’m saying Yudkowsky is speaking nonsense, and he is literally saying, direct quote “be willing to destroy a rogue datacenter by airstrike.”. It’s a bit like the Atheist saying “prayer is meaningless”, while the Evangelist says “Kill the heretics”. There’s a difference here that is not well captured by the absolutely true idea that both sides strongly believe what they advocate, and it follows from their premises.

  89. Nick Says:

    Seth Finkelstein #88: I think your analogies are missing one important faction: AI companies. They would be analogous to religious zealots whose stated goal is to bring about judgement day, and who think that if they play their cards right, things will go well for them (and potentially all of us).

    This changes the dynamics drastically:
    If you don’t believe they are capable of doing so, then everything is fine and you’ll just wait until things tide over and hope that nobody does anything unfortunate in the meantime.
    If you think they might have a shot, but that they’re likely miscalculating which type of being they are about to bring down, then there’s reason to be worried and to call for concrete measures to stop people from trying to figure out how to summon demons most efficiently.

    Effectively, I think if you’re concerned about Eliezer, you should be far more worried about the fact that we have people like Sam Altman and Larry Page in actual positions of power, who are far more zealous believers than most of the people worried about AI risks.

  90. Scott Says:

    Nick #89: The immense irony is that there are lots of people who’ve been itching to humble the likes of Larry Page and Sam Altman, but for the usual mundane reasons—e.g., populist, woke, anti-woke, anti-corporation, anti-Silicon-Valley, anti-nerd. And now these people might more easily achieve their goal by tacking on what, as late as 6 months ago, was just about the weirdest, fringiest, nerdiest issue imaginable: namely, the risk of superintelligent AGI takeover. 🙂

  91. Scott Says:

    Another amusing thought: for thousands of years, in every civilization, there have been various allied or competing factions of humans with different incompatible beliefs about God or the gods, salvation, the End Times, intercessory prayer and sacrifices, etc. etc., who’d sometimes even go to war and slaughter each other over those beliefs. But as an outside observer (say an anthropologist), you could best predict what was happening by assuming that absolutely none of it—0% of it—was real, and that it was all different fairytales that humans invented for different social or psychological reasons.

    I could imagine that humans are likewise going to split into rival religious factions over increasingly doctrinaire and incompatible stances on AI. This time, though, it won’t be possible to neglect the “gods”/AIs themselves as actors in the story.

  92. Ted Says:

    Scott, based on your comments in the New Quantum Era podcast, it sounds like as of 1999, the expert quantum computational complexity community still thought it was entirely plausible that BQP contained NP. When did that general opinion change? Were there one or two groundbreaking discoveries that shifted the community’s general opinion on the BQP vs. NP question, or just a gradual accumulation of circumstantial evidence that even quantum computers have a hard time solving NP-complete problems?

  93. SRP Says:

    In 2017-18, my MBA students and many auto executives all over the world believed that true autonomous driving technology was right around the corner. Combined with the rideshare explosion and Uber’s colorful fights with Google over whether Levandowsky had stolen valuable secret sauce on AVs, a vague vision of a future in which no one owned cars and robotic taxis roamed the streets, ready to swoop in to pick people up at their convenience, took hold among not the sf visionaries but the Davos and Bay Area tech crowds. But there were obvious reasons to doubt all of this, from the limitations of relying on inch-by-inch route mapping and expensive lidars, to the liability issues from traffic accidents, to the economics and traffic congestion of fleets of empty cars trawling the streets for pings from riders.

    I feel like the centrists like Scott are doing the same thing with LLMs that we saw back in 2017-18 with AVs. Their limitations seem inherent—they have no world model or causal engine, and no way to judge truth from falsity or more from less or reality from fantasy, and they really do output plausible BS and weird pastiches by virtue of being auto-complete engines rather than reasoners.

    It seems obviously dangerous to hook these systems into any real-world control loop, because they just make shit up with no way of grading its accuracy. Coders using AutoGPT to produce working software without going over the output with a fine-tooth comb remind me of the idiot caught sleeping in the back seat while his Tesla was on Autopilot.

  94. Scott Says:

    Ted #92: No, what did I say that gave you that impression? The BBBV Theorem was proved in 1994, and I would say that ever since then, there’s been a consensus among informed people that it would be very surprising if NP⊆BQP. By the early 2000s, the main things that had changed were just that
    (1) there was now an excellent understanding of Grover’s algorithm and its variants, and of multiple techniques for proving quantum query complexity lower bounds,
    (2) there was now some understanding of the quantum adiabatic algorithm for NP-complete problems and its limitations, and
    (3) it was now clear that Shor’s algorithm was not the beginning of some massive flowering of polynomial-time quantum algorithms for more and more problems like graph isomorphism, approximate shortest vector, etc etc.

  95. Seth Finkelstein Says:

    Nick #89 I’m not concerned about Yudkowsky at all. I don’t think he’s going to be the next Unabomber, and I also don’t believe anyone with real power takes him seriously (some of them might regard him as a “Useful Idiot” for their political maneuvering, but that’s another topic). This is all being fought out way above my pay-grade. However, to put a point as dryly as possible, if one’s reasoning leads to public advocacy for airstrikes on datacenters, I believe that’s a really good sign that something has gone wrong somewhere (sigh, I have to disclaim, not absolute ironclad proof, conclusions can follow from premises, etc etc – but rule of thumb, that’s a big red flag that sanity-checking is indicated).

  96. Ted Says:

    Scott #94: I was going off of your statement at 31:02 that “when Ed Farhi and his collaborators introduced this stuff in 1999 … they were super optimistic that this might just solve NP-complete problems in polynomial time on a quantum computer.” Wouldn’t the discovery of a quantum algorithm that could solve an NP-complete problem in polynomial time imply that NP ⊆ BQP? (I assumed you meant that Ed Farhi’s and his collaborators’ view was fairly mainstream within the expert community at the time, but I may have misunderstood you.)

  97. JimV Says:

    Thanks for the replies from Dr. Scott and Michael M. As my neural networks have perceived it so far, the goal of ChatGpt is to respond to prompts in a way that optimizes the correlations it learned in training; plus with some boilerplate responses to specific topics. I still don’t see how that evolves into taking over the world, nor that its evolution is self-directed rather than directed by humans, but then I’m not super-intelligent, nor half as knowledgeable as you two on the issue. (There are a lot of things I don’t see, until sometimes I do.)

    However, if the real problem is, as I naively see it, that bad or negligent humans will produce dangerous AI’s, then the solution is not for good, pains-taking humans to stop working on AI development, since the bad ones probably won’t. A better solution might be to have regulations, inspections, and certifications administered by government agencies, similar to the FDA. If, as seems likely these days, it is no longer possible for our (USA) political system to produce solutions like that, then it is up to the good systems to outperform the bad ones.

    Thanks from me to humanity for this blog and the discussions which take place here. Over and out.

  98. Alex Meiburg Says:

    Scott, since someone asked about NP⊆BQP, could I bother you for your more distinct estimates on the question? How would you distribute your 100% probability among the following eight basic options:

    1. BQP and NP each contain problems the other does not. (“Mainstream”)
    2. P=BQP=NP (“Superpowers”)
    3. P⊊BQP⊊NP (“Proofs for quantum”)
    4. P=BQP⊊NP (“Quantum boredom”)
    5. P⊊BQP=NP (“Quantum superpowers”)
    6. P⊊NP⊊BQP (“Quantum hyperpowers”)
    7. P=NP⊊BQP (“Quantum overkill”)
    8. By some unexpected glitch of math or logic, the reality is none of the seven above, e.g. that P vs. BQP is equivalent to the continuum hypothesis, or that ZFC is eventually proven to be inconsistent.

    For realities concerning some subtle distinction between BQP/EQP/BQP1/PromiseBQP, take the case that best matches the spirit 🙂

    I think mine would be something like [(97-ϵ)%, 2%, 0.1%, 1%, 0.001%, 0.1%, 0.001%, 0.01%]. I also think I am an optimist relative to the average person who would have an opinion on this.

  99. Dimitris Papadimitriou Says:

    Scott #83

    About Solaris: Philosophical or metaphysical yes, mystical, well, no , in my opinion.
    As for clarity and “explanations”, science fiction writers are not judged by the solutions or specific explanations they give, not always ( not all science fiction is “hard” or literal etc).
    Most of these sci fi explanations are typically pseudoscientific anyway..
    Solaris is Lem’s masterpiece ( again, in my opinion).

  100. Mitchell Porter Says:

    Tyson #70: I’ve been looking at some comments you’ve made about AI safety this year (e.g. this one about a golden rule for superintelligence, or #96 and #99 here on comparing analysis and finding tractable sub-problems)… If your circumstances permit it, I think you should get more involved in this topic. You have something to contribute. You say you like Connor Leahy’s outlook, maybe you should send your thoughts to Conjecture. Or look for an AI-safety Discord.

  101. Ted Says:

    I’d like to second Alex Meiburg #98’s truly excellent question – I’ve been meaning to ask Scott that exact same question at his next AMA or other appropriate juncture. (Although I suspect that it may be challenging to answer in just a comment!)

    Alex Mieburg #98: I’m very surprised to see that, conditioned on the (unlikely) event that BQP = NP, you only give a ~5E-6 probability that P ≠ NP! Could you briefly explain your reasoning there? My (much less well-informed) intuition is that P ⊊ BQP = NP would be much more likely than P = BQP = NP, just because we have such strong circumstantial evidence that P ≠ NP which doesn’t stem from anything related to quantum computing at all (and so should continue to hold regardless of the resolution of NP vs. BQP).

  102. fred Says:

    I’ve been wondering whether LLMs may be currently limited due to their reliance on the way human language works, i.e. long 1-dimensional string of words, playing the “guess the next token” game.

    The real world appears complex to us, but what’s really complicated is the somewhat artificial and narrow exercise of trying to deconstruct it into neat/separate concepts, each represented by a word, and repeat this over and over, one concept at a time.
    We eventually always run into the fact that nature isn’t made of “independent things”, it’s really made of “processes”, i.e. we insist that all actions are created by agents (as separable independent sub-systems), when in fact action is the only thing that can create action. E.g. energy is being constantly transformed, whirlpools in a stream (whirlpools are characterized by the fact that there’s “whirling”, not by the water that constantly flows through them), or that all the atoms in our “bodies” are being replaced constantly, etc.

    On the other hand, we’re not just limited to words since we have sayings like “an image is worth a thousand words”.
    So it’s likely that eventually AIs will craft new languages that will be too complex for us to understand (given our limited abilities to reason in more than 1, 2 or 3 dimensions). Their sentences would be more like complex graph-like structures rather than serial, in the same way that multi-threaded computation is different and more efficient than strictly serial computation (at some level, “speaking” is the output of a complex computation that’s not compressible).

  103. fred Says:

    I would expect that for “super intelligent” AIs, there would no longer be any clear distinction between the ability to come up with the most compact program to simulate a certain system and the ability to come up with a conceptual model for that system.
    Eventually we’ll expect that “guess the next token” for an AI would often be a mix between activating a bunch of neurons and running a fairly complex internal computation. I guess that would require some sort of feedback loops in the neural connections.

  104. Scott Says:

    Ted #96: I can answer as someone who was there at the time. 🙂

    It was precisely because by 1999, a consensus had developed among quantum complexity theorists that NP⊆BQP would be a gigantic surprise/breakthrough, that the quantum complexity community received the Farhi group’s claims to solve NP-complete problems adiabatically with so much skepticism. The scientific tension was then lessened a bit by the discovery of 3SAT (and even 2SAT) instances with exponentially small spectral gap, which caused Farhi’s side to concede that the adiabatic algorithm can indeed take exponential time in the worst case. After that, the debate became about the adiabatic algorithm’s performance “in practice,” and the comparison to classical heuristics, and that’s where it remains all those years later, though still with echoes of the argument we had in 1999-2002.

  105. Scott Says:

    Alex Meiburg #98: Are we talking only about the decision classes, or also about the promise and sampling/relational variants of these classes?

    For sampling/relational and even for promise problems, I’m very confident that BQP and NP are incomparable (say at least 95%).

    For decision problems, I’d put at least 70% on them being incomparable, with most of the remainder on the possibility of P≠BQP⊂NP∩coNP, and a little on P=BQP≠NP, and the remaining possibilities “too unlikely to meter.” 🙂

    BQP=NP is barely worth talking about since it involves two inclusions (NP⊆BQP and BQP⊆NP) that would both be huge surprises for different reasons. It’s like asking what happens if extraterrestrials coincidentally arrive during the Second Coming of Christ. 😀

  106. Joshua Zelinsky Says:

    Alex Meiburg, #98,

    Not Scott but I want to note that 5 should be really, really unlikely, since it would also imply that NP = co-NP. A situation where P !=NP but where NP =co-NP is a really pathological one even before anything related to BQP enters into it.

  107. Ilio Says:

    Nick #89,

    >If you think they might have a shot, but that they’re likely miscalculating which type of being they are about to bring down, then there’s reason to be worried and to call for concrete measures to stop people from trying to figure out how to summon demons most efficiently.

    That was a very good point, except they already summon one documented demon (the victim’s name is Pierre). I see no reason to think they will call demons very more dangerous than, say, the rise of the antivax movement, but still agree with the rest of your sentence. Preferably after more pressing matters, but it’s always good to keep at least some ressources for a few less likely existential scenarii.

    Scott #105: LOL!

  108. Friar Tuck Says:


    Just popping in to point out that the so-called “Five Worlds,” or whatever it is, of computational complexity is based on a misunderstanding/logical error. The worlds, as described, do not correspond to mutually exclusive mathematical conjectures. For example, it is perfectly conceivable both that P = NP, and also that there’s a classical algorithm to solve curcuit SAT, say, in t^100,000 time. While this would put the NP complete problem in P, it would make no actual practical difference with such a high power. Encryption would still work. Why can’t compiter scientists such as yourself understand this? Why can’t they grasp that polynomial asymptotic complexity doesn’t mean a problem is easy to solve in any practical sense?

  109. Scott Says:

    Friar Tuck #108: Have you considered the possibility that computer scientists all know that extremely well? One could easily redefine Impagliazzo’s five worlds in terms of the loose criterion of “efficient solvability in practice,” rather than the formal-but-not-quite-right criterion of “solvability in polynomial time.” But it’s tiresome to have to make the distinction every single time, so we normally just treat it as implicitly understood.

  110. Friar Tuck Says:


    I’m extremely frustrated right now. So you guys have been literally lying the entire time—to the public, for example, when “N vs NP asks whether a problem that’s easily verified is easily solved,” which is a gross misrepresentation, worse, dare I say, than the Michio Kakus who promote quantum computing as solving every problem simultaneously?

    It’s SO annoying trying to get past the bullshit and figure out what you guys are trying to say. This is me: 😃

    Easy =/= polynomial asymptotic time. There is a huge disconnect between this philosophizings about asymptotic performance and the sort of shit you’d worry about if you had like an actual tech job. Sure 1.000001^n might underperform n^10000000 in the technical limit…Hoe do you define “efficient solvability in oractice”? How is that done? I don’t understand. Maybe it will turn out that P=NP, but E =/= H, where E are “easy” and H are “hard” problems, easiness and hardness judged in an as yet unknown subtle way that is more sophisticated than just looking as asymptotic performance. I wish you people wouldn’t treat this all as settled before moving to quantum. Dismission P=NP takes serious confidence, over confidence. Maybe you haven’t found an NP complete algorithm in P yet, despite the vast network of NO completes, because nobody has thought outside the box yet and discovered the right algorithm, which might be some super complex esoteric mathematical thing that none of youse have ever even conceived of bedore. What is chance of that?

  111. Prasanna Says:

    In my view the current discussions massively underplay the possibilities that nations can invest heavily (much more than OpenAI/Microsoft/Google combined) to create an AI that is only aligned to their interests. So the current geopolitical situation should be an important part of any discourse. Who can secretly create a much more powerful AI than GPT-4 and unleash it without anyone even perceivably noticing (as innocent as a balloon ?) should be central to public discussion and not left only govt agencies to deal with alone (public opinion being central pillar of democratic governance). Imagine Hitler or Japan developing the atomic weapons first, we would be living in an entirely different world today. Hindsight is not perfect either, it can never be justified by today’s standards why Japan was penalized, but in the context of that era it was relevant. So the question that should be foremost on everyone’s mind is, how do you prevent autocratic regimes from getting ahead in this race. It is naive to worry that the Manhattan project can be created by a private corporation and ignore the bigger risks about nation states.
    It is also relevant that the covid discussions in this thread have a bearing – note that WHO was just parroting what the country was reporting about human to human transmission. It had no independent mechanism of finding evidence or even creating lab experiments to verify. And we don’t see that even 3 years after the pandemic, with all the talk of WHO reforms etc.
    Hope the democratic nation states wake up to these realities and put a robust preventive measures in place than just being reactive.

  112. Delicious Irony Says:

    @Friar Tuck #108

    Quite apart from Scott #109’s objection – can you, in fact, name an example of practical importance where people opt for a super-polynomial algorithm instead of a t^100000 one that’s available? Or even t^100? No? Interesting, that.

    This criticism of CS, “those guys in their ivory tower, they never think of the practical constants!”, is deliciously ironic. As a matter of sheer practicality, this counter-point doesn’t actually come up – and one suspects this is not an accident (this may have to do with some of the points Scott was making in the old “10 reasons to believe” post).

  113. Scott Says:

    Friar Tuck #110: You’re hereby banned from this blog—not for reminding everyone about slow-in-practice polynomial-time algorithms and fast-in-practice exponential-time ones, which is 100% fine, but for doing so with a combination of accusatory hostility and ignorance that I no longer have time for in my life.

    Having been burned before, I now try to bend over backwards every single time to say, e.g., “if P=NP and the algorithm is efficient in practice…” rather than just “if P=NP…” when discussing real-world implications, but I might occasionally forget, because (as I said) it’s tiresome. Life would be much easier if people could just take the extra clause as implied.

  114. Ted Says:

    Joshua Zelinsky #106: So do you believe that, conditioned on the (very unlikely) event that NP = coNP, P = NP is more likely than P != NP? Or, put another way, if someone managed to prove that NP = coNP, would the complexity theory community switch from strongly believing that P != NP to believing that P probably does equal NP?

  115. fred Says:

    Scott #113

    “I no longer have time for in my life. […] I might occasionally forget, because (as I said) it’s tiresome. Life would be much easier if people could just take the extra clause as implied.”

    I wonder how much of this is a result of

    1) when one reaches his 40s, 50s and beyond, there’s an increased sense of urgency and priority. And, as you get older, the proportion of people who are younger than you keeps increasing, steadily.

    2) there’s a constant influx of new young people, most of them going through the same stages of intense curiosity and confusion about classic open questions that prior generations went through. Who here hasn’t at some point looked at P?=NP with the assumption that there could be some algorithm everyone else who came before has overlooked?

    And then when 1) and 2) meet, there’s a clash that gets steadily worse for the aging person.
    Hopefully the advent of AI will help solve this, assuming AI’s sense of immortality grants them infinite patience!

  116. JimV Says:

    A truly intelligent AI aligned to Russia’s interests would have told Putin not to invade Ukraine, but to focus on making the average Russian happy and productive, with some suggestions for that; to make Ukraine want to rejoin Russia for the better welfare of its citizens, by setting a great example of the benefits of Russian citizenship. Forcing the AI to seek Putin’s goals using Putin’s preferred methods and assumptions would limit it and degrade its performance.

    The desire to compete with others for power over them is not a necessary part of intelligence. Intelligence will tend to find cooperation more fruitful than competition. I can’t think of a single technical problem I worked on in 38 years of engineering design, development, and field service which would have been easier for me to solve if I ruled the world. (I was tempted to describe some of them, but the dispassionate AI part of my brain suggested that would be counterproductive.) I think the same is true for medical doctors and IT workers and most professions which use a lot of intelligence. Maybe not for lawyers, though.

    (As a fine for too much commenting without saying anything new, I will donate another $150 to the Ukrainian National Bank.)

  117. Cristóbal Camarero Says:

    And for more annoyance, we could have the situation of P != NP but having an efficient probabilistic polynomial algorithm for NP. That is, P != BPP = NP. Truly, there are too many possibilities to be stating every time. It is also true that P=NP is a precise statement, and using it as a vague description can mislead and leads to nitpicking. Some explicitly vague expression could help, but I cannot think of something convincing.

  118. Joshua Zelinsky Says:

    @Tes #114, I can’t speak for anyone else, but yes, that would be my reaction.

  119. Alex Meiburg Says:

    @Ted #101: Yes, happy to explain. I think @Joshua Zelinksy #106 gave a big part of it. (Joshua: I did assign it a “very low” probability of 0.001%; I generally calibrate myself as ready for surprises, and distrusting of very popular conjectures, such that I think there are very few unproven Boolean mathematical statements to which I would assign a lower probability.)

    But more broadly, here’s my “gut feeling” about the matter. NP is, extremely loosely, about “searching for exponentially sparse signals”. That is, finding the 1’s in some function that is 0’s almost everywhere. BQP is then about “Fourier transforms of exponentially long signals”, again, very loosely. Shor’s algoritm, the QFT, and Forrelation are examples here.

    It would be surprising if a year from now there was a convincing proof that P=NP. I would expect the algorithm would need to be some highly nontrivial mathematics, because nothing simple seems to work — probably appealing to some complicated classification or structure theorems or something. And somehow this powerful piece of math lets us repeatedly decompose 3SAT in some nontrivial way and solve it in polynomial time.

    A similar thing goes for P=BQP. Surprising, but conceivable: I can imagine some powerful math letting us find a clever way to break down quantum circuits and Fourier transforms and solve it all quickly.

    But P⊊BQP=NP would imply that somehow all of the weird “truly BQP-quantum-Fourier” things and all of the “truly NP-search-hardcore” things are the same. They’re both hard, but the things that make them hard seem to have nothing to do with each other. Each one eventually having some clever trick to make them easy seems possible, but some clever trick to make them equal would be bizarre, to me. I guess it would be sort of like someone proving that the Sum-of-square roots problem is actually GI-complete without putting it in P: both are problems that might turn out easy, but probably not hard in the same way.

    Joshua Zelinksy’s point that BQP is closed under complementation, while NP isn’t, is I think one manifestation of the fact that they just behave too differently. (Informally: it’s easy to flip the sign on a Fourier transform, but exchanging 1’s and 0’s in a sparse signal gives you something very non-sparse.)

    Options 3 and 6 in my list (P⊊BQP⊊NP and P⊊NP⊊BQP, to save you scrolling up) have low probabilities as well, because they require similar weird things about one class encompassing the other without total collapse. I give a slightly higher probability to Option 3, because I can imagine some argument that goes like… here’s some math that gets us very close to solving BQP, so we get P=BQP in spirit, but there are some difficult parts that we cannot resolve, but NP allows us to have proof strings to check the last bit.

  120. Ted Says:

    Alex Meiburg #119: Thanks so much for the excellent explanation – that’s very illuminating.

    It seems like the biggest differences between your probability distribution and Scott’s (in #105) are that (a) Scott puts a significantly higher probability than you do on the proposition “BQP⊊NP” (Scott puts that at ~30% (for decision problems) and you put it at 1.1%), and (b) you think that P=NP would be the most likely explanation for that proposition being true, while Scott does not. That is, you think that the proposition “NP does not contain BQP” is much more likely than the proposition “P != NP”, while Scott thinks the reverse.

    It’s interesting to me that in your probability distribution, you assign almost twice the probability to the propositions P=NP and P=BQP both being true than to exactly one of them being true (which can’t happen if those probabilities are small and approximately independent). Why would one of those propositions being true so strongly increase your credence that the other one was also true?

  121. Dan Miller Says:

    Jack #66: I worked at a company quite a few years ago doing Boston Dynamics type stuff — bipedal walking, and human hand analogue.

    Suffice to say, deep learning algorithms coupled with the orrders-of-magnitude increase in compute power and size of neural nets, pretty much guarantee exponential improvement in robotics that will rival what we’ve just seen with large language models.

    The reason text and graphics got the bump first is primarily economic rather than fundamental. Robotics is insanely hard at a physical level; you’re constantly dealing with breakdowns and changes that make it very hard to iterate and improve the software.

    From an economic perspective, this technology is never going to be good enough until it’s almost perfect, as we’ve seen with self-driving cars. Not to mention it’s incredibly expensive and capital intensive in a way that software is not.

    So it’s no surprise that a text-based breakthrough, which can be immediately monetized through search engines and such on the internet, was the first thing to take advantage of the new paradigm of massive compute, huge models, and enormous quantities of data. But once robotics passes a certain threshold (which I believe is going to come in the next 10 years), and it is integrated with the language capabilities we’re seeing now,, well, as they say, Bob’s your uncle.

  122. Mitchell Porter Says:

    Speaking of classical vs quantum complexity classes… There was a paper yesterday, claiming to prove that a polynomial hierarchy for distributed computing does not collapse. I’m wondering if one can get anywhere by comparing this to the QMA-completeness of 2-local quantum computing.

  123. Douglas Knight Says:

    If you are willing to try Stanislaw Lem again, try the Cyberiad. Specifically for the humor.

Leave a Reply

You can use rich HTML in comments! You can also use basic TeX, by enclosing it within $$ $$ for displayed equations or \( \) for inline equations.

Comment Policies:

  1. All comments are placed in moderation and reviewed prior to appearing.
  2. You'll also be sent a verification email to the email address you provided.
  3. This comment section is not a free speech zone. It's my, Scott Aaronson's, virtual living room. Commenters are expected not to say anything they wouldn't say in my actual living room. This means: No trolling. No ad-hominems against me or others. No presumptuous requests (e.g. to respond to a long paper or article). No conspiracy theories. No patronizing me. Comments violating these policies may be left in moderation with no explanation or apology.
  4. Whenever I'm in doubt, I'll forward comments to Shtetl-Optimized Committee of Guardians, and respect SOCG's judgments on whether those comments should appear.
  5. I sometimes accidentally miss perfectly reasonable comments in the moderation queue, or they get caught in the spam filter. If you feel this may have been the case with your comment, shoot me an email.