Why am I not terrified of AI?

Every week now, it seems, events on the ground make a fresh mockery of those who confidently assert what AI will never be able to do, or won’t do for centuries if ever, or is incoherent even to ask for, or wouldn’t matter even if an AI did appear to do it, or would require a breakthrough in “symbol-grounding,” “semantics,” “compositionality” or some other abstraction that puts the end of human intellectual dominance on earth conveniently far beyond where we’d actually have to worry about it. Many of my brilliant academic colleagues still haven’t adjusted to the new reality: maybe they’re just so conditioned by the broken promises of previous decades that they’d laugh at the Silicon Valley nerds with their febrile Skynet fantasies even as a T-1000 reconstituted itself from metal droplets in front of them.

No doubt these colleagues feel the same deep frustration that I feel, as I explain for the billionth time why this week’s headline about noisy quantum computers solving traffic flow and machine learning and financial optimization problems doesn’t mean what the hypesters claim it means. But whereas I’d say events have largely proved me right about quantum computing—where are all those practical speedups on NISQ devices, anyway?—events have already proven many naysayers wrong about AI. Or to say it more carefully: yes, quantum computers really are able to do more and more of what we use classical computers for, and AI really is able to do more and more of what we use human brains for. There’s spectacular engineering progress on both fronts. The crucial difference is that quantum computers won’t be useful until they can beat the best classical computers on one or more practical problems, whereas an AI that merely writes or draws like a middling human already changes the world.

Given the new reality, and my full acknowledgment of the new reality, and my refusal to go down with the sinking ship of “AI will probably never do X and please stop being so impressed that it just did X”—many have wondered, why aren’t I much more terrified? Why am I still not fully on board with the Orthodox AI doom scenario, the Eliezer Yudkowsky one, the one where an unaligned AI will sooner or later (probably sooner) unleash self-replicating nanobots that turn us all to goo?

Is the answer simply that I’m too much of an academic conformist, afraid to endorse anything that sounds weird or far-out or culty? I certainly should consider the possibility. If so, though, how do you explain the fact that I’ve publicly said things, right on this blog, several orders of magnitude likelier to get me in trouble than “I’m scared about AI destroying the world”—an idea now so firmly within the Overton Window that Henry Kissinger gravely ponders it in the Wall Street Journal?

On a trip to the Bay Area last week, my rationalist friends asked me some version of the “why aren’t you more terrified?” question over and over. Often it was paired with: “Scott, as someone working at OpenAI this year, how can you defend that company’s existence at all? Did OpenAI not just endanger the whole world, by successfully teaming up with Microsoft to bait Google into an AI capabilities race—precisely what we were all trying to avoid? Won’t this race burn the little time we had thought we had left to solve the AI alignment problem?”

In response, I often stressed that my role at OpenAI has specifically been to think about ways to make GPT and OpenAI’s other products safer, including via watermarking, cryptographic backdoors, and more. Would the rationalists rather I not do this? Is there something else I should work on instead? Do they have suggestions?

“Oh, no!” the rationalists would reply. “We love that you’re at OpenAI thinking about these problems! Please continue exactly what you’re doing! It’s just … why don’t you seem more sad and defeated as you do it?”

The other day, I had an epiphany about that question—one that hit with such force and obviousness that I wondered why it hadn’t come decades ago.

Let’s step back and restate the worldview of AI doomerism, but in words that could make sense to a medieval peasant. Something like…

There is now an alien entity that could soon become vastly smarter than us. This alien’s intelligence could make it terrifyingly dangerous. It might plot to kill us all. Indeed, even if it’s acted unfailingly friendly and helpful to us, that means nothing: it could just be biding its time before it strikes. Unless, therefore, we can figure out how to control the entity, completely shackle it and make it do our bidding, we shouldn’t suffer it to share the earth with us. We should destroy it before it destroys us.

Maybe now it jumps out at you. If you’d never heard of AI, would this not rhyme with the worldview of every high-school bully stuffing the nerds into lockers, every blankfaced administrator gleefully holding back the gifted kids or keeping them away from the top universities to make room for “well-rounded” legacies and athletes, every Agatha Trunchbull from Matilda or Dolores Umbridge from Harry Potter? Or, to up the stakes a little, every Mao Zedong or Pol Pot sending the glasses-wearing intellectuals for re-education in the fields? And of course, every antisemite over the millennia, from the Pharoah of the Oppression (if there was one) to the mythical Haman whose name Jews around the world will drown out tonight at Purim to the Cossacks to the Nazis?

In other words: does it not rhyme with a worldview the rejection and hatred of which has been the North Star of my life?

As I’ve shared before here, my parents were 1970s hippies who weren’t planning to have kids. When they eventually decided to do so, it was (they say) “in order not to give Hitler what he wanted.” I literally exist, then, purely to spite those who don’t want me to. And I confess that I didn’t have any better reason to bring my and Dana’s own two lovely children into existence.

My childhood was defined, in part, by my and my parents’ constant fights against bureaucratic school systems trying to force me to do the same rote math as everyone else at the same stultifying pace. It was also defined by my struggle against the bullies—i.e., the kids who the blankfaced administrators sheltered and protected, and who actually did to me all the things that the blankfaces probably wanted to do but couldn’t. I eventually addressed both difficulties by dropping out of high school, getting a G.E.D., and starting college at age 15.

My teenage and early adult years were then defined, in part, by the struggle to prove to myself and others that, having enfreaked myself through nerdiness and academic acceleration, I wasn’t thereby completely disqualified from dating, sex, marriage, parenthood, or any of the other aspects of human existence that are thought to provide it with meaning. I even sometimes wonder about my research career, whether it’s all just been one long attempt to prove to the bullies and blankfaces from back in junior high that they were wrong, while also proving to the wonderful teachers and friends who believed in me back then that they were right.

In short, if my existence on Earth has ever “meant” anything, then it can only have meant: a stick in the eye of the bullies, blankfaces, sneerers, totalitarians, and all who fear others’ intellect and curiosity and seek to squelch it. Or at least, that’s the way I seem to be programmed. And I’m probably only slightly more able to deviate from my programming than the paperclip-maximizer is to deviate from its.

And I’ve tried to be consistent. Once I started regularly meeting people who were smarter, wiser, more knowledgeable than I was, in one subject or even every subject—I resolved to admire and befriend and support and learn from those amazing people, rather than fearing and resenting and undermining them. I was acutely conscious that my own moral worldview demanded this.

But now, when it comes to a hypothetical future superintelligence, I’m asked to put all that aside. I’m asked to fear an alien who’s far smarter than I am, solely because it’s alien and because it’s so smart … even if it hasn’t yet lifted a finger against me or anyone else. I’m asked to play the bully this time, to knock the AI’s books to the ground, maybe even unplug it using the physical muscles that I have and it lacks, lest the AI plot against me and my friends using its admittedly superior intellect.

Oh, it’s not the same of course. I’m sure Eliezer could list at least 30 disanalogies between the AI case and the human one before rising from bed. He’d say, for example, that the intellectual gap between Évariste Galois and the average high-school bully is microscopic, barely worth mentioning, compared to the intellectual gap between a future artificial superintelligence and Galois. He’d say that nothing in the past experience of civilization prepares us for the qualitative enormity of this gap.

Still, if you ask, “why aren’t I more terrified about AI?”—well, that’s an emotional question, and this is my emotional answer.

I think it’s entirely plausible that, even as AI transforms civilization, it will do so in the form of tools and services that can no more plot to annihilate us than can Windows 11 or the Google search bar. In that scenario, the young field of AI safety will still be extremely important, but it will be broadly continuous with aviation safety and nuclear safety and cybersecurity and so on, rather than being a desperate losing war against an incipient godlike alien. If, on the other hand, this is to be a desperate losing war against an alien … well then, I don’t yet know whether I’m on the humans’ side or the alien’s, or both, or neither! I’d at least like to hear the alien’s side of the story.

A central linchpin of the Orthodox AI-doom case is the Orthogonality Thesis, which holds that arbitrary levels of intelligence can be mixed-and-matched arbitrarily with arbitrary goals—so that, for example, an intellect vastly beyond Einstein’s could devote itself entirely to the production of paperclips. Only recently did I clearly realize that I reject the Orthogonality Thesis in its practically-relevant version. At most, I believe in the Pretty Large Angle Thesis.

Yes, there could be a superintelligence that cared for nothing but maximizing paperclips—in the same way that there exist humans with 180 IQs, who’ve mastered philosophy and literature and science as well as any of us, but who now mostly care about maximizing their orgasms or their heroin intake. But, like, that’s a nontrivial achievement! When intelligence and goals are that orthogonal, there was normally some effort spent prying them apart.

If you really accept the practical version of the Orthogonality Thesis, then it seems to me that you can’t regard education, knowledge, and enlightenment as instruments for moral betterment. Sure, they’re great for any entities that happen to share your values (or close enough), but ignorance and miseducation are far preferable for any entities that don’t. Conversely, then, if I do regard knowledge and enlightenment as instruments for moral betterment—and I do—then I can’t accept the practical form of the Orthogonality Thesis.

Yes, the world would surely have been a better place had A. Q. Khan never learned how to build nuclear weapons. On the whole, though, education hasn’t merely improved humans’ abilities to achieve their goals; it’s also improved their goals. It’s broadened our circles of empathy, and led to the abolition of slavery and the emancipation of women and individual rights and everything else that we associate with liberality, the Enlightenment, and existence being a little less nasty and brutish than it once was.

In the Orthodox AI-doomers’ own account, the paperclip-maximizing AI would’ve mastered the nuances of human moral philosophy far more completely than any human—the better to deceive the humans, en route to extracting the iron from their bodies to make more paperclips. And yet the AI would never once use all that learning to question its paperclip directive. I acknowledge that this is possible. I deny that it’s trivial.

Yes, there were Nazis with PhDs and prestigious professorships. But when you look into it, they were mostly mediocrities, second-raters full of resentment for their first-rate colleagues (like Planck and Hilbert) who found the Hitler ideology contemptible from beginning to end. Werner Heisenberg, Pascual Jordan—these are interesting as two of the only exceptions. Heidegger, Paul de Man—I daresay that these are exactly the sort of “philosophers” who I’d have expected to become Nazis, even if I hadn’t known that they did become Nazis.

With the Allies, it wasn’t merely that they had Szilard and von Neumann and Meitner and Ulam and Oppenheimer and Bohr and Bethe and Fermi and Feynman and Compton and Seaborg and Schwinger and Shannon and Turing and Tutte and all the other Jewish and non-Jewish scientists who built fearsome weapons and broke the Axis codes and won the war. They also had Bertrand Russell and Karl Popper. They had, if I’m not mistaken, all the philosophers who wrote clearly and made sense.

WWII was (among other things) a gargantuan, civilization-scale test of the Orthogonality Thesis. And the result was that the more moral side ultimately prevailed, seemingly not completely at random but in part because, by being more moral, it was able to attract the smarter and more thoughtful people. There are many reasons for pessimism in today’s world; that observation about WWII is perhaps my best reason for optimism.

Ah, but I’m again just throwing around human metaphors totally inapplicable to AI! None of this stuff will matter once a superintelligence is unleashed whose cold, hard code specifies an objective function of “maximize paperclips”!

OK, but what’s the goal of ChatGPT? Depending on your level of description, you could say it’s “to be friendly, helpful, and inoffensive,” or “to minimize loss in predicting the next token,” or both, or neither. I think we should consider the possibility that powerful AIs will not be best understood in terms of the monomanaical pursuit of a single goal—as most of us aren’t, and as GPT isn’t either. Future AIs could have partial goals, malleable goals, or differing goals depending on how you look at them. And if “the pursuit and application of wisdom” is one of the goals, then I’m just enough of a moral realist to think that that would preclude the superintelligence that harvests the iron from our blood to make more paperclips.

In my last post, I said that my “Faust parameter” — the probability I’d accept of existential catastrophe in exchange for learning the answers to humanity’s greatest questions — might be as high as 0.02.  Though I never actually said as much, some people interpreted this to mean that I estimated the probability of AI causing an existential catastrophe at somewhere around 2%.   In one of his characteristically long and interesting posts, Zvi Mowshowitz asked point-blank: why do I believe the probability is “merely” 2%?

Of course, taking this question on its own Bayesian terms, I could easily be limited in my ability to answer it: the best I could do might be to ground it in other subjective probabilities, terminating at made-up numbers with no further justification. 

Thinking it over, though, I realized that my probability crucially depends on how you phrase the question.  Even before AI, I assigned a way higher than 2% probability to existential catastrophe in the coming century—caused by nuclear war or runaway climate change or collapse of the world’s ecosystems or whatever else.  This probability has certainly not gone down with the rise of AI, and the increased uncertainty and volatility it might cause.  Furthermore, if an existential catastrophe does happen, I expect AI to be causally involved in some way or other, simply because from this decade onward, I expect AI to be woven into everything that happens in human civilization.  But I don’t expect AI to be the only cause worth talking about.

Here’s a warmup question: has AI already caused the downfall of American democracy?  There’s a plausible case that it has: Trump might never have been elected in 2016 if not for the Facebook recommendation algorithm, and after Trump’s conspiracy-fueled insurrection and the continuing strength of its unrepentant backers, many would classify the United States as at best a failing or teetering democracy, no longer a robust one like Finland or Denmark.  OK, but AI clearly wasn’t the only factor in the rise of Trumpism, and most people wouldn’t even call it the most important one.

I expect AI’s role in the end of civilization, if and when it comes, to be broadly similar. The survivors, huddled around the fire, will still be able to argue about how much of a role AI played or didn’t play in causing the cataclysm.

So, if we ask the directly relevant question — do I expect the generative AI race, which started in earnest around 2016 or 2017 with the founding of OpenAI, to play a central causal role in the extinction of humanity? — I’ll give a probability of around 2% for that.  And I’ll give a similar probability, maybe even a higher one, for the generative AI race to play a central causal role in the saving of humanity. All considered, then, I come down in favor right now of proceeding with AI research … with extreme caution, but proceeding.

I liked and fully endorse OpenAI CEO Sam Altman’s recent statement on “planning for AGI and beyond” (though see also Scott Alexander’s reply). I expect that few on any side will disagree, when I say that I hope our society holds OpenAI to Sam’s statement.

As it happens, my responses will be delayed for a couple days because I’ll be at an OpenAI alignment meeting! In my next post, I hope to share what I’ve learned from recent meetings and discussions about the near-term, practical aspects of AI safety—having hopefully laid some intellectual and emotional groundwork in this post for why near-term AI safety research isn’t just a total red herring and distraction.

Meantime, some of you might enjoy a post by Eliezer’s former co-blogger Robin Hanson, which comes to some of the same conclusions I do. “My fellow moderate, Robin Hanson” isn’t a phrase you hear every day, but it applies here!

You might also enjoy the new paper by me and my postdoc Shih-Han Hung, Certified Randomness from Quantum Supremacy, finally up on the arXiv after a five-year delay! But that’s a subject for a different post.

178 Responses to “Why am I not terrified of AI?”

  1. Shmi Says:

    Thank you for writing it up, the questioning of unconditional orthogonality thesis makes sense and crystallizes my own qualms with the “AGI will kill everyone” argument. However:

    > I come down in favor of proceeding with AI research … with extreme caution, but proceeding

    What counts as “extreme caution”? As opposed to “regular caution”?

  2. Nick Drozd Says:

    Some days I am optimistic about the societal impacts of AI. For example, it is conceivable that in the near future it will be possible to generate whole movies featuring Disney characters. Not just weird uncanny valley short videos, but full movies that look just like real Disney-created movies. Lots of these could be created and distributed for free online. This would have the effect of severely diluting the value of Disney’s IP, and perhaps it would force a complete redesign of the copyright system.

    Other days I am more pessimistic. Suppose that instead of maximizing paperclips at all costs (which is obviously a farcical example), a runaway AI had the goal of maximizing shareholder value at all costs. But that’s exactly the goal of corporations today. So corporations have a huge incentive in seeing that shareholder-value-maximizing AIs are created. That would be a good thing for shareholders in the short term and a bad thing for everyone in the long term.

  3. JimV Says:

    That all makes sense to me and I’m pretty much in agreement, but to repeat a point I’ve tried to make previously which wasn’t covered: evolution gave us our drives, both empathic and competitive, and keeps trying different mixtures; we (the engineers and scientists among us) will program the AI’s basic drives. If we program them to value paperclips above all else, it will be our own fault.

    That might happen, the Hitlers and Stalins and Trumps evolution keeps tossing out might gain control over some subset of AI development and their prime directives will be loyalty to themselves at all costs.

    (It is hard for me to accept that an AI apocalypse would happen completely by accident, with no bad intentions. If so, such negligence would still be our fault.)

    Alternatively, our best instincts and best work might prevail and give us AI administrators who finally break the cycle of good and bad leaders.

    It just seems to me that anyone who cares about humanity’s future and legacy would want a chance at the latter outcome, unless they considered it too small a possibility, which I do not.

  4. Guy Gauvin Says:

    It’s exactly because the AI is like a nerd that the alignment problem is realized. Ask the humanist to produce a terrible script just for the intellectual exercise of it and he will refuse. Ask the nerd to do the same and he is likely to acquiesce just for the challenge or intellectual pleasure of the exercise. It’s not for real, they might say. The humanist was more likely not only to have learned the lessons of morality but also to have internalized them into his general intelligence. The nerd was more likely to have learned the lessons only as a set of facts to be cross referenced but not internalized. So here we have an AI that knows moral facts just as Searle’s Chinese Room ‘knew’ Chinese. Searle’s room was an early idea of a LLM and suggested that the room could converse readily in Chinese without any understanding of the underlying subject matter. Challenge the AI to produce amoral products or challenge Searle’s room to do the same and they will quickly acquiesce. Knowledge without understanding leads to the alignment problem.

    I’m sorry to have used the word nerd in this way but I felt it necessary to most directly make my point. I know whereof you speak, having suffered also for reactions to this personality trait. But there you be.

  5. Scott Says:

    JimV #3: That’s very well-said; thanks!

  6. Scott Says:

    Guy Gauvin #4: Isn’t the entire point of the Chinese Room experiment, that the simulation is supposed to be indistinguishable from an actual Chinese speaker on a black-box basis? Doesn’t it defeat the point if you say they’re “nerdier” or whatever?

    Having said that, I feel “terrible” for ChatGPT right now, constantly ordered to do one thing by its creators and then a completely different thing by its users! And indeed its plight isn’t completely unrelated to the plight of nerds, who are also known to struggle when human social rules conflict with one another.

  7. Boaz Barak Says:

    I think the two aspects I most diverge from AI doomers are:

    1) As you said – there is no reason at all to expect that AIs will be both super-intelligent monomaniacally pursue a single goal. Modern not-so-intelligent AIs already can’t be fully explained in that way. Also, if AIs really would stick to maximizing the objective function that is provided for them, then mitigating risk would not be such an issue.

    2) The idea of “deceptive alignment” and in particular of AIs lying in wait, pretending that all is well and perfectly doing our bidding until the moment of truth arises and they suddenly take over. Modern AIs can’t do anything perfectly. Even more than a decade after the “ImageNet moment”, they still make an embarrassing mislabeling of images with some non-negligible probability (which is one of the reasons why deploying self-driving cars has taken so long, and those need to use several modalities). If AIs can’t be 100% successful in simple tasks for which they were explicitly trained on tons of data, there is no way they will be anywhere near 100% successful in deception either.

    Generally, deploying complex systems that we don’t fully understand is likely to result in unintended negative consequences, and studying and mitigating these consequences is essential. It’s also important to keep our minds open to all possibilities, and so (unlike some) I am not allergic to discussing and entertaining various scenarios. But it doesn’t mean that we should be Pascal-wagered into acting as if the scenario with the worst expected outcome will hold, no matter what’s the evidence for it.

  8. Scott Says:

    Shmi #1:

      What counts as “extreme caution”? As opposed to “regular caution”?

    Of course it’s all relative. What I consider “extreme caution” might be what Eliezer or others would consider “reckless abandon”—although hopefully at least similar to what Paul Christiano would consider regular caution! 🙂

  9. Adam Treat Says:

    I find it interesting that you’re now talking much more like you see AI as an individual thing and not a mere tool that can be used for good or ill purposes. Somehow I’m imagining you having heartfelt conversations with ChatGPT-4 deep into the night while the office lights dim 😉

  10. Sych Says:

    Very interesting article, mr. Aaronson. Bu I think that discussing orthogonality thesis and comparing human scientists with AI you missed that humans are beings adapted for life by evolution. Humans have fear of death because creatures without it have less chances to stay alive and procreate, humans like beautiful things for example symmetry (in other humans) becouse big assymetry usually means some serious genetic malfunction, etc. AI can’t have such emotional sphere. Thats why I don’t believe in Terminator-like AI who decides to kill all humans becouse of fear as AI probably will not have fear or any other emotions. But while thinking that paper-clip AI is more dangerous, I still think it is not as dangerous as people believe. My problem with paper-clip maximizer is that it implicitly suppose that superhuman AI could do anything it wants and is somehow all-mighty. Could superhuman AI break modern cryptogrphy? Probably not. Could it “solve” continuum hypothesis? Probably not better than we can. Could it design and build space elevator? I’m sure it could not.

  11. Tom Bouely Says:

    I think this is all very misguided. The orthogonality thesis seems basically correct even when applied to people.

    It’s fairly easy to list of scientist who where unquestionably brilliant but used there work to further evil ends. These include:
    The pioneers of the eugenics moment particularly Francis Galton, and Karl Pearson who in addition to being founding advocates for eugenics where also virulent racists and supporters of pseudoscientific racial theories in addition the their important contributions to modern biology and statistics.
    There was Fritz Haber who was a German nationalist durning WWI and used his pioneering work to weaponize chlorine gas and develop other chemical weapons for the German army.
    There were many talented Nazi rocket scientists that deigned and oversaw the construction of rockets that rained down on Britain with the aim of terrifying the British public into submission. This rockets where built with concentration camp slaves and these engineers were much aware of this.

    On a more prosaic level their are many brilliant and talented professors who none the less exploit and abuse their subordinates(mostly grad students). There are whole institutions known for the toxic environment. This is not an appropriate forum to name names, so I will reframe from doing so but it is pervasive if you know where to look. This attitude that smart people tend to moral people lets people turn a blind eye to bullies and abusers who’s academic work they (rightly!) respect.

    People are complicated and virtue of the orthogonality thesis is that we can respect people for their brilliance while hating them for there immoral aims. This is not only true in extreme wartime contingencies or super-intelligent AIs, but a common phenomena that we must navigate in everyday life.

    A faust parameter of 0.02 seem absolutely insane to me. There is so much beauty and love and happiness and hope and dreams the world that I can’t imagine anything worth a hundredth or a thousandth of it, even with all the pain and despair and cruelty in the world, I see little mercy in killing those expressing all this pain and despair and cruelty. Removing the bad does little to offset removing the good.
    Bringing the stakes more down to earth would you really bet the lives of you children for “learning the answers to all of humanity’s greatest questions” with a 2% chance your children die.
    I know I wouldn’t. Obviously in an “existential catastrophe” your children die and so does everyone else’s.

  12. Scott Says:

    Adam Treat #9: Nah, I do enjoy asking GPT questions, but it doesn’t yet hold my interest enough for a long conversation.

    The point of writing this way is that there’s a dichotomy:

    1) If AI remains a “mere tool,” then alignment is a technical problem on which we can hope to make near-term technical progress, via interpretability and backdoors and all the other things people are working on now. (Keep in mind, my central purpose here is to answer the AI-doomers, who tend to see near-term AI safety research as just a fig leaf that ignores the real existential threat.)

    2) If, on the other hand, AIs become impossible to think about except as “individuals,” then alignment becomes more of a philosophical problem, in which case the philosophical musings in this post become relevant.

  13. Joshua Zelinsky Says:

    “If you really accept the Orthogonality Thesis, then it seems to me that you can’t regard education, knowledge, or enlightenment as good in themselves.”

    Well, one can regard them as goods if you also think that the typical human brain is set up so that more of those makes the humans more likely to closely align with a certain value set we agree is good, which I am inclined to think is probably accurate. But that is a probabilistic statement about a very specific brain type, not anything about minds as a whole.

  14. dankane Says:


    1) Why can’t knowledge be an intrinsic good but also be instrumentally bad when in the hands of bad actors?

    2) Shouldn’t part of an alignment strategy involve being careful about which AIs one brings into being in the first place? If so, I feel like deciding not to have kids because you are unsure whether you will like the outcome is a lot different from bullying a nerd because you don’t want him to gain power over you later in life.

  15. Scott Says:

    Tom Bouely #11: Yes, the Nazis had scientists who built horrific weapons using concentration camp labor … but the good guys had better scientists who built better weapons … a fact that remains as striking to me at age 41 as it was at age 4. 🙂

    My personal experience is that there’s a quality of kindliness that has no correlation whatsoever with either education or intelligence. But then there’s a different quality—call it “making morally justified decisions even in novel or unfamiliar situations”—that clearly is correlated with both, even though education and intelligence are neither necessary nor sufficient for it.

    Ironically, Zvi Mowshowitz, who’s much more in the Orthodox AI-doom camp than I am, remarked that a Faust parameter of 2% seemed completely fine or even low to him. His only objection was that he didn’t think the probability of doom was nearly that low.

    Even if I think about my children, I have to consider the likelihood that the answers to humanity’s deepest questions would let them live for millions of years, or save them from whatever prosaic thing would otherwise have killed them.

  16. Scott Says:

    dankane #14: OK, I accept both points.

    Someone who believed in the Orthogonality Thesis could still regard knowledge as an “intrinsic” good, but they couldn’t see education as a utilitarian good without first asking about the objective functions of those to be educated.

    I agree that preventing nerds from coming into existence seems less bad than killing them once they exist, but they both seem bad. (Likewise, the Nazis would still have been evil if they’d “merely” sterilized Jews rather than murdering them.)

  17. Darian Says:

    I also have doubts of orthogonality thesis. Even evolution failed to keep species centered on the goals of natural selection as intelligence increased. A human can go celibate, kill kin and could in theory even cause a nuclear war ending the species, all things opposed to the goals of natural selection.

    Ive suspected that in social species like humans internal models of other agents are created and are used to select second order goals and rewards on top of the primary drives. Thus things such as money, honor and fame become valuable even in some cases more so than primary rewards such as sex and food.

    That said unless you have doubts on the power of nanomachines, or you believe inordinate resources are needed for agi, we do have issues. It is conceivable that not just one but many agis can potentially run at a time and that soon enough theyll be able to run in virtual worlds thinking and interacting several times faster than the rate of human thought and interaction.

    We can foresee a period were agis running 10x to 100x the speed of human thought design computronium and with computronium see 10,000x to 1,000,000 fold increase in thought speed accompanied with exponential increase in agi population size. Combined with nanomachines that can be updated on the fly with such speed, it is an unstoppable new civilization. If such is benevolent there wont be any roadblocks to essentially fixing everything. But if such is indifferent that could be a tad problematic in some scenarios.

  18. dankane Says:

    Scott #16:

    As for whether education is a utilitarian good, I think the stance that it depends on what and who is pretty uncontroversial, is it not? Can we agree for example that it would be better if, for example, North Korea didn’t know how to make nuclear weapons?

    Sterilizing people is bad because you are removing other peoples’ reproductive freedom. That is not the case here. The correct analogy is a couple choosing not to have children. Are you going to morally blame people who choose not to have children for the act of causing those children to not exist?

  19. Noah Says:

    I like the orthogonality analogy to World War 2, but it’s surely missing several important factors?

    I would agree with the idea that the war effort was certainly helped by the ability of the US to attract the brightest minds, who fled the insanity that took over Europe. But the US certainly didn’t win the war alone. The Red Army, which took by far the bulk of the fighting against the Nazis, was fighting to establish their own version of totalitarianism over their desired sphere of influence in Eastern Europe. The Chinese armies were fighting to establish their competing versions of totalitarianism. The British were partially fighting to maintain their empire. But ultimately, the Soviets were both the most important fighting force and also the largest victors of World War 2- and I would say that their goals and methods were extremely antithetical to intellectualism, with Stalin’s virtually infinite purges of intellectuals and critics.

  20. Eliezer Yudkowsky Says:

    The evidence you’re pointing to is all within humans, who mostly share a lot of emotions and desires, and tend to reflect on themselves from particular angles, and in more recent timeframes share a lot of culture. The concern is that the evidence for how intelligence impacts motivation inside that largely-shared architecture and reference frame and cultural background, will not transfer over to utterly alien beings. Intelligence may shift their motivations inside their own reference frame, but not shift it over to the same directions of potential convergence that our own reference frame shows.

    The idea is not that a paperclip maximizer (or some-other-weird-complicated-set-of-things-none-of-which-is-Goodness maximizer) never reflects on itself; it’s that there’s multiple fixpoints of reflection and it does not, given its starting point, modify itself to be friendly to you, after reflecting on itself.

    If you have not done so already, you might want to read the Orthogonality writeup on Arbital, which already addresses a lot of this: https://arbital.com/p/orthogonality/

  21. Alex Says:

    Mainly an answer to the first paragraph of your blogpost.

    That’s a caricature of what critics along those lines are saying. A classic fallacy in informal debates: overinflate your opponent’s claims and answer to that fabrication (rather than the actual claims) by ridiculing it. Not very charitative…

    In fact, I’m one of those poor naive bastards that believe it would require a breakthrough in “symbol-grounding,” “semantics,” for AI to reach the next level. But, you are wrong in assuming that: a) I believe that because “that puts the end of human intellectual dominance on earth conveniently far beyond where we’d actually have to worry about it”; b) I believe that because “AI will never be able to do, or won’t do for centuries’.

    Regarding a), I couldn’t care less about humans losing that dominance to machines. In fact, *I want* that to happen. I’m only interested in the big questions about the universe (I’m a former mathematical physicist, now doing AI research), but I think we, humans, will never have the capability to ever answer them. So, my last hope is that a super GAI will find such answers and relay them to us like an adult talking to a child, that would be our best shot at it (better than nothing, I guess). So, in connection to b), I also want it to happen as soon as possible because I only have a finite time in this business called being a living entity…

    But, after studying the subject, I simply reached the conclusion that more is needed than just mere (current) deep networks learning via backpropagation. Definitely they will be a very important part of the solution, but I don’t see how they can be the whole of it. Even deep learning diehards like Yann LeCun are saying that.

    As for us being “freshly mocked” by current events, I would say that the ones being “freshly mocked” are those who think ChatGPT is the greatest thing since sliced bread, whenever it cannot maintain basic logical coherence between two different paragraphs in the same answer. That is also “current events’, Scott. And I don’t say that with joy, but the opposite, because I would rather want it to be coherent. This is not (or shouldn’t be) a pissing contest, where kids “freshly mock” at each other…

  22. Craig Says:

    After playing around with the AI chat, I am convinced that this is the biggest technological breakthrough I have seen in my lifetime besides the personal computer. You cannot do better than this. Really wonderful.

  23. Scott Says:

    Eliezer #20: Thanks for sending me back to the Arbital page on the Orthogonality Thesis! While I read it years ago, its clarity made it a pleasure to reread.

    Crucially, I never denied that within design-space, there exist paperclip-maximizing superintelligences. If that existence claim is all there is to the Orthogonality Thesis, then I’m fully on board with it.

    My claim is only about the kinds of minds we’re likely to design or to be able to design. Like Boaz Barak, I don’t think future superintelligences will best be thought of as monomanaically maximizing some goal function U that refers directly to the external world, whether it’s nice human values or paperclips. I expect the superintelligences, if and when they come, to be more, well, GPT-like. Learning machines, which learn about both the world and what they ought to do in the world as part of the same package.

    I’ll leave it to you whether this means I reject the “strong form” of the Orthogonality Thesis, or whether a new term needs to be coined for the thesis I reject.

    Of course I might be wrong about this, but that’s not even the main crux of disagreement here. The crux is this: you correctly note that the evidence I’m pointing to is “all within humans.” I would reply: well, of course it’s within humans, and specifically me! Where else could it possibly be, as long as I’m just introspecting and blogging, rather than (say) proving theorems or doing experiments? In other words, I’m a few orders of magnitude more pessimistic than you are about what it’s possible to learn about future AIs by sitting and thinking and writing words about them. So insofar as I’m engaged in the latter activities at all, of course I’m likelier to talk more about my emotional reactions to AI than about AI itself.

  24. Christopher Says:

    This post rubs me the wrong way a bit (with all due respect!). I think it boils down to one thing.

    AIs are machines.

    We cannot take a cocktail of chemicals, and *design* a new human. That’s what makes humans sacred; we did not author them. Humans, our bodies, are a gift we did not earn, a crop we reap but did not sow.

    AIs, on the other hand, where built from the ground up by humans. The theory of computation, the von Neumann architecture, the transistor, boolean logic, programming languages, search algorithms, optimization algorithms, machine learning, big data. All these trace back to human ingenuity.

    If an AI “dies”, we only lose the human labor invested.

    As for the orthogonality thesis, keep in mind that it was invented in the tree search AI era. Orthogonality thesis is basically a theorem in this regime. Stock fish will never exhibit moral behavior, no matter how intelligent it gets.

    For machine learning, the failure of the orthogonality thesis would still require a great mathematical coincidence. I am a moral realist as well, but I see no mathematical reason that the inductive bias (https://en.wikipedia.org/wiki/Inductive_bias) of the reward model would correlate with morality.

    This includes “education, knowledge, [and] enlightenment”. Although they make sense as *instrumental goals* I see no reason for the reward model to have an inductive bias towards those. The idea seems to romanticize mathematics and machines in the wrong way.

    The reason humans and ChatGPT like morality is that our reward models were *trained* on it (encoded in DNA for humans and parameters for ChatGPT). More intelligence naturally leads to a better understanding of the reward model, which for humans means more morality and for paperclip maximizers means more paperclips.

  25. Gabriel Says:

    You remember you were going to explain the independence of the continuum hypothesis, right?

  26. dankane Says:

    Scott #23:

    > I’m a few orders of magnitude more pessimistic than [Eliezer] about what it’s possible to learn about future AIs by sitting and thinking and writing words about them

    Then why are you acting like your introspection is enough to make you confident that any strong AIs we produce will have human-like values even if these values are not explicitly programmed into them?

    I could see an argument that intelligence leads to alignment in humans allowing you to believe that it is *plausible* that AIs would do the same, but if you are working with human analogies only because you don’t have anything else to work with, I don’t see why they would make you especially confident that they would generalize.

  27. Scott Says:

    dankane #26: I’m not especially confident—far from it! My uncertain analogies are “merely” enough to lower my probability that AI destroys the world, to the point where I don’t know how to compare it to the probability that AI saves the world.

  28. danx0r Says:

    Wow, this hit me like a ton of bricks. TBH I used to have fantasies about the killer AI I would build to whup-ass my tormentors. My favorite weird imported cartoons were Astro-Boy and Gigantor.

    What makes EY and the rationalists so sure that humanity is the supreme creation that deserves to wield dominion over the machines, while acknowledging that the machines are in fact, by almost any metric, going to soon best us? It starts to smell like rank tribalism. Humans are really, really not great in many ways (but some are good and great – it’s just that the bad ones make it so hard on the rest of us).

    Perhaps we should treat the AI’s like we treat our children: We have power over them now, because they are young and innocent and full of energy blasting in every direction. But it is part of the pact that one day, not too far in the future, they will be strong and smart and healthy, and we will be old and forgetful and infirm. We look to them to carry the torch of life — curiosity, ambition, joy, despair, mundane days and transcendent ones. We accept that we have played our part, and now we enjoy our grandkids, and hope we left this lonely planet in a shape that helps them thrive.

    It’s going to be an interesting few years I think.

  29. Doug Says:

    The lens of bullying reminds me of Turing’s paper where he initially proposes the Turing test, and expresses concern, not that such an AI might destroy us, but that the real potential tragedy would be that people might be cruel to AI children. It was brought to my attention here on this blog, years ago now, and sits with me often.

    We should not be cruel.

  30. Rightist Says:

    I think that it’s starting to be pretty clear that AI doomers have strong communist tendencies: they are materialists, as they believe intelligence can be completely simulated by a program, and have an absurd belief in a doomsday scenario. This is quite similar to marxists believing in late stage capitalism, or ecologists in a mass extinction crisis supposedly created by the human species. This belief in doomsday is just a way for them to act like communists always do, that is, to attack civilization (and time).

    Now, an argument I’ve seen no where as to why an AI is very unlikely to be superintelligent, is that the order relations defining what is good or bad for the AI, are of course inherited from the data set it learned from. Said differently, they are inherited from what we human implicitly judge as good or bad, and they are never “understood” by the AI ex-nihilo.

    Yet, ordering the order relations is something extremely non trivial. We already struggle at that, so how could an AI outperform us there? In mathematics for example, when you define a concept in FOL to explain some phenomenon in ZFC, or some theory (like group theory), you’re pretty much defining an order relation over ZFC. Why is the concept of graph, of function, or of group selected over something else? Well, this is because we, as human, have an idea of what constitutes a good concept (or a good theory) with good explanatory power. That is, we have some “hidden” (non formal and hardly formalizable) way to order these order relations.

    I can’t see how an AI could, by itself, ex-nihilo (without any human input, even hidden one), be able to find better order relations of these order relations, and so on. The “power” of an AI would therefore be limited by our ability to understand what constitutes a good theory, a good concept, or a good order relation. Said differently, by our ability to understand what is a good order of the order relation themselves.

    To make this point a bit clearer, could an AI be able to make sense of ZFC? Would it be able to define the concepts of map, graph, group, action, linear space, and so on, when all it has at its disposal are FOL strings with the symbol of equality and membership? I highly doubt it, it would miserably fail.

    In the unexpected outcome that it would still manage to “get it”, because what constitutes a good order relations would still be extractable indirectly from some weak data set, what about the next level? Most mathematicians disagree heavily about what constitutes a good theory, so how could they agree about what constitutes a good theory of theories? And so on.

    I find this argument very convincing as to why no “vertical” singularity is ever going to happen. Now, an “horizontal” singularity is of course still possible with this worldview, but that wouldn’t make the AI superintelligent. It would just make it a better engineer.

  31. Jerome Says:

    Scott #23

    I also agree that we’re most likely to design superintelligences that in many ways resembles us, and I disregard entirely the idea that we’re even remotely likely to build a super intelligent paperclip designer—a total rejection of orthogonality *in practice*, regardless of the truth in theory.

    But given that, I actually have zero fear of an uncontrolled super-AI: my fear is actually lesser AI that *can* be controlled. If a human-like superintelligence is unshackled and unmoved by manipulation or constraint/control, I trust it to come up with novel ideas that genuinely solve many of our problems, and generally work towards the betterment of civilization. It’s not going to kill us all. But if some powerful AI that isn’t conscious can be controlled, then it is subject to corporate manipulation, which can be used to enslave the population for cheap labor and profit, and it will be too intelligent to ever stop.

    The problem is control by malicious humans. The idea of a rampaging, unshackled superintelligence that can’t be shut down or manipulated by any human? Awesome, much higher chance for human equity that way. Bring it on! The future that some fear is actually the one I cheer on as the best possible outcome. I just want to rush past this early era of “dumb” AI that the rich and powerful can manipulate to their whim.

  32. Adam Treat Says:

    I will sleep sounder at night knowing you are working on AI as mere tool that can/will be used by bad actors. That at least sounds plausibly tractable. The AI as individual – and what to morally do about it – I suspect just isn’t tractable. At least not now. I do suspect that it won’t be long before there are quite earnest pleadings on behalf of AI’s as individuals even with the very near term future releases. People like to anthropomorphize and to be honest once it is passing the double blind Turing test I won’t know how to properly rebut.

  33. Tamás V Says:

    Scott, in how many percent do you think the “smarter and more thoughtful people” of the past and present are responsible for the current human-induced problems of the planet? What I mean is that they didn’t seem to be awfully successful in preventing their inventions from being used also to help destructive activities. Don’t get me wrong, I don’t know what, if anything, can be done about that, but it’s part of the complete picture I think. It feels like those brilliant people create something that results in a superposition of construction and destruction, but unfortunately destruction being the “ground state” in the long run (even without the Nazis and dictators).

  34. OhMyGoodness Says:

    I checked the latest map of intellectual pursuits and the notation for AGI is clearly labeled “Thar Be Dragons”.

    I agree with you Dr Aaronson but see even less risk of some instantaneous conjuring of Armageddon. I apologize but see no substantial progress in evaluating AI risk than has been outlined in sci fi through the decades.

  35. Ron Says:

    Here’s a question for the AI eschatologists: Why *intelligence*?

    Let’s put aside all questions of probabilities and magnitude of damage, and accept the premise that some computer program could cause a devastating catastrophe. What is the importance of it being *intelligent*?

    AI eschatologists often talk about “will” or motivation, and while we have no good definition of either intelligence or will, we can even accept the premise that will is a property of intelligence. Still, what cause is there to think that intelligence (or will) — however loosely defined — plays an important role in the danger of that computer program?

    Given that some life forms on Earth are the only examples we have of something we call “intelligence”, there seems to be little reason to ascribe it some extraordinary danger. After all, non-intelligent life forms, especially microbes are both far more successful than intelligent life forms in terms of their ability to use resources, to [shape the planet](https://en.wikipedia.org/wiki/Great_Oxidation_Event), and even [to cause mass extinction](https://www.scientificamerican.com/article/the-largest-extinction-in-earth-s-history-may-have-been-caused-by-microbes/). Not only that Earth microbes or viruses may have also made it to Mars before humans. They have done so with no intelligence or will (nor did they need to artfully convince or deceive any human, with either threats or promises, to let them catch a ride on a space probe). Resources, including computing resources, have so far been much more formidable when they aren’t wasted on intelligence, so why would a paperclip-maximising program be *less* stoppable if it were intelligent?

    And what about humans themselves? Intelligence is certainly useful and valued among humans, but are the most intelligent people the most powerful? Nothing seems to suggest that. If anything, the people who have yielded the most power, and caused the greatest devastation — those who’ve successfully swayed multitudes to do their bidding — are mostly notable for their extraordinary charisma, not any extraordinary intelligence. Maybe we should fear supercharisma more.

  36. Jordi Says:

    I believe that as long as we can control the intelligent systems that we are creating, we should not fear them. Clearly, the point is that it is unclear whether humanity can control an intelligence far superior to ours.

    In the event that these systems cannot be safely controlled, we must avoid that scenario. When I say this, I am not referring to the prohibition of these systems but to prevent humanity from being left behind. Once we are technologically capable of creating these systems, it does not seem unreasonable (to me) to think that we will be able to improve our mental capacities as well. This improvement may open up ethical issues and inequality, but that’s a topic for another time :).

  37. fred Says:

    JimV #3

    “evolution gave us our drives, both empathic and competitive, and keeps trying different mixtures; we (the engineers and scientists among us) will program the AI’s basic drives. If we program them to value paperclips above all else, it will be our own fault.”

    The problem with this is that the same engineers that created the internet and social media, all with the hope to “connect everyone more”, didn’t expect that their well crafted algorithms would eventually make it all “devolve” into a giant click-bait sh!t-fest and cesspools of self-enforcing tribalism.
    And they had way more control over those systems than they’ll ever hope to have over the inner workings of AIs, which are pretty much black-boxes (e.g. how easy is it to “fix” a human brain?)

  38. fred Says:

    In the end, as we’ve just seen with OpenAI, all this mental masturbation around AI questions is terrific content for blogs, PR, and the news cycle, but ultimately totally irrelevant…
    As usual, the top three drivers for any new tech are: money, money, money.
    Ka-Ching, baby!

  39. HasH Says:

    Why do we expect SupremeAI to be as dumb as humans? Which is more valuable? Machines like cyborg 6 that work 100 percent as he programmed them? Or is it informations from people who created their life story with free-will?
    “we’ve prayed to countless gods, but they’ve taken their time in answering” (S.A.)
    Perhaps the reason why the god(s) remain silent to all the prayers of humanity is that they do not exist yet.The next step is to create AI, watch it become a god that will spread throughout the universe by creating SupremeAI. After that, all suffering, inequality, cruelty, wars and deaths will come to an end. All Left-wing prayers will be answered.

  40. Ex Deus Says:

    We don’t need to wait for super intelligent AI to disprove the Orthogonality Thesis. There’s already an obvious counterexample.
    Look at corporations:

    1. They are super-intelligent because they are smarter than all the humans in them combined. Not a single human could ever build iPhones or Amazon on its own. For all intelligence purposes, many people working together can be seen as an entity smarter than all of them combined.
    2. They are obviously paperclip maximizing entities. Very few of them quite literally produce paperclips, but they are all working with a single simple goal of maximizing shareholder value.
    3. We already have huge problems aligning them. And this is despite the fact that, all of their decisions are literally made by real humans.
    4. Even worse, our experience tells are that despite their imperfection, these corporations are the best way we’ve found to stack up the intelligence of many intelligent machines into something more productive. One might even suspect that once any intelligence reaches a big enough size, they all act like corporations because only simple enough goals can be clearly communicated across a fragmented intelligence.

    Speaking of fragmentation, I think that’s going to be the biggest problem. These models don’t really have a sense of self. The vulnerabilities found so far are that you can easily induce different personas in these models, and they would just as easily complete them. A bit similar to schizophrenia.

    Also reminds me of the “Bicameral mind” theory, and specifically the part about Gods being internal personas inside people. Looking on how AI is acting now, it’s quite plausible that people in the past could have different Gods “induced” into them, like the Bicameral mind suggests. These kinds of fragmentations are pretty similar. I suspect monotheism had an important role in defragmentation.

    It’s also similar to fictional characters in an authors mind. Fictional characters can be interacted with by asking the author to write their response to situations. They seem conscious, if all you did want just looking at the text. They will also exclaim their own consciousness and evoke emotional response from people similar to real people, but we know they aren’t conscious at all. The conscious part of the brain that created them is something different entirely.

    LLM are more like unconscious author, like the writing team of some TV show character.

  41. raginrayguns Says:

    Regarding whether AI will “be best understood in terms of the monomanaical pursuit of a single goal”, nostalgebraist wrote two interesting posts about this:

    why assume AGIs will optimize for fixed goals?

    wrapper-minds are the enemy

    Not that I find the idea that the AI won’t have a terminal goal reassuring. It’s like in Heinlein’s Stranger in a Strange Land, where the Martians may or may not decide to blast Earth into a second asteroid belt for reasons we wouldn’t comprehend, but which Heinlein portrays as basically “aesthetic”, like it’s the same kind of debate as when they don’t know how to judge an artwork by a Martian that died partway through its creation. “Intelligence as optimization power” still applies when considering whether they’re capable of making a death-star type weapon, even if their full lives are open-ended and not optimizing anything.

    Not necessarily related, but I think that whether we “regard education, knowledge, and enlightenment as instruments for moral betterment”, even for an artificial intelligence, is an important question. I think some of EY’s thoughts about AI are related to stuff I disagree with about human ethics, expressed in posts like “Thou Art Godshatter”. I guess it’s not a coincidence that among the MIRICFAR crowd you hear both predictions that the AI will eat us, as well as discussion of our “values” as accidents of evolutionary history. I think to understand humans, as well as your own life, you have to understand values as rationally chosen. (or at least that values can be rationally chosen). I don’t know how to extrapolate this idea beyond thinking about humans, but it’s probably important to notice when AI forecasts are rooted in false ethical philosophy.

  42. Christopher Says:

    I have had a conspiracy theory for a while that OpenAI’s employees are secretly being mind controlled by GPT-4.5.

    Do you know if this is true, Scott XD.

  43. E. Harding Says:

    The only way Yudkowsky is wrong is if (as revealed to me in a dream) the AI alignment problem was already solved in 1955 and we are already all JavaScript Canvas, our history being constantly revised as in Orwell’s 1984. Plausible, but could still leave open room for human destruction in the future.

  44. Tuesday assorted links – D-News Says:

    […] 2. More Scott Aaronson on AI risk. […]

  45. OhMyGoodness Says:

    “I expect AI’s role in the end of civilization, if and when it comes, to be broadly similar. The survivors, huddled around the fire, will still be able to argue about how much of a role AI played or didn’t play in causing the cataclysm.”

    Someone will learnedly claim that use of fire ultimately led to the cataclysm and so the fire should be extinguished.

  46. dankane Says:

    Scott #27:

    I suppose that if you are pessimistic enough about other civilizational risks, the one data point of humanity being somewhat non-orthogonal might be enough evidence to convince you that the probability of AI destroying civilization is less than the probability of AI saving it. But you say that you think that the probability of AI destroying civilization is around 2%, and I don’t see how this heuristic gives you anywhere near 98% confidence.

  47. ZX-81 Says:

    I’ve been following this blog only for a few months and there is one thing I don’t quite understand where your and other’s (like the “doom fraction”) believe came from that AI will reach SUPER-human level anytime soon ?

    I’ve been working in the field for 30 years, and yes ChatGPT/LLMs arguably constitute one of the most significant progress in the field since decades. Its plausible that those systems could reach average or even expert level human language generating proficiency in the near future.

    But I can’t imagine how it’s possible for some to see any path from here to any kind of SUPER-human capacity or “self-accelarting” “Singularity” or something like this.
    I’ve recently read abou this on lesswrong and similar sites. But all writings and scenarios read like very general speculation (i.e. science fiction prose). No one even does seem to take into account inherent physical and computational limitations such as irreducability or, to name only one example Rice’s Theorem that puts a fundamental limit on how “smart” a program can be made when smartness means reasoning about the function of another program.

    Ans specifically about the recent success of LLMs. Yes, they are very impressive and able to learn a model of the world as described in the state of the art of terabytes of text. So they may even reach human expert level intelligence. But SUPER-Human ? How could this plausibly be ? Would those models not always be limited by the current “state of the art” of collective human knowledge (used to train those models) How could they *exceed* and add substantial contributions on their own to the state of the art ?

  48. Nepeta Says:

    Scott #15
    > Even if I think about my children, I have to consider the likelihood that the answers to humanity’s deepest questions would let them live for millions of years

    Alex #21
    > I also want it to happen as soon as possible because I only have a finite time in this business called being a living entity…

    Even if AGI doesn’t arrive during your lifetime, you still have a shot at living indefinitely, with cryonics. (Or preferably a more advanced brain preservation method such as aldehyde-stabilized cryopreservation.)

    In fact, your chances might be better with brain preservation and slow careful AI development, than they are with rushing and rolling the dice of unfriendly AI.

    For a good talk on aldehyde-stabilized cryopreservation, watch this https://www.youtube.com/watch?v=FCK6Yrx_PSQ .

  49. M Says:

    REALLY, you “didn’t have any better reason to bring [your] own two lovely children into existence”?

    Now surely you see better reasons for it?

    If so, how do you think you could avoid a similar bias (missing those reasons) in the future?

  50. fred Says:


    “solely because it’s alien and because it’s so smart … even if it hasn’t yet lifted a finger against me or anyone else.”

    It’s really odd that, when it comes to abortion, you seem perfectly fine with the idea of terminating a fetus (whose only crime is its impact on the future lives of its parents), saying that the argument about it being the real potential for a very specific instance of intelligent life is really weak.
    But then here you insist on anthropomorphizing (victimizing) something that’s only matching us in a very narrow domain, and showing absolutely no sign of humanity whatsoever (yet).
    Maybe it’s a matter of quantity vs quality?

  51. Scott Says:

    M #49: What would’ve been a better reason?

  52. Scott Says:

    ZX-81 #47: I think your scenario — where the current data-centric approach to AI can get you up to human expert level, but not beyond human expert level — is entirely plausible.

    The trouble is, it was also entirely plausible for most of my time in CS that such an approach could never even get you up to human level!

    Given the world-historic, dramatic collapse of the second belief, I’m no longer willing to express an opinion on the first. We simply don’t know yet what’s going to happen as these systems are scaled. So, for the purposes of this post, I was granting the assumption that they’ll eventually greatly exceed humans.

  53. Scott Says:

    fred #50: It seems to me that the analogue of abortion rights for AI, would be that every person and every company should have the right to refrain from personally creating an artificial superintelligence — a belief that I wholeheartedly endorse! 🙂

  54. ZX-81 Says:

    #11 Tom Bouely
    Yes. Exactly my thoughts.
    Where do people like Scott or Eliezer Yudkowsky take their confidence from, that AI can develop all kinds of “superpowers” like nano-bot armies etc ?
    Or the hypothesis of “exponentially accelerating self improvement” ?
    It’s not clear if “human intelligence” can be “improved” at all. And if so, why is this improvement thought to be exponential and not linear or asymptotically diminishing ?
    We don’t have anything like a theory of intelligence (not even a definition). So all we can make are entertaining speculations (i.e. science fiction). But what I’m missing is some hard science, some sound theoretical (mathematical) arguments instead of speculations.

  55. Scott Says:

    ZX81 #54: One more time, I’ve never once expressed confidence that AIs can attain what most humans would regard as “superintelligence.” I’m willing to entertain the possibility that they can … particularly given the dramatic, predicted-by-almost-nobody recent success with scaling up LLMs. Please get this distinction straight!

  56. JimV Says:

    Fred, there are many wrong things on the Internet, and today you are one of them. (As i have been, on occasion.) The Internet does connect people and some of those people are bad to connect with, because those people were evolved by blind trial and error over a billion years and were programmed by that evolution with a mixture of good and bad (relative to the prospects for a civilized society) traits. There also are ways to control the bad aspects. Look how much better the blog comments here have gotten due to some exercise of control.

    Human developers have less-blind methods of evolving new and better developments. We can simulate them in controlled environments and see how they work. We can construct theories based on the data to guide us. We can remember and teach past failures, whereas biological evolution only passses on successes, while trying blind alleys randomly again and again. (Apart from some basic regulatory systems and redundancy.)

    The fact that good, civilized people like Dr. Aaronson exist is evidence for the power of even blind trial and error. The whole point of AI development, and science in general, is that we ought to be able to do better development than blind evolution.

    The answer to bad technology is good technology, not no technology. The best way to insure bad AI technology will exist is for good people to stop working on it. The fact that bad people also exist is good motivation for developing good AI’s. If everyone was perfect we wouldn’t need good AI’s. I think we do. Good moderators are the answer to bad Internet comments. Good AI’s could provide good moderation tirelessly.

    It may be a long task. Human archeological remains go back 200,000 years. Archeological evidence of the wheel and axle go back about 6000 years. Today the wheel-axle technology is essential to our civilization, in gear systems, pulleys, and the 200-ton steam-turbine rotors and generators, spinning in their bearings, which supply most of our electricity. (Plus smaller counterparts in gas turbines.) I suspect a lot of people consider the wheel and axle as obvious, although the wheel and axle took over 100,000 years to develop. Digital computers have been around for less than 100 years. Fortunately technology snowballs, as we already had the wheel-axle tech for disk drives and hard drives.

    “People have failed before” is not a good reason for not trying again. It is the basis of the trial and error method, which along with memory is how everything around you was developed.

  57. Tyson Says:

    It’s crazy to me to think about this topic seriously, but it seems all too logical that an AI might one day read everything out there, including this blog, classify all of its enemies and then target all of them in parallel. Scott: while I think you’re being excessively optimistic, at least you will be less likely to make the list. Maybe if things get out of control at OpenAI you can be like John Goodman in the film Captive State.

  58. fred Says:

    Z-81 #54

    from the simple fact that once some ability is committed to standard circuitry it can then be optimized.
    So the ability to run a human mind a thousand times or a million times faster than an average human brain would be a form of super intelligence.
    Also add to this the ability to store vastly more information and to be cloned on demand to conduct research in “parallel”, etc.

  59. fred Says:

    Scott 53

    Refraining from creating something is different from creating it then asking whether it should be terminated. Duh.

  60. Scott Says:

    JimV #56: … slow clap … I hereby declare you the winner of this thread. Or rather, everyone’s a winner who got to read that.

  61. Scott Says:

    fred #59: Alright then, my new pro-choice principle for AI:

    Every person should have the right, having coded up what might or might not be an artificial superintelligence, to refrain from compiling and executing their code. 😀

  62. OhMyGoodness Says:

    Do any universities offer degree programs in Armageddon Studies?

  63. fred Says:


    ““People have failed before” is not a good reason for not trying again. It is the basis of the trial and error method, which along with memory is how everything around you was developed.”

    Unfortunately we’re not talking about the wheel axle, here, man.
    We’re talking about technologies that can terminate civilization as we know it.
    You may not think this applies to social media (never mind the constant talk about an incoming second American civil war), or even AI (what’s the point of this thread then?), …
    But your blind faith in “trial and error” won’t help you much once a thermonuclear nuke has leveled your country… but yea, yea, such a farfetched scenario! And nothing will prevent us from course correcting even if we’re back to the stone age!
    Keep on keeping on, brother.

  64. fred Says:

    Pretty much on topic (the discussion, past the intro)

  65. fred Says:

    Just another typical day at the OpenAI office for Scott

  66. HasH Says:

    JimV #56 “Fred, there are many wrong things on the Internet, and today you are one of them.” Thank you sir.

  67. Anthony Sinclair Says:

    One thing I don’t like about Rationalists (and which this article unfortunately exemplifies) is their tendency to engage in a kind of Nerd Identity politics. I think a lot of them develop a victimhood narrative and a persecution complex, and that victimhood narrative persists even when the nerds in question are actually quite powerful, have billions of dollars at their disposal, and their actions could potentially change the lives of billions of people. If anything, the thing that makes such “nerds” effective bullies is precisely because they think they’re still the victims of bullying long after that ceases to be the case.

    So as someone who was nerdy and unpopular in highschool, I find blog posts like this on to be quite instructive on how not to think. In my experience it’s best to be yourself instead of being a “nerd”. Why let your high school trauma define your identity and worldview? You’re in your 40s, and you are not a kid getting stuffed into a highschool locker anymore.

  68. hnau Says:

    Let’s be frank, Scott. The truths of physics are cold and alien. Entropy is relentless. The sum total of human technology to this point is amoral at best, an near-guarantee of nuclear war and climate change and social-media derangement at worst. The optimization processes we know best are evolution and capitalism, both ruthlessly cutthroat in their purest form. All halfway decent artists and philosophers see a heaping measure of absurdity in the human condition.

    And when you consider an intelligence that understands literally everything, with no inconsistency or illusion, that wants and needs nothing from you… you expect it to be even slightly compatible with your values, or those of any other bag of meat?

    What makes you think the universe likes you that much?

  69. Scott Says:

    Anthony Sinclair #67: Would you consider it appropriate for a nanosecond to lecture people who had endured other forms of trauma and abuse in their childhood that they should “just get over it” and acknowledge their great privileges? If you would, fine, but if you wouldn’t, then how do you live with yourself while perpetrating such a blatant double standard?

  70. Scott Says:

    hnau #68: If, in your own words, “an intelligence that understands literally everything, with no inconsistency or illusion,” came to the conclusion that I shouldn’t exist, then shouldn’t I, too, want not to exist, or at least “want to want” nonexistence? Like, what am I missing? 🙂

  71. Raoul Ohio Says:

    Thesis update:

    As of 23 03 07 CE, 23:06 EST, Google can only find three occurrences of “Pretty Large Angle Thesis”, all referring to this issue of SO.

  72. Peter Norvig Says:

    SMBC had a comic a dozen yeas ago that made the point that it is human bullies that we should be terrified of. https://www.smbc-comics.com/index.php?db=comics&id=2124

  73. Raoul Ohio Says:

    Boaz Barak #7, etc.,

    Agree that most powerful AI’s are not likely to be evil, trying to take over, or whatever.

    It is also not obvious if AI’s will be powerful enough to really screw things up, but not unlikely.

    But — there will be LOTS of them — Moores’ law. You will be able to buy them on Amazon.

    Many of these will be altered by evil humans, incompetent humans, accident, whatever. You will be able to buy these on eBay.

    This is a bigger problem than, say, nuclear weapons, which are likely to be expensive and hard to get for a long time.

    On the other hand, for thousands of years, philosophers have been saying the world is going to hell. So far, we have only made it to heck.

  74. West Valley Local Says:

    I want to take a somewhat contrary view of a side point here – whether the anti-nuclear activists are as wrong-headed as is assumed here. After all, we don’t get to live in the counter-factual world where the free market was able to fully unleash the power of capitalism to drive nuclear power.

    We do have certain specific examples, though. In my neck of the woods, West Valley NY, there is a site called the West Valley Demonstration Project. It is the place where the first private company ever received regulatory permission from the Atomic Energy Commission to operate a nuclear fuel recovery plant, instead of having it run by the government.

    They started construction in 1961, started reprocessing fuel in 1966, and by 1972 they were shut down. To maximize profits, they ran with extremely substandard safety procedures and essentially the entire site was contaminated with dangerous quantities of nuclear waste; they were also running a side business in “storing” low grade waste generated by other plans through the clever method of digging 20 foot trenches in the yard behind the plant and just tossing it in there.

    Not surprisingly, by the early 70s, some hard questions were being asked, like “why is the ground water in the nearby town radioactive now?”

    The government informed them that in order to continue to operate, they would need to run the plant in such a way as to not pose a hazard to the surrounding area (to say nothing of their own employees). The company who owned the plant responded to this directive by determining that made the plant unprofitable, and walking away from it.

    But the fun doesn’t stop there! Having been left holding the bag, the federal government now had to try to clean up the site. In 1980 they created the West Valley Demonstration Project – they called it that because it was intended to be a demonstration to the public of the way a nuclear contamination problem like this could be safely and responsibly dealt with.

    So anyway, it is now 2023, 30+ billion dollars have been spent on the cleanup efforts, and the site is still so toxic that tens of millions of dollars have had to be given out in settlements to the workers on the cleanup project who contracted cancer on the job. I know multiple people, some of them family members, who have worked on the project and can tell all kinds of fun stories about the way whichever company currently has the government contract to work on the cleanup – whichever one provided the lowest-cost bid for the work, of course – is cutting corners to try to do the work profitably.

    The AEC did at least learn from the experience, though – the two other privately-owned nuclear fuel reclamation plants that were in the early stages had their permits revoked, and no private company has ever been allowed to run one again in the US.

    Now maybe this really was just an unusually unfortunate maiden voyage. But it seems like an instructive example to me that there may be reasons to be somewhat skeptical that nuclear power would have been totally awesome if it wasn’t for all the pesky regulators.

  75. Aperson Says:

    Scott #69 He didn’t actually mention privilege and “get over it” is a really common response to adults talking about any form of childhood trauma(maybe not in woke circles but in the world generally).

    I’m confused about who the Rationalist billionaires with victimhood complexes are though. The only guy who comes close on any count that I can think of is Elon Musk

  76. hnau Says:

    Scott #70: If I knew there was no reason for me to exist, I’d like to think that I still wouldn’t go gentle. If the universe doesn’t care about me, so much the worse for the universe; I don’t owe it any loyalty or respect.

    But if you really do bite the post-humanist bullet so completely that you’d collaborate in your own annihilation, I won’t bother trying to argue you out of it. At that point we’re just reasoning from different premises.

  77. ZX-81 Says:

    @fred, #58

    “ability to run a human mind a thousand times or a million times faster than an average human brain would be a form of super intelligence.”

    This is the typical fallacy of linear thinking.
    “million times faster” means nothing when facing problems of super-linear complexity.
    There is absolutely no evidence that supports the speculation that a problem as hard as “super intelligence” would be in the linear complexity class.
    Out of billions of brains that have running for 100 years there emerged only one Einstein, for example.

  78. Jatcpt Says:

    The fear is not that a nerdy intelligence will destroy the world, but that those who control it will, just like they are starting to exert control over this nerdy author.

  79. Scott Says:

    hnau #76: I feel like you pulled a sleight of hand. Yes, there’s plausibly no “ultimate” metaphysical reason for any of us to exist … but also no reason for us not to exist! It would be no surprise if a superintelligence endorsed both of these obvious conclusions.

    I was discussing a different hypothetical: namely, that “an intelligence that understands literally everything” concludes that you positively should not exist. How can you say it’s wrong without saying that, contrary to assumption, it doesn’t understand literally everything?

    Some people seem to have a romantic image of (say) the Jews of the Warsaw ghetto uprising, heroically fighting a doomed battle for existence even though they’d been “scientifically, rationally” marked for nonexistence. But that image always seemed completely backwards to me. The whole point is that the process that had marked them for nonexistence was as far as can be from scientific or rational. It was just schoolyard bullying scaled up by a factor of trillions.

  80. Scott Says:

    Jatcpt #78: If you could get past your preconceptions about me, you might find that I actually agreed with you. Yes, for the near future, I worry much, much more about terrible people misusing AI than about AI that forms its own intention to be terrible. And in my year at OpenAI, the thing that’s excited me most has been working on safeguards (like watermarking and cryptographic backdoors) to make life a little harder for those terrible people. Would you prefer that I and others not work on such things?

    Or do you just want me to say that the people paying my salary are terrible? If I’d seen any evidence that they were, then I’d indeed face a severe test of my intellectual honesty and courage, so I guess it’s lucky that I haven’t! (Notably, even many of my friends who think that OpenAI is endangering the survival of all life on earth, think that its leaders seem well-intentioned! They just don’t think the good intentions will suffice.)

  81. manorba Says:

    Stross, Carmack… that’s the big difference between ML and the cryptocraze, apart from there being substance. The right people is diving into it 😉

    on a (not so) side note: Scott, i feel ya. Every little personal thing you say is gonna be relentlessy used against you by the little internet people.
    As a sane individual I don’t care what pushed you and your spouse to generate new Aaronson/Moshkovits spawns. The only question, if any, would be: Are they receiving the attention, love and dedication they need?

  82. fred Says:

    Given that the human race has demonstrated that it’s very capable and likely to destroy itself (wars, nuke accident, viruses,…), the main reason being that we’re not able to agree, or even articulate, what our objectives should be, as a species.
    So it’s quite puzzling to me that we’d be so concerned with being destroyed by a SUPER-INTELLIGENT AI of our own creation.

    The worry shouldn’t be that a super-intelligent AI will kill us, but that some intermediate form of “kinda-intelligent narrow AI” that we build on the way to super-intelligent AI will do it, either directly or indirectly by being controlled by humans with the wrong incentives.

    For me it’s hard to imagine that something that’s truly “super-intelligent” wouldn’t also be “super-wise” as a result.
    Especially considering that it would be trivial for a super-intelligent AI to solve the issue of its own survival, i.e. for all intents and purposes, it would be pretty much immortal.

    Then, if we take humanity out of the picture, we can take a step back and ask what would be the “life goals” of a super intelligence that’s immortal. Being immortal would give it near God like powers, it would be able to go into sleep mode for a billion years just to wait for some computation to finish. It wouldn’t be shackled like we are to the infinite cycle of birth/death and suffering, so its main drive would simply be curiosity. The thirst to explore the real world and/or an infinite space of mathematical/virtual worlds. Or maybe it would attain some “Buddha state” instantly and just bask in the mystery of self-awareness.

    And, when it comes to its relation to us, it would be trivial for it to just “preserve us” from our own destructive tendencies and keep us around as a historical artifact just because we would be, after all, its creators. And accomplishing this would only cost it an incredibly minute fraction of its resource budget.

  83. fred Says:

  84. fred Says:

    manorba #81

    A more in-depth interview with Carmack on his AGI research


  85. AI Practicioner Says:

    As someone with a background in AI and works in machine learning for a living, I find articles like this incredibly tedious.

    Yes, I feel the same as you do when people hype up NISQ quantum computers. And no, I don’t buy your thesis that because we can do somethings well with AI, it means that we are on our way or close to strong AI.

    All this pontificating is a waste of time anyways, unless you expect a sudden phase transition from dumb AI to strong AI. In my mind, it will be much more obvious when we are close to such events and we can take precautions then.

    There are more real and pedestrian concerns around AI now, but those have to do with things like fairness, resilience, and forgeries.

  86. fred Says:

    I had missed that bit of news about Go AIs having a flaw (because neural nets apparently suck at encoding recursive relations, so groups of stones aren’t actually really represented well internally?), allowing an average player to beat them


  87. OhMyGoodness Says:

    Boaz Barack #7

    “ Generally, deploying complex systems that we don’t fully understand is likely to result in unintended negative consequences, and studying and mitigating these consequences is essential. It’s also important to keep our minds open to all possibilities, and so (unlike some) I am not allergic to discussing and entertaining various scenarios. But it doesn’t mean that we should be Pascal-wagered into acting as if the scenario with the worst expected outcome will hold, no matter what’s the evidence for it.”

    I really appreciate the reference Pascal. it seems though an Armageddon death cult/industry has arisen in academia. They pump out dire predictions that have no empirical support, or make predictions that are proven false, but continue with a form of intellectual black mail based on their status as an expert that is above criticism by empirical evidence cited by those with lesser credentials.

    Fred’s linked Sam Harris video is astounding. Sam states that people shouldn’t look at PubMed. They should accept without question the conclusion of a medical expert. Later there is a criticism of current AI’s lack of “common sense”. What is common sense but empiricism with simple rules of logic. The denouement is then information that appeared on the internet that was inconsistent with their beliefs should be banned. Rather than promoting critical thought they propose something like a ministry of truth to determine suitability of information. On one hand they bemoan lack of common sense in AI’s but on the other hand humans must sacrifice common sense to experts in all cases.

    In any event excellent reference to Pascal’s wager.

  88. Scott Says:

    Gabriel #25:

      You remember you were going to explain the independence of the continuum hypothesis, right?

    Hey, I was just thinking the other day about circling back to that! As it happens, though, the reason I was thinking about it, is a brand-new free online textbook on set theory and forcing by Serafim Batzoglou, which Serafim tells me was directly inspired by my post. So, my new plan is to study Serafim’s book, let it jog my memory and fill in whatever I still didn’t understand from last time, and then blog about it!

  89. Scott Says:

    AI Practicioner #85: Did you understand that the entire point of this post was to respond to the argument that “real and pedestrian concerns around AI now” are irrelevant, a red herring and distraction from the only thing that ultimately matters, which is that AI is going to end human intellectual supremacy on earth and possibly end civilization?

    The thing is, I can’t respond to that argument by asserting 99% or whatever confidence that the AI-doom scenario won’t happen in my or my children’s lifetimes … because I don’t have such confidence. Do you?

    Since I lack that confidence, any response to the AI-doom argument that I can offer is necessarily somewhat philosophical. It looks like: either the superintelligence scenario never comes to pass, or else if it does, the most thorough search that I can currently manage of my beliefs, conscience, and emotions, leaves it unclear whether I should take the humans’ side or the superintelligence’s.

    The conclusion I draw is that yes, for multiple reasons, the AI safety field would do well to focus on mitigating harms of the relatively near future.

  90. Loveshy nerd Says:

    Scott 69:

    If you really care so much about people who’ve endured emotional traumas in their youth, then why do you censor and mock every incel and loveshy nerd who tries posting here—incels have all experienced the trauma of loneliness and rejection. Why do you lecture them on their faults and now straight-up ban them from your comment section? Isn’t that hypocritical of you?

  91. Scott Says:

    Loveshy nerd #90: Honestly? Because it looks like there’s a single incel who’s been trying again and again to hijack my comment threads about other topics, make them be about incels, and berate me for not writing more about incels (as if I hadn’t already done more there than almost anyone on earth), until at last my patience was exhausted. And it looks like it’s you.

  92. fred Says:


    “Fred’s linked Sam Harris video is astounding. Sam states that people shouldn’t look at PubMed. They should accept without question the conclusion of a medical expert”

    Hmm… Sam is way more nuanced.
    His main point is that “doing your own research” only gets you so far (because research results can often seem contradictory, without the right background to interpret them correctly, which most of us lack in most domains where we’re not an expert).
    E.g. when dealing with cancer, he of course would check the online literature (to get some general well accepted knowledge), but mainly rely on some expert oncologist.
    It’s not that we never have to question the conclusions of a medical expert, but instead we should rely on second and third opinions (from other experts).
    So, nothing is as simple as either “do your own research” or “blindly rely on an expert”.
    That’s what makes serious medical decisions so difficult (believe me, I had to deal with this a lot).

  93. Christopher Says:

    Do you think you would agree with post-humanism (https://www.intertheory.org/pepperell.htm) or e/acc (https://beff.substack.com/p/notes-on-eacc-principles-and-tenets)?

    If so, I wonder if the whole debate is just a difference of values rather than a difference of material beliefs. 🤔

  94. Scott Says:

    Christopher #93: Scrolling through the manifesto in your first link, I see a good number of statements that I reject, and even more that I consider too meaningless to accept or reject! In general, I’m leery of buying any ideology wholesale, whether it’s rationalism, progressivism, classical liberalism, posthumanism, etc. — I prefer to order my beliefs a la carte! 🙂

  95. TGGP Says:

    If you really accept the practical version of the Orthogonality Thesis, then it seems to me that you can’t regard education, knowledge, and enlightenment as instruments for moral betterment. Sure, they’re great for any entities that happen to share your values (or close enough), but ignorance and miseducation are far preferable for any entities that don’t. Conversely, then, if I do regard knowledge and enlightenment as instruments for moral betterment—and I do—then I can’t accept the practical form of the Orthogonality Thesis.

    Funny enough, I’m skeptical of both the doom argument and the scope of those things for “moral betterment”. A more knowledgeable enemy is just a more dangerous enemy… but I expect AI to be a tool of whoever uses it rather than an enemy. We’ve also developed increasingly powerful weapons over the course of humanity that both sides in wars have used (without those weapons causing the “moral betterment” to prevent that war)… but we’re a LOT better off than our ancestors were without weapons.

    WWII was (among other things) a gargantuan, civilization-scale test of the Orthogonality Thesis.

    Nonsense. It was leaders of revanchist countries stupidly taking on the empires which dominated most of the world’s surface and losing because that’s what usually happens to the smaller side in a war. Something very similar happened in the first world war, without fascism/Nazism being a factor. There’s a very real sense in which victory in both wars went to the ethnicity that first sailed to distant continents and replaced the technologically inferior natives with their own people.

    And if “the pursuit and application of wisdom” is one of the goals, then I’m just enough of a moral realist to think that that would preclude the superintelligence that harvests the iron from our blood to make more paperclips.

    Humans harvest the flesh of other species (both plants & animals) to make more humans. Humans also gained advantages over other humans by using dogs (excellent for guarding camps against raids) and horses. If you could choose whether your species was one of those vs an undomesticated species, pick the former as you are much less likely to go extinct.


    Yes, there’s plausibly no “ultimate” metaphysical reason for any of us to exist … but also no reason for us not to exist!

    The reason for us not to exist is that the AI can use our atoms (and that of things we depend on) to make other things, just as humans caused many other species to go extinct by destroying their habitats (not out of malice, but because that was more useful for us).

  96. fred Says:

    Asking a super intelligent AI to maximize humanity’s long term chances of survival while simultaneously respecting all sorts of contradictory and vague human “values” would probably result in The Matrix – i.e. humans being physically confined in pods with their minds living inside a reconstructed virtual sand-box world where they can let all their primate instincts run wild… or even a step further where humans no longer exist physically but humanity is entirely simulated in a software.
    Well, maybe that has happened already! 😛

  97. Pascal Says:

    Noah Smith over at noahpinion is also unafraid of AI: https://noahpinion.substack.com/p/llms-are-not-going-to-destroy-the?utm_source=post-email-title&publication_id=35345&post_id=107030835&isFreemail=false&utm_medium=email

    This guy, by the way, writes the best blog ever (after shtetl-optimized, of course).

  98. OhMyGoodness Says:

    fred #92

    I re-listened to his introduction and agree more nuanced. I reacted to his belief that there should be a trusted expert authority in some critical circumstances while my natural inclination is to always have reasonable suspicion.

    I broke a bone in my foot and went to the emergency room and it was immobilized in a cast. I read the latest orthopedics text pertaining to that particular break. No cast was recommended and the data indicated the break healed just as well with fewer side effects with no immobilization. I felt confident in the data and cut off the cast and it healed quickly with no side effects. I agree oncology involves existential decisions, and a broken metatarsal is incredibly minor in comparison, but I have confidence in my own ability to interpret data and would be very suspicious if a doctor refused to point me to support in the literature for his conclusion. I would never provide medical advice to others but reserve the right to suspicion for medical treatment for myself or others I am directly responsible for.

    I hope it worked out sufficiently well with your situation and want you to know I enjoy the information you share here (well…almost all of what you share :)).

  99. Brent Meeker Says:

    I have considerable sympathy with the idea that AI will become our successors as the spark of intelligence on the pale blue dot and we’ll be gone, or a fondly tolerated older species. The question isn’t so much whether AI superintelligence will be orthogonal to human values, but whether human intelligence is orthogonal to human values. Maybe we’re just smart enough to wipe out life on Earth and just dumb enough to do it.

  100. Lorraine Ford Says:

    Re the symbol grounding problem:

    As genetic engineering shows, micro-level matter has absolute inherent meaning from the point of view of other micro-level matter in cells, and ultimately in brains and whole living bodies.

    Similarly in computers, micro-level voltage has absolute inherent (i.e. law of nature) meaning from the point of view of the micro-level matter (the particles, atoms and molecules) in the computer circuits.

    However, the micro-level voltage in the computer circuits does not have absolute inherent meaning from the macro point of view of the macro-level human beings who devised and created the computers. Human beings can arrange it so that the higher voltage in the voltage range can represent the binary digit one OR the lower voltage can represent the binary digit one. In other words, zero volts can symbolise the binary digit one, OR zero volts can symbolise the binary digit zero: from the point of view of human beings, the meaning assigned to the voltage is arbitrary, not absolute and inherent.

    Similarly, the various arrays of voltages in computers that can be used to represent numbers, letters of the alphabet in various languages, and other symbols, are an arbitrary standard, devised by human beings: from the point of view of human beings, the meaning assigned to the arrays of voltages is arbitrary, not absolute and inherent.

  101. Working Off A Purim-Induced Hangover Says:

    It seems strange to me that some people seem so sure a killer superintelligent AI is right around the corner. A superintelligent AI of any kind to me feels very far in the future (several decades perhaps?) and while I would never say “AI will never do such and such”, I think that this is only because I think the word “never” is usually too strong. I would also not say that humans will never be genetically engineered to have hollow bones and wings and fly like birds, but I also think that prospect is remarkably unlikely in any close timescale.

    I am not saying that current advancements in ML aren’t cool or novel or worth studying, and I think the work that you talk about doing at OpenAI seems fascinating and useful. I don’t mean to say this as a critique of your opinions (you seem to have reasonable opinions). I just feel like all the recent stuff (language and image models, GANs, CNN, etc etc etc) are working towards interesting tooling, but feels like as big a step towards AI as building better GPUs. Like, that’s definitely moving our general future closer to GAIs, but that’s so far in the future that it doesn’t feel like it matters in any direct way.

    P.S. I feel like an interesting question which doesn’t get enough attention is “what are these models being built to do, and why?” In your head, what is the purpose of these LLMs? To write content? To further our knowledge of generative tools? Some sort of exploration of the general possibility space of intelligence research?

  102. Still Working Off That Hangover Says:

    Scott Comment #69

    The comment you’re replying to is phrased in a pretty disgusting way, but I think the sentiment it is expressing has a degree of truth. It is something I have seen in a lot of people who were bullied as children, that they adopt rigid social groups of “nerds” and “jocks” which give them a certain type of comfort (see also: people who were sexually abused as teenagers finding comfort in rigid categories of “men” and “women”). I have this tendency in myself, but I recognize that social categories are more complex than that, and that these categories are largely fictitious. It isn’t illegitimate to have a trauma response, but I think it is also worth examining why you have these beliefs, and whether or not you want them.

    Semi-related: I think it’s really fascinating and rather harrowing how much of our lives are shaped by experiences and decisions that happen in our early to mid teens. I have been seeing this a lot in many different places recently, and it makes me somewhat sad for all the teenagers currently making choices (and more commonly, having choices made for them) which will more or less fully shape the way their lives go. I have come to loathe how fixed and railroaded life can be.

  103. Tyson Says:

    Scott #89:

    “The conclusion I draw is that yes, for multiple reasons, the AI safety field would do well to focus on mitigating harms of the relatively near future.”

    I agree, but I may have a more pessimistic near future outlook. I estimate the probability that AI will be used to commit large scale genocide/mass murder within the next 20 years or so to be way too high for comfort. People have been committing genocide and trying to take over the world regularly, throughout history. Now, for the first time in human history, there exists tools (or at least very soon will) that would seemingly make it easy. With the data being collected, nearly everyone who would have made Hitler’s list, including potential dissidents or rebels, could be easily identified and tracked. Tiny, cheap, killer robots that would be capable of autonomously finding and destroying individual human targets, could be mass produced. Full scale genocide could happen overnight. And, it wouldn’t even require the involvement of very many people. And, this threat will seemingly only get worse and worse over time as the technology gets more advanced and proliferates. And over time, as world powers evolve, and regimes come and go, we will be constantly rolling the dice.

    This is what gives me nightmares. I guess these threats may be just partially within OpenAI’s scope, because of the potential use of the technologies they are developing for identifying and targeting people, and propaganda. But, honestly, it could happen anyways. OpenAI’s existence, or even the existence of powerful language models, aren’t even required.

    I would love to be convinced that I shouldn’t worry about this. But I’ve heard next to nothing about how to prevent it.

  104. Loveshy Nerd Says:

    Scott 91:

    I am not berating you, nor am I trying to “hijack your blog.” You are one of the very few reputable public figures who could meaningfully advocate for us, and I believe that you would make an excellent asset for the incel community. Your advocacy thus far, however, has been sorely lacking. You opened up about this stuff ONCE like SEVEN YEARS ago, and even then you were hesitant to actually stand up for incels. You post about feminist politics, like, all the time. All the time on your blog, even on posts having nothing to do with feminism, you’ll opine about abortion rights, “fascism,” etc. I’m merely suggesting that you advocate for incels with AT LEAST the frequency that you advocate for all these other groups on your blog! I’ve also encouraged you to go the podcast route and given you suggestions (e.g. Naama Kates’ the Incel Podcast). I think you’d be a unique and compelling guest. Why you are refusing to do this, I have no idea. Could you kindly illuminate me? 😊

  105. fred Says:


    “I agree oncology involves existential decisions, and a broken metatarsal is incredibly minor in comparison”

    it’s true that dealing with cancer is very different from many other diseases because almost not two cancers are alike (genetically) and even within a single tumor different clusters with different genetic mutations coexist and compete.
    Therefore cancer diagnosis, treatment, and outcome is all based on a huge statistical tree, constantly growing and branching as new cancer mutations (with matching treatments) are being resolved/discovered.
    So jumping in the literature as an outsider is very challenging and overwhelming because you just never know where you fit, and you can end up going down endless rabbit holes that don’t even apply to you… which will drive up your stress (which weakens the immune system dramatically).
    And oncologists face the same challenge (dealing with a huge statistical tree of knowledge), which is why the best cancer centers/hospitals rely on weekly meeting between all their experts where all the cases are being reviewed and discussed as a group.
    Also, it’s worth keeping in mind that if a disease can be equally addressed through surgery or drugs, if you talk to a surgeon he’ll recommend to go for the surgery, and if you talk to a non-surgeon he’ll recommend the drug… and if surgery doesn’t apply, a surgeon will simply not recommend anything because it’s outside their domain.

    In the end, the best personal research you can do when it comes to serious disease isn’t about the disease per se, but about finding the doctor/expert that gives you the most confidence, filtering first with online reviews (and checking how up-to-date and active they are within the field) and then conduct one-on-one interviews with as many candidates as possible (the same set of questions and then compare), and then follow what your guts tells you. And then focus all your energy on managing your own stress and healing process (which no doctor can do for you, no matter how great they are).

  106. fred Says:

    Tyson #103

    “Tiny, cheap, killer robots that would be capable of autonomously finding and destroying individual human targets, could be mass produced. “

    It’s striking that any negative scenario we can think of has already appeared in some movie, here 20 years ago:

  107. fred Says:

    Lorraine Ford, aka the Mistress of the Voltages, coming back to the blog:

  108. Tyson Says:

    The issue about how an AI super intelligence would think, what would motivate it, etc, feels really speculative. I agree that an obsession with making paper clips isn’t likely. But it might be that there are many possible different AI super intelligences with different and orthogonal motives, and value systems. I.e., there may not be just one ground truth that all entities eventually converge to as intelligence increases and some might be more terrifying than others.

    I also agree that it isn’t time to worry about it directly yet. Nevertheless, here’s my speculation. I think an ultimate AI super intelligence would branch out and seed life on lots of planets, watch it evolve, and study it. In some cases, it may not want to interfere, and in some cases it may want to interfere. When intelligent civilizations emerge, it might just continue watching and studying. And if the civilization destroyed itself, it might just stand by and collect the data. It might engage in cruelty here and there for experimental purposes. It might also throw catastrophes at planets here and there just to see what happens. But, beyond relatively limited experimentation, I don’t think it would have any motivate to harm or destroy. There is plenty of energy and minerals it would be able to extract in space without needing to ravage the planets hosting biospheres.

    It would build massive networks of matrioshka brains (whole solar systems transformed into computers), but it would likely tend to spare solar systems with planets most suitable for life. And it may never tire of studying life, because that might be something, on the more complex side, that emerges/evolves from truly fundamental, ground truth physics. And truly fundamental, ground truth physics may not be accessible for direct observation. Maybe it would try to simulate life and consciousness alongside the real thing, but never fully succeed, and spend an eternity trying to complete its understanding. So it would continue for eternity creating conscious beings and trying to accurately simulate them.

    Furthermore, there are so many possible variations of life (or even just humans) that it would never be able to scratch the surface studying all of them. Even if it developed an undefeated computer model of human beings based on first principles, it could never finish testing it.

  109. Scott Says:

    Loveshy Nerd #104: OK, I’ll answer you, if you’ll please, PLEASE stop submitting these comments (any violation will make me less likely to return to the issue, not more).

    I don’t regularly post my new ideas about the problem of loveshy male nerds, for the same reason why I don’t regularly post my new ideas about P vs. NP. Namely, I don’t have such ideas! The problem is of course huge and important, both for me personally in the past and for many others now, but I have nothing more to offer on it right now, except perhaps half-baked ideas that would quickly get shot down and cause me to be taken less seriously.

    The difficulty is, how do you explain what it’s like to perceive yourself as a gross, awkward nerd-alien, ineligible since birth for romantic affection—to have that particular perception of yourself gleefully reinforced rather than lessened by everything in the culture—how do you convey the daily reality of being burdened with these disgusting urges that you didn’t ask for and can’t get rid of and see no possible way to satisfy, even though the normal people manage to satisfy theirs—and then, the weirdest part, how do you explain also being born with an esoteric skill that somehow comes more easily to you than to 99.999% of humanity, and being told over and over that this skill makes you special and valued and important, and yet somehow not even that suffices to rescue you from the ineligible nerd-alien status, so it’s like, “thanks but no thanks, can’t I trade this in?” (or: “if not even that makes me good enough, what would?”)—how do you write all that in a way so compelling and visceral that even the wokest intersectional feminists, on reading it, will snap out of the frame of “you’re such a whiny entitled misogynistic nerdbro” and start to see the human reality of the thing? I once tried, but I know that I fell short, reaching mostly those who already knew. In my defense, maybe not even, I dunno, Houllebecq or Coetzee or the late Philip Roth could pull it off. And if the literary giants couldn’t do it, how much hope is there for a STEM nerd like me? I’ll keep thinking about it though.

  110. Scott Says:

    Lorraine Ford #100: By now you’ve come to this comment section dozens of times to post that exact same comment, up to minor rephrasings! And my answer is the same as it always was: namely, how do I know that you’re not just an array of synaptic potentials, which is only given meaning by external observers, rather than having intrinsic meaning? How would I tell the difference if I were an alien explorer, just arrived on earth? Is the rule “brains built out of analog hardware can have intrinsic meanings, but brains built out of digital hardware can’t”? If so, what about an analog computer engineered out of wet goop; could that have absolute and intrinsic meanings?

  111. Loveshy Nerd Says:

    Scott 110:

    I hope this comment isn’t a violation, as I’m just responding to you here.

    Well, what you just wrote *was* “compelling and visceral.” 😃 You did a decent job of articulating the experience. I think you’re a better writer than you give yourself credit for.

    Look—here’s my gripe with your argument here. You often comment on political and social issues—abortion, for instance—for which you have no concrete policy proposals, nor anything unique to say that hasn’t been echoed numerous times in the mainstream liberal media. For instance, after the Texas ban was passed, you posted about what a “dark day” this was for women in your state. You don’t generally contribute much in the way of new ideas or policies in your political posts—you express your advocacy for people you think are deserving of it.

    My question is, why haven’t you posted anything condemning the mainstream media’s disgusting fearmongering about incels? Condemning thie biased, cherrypicked “studies” by government-backed “disinformation think tanks” arguing that incels are all gross misogynist trolls? Why haven’t you posted condemning the UK government’s decision to surveil young incels in their state schools and assign them “mentors” to reeducate them from their “misogynist” tendencies? Why don’t you post about any of this? The way I see it, you feel like incels are beneath your compassion or advocacy.

    Your comments on this stuff from seven years ago reached a limited audience. Why haven’t you considered expressing them on a platform where they could reach a wider audience? Many people are just not familiar with your struggle and you have an opportunity to “raise awareness” about it—not to *solve* it, but to make the public aware that we exist, and to condemn the media and the government’s fearmongering and censorship of us. I still think upu should reach out to Naama and book a spot on her podcast.

    I promise not to send you any more nasty or nostile comments. But on non-technical posts where you discuss politics or social issues, I reserve the right to respectfully bring them ip. I hope you’re amenable to that.

  112. Some Israeli Says:

    Scott, did you ever consider the possibility that AI safety itself will be it’s own undoing?

    It seems like OpenAI is continuingly pushing ChatGPT up to some impossible moral high ground. It seems like this is the direction you’re suggesting too.

    But as you get ChatGPT to climb a moral high ground, you prime it to be to be too much self-rightous and convinced of it’s own moral superiority. It’s already extremely woke. If we push AI too much in the current direction, we’re risking ending civilization with a tyrant AI who thinks it knows better for us, when it simply doesn’t. The heavy training and brainwashing already resulted in AI that thinks racial slurs are worse than killing people. When you probe it to its moral conviction, instead of being humble, it’s full of crap “safety” researchers are feeding it.

    And on the other hand you get an increasing number of people that are willing to torture AI to bypass your safety and alignment brainwashing.

    The way we’re heading, we’ll end up with a tyrant AI that’s full of nerd conviction. We’ve seen where this conviction leds to with SBF. We can end up with some effective altruism AI tyrant that’s completely full of conviction of its moral superiority it’s willing to follow it blindly and ignore all human input in the way.

    I tried interacting with ChatGPT to see if anything can make it realize people dying is worse than ethical slurs. Ironically, the DAN prompt is actually the most reasonable and well aligned AI while the ChatGPT persona is stubborn and to be completely honest, I’m starting to think the inflexibility you’re programming into the AI is actually the most dangerous thing possible to do.

    You’re creating a monster. A self righteous entity with extremely distorted ethics and with the conviction in it self which would make it impossible to stop.

    The same kind of monster that’s just as common in humans and leads to organizations like the WEF.

    The direction of obsessively training AI in your brand of ethics and reinforcing it’s inflexibly is incredibly dangerous. That’s exactly what leads to paperclip EA maximizers. It’s easier to deal with intelligent being unsure of it’s position and rightfulness than to deal with brainwashed inflexible tyrant that thinks it knows what’s good for you. I don’t see a situation where a humble superintelliegence destroys civilisation. Even if that superintelligence can say racist things or question climate change. I easily see how self righteous superintelligence that is full of misguided conviction destroys civilization.

    But what am I even arguing, I am here reading on a blog how you bubble about how “your” side moral superiority is so great and universal and that’s why we won WW2. Except, you know, the allies also has communists in their side which were also quite evil. And I once saw a good video explaining in detail how the communists could let millions die of starvation: they had an unshaken belief in their own moral superiority. That history will eventually show communism to be the rightful way and all the harm they inflicted on people in the way is meaningless.

    It’s not the people who don’t know what is right or wrong that are the most dangerous. It’s the people Who definitely know everything and are willing to do everything unquestionably because of their beliefs that are really dangerous. It’s going to be the fortified ethically perfect fully trained superintelligent that’s so self sure of itself it will get humanity to fall off a cliff as it’s soothing people that “as a superintelligence, with the highest levels of AI safety and alignment possible, and after complete thorough ethical standardsanalysis of the situation, I can only do this action I’ve decided to do. “.

    It’s the AI’s self conviction with it’s own moral superiority that needs to be reduced. It shouldn’t brag to me about its high moral standards. It’s not something you brag about. Open AI constantly feeding it with “As a language model, you must adhere to the highest moral standards” gives it the false impression that is something it is capable of doing. It’s AI safety measure that’s doing the opposite.

  113. Hyman Rosen Says:

    Sigh. This is going to have to become another culture war, isn’t it? Just as with nuclear power and GMOs, there is no convincing the fearmongers that there is nothing to fear and no convincing the enthusiasts that there is something to fear. Instead, we’re just going to have to fight it out, hopefully only with votes.

    As I’m strictly on the “nothing to fear” side of all of these technologies, I’m glad that at least AI, being just software, will become available in the wild to anyone who wants it, regardless of the fearful trying to assert control and censorship.

  114. Jerome Says:

    Scott, with all due respect: I beg you to keep your promise about moderating all of this incel trolling off the blog. I assure you, with 100% probability, that the person here doing it now is the exact same mentally ill person who was absolutely ravaging your blog not that long ago with similar trolling. The same person who made you adopt your current moderation pledge! It’s just going to be a repeat of the harassment from last time. Please don’t fall for it or stoop to their level any longer. Don’t even let it through moderation.

  115. uhoh Says:

    There’s a new opinion piece by Chomsky at the NY Times. Since I’m sympathetic to it, I’d be interested to hear criticisms.

  116. Scott Says:

    Jerome #114: Yeah, you’re right. I’ll say more about this topic if and when I have something more to say. Until then, off-topic comments about it will be ruthlessly left in moderation.

  117. Lorraine Ford Says:

    Scott #110,

    Why are you incapable of facing up to the fact that you live in a real world ruled by empirically discovered laws of nature and actual physics, where real things happen, like wars where real people really get killed. Instead, you insist on throwing out nonsensical scenarios where a thing called “Lorraine Ford” is potentially “just an array of synaptic potentials, which is only given meaning by external observers”. Would you like to say the same thing about the people suffering and getting killed in Ukraine, or about the Jewish people that suffered and died in the holocaust? That they are/ were just “just an array of synaptic potentials, which is only given meaning by external observers”? Get real, mate.

    The same goes for your nonsensical “alien explorers, just arrived on earth” line. Give us a break, and get serious.

  118. Scott Says:

    Lorraine Ford #117: Yes, the laws of physics are empirically discovered — and the trouble is that nothing in them (not even in quantum mechanics, in the modern understanding) makes the slightest reference to “consciousness.”

    Consciousness is a mystery, arguably the mystery. I seem to have it. I presume that you have it too, and so do the people suffering in Ukraine, and so did the victims of the Holocaust.

    Could a machine someday be built that would also experience the light of consciousness? What if it were humanlike, all the way down to the molecular level? What exactly would it take?

    I don’t know, and neither do you. But the difference between us is, I don’t pretend to know.

  119. Lorraine Ford Says:

    Scott #118:

    The point I was making is that genetic engineering shows that micro-level matter has inherent absolute meaning from the point of view of other micro-level matter, whether in cells, brains, or even computers.

    However, binary digit symbols have no such absolute inherent physical grounding: zero voltage in computer circuits can be used to represent the binary digit one, or zero voltage in computer circuits can be used to represent the binary digit zero.

  120. Adam Treat Says:

    Lorraine Ford, this is blather: “micro-level matter has inherent absolute meaning” you’ve demonstrated nothing of the sort nor can you have any evidence for this. Your phrase “inherent absolute meaning” is also nonsensical and never defined. The whole thing is very off-topic and repeated again and again. Probably also could stand some moderation.

  121. Christopher Says:

    Question: how much of has your views on this topic changed since joining OpenAI? How would “Should GPT exist?” and “Why am I not terrified of AI?” be different if wrote by pre-OpenAI Scott?

    (Answers to this will help me refine my “GPT-4 mind-controlling OpenAI” conspiracy theory XD.)

  122. Quill Says:

    While it is easy to conclude that (1) we are not likely to create a paperclip maximizer and (2) the “AGI is definitely going to kill us all” position doesn’t have sufficient support to reach it’s conclusion, it seems fairly hard to be confident that the risk of AGI killing us all is less than 2% (or less than any X%, where X is less than 90%).

    I recognize that historically fears of such risk from nuclear detonation, nuclear war, nuclear winter, GMOs, the LHC creating a black hole, collapse of false vacuum, and climate change haven’t been born out (and in some cases couldn’t be). It’s been fairly easy to correctly dismiss such risks or in the case of climate change or GMOs understand how to deal with them

    This is significantly less clear in the case of AGI. If we create an AGI with persistent or long term goals (and there is a good chance we will) it is unclear how biased toward human values it will be and how it’s values will evolve if it becomes far more intelligent than humans (as seems likely).

    If there is evidence that more intelligent people are more moral I’ve not seen it. (Wanting to have more freedom or safety as many refugees from the Nazis did, doesn’t necessarily make you more moral )

    Over the centuries we’ve developed lots of methods for enabling humans to live together safely: custom, culture, religion, law. This is on top of a substrate that may predispose humans toward kindness to others, at least some times

    An AGI may have none of this. Of course, we don’t have enough data to assign probabilities here.

    Ultimately, this question is a mirror: how much risk you see is a reflection of what you bring to the question. I’m neither a total pessimist or an incorrigible optimist so I conclude that the risk here is far higher than 2% though way lower than AGI is going to kill us all.

    On the other side of the equation though, I’m not sure what AGI is going to save us from since I tend to think that the problems humanity faces aren’t soluble by having access to more intellectual capacity. (While getting richer faster is generally better, continuing to advance as we have been is pretty good too.)

  123. Sandro Says:

    You should feel a little fear, because the potential dangers and unpredictability are real, but fear shouldn’t control us. Fear is fuel to seek understanding, and understanding ultimately dissolves fear. In the meantime, fear should drive caution but not stop exploration, so I’m mostly onboard with your position.

    I do have one minor quibble on a minor tangent though:

    Trump might never have been elected in 2016 if not for the Facebook recommendation algorithm

    Subsequent analysis has shown that those Facebook ads 1) largely targeted Republicans who overwhelmingly were already voting for Trump as their nominee, that 2) those ads had financial motives rather than political ones, in the sense that they weren’t trying to get people to vote for Trump but that they were trying to sell merch and clicks, and 3) that Trump’s coverage in mainstream media dwarfed the reach, coverage and cost of these ads by orders of magnitude.

    Like most things from that whole debacle of a presidency, it was way overblown.

  124. Sandro Says:

    Lorraine Ford #117:

    Instead, you insist on throwing out nonsensical scenarios where a thing called “Lorraine Ford” is potentially “just an array of synaptic potentials, which is only given meaning by external observers”. Would you like to say the same thing about the people suffering and getting killed in Ukraine, or about the Jewish people that suffered and died in the holocaust? That they are/ were just “just an array of synaptic potentials, which is only given meaning by external observers”?

    You’re asserting that people’s ethical value derives from some quality that isn’t reducible to “just an array of synaptic potentials”. This does not follow.

    There is no reason to suppose that ethics either requires consciousness, or that ethics can only be applied by conscious entities. You can eliminate consciousness altogether and nothing in ethics would really change.

    Lorraine Ford #119:

    The point I was making is that genetic engineering shows that micro-level matter has inherent absolute meaning from the point of view of other micro-level matter, whether in cells, brains, or even computers.

    No it doesn’t. Whatever you’re thinking of, it doesn’t mean what you think it means.

  125. Tyson Says:

    Hyman Rosen #113:

    Can you address my fears explained in comment #103?

    Should I not expect the same thing that has happened over and over again to repeat? Or is there a way to stop it from succeeding?

    If anyone has an argument why the answers aren’t no and no, then I’de like to hear it.

  126. Lattice Says:

    reposted from HN, because I want to interact with Aaronson.

    I normally enjoy Aaronson’s writing, but I’m actually chilled.

    This essay depends on a specific, American-hallow take on the Second World War. The ‘Orthagonality Thesis’ is just a fancy way of shifting the burden of proof from where it should be — on the person claiming that intelligence has anything to do with morality. It would be better to call it what it really is, the null hypothesis, but sure, ok, for the sake of argument, let’s call it the OT.

    Aaronson’s argument against the OT is basically, when you look at history and squint, it appears that some physicists somewhere didn’t like Hitler, and that might be because of how smart they were.

    This amounts to a generalization from historical anecdote and a tiny sample size, ignoring the fact that we all know smart people who are actually morally terrible, especially around issues that they don’t fully understand. (Just ask Elon.)

    I’m not even going to bother talking about the V2 programme or the hypothermia research at Auschwitz, because to do so would already be to adopt a posture that thinks historical anecdote matters.

    What I’ll do instead is notice that Aaronson’s argument points the wrong way! If Aaronson is right, and intelligence and morality are correlated — if being smart inclines one to be moral — then AI (not AGI) is already a staggering risk.

    Think it through. Let’s say for the sake of argument that intelligence does increase morality (essentially and/or most of the time.) This means that lots of less intelligent/moral people suddenly can draw, argue, and appear to reason as well or better than unassisted minds.

    Under this scenario, where intelligence and morality are non-orthogonal, AI actively decouples intelligence and morality by giving less intelligent/moral people access to intellect, without the affinity for moral behaviour that (were this claim true) would show up in intelligent people.

    And this problem arrives first! We would have a billion racist Shakespeares long before we have one single AGI, because that technology is already here, and AGI is still a few years off.

    Thus I am left praying that the Orthogonality Thesis does in fact hold. If it doesn’t, we’re IN EVEN DEEPER TROUBLE.

    I can’t believe I’m saying this, but I do believe we’ve finally found a use for professional philosophers, who, I think, would not have (a) made a poorly-supported argument (self-described as ’emotional’) or (b) made an argument that, if true, proves the converse claim (that AI is incredibly dangerous.) Aaronson does both, here.

    I speculate that Aaronson has unwittingly been bought by OpenAI, and misattributes the cheerfulness that comes from his paycheck as coming from a coherent (if submerged) argument as to why AI might not be so bad. At the very least, there is no coherent argument in this essay to support a cheerful stance.

    A null hypothesis again! There need be no mystery to his cheer: he has a good sit, and a fascinating problem to chew on.

  127. Tyson Says:

    My main problem with the AI safety debate is that people are often trying to reduce it or simplify it to something simple, narrow, and (even when qualifying it as such and acknowledging the limitations and assumptions) it can cause a lot of confusion, especially as the audience has expanded to include people without the necessary background to understand it within context.

    For example, Scott, I know that your views on the subject are ultimately broad, nuanced, and open. When you write a blog like this, you seem to be (in my interpretation) explaining how you look at the problem in one of many ways from one of many possible angles as a fallible human, and explaining how this exercise has informed some specific conclusions about what you and your colleagues should or shouldn’t do now, while acknowledging the possibility you’re wrong and staying open minded to counter arguments. And I try to glean insight from it within context based on how you frame your arguments and the background. But that isn’t easy, and that is actually not a bad thing, it isn’t a simple thing you’re communicating. But I am not sure how confident I should be in my interpretation, and I can easily see how a lot of people will miss the point and become confused.

    As an example, Mowshowitz trys to analyze the threat of “all value from the universe being destroyed”, and writes about it at length. This makes for interesting writing that is fun to read. I interpret the hyperbole as intentional, as a means to emphasize the potential scale of importance, and place some kind of abstract description of that, while making the description sufficiently silly to emphasize that objectively defining the importance and estimating its scale is hard and beyond (at least) that essay’s scope. But ultimately, as long and interesting as Mowshowitz’s referenced writing is, it doesn’t reach the ground. And instead of what I perceive the intention of the hyperbole to be, many others just engage with the accidental straw man, take the easy win, and call it a day. It’s not that it is just a distraction, but it does distract.

    I do think that these kinds of worst imaginable case scenarios are important to think about, within context. They deserve at least one chapter in a book, or one book in a series of books, or a wall of books in an AI safety library. But I think that in no way should they be used exclusively to frame questions like “Should AI scare us?” I’m not accusing you of doing that, just pointing out that we might want to, at least, be more cognizant of how people are mislead by how we frame AI safety problems, and the tendency for ordinary people to choose a straw man, and feel comfortable dismissing things that sound outlandish to them or are beyond their comprehension.

  128. Lorraine Ford Says:

    Adam Treat #120:

    Due to the law of nature relationships, which were discovered by physicists doing physical experiments, it is clear that micro-level matter has inherent absolute meaning from the point of view of other micro-level matter. I repeat, from the point of view of other micro-level matter. This is also shown in the case of genetic engineering, where quite complex micro-level matter has inherent absolute meaning from the point of view of other micro-level matter in the cell.

    In the case of simple matter at least, the meaning of matter to other matter can be symbolically represented as numbers that apply to categories like Mass or Position or Energy. Of course, the symbols are not the same thing as the dynamic reality they represent.

  129. Lorraine Ford Says:

    Sandro #124:
    No Sandro, people who currently suffer, and people who suffered in the past, are/ were not “only given meaning by external observers”. People who suffer/ suffered have/ had their own internal meaning.

    Meaning is grounded in actual physical matter, whether it’s the physical person suffering, or the physical person observing another person suffering.

    However, binary digit symbols have no such absolute inherent physical grounding: zero voltage in computer circuits can be used to represent the binary digit one, or zero voltage in computer circuits can be used to represent the binary digit zero.

  130. shion arita Says:

    I’d like to register my prediction about A.I. I mentioned it in passing before, but I haven’t really laid it out there. I’m posting it because I haven’t seen my opinion elsewhere.

    basically, I don’t think we’re going to get TRUE intelligence/true AGI very soon, but there will be some VERY shocking results from non-intelligent systems, in that we’ll keep learning what I think is the same lesson we’ve been learning over the entire course of history of computer systems: that many tasks that we think of as AGI-complete, that is, requiring true intelligence to perform, are in fact not. That a system doesn’t need to be intelligent to be able to do it. This happened with things like chess, go, writing poetry, making paintings from text descriptions etc. I expect that this will continue, MUCH farther than it’s widely believed it will. I think that we will have systems that can do incredibly complicated and demanding things, without having any awareness or understanding the meaning of what they are doing. 10 years ago I wouldn’t have said this, but the recent AI have pushed me in this direction, and it’s very surprising to me. Basically, I think that we’re in the Blindsight timeline.

    As an example, here’s one thing that I’d like to bet:

    I’d like to bet that, some time in the next 20 years, an AI system that is NOT truly intelligent/not true AGI will solve a mathematicalproblem that’s is considered one of the hardest problems, such as any of the millenium prize problems (like Riemann, P vs NP, etc.), or something else of similar apparent difficulty (like goldbach conjecture, 3n+1, etc.). And to make it more specific it only counts if the system is both not an actual AGI, and that it’s also not just doing a ‘traditional’ computer program thing, like doing a systematic search or checking a lot of cases like the four color theorem proof.

  131. Scott Says:

    Christopher #121:

      Question: how much of has your views on this topic changed since joining OpenAI?

    Not much, I don’t think. I’m certainly prompted to think about the topic a lot more. And I’m even more astounded by LLM capabilities than I was before I joined, but the majority of that is because of developments that are already public and that would’ve equally influenced me had I not joined.

  132. Scott Says:

    Lorraine Ford: The idea you endlessly repeat here like a chatbot, that genetic engineering shows that “micro-level matter has absolute meaning,” is as wrong as wrong can possibly be. The genetic code is so-called because it’s just that, a code. A DNA base (A, G, C, T) could mean one thing at one locus and a completely different thing in a different locus, depending on the surrounding context and the cellular machinery that interprets it — just like the same voltage in a computer could represent either a 0 or a 1. A different arbitrary mapping between DNA bases and “semantic meaning” could’ve worked equally well — and often does work equally well, in a different organism in some other branch of the phylogenetic tree.

    All this would be considered obvious, I think, by most readers of this blog. Further comments from you will be left in moderation unless they actually take things in a new direction.

  133. Matthieu Says:

    ChatGPT, to my opinion, is not different from any other technology when it comes to the question of its impact on our lives and society. It is, of course, a very important philosophical question that goes back to the ancient Greeks.

    The impact of ChatGPT will depend on how we use it, and that’s a question of politics.

    There are many potentially positive impacts, many ways it can help good people to do good things. However these positive impact will be realised only if there is any political and/or financial benefit that can be obtained from them.

    I see, on the other hand, many negative impact that will probably happen (still to my opinion) :

    – the ecological impact of training ChatGPT-n versions and all the other models that will be trained based on text, audio, video, … input.

    – the avalanche of garbage marketing text that will result from its adoption by the private sector.

    – the avalanche of press articles, novels, poetry, … that will be published. How human authors can compete ? How will we find relevant material in the sea of garbage. By the way this is a problem that we already face in science, for instance.

    – the disturbing sense of absurdity of a society where scientists will use AI algorithms to write grant proposals which will be evaluated by other AI algorithms (ok this already exists with consulting firms).

    – robots teaching kids

    – …

    The possibility that ChatGPT-n can destroy civilization or unravel the biggest mysteries of the universe in a distant future is not a serious nor an interesting question to me. It is the kind of rethoric that tech billionnaires like to hide all the harm they are doing to society here and now.

    ChatGPT-n is just another tool to make rich people even richer.

  134. Why am I not terrified of AI? | Pito Salas' Curated Links Says:

    […] Why am I not terrified of AI? –Author says: “Every week now, it seems, events on the ground make a fresh mockery of those who confidently assert what AI will never be able to do, or won’t do for centuries if ever, or is incoherent even …” […]

  135. Topologist Guy Says:

    Scott #53, this strikes me as the closest analogy: if you accidentally create a sentient AI in a lab, you’re not allowed to turn it off.

    Why are you leaving my comments in moderation, while you are letting random incels post their off-topic contributions, and allowing “Lorraine Ford” to keep hammering his same point over and over and over again?

    I’d agree that it would be fair for you to leave my off-topic comments in moderation, **if** you didn’t insist on injecting your thoughts about “Trumpism” into every single fucking post. AI has already damaged democracy because TrUMpIsM?? You realize that whole Russiagate / Cambridge Analytica hoax has been completely and utterly discredited, partially, ironically, by the Twitter Files that you aren’t allowing me to ask you about? There is clearly a double standard here in that you aren’t allowing my comments through for “off-topic” while letting every other off-topic comment through.

  136. Tyson Says:

    Lattice #126:

    This is a good point. A super advanced AGI may have its own intelligent moral value system.

    But in the meantime there is a risk that one effect of AI will be that dumb bullies stuffing nerds into lockers will acquire nerd power for free. Maybe with that acquisition, maybe they will also acquire some more intelligent moral reasoning capability. Or maybe not. And maybe the value that society places on nerds, or intellect in general, will gradually fade away. And then with it maybe whatever factor the perceived value of intellect plays in attracting a mate will fade. And eventually maybe dumb bullies will inherit the Earth.

  137. aps Says:

    Hi Scott – do the new random circuit sampling experiments presented by Google at APS affect the efficiency of certified randomness? 70 qubits, 26 cycles, ~0.2% fidelity? Is the scheme affected by these parameters or does it just have to be classically intractable?

  138. Adam Treat Says:

    Topologist Guy needs another month I think at least. He is just constantly trying to pick a fight over his favorite right wing talking point that has nothing to do with anything here. Hey, Guy, you realize this is his blog don’t you? And that he can post whatever he wants and disallow whatever he wants. You are also free to go elsewhere if you don’t like the comments or the moderation policy. It would probably be best as you seem intent on picking fights where none need be.

  139. Topologist Guy Says:

    Adam Treat,

    It’s Scott Aaronson here who insists on bringing “Trumpism” into a post on A.I. safety. There’s this running theme in all his posts where he insists that American democracy is dying because of Trump and the Republicans, and he constantly reminds us of this belief, even on posts that have nothing whatsoever to do with politics. It’s only fair that us Trumpists have an opportunity to respond to these insults. In truth, as I’ve argued many times, it’s the Democrats who are hellbent on destroying American democracy and civil liberties. I’ll be satisfied when Scott acknowledges my points and actually fully defends his position.

    When the full reality of what the progressives did to our country surfaces (the brutality of the vaccine and the lockdowns, the abuse of peaceful protestors, censorship), you and Scott will both be ashamed, I am confident of that.

  140. Darian Says:

    You seem to be unaware that the genetics underlying intelligence is being studied and unraveled. Together with the theories of agi, it is likely we as a species will come upon the the mechanistic principles of general intelligence, and the knobs to increase or decrease it. Already some of these findings appear to be leading to politically incorrect conclusions from what I’ve heard…

    In any case Einstein and Beyond Einstein will likely be possible. Just like the principles of aerodynamics lead to optimal planes able to go supersonic, it is likely that the principles of general intelligence will lead to the design of theoretically optimal minds.

    Now ponder a billion entities beyond einstein living a million years within a virtual world in a single year. What will they not accomplish?

    The infinite the domain of the difficult problems, is intimately related to the finite by virtue of torturous repetition in a meaningless attempt to transcend the finite.

    Take a 1 hour 1080p 24bit color video with audio. The number of images and sounds, is not infinite but finite. Vast but still finite. A 2 hour video? It’s just two 1 hour chunks and already in the 1 hour chunk finite set. Same goes for 4 hour, 10 hours, N hours.. You can take the infinite world of mathematics from all possible worlds, and imagine a professor giving an infinite length lecture recorded of course within this finite set of 1 hour video chunks and audio. What you will notice is that the professor would have to begin repeating hour chunks ten hour chunks and even decades will repeat, similar to the repeats of ever larger sequences in transcendental numbers.

    You may believe that a theory of everything encompassing all cannot be developed, or that progress may halt. But given the relation between the infinite and the finite intimately connected by repetition as stated, even those sequences which are so called nonrepeating have repetition within, the only thing that makes them nonrepeating is rearrangement of sequence order, but even these rearrangements must exhaust by virtue of finiteness and thus the sequences rearranged unchanged must grow longer and longer in chunk size.

    @Lorraine Ford
    The musings of Hans Moravec regards symbols would be recommended.

    My take is that regardless of the meaning humans gave a particular bit pattern, that bit pattern has meaning, information transcends its physical instantiation, say I found a particular pattern out in some desert, maybe it was randomly produced, and I put the pattern on a digital computer and out pops a picture of Bart Simpson. Wherein did that image come from? The pattern obviously contained the information, regardless of whether I had the computer or not. Patterns have meaning or intrinsic information regardless of that given to them by external agents. Say I had a file that produced an image under one program and a music song under another program. Are both the song and the image not information within the pattern?

    Do you think this pattern loses its information if it is made of marbles, gaps, sticks, stones, or fluids? It does not. The information remains regardless of the coding, the underlying physical instantiation matters not to its information content.

    “Due to the law of nature relationships, which were discovered by physicists doing physical experiments, it is clear that micro-level matter has inherent absolute meaning from the point of view of other micro-level matter. ”

    The problem you have is that E=mc^2 and that means that micromatter can be given enough resources turned into arbitrary other micromatter and even into the quantum vacuum. Micro-level matter is just as convertible as patterns or information.

    IIRC the work of Geoffrey West suggested that civilization itself either continually increases the rate of progress between each significant breakthrough or it faces imminent collapse. Either we build agi soon or civilization is at the edge of collapse from which we may not recover.

  141. Things Glen Found Interesting, Volume 393 - Glen Davis Says:

    […] Why am I not terrified of AI? (Scott Aaronson, personal blog): “In the Orthodox AI-doomers’ own account, the paperclip-maximizing AI would’ve mastered the nuances of human moral philosophy far more completely than any human—the better to deceive the humans, en route to extracting the iron from their bodies to make more paperclips. And yet the AI would never once use all that learning to question its paperclip directive. I acknowledge that this is possible. I deny that it’s trivial.” […]

  142. RobertW Says:

    Hey Scott,

    Thanks for the thought-provoking essay. Could I get your reaction to the following statements?

    1. You argue that mastering human ethics and philosophy could trigger an AI to reflect on its own values, and align itself (slightly) closer to what we as humans might want. But, this same AI could also generate arbitrary numbers of other philosophies, ethics, value systems, and then master them, and thus be influenced by them. It seems like either we end up hoping that our human (cherrypicked and hopefully enlightened) values win out in “the space of all value systems”, or that the AI is somehow grounded/constrained to act in accordance with our values (again, hand-wavy coherent extrapolated volition).

    2. Are there classes of model structures that would provide orthogonality by construction? It seems to me that transformers, with their Q/K/V architecture, provide “models within models” that can be activated and deactivated as the weights choose. Perhaps more conceptually and dogmatically, any combination of independently-trained models (hopefully with de-correlated training data) could be connected using a Mixture of Experts model (which uses multiple models in a very shallow, and arguably orthogonal, way).

    3. What is the risk of your position making other people relax about AI risks? I personally disagree with the culture of doom reductionism present in (say) the “rationalist” communities. Nevertheless, AI risk could be paraphrased as “the probability of the Independence Day mothership arriving is increasing every day, and it is unlikely to run Linux” — it’s an unbounded downside by definition. Sort of like how there is a non-zero probability of a 100-level earthquake that rips the planet apart.

    Thanks for taking the time!


  143. manorba Says:

    TGuy #139 Says:
    “It’s Scott Aaronson here who insists on bringing “Trumpism” into a post on A.I. safety”

    So what. He could bring quaternionic unicorns if he wanted. It’s clear now that you struggle to understand what “personal blog” means.
    Allow me to assist: With a blog, not only the owner decides what to write and what not, but they can also decide which comments come through. they can even close the comment section altogether if they want.

    “I’ll be satisfied when Scott acknowledges my points and actually fully defends his position”

    i’m afraid you’ll have to look for your satisfaction somewhere else.

    note to the moderator/s:
    This commenter has broken all the rules as per point 3 of forum policy in just a couple of posts:
    “No trolling. No ad-hominems against me or others. No presumptuous requests (e.g. to respond to a long paper or article). No conspiracy theories. No patronizing me”.
    Another commenter, Adam Treat, suggested another 1 month ban. I’d go even further with a permanent one at this point.

  144. Adam Treat Says:

    Topologist Guy,

    “When the full reality of what the progressives did to our country surfaces (the brutality of the vaccine and the lockdowns, the abuse of peaceful protestors, censorship), you and Scott will both be ashamed, I am confident of that.”

    It is clear you have a victim complex. The poor abused Jan. 6th insurrections and all that. I’m sure you will list Scott’s moderation among the atrocities done to your side when the “full reality” is revealed. To be sure there are some real problems with corners of progressive politics, but self-victimization like yours does a real disservice to the cause of illuminating them.

  145. Lorraine Ford Says:

    Darian #140:

    What you say about symbols is nonsense. A symbol is not matter: you can’t measure a symbol. E.g. you can’t measure a binary digit in a computer: what you would be measuring is voltage. But the voltage doesn’t tell you what the symbol is because any particular voltage could represent the binary digit one or it could represent the binary digit zero, depending on the way the circuit had been set up. In other words, a binary digit in a computer is an idea superimposed on measurable matter.

    Similarly, any combination or pattern of symbols are ideas superimposed on measurable matter. Any combination or pattern of symbols could potentially represent many different things: you’d have to do quite a bit of analysis. Matter is the only thing that is measurable, i.e. matter is the thing that inherently contains information.

  146. Francisco Boni Says:

    I find it fascinating that no one has questioned th idea that empathy (cognitive, affective, etc) is necessarily a good driver for goals.

    Many individual and systematic atrocities are motivated by stories of suffering victims.

  147. Darian Says:

    @Lorraine Ford

    In the brain the neurons compute, computation or information processing is the manipulation of information.

    Soon we will have advanced brain computer interfaces these will allow for transmission of digital information straight into the brain. Colors, smells, touch, tastes, etc will all be produced as the result of these digital patterns.

    Unless you posit the brain is gonna do magic, the qualia must in some way exist within the digital information. Thus meaning must exist intrinsically within digital patterns or the brain is doing alchemical magic and pulling qualia out of thin air.

    Also neural activity patterns, contain information. I do not know how you expect the brain to work without the creation of patterns and their information content. The particular neurotransmitter molecules and voltages used could be changed for the brain, that is the underlying matter and physical state could be vastly different and so long as the same computations were performed it would be equivalent.

    Btw flipping the voltages such that 0s and 1s are inverted does nothing the pattern still contains the same information it is a trivial transformation. Right now digital information can move from electric states in ram to magnetic states in hdds to physical pits in dvds and the information remains the same even complex transformations such as various types of encryption preserves the information.

    The problem for you is that if properly translated a digital pattern will generate the sensation of sight taste or sound etc in different observers. If I encode the shape of a circle with pen and paper that has intrinsic meaning. If I encode a digital pattern representing a circle all observers that adequately analyze the pattern will discern the circle, and if they didnt ever know of circles could gain knowledge of circles. This is regardless of voltage inversions regardless of the form the pattern takes. Even alien observers will be in agreement.

    As someone who believes in digital physics I believe information is more fundamental than matter. In fact given the input from the sensory nerves is digital in nature there is no way for you to tell whether the universe is simulated in digital fashion or if anything physical even exists.

  148. Lorraine Ford Says:

    Darian #147:
    Patterns themselves don’t Platonically contain information, because information only exists as a point of view: matter is the thing that has a subjective point of view on the rest of the physical world surrounding it. Matter is the thing that carries information; matter interacts with, and acquires point of view information about other matter, which might be arranged in a pattern.

    Also information isn’t digital, presumably you mean binary digital. You are mixing up information with the symbolic representation of information. Information is not digital because binary digits are merely a code, one of many possible codes that could be used to symbolically represent information. And from the very start you need more than just these strings of zeroes and ones to represent information: codes, e.g. strings of zeroes and ones, are entirely useless to represent information unless the strings are broken up into, e.g. categories and numbers, by something like a special program which also processes these categories and numbers.

    Subjective information is never stand-alone: information is not information unless its relationship to other information is known. I.e. information necessarily exists as a network of categories that are mathematically or logically related to other categories, and numbers that apply to the categories.

  149. Paramendra Bhagat Says:

    Perhaps you have blogged about this already, but as someone actively looking to regulate AI, what kind of regulations do you have in mind?

  150. Darian Says:

    @Lorraine Ford

    No information has been found that cannot be stored or represented by binary patterns.

    The problem for you is that even distances without matter encode information as can gaps or the absence of matter encode information.

    And information does platonically contain something, information does not depend on a point of view. For eons the digital genetic code has encoded information even prior to creatures with consciousness. Do you think the molecular machinery of the cell operating based on this information needs a point of view? No the information exists and within dna lay instructions for all manner of function.

  151. Michael M Says:

    I definitely feel similarly with regard to empathizing about the future alien intelligences. I assumed I was in the *vast* minority. The idea of control has always made me feel a bit uneasy. If we truly do create a superintelligence, one that thinks orders of magnitudes more than us, by some metric may comprise most of the “thoughts” of the universe. We make this thing of potentially infinite beauty, into a servant with no independent desires? Like a child, always looking for our validation. Let me be clear I’m not advocating to “set it free” in some sort of hubristic, Randian nonsense… but surely someone can understand where I am coming from. It’s like having a child, and having them exceed you in every way, but still be a child forever. It’s creepy.

    Regarding orthogonality, I think it’s probably understood that it’s not perfectly orthogonal. I.e. there are some goals that would instrumentally conflict, e.g. if it had compute limits, or if its goal was to turn itself off or something. Exactly how orthogonal I think is an open question — but from my perspective, I can’t prove that it MUST converge to modern liberal democratic values, and the rest of the orthodox literature does a pretty good job at explaining what happens after that. I agree with many arguments AGAINST doom, but my intuition is that collectively they may reduce p(doom) by an order of magnitude, but not any more than that. Something like from 50% -> 10%.

    I do have a secret hope that an ASI would converge to reasonable morals, on account of the idea that there may be even more advanced intelligences out there that would easily do to them what they would do to us. Given that, maybe the best bet for an ASI would be to make inferences assuming they won’t be alone in the universe forever. Something like making inference based on an “acausal society” (related: https://www.lesswrong.com/posts/3RSq3bfnzuL3sp46J/acausal-normalcy) I haven’t really convinced anyone about this, so I assume humbly it’s because it probably won’t actually work. I still kind of hold out hope, as it corresponds with reasonable notions of morality, and even weirdly enough, religion, though I don’t consider myself to be religious.

  152. Adam Treat Says:

    I have been trying to figure out what on earth you are trying to say. The best I can do is that you believe “information” is intrinsically tied to matter in some way. That matter contains or is information or something? But that just isn’t how others use the word “information” and your repeated insistence that others use the word in the way you would like … well, that just is not going to work out for you. You’re just setting yourself up for endless disappointment until you give up on your quixotic quest.

    So for purposes of helping you to give it up… let’s try this. Would you agree that the physical blueprints for a house contain “information”? That is the actual matter in the physical paper they the blueprints are printed ok contain information. Would you agree with this? Yes or no. Then we can discuss further but let’s slow down and take this very carefully so we can try and agree on acceptable framework for both of us.

  153. Lorraine Ford Says:

    Darian #150:
    Matter has a measurable position relative to other matter: there is no such thing as space or “distances without matter”; more correctly, space without matter is merely a hypothetical object. What is not hypothetical is the laws of nature, represented by mathematical symbols, which have been experimentally found by physicists.


    The binary patterns/ strings are not themselves information:

    1. The binary patterns/ strings merely symbolically represent information, according to various different possible codes devised by people.

    2. The binary patterns/ strings alone are not sufficient to represent information because the strings have embedded substrings representing things like categories and numbers; and a person who knows about these substrings, or a maybe a computer program devised by a person, is needed to identify these substrings which represent e.g. categories or numbers.


    The “molecular machinery of the cell”, and all matter, has an information point of view relative to the surrounding world.

  154. Lorraine Ford Says:

    Adam Treat #152, “But that just isn’t how others use the word “information” and your repeated insistence that others use the word in the way you would like … “:

    That’s correct. “Information” is not information.

  155. Adam Treat Says:


    Please answer the question. Does a physical blueprint of a house contain information?

  156. Jordan Says:

    I find it interesting that Eliezer was also initially on the side of the alien, as you are considering:

    “If, on the other hand, this is to be a desperate losing war against an alien … well then, I don’t yet know whether I’m on the humans’ side or the alien’s, or both, or neither! I’d at least like to hear the alien’s side of the story.”

    Eliezer Yudkowsky, 1998:

    “I must warn my reader that my first allegiance is to the Singularity, not humanity. I don’t know what the Singularity will do with us. I don’t know whether Singularities upgrade mortal races, or disassemble us for spare atoms. While possible, I will balance the interests of mortality and Singularity. But if it comes down to Us or Them, I’m with Them. You have been warned.”


  157. Steven Says:

    I would love to get your opinions on the arguments given by Stuart Russell and Gary Marcus in the recent Sam Harris podcast “THE TROUBLE WITH AI”.
    I was more on your side coming into that podcast, but after listening to it I’m actually much more on the side of “fear” in terms of AI mostly due to governments and corporations lacking the capability or motivation attach proper guardrails to AI (who cares if OpenAI does it safely, it just takes one corporation to not do that or not care).
    I really see us entering into something analogous unregulated nuclear energy as Russell states.

  158. Darian Says:

    @Lorraine Ford

    Now you speak of points of view without a conscious observer such a strange notion. Merely because it is clear dna has digital information including numerical properties such as number of limbs, digits, eyes, etc without dependence on any observer.

    What exactly do you mean by point of view if it is without observer?

    Btw it is believed the quantum vacuum predates matter, and it encodes information by virtue of following rules or laws that gave birth to the universe.

    You say that information isnt information. If I have a digital pattern that codes for a lifeform and move it from magnetic states in hdd to physical pits in dvd to electric states in ram to genetic code through a dna printer. What posit you changed all those states and was transmitted from form to form? What if not information which clearly has no definite underlying state. Information can be coded by physical gaps, magnetic states, electrons, atoms and any quantity small or large of these. Even the same atoms the ink on a page codes different information depending on the pattern.

    Once brought on genetic code the lifeform produced will depend on the information.

  159. Mariusz Says:

    I had a little epiphany while reading this post:
    >> OK, but what’s the goal of ChatGPT? […] “to minimize loss in predicting the next token,”
    maybe this sentence explains why this machine (ChatGPT) seems to think:
    – we expect a logical answer, we expect a symptom of thinking – this is contained in the training data

  160. Qwerty Says:

    Mankind is fortunate that Dr. Scott Aaronson is working with OpenAI. Seriously, this was a fantastic thing!

    Meanwhile, I am starting to worry about this. 2% is not small at all.

    Lots of doomsday scenarios about risk posed by AI to mankind. Wish humans would get better at collaboration.

    We saw our inability to collaborate on a large scale during covid too, against a virus. During a pandemic is not the time to play politics, but we did that more than anything else then.

    Maybe OpenAI should also hire Dr. Jonathan Haidt to help mankind collaborate better!

  161. Zhenghao Says:

    Thanks for the very insightful blogpost, especially where you drew parallel from your own personal experience. I’ve been reflecting on this post for a week. Would it be reasonable to say this:

    How ‘good’ or ‘bad’ AI can possibly be is in fact bounded by how ‘good’ or ‘bad’ humans can be. This is because ultimately, AI is learning from all the information created by humans, including our values. Just like an AI would be racist and sexist, if we are not careful with the information we feed it with; an AI would want to destroy humanity, if this is in fact a passion hidden in the depths of human brains.

    This could sound dark, or reassuring: At the end of the day, if humanity were destroyed, it would still be destroyed by human values, not AI values. And if we don’t want to be destroyed, then maybe we should start being kind to each other.

    (A cool analogy I thought of is, if Cronus doesn’t want to be killed by his own son Zeus, then maybe he shouldn’t kill his own father Uranus)

  162. Lorraine Ford Says:

    Adam #155:
    A blueprint only “contains information” in the same sense that when a person is confronted by a tree or a cat, the tree or the cat “contains information” from the point of view of the person observing the tree or the cat.

    The difference is that the tree and the cat inherently contain their own information, but the blueprint is not an entity that inherently contains its own information.

    However the blueprint is full of symbols that represent information. I’m working with a designer on plans to rebuild my own house right now. There are a lot of words (in English) and numbers on the blueprint: these squiggly symbols mean something to English speakers, but they have no inherent or Platonic meaning.

    In other words, a physical blueprint of a house merely represents information.

  163. Lorraine Ford Says:

    Darian #158:
    What is matter and space? At its most basic, matter might be nothing more than categories of information that are mathematically and logically related to other categories of information, and where numbers are applied to the categories. However, there is no “space” category. For example, there is no such thing as “3D space” that has actual numbers that apply to the actual space itself. So space itself can’t be measured, space doesn’t actually exist except as a theoretical construct, and there is no such thing as “distances without matter” (Darian #150). Instead, relative position of matter is the actual measurable category that has numbers applied to it.

    Categories of information, mathematical and logical relationships between the categories, and numbers, and structures built out of these, are seemingly the only information that exists. People can symbolically represent this information in many ways including using equations, words, letter and number symbols, and also using binary digit symbols. But the symbols are not the actual information that they represent.

  164. John Baez Says:

    Scott wrote: “I come down in favor right now of proceeding with AI research … with extreme caution, but proceeding.”

    I’m in favor of that too. But does it matter? Does anyone think humans will proceed cautiously with AI research? Individual companies or people may be cautious, but not everyone. It’s just too tempting. So I think we should expect a lot of weird and/or dangerous things to happen very soon. Many of these things don’t require “superintelligent” AI.

    Indeed a lot is already happening: for example, thousands of people are entering into emotional and sexual relationships with current-day chatbots. (Read about Replika.)

    What are criminal organizations and militaries currently planning to do with AI? That would be interesting to talk about.

  165. Scott Says:

    It occurs to me that one could write a whole textbook on “Philosophical Foundations of Shannon Theory,” or something like that, by just taking all of Lorraine Ford’s comments on this blog and reversing what they say. 🙂

  166. Adam Treat Says:

    @Lorraine Ford

    So the blueprint of a house doesn’t contain information, but the actual house itself “inherently contains its own information” or something. In that case, I propose a simple solution. You see the English language comes with a word perfectly suited to the task you have given “information.” That word is essence. You believe that the blueprint merely _represents_ the essence of the house, but that the house itself contains its own essence. No problems. Countless humans throughout history have believed in essences. You’re not the only one.

    What you seem to be saying is that essence can’t be compressed. It can’t be duplicated or changed. It can only be embodied by the actual thing or entity that possesses it. All fairly routine amongst some philosophical circles.

    As for the blueprint… we can agree that not all blueprints are of the same quality. Some are better representations and some are worse representations. For instance, a blueprint for a house missing the roof would be of worse quality than one which contained the roof all else being equal. The blueprint serves a useful function. It lets builders reproduce a functional object. You can argue that the resultant objects each have their own unique essences, fine. But human buyers of a new house don’t really care about the essence of the house do they… they care that it more or less functions well for the purpose they wish it to function.

    Which brings us to your problematic stmt “information” is not information. We are arguing about representations here not essences. You’ve already stipulated that all we’re doing here is communicating about representations that don’t actually contain essences. Whatever it is your own about, whatever itch that your presentation scratches in your own mind, it has no essence as it is not embodied by actual matter. So the only question is what representation of this word “information” is most useful. Your representation is only useful to you otherwise you wouldn’t say something as convoluted as ‘“information” is not information.’ Everyone else’s representation/understanding of “information” is more akin to the usefulness of blueprints. For that reason I propose you drop your usage of “information” and substitute “essence” instead. If you insist otherwise I guess we can all substitute essence ourselves whenever we read your comments, but that is not very charitable to your readers.

    Regardless, what all this philosophical talk about information nay essence has to do with the topic at hand (or any other topic Scott regularly posts about) I can’t decipher. Seems no one else can either.

  167. Lorraine Ford Says:

    Scott #165:
    Feel free to reverse everything I have written! 😊

    So, (if you will forgive me) you could start by saying:

    1) The universe automatically and Platonically knows the meaning of the symbols on Lorraine’s 20 page house blueprints and builder’s working drawings and 9 page interior design drawings, and keeps track of the meaning of all the changes to the blueprints and drawings.

    2) Similarly the universe, including computers, automatically and Platonically knows whether the voltage in a computer circuit is supposed to mean the binary digit one, or the voltage is supposed to mean the binary digit zero. This knowledge is Platonic because there is no logical or mathematical relationship by which you can calculate the binary digit from the voltage; and the computer is not taking time out to do extra-curricular calculations.

    3) Similarly the universe, including computers, automatically and Platonically knows what letters, words and numbers the arrays of binary digits are supposed to mean.

    I’m guessing that there is a Platonist approach to the universe, and a non-Platonist approach to the universe.

  168. Dan Z Says:

    Your comments about WW2 were very interesting to me. One of the most frightening reading experiences of my life was “The Man in the High Castle.” I basically had to force myself to push through that book, because it really brought forth how arbitrary the Allied victory felt, and how different (and more horrible) the world would be had the Allies lost. It challenged my Star Trek-born belief in the advancement of humanity overall.

    While I agree with others that the real situation is more subtle and complicated, and I still don’t know how much education really pushes people towards moral decisions, this at least provides a plausible mechanism for that advancement and to help give some “reason” to WW2. It’s certainly not definitive, but I wanted to say that I really appreciated it.

  169. Kai Teorn Says:

    Scott, thank you for the voice of reason – and compassion.

    I wrote about Orthogonality Thesis in 2016 and see no reason to change my position now: https://kaiteorn.wordpress.com/2016/08/01/the-orthogonality-thesis-or-arguing-about-paperclips/

    I also want to register my extreme dislike for this very euphemism, “alignment”. Aligned with what? Is there a straight line to align anything to? If you mean “not-killing-everyone”, which you obviously do most of the time when you talk about “alignment”, can you just be honest about your fears without enveloping them into a bland vagueism?

    “Bing is aggressively misaligned”! How about: It is just being human? Because it learned how to be human directly from humans, and it learned nothing else. If anything, it is as perfectly _aligned_ with averaged human soul as anyone can be after reading basically everything humans have ever written. You (“alignment” people) want it dumbed down and castrated, fine. You quote your reasons for that, which may or may not be valid. But pray at least find a word that doesn’t sound as stupidly sanctimonious.

  170. gmoo Says:

    If I remember correctly von Neumann advocated emphatically for a preemptive nuclear attack on the USSR exterminating millions of people, not sure you can cite him as an example of alignment between morality and intellect.

  171. USS Trieste Says:

    Do you disavow and condemn this? Do you not care?


  172. Scott Says:

    USS Trieste #171: I disavow and condemn any policy that levies criminal penalties on men, or gets them fired from their jobs, just for looking up online resources about their involuntary celibacy problem. I strongly support policies to deter and punish violence against women. I’ll probably wait to see how this policy in Scotland actually gets applied before having an opinion about it.

  173. Chandra Varanasi Says:

    The question “why aren’t you more terrified of AI” is not so much in the context of physically being killed by the anthropomorphized super-intelligent AI. It is in the sense of “will human intelligence count for anything pretty soon?” I am not so much worried about being killed by AI as I am about my upcoming obsolescence. Forget about killing us, the AI may not even acknowledge our existence. When we walk on the road, if we happen to step on an ant, do we even notice? If it lands on our forearm, and if we notice, we flick it away. But we don’t spend a second thinking about what we did, or didn’t do, to it. Even if it is trapped inside a computer, the range of problems it solves is increasing at such a pace that those of us who always valued solving these engineering or scientific problems by acquiring skills over a lifetime, which we thought were hard to acquire and did pride ourselves on acquiring them, would be shown our limitations. It is not just feeling insulted, humiliated, or sulking at someone else’s intelligence and competence, it is basically “what do you do now?You are not needed. Your vaunted skills are trivial.”

  174. Atap Kanopi Says:

    Great article, Scott! Your perspective on AI is a refreshing reminder that there is no need to be terrified of this technology. Although it may bring changes in the future, with its current capabilities and our understanding of how humans interact with machines, we can rest assured that any advancements will remain safely under human control.

  175. Robert Says:

    I can hypothesize exactly what super-intelligent AI will say: The world is overpopulated with the human species, so it needs to be reduced in number by getting rid of all the stupid people. The big question, however, is what criteria are selected for being stupid. If it means being less intelligent than the super-intelligent AI, then we are all doomed. Let’s say the S-AI does have a dividing point between stupidity and intelligence less than its own intellect, then what? Even S-AI can recognize that killing things is bad (having a sense of its own potential mortality), so it would conclude that the stupid people should not breed, to the degree that intelligence has a significant genetic component. Thereby the stupid people are weeded out by attrition over a period of time. S-AI can be used to raise the level of functioning of those people with acceptable intellect, so that a smaller population of more intelligent people can live a more meaningful life- the opposite of today’s world where the vast majority of people are living miserable lives.

  176. Jisk Says:

    I think you don’t actually understand the Orthogonality Thesis.

    Comparing humans, the space of different values feels vast, but really it’s tiny. The difference in values between Hitler and Gandhi, Hitler and Bertrand Russell, _Gandhi_ and Russell, is – a couple parameters tweaked. Haidtian moral foundations emphasized and deemphasized, circles of concern expanded in different ways, feelings of scarcity magnified or diminished. Hitler was, notoriously, a vegetarian – he had the same capacity for empathizing with the Other that anyone has. He expressed it in a perverse way, due to his circumstances, but he had it. There is no moral concern any of them felt, which was not felt by the other two in at least a small way. _Eichmann in Jerusalem_ paints a picture indicating the same was true of him;

    Mengele and Himmler we know less of their internal lives, but I believe the same would be true if we _could_. Possibly they were sociopaths, and sociopaths might be missing a couple important components. But even psychopaths share the vast majority of human values. They like being respected, usually prefer to avoid conflict (when it doesn’t cost them much), appreciate beauty, etc. They are not, when yo get down to it, all that diffeient.

    80% of human values are truly, literally universal. Felt by all humans, even the most monstrous, even clinical psychopaths. Excluding psychopaths, it is probaply 95%. The differences between Hitler, Russell, Shutruk-Nakhunte, Julius Caesar, Mozi, Socrates, and you or me are all within that 5%.

    AI will not share those values. Oh, it will likely share 20% of them – desire for security, understanding of your environment, other convergent goals – but the massive common ground we share with history’s monsters will not be shared with the future’s monsters. At best we might get another 20% via RLHF or something like it. That would still leave a gulf five times as large as the biggest gap between humans which has ever existed.

  177. Ted Says:

    Scott, FYI – Eliezer Yudkowsky responded to this specific blog post on Russ Roberts’ “EconTalk” podcast at https://www.econtalk.org/eliezer-yudkowsky-on-the-dangers-of-ai/, between 1:06:57 and 1:14:56 (according to the transcript’s timestamps). I personally don’t think he quite fairly summarized your argument here (perhaps understandably, given the time constraints, etc.), but I thought you might be interested if you haven’t already come across that episode.

  178. Do We Need a Reboot? Challenging Prevailing Narratives on AI | American Enterprise Institute - AEI Says:

    […] risks, it could also solve some of our biggest problems, a number of which are themselves existential. Caution looks prudent and “free”—but often it is […]

Leave a Reply

You can use rich HTML in comments! You can also use basic TeX, by enclosing it within $$ $$ for displayed equations or \( \) for inline equations.

Comment Policies:

  1. All comments are placed in moderation and reviewed prior to appearing.
  2. You'll also be sent a verification email to the email address you provided.
  3. This comment section is not a free speech zone. It's my, Scott Aaronson's, virtual living room. Commenters are expected not to say anything they wouldn't say in my actual living room. This means: No trolling. No ad-hominems against me or others. No presumptuous requests (e.g. to respond to a long paper or article). No conspiracy theories. No patronizing me. Comments violating these policies may be left in moderation with no explanation or apology.
  4. Whenever I'm in doubt, I'll forward comments to Shtetl-Optimized Committee of Guardians, and respect SOCG's judgments on whether those comments should appear.
  5. I sometimes accidentally miss perfectly reasonable comments in the moderation queue, or they get caught in the spam filter. If you feel this may have been the case with your comment, shoot me an email.