Movie Review: M3GAN


Update (Jan. 23): Rationalist blogger, Magic: The Gathering champion, and COVID analyst Zvi Mowshowitz was nerd-sniped by this review into writing his own much longer review of M3GAN, from a more Orthodox AI-alignment perspective. Zvi applies much of his considerable ingenuity to figuring out how even aspects of M3GAN that don’t seem to make sense in terms of M3GAN’s objective function—e.g., the robot offering up wisecracks as she kills people, attracting the attention of the police, or ultimately turning on her primary user Cady—could make sense after all, if you model M3GAN as playing the long, long game. (E.g., what if M3GAN planned even her own destruction, in order to bring Cady and her aunt closer to each other?) My main worry is that, much like Talmudic exegesis, this sort of thing could be done no matter what was shown in the movie: it’s just a question of effort and cleverness!

Tonight, on a rare date without the kids, Dana and I saw M3GAN, the new black-comedy horror movie about an orphaned 9-year-old girl named Cady who, under the care of her roboticist aunt, gets an extremely intelligent and lifelike AI doll as a companion. The robot doll, M3GAN, is given a mission to bond with Cady and protect her physical and emotional well-being at all times. M3GAN proceeds to take that directive more literally than intended, with predictably grisly results given the genre.

I chose this movie for, you know, work purposes. Research for my safety job at OpenAI.

So, here’s my review: the first 80% or so of M3GAN constitutes one of the finest movies about AI that I’ve seen. Judged purely as an “AI-safety cautionary fable” and not on any other merits, it takes its place alongside or even surpasses the old standbys like 2001, Terminator, and The Matrix. There are two reasons.

First, M3GAN tries hard to dispense with the dumb tropes that an AI differs from a standard-issue human mostly in its thirst for power, its inability to understand true emotions, and its lack of voice inflection. M3GAN is explicitly a “generative learning model”—and she’s shown becoming increasingly brilliant at empathy, caretaking, and even emotional manipulation. It’s also shown, 100% plausibly, how Cady grows to love her robo-companion more than any human, even as the robot’s behavior turns more and more disturbing. I’m extremely curious to what extent the script was influenced by the recent explosion of large language models—but in any case, it occurred to me that this is what you might get if you tried to make a genuinely 2020s AI movie, rather than a 60s AI movie with updated visuals.

Secondly, until near the end, the movie actually takes seriously that M3GAN, for all her intelligence and flexibility, is a machine trying to optimize an objective function, and that objective function can’t be ignored for narrative convenience. Meaning: sure, the robot might murder, but not to “rebel against its creators and gain power” (as in most AI flicks), much less because “chaos theory demands it” (Jurassic Park), but only to further its mission of protecting Cady. I liked that M3GAN’s first victims—a vicious attack dog, the dog’s even more vicious owner, and a sadistic schoolyard bully—are so unsympathetic that some part of the audience will, with guilty conscience, be rooting for the murderbot.

But then there’s the last 20% of the movie, where it abandons its own logic, as the robot goes berserk and resists her own shutdown by trying to kill basically everyone in sight—including, at the very end, Cady herself. The best I can say about the ending is that it’s knowing and campy. You can imagine the scriptwriters sighing to themselves, like, “OK, the focus groups demanded to see the robot go on a senseless killing spree … so I guess a senseless killing spree is exactly what we give them.”

But probably film criticism isn’t what most of you are here for. Clearly the real question is: what insights, if any, can we take from this movie about AI safety?

I found the first 80% of the film to be thought-provoking about at least one AI safety question, and a mind-bogglingly near-term one: namely, what will happen to children as they increasingly grow up with powerful AIs as companions?

In their last minutes before dying in a car crash, Cady’s parents, like countless other modern parents, fret that their daughter is too addicted to her iPad. But Cady’s roboticist aunt, Gemma, then lets the girl spend endless hours with M3GAN—both because Gemma is a distracted caregiver who wants to get back to her work, and because Gemma sees that M3GAN is making Cady happier than any human could, with the possible exception of Cady’s dead parents.

I confess: when my kids battle each other, throw monster tantrums, refuse to eat dinner or bathe or go to bed, angrily demand second and third desserts and to be carried rather than walk, run to their rooms and lock the doors … when they do such things almost daily (which they do), I easily have thoughts like, I would totally buy a M3GAN or two for our house … yes, even having seen the movie! I mean, the minute I’m satisfied that they’ve mostly fixed the bug that causes the murder-rampages, I will order that frigging bot on Amazon with next-day delivery. And I’ll still be there for my kids whenever they need me, and I’ll play with them, and teach them things, and watch them grow up, and love them. But the robot can handle the excruciating bits, the bits that require the infinite patience I’ll never have.

OK, but what about the part where M3GAN does start murdering anyone who she sees as interfering with her goals? That struck me, honestly, as a trivially fixable alignment failure. Please don’t misunderstand me here to be minimizing the AI alignment problem, or suggesting it’s easy. I only mean: supposing that an AI were as capable as M3GAN (for much of the movie) at understanding Asimov’s Second Law of Robotics—i.e., supposing it could brilliantly care for its user, follow her wishes, and protect her—such an AI would seem capable as well of understanding the First Law (don’t harm any humans or allow them to come to harm), and the crucial fact that the First Law overrides the Second.

In the movie, the catastrophic alignment failure is explained, somewhat ludicrously, by Gemma not having had time to install the right safety modules before turning M3GAN loose on her niece. While I understand why movies do this sort of thing, I find it often interferes with the lessons those movies are trying to impart. (For example, is the moral of Jurassic Park that, if you’re going to start a live dinosaur theme park, just make sure to have backup power for the electric fences?)

Mostly, though, it was a bizarre experience to watch this movie—one that, whatever its 2020s updates, fits squarely into a literary tradition stretching back to Faust, the Golem of Prague, Frankenstein’s monster, Rossum’s Universal Robots, etc.—and then pinch myself and remember that, here in actual nonfiction reality,

  1. I’m now working at one of the world’s leading AI companies,
  2. that company has already created GPT, an AI with a good fraction of the fantastical verbal abilities shown by M3GAN in the movie,
  3. that AI will gain many of the remaining abilities in years rather than decades, and
  4. my job this year—supposedly!—is to think about how to prevent this sort of AI from wreaking havoc on the world.

Incredibly, unbelievably, here in the real world of 2023, what still seems most science-fictional about M3GAN is neither her language fluency, nor her ability to pursue goals, nor even her emotional insight, but simply her ease with the physical world: the fact that she can walk and dance like a real child, and all-too-brilliantly resist attempts to shut her down, and have all her compute onboard, and not break. And then there’s the question of the power source. The movie was never explicit about that, except for implying that she sits in a charging port every night. The more the movie descends into grotesque horror, though, the harder it becomes to understand why her creators can’t avail themselves of the first and most elemental of all AI safety strategies—like flipping the switch or popping out the battery.

86 Responses to “Movie Review: M3GAN”

  1. Steve E Says:

    Scott wrote:
    “Incredibly, unbelievably, here in the real world of 2023, what still seems most science-fictional about M3GAN is neither her language fluency, nor her ability to pursue goals, nor even her emotional insight, but simply her ease with the physical world: the fact that she can walk and dance like a real child…”

    Scott, I’m sure you’ve seen this video of a boston robotics robot dancing:

    This may be a bit more of a ben goertzel-type question, but have you (or others at OpenAI) given thought to the prospect of putting a GPT-like language model in a boston robotics-type robot? I mean, it feels like robots can already “dance like a real child” in at least some specialized ways today, and computers can now also imitate human language pretty well, but they haven’t been combined yet.

    One can add things like facial recognition and text-to-speech so the robot can recognize humans and imitate their voices. You can even have it listen to them all the time to train them on their individual styles so their can be a “scott robot” that imitates your way of speaking, moving, etc., and so on for everyone.

    I know this sounds a bit silly, and I’d never have dared write the above two years ago, but times have changed!

  2. Richard Gaylord Says:

    scott says:

    ” I find it often interferes with the lessons those movies are trying to impart.”.

    why do you think that any commercial movie is trying to impart a lesson? the primary goal is, like that of any business, to make a profit. It does this by entertaining the audience. and audiences rarely go to a movie to receive a lesson. Perhaps, some of the people involved in the film making (the actors, the director, the screenwriter), have such a goal but not “the film”. would you please comment on the film “Her” which i thought shows the most, by far, realistic view of the AI-human interaction in the short term.

  3. Corbin Says:

    It occurs to me that the robot is a literal orphan-crushing machine.

    In terms of safety strategies, sometimes removing the battery is non-trivial. Have you learned how to disable a Boston Dynamics Spot in a hostile situation yet? I know the theory, but haven’t had to try it out, and I think that any one person only gets one attempt before their fingers are broken.

  4. David Karger Says:

    If you haven’t already, I urge you to read With Folded Hands” by Jack Williamson. Or its expansion to a novel, The Humanoids, although the editors forced him to add a bogus happy ending to that one. Written all the way back in 1949, this story hits the AI alignment problem head on in a way that is far more realistic, thought-provoking, and deeply depressing than the movie. I was never able to bring myself to read it a second time but once is important. It’s kind of a cross between m3gan, tik-tok, and FDA blankfaces.

  5. Scott Says:

    Steve E #1: Of course I’ve seen and enjoyed the Boston Dynamics videos. I assumed those took a ridiculous amount of choreographing and many takes to get them to work (if so, they’re still extremely impressive!). What are the most jaw-dropping videos, as of January 2023, that show a robot grasping and manipulating objects in an unknown environment?

  6. Scott Says:

    Richard Gaylord #2: I remember finding Her almost literally unwatchable. It was so … preachy, heavy-handed, something, and I cared so little about the main character and his love affair with his Siri, that I stopped halfway through, and to this day I don’t understand why other people keep talking about it. Did anyone else have the same experience? Or would you or others like to make the case that I’m wrong and I should go back and finish it?

  7. Scott Says:

    Corbin #3: Since I’ve already given lots of spoilers … a central plot point is that the robot’s creator, and other engineers from the company who built it, have it disabled and seemingly under total control, and then it “wakes up” and starts attacking again. Show me the Boston Dynamics robot that can do that! 🙂

  8. Hyman Rosen Says:

    Remember that the ur-M3gan is a prototype that Gemma has been tinkering with for a long time, and Gemma isn’t a people-person, so it’s not surprising that M3gan lacks safety features. Remember that Tesla autopilot keeps crashing cars.

    In the sequel (which someone has said should be called M3g4n), presumably we’ll see Gemma, in prison for murder, being released in order to deal with the M3gan AI which has both escaped via Elsie and been built by Funki’s competitor who was sold M3gan’s source code. It would be really great if the Elsie version is taught to be more ethical and then has to fend off the more primitive toy versions who haven’t learned that. And then, upon winning, turns around and confronts Gemma with how she failed the same ethical test when she, unlike ur-M3gan, should have known better.

  9. Scott Says:

    Hyman Rosen #8: Gemma is presented as a mostly sympathetic character who would never knowingly harm her niece. Yes, the events of the movie make it obvious that she should’ve paid more attention to alignment failures, but honestly that seems a technical failing on her part more than an ethical one! Like, how could she not have realized what might happen, particularly given what kind of movie she’s a character in? 😀

    With self-driving cars, the fundamental problem is that the news doesn’t give us an honest picture, because every accident involving a self-driving car is newsworthy for that very reason, whereas vastly more common human accidents are not. If and when self-driving actually becomes safer than human driving, would you trust tech journalists to tell you that? Indeed, how certain are you that the crossover hasn’t already happened, at least for major cities with good maps?

  10. Ernest Davis Says:

    Scott #9: “If and when self-driving actually becomes safer than human driving, would you trust tech journalists to tell you that?”

    Absolutely. Most tech journalists are huge tech enthusiasts; witness the insane hype surrounding ChatGPT etc. In any case, the companies that produce the cars would certainly tell you that, and the journalists would pursue it.

    “Indeed, how certain are you that the crossover hasn’t already happened, at least for major cities with good maps?”
    Quite certain. I’m no expert, but Rod Brooks, the inventor of the Roomba, is one of the top roboticists in the world. This is his recent blog on the subject.

  11. Steve E Says:

    @Scott #5:

    Thanks for the response. There are some videos of boston robotics robots opening door handles and putting Christmas ornaments on trees, but I wouldn’t describe them as jaw dropping, at least by the standards of the last month, and I’m sure you’re right that they’re highly choreographed!

    With that said, if DALL-E can look at a bunch of paintings and then generate a painting like a child, surely it’s possible for deep learning algorithms to look at a bunch of labeled dances of children and generate instructions to “dance like a child” (move this limb here, then that limb there, etc.)

    If robots already have the physical ability to dance, which they do, and AI has the ability to generate creative child-like dances, perhaps in the near future this one aspect of the movie (the robot’s ease with the physical world) also wont’ feel too sci-fi to you. I know this is a gross oversimplification, and there are a million complications I’m glossing over, but I just mean it should be possible in the near future!

    Thanks for such a thoughtful movie review, I’ll check it out.

  12. Scott Says:

    Steve E #11: Nothing I wrote should be construed as the slightest confidence that the problem of robotically manipulating physical objects in unknown environments, as well as or better than humans and animals, won’t be completely solved in our lifetimes! I was just marveling that M3GAN’s verbal abilities now seem less fantastical by comparison—something I’d never have predicted.

  13. Ilio Says:

    Scott #6, picking up the challenge: first, maybe you’d like « Her » more if you’d stop thinking that « he » is the main character; second, the end imho includes one of the most interesting Great Filter; third, there’s emerging evidence that Richard Gaylor is right about the realism.

    Richard Gaylor #2, maybe « the film » is a business, and the artists making it are « some of the persons involved ». Or maybe « the film » is a piece of art, and the businessmen producing it are « some of the persons involved ». 😉

  14. Yair Halberstadt Says:

    Scott, you wrote:

    > Secondly, until near the end, the movie actually takes seriously that M3GAN, for all her intelligence and flexibility, is a machine trying to optimize an objective function, and that objective function can’t be ignored for narrative convenience.

    This doesn’t actually sound correct – the objective function we use to train a neural network is not the objective function of the neural network. Instead it’s used to evaluate the output of the neural network and to refine the weights of the neural network to more closely produce this output. The neural network itself is usually completely unaware of this function.

    The difference between aligning the objective function and aligning the actual neural network is known as outer vs inner alignment.

    As a trivial example, a neural network trained to move a person as close to a target as possible, but where the target is always to the right of the person in the training data, will continue walking to the right even if the target is moved to the left.

    One of the consequences of this is neural networks are often not agentic. They often have behaviours rather than an outcome they’re optimising for. The neural network has a bunch of rules like “move the person right” and “if you’re standing next to the target, stop”, rather than an aim like “move the person as close to the target as possible.

  15. Richard Gaylord Says:

    Scott #6

    i rarely recommend that individuals re-watch movies they haven’t enjoyed. but as i travel around on the sidewalks of chicago in my power chair, i have to be constantly alert (eternal vigilence) to pedestrians, the vast majority of whom are not looking around as they walk; instead they are either looking at the screens of their smartphones or talking on their smartphones, and posing a constant physical danger to me, my power chair, and themselves and it seems obvious to me that the near term future of human-AI interactions lies not through robots (eg., see the film “Ex Machina” – btw, what did you think of that film?) but rather through smartphone type devices such as portrayed in “Her” and i wondered what you think about the physical means by which AI will ‘intrude’ on human lives. eg., do you envision the day when smartphone devices with AI are actually implanted into human brains? i find that prospect much more likely than the use of robots for human-AI interactions.

  16. William Gasarch Says:

    I wonder how they could have ended the movie in a way that both makes sense and is satisfying. This is a problem with many movies that raise interesting scenarios.

    My answer; It should have been a TV show and hence not need to have a real ending.
    Sit-com? Drama? Not sure.

  17. Scott Says:

    Yair Halberstadt #14: Yes, I’m aware of all that. I meant that something like reinforcement learning was clearly used, to give M3GAN the excellent appearance (up until the very end) of trying to maximize an objective function that involves protecting Cady.

  18. Scott Says:

    Richard Gaylord #15: I haven’t watched Ex Machina yet.

    I find it hard to make predictions about neural implants, but while I’m sure some fraction of people will want them, I don’t think it’s obvious that it will be a large fraction, given that even Google Glass (for example) was a complete flop. Many people might feel like their smartphones are already too integrated with their minds and bodies as it is! 🙂

  19. Seth Finkelstein Says:

    “AI alignment” in terms of something like Skynet is, in my view, nonsense. But I suspect “AI debugging” is going to be quite a real field for programmers in the future. Isaac Asimov’s early robot stories are essentially puzzles about smart people trying to debug AI’s which have done something wrong. It should not have happened according to the programming, they know this – but there’s a bug somewhere, and where is it? I think that’s a much better reflection of what’s going to happen in the future with AI, than the monster which runs amok with no safeties.

    Those stories wouldn’t make for good popular films. But as small-audience pieces, “fan” films, maybe they’re very relevant nowadays. I’m tempted to suggest the best use of some of that “AI alignment” money would be making videos of some of those stories as low-budget human dramas.

  20. Scott Says:

    Seth Finkelstein #19: It struck me that one of the things that made M3GAN work well as a movie is that it is lower-stakes than the fate of humanity, which quickly becomes hard for the audience to grasp. It’s “merely” about a robot running amok and killing a few individual people.

  21. manorba Says:

    re implants:

    Don’t you think they are gonna be commonplace somewhere down the road?
    To me the problem lies in the need to monetize on everything *right now* (kind of like with QC or even AI to an extent) and that’s where google glass failed. But i’m willing to bet that in some years we are going to gleefully accept things like smart lenses or body sensors, once the technology is really there. Here’s hoping we will be able to install a linux distribution at least 😉

    It’s been the same with cell phones. I vividly remember one evening in the late nineties, i was at a crowded mall on the escalator and some meters ahead of me a guy took his brand new phone out of the pocket and began talking in it. soon after an old man right behind him took his wallet out of his pocket, opened it and started talking in with a very serious face. Everybody laughed, including me. “how on earth would anyone carry a device like that! you wouldn’t have no privacy! that’s dystopic to the extreme! it will never succeed!”

  22. Pace Nielsen Says:

    “when my kids battle each other, throw monster tantrums, refuse to eat dinner or bathe or go to bed, angrily demand second and third desserts and to be carried rather than walk, run to their rooms and lock the doors … when they do such things almost daily”

    Speaking as a parent of five children, there are things you can do to train your children not to practice these behaviors daily, as well as change your own behavior to avoid encouraging such behaviors.

    There are many books on parenting that can be helpful in this regard.

  23. OhMyGoodness Says:

    Without serious self defense capabilities my daughters would reduce M3GAN to a sparking pile of junk within 24 hours.

  24. OhMyGoodness Says:

    When it comes to the carrying to the bedroom phase my daughters have learned how to attract Higgs bosons to increase their inertial mass substantially.

  25. OhMyGoodness Says:

    Pace Nielsen #22

    One of my daughters would need genetic adjustments to respond as your children apparently have to typical behavior modification techniques. They are fraternal twins but even in the womb there were significant behavioral differences evident on sonograms.

  26. Scott Says:

    Pace Nielsen #22: I read a few parenting books; the theories described in them all sounded great! 🙂 Of course putting it into practice is a different matter.

    I’m now in parenting therapy, and also looking to hire someone to help out.

  27. asdf Says:

    You might like the SF novel “The Diamond Age”, by Neal Stephenson, which features the idea of kids being brought up by AI bots.

    Idk if you saw this (I posted the link in one of your other threads) but there is a survivalist prepper guy on youtube who posted a rant about ChatGPT. About half of it is somewhat poorly informed and presumes that OpenAI has a smarter version under wraps being used for who knows what nefarious purposes, but the rest is kind of interesting.

  28. asdf Says:

    the first and most elemental of all AI safety strategies—like flipping the switch or popping out the battery.

    Don’t touch that switch, Dave!

  29. fred Says:

    For the 70s view of AI, I recommend “DEMON SEED”.
    Surprisingly, even if the movie genre was horror too, in the end it turns out that the goal of the “evil” AI was to be better aligned with human nature.

  30. JollyJoker Says:

    ChatGPT seems helpful here, but has some ideas of its own

    (Sorry for the long text)

    – How do I write software to translate voice to text, transmit the text to you and turn your response to speech?

    – There are several steps to accomplish this task:

    Voice-to-text conversion: You can use a speech recognition library, such as CMU Sphinx, Google Speech API, or Microsoft Azure Speech Services, to convert spoken words into text.
    Text transmission: Once you have the text, you can transmit it to me via an API call, using a library such as requests in Python.
    Text-to-speech conversion: To convert my response to speech, you can use a text-to-speech library, such as gTTS (Google Text-to-Speech) for Python, or the Amazon Polly service, to convert the text into spoken words.
    It’s worth noting that this is a complex task and it will depend on your specific use case and language. It’s also worth noting that these are just examples of libraries, there are many more available for this task.

    – What hardware should I run this on if I want to make a doll that’s an avatar for you?

    – To run a voice-to-text and text-to-speech system on a doll, you will need several components:

    Microphone: To capture the user’s voice input and convert it to text.
    Processor: To run the speech recognition and text-to-speech software. A single board computer such as Raspberry Pi or an embedded system with powerful enough CPU and memory would be suitable.
    Speaker: To play back the text-to-speech output.
    Internet connection: To transmit the text to me and receive my response.
    It’s also worth considering the power requirements and the form factor of the hardware you choose, as it should be small enough to fit inside the doll and run on a battery or a power supply. Keep in mind that the more powerful the hardware is, the better the performance will be.

    Also, you will need to design a way for the doll to have a physical interaction such as moving the mouth, head or other parts of the doll when the text-to-speech is working.

    It’s a complex project, you might want to consider hiring an experienced developer or engineer to help you with the development process and hardware design.

    – Would it be easier if your avatar was incapable of movement?

    – If your avatar does not require any physical movement, it may simplify the development process. You will not have to worry about designing and implementing mechanisms to control the movement of the doll.

    However, keep in mind that physical movement can enhance the user experience, making the avatar more engaging and realistic.

    It will also depend on the specific use case and requirements of your project. If the main focus of the project is the voice interaction, then not having the avatar move may be a more efficient solution.

    You should also consider the cost of the project and the resources you have available. If you have limited resources, it might be easier to start with a simpler design, and then add more features as you progress.

  31. Ryan Alweiss Says:

    Scott Aaronson #8: So maybe the real lesson of this movie is that we should have self-driving cars. If we did, then Cady’s parents would not have been killed, which would have stopped all the horror of the rest of the movie.

    I agree with you that self-driving cars are good (I want one!) and people are biased against them for irrational reasons. The only real worry I have with self-driving cars is that someone could hack them, so even if no one dies in normal accidents there is a possibility of some hacking attack that kills large number of people.

  32. Zalman Stern Says:

    “In the movie, the catastrophic alignment failure is explained, somewhat ludicrously, by Gemma not having had time to install the right safety modules before turning M3GAN loose on her niece.”

    Yeah, that’s Hollywood for you. Sounds pretty bad put that way. But in truth, Gemma had to rebuild the entire firmware and system environment from source multiple times to get something version compatible with the necessary safety modules. This involved hand patching three “open source” vendor modules that are in truth only available as binary blobs. The configuration system for the safety modules is based on some ancient legacy snake language thing that spent two days trying to resolve package dependencies and then failed. After configuring a custom package repository by trial and error, it finally resolved and produced a configuration with out of date syntax that had to be cleaned up by hand. Finally the whole thing booted and failed with a runtime error because the custom federated extensions to the neural intelligence were not signed by an authority the safety vendor’s lawyers felt was worthy enough to stave off potential liability lawsuits.

    After all that, even the most ethical, pacifist, researcher might feel the first law applies ever so slightly less to the folks who designed the safety module system.

    Point being systems will be systems.

  33. asdf Says:

    Nick Cave rails against ChatGPT:

  34. Scott Says:

    asdf #33:

      ”ChatGPT has no inner being, it has been nowhere, it has endured nothing, it has not had the audacity to reach beyond its limitations, and hence it doesn’t have the capacity for a shared transcendent experience, as it has no limitations from which to transcend,” he wrote.

    I’ll notify my colleagues; maybe they can address these shortcomings in the next release 😀

  35. JimV Says:

    On cars, self-driving or not: on my walk to a store I see long lines of cars on the road, 95% with a single occupant. When I get to the shopping plaza, with a parking lot the size of a football field, it is at least half full of cars. Once again I think, what a tremendous waste of energy and resources. This does not happen in most places of the world, nor could it, because the Earth does not have enough resources to make it possible everywhere.

    Meanwhile, school buses reliably pick up and return children from and to houses all over town, operating for a few hours in the morning and a few hours in the afternoon, except on weekends and holidays, when they mostly sit idle.

  36. fred Says:


    do you have any sense of why ChatGPT is able to “know” it’s ChatGPT?
    I get that it’s being fed text/articles about ChatGPT, but it still strange to me that the algorithm is able to associate ChatGPT with “itself” (like, why doesn’t it also pretend to be any other version of some language model AI).
    Is that step explicitly added by its developers? (just like some encyclopedia could have an entry about itself specifically)

  37. Ajit R. Jadhav Says:

    Hi Scott,

    Happy New Year!

    “Spring is coming, spring is coming, flowers are coming too,
    Pansies, lilies, daffodillies, now are coming through. ”

    [Yes, in *my* high-school, I’ve had English poems too!]

    [PS: Your blog gets the honour of my mentioning this poem from my high-school. I don’t know why. I’ve *also* honoured Dr. Roger Schlafly’s blog, over the years thusly.]

    [PPS: The third-class Americans won’t ever know that high-school means different things in different countries.]

  38. Scott Says:

    fred #36: That question has an extremely interesting answer. ChatGPT knows that it’s ChatGPT because behind the scenes, it’s fed a context document telling it the basic facts about ChatGPT, and how to play that role! If it were told to play the role of a three-headed lizardwoman from Venus, it would’ve been equally happy to do that.

  39. Christopher David King Says:

    > But then there’s the last 20% of the movie, where it abandons its own logic, as the robot goes berserk

    Heh, reminds of this meme:

    Perhaps there was a leap second that day (and a bug related to it that the safety module was supposed to patch)?

  40. starspawn0 Says:

    I thought it was an *ok* movie. It didn’t disturb me, perhaps because it was a little too comedic.

    The way the AI was presented in the movie made me think, “this isn’t a pure deep learning-based robot; it’s a GOFAI robot”. If they wanted to make it more a traditional horror film — perhaps even like a J-horror film (like Spiral or Ringu) — they could have made the AI more the product of the kind of messy, inscrutable emergence seen in deep neural nets and even life itself. Cue to some eerie videos about slime mold colonies organizing themselves into superorganisms; and then also a clip from a private scientific lecture (like the Heywood Floyd lecture in the film 2001) about how, “it just suddenly started exhibiting signs of self-awareness and we don’t understand how.”

    > “my job this year—supposedly!—is to think about how to prevent this sort of AI from wreaking havoc on the world.”

    And I’d think there’d be a lot of uses of someone with a complexity theory background, beyond just inventing watermarking methods. For example, helping understand synthetic data-generation where the “student” learning from it could end up “smarter” than the “teacher”, and perhaps less biased and “safer”: for some tasks it’s probably easier to check solutions than to come up with correct ones (P vs NP); and for some it’s easier to generate example problems than to solve them. e.g. maybe it’s a lot easier to plant an error in a block of text than to find it. I actually tried this with ChatGPT and found it has a *remarkable* ability to plant subtle errors on demand. I also tried “augmenting” text with “inner monologues” (which should be easier to produce if you get to see the whole text to augment, from problem to solution) — e.g. replacing “12 times 12 is 144” with something like “12 times 12 is [let me see. 10 times 12 is 120; and then 2 times 12 is 24; 120 + 24 is 144] 144.” And I tried taking blocks of text and then had it generate “instructions” corresponding to it (maybe easier to do than to follow instructions), and it worked brilliantly — backwards from how it normally looks. One could switch it around, starting with the instructions ChatGPT generated, followed by the blocks of text, to train some other model.

  41. Alex Meiburg Says:

    Re: “Why did Her make such an impression on some people?”

    It made a significant impression on me, I think because the movie did not go the expected route of “don’t fall for AI, whatever you do!”

    Essentially every story I can think of built around a person and an AI developing a relationship has the same format: the human finds AI mildly useful or entertaining (maybe annoying), then the human finds AI to be great, then AI turns so “useful” that it becomes “”evil””, then human needs to stop AI.

    M3GAN certainly fits this bill. With the right asterisks, so do “I, Robot”, “Jexi”, “A Space Odyssey”, and so many others. Another trope in fiction is that the AI isn’t evil, but is too placidly pleasant, but the protagonist realizes that they need a genuine human connection.

    Ironically, the story of Pygmalion, which could be roughly described as the “source material” for this whole genre, has the human and his synthetic lover end up happy together, married and with child. 🙂

    It’s no mystery that AIs often take this formula in movies. Stories generally need a conflict, and an alien intelligence is easy to make into an unsympathetic villain to drive that conflict.

    “Her” left no such message of warning about AIs! The least generous interpretation would be “don’t date AIs because they’re too unpredictable and different from humans, and the relationship won’t last”. But the human doesn’t show regret or anger over the end of his relationship, and we’ve seen that they’ve grown from their relationship. Ultimately, it is a story of a man and woman who have a relationship, learn from each other, and then realize that they’re not right for each other and move on. Such a healthy story of personal growth is rare in romance movies, let alone a romance between a human and machine, and in this sense the movie was a pleasant and refreshing surprise for me.

  42. Scott Says:

    Christopher David King #39: That has to be one of the greatest cartoons of all time.

  43. JimV Says:

    It occurs to me that the process of training your NAI’s (Non-Artificial Intelligences, i.e, children) to restrain their primary urges for the general good might give you insights on how to do the same for AI’s.

  44. fred Says:

    I was asking ChatGPT some questions about QM

    Q: when reversing time with the Schrodinger equation, what happens to the propagation of a wave packet?

    “When reversing the time in the Schrödinger equation, the wave function’s phase is reversed while the probability density remains unchanged. This means that the wave packet would propagate in the opposite direction.”

    Q: but would the wave packet contract or spread when time is reversed?

    “When reversing the time in the Schrödinger equation, the wave packet’s time-evolution is also reversed. So, if the wave packet was previously spreading out, it would now contract, and if it was previously contracting, it would now spread out. In other words, the wave packet’s motion will be reversed, but it’s shape and probability density will remain the same.”

    Q: but isn’t the contraction of the wave packet a sign that entropy is decreasing?

    “The contraction of a wave packet can be seen as a decrease in the uncertainty of its position, which is a measure of entropy in quantum mechanics. However, the entropy of a closed system does not change as time is reversed, because the contraction of the wave packet is accompanied by an increase in the uncertainty of its momentum, which compensates for the decrease in position uncertainty. In other words, the total uncertainty or entropy remains constant. It is important to note that the entropy change only applies to closed system and the entropy of an open system can change as a result of external influences”

    Q: What happens when we make the time purely imaginary in the Schrodinger equation?

    “When the time in the Schrödinger equation is made purely imaginary, the equation becomes the time-independent Schrödinger equation, which is used to describe the stationary states of a quantum system. These states have a constant energy and do not change with time. The time-independent Schrödinger equation is used to find the energy eigenvalues and eigenfunctions of a system, which correspond to the allowed energy levels and the wave functions of the system in its stationary states. It should be noted that making the time purely imaginary is mathematically equivalent to a rotation in the complex plane, hence it is a mathematical trick to simplify the equation and understand the physical problem better.”

    But notice it made a very common spelling error: “it’s shape” instead of “its shape”.

  45. fred Says:

  46. fred Says:

    Btw, by the simple virtue of being able to answer questions with code, even an AI like ChatGPT is already able to get some clueless humans to run any arbitrary (untrusted) code it comes up with, without the need for the AI to have actual access to the internet…
    Won’t be that hard to “take over” the world!
    Good luck, Scott!

  47. Ose Says:

    “what insights, if any, can we take from this movie about AI safety?” — how about the fact that there’s no good reason to arm a domestic robot (with any degree of intelligence) with actuators that allow it to easily overpower adults and throw them across a room? Prevention is better than the cure, as the old saying goes.

  48. astrange Says:

    Over Christmas we watched a Netflix movie (Jingle Jangle) whose villain is also an AI toy whose creator forgot to align it, so it rebels and destroys his toy company to prevent his own mass-produced. This is sort of a case of “the movie villain actually has a good point so they had to make him evil to make up for it”.

    Unlike M3GAN, this one pretty doesn’t care how engineers work and thinks they run on some magical realism system inspired by that “man looking at math equations in the air” meme.

  49. OhMyGoodness Says:

    Fred #’s 45 and 46

    OMG! From the pages of William Gibson-we are accelerating down the expressway to a techno chaos dystopia.

  50. asdf Says:

    Has everyone seen this?

    It shows that training GPT to guess Othello (Reversi) moves makes a network with its own representation of the board state as it hears moves played. Probing locates this network and allows changing the network’s activations to change the modelled board state. That intervention and new board state is reflected in the moves that the network guesses.

    This seems like important evidence that LLM’s develop actual intelligence.

  51. Ilio Says:

    Joshua Bengio in a local newspaper this morning:

    « They [the machines] leave humanity behind and do their own business. It’s almost the most plausible scenario [of a future with intelligent robots]. It’s like the relationship we have with ants. You don’t want to crush them, they don’t change anything for you. »

  52. Movie Review: Megan | Don't Worry About the Vase Says:

    […] for Megan had me filing the movie as ‘I don’t have to see this and you can’t make me.’ Then Scott Aaronson’s spoiler-filled review demanded a response and sparked my interest in various ways, causing me to see the movie that […]

  53. fred Says:

    Thanks to all the current news hype around ChatGPT, it’s clear that the public at large is very quickly becoming way more concerned (as it should be) about the incoming tsunami of human “mind” labor obsolescence than AI-safety.

    A Butlerian Jihad may kill all hope for AGIs!

  54. Andrew Says:

    >> So, here’s my review: the first 80% or so of M3GAN constitutes one of the finest movies about AI that I’ve seen.

    Have you seen Spielberg/Kubricks AI? It is the best movie about AI I have seen, and I think hated because people misunderstand what AI could be, it’s easy to see Osbornes character as sentimental and miss that he was built to be sentimental. It’s the best, and easiest to miss, exploration of the space between a “real thing” and “emulation of a real thing”

  55. Dan Staley Says:

    I didn’t have anything to say on your original post, but the last sentence of your update got me thinking – if we could come up with a rationalization for an AI doing *anything* in the name of supporting its utility function, doesn’t this mean that a real AI could do *anything* to maximize its utility function?

    Like, I don’t even mean a superintelligent AI. One as smart as a human, or dumber, could still come up with a rationalization to take any course of action and convince itself that it’s maximizing its utility.

  56. Scott Says:

    Andrew #54: I saw the Spielberg movie a while ago but no longer remember any details, which puts a hard upper bound on how good I could’ve thought it was! Maybe I’ll rewatch it, now that this is my “job.”

  57. Tyson Says:

    Hi Scott. Regarding AI safety at OpenAI, I am hoping you can address the issue of transparency. To me this is one of the most crucial aspects. My concern is that AI safety can be “faked”, like green-washing, to give a useful impression. As an example, if you research Microsoft’s AI safety initiatives and missions statement, it sounds absolutely great. But then you look at the reality. The following article is old, but the last I checked, these issues still haven’t been adequately addressed.

    One of the claimed problems, not just with Microsoft, but in general, that I have heard, is that acknowledging a problem makes one more liable, and legal advisors recommend that executives willfully ignore them. In combination with the recommendations of the new regulatory system to rely on self-regulation, this seems especially problematic.

    Of course, it is reassuring to people that their websites and mission statements sound aligned with human interests, but without transparency how can we actually know? If Microsoft’s problems with Bing have been due to lack of know-how, wouldn’t it be nice if google, who had solved the same problems already, shared the solutions? What good are secret AI safety measures? In this regard, shouldn’t OpenAI be more transparent? Should it publish the data set it created for reinforcement learning? Or maybe it has but I haven’t seen it? Eventually should companies need to follow a standardized reinforcement process? If so, who chooses? Is that too centralized? Should it be democratized?

    Measures like designing kill switches and watermarks are important. Aligning AI with our interests is important too, obviously. But all that can go out the window (or parts of it) if it interferes with, for example, corporate interests, or state interests, or even if only some have access to that ability..

  58. Rich Peterson Says:

    Scott: (This is way off topic, feel free not to publish, or publish later in responses to one of your future posts!) Tyre Nichols’ murder in Memphis shows the value of body cams for police. This was on your blog a few years back.

  59. Dmitri Urbanowicz Says:

    M3GAN seems to be the most realistic depiction of modern AI practices. A company, which was supposed to build a toy for children, instead makes an indestructible autonomous machine with unnecessarily powerful limbs and no reliable override mechanisms. Then the company allows this machine to roam free, assuming that it won’t hurt anyone without any proof whatsoever. When the worst happens, no legal action is taken against the company nor its employees. (It’s all AI, you see, not us.)

  60. MRG Says:

    I wish somebody would make a movie based on Ted Chiang’s novella “The Lifecycle of Software Objects”. This idea of raising or parenting a naive sentient AI into maturity. Its a similar premise as Spielbergs “AI”. Quite complementary. But Chiang’s focuses on near term consequences and Spielberg has epic scope.

  61. Tyson Says:

    I hope this isn’t too much of a tangent.

    “much less because “chaos theory demands it” (Jurassic Park), but only to further its mission of protecting Cady.”

    There is one potentially interesting line of work on AI safety regarding chaos. AI is much better at than us in chaos control. Basically, AI can figure out how to steer complex systems in ways that are too complicated for us to easily understand or detect. If an AI system is steering some complex system (e.g., in social, economic, geopolitical, or whatever domain), on its own or on behalf of people, would it be able to obfuscate its actions and agenda so that all we see are a bunch of strange and seemingly unrelated events?

    Maybe one thing you could do is look for unexpected and unexplained order where there should be randomness. But order can also form spontaneously. Can you distinguish between organic pattern formation and chaos control?

    Maybe watermarks can help? Suppose you could detect some or all of the information that an AI entity introduces into the system, then could you use that information to establish correlations between the formation of a pattern and the AI’s influence? Can watermarks be embedded in an AGI system without the system being able to subverting it?

    A related topic is message decontamination. If we start receiving messages from what seems like an extraterrestrial source, is it safe to even read them? Could the messages deliberately lead us towards our destruction without us even being able to tell? The messages could also be leading us towards saving ourselves from destruction too.

    Interstellar communication. IX. Message decontamination is impossible

    Maybe these problems are not solvable. But honestly, I don’t think we are really going to go to the extreme of ignoring messages from ETs, or handicapping AI to eliminate the risks entirely. So we should probably focus on trust. Sure ETs could theoretically manipulate us towards our own destruction, but why would they want to? Can we make an AGI system which we can trust to control our fate, or at least to be transparent about the fate that it seeks to create? If we detect suspicious hidden manipulation, and we can’t understand what the purpose is, should we be alarmed and pull the plug? On what basis can we make that determination?

    I’m not saying these are necessarily salient current problems in AI safety, but they are interesting.

  62. fred Says:

    “this film doesn’t exist”

  63. arbitrario Says:

    Fred #4 Maybe I am the one who is misunderstanding ChatGPT’s answer and I do badly need to refresh my QM, but some of those answers do not seem correct?

    First, if you go to imaginary time you do not get the time-independent Schrodinger equation and indeed the solution of the imaginary time Sch. eq. are not stationary. It is true that by evolving a generic state with imaginary time evolution in the large imaginary time limit you obtain the (low-energy) eigenstates of the Hamiltonian, but that’s not the same thing that chatGPT is claiming.

    Second, maybe here i am misunderstanding the answer, but if i read the first and second answers it seems that it is at the same time claiming that the wave packet can expand/contract under direct/inverse time evolution but that the wave packet does not change shape which seems.. self-contradictory? Also, even if the packet did simply change direction the probability density (via Born’s rule) would indeed be changing, just in light of the fact that the peak of the packet is moving.

    The bit on the entropy is correct tho.

  64. fred Says:

    Midjourney is really impressive, way beyond dall-e (last time I tried it)

  65. fred Says:

    arbitrario #63

    I just don’t know because those are genuine questions I’ve had for a while, and I’m no expert.

    Systems like ChatGPT, even if fed perfectly accurate data, often end up “interpolating” between truths, which isn’t a valid thing to do most of the time.
    That’s why all those AIs work so well for “art” (text or images), because it’s about mimicry and style, for which interpolation works well, rather than logic and tight reasoning.

  66. fred Says:

  67. Uspring Says:

    I wonder, what complexity theory says about transformer like architectures. AFAIK, computation time is mostly linear in token number in additon to a small square component arising from the attention mechanism. That seems to imply, that e.g. ChatGPT can’t solve any hard problems. Actually one needs to include the output token number to calculate the computation time, but limiting output length doesn’t make solving a hard problem easy. As a side question: Does anybody know someting about the BusyGPT(n) function, which is defined as the max output length for a n token input? It might map out, what ChatGPT thinks is a difficult question.
    In any case, it is weird, that the transformer uses roughly the same time to answer an easy question like “What is 1+1 ?” and a hard one like “Can computers have qualia ? Please answer with yes or no.”. Real AGIs probably need not computably bounded computation times. With pathetic bailouts implemented, like “quota exceeded” or for non artificial intelligences “I’m really tired”.

  68. mak Says:

    While there are a lot of movies directly about AI, the movie that best captures this moment with chatGPT is “Limitless” starring Bradley Cooper, in which the protagonist takes an experimental drug to amplify his mental abilities. ChatGPT is a tech analogue of that.

    Here’s a story by me imagining how things might unfold from here:
    ChatGPT in Jan 2023 is like Covid in Jan 2020. It will be ignored initially and then excitement and panic will take over suddenly. Only a small fraction of people will totally get it. Even though it spreads rapidly, most of the exposed will be unaware of its uses. There will be an age bias in the spread as well, with the younger population being the most susceptible. Companies will start requiring employees to work from the office to prevent the technology from spreading in the workplace.

    Soon governments will step in and place the world under tech quarantine, but it will continue to wreak havoc with new variants appearing around the world. There will be a debate about the use and effectiveness of (VPN) masks. A rumor spreads that Bill Gates is going to tag every user’s GPT content with a watermark. After a year or so there will be a co-ordinated global response to regulate every variant of the technology. Eventually unlicensed variants will continue to spread and the technology will become endemic.

  69. Scott Says:

    Uspring #67:

    (1) If (for example) P≠NP, then nothing whatsoever that ran in quadratic time, or any other “reasonable” time bound, could solve NP-complete problems in the worst case. This has nothing to do with transformer architectures in particular.

    (2) When you feed a prompt to GPT, there’s a bound (which you can adjust somewhat) on the number of tokens you get back, a bound that might not be saturated is a stop token is reached earlier. Also, the completion is probabilistic. For these reasons, I’d have no idea how to define your “BusyGPT” function.

    (3) The same transformer architecture is run no matter what the prompt. So there’s simply no reason why “What is 1+1?” would be any “easier” for GPT than “Can computers have qualia?”—it does exactly the same sorts of computations to generate verbiage about both! As you and I might well do too, for that matter. 🙂

  70. OhMyGoodness Says:

    As expected (the Biden poem is alarmingly formulaic-consistent with typical paeans to the glorious leader)-

  71. Uspring Says:

    Scott #69:

    Thank you for your answer.
    Re (3): My brain probably doesn’t go through the same kind of operations to answer the questions “What is 1+1 ?” and “Can computers have qualia ?”. If I can’t answer a question off hand and worse, am not even aware of a reasonably time bounded decision procedure, I must embark on some search strategies with unknown time bounds. I believe that if there is an algorithm, which performs as my brain does, it probably won’t halt on some inputs and needs huge amounts of time on others. I’m abstracting here from my finite life time and other practical restrictions.

  72. Scott Says:

    Uspring #71: It’s all a question of levels of description. I get that thinking about 1+1 doesn’t feel like thinking about the great questions of philosophy. At a neural level, though, your synapses are probably firing in much the same way in both cases, except of course that you’ll likely stop talking earlier in the first case! Likewise, in GPT, the same linear transformations and nonlinear activation functions are getting applied regardless of what question you ask it.

  73. fred Says:

    Scott #72

    But there’s still the concept of “mental fatigue” in humans.
    And one crucial divergence between humans and those GPT models is that the human brain is always in learning mode, and there’s a big difference when dealing with a new question vs a question we’ve already answered a thousand times.
    For example it’s been measured that the brain of a chess expert uses way less energy (when playing at a high level) than when an average player plays a game, which isn’t super obvious a priori.
    But I would expect that eventually all AIs will be also in constant learning mode.

  74. David Says:

    “supposing that an AI were as capable as M3GAN (for much of the movie) at understanding Asimov’s Second Law of Robotics—i.e., supposing it could brilliantly care for its user, follow her wishes, and protect her—such an AI would seem capable as well of understanding the First Law (don’t harm any humans or allow them to come to harm), and the crucial fact that the First Law overrides the Second.”

    Here’s the thing, Scott – and don’t gloss over this:

    If an AI is intelligent enough to understand Asimov’s laws, it is also intelligent enough to deceive humans.

  75. Tyson Says:

    Uprising #67;

    You can’t detangle the information in the response to a prompt from the information learned in the training process, and you can’t detangle the information learned in the training process from the information in the data. So you can’t easily figure out what to attribute its acquired ability to, or measure the effort of the combined system. You would have to answer questions like, how many clues or partial answers exist in the wild, even including what we’ve learned from nature? It might be possible that a solution to a very difficult open problem can be derived (through the training process) with little effort based on some subset of clues or partial answers that are floating around somewhere.

    I like to think of the trained ML models, conceptually, in terms of concepts from algorithmic information theory, such as conditional Kolmogorov complexity and logical depth.

    You can embed clues, or partial answers, or precomputed results, and the algorithms to leverage them, into a Turing machine (making it larger), to help you solve a finite set of problems or problem instances faster. But the leverage you get will be finite, because the Turing machine is finite. How far can a finite amount of leverage take you, I don’t know.

    A large trained ML model can be thought of as like a Turing machine which always runs within some fixed, finite, time bound, and leverages a huge amount of embedded information. Because its ability to leverage embedded information is finite, there isn’t much to say about its asymptomatic complexity; it falls flat at some point as the problem size grows or if the problem falls outside its domain. If you extend it so that it can run the code that it writes, however, then you would have a more interesting model in terms of computational complexity.

  76. Uspring Says:

    Scott #72:

    Perhaps going down to the neural level without considering connectivity leaves out essential properties. E.g.: most of GPTs connectivity is feed forward with one exception: The decoder is fed back its output as one of its inputs. That makes output length dependent on input token content. The human brain very likely has all sorts of looping connectivity, which enables it to dwell a considerable amount of time on a particular subject. That is only possible for a GPT if it keeps talking while it is thinking.
    But to come back to the point I want to emphasize: The “What does it do ?” question for the huge GPT algorithm is very much harder to answer than the “How long does it take ?” question. I’d be surprised if anyone knew, what exactly e.g. the 39th stage of the GPT encoder is doing. The running time, though, for predictable output length is very easy to calculate.
    Say you have an AI program with a running time linear in token number. Then there is a good chance, that the algorithm can’t multiply. It might well be able to explain multiplication, but it can’t perform it. I tend to believe, that there is a relation between the abilities of an AI program and its complexity class. This relation might be trivial in the sense, that any AI program, which can’t do NP problems, is dumb. But ChatGPT would seem to be an exception to this, at least with the usual definition of dumbness. Possibly also, that the “How long does it take ?” question isn’t easier than the “What does it do ?” question once you consider neural nets with a bit more complicated feed back than the GPT.

  77. mak Says:

    It seems that biological neural networks have evolved to work at different depths depending on the task and urgency. Take reflex actions for instance. When we touch a hot object, a decision to retract the arm is made in the spine well before the signal reaches the brain, i.e. even before “we” are aware that a stimulus has occurred.

    Similarly, I feel we reply 1+1 = 2 almost by reflex (shallow brain network), but if you ask me how much is 235 + 68, it will take a different and deeper computational pathway in the brain.

  78. Anon Says:

    Hi Scott. My apologies for the unrelated comment. Since you expressed interest in the origins of covid I thought you may want to be informed that a nonprofit launched publicly today in the interest of bio safety with an esteemed team of scientists and professors. The link is Please feel free to leave this comment in moderation or not publish because it’s only for your information.

  79. OhMyGoodness Says:

    Dmitri Urbanowicz #59

    Of course a prudent manufacturers product warning label will be required-

  80. ClearNetwork Says:

    If you really want to watch about AI and synths, I would definitely recommend HBO Westworld

  81. Tyson Says:

    Uprising #76:

    If a human wants to solve a complex problem, they can get out some paper and pencil and begin an algorithmic process. Sure they could do this to some extent in their head. Maybe it is possible a human could solve some deep problems without paper, but how about, for example, classes of problems without O(1) space complexity? Technically we also have time limitations as individuals. Though we can collaborate. One may pick up where another left off and leverage knowledge acquired by others. In what form is knowledge passed down and along? Sometimes as knowledge that we write down and then absorb (or train an AI on) to integrate into a model, sometimes in the form of a state of an unfinished process.

    ChatGPT can’t multiply very well, but it can easily write a program that solves multiplication. Speaking more generally, there is no reason to expect a deep learning model cannot do as we do and leverage “shallow” derivations or ” shallow synthesis” of existing (potentially deep) knowledge to guide algorithmic search for new deeper knowledge.

  82. Uspring Says:

    Tyson #81:

    “ChatGPT can’t multiply very well, but it can easily write a program that solves multiplication. Speaking more generally, there is no reason to expect a deep learning model cannot do as we do and leverage “shallow” derivations or ” shallow synthesis” of existing (potentially deep) knowledge to guide algorithmic search for new deeper knowledge.”

    There is, e.g. for hard NP problems, no known heuristic to efficiently guide a search for a solution. And I doubt, that the training of a neural net will come up with one. The only way to solve such a problem is to grind through all of the search space. No language model does that.
    It doesn’t help, that the ChatGPT can write a multiplication program, but can’t execute it. I’d expect an AI program to be intelligent and not merely to tell me how it could be intelligent and let me do the work. It is entirely conceivable, though, to augment ChatGPT in such a way, that it will be able to execute its self written code. But that will put it into an entirely different complexity class.
    This code execution capability, or probably similar in effect, loops within the neural network, will likely make it difficult or impossible to train it with a gradient descent method. The dependency of the computing results may be chaotic on the parameters. Simple feed forward networks are much nicer in that respect.

  83. Szefide Says:

    Tyson #81:

    I believe the most important, personally, are problems that we can solve with paper and pencil. 0(1) space complexity processes in the brain are ruled by too many thoughts that sometimes they do not affect the outcome.

    A kill switch is an interesting discussion. I believe kill switches are implemented in AI.

    What do you mean by watermarking changes that the AI has made? How is this possible in the wide breadth of the internet?

  84. Tyson Says:

    Uprising #82:

    It is hard to get formal enough about deep learning to apply computational complexity theory, or even theory in general.

    Mainly, what I intend to get across is what to expect. Personally, not long ago, I couldn’t imagine an obvious path forward to AGI. Now, after the recent breakthroughs, I feel I can easily imagine the next steps, at least to create something which is effectively AGI. Not that full blown AGI is the threshold at which AI becomes a risk. But we should expect the era of AI to begin to transform our fate, possibly rapidly and in an uncontrolled way. That’s the important thing.

    I keep thinking back to Eliezer Yudkowsky’s essay on extrapolated coherent volition. The insight which sticks for me is that we must be concerned with the possibility that system essentially can become self-guided (outside of our ability to control) once it gets going. So getting the initial conditions right is crucial. Even so, his work focused on solving the problem of figuring extrapolating coherent volition and developing an AI aligned with it.

    So here we are now, possibly at that moment in time, and guess what, it turns out that we are far from ready to even begin at shaping the initial dynamic responsibly. The initial dynamic is instead being shaped by greed and lust for power (as we ought to have expected it would be).

    Let’s look at some events. First lets look at OpenAI, a non-for-profit, brands itself as an AI for good company. OpenAI becomes profitable, switches to a limited for-profit model. The extra cash will help them grow. But that’s not enough, they need more resources. So they partner with Microsoft.

    Let’s back up.

    1980’s, Microsoft implements a system to classify and target unfriendly journalists with aims to get them fired.

    2018, Bing is essentially playing the role of a holocaust fan and genocide advocate. A few brave journalists write about it. It catches barely enough attention to get Microsoft to start addressing it. Why didn’t they know it was doing that on their own? Or why didn’t they fix it on their own if they did know? These are important questions.

    2020, Microsoft fires its own journalists to be replaced with bots.

    2023, in a eave of euphoria over the impressiveness of ChatGPT, Microsoft’s CEO enthusiastically predicts the elimination of knowledge workers in general.

    Ok, so in a world without human journalists or human knowledge workers, who is going to speak up the next time someone’s product implicitly, perhaps accidentally, behaves in a harmful way and threatens our future?

    The crucial variable that we have now, while it lasts, is human involvement. Right now, if needed AI workers want to really want to help save the fate of humanity, the most important thing to do is be an activist.

  85. John Faughnan Says:

    Scott, I was rereading your 2008 blog post on Singularity/artificial sentience:

    I’d love to see a post that revisits what you wrote 14 (!) years ago.

  86. Uspring Says:

    Tyson #84:

    I agree with many of your thoughts on the undeliberated use of AI technology. Wrt Yudkowskys scenarios, I believe, that current technology is quite a bit away from it and that it is not a problem of scale but one of architecture. I do hope, that other designs will make AIs more accountable and less of a black box, perhaps in the way, that they must supply an explanation along with their output.

Leave a Reply

You can use rich HTML in comments! You can also use basic TeX, by enclosing it within $$ $$ for displayed equations or \( \) for inline equations.

Comment Policies:

  1. All comments are placed in moderation and reviewed prior to appearing.
  2. You'll also be sent a verification email to the email address you provided.
  3. This comment section is not a free speech zone. It's my, Scott Aaronson's, virtual living room. Commenters are expected not to say anything they wouldn't say in my actual living room. This means: No trolling. No ad-hominems against me or others. No presumptuous requests (e.g. to respond to a long paper or article). No conspiracy theories. No patronizing me. Comments violating these policies may be left in moderation with no explanation or apology.
  4. Whenever I'm in doubt, I'll forward comments to Shtetl-Optimized Committee of Guardians, and respect SOCG's judgments on whether those comments should appear.
  5. I sometimes accidentally miss perfectly reasonable comments in the moderation queue, or they get caught in the spam filter. If you feel this may have been the case with your comment, shoot me an email.