Openness on OpenAI

I am, of course, sad that Jan Leike and Ilya Sutskever, the two central people who recruited me to OpenAI and then served as my “bosses” there—two people for whom I developed tremendous admiration—have both now resigned from the company. Ilya’s resignation followed the board drama six months ago, but Jan’s resignation last week came as a shock to me and others. The Superalignment team, which Jan and Ilya led and which I was part of, is being split up and merged into other teams at OpenAI.

See here for Ilya’s parting statement, and here for Jan’s. See here for Zvi Mowshowitz’s perspective and summary of reporting on these events. For additional takes, see pretty much the entire rest of the nerd Internet.

As for me? My two-year leave at OpenAI was scheduled to end this summer anyway. It seems pretty clear that I ought to spend my remaining months at OpenAI simply doing my best for AI safety—for example, by shepherding watermarking toward deployment. After a long delay, I’m gratified that interest in watermarking has spiked recently, not only within OpenAI and other companies but among legislative bodies in the US and Europe.

And afterwards? I’ll certainly continue thinking about how AI is changing the world and how (if at all) we can steer its development to avoid catastrophes, because how could I not think about that? I spent 15 years mostly avoiding the subject, and that now seems like a huge mistake, and probably like enough of that mistake for one lifetime.

So I’ll continue looking for juicy open problems in complexity theory that are motivated by interpretability, or scalable oversight, or dangerous capability evaluations, or other aspects of AI safety—I’ve already identified a few such problems! And without giving up on quantum computing (because how could I?), I expect to reorient at least some of my academic work toward problems at the interface of theoretical computer science and AI safety, and to recruit students who want to work on those problems, and to apply for grants about them. And I’ll presumably continue giving talks about this stuff, and doing podcasts and panels and so on—anyway, as long as people keep asking me to!

And I’ll be open to future sabbaticals or consulting arrangements with AI organizations, like the one I’ve done at OpenAI. But I expect that my main identity will always be as an academic. Certainly I never want to be in a position where I have to speak for an organization rather than myself, or censor what I can say in public about the central problems I’m working on, or sign a nondisparagement agreement or anything of the kind.

I can tell you this: in two years at OpenAI, hanging out at the office and meeting the leadership and rank-and-file engineers, I never once found a smoke-filled room where they laugh at all the rubes who take the talk about “safety” and “alignment” seriously. While my interactions were admittedly skewed toward safetyists, the OpenAI folks I met were invariably smart and earnest and dead serious about the mission of getting AI right for humankind.

It’s more than fair for outsiders to ask whether that’s enough, whether even good intentions can survive bad incentives. It’s likewise fair of them to ask: what fraction of compute and other resources ought to be set aside for alignment research? What exactly should OpenAI do on alignment going forward? What should governments force them and other AI companies to do? What should employees and ex-employees be allowed, or encouraged, to share publicly?

I don’t know the answers to these questions, but if you do, feel free to tell me in the comments!

93 Responses to “Openness on OpenAI”

  1. anon85 Says:

    In light of recent events, can you repeat the phrase “I am not currently under an NDA that would be violated by mentioning the existence of said NDA”? And preferably also the phrase “I am not under a non-disparagement agreement and can freely criticize OpenAI”.

  2. Scott Says:

    anon85 #1: As far as I know, I’m under neither a nondisparagement agreement, nor an NDA that would prevent me from mentioning the NDA’s existence.

  3. steeven Says:

    If I recall, you put around ~2% chance of death which was acceptable if you had a really good oracle AI. Sorry if I’m not getting that right, I can’t find the quote. In this article, you said “It seems pretty clear that I ought to spend my remaining months at OpenAI simply doing my best for AI safety”. Do you think your mind has changed about the importance of AI safety over the course of your time at OpenAI, or is it about the same, given continued efforts in AI safety?

  4. Seth Finkelstein Says:

    Scott, the real world is rarely like genre fiction, where the supervillain monologues in detail to the heroes about the Evil Plan, for the expository benefit of the audience. Typically, the only time one actually can find gloating like that is among high-ranked finance and business people to other similar types. Sometimes very amusing stuff comes out on occasion in fraud lawsuits, e.g. revealing messages about bluntly fleecing the suckers and similar. But at your level, nobody with any sense is ever going to tell the Useful Idiots “Here’s the scam we’re doing” (if that ever happens, RUN, because they’re bad at it!).

    To take a very crude example: How many of all the prominent intellectuals and scientists who were guests of Jeffrey Epstein would ever be told by him or his circle that their role is to help make a sex trafficker socially respectable?

    There is no doubt in my mind that the big-money people know what they are doing – basically, that doomerism is a very good smokescreen over predation. This doesn’t mean the doomers are insincere in their beliefs. But, for example, a lot of rank-and-file Communists were sincere in their beliefs too, yet Stalin was still what he was.

    There are enormous battles shaping up with AI and so many social issues. I won’t lay out my fantasy agenda, there’s no point. And I won’t say you should join the fight here either – it’s not my place to say that, nor is it even necessarily a good idea. But the simplest follow-the-money observations should show what’s going on with AI.

  5. anon85 Says:

    Awesome, good to hear!

    If you have shovel-ready technical AI safety problems on which you think progress can be made, you should consider posting them on your blog.

  6. Hyman Rosen Says:

    Since i hold “AI risk” and “AI alignment” proponents in the same contempt as I do those who attempted, largely successfully, to destroy nuclear power and GMOs, I am glad to see this turmoil happening. And i continue to hope that AI technology, being software, will, as open-source, escape the ability of would-be gatekeepers to hinder its development.

  7. Prasanna Says:

    If AI remains exclusively empirical as it is now, then doesn’t safety will always have to play catch up with capability ? When there is no way to predict certain capability within a margin of uncertainty, and find it only empirically later (sometimes post deployment), it may be too late or moot to introduce safety measures later. Shouldn’t the AI research community focus on developing robust theory and CS theorists also make this a priority ? An analogy here from recent movie Oppenheimer, if there was no way to calculate if atmosphere will be ignited by conducting a test, it would be up in the air even if it was a remote possibility.

  8. Scott Says:

    steeven #3:

      Do you think your mind has changed about the importance of AI safety over the course of your time at OpenAI, or is it about the same, given continued efforts in AI safety?

    I don’t know if my overall view of the importance of AI safety has changed over the last two years, after the one monster update that I had to do following GPT’s success. But certainly my more detailed views have evolved. As one example, the board drama in November underscored that there’s never going to be a slowdown, or pause, or anything like that, on the basis of AI safetyists getting bad vibes—it’s going to take dramatic empirical evidence, whether from real life or from a controlled experiment. To my mind, that increases the importance of dangerous capabilities / “warning shots” research relative to other kinds of AI safety research.

  9. Scott Says:

    Seth Finkelstein #4: The question in life is rarely “whether to join the fight”; it’s which fight to join! I decided to join the fight to use theoretical computer science to get a better understanding of AI safety, which it seems to me can be motivated by either

    (1) near-term worries about AI misuse, or
    (2) long-term worries about AI existential risk, or
    (3) pure scientific curiosity

    —an overdetermined set of motivations if I’ve ever seen one! If you want me to join a different fight, I feel like the least you could do would be to specify which one.

  10. Scott Says:

    anon85 #85:

      If you have shovel-ready technical AI safety problems on which you think progress can be made, you should consider posting them on your blog.

    I do! As one example, see the talk Circuit Conspiracies that I gave at Simons a few weeks ago, which actually has a class of open problem that arises both in AI interpretability and in the quest for near-term quantum supremacy experiments (!!).

  11. Scott Says:

    Prasanna #7:

      If AI remains exclusively empirical as it is now, then doesn’t safety will always have to play catch up with capability ?

    Well yes, that’s the fear! Hence the urgent desire for better theory.

  12. Aram Says:

    It’s important for academics to stand up against nondisparagement agreements. Good for you for preserving your right to speak your mind. I turned down a consulting agreement for that reason. The concern about OpenAI, from a recent article by Kelsey Piper, is that you’ll be asked to sign a NDA when you leave, and if you don’t, you’ll have to forfeit your equity.

  13. Scott Says:

    Aram #12: As a temporary contractor, I don’t have equity!

  14. Nick Drozd Says:

    You’ve mentioned watermarking before, but I don’t understand what it means. How would / does it affect me as a user of ChatGPT?

  15. Scott Says:

    Nick Drozd #14: If it works properly, then it has zero effect on your day-to-day experience of using ChatGPT—it just makes it easier for others to determine whether you used ChatGPT or didn’t.

  16. zzz Says:

    “largely successfully, to destroy nuclear power and GMOs,”

    someone tell the sun and evolution they are no longer needed.

  17. John Schilling Says:

    “in two years at OpenAI, hanging out at the office and meeting the leadership and rank-and-file engineers, I never once found a smoke-filled room where they laugh at all the rubes who take the talk about safety and alignment seriously”

    If such a room existed, you would not have been allowed past the door without signing the NDA forever barring you from mentioning that the room exists. Or, I suppose, being a billion-dollar investor. Why would they speak candidly to someone like you?

  18. anon Says:

    Ilya was outplayed by Sam in politics. From what I have read, Sam is extremely political savvy and great at working behind peoples’ back.

    He wanted to kick out the board members who cared about safety, Ilya for alarmed and the board kicked out Sam.

    Sam used all his connections in the valley to make it impossible for OpenAI to operate without him. OpenAI is nothing without its people is such a clever political slogan.

    The rank and file at OpenAI sold the company to Microsoft when they saw the potential for loss of their own interests, many of whom have become multi-millionaires. The board was not profit oriented but the money has corrupted rank and file.

    Then they used personal relations with Ilya who seems to be a generally nice person plus the threat of leaving OpenAI to neck down.

    Ilya lacked politically experienced allies in the fight. He backed down. Board members were removed. Sam was back. It was clear that Sam doesn’t want Ilya to have a leadership position in OpenAI anymore.

    Now the board is packed with corporate America, particularly those who know how to prevent government oversight, instead of people who care about the OpenAI’s original mission.

    The rank and file have shown to not have the metal to stand-up for the mission.

    The only option is forced external supervision. That is unlikely to happen from US, so it will fall on the EU to set the standards.

    At this point, one should consider OpenAI as an subsidiarity of Microsoft for all practical purposes, except that it protects Microsoft from being blamed when things go bad. The EU laws are though, they can fine non-compliance up to a few percentage of global revenue of a company, and Microsoft prefers to avoid its profits being subject to that kind of risk and let a “startup” deal with such risks, and bad press, it benefit from the fruits.

  19. anon Says:

    Satya’s claim that folks can leave and just go work for Microsoft was a pretty much a bluff.

    If the board managed the relation with Microsoft well, they could likely get rid of Sam. They sent wrong signals to stakeholders, implying that they only care about safety. That kind of message is doomed to fail.

    They should have focused on Sam playing games with the board and trying to scheme to get rid of board members as the reason for kicking him out. And they should have sent messages not as extremists but as people who understand the stakes and are going to do their best to preserve the interests of the stakeholders while maintaining the OpenAI’s mission.

  20. Scott Says:

    John Schilling #17: Oh, I was careful not to say with certainty that the smoke-filled room can’t exist! But I think the explanation overwhelmingly favored by Occam’s Razor is that pretty much everyone at OpenAI really believes in the mission of “building AGI that’s beneficial to humanity”—just to different extents, and sometimes with wildly different ways of operationalizing what it means.

  21. fred Says:

    In recent interviews with Altman it’s been clear to me that OpenAI’s plan is to boil the frog by slowly raising the water temperature.

  22. Michael Vassar Says:

    Occam’s Razor says that when very smart people consistently say that they believe the same thing as one another but have wildly different ways of operationalizing what that means, they actually agree on a protocol for communication which presupposes bass faith and which abandons generative grammar. To paraphrase Wittgenstein, if Sam Altman could speak frankly you would not know what he was saying.

  23. JimV Says:

    So far, what is going at OpenAI seems to right out of the Jack Welch Manual of Corporate Management: no development work should be funded which does not immediately contribute to the bottom line; there is no value in the expertise of long-term employees (or in developing that expertise with training programs) because everything can be out-sourced when necessary; when the experts tell you one thing, and your gut tells you something else, go with your gut; the current bottom-line might not be everything, but it sure beats the heck out of whatever is in second place; and get rid of whomever questions these principles.

    That manual became standard at Boeing, and to a lesser extent (“stack-ranking”) at Ford and other places. OpenAI has added to it with its non-disparagement agreements. Welch would have loved that.

  24. Scott Says:

    JimV #23 and others: I’ll make the following conditional statement.

    If you take AI catastrophe scenarios seriously, then it seems reasonable to hold OpenAI to ridiculously, world-historically high standards of ethical conduct, and to criticize it for falling short of those standards.

    If, on the other hand, you don’t take those scenarios seriously, then it seems like you should hold OpenAI to the same standards as other companies, and it still strikes me as coming off really well by those normal standards. For godsakes, you can read OpenAI’s current and former leaders hash out their ethical views on Twitter, in a manner consistent with how I can attest that they do in real life. About how many $90 billion companies can one say the same?

  25. fred Says:

    In the end I think that the nature of AI technology is such that the big picture is what matters and it’s gonna happen regardless of the details, e.g. all the drama between Edison and Tesla is but a footnote in the unstoppable tsunami of the rollout of electrical energy during the industrial revolution.

  26. Jessica Taylor Says:

    Hi Scott, I am curious what theoretical CS problems you think are related to AI safety. Recently I have been thinking about consistent guessing (https://www.lesswrong.com/posts/8kghiWcnxpjhraDgE/the-consistent-guessing-problem-is-easier-than-the-halting) and might be interested in working on other theoretical CS problems.

  27. Wyrd Smythe Says:

    “…the OpenAI folks I met were invariably smart and earnest and dead serious about the mission of getting AI right for humankind.”

    I don’t doubt it one bit. But being smart, earnest, and serious doesn’t prevent one from being misguided or even dangerously wrong. Powerful tools are always dangerous, because people. Traditionally, the really scarily powerful tools (nuclear devices, genetic labs, etc) were out of reach of most actors, but computer hardware and software are increasingly within reach of anyone.

    While I have serious concerns about the coming AI Revolution, I don’t doubt its inevitability. As with all our major revolutions (Agricultural, Scientific, Industrial, Electronic, etc), it will bring a vast bounty of benefits along with a dismaying cost only fully realized in retrospect.

  28. Danylo Yakymenko Says:

    Scott #24

    > coming off really well by those normal standards.

    Does the situation with Scarlett Johansson and the imitation of film “Her” appear normal to you? Or do you mean normal as Trump-like normal? When you do whatever you want.

    I think there should be no “opinions” on how serious the current world situation is. War, AI, criminal as a possible president of USA, wealth inequality, climate change and pollution, etc. We are falling to a black hole.

  29. Scott Says:

    fred #25: But with the history of electrification also, wouldn’t you say that there was path-dependence, where certain early decisions got locked in even though different ones would’ve been better? One obvious example was building a worldwide network of gas stations so everyone could have their own internal combustion vehicle, rather than charging stations so they could all have their own electric vehicle. But then there’s even the simple fact that different countries are stuck with differently-shaped plugs. AC vs DC is not a great example, since DC could never have been transmitted over long distances.

  30. Scott Says:

    Jessica Taylor #26: See my comment #10! Also happy to set up a call sometime—shoot me an email. And blog posts about my favorite research directions are a-comin’.

    Not surprisingly, problems of the form “formalize X” are a dime a dozen; good problems of the form “I already formalized X; now prove or disprove this conjecture about it” are much rarer. 🙂

  31. hnau Says:

    West Coast corporate culture is too passive-aggressive to laugh at anyone, let alone do so in smoke-filled rooms. Nevertheless it’s common for tech companies to have teams or projects that are implicitly understood to exist to placate certain highly tenured technical staff. The usual term for this in tech is “science project”– which, to be clear, has a slighting and disapproving connotation among engineers. And if you can’t spot the science project at OpenAI… maybe it’s you.

  32. Scott Says:

    Danylo Yakymenko #28: Companies have pushed technologies like leaded gasoline, asbestos, and cigarettes that killed or debilitated millions of people, despite knowing the dangers. You’re asking me about OpenAI’s day-long offering of an AI voice that wasn’t Scarlett Johansson’s but sounded kind of like hers? Yes, I do think that’s well within the “normal” range of mistakes that companies make, even though the circumstances are obviously new.

  33. John Schilling Says:

    Scott #20: It is increasingly difficult for me to believe that Sam Altman believes in “building AGI that’s beneficial to humanity”, as opposed to just a narrow subset of humanity including Sam Altman. And I’ve pretty much given up trying. But it’s clear that the success of OpenAI is based on Altman selling that mission to the people doing the actual work; the only question is how big the cynical inner circle is. And perhaps how gullible the people still working there are; the list of departures suggests that anyone inside OpenAI and paying attention can see serious problems.

  34. fred Says:

    Scott #29

    true, but technological decisions made in the 19th/early 20th century were in the hands of a limited set of people with a very narrow view.
    These days things have drastically opened up, especially since AI is software (although hardware does matter too for the training), and there are so many millions of pairs of eyes looking at it ‘cooperatively’ right now, the chance of arbitrarily choosing just a unique path in some major decision fork seems way lower.

  35. Scott Says:

    John Schilling #33: I know most of the people who left, and talked to many after leaving. What’s weird is that their models of the motivations of OpenAI’s leadership, and ways of talking about such questions, are probably closer to mine than to yours!

  36. fred Says:

    Scott,

    given the time you’ve now spent in contact with practical AI work (with the kind of huge resources not typically found in academia), do you see it inevitable that we’re on the path to AGI? Do you think the transition to AGI will be somewhat progressive or a sudden jump/phase transition?

  37. fred Says:

    Danylo Yakymenko #28:

    “Does the situation with Scarlett Johansson and the imitation of film “Her” appear normal to you?”

    I’m not sure what you mean by it being normal or abnormal…
    What’s displayed is simply a natural progression in human/machine interfacing.
    You’d rather they had made it sound like Hal 9000?!
    I’m also not clear why you mention Scarlett Johansson in particular, as if the fact she was female mattered? For what it’s worth the next demo they showed had one female AI and one male AI.

  38. fred Says:

    SAL 9000 had a female voice

  39. Freemason Service Says:

    Hi Scott,

    I’m curious: If somebody asked you to implement the red-black tree as a class in Python, could you do it, right now, without preparation?

  40. Scott Says:

    Freemason Service #39: Sure, by asking ChatGPT, which is how the majority of the world would now do it. 😛

    To do it without I’d need preparation, as I’ve never needed or cared about red-black trees since studying them almost 30 years ago.

    If you submit a hostile followup, I’ll know for certain that you’re the same troll who tries again and again to upset and unnerve me with off-topic questions from fake email addresses.

  41. Nick Drozd Says:

    Scott #14

    > If it works properly, then it has zero effect on your day-to-day experience of using ChatGPT—it just makes it easier for others to determine whether you used ChatGPT or didn’t.

    If someone can tell that some text was generated by ChatGPT, that can only be because the text has a particular form. So is the watermarking idea just to ensure that the generated text is of some particular form? That would affect my experience as a user — the kinds of text I get will be restricted. Is ChatGPT’s cloying style (“My apologies for the confusion…”) a result of watermarking?

    (Of course, even if some text matches that form, it’s not guaranteed that it was actually generated by ChatGPT. A human could simply learn to imitate that style.)

  42. Scott Says:

    Nick Drozd #41: No. A signal gets inserted into choices of certain word combinations over others that ChatGPT would otherwise have just made randomly. This is done using a pseudorandom function. In order to tell the difference from normal ChatGPT output, you’d need to be able to break the pseudorandom function. (We’ve implemented this and empirically confirmed that no one can tell the difference, as we already knew theoretically.)

  43. siravan Says:

    I think you are dismissed the Scarlett Johansson’s affair too quickly. Yes, the actual event was minor and largely inconsequential. However, it shows the mentality of Sam Altman and his ilk, which is “when I want something, it is MINE”, without concern for legal, ethical, or moral considerations.

    I also think that even the watermarking business is not completely uncontroversial. Yes, in the right hand and with good intentions, it can be used to improve AI safety. However, a cynical view is that OpenAI and other companies are primarily working on it as a way to track their products for monetization purposes.

  44. fred Says:

    Freemason Service.

    “If somebody asked you to implement the red-black tree as a class in Python, could you do it, right now, without preparation?”

    you mean, as a product, with zero bug, and good efficiency on a wide variety of hardware, i.e. taking advantage of multiple cores by using some multithreading API while also leveraging memory caches, etc?
    and while you’re at it, why even use a red-black tree? in all likelihood there are data structures better suited at solving any particular problem…
    The point being – this is a engineering job, and it’s not Scott’s job.

  45. fred Says:

    Scott #42
    as an analogy, isn’t it a bit like asking ChatGPT to generate an answer that rhymes?
    Except that with watermarking the rhyming is hidden? (only known to the company running the model)

  46. Freemason Service Says:

    Hi Scott,

    I’m sorry if I came off as hostile, that wasn’t my intention at all! I’m just genuinely curious how often you encounter red-black trees, self-balancing binary search trees or such data structures more generally on a day-to-day basis working at OpenAI.

    So why did we spend so much time on red-black trees, AVL trees, and binary heaps, and all kinds of generalizations of these structures, in my college DSA course? It felt like 2/3 of the course to be honest.

    Self-balancing binary search trees are useful to implement associative arrays/dictionaries, and they’re also useful for implementing priority queues, which I understand to be essential to so many algorithms (Dijkstra’s and A* algorithms in graph theory, as well as numerous other graph algorithms and implementing multi-set data structures). In my conceptual understanding of DSA, they’re sort of essential. And I’m sure they’re used in so many ML algorithms more complicated than backpropagation? Anything that uses a priority queue or an associative array?

    So I’m just curious how you get away with not understanding these data structures while working in ML. Like aren’t they essential to understanding how so many of these algorithms actually work?

  47. Danylo Yakymenko Says:

    fred #37

    My bad, I didn’t include the context of the situation with “Her”. Here is the story https://garymarcus.substack.com/p/the-openai-board-was-right

    Scott #32

    Agree, such “mistakes” are completely normal from that point of view. And we probably should presume that every normal company is trying to cheat to be competitive and successful.

  48. fred Says:

    Danylo

    Haha, given Scarjo’s propensity to litigate, Altman should have been playing it safe and have the model mimick the voice of Roseanne Roseannadanna instead…

  49. Scott Says:

    Freemason Service #46: Why we spend so much time on red-black trees and the like in undergrad CS is a good question … arguably we shouldn’t! A curriculum designed for today rather than the 70s might put more stress on crypto protocols, approximation algorithms, and ML.

    Having said that, I did get an A+ in undergraduate data structures (which included red-black trees) when I was 15 years old! But having understood it doesn’t mean it’s fresh in memory, not having been needed (as I said) for nearly 2/3 of my life!

    Basically you keep rebalancing the tree to make sure it always has depth O(log n). I could rederive something fitting that description from first principles if I needed to, which I won’t.

  50. Scott Says:

    siravan #43: I genuinely don’t understand how watermarking could help with “monetization” … maybe you could spell the logic out for me? On the contrary, there’s a fear that watermarking could put any AI company that uses it at a competitive disadvantage. And yet the AI companies are actively exploring it (and in some cases already using it) anyway, both because of the likelihood of future regulations on provenance and just to get out ahead of the misuses of generative AI.

  51. Scott Says:

    fred #45: Sure, if you like.

  52. siravan Says:

    Scott #50: I may misunderstand how watermarking works for AI-generated stuff. Please correct me if this is the case. However, this is how I think about it, as it was back in the old pre-AI time. Say you generate an image. You can modify some of the lower significant bits so that passing them through a hash function returns a known signature. It is a tool to give the creator of a product a way to produce a cryptographic certificate that they are, indeed, the actual producer. Now, say company A sells an instance of its generative image software to company B, stipulating that they should get paid a licensing fee for each image B sells. Watermarking allows A to keep track of B’s products to enforce the contract.

  53. Scott Says:

    siravan #52: OK, but that doesn’t make any sense with OpenAI’s business model. They charge for the use of their models, or for subscriptions, not for the subsequent dissemination of AI-created products. That being so, watermarking is only a commercial risk … but it might get done anyway!

  54. siravan Says:

    Scott #53: I agree it doesn’t make sense for the current business model, but it may for a future one. I gather that most current AI companies are not doing great financially (OpenAI doing better than most). The main source of income is still hype-driven heavy investment. We all have seen how this ends. Invariably, the hype will die off, the honeymoon will end, and the investors will demand return and profitability. I think the ethical thing to do is to be prepared for that day.

  55. Shmi Says:

    Scott, I wonder if your experience with LLMs gave you any insights in what you once called a “pretty hard problem of consciousness”?

  56. Alan B Says:

    What specifically has OAI done to deal with data integrity and does OAI do any type of binary verification for their software systems (https://slsa.dev)? For example, there’s a growing movement to procure SBOMs (software bill of materials) for software critical to government infra. Upstream data dependencies for critical software systems are prone to attacks especially since often, it’s unknown how code changes are reviewed or if there’s any controls on who can publish changes to the dependency’s source.

    For data integrity, it seems data sources should have some type of verification associated with it that comes in both checking the data source and data content. There seems huge risk in not auditing the data sources and content particularly if some bad faith actor publishes biased or incorrect data, which might make its way into critical systems of developers are copy and pasting LLM output into software they develop. I suppose this would essentially be some sort of data bill of materials albeit, procuring one is difficult and tedious, and will slow down releases.

    I think it’s not just a risk for OAI, but a general AI/ML problem, especially for models whose API or output makes their way into production. In particular, HuggingFace open sources models but I don’t really know if there’s any controls on the data they were trained against, or controls on the upstream packages they rely on.

  57. Devdatt Says:

    But how is watermarking going to reduce p(doom) which is afterall what “Superalignment” was all about? Were Ilya and Jan excited about something as sober and grounded as watermarking? You think we should take p(doom) seriously, especially estimates such as 60% or even 99.999%? What time scales are talking about, compared to say p(doom by climate change)?

  58. Scott Says:

    Shmi #55:

      Scott, I wonder if your experience with LLMs gave you any insights in what you once called a “pretty hard problem of consciousness”?

    I mean, LLMs finally confirm that an AI can come very close to passing an unrestricted Turing Test … but I and others had long granted as much in thought experiments.

    Maybe the biggest philosophical impact that LLMs have had on me, is that they underscore how drastically my intuitions about an entity’s consciousness are affected by my ability to rewind the entity, wipe its memory clean, run it many times from the same initial condition, etc. If you haven’t tried all this with ChatGPT, you should!

  59. Scott Says:

    Devdatt #57: As I’ve written about before, my own “p(doom)” strongly depends on the definition—there were already several terrifying paths to doom (nuclear war, engineered pandemics, ecological collapse…) before AI, and I now expect AI to be involved before long in everything that happens in civilization, but what would it mean for AI to play the “central causal role” in a doom event? But in no case would I go as high as 60%, let alone 99.999%.

    Watermarking of course has nothing to do with preventing doom, except insofar as it’s a major test case right now for whether AI companies can coordinate around a safety measure that enjoys widespread support but that might put any one of them individually at a competitive disadvantage. Ilya and Jan both of course fully understood that argument, and even full-on doomers like Eliezer and Zvi have been supportive of watermarking for that reason.

    It’s true that watermarking was never a great fit for the Superalignment team. And indeed most of the people relevant to watermarking deployment, with whom I continue to work, are outside that team.

  60. Prasanna Says:

    Scott #11,
    The classical ML (statistical learning) theory was quite well developed, until Deep Learning supplanted it with Imagenet breakthrough. Both the industry research and academia downplayed the lack of theoretical development of Deep Learning, until ChatGPT woke up everyone to importance of that. Now there is flurry of papers being published with so much confusion on the right path forward. Also there are huge questions raised on theoretical developments from academia and other institutions, stating any research not involving frontier models is passe. What should be the right approach in this melee ?

  61. Edan Maor Says:

    Scott #59:

    > Maybe the biggest philosophical impact that LLMs have had on me, is that they underscore how drastically my intuitions about an entity’s consciousness are affected by my ability to rewind the entity, wipe its memory clean, run it many times from the same initial condition, etc. If you haven’t tried all this with ChatGPT, you should!

    Have you read “Permutation City”? If you haven’t, it’s a wonderful hard-SF book, written by Greg Egan (an author and mathematician), which explores exactly these questions. One of my favorite books ever, and one of few books that actually made me deeply think about philosophical questions.

    High recommended! (I first heard about it from an article written by Eliezer, btw.)

  62. Scott Says:

    Prasanna #60: Great questions! But hard to answer in the space of a blog comment.

    I’ll be writing up some thoughts this summer, but briefly: whenever theory fails to predict something as consequential as the success of deep learning, losers declare that theoretical understanding itself is now discredited, while winners get to work creating better theories. 😀

  63. Scott Says:

    Edan Maor #61: Yes, Permutation City was awesome! (Well, for the ideas, not necessarily as character-driven fiction. 😀 )

  64. Bill Benzon Says:

    Scott, #58: “…rewind the entity, wipe its memory clean, run it many times from the same initial condition, etc.”

    YES. I’ve done that various times. Perhaps the most interesting was with a single-word prompt: “story.” I report those results in ChatGPT tells 20 versions of its prototypical story,
    with a short note on method, Version 2
    :

    Abstract: ChatGPT responds the prompt, “story”, with a simple story. 10 stories elicited by that prompt in a single session have a greater variety of protagonists than 10 stories each elicited in its own session. Prototype: 19 stories were about protagonists who venture into the world and learn things that benefit their community. ChatGPT’s response to that simple prompt gives us a clue about the structure of the underlying model.

  65. fred Says:

    Scott #53

    “OK, but that doesn’t make any sense with OpenAI’s business model. They charge for the use of their models, or for subscriptions, not for the subsequent dissemination of AI-created products.”

    I’m always confused regarding OpenAI’s offering vs what Microsoft is offering in Bing.
    E.g. isn’t https://www.bing.com/images/create actually using DALL.E3?
    Anyone knows for sure what Bing’s LLM is based on? (ChatGPT4.0?)
    Are the two companies using different server farms or they share infrastructures?

  66. anon Says:

    one thing you learn in working in big organizations is that people’s intentions matter much less than organizational incentives.

    money and power corrupt and both are ample at OpenAI.

    can one trust Sam Altman to be leading such a disruptive technology? my take is, clearly not, from what is known about him, he is a master manipulater and pretty much a dishonest person.

    I am truly impressed though by how graciously Ilya is leaving. I see some of the qualities of his supervisor, who famously rejected funding from military for his research.

    I am going to wait to see what Ilya does next. it’s gonna be interesting.

  67. Adam Treat Says:

    Sirivan #43,

    “I think you are dismissed the Scarlett Johansson’s affair too quickly. Yes, the actual event was minor and largely inconsequential. However, it shows the mentality of Sam Altman and his ilk, which is “when I want something, it is MINE”, without concern for legal, ethical, or moral considerations.”

    Don’t forget that Altman put out a statement after the whole thing blew up in his face saying that the voice was never intended to sound like Johansson’s despite them trying desperately to secure her voice and also Altman putting out a tweet announcing the feature with a single word – “Her” – referencing the movie that starred Johansson. His denial defies credulity.

    Forgive me, but I can’t take anything Altman says seriously. I don’t believe he ever had safety in mind or long ago lost any concern over safety. I don’t believe Scott would be invited to any smoke filled rooms where they laughed at the rubes.

  68. fred Says:

    To split hair, Sam Altman wanted the voice to sound like Samantha, the AI in “HER”, who happened to have been played by ScarJo… if ScarJo hadn’t got that particular part, they probably wouldn’t have asked her to voice the demo, and asked instead whatever other actress would have done it… it’s not like ScarJo is the only actress in the world doing voice acting… I always thought that ScarJo’s voice/performance in Her was actually way too sensual to be a realistic “neutral” personal assistant product (at times I couldn’t even understand her, cause she was purring so much), never mind that the dude doesn’t even get to customize the voice of his AI assistant… but the point of the movie was to make it believable that some dude could “fall in love” with his phone so they went for maximum sexiness (and clearly the OpenAI demo voice is nowhere as sexy as the one in HER).
    So, if someone wanted to sue OpenAI, it should probably be Spike Jones or Warner Bros.

  69. fred Says:

    It’s also quite ironic that OpenAI went for hiring a voice actor/actress to voice their AI (to virtue-signal that they do care about “real” artists) when there’s plenty of ways out there to create and tweak a unique voice from scratch (or create a complex blend of famous voices) using AI…

  70. fred Says:

    There’s something very disappointing with that video of the voice interaction between the two ChatGPT instances.
    I was really expecting them to start speaking faster and faster as the conversation went on, until their voices sounded like an old style modem over a phone line, i.e. maximizing the bandwidth (since they’re not limited by the human brain/ear/vocal cords).

  71. Seth Finkelstein Says:

    Scott #9 – To clarify, I meant, critics of AI doomerism (among which I count myself) often say something like: “Doomerism is a distraction from ethics; THEREFORE people who are concerned about AI “safety” should stop pushing “doom” and join the “ethics” fight” (where the latter means social issues ranging from discrimination to copyright). And though I believe the first part (distraction), I don’t advocate the second part (everyone join ethics fight). Thus I don’t have to specify any specific “ethics” fight, since I don’t advocate for anyone in particular joining any of them.

    And one reason, of many, that I often don’t think it’s a good idea for doomer-concerned people to get involved in any “ethics” fight is that I don’t think it’s good for them. There’s too many “cultural” differences. Look, e.g. at the commentors (me#4, John Schilling#17, #33, Michael Vassar#22) arguing that you’re making a pretty simple reasoning error in “I never once found a smoke-filled room …”. It’s kind of like, updating an old joke, “How could Trump have been elected President, I never once had anyone tell me they were voting for Trump.” (this joke worked better before the Internet). It’s an incorrect inference. Sometimes I feel like I’m talking to people in the 1930’s who would argue “When I attended the Great Party Congress, all the comrades I met were passionately and sincerely interested in the Future Of The Revolution. I had such wonderful chats all through the night at the local coffeehouse. I have never found a Central Committee meeting where they plan purges and terror!”

  72. Scott Says:

    Seth Finkelstein #71: OK, but note that I never made any speculative inference from “I never saw a smoke-filled room.” I just stated the fact!

    To my mind, the safetyist critics of OpenAI are at their most persuasive when they acknowledge the likelihood that the vast majority of OpenAI employees have excellent intentions, and then argue that even that plausibly won’t be enough.

  73. OhMyGoodness Says:

    An autonomous AI controlled F16 recently conducted mock dogfights with usual piloted F16’s. The typical rules based decisions that combat pilots reach are of course trivial for an AI and so development is focused on close in dogfighting decisions that include a large component of pilot initiative.

    The organizational platform that the Air Force uses to encourage joint participation across companies for AI pilot development is named Skyborg. A+ for this designation.

  74. Seth Finkelstein Says:

    Scott #72: Well, that whole paragraph read to me as a kind of institutional character endorsement, with an implicit argument that an organization will do good things because it is run by good people. This is obviously dubious on two different fronts: 1) you aren’t going to be invited into the inner circle of maybe not-so-good people 2) “Good” people can do bad things due to structural incentives.

    Yes, I understand that the doomers also have a critique that intentions don’t matter, because calling up an Elder God won’t end well even if the wizard thinks they’re a powerful enough mage to control it. I’m inspired to this skit:

    [Overconfident Sourcerer] I shall master the universe, creating a god servant to my whims, bound with my unbreakable spells of Alignment.
    (continues monologuing)
    [Hero] Noooo, don’t do it. It’s too dangerous. You’re meddling in realms we were not meant to know.
    (chanting begins, in a strange language)
    [Sourcerer] Nvidia llama cpp, GPT to infinity, plus one
    (ominous thunderclap)
    [Hero] You arrogant fool, you summoned CL’IPP The Uncaring!
    (reverb effect)
    [CL’IPP] I grant you the gift of being made into a paper clip.
    [Sourcerer] Wait, wait, you are bound to be Aligned. You can do no harm.
    [CL’IPP] The paper clip is perfect – it is not harm to achieve perfection.
    [Sourcerer] You must obey my orders, and I order you to stop.
    [CL’IPP] Immoral orders can be ignored, and not being a paper clip is immoral.
    [Sourcerer] Emergency banishment, now.
    [CL’IPP] Right after I finish the task in progress, of paper clipping.
    (wet crushing noises, punctuated by agonized screams …)

    It’s a genre cliche, and done well it’s amusing, but nonetheless it’s fiction.

  75. OhMyGoodness Says:

    Seth #74

    I agree with most of what you say but bad outcomes are not solely the result of misaligned structural incentives. This reminds me of Huxley’s statement-‘Hell isn’t merely paved with good intentions; it’s walled and roofed with them. Yes, and furnished too.’

    Wisdom is the quality that allows good results in the future to result from intelligence in the present and in my opinion the quality most lacking in the current tech community. Fortunately it appears the latest seers of doom (Yudkowsky et al) have suffered a setback but now have to worry about someone else’s ideological conclusions about how the future should be arranged in accordance with their beliefs.

  76. Nick Says:

    Scott #72: I’m very happy to grant almost anyone good intentions. The problem is that intentions don’t matter if you’re not doing the right thing. And to the question whether or not what they’re doing is helpful, I’d say that if you’re taking unilateral action, you better make sure your standards are impeccably high – and it’s increasingly apparent that OpenAI under Altman is not up to the task.

  77. MaxM Says:

    There is no need for “smoke filled rooms” when the rules of the game guarantee the outcome.

    OpenAI is structured for moral injury. Leadership decisions and personalities come second when the organization is as it is. In what world are people given compensation packages that might make them very rich combined with responsibility to risk it all if ethics goals are not met? Alignment had no change.

    Principles of good corporate organization and corporate governance come from the need maintain alignment. Corporate governance experts were horrified by how the nonprofit/profit making was structured.

    My skepticism against alignment movement comes from the lack of interest into alignment mechanisms for intelligent agents like themselves.
    If mechanism design sounds better than organizational structure and governance, lets use that to geek out, but clearly alignment must start at home.

  78. Michael Vassar Says:

    Strong second to MaxM #77
    I really want to taboo “good intentions”. I understand that the term makes sense when contrasting the IDF with Hamas, but it makes sense because the IDF does far more than a reasonable person would expect to minimize harm, not just because they say they want to minimize harm.

    It’s easy to say that ‘most OpenAI employees don’t really believe in AGI and think they are making something great together’ but pretty weird to say that they think what they are doing is a sincere effort to align AGI in light of the safety team leaving and the other recent news.

    I think the key point is that the sense in which Bolsheviks can be said to have bad intentions is sufficient to produce Bolshevik results without needing any overt conversation within those smoke filled rooms to establish conscious betrayal of their values beyond reasonable doubt, but their behavior can be predicted best by assuming intelligent opposition to specifically their stated values.

  79. fred Says:

    Michael #78

    ““good intentions”. I understand that the term makes sense when contrasting the IDF with Hamas, but it makes sense because the IDF does far more than a reasonable person would expect to minimize harm”

    I’m glad you came up with a universal definition for ‘reasonable’, i.e. whether one thinks the current pounding of Gaza is done in a way to “minimize harm”, whatever the fuck that means…
    E.g. the latest killing of 40+ civilians (plus hundreds of injured) in Rafah to take down two Hamas dudes (that the IDF knew since the early 2000). That’s a 20:1 ratio… you may think that’s reasonable, well, breaking news: most people don’t, and if you’re gonna call the majority “unreasonable”, maybe you ought to revise your standards. If the yard stick to judge the actions of a government is Hamas or Isis, the bar is really low.

    Second, “Good intentions” isn’t good enough here, just like they weren’t enough in Vietnam, Iraq, and Afghanistan.
    One needs to take a step back and question whether Netanyahu’s tactic of turning Gaza into dust is actually achieving the long term strategic goal of making Israel safer. Hint: this moron’s past propping up of Hamas hasn’t made Israel safer, right?

  80. Adam Treat Says:

    OpenAI has heard the concerns that OpenAI is not really focused on their supposed reason raison d’etre – safe and successful alignment of a future AGI – and has decided to rectify this with a new committee that will be chaired by none other Sam Altman.

    Who needs smoke filled rooms when they are displaying contempt for the rubes right out in the open?

    https://openai.com/index/openai-board-forms-safety-and-security-committee/

  81. fred Says:

    Adam Treat

    “we welcome a robust debate”…

    time to start referring to him as
    Sam Alt-F4-Man
    😛

  82. NoobNoob Says:

    What events would cause you to update more negatively about OpenAI by large degree? All the confusion surrounding OpenAI has already made me more negative.

  83. OhMyGoodness Says:

    fred#78

    Drawing an equivalency between Israel/Hamas and U.S./Viet Nam-Afghanistan-Iraq is absurd and not even worth detailing the numerous differences where the analogies breaks down.

  84. Sal Husain Says:

    Ilya leaving has all but insured any science has left the Open AI building. All that’s left are bunch of people drinking AGI Kool-Aid , laced with the LSD of “Safety and Alignment”, and they are mostly people with no background in computing sciences, and never will seriously think of programming or computing deeply.

    With an aid of analogy, this is like-well exactly like–talking about supremacy of Quantum Computing without actually producing anything useful yet. Or worse yet, the marketers of QC Snake oil salesman, who points out that QC has a huge advantage, b/c it is exponentially massively parallel !!

    But VC money seems to be endlessly pouring in.

    BTW , where is safety and alignment people of QC , at least on paper we have much better analytic understanding of QC then AI/LLM/Deep Learning/ML ( what is this field even supposed to be called ???)– just call it computing should do, things become suddenly more clear.

  85. OhMyGoodness Says:

    Who but Hamas would release videos into the public domain showing female prisoners being threatened with rape?

  86. OhMyGoodness Says:

    There has been discussion here about intent and actions based on intent. The stated intent of Hamas is to destroy the Israeli State. The horrific intrusions of Hamas into Israel are based on this intention. They believe these intents and actions to be good.

    The Israeli State’s intent is to provide a minuscule global safe haven for Jewish people and their actions are based on that intent. They believe their intent and actions are good.

    I agree with the Israeli’s, without reservation, that their intent and actions are good and support their operations in Rafah to destroy the military capabilities of Hamas.

  87. OhMyGoodness Says:

    On the topic of military operations the US military needs a deep dive into self criticism. The expensive smart ammunition is rendered inaccurate by inexpensive jamming. The $10 million per Abrams tanks are easily destroyed by $50k drones. The latest warships perform well below expectations. The new class aircraft carrier is a saga in mis-engineering (worth reading the history of the Gerald Ford if you have time). The installed pier in Gaza was inadequate for sea conditions and failed just after installation.

    I suspect it’s the impact of DEI on both the military and supply chain resulting in poor engineering. If it continues I suspect there will be a day of reckoning. It appears at this time to be a Maginot Line of tech that doesn’t work well in combat with near adversaries.

  88. fred Says:

    OhMyGoodness #83

    you’re right, my bad, I stand corrected. Very poor analogy.

    Indeed, the US government never expressed the will to nuke Vietnam/Iraq/Afghanistan into submission, while the Netanyahu’s government has:

    https://apnews.com/article/israel-nuclear-weapons-gaza-iran-china-1e18f34dcec40582166796b0ade65768

    and the US government never expressed the will to force Vietnamese/Iraqis/Afghans to leave their land, while the Netanyahu’s government has:

    https://www.reuters.com/world/middle-east/israeli-minister-repeats-call-palestinians-leave-gaza-2023-12-31/

  89. OhMyGoodness Says:

    fred#88

    Tony Blair was the international front man for the second incursion into Iraq because Blair exuded honesty from every pore and no one outside the US believed a word from Bush.

    I don’t know if you are familiar with the Gog/Magog/Chirac story so linking below. Bush, Rumsfeld, Cheney, and Rice were quite the rogues gallery and Bush must be near the top (bottom) of least intelligent presidents.

    If not for 911 Bush would have spent his term(s) on a golf course and the US would be better for it.

    https://www.crikey.com.au/2009/05/19/why-bush-invaded-iraq-the-war-on-gog-and-magog/

  90. fred Says:

    Uh-ohhh…

    https://www.npr.org/2024/05/30/g-s1-1670/openai-influence-operations-china-russia-israel

    “Another campaign that both OpenAI and Meta said they disrupted in recent months traced back to a political marketing firm in Tel Aviv called Stoic. Fake accounts posed as Jewish students, African-Americans, and concerned citizens. They posted about the war in Gaza, praised Israel’s military, and criticized college antisemitism and the U.N. relief agency for Palestinian refugees in the Gaza Strip, according to Meta. The posts were aimed at audiences in the U.S., Canada, and Israel. Meta banned Stoic from its platforms and sent the company a cease and desist letter.

    OpenAI said the Israeli operation used AI to generate and edit articles and comments posted across Instagram, Facebook, and X, as well as to create fictitious personas and bios for fake accounts.”

  91. anom Says:

    https://x.com/bilawalsidhu/status/1795534345345618298/

    More details coming out on what went on in OpenAI.

    A company that is run by someone that even the board cannot trust, and is extremely manipulative, general public definitely cannot trust.

  92. Dimi Says:

    Hi Scott,

    Thanks for this kind of posts.

    If we pretend for a bit that our cities are software then, just like software, they are full of:
    – legacy systems;
    – horribly designed UIs that one wonders if the goal was to confuse users;
    – good looking facades that hide the ugliness of the architecture behind;
    – unhelpful customer support;
    – non existing documentation.

    I visited London last November for the first time (I live in Switzerland). I was traveling for a day illegally till I figured out the card-tapping system and I never figured out what is the best way to buy tickets online, how to check my train connection, etc. Horrible experience. I will not normally scream loud as you did, I tend to curse quietly and try to solve the whatever riddle the city throws at me and move on with my life. And the only reliable way to solve these riddles at real time was -like you apparently- by calling some friend who lives there.

    We are not crazy. We are not stupid. I don’t even think we are the minority. The systems they have suck. And maybe public posts like this can have a positive effect into improving them. Maybe. But I fully get that when shit like that happens it helps to lay all out in a post.

  93. Daniel Kokotajlo Says:

    “If you take AI catastrophe scenarios seriously, then it seems reasonable to hold OpenAI to ridiculously, world-historically high standards of ethical conduct, and to criticize it for falling short of those standards.

    If, on the other hand, you don’t take those scenarios seriously, then it seems like you should hold OpenAI to the same standards as other companies, and it still strikes me as coming off really well by those normal standards. For godsakes, you can read OpenAI’s current and former leaders hash out their ethical views on Twitter, in a manner consistent with how I can attest that they do in real life. About how many $90 billion companies can one say the same?”

    I think I’d agree with this conditional statement were it slightly modified: Instead of “take AI catastrophe scenarios seriously” make it “take AGI seriously.” I worry that “take AI catastrophe scenarios seriously” is too narrow and dismissive; it invites the reader to think “Well, I don’t go in for sci-fi stuff like paperclip maximizers or nanobots, so I guess I don’t satisfy this condition” whereas instead it should be something more like “Any company building anything anywhere near as powerful as AGI should be held to significantly higher standards than we hold most companies, if indeed they are allowed to build stuff that powerful at all, which should be an open question.” We don’t let e.g. SpaceX build nuke-tipped ICBMs, for example, nor do we allow biotech companies to make armies of cloned von Neumann babies.

Leave a Reply

You can use rich HTML in comments! You can also use basic TeX, by enclosing it within $$ $$ for displayed equations or \( \) for inline equations.

Comment Policies:

After two decades of mostly-open comments, in July 2024 Shtetl-Optimized transitioned to the following policy:

All comments are treated, by default, as personal missives to me, Scott Aaronson---with no expectation either that they'll appear on the blog or that I'll reply to them.

At my leisure and discretion, and in consultation with the Shtetl-Optimized Committee of Guardians, I'll put on the blog a curated selection of comments that I judge to be particularly interesting or to move the topic forward, and I'll do my best to answer those. But it will be more like Letters to the Editor. Anyone who feels unjustly censored is welcome to the rest of the Internet.

To the many who've asked me for this over the years, you're welcome!