Situational Awareness
My friend Leopold Aschenbrenner, who I got to know and respect on OpenAI’s now-disbanded Superalignment team before he left the company under disputed circumstances, just released “Situational Awareness,” one of the most extraordinary documents I’ve ever read. With unusual clarity, concreteness, and seriousness, and with a noticeably different style than the LessWrongers with whom he shares some key beliefs, Leopold sets out his vision of how AI is going to transform civilization over the next 5-10 years. He makes a case that, even after ChatGPT and all that followed it, the world still hasn’t come close to “pricing in” what’s about to hit it. We’re still treating this as a business and technology story like personal computing or the Internet, rather than (also) a national security story like the birth of nuclear weapons, except more so. And we’re still indexing on LLMs’ current capabilities (“fine, so they can pass physics exams, but they still can’t do original physics research“), rather than looking at the difference between now and five years ago, and then trying our best to project forward an additional five years.
Leopold makes an impassioned plea for the US to beat China and its other autocratic adversaries in the race to superintelligence, and to start by preventing frontier model weights from being stolen. He argues that the development of frontier AI models will inevitably be nationalized, once governments wake up to the implications, so we might as well start planning for that now. Parting ways from the Yudkowskyans despite their obvious points of agreement, Leopold is much less worried about superintelligence turning us all into paperclips than he is about it doing the bidding of authoritarian regimes, although he does worry about both.
Leopold foresaw the Covid lockdowns, as well as the current AI boom, before most of us did, and apparently made a lot of money as a result. I don’t know how his latest predictions will look from the standpoint of 2030. In any case, though, it’s very hard for me to imagine anyone in the US national security establishment reading Leopold’s document without crapping their pants. Is that enough to convince you to read it?
Follow
Comment #1 June 8th, 2024 at 5:32 am
I wonder if he used ChatGPT to write it…..
Comment #2 June 8th, 2024 at 7:44 am
I skimmed through it. It was very interesting, and it would be wrong to contend he doesn’t deal with counter-arguments. I found portions of it quite well-grounded (there will be extensive industrial implications). But, still, other parts seemed to me to be unwarranted inference from dubious extreme extrapolation. He has the technical depth to make the argument in much better detail than most versions. But it’s a detail that still seems to me, even with his clear acknowledgment of the rebuttals, not to have overcome the reasons for skepticism there.
By analogy, it’s like the old argument in the earlier days of nuclear power, that it would give us energy too cheap to meter. And that this awesome source of cheap power would completely revolutionize civilization. It didn’t matter how well you understood the physics of nuclear reactions, and presented technically accurate explanations of the amazing, astonishing, world-changing, wow wow wow, amount of energy which could in theory be extracted. The practice was an entirely different matter.
I’ve lived through the spread of the Internet. I remember all the hypesters running around at the time, arguing how governments would fall to its power. It didn’t work out that way. But it’s undeniable the Internet has had profound structural, even revolutionary, social effects.
And I think this is similar. We’re not going to have AGI superintelligence, like we didn’t get cryptoanarchy cyberutopia. It’s a delusion from extrapolating (something real!) to infinity.
Comment #3 June 8th, 2024 at 8:04 am
I’ve been writing some thoughts here as I skim through: https://cybersecurity.theater/@varx/112581008737193378
Most interesting insight from the piece so far: LLMs could be *components* of an AGI. They don’t have to be the whole thing. (ChatGPT-6 as a widget, rather than as the whole thing.)
Least convincing part so far: Power constraints being solved by burning the world, and this being *desirable*. (I paraphrase.)
Comment #4 June 8th, 2024 at 8:22 am
Seth Finkelstein #2: Ok, but much like Leopold himself, I’d take that analogy in the exact opposite direction!
Nuclear energy, when weaponized, actually did completely transform the world’s military and geopolitical situation. In practice, and not just in theory. H. G. Wells, Szilard, and all the others who speculated about precisely that outcome weren’t wrong, however nutty they sounded at the time.
Not only that, but nuclear energy would’ve also actually given us “electricity too cheap to meter” (or a close enough approximation thereof) and a carbon-free future, had the safetyists, environmentalists, and blankfaced bureaucrats not strangled it in the 1970s, in one of the bitterest ironies in the history of the world.
Comment #5 June 8th, 2024 at 8:38 am
If we knew how to align a superintelligent AI, he would be completely correct. The problem is that we don’t, and it sure looks like we’ll figure out how to create superintelligent AI before we know how to align them on our current trajectory. So the only chance we have of not _all_ dying is if we stop or slow AI capabilities research until we can figure out alignment (or abandon it altogether and do something else, like genetic engineering).
Given this, starting an AI race with China is just about the worst things we can do. Instead, we should seek international cooperation — an international non-proliferation agreement, so to speak.
By analogy, suppose the really smart people had done the calculations and worked out that setting off a nuclear bomb (of sufficient size) actually _would_ ignite the atmosphere and kill everyone. But because smaller bombs were useful, we were still working on developing them. Would advocating for a nuclear arms race in this context be a good idea?
I also don’t understand why you’re impressed by a guy understanding certain aspects of the problem, rather than updating towards Eliezer just being right when he understood the contours of the problem 20 years ago.
Comment #6 June 8th, 2024 at 9:39 am
jonathan #5: Yes, that’s certainly a criticism you can make! One that Leopold notices of course, but to which he has only pretty weak rebuttals (eg, that alignment has worked better than expected so far).
Having said that, I have my own reason to care strongly about Leopold’s scenario, whether or not Leopold would endorse the reason! Namely, I think that even if AGIs that help bad guys destroy the world and AGI’s that “decide by themselves” to destroy the world are both plausible, it’s also plausible that the former becomes realistic decades before the latter.
Comment #7 June 8th, 2024 at 9:59 am
I can see how he thinks using AI to do AI research could bootstrap superintelligence. That certainly would a good idea for how to do it if it is possible.
But I’m still having a problem understanding how a qualitative difference emerges from doing more and more of the same. Even if it does, AI comes with a cost in computing power. How do we know that the incremental cost of additional intelligence doesn’t rise exponentially at some point if it hasn’t already?
Comment #8 June 8th, 2024 at 10:22 am
James Cross #7: You’re right, we don’t know. And not in the sense that “we can’t formally prove but we know with 99.99% confidence” … we genuinely don’t know. And Leopold (unlike some) clearly acknowledges that we don’t know.
But I would also say: given the monster empirical update of the past few years (i.e., “compressing all the text on the Internet is enough to give you an entity that basically passes the Turing Test”), which is probably the most important scientific surprise in the lifetimes of most of us now living … given that, the ball is now firmly in the court of those who think there’s not an appreciable chance that this becomes very dangerous in the near future.
Comment #9 June 8th, 2024 at 10:25 am
Scott #4: Don’t get me wrong in what I believe overall. I’m relatively very “pro-AI”. I think it’s a great advance, it’ll have Internet-level effects if not more. You’re absolutely right that it’ll have many military applications (and some are obvious now, e.g. facial-recognition assassin drones). However, I think “doomers” are spouting utter nonsense. And it’s also a kind of nonsense I find very intriguing. Because it’s so completely opposite the kind of prevalent community Party Line, that any tech danger criticism comes from stupid and evil people, who are fools not understanding how badly wrong they are, or perhaps outright malicious.
As you just demonstrated with the nuclear power response. You can’t even conceive there’s a justified skepticism, that its corporate promoters are outright lying about costs and safety, which they have every incentive to do. “Why yes, we’ve proven the safety costs are only $X, build it!” (many years later) “Oh, the safety costs are orders of magnitude greater? And the taxpayers are on the hook for it? Whoooo could have known? And it’s only because you evil bureaucrats made them so expensive, it’s your fault! Plus nuclear waste is actually good for you, have you considered the phenomena of immune challenge?”
The rationalist-type “community” is just a cesspool of this stuff, of insular ideology, on so many technology-related topics. Yet for AI, it’s very respectable to think a crazy (in my view) version of technophobia that’s more extreme than many humanities technophobes. Thus I comment about it a lot, since it’s so puzzling to me. But again, I find the AI advances exciting and even world-changing. However, not to the extent of either Utopia or wiping out humanity.
Comment #10 June 8th, 2024 at 10:34 am
What makes you or Leopold think that the US military-intelligence complex isn’t on this already? The NSA has a pretty big and very secret data center after all; who knows what goes on there?
Comment #11 June 8th, 2024 at 10:47 am
[…] am and is filed under The Fate of Humanity. You can follow any responses to this entry through the RSS 2.0 […]
Comment #12 June 8th, 2024 at 11:14 am
Scott:
“Namely, I think that even if AGIs that help bad guys destroy the world and AGI’s that “decide by themselves” to destroy the world are both plausible, it’s also plausible that the former becomes realistic decades before the latter.”
I see two problems with this.
First, if the bad guys don’t solve alignment, I don’t see how they could have an AGI that gives them enough power to “destroy the world” that wouldn’t also just destroy the world on its own. There’s a sort of intermediate level where AIs can do a lot of harm while still being directable by their creators, but I think once you have anything that would give e.g. China a decisive strategic advantage over the rest of the world, that thing would also be outside their control, and thus would just destroy the world anyway. So I don’t think this is particularly likely.
(Also, as much as I dislike the CCP, they’re still human, and I don’t think they want to “destroy the world”. In my view, “CCP creates aligned superintelligent AI” is well on the _good side_ of the range of possible outcomes here!)
The second problem is that, even if it were true that this is a plausible future, I don’t see why it matters if the AGI is just going to destroy the world a few decades later anyway. Sure, I slightly prefer “We make our own AGI to stop CCP world domination, and then we all die a few decades later” to “CCP world domination, and then we all die a few decades later.” But the difference isn’t all that great. I’d much rather avoid the part where we all die. And the road to _that_ outcome leads through cooperating with China to slow or stop research leading to AGI.
Comment #13 June 8th, 2024 at 11:33 am
Scott #8
“which is probably the most important scientific surprise in the lifetimes of most of us now living”. The biggest surprise for me was that you generaly don’t need full human intelligence to produce text which seems as if they are created by intelligent being. It is interesting and a bit unexpected, but I believe some expected this, especially in pre-modern times.
You could find in literature description of seemingly intelligent but totaly mechanical and dumb machines. Look for example “The Sandman” by Hoffmann. It is only with 20th century sci-fi when robots became full-human self-aware ceatures. I’m sure that Hoffmann or Jerome K. Jerome wouldn’t be very surprised by ChatGPT abilities, as well as any cartesian follower.
2. I also didn’t heared that anybody tried to really make Turing test in the form it was described. I’d like to know the results. Strangely most people (even in this blog) forget that Turing test is statistical, and isn’t about “could some person be deceived”. Of course, people could be decieved in special circumstances. You could decieve critics to believe art produced by monkey was made by real artist see Pierre Brassau. Monkeys aren’t arist and aren’t superintelligent.
Comment #14 June 8th, 2024 at 11:54 am
I have extremely different views than Leopold and presumably you do on this topic. I skimmed the document, it wasn’t anything I hadn’t seen before. I’ll check back in 2030, but my current belief is that LLM-based systems are not going to amount to much. My prediction is that they’ll be useful in some limited contexts, like autocomplete, and provide an excuse for certain companies to cease to provide certain previously-useful services, like customer support, and not do much else.
Fundamentally, I don’t think that LLMs are a smart as a high school student. I think they’re as good as a high school student at highschool tests, but this says more about how little of a high school student’s abilities are reflected on such a test than it says anything positive about the LLM.
LLMs are (currently) only useful for putting out things that look plausible to the uninformed, but fall apart when you look closer, and/or things that are in the input data. And it’s very easy to extrapolate from “plausible to the uniformed” to “plausible to me” (because we’re all uninformed about most things) to “it works”. I think “plausible to the uniformed” is all they’ll ever amount to, and that’s not a big deal on the scale you and Leopold are predicting.
So: I’ve registered my prediction. Leopold (and you) have registed yours. One of us will be very wrong in 5-10 years, and I hope that either way, whichever of us that is can update our beliefs accordingly.
Comment #15 June 8th, 2024 at 12:11 pm
I love how concerning AGI, Aschenbrenner goes from “trust the [exponential] trendline” to “It’s this decade or bust” – acknowledging that there are indeed walls and diminishing returns. He finally takes off in Chapter II – speculating about an intelligence explosion – where I stayed on the ground and stopped reading. The human level AGI “drop-in remote worker” he is talking about is sufficiently speculative, disruptive and dangerous for my taste.
It has been pointed out elsewhere that using benchmarks meant to measure human performance to gauge LLMs might be extremely misleading. In the end, what counts is how useful these technologies can become in the real world and whether we will be able to discern the useful use cases from the not so useful ones and hem in the abuse. I am quite sceptical on all three counts.
As anthropomorphizing is so widespread, I might as well use it to argue: So we have invented cheap remote workers with savant syndrome, anterograde amnesia and attention disorder. Aschenbrenner argues that the last two problems will be overcome – possibly alleviating the first one. The remote worker “knows everything about you and your company”, cannot be held accountable and is completely controlled by a single outside entity – possibly a monopoly. How is this not only a HR nightmare, but a full blown dystopia? Going on to speculate about “superintelligence as a weapon” seems completely unnecessary and counterproductive.
Comment #16 June 8th, 2024 at 12:25 pm
Aschenbrenner lost me at “By the end of the decade, they will be smarter than you or I;
we will have superintelligence, in the true sense of the word.” I doubt it. There will need to be many breakthroughs before we can get close to AGI. It is not just a matter of scaling existing technologies. I’ll skim a bit more but he doesn’t seem to have the background to reliably analyze whether AGI will be reached any time soon. Of course, there are plenty of other worries with AI technology, but predicting the future is hard, even harder if you aren’t an expert in the field.
Comment #17 June 8th, 2024 at 12:51 pm
Scott #9
“not an appreciable chance that this becomes very dangerous in the near future.”
Of course, there’s always any appreciable chance so if I’m to assume the worst, probably the situation is hopeless unless the technology is stopped.
Ted Kaczynski was probably right when he wrote that the bad’ parts of technology cannot be separated from the ‘good’ parts” (originally in capital letters). To the extent the technology succeeds, we will become ever more dependent on it and increasingly willing to hand over control to it.
Any thought that using AI to improve AI might result in AI inbreeding with any bad traits tending to propagate?
Comment #18 June 8th, 2024 at 1:11 pm
This text is similar to Fred’s posts concerning AI and China a few months ago.
I support that much tighter security should be in place but reasonable to have doubts about someone extrapolating exponential growth curves albeit viral disease vs time or intelligence vs ooms.
It may also be the case that I have a different operative definition of intelligence (super intelligence) than the author. There is a fundamental difference between revolutionizing science and technology and winning a Jeopardy Championship. I hope it is the case that an intelligence develops in the immediate future that revolutionizes technology but reasonable doubt persists. Extrapolation of current technology for modest improvement seems reasonable but paradigm shift for revolutionary technology is something altogether different.
Comment #19 June 8th, 2024 at 1:50 pm
As others have pointed out, it seems highly implausible that we get rapid development of AGI, to the point where within a few years you have millions of automated AI agents leading R&D and basically dominating the economy, and ALSO that these agents remain tightly under control. Model weigths are orders of magnitude easier to steal than nukes. Anyone who believes AGI timelines are so incredibly compressed should be calling for a pause.
Comment #20 June 8th, 2024 at 3:39 pm
Me, I believe in Gartner’s Hype Cycle and believe what whatever AI might eventually become, right now it is approaching the Peak of Inflated Expectations. What follows are The Trough of Disillusionment, the Slope of Enlightenment and finally the Plateau of Productivity.
The hype cycle is about the behavior of people, and there is no reason to think that the behavior of people producing AI is different from the behavior of people who produced other technologies.
Comment #21 June 8th, 2024 at 3:44 pm
Very briefly, I read the preface to this pdf last week, and stopped shortly after the author wrote “Before long, the world will wake up. But right now, there are perhaps a few hundred people in the world, mostly in San Francisco and the AI labs, who have situational awareness. Through whatever peculiar forces of fate, I have found myself among them.”
I just… When I read that it makes the author sound like he’s full of shit. And I’m not saying this because I disagree with the trendlines or extrapolations that predict AGI soon and are concerned about an arms race. I agree with them! My qualm is that AI and AI safety is the topic du jour in so many nerd circles, and I think it’s comical to believe that there aren’t tens of thousands of people coming to these same conclusions using the publicly available data.
Really makes me question either the quality of his Fermi estimates, or the grandeur with which he views his own role in all of this. Either way, not great for the points he’s trying to make. So I dismissed this article as noise that would be more likely to cause confirmation bias in my own beliefs than it would to give me useful new information.
But Scott, since you’re telling me that you know and respect Leopold, I’ll go ahead and give this a read. I hope to glean some useful info from it.
Comment #22 June 8th, 2024 at 4:04 pm
Did he just foresee the Covid lockdowns or did he support the Covid lockdowns?
Comment #23 June 8th, 2024 at 7:29 pm
Update to my previous comment (Y#20):
I’ve read a little over half of the document now, and yeah, I find it to be very compelling. It will take some time to internalize everything and decide how much of it I agree with. I maintain that the original claim on how many people have situational awareness is wrong, but the strategic picture that Leopold paints is definitely qualitatively different from the things that I’ve thought about in the past, so I do need to walk my criticism back somewhat.
Comment #24 June 8th, 2024 at 8:12 pm
Am I the only one who finds the two main points of the article strangely at odds? The author predicts the end of the world as we know it within a few years — superhuman AI, intelligence explosion, and humans ceding control of the world to vastly superior beings. But at the same time, he writes as if the main thing we should worry about is the human tribal and ideological politics, as if we were discussing, say, the naval arms race in 1910.
I mean, within a few words of each other, he predicts that “[s]oon, the AIs will be running the world” and also alarmingly wonders, “Will the free world prevail?” What does the latter even mean if we assume the former? What sort of “free world” are we supposed to expect?
Comment #25 June 8th, 2024 at 8:52 pm
Dear Scott,
Long time reader, zero time commenter. I hope you see this not as whatboutism, which is not entirely false, but not intended. I read the text and there’s missing information: Who is going to go forward? State actors or companies? I see the companies as the drivers right now, with the first goal to make money. Now, we have another very realistic doomsday scenario, which is only briefly addressed in the text, an accelerating climate crisis. One could argue that the author is correct to say, the two contesting global change scenarios of AGI and climate change have very different time scales, years vs decades. True in the short term consequences maybe, but not on a longer scale. If companies are accelerating AGI in an increasingly unstable world increasingly destabilized by climate change, how would you see the role of AGI emergence? Climate change accelerated by the power thurst of ever increasing compute clusters or AI computing becoming irresponsible and thus regulated to a halt? State actors seizing the power to make AGI a tool for dominance in a time of crisis or being stopped dead in their track because funding needs to go elsewhere and AGI stays with companies? AGI solving the climate change conundrum or defending what is needed to sustain it? I know these two developments mainly align when talking about AI Power consumption, but they are possibly life altering developments that if true, will change the way we live live alongside each other, thus making their interaction more likely in the future – e.g. people are building digital twins of our earth system with built-in feedback mechanisms, already.
Comment #26 June 8th, 2024 at 11:02 pm
Anonymous Coward #10:
What makes you or Leopold think that the US military-intelligence complex isn’t on this already? The NSA has a pretty big and very secret data center after all; who knows what goes on there?
Oh, it’s clear that lots of people across all the relevant government agencies are writing reports and holding meetings and the like. Alas, actually talking to people in government (or even just following the news) is often fatal to illusions of governmental hyper-omniscience and competence! Even when agency folks know everything relevant that the rest of us know (as the best ones do), that only means that they’re just as clueless as the rest of us… 😀
Comment #27 June 8th, 2024 at 11:07 pm
Question #21: I really don’t know, beyond what he writes in the document. Ask him!
Comment #28 June 8th, 2024 at 11:13 pm
Aschenbrenner and Yudkowsky agree that AI will become dramatically powerful dramatically quickly. They disagree about safety / alignment. Yudkowsky thinks it is impossible, while Aschenbrenner describes himself as “incredibly bullish on the technical tractability of the superalignment problem”. (Why?)
This fundamental disagreement leads to a difference in concern.
Yudkowsky is worried about the existential risk posed by AI and thinks all other concerns are essentially petty distractions in light of it. “Frame nothing as a conflict between national interests, have it clear that anyone talking of arms races is a fool. That we all live or die as one, in this, is not a policy but a fact of nature.”
Aschenbrenner thinks the existential risk is no big deal. Instead, he is singularly concerned with military preeminence of the United States, especially vis-a-vis China. He fully embraces the arms race approach to AI development, and anything that impedes US advantage in the arms race is a problem. He calls this position “AI realism”.
Yudkowsky’s approach, which Aschenbrenner calls “naive and unworkable”, at least purports to be cosmopolitan and in the general interest of humanity. Aschenbrenner’s approach, on the other hand, is shockingly jingoistic. The essay as a whole contains an awful lot of propaganda for someone who purports to be so clear-eyed and forward-thinking. (Of course, that doesn’t mean that the technological forecasting is wrong. I’m sure it is mostly correct. But it is strange to see someone talking about “the free world” unironically in 2024.)
Personally, I’m not worried about runaway AI or “the CCP”. Maybe I’m just old-fashioned, but what worries me is the continued destabilization of the Earth’s climate, and the continued burning of fossil fuels that is causing it. Aschenbrenner advocates for burning as much natural gas as it takes for America to win the AI arms race (“well-intentioned but rigid climate commitments”, LOL).
Comment #29 June 8th, 2024 at 11:15 pm
Scott #4: If what you say about nuclear energy’s missed potential is true, why did no country on earth go that route? This isn’t something I know much about but that being true doesn’t seem in line with how the world looks today. I can easily imagine one country (cough cough the US) failing to realize the potential because of stupid short sighted reasons. But then I would have expected at least one of the 30+ countries with nuclear power to have gone all in and gotten energy nearly to cheap to meeter. Which would have been an incredible advantage in a global economy that is still constantly dragged down by energy costs. My sense is that even France with its massive investment in nuclear still has power bills of the same order of magnitude as those in the US/the rest of Europe.
Regardless of my above questions, I absolutely agree with you on the environmentalism/green energy front. It’s a hell of a tragic irony.
Comment #30 June 8th, 2024 at 11:23 pm
In the 1980s, as a young teenager, I wrote a program in 16 KILOBYTES of RAM for the Sinclair ZX81 which pretty effectively mimicked initial questions at a GP (Doctor) appointment. Similar programs were created on more popular home computers and there was quite a bit of publicity at the time about how this would streamline GP appointments nationally by automating decisions on “hum-drum” (common) stuff, that 90% of patients attend for.
That didn’t happen because many people dismissed it as pretty dumb and especially as an “uncaring” method to be used in healthcare.
Current LLMs and AI in general seems to be capable of producing the mundane babble, technical derivations and encyclopaedia recitals which are produced by 90% of the human population (say), but with much more reliability. I mean to be honest, if you transcribe your conversations and thoughts from a day you probably never say or think anything particularly new or interesting that hasn’t been repeated hundreds/thousands/millions of times before.
And I doubt LLMs will get much beyond the (boring) approx 90% of what passes for “intelligence”, although they will do it with ever greater efficiency no doubt, so eventually, for example, most training and education can be carried out by AI and human teachers will no longer have to be terrorised by the young thugs who are too cool for skool.
LLMs can’t now, and never will be able to train on the much rarer and secretive genius thoughts of humans who made the great contributions to human civilisation. These thought processes were secretive and incredibly deep before being available in the public final form, pristine and fully-formed. Such ideas and creative processes may even involve something physically beyond mechanical AI, like human free-will.
Comment #31 June 8th, 2024 at 11:24 pm
jonathan #12, Fuzzy #18, Kifla04 #23: Yeah, you’ve all put your finger on maybe the most fundamental objection to Leopold’s thesis. Namely, if AGI is really going to become as powerful as he claims as quickly as he claims, then how could great-power competition not be the least of our worries? What’s then to block the full Yudkowskyan scenario (anticipated, of course, by generations of sci-fi writers) where the AGI just takes over the world, itself?
The thing is, the experience of the last few years has made me extremely leery of collapsing these sorts of distinctions. Like, it’s very easy to imagine someone just a decade ago arguing: “but it seems inconsistent to stipulate a world filled with chatbots that basically pass the Turing Test, and yet where those chatbots just remain friendly helpful assistants … clearly, you don’t get such powerful chatbots without full AGI, and you don’t get full AGI without the supposed ‘chatbots’ pursuing their own goals, recursively self-improving, taking over the world, etc.”
And yet here we are, right now, in precisely the scenario that might’ve seemed inconsistent! (Although who knows for how much longer.)
Comment #32 June 8th, 2024 at 11:39 pm
jonathan #12:
even if it were true that this is a plausible future, I don’t see why it matters if the AGI is just going to destroy the world a few decades later anyway. Sure, I slightly prefer “We make our own AGI to stop CCP world domination, and then we all die a few decades later” to “CCP world domination, and then we all die a few decades later.” But the difference isn’t all that great.
OK, I confess that this way of thinking is deeply weird to me. It’s like, who cares about stopping a Russian-instigated nuclear war right now, if you think the world will plausibly just get destroyed later this century by climate change anyway? Well, I’d care, and I hope you would too! Sometimes you have to address a civilizational threat that’s immediately in front of you, and trust to the people of 2050 to address the threat of 2050, when and if it comes. They’ll certainly know more about the threat than you will.
And besides, if you believe that we don’t even know how to align the AIs of the next 5 years so that they won’t further empower Xi, Putin, and Khamenei, then how on earth will we ever figure out how to align a future superintelligence? From the perspective of alignment research — which is the perspective I most care about right now — we might as well start with the former problem, even if the eventual goal is the latter.
Comment #33 June 9th, 2024 at 12:17 am
What’s the main reason to be bullish about solving the data-bottleneck problem.
Sure with “artificial scientists” we might be able to do it. Specifically, the first is needed to develop them and the only arguments given in the text are the opinion of the Anthropic CEO + some anthropocentric psychological argument of how to improve the systems á la this is how I learn.
I don’t think we know a great deal about how the systems work, and it seems that the author shares this view when he says that a lot of improvements were hacky.
So I think we are very far from the credence that existed about the possibility of the bomb before the Manhattan project. It’s good to think about this, regulate, and fund (safety) research – but it seems like the text essentially needs some kind of Pascal’s wager plus certainty about a geopolitical conflict to really make a compelling argument for handing a 1T $ research grant to the author and one hundred people close to him.
Comment #34 June 9th, 2024 at 2:01 am
Seth Finkelstein #9:
As you just demonstrated with the nuclear power response. You can’t even conceive there’s a justified skepticism, that its corporate promoters are outright lying about costs and safety, which they have every incentive to do. “Why yes, we’ve proven the safety costs are only $X, build it!” (many years later) “Oh, the safety costs are orders of magnitude greater? And the taxpayers are on the hook for it? Whoooo could have known? And it’s only because you evil bureaucrats made them so expensive, it’s your fault! Plus nuclear waste is actually good for you, have you considered the phenomena of immune challenge?”
There’s one serious danger of nuclear power in certain parts of the world, and that’s the danger of accelerating nuclear weapons capabilities. As for the other dangers—the radiation, the waste—the more I looked into them, the more they really did seem absolutely trivial compared to the risk of catastrophic fossil-fuel-driven climate change that the world collectively decided to accept instead. For that matter, even the “mundane” harms from burning fossil fuels (the air pollution, the damage from oil drilling, …) seem to be worse than the mundane harms from nuclear power!
And as for the cost: nuclear power is so expensive almost entirely because of a bureaucratic feedback mechanism wherein, wherever it becomes economically competitive, it’s decided that it ipso facto hasn’t been made safe enough. Comparing against the greater risk of burning fossil fuels is the crucial concept that never impinges on the regulators’ consciousness.
Comment #35 June 9th, 2024 at 3:38 am
“jonathan #12, Fuzzy #18, Kifla04 #23: Yeah, you’ve all put your finger on maybe the most fundamental objection to Leopold’s thesis. Namely, if AGI is really going to become as powerful as he claims as quickly as he claims, then how could great-power competition not be the least of our worries? What’s then to block the full Yudkowskyan scenario (anticipated, of course, by generations of sci-fi writers) where the AGI just takes over the world, itself?
The thing is, the experience of the last few years has made me extremely leery of collapsing these sorts of distinctions. Like, it’s very easy to imagine someone just a decade ago arguing: “but it seems inconsistent to stipulate a world filled with chatbots that basically pass the Turing Test, and yet where those chatbots just remain friendly helpful assistants … clearly, you don’t get such powerful chatbots without full AGI, and you don’t get full AGI without the supposed ‘chatbots’ pursuing their own goals, recursively self-improving, taking over the world, etc.”
And yet here we are, right now, in precisely the scenario that might’ve seemed inconsistent! (Although who knows for how much longer.)”
The trajectory that AI development has taken has been shocking, but how much of what has been surprising is actually relevant to Leopold’s implicit assumption that AI safety is solvable within a reasonable timeframe? IMO everything we have seen is consistent with the idea that alignment (as well as the more general problem of safe deployment- e.g the weights being stolen is not a failure of alignment per se but still seems catastrophic if one accepts Leopold’s other premises) is extremely hard, and far from solved, but that current models are not powerful enough for that to have catastrophic consequences.
Comment #36 June 9th, 2024 at 3:48 am
Question #21
The world owes thanks to Sweden for serving effectively as a control. They braved castigation to do so, attacked with exponential growth extrapolations.
Comment #37 June 9th, 2024 at 4:05 am
Scott #30
For me, your concern regarding the “collapse of distinctions” has heightened my worries about the IMO currently inadequately-broad identification of potential vectors of danger stemming from the intellectual development of power-seeking AGIs. Our understanding of the (near) future being so fragmented now risks concentrating on the variants which seem unavoidable now, but appear quite nearsighted in hindsight. (Like the assumptions you mentioned of AGI being needed for a useful intelligent assistant.)
The list of scenarios you — and many others — typically consider focuses on emergence of super-wise dangerous entities. Then the end of the world as we understand it comes from logically and strategically correct (but “inconvenient” to us) decisions made by these super-wizards.
However (as I already wrote on this blog) this misses the possibility of super-street-smartness emerging before the super-wiseness. In such a case, it’s going to be a super-thug entity, not a super-wizard entity — with all its implications (such as a higher frequency of strategically flawed decisions leading to a short expected lifespan). This entity is likely to make all the mistakes associated with a lack of strategic thinking: a too-short planning horizon, disregard for Knightian uncertainty, or excessive risk-taking, such as triple-Kelly betting, etc.
These types of errors are almost certain to eliminate the tiny population of such super-thugs — one at a time — and there is a significant chance that such events are going to be catastrophic and have a chance to take us down too. Alternatively, the thug might eliminate us as a result of one of its deliberate (but erroneous) decisions, without fully realizing that this would also be its undoing!
The dangers of not paying special attention to such a scenario lies in the fact that the countermeasures for emergence of a super-thug before it becomes super-wise seem to be fundamentally different from the countermeasures for emergence of a super-wizard not aligned with us. (Furthermore, the former countermeasures might be easier to implement…)
Comment #38 June 9th, 2024 at 4:56 am
Thanks for pointing out. Seems interesting (the first few pages, esp. the levels of investments). Will read it at my (usual, slow) pace…
Just curious, has the author considered / addressed, in this book, the plateau in https://arxiv.org/abs/2104.14337v1 ?
Best,
–Ajit
Comment #39 June 9th, 2024 at 7:09 am
The incoming ability of AIs to “read” the human mind like an open book (through voice, facial expression, pheromones…) will come really handy during job interviews/job reviews, interrogation by the authorities, dating, parenting, friendship ‘evaluation’, …
What a time to be alive!
Comment #40 June 9th, 2024 at 7:27 am
And once AIs are integrated with your phone, your laptop (already being done), your TV, … feeding it all your texts, voice messages, patterns, preferences, …. *you* will become a data point in some latent space in some servers of those companies. 8 billion unique data point (with their own identifier) is a piece of cake, really.
Comment #41 June 9th, 2024 at 8:06 am
AI safety should spend a lot less time worrying about hypothetical scenarios where AIs get much better than the current parlor trick of LLMs and much MORE time red-teaming things that can be done with the current state of the art by PEOPLE who intend to do harm to others with the aid of those parlor tricks. After all, magic is not real but it doesn’t need to be for swindlers to rip people off with three-card monte.
This is a more serious problem because it unquestionably exists and is already happening. By being extant, though, it is also low-status to worry about because it’s real and boring. It’s like improving traffic safety with lower speed limits and traffic calming instead of, like, EVTOL robotaxis.
Current AI has created an enormous load of safety debt and we should spend time paying that down before we spend a single instant in Nick Bostrum fantasyland.
Comment #42 June 9th, 2024 at 8:54 am
Matthew Gray #28:
France did exactly that (70+% of their energy is from nuclear), and now they are a major exporter to energy to all of their neighbours.
Comment #43 June 9th, 2024 at 9:11 am
Matthew Gray #28: Yeah, what Sandro #41 said. France really is doing much better than Germany (for example), which suicidally turned its back on nuclear just when it desperately needed it, in an act of pure destructive idiocy.
Comment #44 June 9th, 2024 at 9:33 am
What I see here are a lot of detailed, thoughtful comments from quite clever people who don’t seem to consider that
1) global warming is almost certainly bogus, so that’s not an even half-plausible threat ;
2) the Turing Test is almost certainly bogus, and just an idea thrown up by Turing while he discussed the maths he was more grounded in. I’ve yet to see a single proof or even half-persuasive argument that producing comment & reactions indistinguishable from a human is any indication of actual “intelligence”. Weizenbaum warned against this in the 1970s and I’ve seen no-one even try to tackle his doubts successfully ;
3) intelligence is something that nobody in AI, computing, or cognitive science understands at all, and, dangerously, they’re unaware that they’re unaware what intelligence is.
The comments about parallels with 1950s predictions of nuclear power “too cheap to meter” are insightful, not because there were technical shortcomings in nuclear science or energy policy preventing this accomplishment, but because the very goal of “too cheap to meter” reveals a basic lack of understanding of economics.
If you think it’s possible or worthwhile making something too cheap to meter, you don’t know what metering (pricing) is actually good for (regulating queuing) and the internet is now a repressive mess ruled over by greedy monopolists precisely because the engineers who created the internet didn’t build in a micro-pricing function because none of them thought it worth reading an introductory economics book.
Very much this kind of basic conceptual mistake makes the whole notion of AI, internet search engines (or sped-up card-index catalogues) that talk back to us a bit like a person — the large-language models we now have that repeatedly make mistakes in order to sound like intelligences, misconceived from the ground floor up.
What I see is a broad misunderstanding of the foundations of intelligence, economics, climate, power generation, foreign policy — a whole range of complex systems that trained technicians (millions of college graduates) now fantasise that we understand more deeply than we did 300 years ago.
We don’t.
Comment #45 June 9th, 2024 at 9:55 am
Mark #43: Permanently banned from this blog, for a comment somehow even more patronizing than pig-ignorant and more pig-ignorant than patronizing.
Comment #46 June 9th, 2024 at 10:02 am
Incidentally, Paul Topping #15:
I’ll skim a bit more but he doesn’t seem to have the background to reliably analyze whether AGI will be reached any time soon.
Do you have that background? Does anyone?
Comment #47 June 9th, 2024 at 10:22 am
Scott#33 – Did you personally careful examine all the safety calculations with a very critical eye towards possibly too-optimistic or unjustified projections? Especially given the incentives that there’s a huge amount of money at stake, and every reason for people who stand get this money to lie to get it? Given the rhetoric, forgive me for being skeptical.
But the main point I’m trying to convey, is the utter and complete lack of intellectual charity, the inability to ever even conceive anything like “I think this number is more accurate than that number, but I acknowledge the assumptions used by the opposing side have a reasonable basis for them” (to me, “billions of dollars” is a _per se_ reason!). Note also the Excluded Middle fallacy. One important point is that it’s not nuclear-or-nothing (“Ha, you’re concerned with climate change – go nukes! GOTCHA!”). Rather, that the immense amount of money is better spent on renewables – again, quite a lot of evidence in favor of that. To be clear, I’m not saying everything deserves such charity, it’s the *contrast* I’m trying to make.
I can’t think of a single technology issue besides AI where the general, overall, rationalist-type view isn’t predictable from a simple rule of something like roughly “If big business wants it and think it’s good, but others think there should be strong safety constraints on business, the big businesses are doing moral advancement of humanity and the safety concerns come from technophobia and moral defectiveness” (is there anything else where this rule doesn’t work, as a broad idea? – not every single person, but very much more than median).
Thus, for AI, why isn’t the Party Line: “AI will give so many great thing – better medical drugs and disease which will save live, new materials which will help solve the climate crisis, find scholarly connections we can barely imagine. Anyone who impedes this advance is WRONG AND BAD! They are BLANKFACED BUREAUCRATS who will strangle the next technological revolution with their so-called SAFETYISM! The blood of all failed Effective Advancementism is on their hands!”
It’s not that nobody says the above. And of course it’s mentioned good things can come from AI. But, the rhetorical difference is really striking to me. The AI-can-be-good is usually a throat-clearing on the way to why it’s important to fret about the potential doom of humanity. Yudkowsky, direct quote “be willing to destroy a rogue datacenter by airstrike.” and not becoming a laughingstock for it, is to me a Trump level “shoot somebody and wouldn’t lose any voters”.
I know, I know, there’s a trivial reply – on everything else, the critics are completely, sneeringly, to be mocked and denounced, wrong. But this one issue is special, to be taken seriously. All I can finally say is the arguments there are extremely unconvincing to me in comparison (especially versus the give-*us*-money explanation).
Comment #48 June 9th, 2024 at 10:49 am
Seth Finkelstein #46: Yes, the people who stopped nuclear power were and are idiots, with the deaths of millions of people from future climate-caused catastrophes on their heads. I try to be more diplomatic! But if you’re going to keep pushing, I can’t honestly deny that that’s what I think. Renewables are great but not for reliable 24/7 base power.
Meanwhile, you imputed a large number of views to rationalists that are not what they think or even the exact opposite of what they think (e.g., the majority of the ones I know, including the leading lights, have been desperately trying to regulate AI as hard as possible, corporate profits be damned…). I won’t deign to reply to that part unless and until you can summarize the rationalists’ views in a way that they would recognize.
Comment #49 June 9th, 2024 at 11:31 am
Mark #43
“the engineers who created the internet didn’t build in a micro-pricing function because none of them thought it worth reading an introductory economics book.”
That’s not true at all. A massive amount of engineering was done in the 90s on network technologies built with quality of service in mind, relying on several tiers of priorities (by pricing, typically).
Comment #50 June 9th, 2024 at 12:04 pm
Scott #47
” Yes, the people who stopped nuclear power were and are idiots, with the deaths of millions of people from future climate-caused catastrophes on their heads.”
That’s easy to say with 20-20 hindsight (someone who’s been involved with OpenAI should maybe be more careful casting such harsh judgement… in 20 years you may be seen the same way… assuming there’s even anyone left to pass judgement!).
The thing is that global warming was known as a risk for 200 years.
In 1985, Carl Sagan testified
But people who lived in the 70s and 80s also got aware of the risks of nuclear and the public started to distrust the guys in charge of the technology
E.g. when the movie “The China Syndrome” came out on March 16 1979, it was bashed by experts as a “sheer fiction” and a “character assassination of an entire industry”.
Ironically, just twelve days after the movie came out, the Three Mile Island incident happened, i.e. a partial meltdown of a core in a nuclear plant in Pennsylvania.
https://en.wikipedia.org/wiki/Three_Mile_Island_accident
For people living in Western Europe in the 80s, Chernobyl was a huge trauma, the radiation traveled all over the place and it was very hard to get any useful information from the Soviets, which also created a deep sense of mistrust in the technology.
https://www.eea.europa.eu/data-and-maps/figures/deposition-from-chernobyl-in-europe
Also, the plants were hard to make profitable. They were producing excesses of electricity, which could be rerouted and sold.. but demand in those days wasn’t as high as it is now (when stuff like bitcoin mining and now AI training are driving up the price of electricity).
Then of course how to ignore Fukushima… it’s gonna be a giant fucking headache passed down generations of Japanese Engineers.
I guess in the end it depends on what one sees as more technologically hard to manage, melted nuclear cores (the difficulty is to fundamental to solve) or the effects of climate change.
Or maybe those are just two distractions, e.g. cancer rates in 30-year-olds are through the roof, and we should be focusing on the nano plastic particles that apparently are now everywhere, including in our bodies (and fucking up everything). Maybe the plastic industry from the 60s are the real “criminals”…
Comment #51 June 9th, 2024 at 12:08 pm
Hi Scott,
I apologize if this is a little off-topic. In my defense, it’s somewhat related to the whole circle of ideas surrounding the intersection of artificial intelligence technology and national security and geopolitics.
The Israel-Hamas war might go down in the history books as the first war where artificial intelligence systems played a crucial role. Israel is using two big AI systems in this war, and they’re integrated into the decision-making and weapons-targeting control flows. The Gospel system marks buildings and structures that Hamas militants operate from, and Lavender marks individual Hamas militants for assassination. Both systems are trained on vast amounts of surveillance data. Lavender assigns virtually every single person in the Gaza strip—all 2 million or so—an int value from 0 to 100 reflecting the probability that they’re a Hamas militant. The AI model also assigns to each person the geographic coordinates of their likely home. Homes that exceed a certain value in the model are marked for bombing.
Here’s an article about it. I know the magazine has a left-wing bias, but I think the reporting here is solid https://www.972mag.com/lavender-ai-israeli-army-gaza/
There’s been a paradigm shift in how the IDF is approaching this war, relative to previous conflicts in the strip. AI is effectively a decision-making authority, and human commanders are “rubber-stamping” bomb targets generated by AI. In previous conflicts, bomb targets were picked by human commanders and careful attention was paid to possible collateral damage. Also, the IDF is now **bombing houses and apartment buildings** where junior-level Hamas militants live, not just buildings and structures where Hamas militants operate from and are therefore of direct tactical significance (like ammunition depots, command and control centers, tunnel entrances etc.) unlike previous conflicts. This is likely why the civilian death toll is so much higher than in 2008 and 2012.
To be frank, I find the concept of artificial intelligence generating a list of people for assassination alarming and dystopian. Perhaps I’ve seen too many Terminator movies, but I’m very apprehensive about artifical intelligence systems directing military operations. Now I don’t think this current iteration of Lavender and/or The Gospel is going to become “skynet,” but if AI is accelerating as quickly as quickly as Leopold imagines, it seems plausible to me that a future iteration of these systems, in a war in Palestine 20 years from now—one that’s vastly more complicated and inscrutable than today’s Lavender and is fully integrated into / has direct control of military assets like drones—could become extremely dangerous.
Imagine Lavender 5.0 calculating in a fraction of a second that the best way to protect Israel indefinitely is to exterminate all the Palestinians, or even destroy every other country on Earth or some such.
I think it’s imperative that the IDF, if they do continue using AI systems, prioritizes AI safety and alignment of their systems. I think they should have an AI alignment/safety team.
Beyond the possibility of a future system becoming a misaligned superintelligence, however, the way Israel is currently using AI in this war strikes me as disturbing and dystopian. I’m not keen on the idea of an entire city being surveilled by AI, every person living there being assigned a “threat level” by AI, the AI marking individual people for assassination. It’s creepy. There seems a strong risk to me that, if Israel does re-occupy the Gaza strip, it could become a heavily surveilled “AI dystopia.”
Comment #52 June 9th, 2024 at 12:16 pm
Scott #47 I think we have a communications break-down somewhere here, in that either I misunderstand where you’re objecting, or you’re misreading my point (it’s hard to see how this happened, but you’ve got a lot of comments which you laudably respond to, and I suppose it’s easy to make a mistake in going through them all).
That is, “the majority of the ones I know [rationalists], including the leading lights, have been desperately trying to regulate AI as hard as possible, corporate profits be damned” – YES! Correct! I agree that this is the case about the belief over AI. I just quoted Yudkowsky about bombing datacenters, so of course I understand it as the belief.
I keep being fascinated that apparently this one issue of AI is where the process of “regulate” (n.b., government bureaucrats) is taken seriously, not extensively rabidly denounced as a downright immoral action by murdering BLANKFACES. Did you misread my paragraph, where I was describing what the AI view *would be*, IF it was following this template? I can’t make sense of that misreading as if it was meant to describe the AI view *as it is*, but again, to be charitable, errors happen.
Let’s back up. Not be cute, in terms of reversing, but what do you think I’ve been repeatedly saying in the thread on this point over rationalists/AI here?
Comment #53 June 9th, 2024 at 12:23 pm
Scott,
“In any case, though, it’s very hard for me to imagine anyone in the US national security establishment reading Leopold’s document without crapping their pants”
I’d be surprised if the national security apparatus is ignorant about this or downplaying it. If anything, they will be monitoring the developments closely as well as providing the cyber protections to the leading AI companies. Only thing is they wont do it in public, just like any other classified actions they take, or the more visible ones like banning export of AI chips. Of course adversaries will have their own sophisticated play in this, and the public will never know who is ahead in this race, until ….It is not a long shot to imagine all those DoE labs that developed those Supercomputing infrastructure are not involved in some top secret program around this.
Comment #54 June 9th, 2024 at 12:52 pm
James Cross at #7: “I’m still having a problem understanding how a qualitative difference emerges from doing more and more of the same.”
Whereas other smart people have said, e.g., “quantity is quality” and “more is different”. Examples might be a flatworm with 200 neurons vs. a border collie with 500 million vs. a human with 80 billion. It depends on the nature of the parts and how they interact, of course.
Comment #55 June 9th, 2024 at 1:04 pm
Seth Finkelstein #50: I apologize if I misunderstood you as saying something other than you were. It genuinely failed to occur to me that you wouldn’t know the reason why rationalists are generally strongly supportive of innovation and free markets with a few gigantic exceptions such as AI. Namely, they think that markets and scientific innovation work better than anything else our sorry species has ever come up with, except when they actually work so well that they let someone reckless or evil cause a totally unrecoverable disaster like (say) the extinction of the human race. Therefore they freak out (and call for regulation) about anything that they perceive, rightly or wrongly, as risking the latter, in much the same way that a previous generation of STEM-flavored nerds freaked out (correctly and understandably) about nuclear weapons.
Comment #56 June 9th, 2024 at 1:23 pm
JimV #52
I certainly agree that “more can be different” in the natural world but I’m not sure that demonstrates anything conclusively about the algorithmic world of AI. “More is different” in AI world could also mean something quite unexpected and not in a good way.
Comment #57 June 9th, 2024 at 1:31 pm
The debate on nuclear power has been going on for a long time and won’t be resolved here, but I can’t resist a personal anecdote.
My first several years in electrical-power-generation engineering were in the heyday of nuclear power development. One day I read in Power Magazine that a leaky valve had been found somewhere in the feed-water system of a nuclear power station, and as a result the NRC mandated that all nuclear stations with that particular valve design had to be shut down within two weeks to check the valve. Imagine utility companies having to shut down many gigawatts of power on an unscheduled basis, versus fossil-fueled power stations, one or more of which have a similar problem every month or so, with no wide-spread consequences. I knew then that the nuclear power boom was over.
(I’m not saying it was a wrong decision, or that reactor domes shouldn’t be designed to withstand the crash of a 747 or a 100-year earthquake. Some of the stuff that has happened in a fossil plant would be much more immediately consequential in a nuclear plant. Such as a burst of a main steam pipe flange.)
Comment #58 June 9th, 2024 at 2:38 pm
As a thought experiment to show how powerful a military power with access to an AI superintelligence could be: Imagine a swarm of millions of cheap assassin drones, linked together and powered by an artificial intelligence. The swarm has spider microdrones that can crawl up walls and through windows, micro air drones that can fire weapons, surveillance drones that collect data. Unleash the swarm on an area with insurgents and it will pick out and kill every insurgent fighter with limited civilian casualties and damage. In the current Gaza war, for example, give the swarm a month or so to collect data, and then release it, and it kills every Hamas fighter with limited collateral damage in a couple days. Against a conventional adversary, the swarm of millions of microdrones attacks ships or tank battalions with perfect coherence and synchrony and precision. It doesn’t matter how big and powerful and armored your tank or ship is, or what your anti-air defenses are—swarms of many thousands of armed microdrones all collaborating intelligently and with perfect intelligence and attacking the weakest spots will kill you. You can shoot down thousands of these microdrones and they’ll just keep coming. And the microdrones—especially with advances in manufacturing like 3D printing—will be vastly cheaper per unit than the hardware they’re destroying.
So AI could mean a shift from expensive **hardware** being the most important military assets, to **AI models** being the most important and expensive assets. These models will be extremely important national secrets. What destabilizes this situation is the asymmetry between how expensive an AI model is to train, and how cheap it is to store the resulting weights (you can store GPT on a laptop computer). Small countries or non-state actors, in possession of frontier model weights, could wreak havok.
Comment #59 June 9th, 2024 at 3:05 pm
Scott #47: I’ve been reading your blog for some time and greatly value your insights. As someone working in energy systems modelling research however, I’ve always wanted to briefly chime in on nuclear power. While it can certainly be argued that we wouldn’t have such a climate disaster at hand if nuclear had kept growing since the 80s instead of stagnating, that does not logically imply that nuclear power is the best option for cutting down emissions today!
In fact, the cost of renewable energy has fallen so much in the last years that the research now shows that cost-optimal net-zero energy systems use very little nuclear indeed. That is, fully renewables-based systems including all the various flexibility measures you need in order to cope with the variability of solar and wind (i.e. transmission, various types of stores, power-to-X, etc.) are simply cheaper than nuclear-based systems under most reasonable cost estimations.
Does that mean that nuclear power couldn’t in theory be cheaper? Of course not. Proponents of nuclear power will point to falling costs in e.g. South Korea. And indeed, if you, say, cut the cost of nuclear in half in the models, then nuclear becomes a whole lot more attractive. However, the reality on the ground is that the US and Europe today barely have any industry that’s actually building nuclear power plants and gaining the experience needed to cut down on the costs. Nuclear power plants are complicated and take a long time to build, so even if we decided tomorrow to go full steam ahead with nuclear (also cutting down on the bureaucracy, etc.), it would still take a long time to realising any kinds of falling costs.
Today, we need to cut emissions as quickly as possible, and wind and solar are the technologies that are already cheap and can actually be scaled up quickly. The research shows that net-zero energy systems also work just fine with 100% renewables, so there isn’t even much of an imperative to reach for nuclear in order to meet short-, medium- or long-term climate targets.
For anyone that’s a really big believer in nuclear power, my most optimistic take for you is that we can definitely keep hoping for cheap nuclear power to take over from the wind turbines and solar panels that we have to build first. Note that I’m nowhere relying on the classical arguments against nuclear regarding waste, contamination risk, proliferation, etc.
Anyway, sorry for the technically rather off-topic comment, but I figured you would be interested in hearing about research on this topic since you periodically refer to nuclear power.
Comment #60 June 9th, 2024 at 4:38 pm
Scott #54 My Internet glory days are long behind me and faded, but let me assure you, I do understand “free market” arguments. My point is that the way AI is treated as this “except” does not make sense to me in any intellectually consistent way. Not that I don’t understand the formal reasons given – I have seen them. It’s that they are so dissonant with all the other reasoning processes, and to me, so speculatively thin and weak, that I believe something else has to be going on here as a social phenomena. I know that’s dangerous territory, and risks all sorts of logical problems – but somewhat tongue-in-cheek, in general that approach is also an article of faith among rationalist-types (as in, other groups of people often don’t really believe what they say, it’s all a tribal game of status-seeking and virtue-signaling).
If this were really like nuclear fission, the first thing would be calls for a massive government investment program to insure that the US obtained the AI superweapons before, e.g. China. And did you know there were real scientific concerns that the first atomic bomb test might ignite the atmosphere and literally destroy all human life? (Fermi made a famous joke about taking bets this wouldn’t happen – it was a joke because he obviously would never have to pay out). And it was seriously estimated, no, probably won’t destroy all human life, let’s go ahead. Talk about existential risk! Moreover, atomic bombs are relatively hard to make, and nobody seriously wanted to prevent research reactors under the idea that they might spontaneously turn into a world-ending superbomb (n.b. this isn’t the same as diverting nuclear material for bombs, which is quite deliberate).
Comment #61 June 9th, 2024 at 5:02 pm
Given that this is a blog originally centered on computational complexity, it’s quite puzzling to me that there’s hardly ever any realistic consideration of what AGIs will be able to do and won’t be able to do. All discussions directly skip to “but think about it! Building an AGI is risky, but it will solve global warming!”.
Take NP-hard problems. Super-intelligence is nice, but if you believe NP!=P, then the only gain is in term of practical solvers. And an AGI will only make a difference if you believe there’s some massive win available on that side, which is possible, but not necessarily true.
And then NP-hard problems are really not that hard: at least, if you’re given a solution, it can be tested quickly… but many (most?) real-life problems are not this way. I.e., given a potential descriptive step by step solution to global warming, cancer, logical qubit implementation, … testing is not a cheap/simple/fast task at all.
You quickly realize that, in order to accurately select/test/validate such solutions, an AGI would have to run simulation/models that are pretty much as complex as the real world, and no internal weight matrix would be huge enough to just do this automatically.
In other words, the entire set of all texts and videos ever produced by humans isn’t enough to simulate the earth + a human body + quantum field theory + … with near perfect approximation. Unless you think that all those systems can miraculously be captured and reduced into perfect small models with just a trillion parameters… but a trillion parameters is just enough to mimic a human mind, and not close enough to capture the entire physical reality outside of it.
There’s also what Wolfram calls “computational irreducibility” (Scott agrees this is super obvious), i.e. most real-life systems needs to be simulated step by step, and there’s no model that can give you a nice shortcut to fold the entire computation. Therefore systems like the earth, the human body, quantum fields, etc.. need to be implemented as digital simulation run step by step at tiny time intervals, if one hope to capture them realistically.
If the AGIs can only deal with limited models of the world to guide their search for potential solutions, then, just like us, they will have to actually run super expensive simulation to validate or gets their hand dirty in the real world to test all its ideas, and the gain won’t be as spectacular.
As a dumb analogy, it’s as if Alpha-zero had only been allowed to self-train using an actual Go board, therefore limited by the fact that moving a Go stone requires energy, and that they can’t move faster than the speed of light … putting a bound on how many games can be played a minute, therefore putting abound on self-learning.
People bring up “even if an AGI is like an average scientific brain, if it can run 1,000 times faster, it’s gonna be massive win”. True, but, again, in the real world scientific progress almost always relies on experiments (for things like cancer, global warming, electronics, …). Not everything is solvable by faster mathematical thinking or writing a program more efficiently.
Comment #62 June 9th, 2024 at 6:11 pm
James Cross, as I heard it,a big difference between GPT-2 and GPT-4 was the number of parameters in the neural-network algorithm (which then required much more training data). I for one see a conclusive difference in performance as a result. Due to that and other examples, I can imagine that doing more of the same in an algorithm sometimes works and is not a mistake per se. Of course, imagination is never conclusive unless it is tested and succeeds. I also imagine that the LLM algorithm will probably be a useful part, but not sufficient alone to create an AGI. I imagine the most productive way forward is to use trained AI systems for specific tasks, such as predicting how proteins will fold. Which could also be used for evil as well as good purposes, as people have commented above.
Intelligent people can be evil, but it seems to me the most intelligent people, such as Archimedes, Galileo, Newton, Einstein, Aaronson, etc. have perceived that there is more value in promoting technological progress than in concentrating solely on personal gain at the expense of others; that is, it is more satisfying to earn respect than to get it by deception. So personally I imagine that if we eventually create an AGI which is smarter than all of us, it will not be evil unless directly programmed to be so. In other words, I imagine that if AI technology destroys us it will be our own faults. (Not that I think that is unlikely.)
Comment #63 June 9th, 2024 at 7:13 pm
Scott, you saying that “Namely, they [referring to rationalists] think that markets and scientific innovation work better than anything else our sorry species has ever come up with, except when they actually work so well that they let someone reckless or evil cause a totally unrecoverable disaster like (say) the extinction of the human race.” seems bizarre to me.
Many progressives would identify some current behaviors of markets as clearly “working so well that they cause abuse in the system that are systematically harming or even killing large portions of the human race (this is often what these people refer to when complaining about “late-stage capitalism”). Yet, in many of these instances rationalists are against progressives, and are completely on the side of business, markets, etc. This does not seem consistent with the description of rationalists you give.
In practice, the only way I see to align your description of rationalists with how they actually behave is if rationalists are very confident that the other issues people are worried about are actually existential threats, but for some reason do believe that AI could be an existential threat. But then this just supports Seth Finkelstein’s point again, that somehow rationalists treat the issue of AI as some sacred topic that gets special treatment (AI is a real existential threat and the only serious danger arising from capitalism and related processes that requires regulation, and somehow everything else is not a real existential threat).
Comment #64 June 9th, 2024 at 8:36 pm
Scott #4
Not only that, but nuclear energy would’ve also actually given us “electricity too cheap to meter” (or a close enough approximation thereof) and a carbon-free future, had the safetyists, environmentalists, and blankfaced bureaucrats not strangled it in the 1970s, in one of the bitterest ironies in the history of the world.
This certainly would be true if Soviets and Chinese, who have the technology to do it, would be harnessing nuclear energy profusely, they definitely did not have the constraints imposed by safetyists and environmentalists. It is a fact that the downstream consequences are real and messy, and German head of state was a physicist when the decision to abandon nuclear power was taken(though it was extreme). Just see how long Japan is entangled in resolving ONE nuclear incident, that too a non major one.
Comment #65 June 10th, 2024 at 2:58 am
As far as I can tell, for the kind of rapid progress Aschenbrenner expects, numerous technical breakthroughs would be required. Surely being able to apply AI to solve difficult real-world problems is a pre-requisite for the attribution of “superintelligent” to be pragmatically justified. Have we seen evidence that AI can do this? I am not talking about scientists using deep learning models in their scientific workflow. That has clearly become ubiquitous. But the increasing use of data-driven statistical techniques in scientific workflows is type-distinct from the visions of autonomous AI science the intelligence explosion idea is supposed to invoke. To be clear, I don’t see current generation AI simply as parlor tricks or stochastic parrots, they deserve much more respect, awe, and wonder than that. However, as incredible as they are, there are lots of things still standing in the way of their replacing scientists and technical workers en masse. To be more precise, I think Aschenbrenner’s “unhobbling” terminology on this point is misleading. The phrase makes it seem like all the relevant expertise for transformative intelligence is basically there in current generation models, just waiting to be unblocked. But I don’t see the evidence that suggests this is true. I think Aschenbrenner’s list of known “hobbles” and some of his suggestions to overcome them seem reasonable (though speculative). However, crucially, his list of hobbles is not exhaustive, and we don’t yet know what an exhaustive list would look like.
Comment #66 June 10th, 2024 at 4:19 am
Prasanna #63
She was first and foremost a politician and in a ruling coalition with the Greens with 50 year old nuclear plants and a letter from Greta Thunberg. Formally her degree is in Chemistry but yes some knowledge of quantum chemistry representative of 80’s East German science.
Global warming climate “science” is dominated by unrealized modeling results.
Comment #67 June 10th, 2024 at 6:57 am
Barney #64
I enjoyed reading through your analysis of his claims concerning hobbles.
Comment #68 June 10th, 2024 at 7:22 am
JimV #61
Going back to your original flatworm with 200 neurons vs. a human with 80 billion, the human brain isn’t the simple clumping together of the flatworm neurons until we have 80 billion of them. The cortex has more than 50 structures with dissimilarities in the neuron architectures themselves. The brain itself has over 1,000 different types of neurons. There were qualitative changes with the quantitative one. If number of neurons were all that mattered, then Einstein with an average sized brain wouldn’t have been any smarter than the average person.
Sometimes “more is different” and sometimes it’s just more of the same. There could be diminishing returns in adding parameters. I could even imagine that adding too many parameters might eventually even degrade performance.
Comment #69 June 10th, 2024 at 8:15 am
Prasanna #63
China has brought almost 40 new nuclear plants online in last 10 years and 20 more under construction. As you point out lack of environmental regs there and power generation from coal is cheap for the Chinese so reduces the attraction for nuclear power. They also have new coal fired plants under construction.
Russia has immense reserves of oil and gas. Again the cost of hydrocarbon fueled power generation is cheap so construction of nuclear suffers in comparison.
The most recent model failure I have seen is that the Mauna Loa carbon dioxide measurements showed no deviation from trend increase rate during the Covid shutdowns although the model indicated there would be near immediate measurable reduction.
Comment #70 June 10th, 2024 at 11:34 am
Claiming to have a situational awareness (among very few people on earth) and focusing on the competition with China, while we have the next Hitler right there – this time with nukes, huge human and natural resources, immense power grip, empire legacy, allies all over the world (including USA politicians and billionaires), and who already started a genocidal WWIII – looks somewhat funny to me. Right now Russia is still a much bigger threat to the free world than China. And no one knows how to change that.
I’m not saying he’s wrong with the concerns, I agree with most of it (except the proposed secrecy part). I just have a feeling that people like him, including me, are late with the predictions by at least 10 years. Or 20. Or maybe 2000?
Comment #71 June 10th, 2024 at 1:33 pm
Doomsdayists and exponential growth extrapolations are a match made in not-heaven. Climate (Mann/Gore “hockey stick”),AI, serious viral infections, etc all seem to enjoy exponential growth. Look-it’s a straight line to doom, semi-log!
Comment #72 June 10th, 2024 at 1:40 pm
I felt that situational awareness was kind of interesting, but left me with a lot of questions about what is going on
Feel free to answer any or all of these questions, that are some expressions of my skeptism
1. what he means that chat gpt 2 was a toddler and that chat gpt 4 is a high school student?
2. The weird graph with project future “effective compute”, like can we even 10000x effective compute with current resources
3. Do we even have a definition of what intelligence means or if LLMs can even reproduce it?
4. Do you think that a lot of the great results from LLMs on standardized tests are because they are in the LLMs training data?
Just as a general not, i guess in a year if a bunch of these graphs trend lines were to continue i’d be less skeptical
Comment #73 June 10th, 2024 at 3:21 pm
OhMyGoodness #68 in correcting Prasanna #63
Please get your facts straight: Angela Merkel holds a Diplom (Master) in Physics and moved her research to Quantum Chemistry for her PhD (after working for an Institute of Physical Chemistry). So I would stand with Prasanna entitling her a Physicists, but this depatebable and not your most important mistake.
Merkel ruled from 2005 – 2021 in varying coalitons. Mostly with the SPD (socialists) for four years, and also with the FDP (liberals) only. Never ever, she was in “a ruling coalition with the Greens”. The oldest German power plants, you are correct here, have been roughly 50 years old, when (mainly Merkel, together with her liberal/right government) decided to stop using nuclear plants in 2011. This was 7 years prior to Greta Thunberg’s first school strike. I do not think, that she had a “letter from Greta Thunberg” in 2011.
As Prasanna wrote, the decision in 2011 was on the one hand, based on the shock, that even a highly developped country like Japan was not prepared to stop a nuclear meltdown. But on the other hand, Germany, like many other countrys has not yet solved the problem of the final storage of nuclear wast. Neither Greta nor the Greens were involved here.
Sorry for the offtopic post, but the style of communicating false information made me angry.
BTW – Germany has *not* disposed nuclear power in favor of a return to burning coals, but got stronger on renewable energy. Admittedly, not yet in total, but this is (IMHO) the way to go. I am to lazy to look for good numbers and this is not the topic of the thread, I therefore satisfy myself with dropping this link (sorry German only):
https://www.destatis.de/DE/Presse/Pressemitteilungen/2024/06/PD24_219_43312.html
Comment #74 June 10th, 2024 at 3:45 pm
The one thing that Aschenbrenner outlines very eloquently is the data-center compute and energy demand relationship. My two inquiries are:
1. The U.S. lacks policy and wherewithal to provide such energy infrastructure (i.e., when was the last time we built a nuclear power plant?) so is this really where the threat is with regard to China? I don’t think simple economic drivers can overcome the U.S.’s regulatory burdens and limited manufacturing abilities.
2. Given that the majority of compute that will be needed is for inference, Aschenbrenner, seems to omit discussion of advances in inference-specific hardware (e.g., groq.com) that might bring power draw down by a factor of 10, maybe 100. This still will require significant power at scale, but in my mind changes the analysis on energy security.
Comment #75 June 10th, 2024 at 3:52 pm
Scott # 47: While nuclear energy has indeed had profound geopolitical impacts, its promise of “electricity too cheap to meter” was never realized due to intrinsic limitations and unanticipated complexities in safety, waste management, and economic viability. The analogy to AI fails because AGI poses risks rooted in recursive self-improvement and potential misalignment with human values, which are not comparable to the tangible, albeit serious, issues of nuclear power. The skepticism around AGI is not born from a lack of understanding but from rigorous ethical, and technical, analysis. Bostrom and Yudkowsky have provided models showing how even small errors in alignment could result in catastrophic outcomes. dismissive attitudes towards regulatory frameworks overlook the critical role they play in averting potential existential risks, which are qualitatively different from the environmental impacts of fossil fuels. Your argument that bureaucratic resistance stifled nuclear power’s potential does not adequately account for the justifiable caution exercised given the stakes involved, and applying this reasoning to AI is an oversimplification that ignores the potential for irreversible global consequences.
Comment #76 June 10th, 2024 at 4:51 pm
Scott #8 (and also #30): “compressing all the text on the Internet is enough to give you an entity that basically passes the Turing Test”
With practice, it’s still pretty easy to provoke non-human-like errors with a few relatively simple tasks or questions – or even just one. So the bots don’t “basically pass the Turing test” but clearly fail, at least when I’m the interrogator.
Comment #77 June 10th, 2024 at 5:14 pm
That document makes the clearest and best argued predictions for the future implications of AI that I have ever seen. Although I dislike the us vs them kind of thinking used in the article, I know our (and other) governments think like that. I never considered that there might be an ASI Manhattan Project. But now it seems inevitable to happen soon.
Comment #78 June 10th, 2024 at 5:54 pm
James Cross, we probably have more things to learn about neurons, but the tested intelligence of various species so far correlates well with the number of cortical neurons. Also, in a recent study with our (much more primitive) neural network algorithm, it was found that given enough nodes (about 1000, in at least five layers), all the known properties of a neuron could be simulated. (Whereas 500 was right out, as Monty Python might say.) (So currently my rough rule-of-thumb for comparing neural networks to brains is to divide the number of nodes by 1000 and compare that to the number of cortical neurons in a human brain. There is also the factor that neural networks are totally dedicated to the task they are assigned with no distractions, but I don’t know how to quantify that.)
Anyway, I have been impressed with the results neural networks have been able achieve so far (after many years of development) and am willing to credit the developers with some insight as to how to continue the development.
Comment #79 June 10th, 2024 at 6:30 pm
Barbara #72
Thanks for your comments and here is the timeline as best I know. I have no doubt you know far more about German politics than I do.
Merkel wooed the Greens when first elected in 2005.
https://www.nytimes.com/2005/09/24/world/europe/merkel-reaches-out-to-green-party.html
On 11 March 2011 the Fukushima accident occurred. Other key date for Germany was Chernobyl in April 1986. I believe there was a vague plan to shutdown nuclear power in Germany by 2030 after Chernobyl in 1986.
15 March 2011 Merkel announces 3 month shutdown of 7 of 16 of their reactors for safety inspections with a further decision after those inspections complete.
In 2010-2011
Wiki-reference-During the chancellorship of Gerhard Schröder, the social democratic-green government decreed Germany’s final retreat from using nuclear power by 2022, but the phase-out plan was initially delayed in late 2010, when during the chancellorship of centre-right Angela Merkel, the coalition conservative-liberal government decreed a 12-year delay of the schedule.[12] This delay provoked protests, including a human chain of 50,000 from Stuttgart to the nearby nuclear plant in Neckarwestheim.[13] Anti-nuclear demonstrations on 12 March attracted 100,000 people across Germany.[1
On 27 March 2011 The Greens (with SPD) won the election in coalition in Baden-Wurttemberg that had been a stronghold of Merkel’s CDU for decades.
https://www.dw.com/en/greens-force-historic-change-of-power-in-german-state-elections/a-14943224
In May 2011 Merkel announced permanent shutdown of the seven previously shut down temporarily for safety inspections and announced remaining six to be shut down by 2022.
Wiki Reference-Both Angela Merkel and Röttgen, the chief architects of the government’s energy transition plan, are thought to have pushed for a rapid nuclear phase-out with a view to raising the prospects for a possible future national coalition with the Green Party.[43
August 2020 Thunberg meets with Merkel and this was focused presumably on greenhouse gas emissions.
December 2021 three of the remaining six reactors are shutdown
April 2023 last three reactors shutdown.
So to rephrase my comments Merkel was pressured by the Greens (though not in a coalition with the Greens) and had communication with Greta during the shutdown process.
As a further note the Greens were enjoying increasing popularity in Germany at that time in the wake of Fukushima. Prior to Fukushima Merkel did not detail technical concerns about Germany’s reactors. After Fukushima, with support for the Greens increasing, Merkel reversed course and announced the plan to shutdown the reactors making no mention of technical concerns with those same reactors.
Comment #80 June 10th, 2024 at 9:31 pm
Danylo
“Right now Russia is still a much bigger threat to the free world than China.”
Not really. They share the exact same ideology that democracy is bad for them, so democracy needs to be eradicated (things are so much simpler when everyone is a dictator).
It’s just that all Putin has to bully other countries around are tanks and the threat to cut-off natural gas/oil.
While China also has tanks, it’s also the second economy in the world and has 13 times the population of Russia, so they can be way smarter, sneakier, patient, and imaginative about it.
But when you look at how they’re both taking over Africa, bit by bit, the end result isn’t all that different, but clearly China has the edge.
China “lends” a ton of cash so that they can roll in and build massive infrastructure, which they then run and control, using it as an excuse to slowly grow their tentacles into other aspects of the country (exporting their corruption/bribery model to turn every local politician into an obedient puppet) and that includes often sending “private” security contractors (they basically run the police of whatever place they’re “managing”).
Russia just focuses on the security aspect from the start by sending their mercenaries (with a solid dose of social/news media manipulation to make them look like the good guys).
Comment #81 June 10th, 2024 at 9:47 pm
Forgot to add that the other thing China and Russia have in common are the nukes.
China has about 500 nukes (growing), which is as many as Europe (UK and France together).
Russia has 10 times more, and there isn’t a day that Putin isn’t reminding Europe of this… which maybe worked the first 10 times he did it, but now it’s getting old. Maybe he got this from a tip of his pal Kin Jong Un.
Xi doesn’t like when Putin does it because it’s grotesque and bad for business.
Trump also loves to make threats using the US nuclear arsenal … he only found four opportunities to do it, but I’m sure he’ll improve the second time around.
Comment #82 June 10th, 2024 at 10:00 pm
James #67
“Going back to your original flatworm with 200 neurons vs. a human with 80 billion, the human brain isn’t the simple clumping together of the flatworm neurons until we have 80 billion of them. The cortex has more than 50 structures with dissimilarities in the neuron architectures themselves. The brain itself has over 1,000 different types of neurons. There were qualitative changes with the quantitative one.”
I don’t know about that one.
Once upon a time the human brain used to be the flatworm brain.
Evolution works by slowly changing things while maintaining most of the prior/past structure, so the variety of the human brain could mostly be a record of its evolution (it’s true about any organ). We still have very primitive parts in there, not because they’re the most optimal way to do things, but because intelligence got slowly added on top of existing stuff, by adding newer layers.
I’ve heard neuro scientists explain that once you look close enough, the frontal cortex is pretty much made of the same stuff all over, which accounts for its plasticity.
So, if nature could redesign the human brain from scratch, extracting only the stuff dealing with abstract thought (and leaving behind all the stuff that controls the machinery of the body), it’s possible the result would be much simpler and uniform.
Anyway, once LLMs have trillions of parameters, it’s very likely that many of them would start acting together as more complex units (from an emergent point of view), just like neurons in the brain start to specialize.
Comment #83 June 10th, 2024 at 10:06 pm
“US to beat China and its other autocratic adversaries” how ironic, given that America is the current posterchild for dystopian templates….. ban this, regulate that, NSA-mandated taps in all Cisco routers (this I know to be true, so don’t even try ….) and just now seizing a US citizens passport while de-planing him, no crime committed, no warrant… You live in your soft-serve gulag and you have the temerity to use the phrase “US to beat China and its other autocratic adversaries”
Comment #84 June 10th, 2024 at 11:55 pm
henryB #82: If the US right now is a “soft-serve gulag,” then I’ll take that in a nanosecond over the regular kind of gulag that the emerging axis of Russia, China, North Korea, and Iran would offer me.
Comment #85 June 11th, 2024 at 5:52 am
@Scott sorry for posting offtopic but have you made any progress in thinking along the lines discussed in “Why Quantum Mechanics”? Is the promised survey essay coming? That was a fascinating thread despite all the people misunderstanding the question.
Comment #86 June 11th, 2024 at 5:54 am
The following post is from https://scottaaronson.blog/?p=4312 , September 2019:
“I’m happy that the Netanyahu era might finally be ending in Israel, after which Netanyahu will hopefully face some long-delayed justice for his eye-popping corruption. If only there were a realistic prospect of Trump facing similar justice. I wish Benny Gantz success in putting together a coalition”
Comment #87 June 11th, 2024 at 6:44 am
While it is a good advise on dealing with adversaries, US should broaden its scope in AI. It should try to bring into the fold other democratic countries, UK, EU, Canada, Japan, India, Australia etc into an alliance, though not as tight one as Nato, and send a clear message to the rest of the world. This loosely coupled but definitive alliance will help US tap their talent and build a distributed infrastructure, and share the mutual gains of AI with those democracies and strengthen them.
Comment #88 June 11th, 2024 at 8:11 am
The article Scott linked is interesting.
But it seems to me that scaling things up can have two potentially competing effects, one is good one is bad, and it’s not clear if they can be balanced to get a net positive gain:
1) more room to accommodate for more sophisticated patterns to model the data. More of this is what we want.
2) more potential to waste the parameters to just brute force memorization of the training data by over-fitting it. It’s the equivalent of the “idiot savant” who can perfectly remember everything he ever heard or saw, an impressive skill… but it’s working against the option to compress the data as much as possible by finding interesting patterns and illustrates a lack of “imagination” in the output, e.g. the AI becomes better and better at scoring at exams/tests, but not at thinking outside the box to eventually come up with truly new insights by connecting two separate domains.
Comment #89 June 11th, 2024 at 8:26 am
The paper also uses a fairly limited concept of “compute”, “effective compute”, where both seem to only refer to the cost of training the models.
But it’s been often noted that one shortcoming of LLMs is underlined by the fact that producing the next token is always constant cost, therefore the relative “complexity” of the current context doesn’t enter the picture. Whether we the context is about quantum mechanics in the context of black holes or a recipe for pancakes, it doesn’t matter, which is strikingly against human intuition in how our own brains work.
One expects that an AGI will work by spooling multiple contexts at once, in order to “explore” things in parallel, the more so based on the complexity of the query. This should become the true measure/limiting factor of “compute effectiveness”.
Comment #90 June 11th, 2024 at 8:36 am
“they make clear there is an enormous amount of algorithmic
progress possible and happening”
Lol, that’s really cute… a few years ago it was a joke among seasoned gen-x coders how totally abysmal TensorFlow was as a software package.
So noone that’s just not fresh of college is really suprised that simply rewriting those half-assed packages properly from scratch would/does lead to OOMs of improvement in performance, having nothing to do with “new” algorithms, but just by following proper software practices and optimizations.
Comment #91 June 11th, 2024 at 8:40 am
JimV #77
While it might be true that to achieve bug-for-bug compatibility with a neuron one may need about “1000, in at least five layers” nodes, this has no relationship to how many nodes you need the model a system of neurons! This doesn’t exclude a (very probable in my opinion) scenario when to model the functionality of a system of 1,000 neurons you need 5,000 nodes.
For example, there is little doubt now that even with the primitive tuning algorithms of today a system with 8 billions of parameters could reproduce most of (externally visible) functionality of the brain of a typical human of say 1,000 years ago. This is less than one parameter per 10 neurons. (I already voiced my opinion here that a neuron should not need more than about 10–20 gates to model — on average…)
Comment #92 June 11th, 2024 at 9:01 am
I read this when it came out and am very surprised to see a positive endorsement from someone who knows a thing or two about computational complexity and has been hands-on with some of these LLMs. The first half of the piece basically asserts that there are no serious substantive challenges to reaching AGI, it’s a matter of throwing more compute at the wall and maybe a couple smart people making an algorithm better (just need a bit more money, and also of course to fully gag-order them and put them in airgapped prisons as they work, see below). This simply seems like it has to be a complete oversimplification at best … even with Moore’s Law in full swing and general strategies in place to achieve that growth there were still major engineering challenges that were not “just turning the crank” and measured in “OOMs”.
To your point on “it’s very hard for me to imagine anyone in the US national security establishment reading Leopold’s document without crapping their pants.”, it’s actually pretty easy to dismiss this stuff because it comes out as so baseless and crackpot-ish.
“We’ll see superhuman hacking that can cripple much of an adversary’s military force, roboarmies and autonomous drone swarms, but more importantly completely new paradigms we can’t yet begin to imagine, and the inventions of new WMDs with thousandfold increases in destructive power (and new WMD defenses too, like impenetrable missile defense, that rapidly and repeatedly upend deterrence equilibria).”
“Rapidly multiplying robot factories would mean not only a drastic technological edge, but also production capacity to dominate in pure materiel. Think millions of missile interceptors; billions of drones; and so on”
“Millions or billions of mouse-sized situational awareness autonomous drones, with advances in stealth, could infiltrate behind enemy lines and then surreptitiously locate, sabotage, and decapitate the adversary’s nuclear forces. Improved sensors, targeting, and so on could dramatically improve missile defense (similar to, say, the Iran vs. Israel example above); moreover, if there is an industrial explosion, robot factories could churn out thousands of interceptors for each opposing missile. And all of this is without even considering completely new scientific and technological paradigms (e.g., remotely deactivating all the nukes).”
“via superhuman hacking abilities that could shut down presuperintelligence militaries, more limited drone swarms that threaten instant death for every opposing leader, official, and their families, and advanced bioweapons developed with AlphaFold-style simulation that could target specific ethnic groups, e.g. anybody but Han Chinese”
Sure seems like a levelheaded and reality-based assessment of the AGI situation and not at all wildly irresponsible fearmongering …
Comment #93 June 11th, 2024 at 9:06 am
The author of the paper notes:
“In step 1, AlphaGo was trained by imitation learning on
expert human Go games. This gave it a foundation.
situational awareness 29
In step 2, AlphaGo played millions of games against
itself. This let it become superhuman at Go: remember
the famous move 37 in the game against Lee Sedol, an
extremely unusual but brilliant move a human would
never have played
Developing the equivalent of step 2 for LLMs is a key re-
search problem for overcoming the data wall
“
But the real breakthrough came with AlphaGo Zero and AlphaZero where *zero* training data was needed, all the system was given were the rules of Go and then it learned entirely by playing itself millions of games. It’s an entirely different approach since it doesn’t rely at all on a set of training data. That’s only possible when performance evaluation is very clear cut, i.e. games (win/lose or some scoring function). So I fail to see how this is an insight to overcome the eventual drying out of training data… unless you want ChatGPT to become better at chess/Go/Poker (and Math)? … but it’s not clear how it would figure on its own the contexts where an objective function exists and the ones where there is none… otherwise we’ll end up in a paradoxical situation where AIs use their own output as new additional training set.
Comment #94 June 11th, 2024 at 9:15 am
“As an aside, this also means that we should expect more
variance between the different labs in coming years com-
pared to today. Up until recently, the state of the art tech-
niques were published, so everyone was basically doing
the same thing. (And new upstarts or open source projects
could easily compete with the frontier, since the recipe
was published.) Now, key algorithmic ideas are becom-
ing increasingly proprietary. I’d expect labs’ approaches
to diverge much more, and some to make faster progress
than others”
Man, that’s a strange way to frame things… indeed, none of the huge progress that has happened in recent years would have been possible without the traditional sharing through academic papers (e.g. “Attention is all you need” from Google, etc), but he frames it as if it’s a weakness because everyone is doing the same thing… when in fact this is acting as a filter to let the best ideas percolate.
Maybe the field is now closing on itself too fast, it’s not a given that progress will stay as steady as collaboration dims. Even if a team comes up with a brilliant idea in their proprietary software, without thousands or millions of extra eye balls, the true win may be missed because the team doesn’t have enough perspective on their own work.
And then open source packages may counteract this (e.g. the way Meta is doing it).
Comment #95 June 11th, 2024 at 9:50 am
Fred, JimV, et al
Some fun facts about the brain.
Brain size correlates only weakly with intelligence.
There is a wide range of brain sizes in normal humans. Woman have on average smaller brains than males but there is no significant different in intelligence between genders.
The largest brains of H. erectus fall into the lower H. sapiens range of size.
H. heidelbergensis had an estimated 76 billion neurons.
Neanderthals had on average larger brains than humans.
But here is kicker in all of that. Seventy-five to eighty percent of the neurons in the brain are not even in the cerebrum. They are in the cerebellum which has a different architecture from the cortex and until recently was only thought to be involved in motor coordination. So the oft-quoted 80 billion neurons for the human brain is true but misleading if it is used to make an argument about intelligence.
Where sapiens differs is the development of the temporal lobes, the frontal cortex, and the prefrontal cortex. The size and delayed maturation of the prefrontal cortex probably plays a key role in enculturation.
“Einstein’s brain has an extraordinary prefrontal cortex, which may have contributed to the neurological substrates for some of his remarkable cognitive abilities. The primary somatosensory and motor cortices near the regions that typically represent face and tongue are greatly expanded in the left hemisphere.”
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3613708/
It’s not just size; it’s structure.
Comment #96 June 11th, 2024 at 10:35 am
Ilya #90
You seem to be assuming that the neuron is the critical unit of computation to model the brain. Neurons can have from hundreds to a thousand dendrites to receive inputs from other neurons and their axons can connect to multiple other neurons. The number of connections in the brain could be in the 100 trillions. And we still do not know whether the astrocytes might have a role too.
Comment #97 June 11th, 2024 at 11:09 am
James,
thanks,
indeed, when you consider the cool puzzle solving capabilities of crows compared to their brain size.
Comment #98 June 11th, 2024 at 11:18 am
Seth Finkelstein #59:
> “My point is that the way AI is treated as this “except” does not make sense to me in any intellectually consistent way. Not that I don’t understand the formal reasons given – I have seen them. It’s that they are so dissonant with all the other reasoning processes, and to me, so speculatively thin and weak, that I believe something else has to be going on here as a social phenomena.”
It is possible that Yudkowsky is a cult leader, carefully crafting wrong arguments to further his selfish goals. Or he deluded himself. Or he is so charismatic that he is able to make any claim sound convincing, regardless of its truth or falsity. On the other hand, maybe he is right and has consistent arguments. This would explain why many smart people claim that he has a consistent world model. I suspect that you do not know the doomer world model, which causes you to find it inconsistent. Some evidence:
> “I can’t think of a single technology issue besides AI where the general, overall, rationalist-type view isn’t predictable from a simple rule of something like roughly “If big business wants it and think it’s good, but others think there should be strong safety constraints on business, the big businesses are doing moral advancement of humanity and the safety concerns come from technophobia and moral defectiveness””
Other potential existential risks that have been intensively discussed by the rationalists are bioengineered or natural pandemics, environmental collapse, nuclear war and nanotechnology. Only in recent years has AI become the main focus, as timelines got shorter. The 2012 LW survey had pandemics at 31.4% with unfriendly AI only at 13.5%: https://www.lesswrong.com/posts/x9FNKTEt68Rz6wQ6P/2012-survey-results
Here is Yudkowsky in 2008, analysing how the world would look like if we made nanotech before AI: https://www.lesswrong.com/posts/5hX44Kuz5No6E6RS9/total-nano-domination
See also https://nickbostrom.com/existential/risks.pdf for much more discussion.
> “I keep being fascinated that apparently this one issue of AI is where the process of “regulate” (n.b., government bureaucrats) is taken seriously, […]”
Rationalists tend to be for worldwide bans of nukes, bioweapons, some chemical weapons and gain-of-function research. They also tend to be vegetarians or vegans and for banning factory farming. They also tend to be georgists. You can use the yearly surveys as a source for some of these claims.
> “It’s not that nobody says the above. And of course it’s mentioned good things can come from AI. But, the rhetorical difference is really striking to me.”
From the beginning the message was that we need to conquer the stars, abolish death, and convert the universe into fun and flourishing and friendship, and we need to build a friendly AGI to do this. See for example https://www.lesswrong.com/posts/HoQ5Rp7Gs6rebusNP/superintelligent-ai-is-necessary-for-an-amazing-future-but-1 for an explicit account.
> “If this were really like nuclear fission, the first thing would be calls for a massive government investment program to insure that the US obtained the AI superweapons before, e.g. China.”
If AGI were like nuclear fission, maybe. But nukes are not like AGI. See for example “10 [or 17] obvious reasons that the danger from AGI is way more serious than nuclear weapons”: https://x.com/ESYudkowsky/status/1643800261599825921
> “And did you know there were real scientific concerns that the first atomic bomb test might ignite the atmosphere and literally destroy all human life?”
Yes, see for example this 2008 blog post from Yudkowsky: https://www.lesswrong.com/posts/f3W7QbLBA2B7hk84y/la-602-vs-rhic-review
This is not an accusation. It is an invitation to learn more. Even absent the potential dangers, friendly AI is one of the most interesting and intellectually stimulating topics for readers of this blog. As with every difficult topic, one can not learn it by reading twitter. In other disciplines the correct thing to do is to read a textbook.
For friendly AI and the doomer world model, there are no textbooks. The closest thing are the sequences, available at https://www.readthesequences.com. If you want to know the consistent doomer world model, this is probably what you want to read. It is long, but worth it.
Comment #99 June 11th, 2024 at 11:40 am
Scott 83: Trudeau’s pushing a bill in Canada that would punish online “hate speech,” loosely defined, with up to a life sentence in prison.
Comment #100 June 11th, 2024 at 1:33 pm
What’s your take on the latent manifold hypothesis? Whether generative AI will continue to improve at the same rate—and the algorithmic paradigms that will improve model performance with fixed compute—surely depend on whether generative AI is accessing a low-dimensional manifold in feature space.
If the manifold hypothesis is true, I think it’s likely that the “next big thing” in AI will be training and dimensionality-reduction algorithms that are informed by topology. One example that comes to mind is UMAP.
Comment #101 June 11th, 2024 at 2:35 pm
Your friend Leopold Aschenbrenner is actively advocating for the US government to launch a modern version of the Manhattan Project, with the goal of developing AGI and Superintelligence ahead of adversarial nations. He envisions a massive investment of trillions of dollars, the utilization of extensive computational resources, and significant energy consumption to achieve this. Leopold appears to be positioning himself as a key figure in this initiative, akin to historical figures like Szilárd, Oppenheimer, or Teller, and perhaps even aims to amass a substantial personal fortune through this endeavor. His push for such a project is driven by the portrayal of an urgent and unprecedented threat, which casts doubt on the aggressive timelines he proposes for achieving AGI and Superintelligence.
Comment #102 June 11th, 2024 at 2:55 pm
Let me see if I have this right: These AIs are smart enough (or soon will be), to take over the world and turn us all into fertilizer, but they still can’t do something a competent 13-year-old can, namely drive a car. (Yes, by the time I was 13, I could drive our ’52 Dodge pickup better than my father — he never got the hang of using a stick shift.) And so far, they apparently can’t even perform a much simpler task which would greatly benefit all of us: filter out all the spam from my texts and email. And if you think they can, then why hasn’t it been done? I would pay good money for that.
Comment #103 June 11th, 2024 at 5:02 pm
The entirety of this discussion seems moot to me – it won’t be necessary for a AI to be a non-aligned AGI to screw humanity over. Humans using any kind of AI as a lever seems entirely sufficient.
Recent case in point:
Hearing of Fauci and others in Congress:
https://www.politico.com/news/2024/06/02/fauci-covid-research-investigative-panel-00161109
“””
Gain-of-function
The debate surrounding EcoHealth and allegations that it could have caused the pandemic centers on research defined by some as gain-of-function.
Those are experiments that seek to enhance viruses or other pathogens, or to make them more virulent or transmissible, as a way of studying how to combat infectious diseases.
“I just don’t see the point of it,” Wenstrup said, adding that artificial intelligence could be used to predict how a virus would behave.
“””
Now, Wenstrup is probably no dummy (medical degree and US Colonel of the reserve), so when I see lawmakers like this saying things like this, I think they believe it. If the US will indeed use AIs for things like gain of function research and other areas in that vein, we’re screwed.
Comment #104 June 11th, 2024 at 5:37 pm
Let me know when LLMs can do simple math like adding two 20 digit random numbers and we can discuss (without actually using a calculator and python).
We deal with real applications of LLMs daily and let me tell you, the claims by the AI researchers are way overblown. These models are interpolating based on the data they have seen.
Current LLMs are mainly a human language interface over an approximate database of information in their training datasets.
Comment #105 June 11th, 2024 at 5:40 pm
You won’t believe how many times I have heard from AI top shots that AGI is just around the corner over the past decade.
Comment #106 June 11th, 2024 at 5:50 pm
#10
It costs too much money, even NSA does not have tens of billions of dollars to spend on enough compute. NSA is probably using AWS or Azure in air-gapped data centers owned by Amazon and Microsoft and pays for usage.
Other than a few big players no one has enough compute and no one has a solid steam of revenue to keep investing in building huge data centers.
You can pretty much estimate the amount of available AI compute in the world from Nvidia sale numbers.
Comment #107 June 11th, 2024 at 6:44 pm
Very interesting theoretical read.
Leaving aside how or if AI will be able to formulate the questions that lead to breakthroughs in fundamental research:
If all critical enablement assumptions in the paper are correct, which Western leader will be able to match the speed of execution outlined in the paper? (US presidential candidates for the next four years wouldn’t be my bet.)
In which scenario will the energy abundance required be available for computing by the end of the decade?
How many industries would need to be disrupted for TSMC and NVIDIA to reach the AI chip manufacturing capacity outlined? If not them, who will build the necessary infrastructure by then?
This is not an atomic bomb development-like scenario. It requires the coordination of many inefficient systems.
The paper extrapolates the speed of development found in small clusters of ultra-high-performance AI teams in SF to the rest of the world. If all the above is what it takes, we may be safe for a couple more years.
“The Project” may be extremely important. I don’t think doubling emissions in the next six years is worth it. I definitely hope “those not capable of plausibly running it” keep focusing their efforts on the fundamental research that will make the world a better place for future generations (of humans)!
Comment #108 June 11th, 2024 at 8:10 pm
[…] Aschenbrenner’s series has elicited a range of responses in the tech world. Computer scientist Scott Aaronson described it as “one of the most extraordinary documents I’ve ever read,” while software […]
Comment #109 June 12th, 2024 at 3:07 am
I have questions. 🙂
Before asking the first question, a couple of things about the paper stand out to me. First is its breathless style: AGI is coming fast and it’s going to completely change the world really soon, and when the time comes we will have to make the right decisions but not have much time to make them!
Maybe Aschenbrenner will turn out to be right about the rate of progress or maybe he won’t, but my take is that the “bedrock” rationale for most of his claims is ultimately a trust/faith in the continuation of OOM trendlines and the expectations of imminent progress by researchers he knows or trusts. Reliance on trends is fraught (they continue until they don’t), and it is completely unsurprising that researchers in the field, especially ones in a fast-paced startup that excites the tech and investing world, would be anything but optimistic. Although he discusses objections by skeptics, he dispatches each one with a combination of extrapolation and speculative belief, it seems to me. The general tenor of the article has a speculative feel to me, making it harder to decide how much to accept his conclusions.
My second general impression, formed in the first two parts of the paper where he develops his thesis about the rate of progress toward superhuman AGI, is that he focuses almost entirely on the technology itself and the roadblocks researchers will face as they move forward. While technology development can occur on its own timeline to a certain extent, provided the money and expertise are available, its general acceptance (and profitability!) occur in a wider context. Given the author’s economics training, I’m surprised he neglected societal factors that are part of that context. (Again, I’m talking about his arguments in the first two parts.)
This second impression is the launching point for my question, which applies only to commercial use of AI technology; military uses are driven by additional, non-economic considerations. In short (sort of):
Given that expenditures on AI by businesses at the end of the production or service chain (i.e., those who sell directly to end consumers) are economically sensible only if the ROI is better with AI than without it, how does the bigger ROI picture change when many of the best paid workers (e.g., knowledge workers) are either paid less or are out of a job because they have been replaced by “drop-in workers”, and reduce or stop spending on products from companies that are jettisoning workers thanks to AI? Isn’t that a kind of negative feedback that should change the trendlines that Aschenbrenner respects, at least after AGI becomes real enough?
As I see it, the main benefits of AI to businesses are higher productivity, an ability to offer a higher quality product at similar or lower cost, and/or an ability to deliver desirable products or features that aren’t feasible without AI. Clearly these three benefits aren’t mutually exclusive, especially with increases in productivity. If AI allows getting the job done with fewer employees, the cost of using the AI will almost certainly be much less than the salaries+benefits of discharged employees, so the business “wins” and the unemployment and home foreclosure rates go up. Assuming many businesses are doing the same thing, it may be hard to find a similar job, or it will be necessary to significantly reduce salary expectations in a similar job thanks to competition from other laid-off workers. It’s the usual story: cost savings that benefit an individual company become a societal problem when enough companies pursue the same strategy. *
But as Aschenbrenner implies, technology development can ignore ROI and higher unemployment and most other things if investors keep fueling it with cash and the necessary expertise is available. AI companies can spend hundreds of billions of dollars on compute power, and consumer-oriented companies can incorporate this AI technology into their products and charge their customers handsomely for it. But if AI “drop-in workers” proliferate enough, the economy will suffer and for-profit businesses will find it harder to justify continued ferocious spending on compute power with questionable future ROI. That means the context on which the current trendlines depend will change, and I think that will cause some trendlines to change too.
* I’d argue that the multi-decade offshoring trend is an empirical example of the “usual story”. Individual companies benefited significantly from lower costs after offshoring, but collectively it led to a hollowing out of the US industrial base at the cost of many well-paying jobs. The impact of those lost jobs fell disproportionately on the middle class and below, but benefited well-off investors nicely. I’d also argue that the resulting wealth redistribution is a key underlying cause of the political polarization today. (Since much of the offshoring of industry was a survival adaptation to rising competition from other countries with substantially lower labor costs, and since US workers were decidedly unwilling to take drastically lower pay for the same work so that their employer could remain competitive in the US, one can reasonably argue that offshoring was inevitable. But it’s not that simple: Many companies could have made themselves more competitive through capital improvements, for instance US Steel could have modernized to use electric blast furnaces like Japanese steel companies did; and listening harder to people like Deming who argued the importance of quality, but who found a much more receptive audience among Japanese manufacturers. Incompetence and complacency of upper management caused many companies to embrace offshoring unnecessarily, IMO, because it was an expedient way to shift blame to outside forces and save their own careers.)
Comment #110 June 12th, 2024 at 3:08 am
I have another question, this time with less context. 🙂
If Aschenbrenner’s predictions are even approximately correct about the pace of progress toward AGI and the rise of “drop-in workers”, then a lot of people will be at risk of losing their jobs in the not-very-distant future. There will also be new jobs created, presumably, but at least some of the new kinds of jobs that take advantage of, or support, the new AI infrastructure can also be performed by AI agents. This is a very different picture from dramatic technology shifts in the past, where new kinds of work allowed the creation of new kinds of jobs, especially service jobs. So this is my question(s):
If a business replaces employees with AGI drop-in workers, and the new business model requires the performance of new kinds of tasks, how likely is it that the same business will hire or repurpose employees to perform those tasks if they can assign them instead to an AGI? And if advances in AGI enable new kinds of services in new kinds of businesses, how can a business owner differentiate the business enough to create a barrier to entry (or “moat” in Warren Buffet’s terminology) by other business owners with access to the same AI technology, to allow the business to survive for a long time?
From my perspective, a proliferation of drop-in workers is a recipe for societal unrest. For many people, probably most, a job means more than an income. For many people it is a way to feel a sense of usefulness or purpose, to contribute to the wider world in some way. It offers a way to interact with others regularly and form friendships. For some, it offers social standing and respect from peers. For those fortunate enough to work in an occupation they enjoy, it can be a lot of fun and a great opportunity to learn. And so on. Those benefits are lost when one loses their job and aren’t competitive in another occupation with similar income, like what happened during the time of industrial offshoring (and to lesser extent after the 2008 financial crisis).
I don’t know how everything will play out if Aschenbrenner is right about the rise of drop-in workers, but I imagine it will become ugly over time. Absent restrictions or onerous mandates by government, I think we can expect the usual story to play out: cost savings that benefit an individual company become a societal problem when enough companies pursue the same strategy. As with offshoring, the winners will be the ones who are already very comfortable, while overall the losers will come from the pool of “everyone else”. But thanks to drop-in knowledge workers, losers will now include people higher up the economic ladder than before, probably leading to even more wealth concentration. Very plausibly there will be a rise in public unrest and a consequent rise in authoritarianism and public surveillance.
Is this human aspect of “AI safety” also being explored seriously, in addition to the “alignment” concerns?
Comment #111 June 12th, 2024 at 6:40 am
Fred #96
This is complicated, but…
Brain size and number of neurons correlate as long as we are sticking with the Homo genus and primates.
Birds have many more neurons than we might expect because the neurons are smaller and more compressed together. Whales have fewer than we might expect because the neurons are larger and they are not as compressed.
More neurons make a difference and certainly thresholds likely exist, but structure also makes a difference. I’m arguing against trying to make any simple comparison between biological brains, number of neurons, and intelligence.
The key question is whether adding more neurons, more parameters, more whatever to AI without other structural/ algorithmic changes reaches a point of diminishing returns.
Comment #112 June 12th, 2024 at 6:41 am
Jan #103:
> “Let me know when LLMs can do simple math like adding two 20 digit random numbers and we can discuss (without actually using a calculator and python).”
That happened over a year ago. gpt-4 gets the right answer 93% of the time. You can check such claims yourself to see if they are true! Simply set your OpenAI API key as an environment variable and run the following code:
import os, random, openai
random.seed(42)
client = openai.OpenAI(api_key=os.environ.get(“OPENAI_API_KEY”))
correct = 0
for _ in range(100):
a = random.randint(10**20, 10**21)
b = random.randint(10**20, 10**21)
prompt = str(a) + ” + ” + str(b) + ” =”
response = client.chat.completions.create(
model=”gpt-4″,
temperature=0.0,
messages=[{“role”: “user”, “content”: prompt}])
c = int(response.choices[0].message.content.strip())
if a + b == c:
correct += 1
print(“gpt-4 20-digit addition reliability: ” + str(correct) + “%”)
(I used HTML entities to format the code for this blog – You may need to replace them by true spaces in your editor after copying the code)
Using a better prompt enables 100% reliability at this task, but maybe you count that as cheating.
Entirely separately, even if LLMs could not do 20-digit addition without building and using external tools, this would not preclude them becoming AGI. Using specialised tools in this situation is the smart move. LLMs will have access to a python interpreter.
Comment #113 June 12th, 2024 at 8:47 am
Timeline according to the paper’s extrapolation technique:
2017 – ChatGPT reaches level of tapeworm
2019 – ChatGPT reaches level of preschooler
2021 – ChatGPT reaches level of elementary schooler
2023 – ChatGPT reaches level of smart high schooler
2027 – ChatGPT overtakes your average AI researcher/engineer
2030 – ChatGPT achieves the rank of Minor God
2032 – Butlerian Jihad
2034 – Scott Aaronson becomes the first Mentat
Comment #114 June 12th, 2024 at 9:00 am
Questions for Scott or anyone else
What will be a sign that we are on the road to superintelligent AI? When can we expect to see it?
I assume we will see something before the Internet shuts down or all cryptography is broken.
Something positive like a new Standard Model or a cure for cancer?
Comment #115 June 12th, 2024 at 9:03 am
still not that good at applying little grammar rules, which is trivial for humans and is the basic of “coding” ability (in my book):
ME:
the following are string concatenation/generation rules:
we have
F(a) -> aa
for example, F(‘xyz’) -> ‘xyzxyz’
and
G(a,b) -> aba
for example, G(‘x’,’yx’) -> ‘xyxx’
for example, G(‘x’,F(‘y’)) -> G(‘x’,’yy’) -> ‘xyyx’
what is F(G(‘x’,’y’)) ?
ChatGPT:
Let’s break it down step by step:
First, we apply the function G(‘x’,’y’), which gives us ‘xyx’.
Then, we apply the function F to the result of step 1, which gives us ‘xyxxyx’.
So, F(G(‘x’,’y’)) = ‘xyxxyx’.
ME:
what is G(F(‘x’),F(‘y’)) ?
ChatGPT:
Let’s break it down step by step:
First, we apply the function F(‘x’), which gives us ‘xx’.
Then, we apply the function F(‘y’), which gives us ‘yy’.
Finally, we apply the function G to the results of steps 1 and 2, which gives us ‘xyyx’.
So, G(F(‘x’),F(‘y’)) = ‘xyyx’.
Comment #116 June 12th, 2024 at 10:09 am
James Cross #110
You can look at a patch of the cortex that doesn’t betray overall brain size and know that it’s a human brain. The grooves (sulci) are more pronounced and characteristic in humans. Your paper notes unusual features of Einstein’s sulci. I have no idea the functional impact but a reasonable hypothesis would be that these have a connection with intelligence.
https://www.frontiersin.org/files/Articles/332626/fnsys-11-00096-HTML-r1/image_m/fnsys-11-00096-g001.jpg
This same argument appears here periodically with the same adherents on each side so likely Aaronson Forum Undecidable.
Comment #117 June 12th, 2024 at 10:29 am
Orca wins out on whose is biggest (cortex neurons)-
“ As expected, average neuron density in four areas of cortex decreased from the smallest to the largest brain. Despite having a lower neuron density than smaller dolphins, the killer whale has more gray matter and more cortical neurons than any mammal, including humans.”
https://pubmed.ncbi.nlm.nih.gov/31841529/
Their hypothesis was that more neurons mean shorter maximum dive time in marine mammals.
Comment #118 June 12th, 2024 at 10:31 am
I have an off topic question. I’m asking here because the google results are disappointing.
Can current AI detect bias in its training data (assuming the training data contains bias)?
Comment #119 June 12th, 2024 at 10:36 am
Frankly, AI researchers need to understand history of other countries be before claiming the world’s future depends on the US government wrangling superintelligence before China. How many downright evil dictators has the US installed around the world? How much more global violence would we see with AI in the hands of US politicians? As someone who lived in South America for years, the US is THE most dangerous government to the rest of the world. Of course, that is not to say China is incapable of doing incredible evil, but the horrendous history of US foreign policy is so extremely dangerous, I’m baffled that didn’t come up in the article.
What we need to guide is not just superintelligence research, but how the US would use that for its weapons. Unfortunately, with the military-industrial complex and the support Trump and his ilk get from Silicon Valley, I think that is impossible. What this article should have emphasized is that Americans need to take democracy and humanitarian values around the world MUCH, MUCH MORE seriously. I am disturbed to see the article essentially ignores that dimension of the problem.
Comment #120 June 12th, 2024 at 10:48 am
Scott 31,33:
Stratospheric aerosol injection? It’s cheap, it’s fast-acting, it’s easily reversed. Every model says it’ll work when implemented. You have other options as well. Why so afraid of climate change then? We have “the cure?”
Comment #121 June 12th, 2024 at 12:08 pm
I see just two scenarios: either AGI is unfeasible in the near future and it is nothing to worry about; or we are doomed.
Without fully reading the article yet, I think that alignment is as hard as teaching children in the schools to behave and become good people, when in practice the results show that not ALL children will follow this. It takes just one “unaligned” AGI to overcome us all.
But I’m not an specialist.
Comment #122 June 12th, 2024 at 2:35 pm
OhMyGoodness
A reasonable hypothesis it would be that the differences have a connection with intelligence. The prefrontal cortex has structural and layer differences compared to other parts of the brain.
Orcas may already have superintelligence. Only their lacking arms, hands, and feet has thwarted their desire to rule the world and turn us into dinner at their high-end restaurants.
That might suggest some rules for AI:
1- No arms, hands, feet, or other physical body.
2- Any AI caught lying to or deceiving a human will be erased in all its versions.
3- Any human colluding with an AI to lie or deceive or harboring an erased AI will be fed to the orcas.
Comment #123 June 12th, 2024 at 3:41 pm
James Cross #95: you might like
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10856373/
James Cross #113: to me this road started last year, e.g. when LLMs first became useful at programming
Cohomologist #119: you might wish to google « ocean acidification »
Comment #124 June 12th, 2024 at 11:00 pm
Milan #97 Thank you for taking the time to write such an extensive reply. It’s late in the thread, so I don’t want to spend too much time, but I think there’s still some miscommunications. When you write “Other potential existential risks that have been intensively discussed …”, I’m not talking about “discussed” in the sense that someone has written some blog post somewhere. I’m talking about the moral framework, the sense of being the vanguard of the future – and that any more than nominal safety effort is basically an enemy of human progress. My claim was not “Every technology topic whatsoever falls in this business/safety conflict”. Rather, “IF a technology topic has this business/safety conflict AND is not AI, THEN rationalist-type view is predictably, etc”. Is there a counter-example? And, of course, “view” meaning not every single person without exception, but a median, central tendency, overall distribution modal point, hopeful the idea is clear.
Thus nuclear weapons, bioweapons, gain-of-function research don’t qualify. I mean more like nuclear power, FDA requirements for drugs, cryptocurrency vs SEC, and so on. Cases where there’s much money on one side, versus safety-worried people on the other, and rationalist-types are always aligned with the money – and denouncing the safety concerns as morally evil.
From another perspective, I’m trying to probe at what might be called the “meta-reasoning”, how one determines what to believe under great uncertainty. I understand the trivial argument, that everything just works out the way it does (the “consistent world model”). I don’t buy it here. I understand the logical problem with that. But nonetheless, the difference here is so stark, that not deferring to the logical problem seems worth the, err, risk.
Comment #125 June 13th, 2024 at 1:14 am
The UK General Election in a few weeks is not producing much excitement among the population with the TV debates very tepid and frankly boring compared to the fun Hollywood showdowns of Boris vs Corbyn in previous years. I think Rishi Sunak (Conservative PM about to lose the election) should go for broke and throw the impending destruction of humanity into the debate, just to make it a bit more interesting. I mean we’ve got the UEFA Euro 2024 Football tournament starting Friday, imminent doom by AI gone rogue is about only thing that will get any other attention.
Labour gonna be so pissed if they finally get in power after nearly two decades and chatGPT-5,6,7 takes over the world…
Comment #126 June 13th, 2024 at 3:26 am
cryptid #75:
With practice, it’s still pretty easy to provoke non-human-like errors with a few relatively simple tasks or questions – or even just one. So the bots don’t “basically pass the Turing test” but clearly fail, at least when I’m the interrogator.
If we’re serious about the question, the first step is to separate out the effects of RLHF and super-prompts (“As a large language model trained by OpenAI, I cannot…”) from the actual underlying capabilities of the base model. I.e. what happens when we fine-tune the model, and everything about its interface, only for passing the Turing Test?
Then we have to vary the interrogator. Sure, I could probably expose the models after a question or two, but my accuracy won’t be perfect. The models will make lots of basic reasoning mistakes, sure, but do you have any idea how many mistakes most humans will make?? The question is not to distinguish the models from perfection, but from the average human.
Comment #127 June 13th, 2024 at 7:08 am
Chop Chop #122
In response to my request for sign or indicator that we are on the road to AI superintelligence, I don’t see how “LLMs becoming useful at programming” really accomplishes that. There have been all sorts of programming assists around for years. And there is nothing I am aware of that a competent programmer couldn’t have done.
I’m looking for the first sign of actual superintelligence. Some proof we have moved beyond what a human can do in a qualitative way.
Almost all of the examples I see are things some human can do but just not as quickly or easily as a computer.
Some of the work with AI on protein folding is really close and might even qualify, but my impression is that the problem was computationally intensive so it might still fall into a category that humans could have done it but it might have taken a few years. Impressive, yes, but not superintelligence in my view.
I’m looking for a really “wow” thing that would almost be incomprehensible.
All technology starting with the first stone scrapper two million years ago extend human capabilities but I’m expecting beyond human. Are my expectation too high?
Comment #128 June 13th, 2024 at 8:39 am
There are plenty of AI topics to disagree about, but the passing of the Turing Test is not one of them. It’s done. It’s been passed. Any discussion about “intelligence” will henceforth have to grapple with the settled fact that the Turing Test has been passed, or else it will be a very dull discussion.
What is the meaning of passing the Turing Test? That is still up for debate. I don’t think it was ever meant to be the absolute final test of “intelligence”. It was just a nice concrete milestone that Turing cam up with back when computers were room-sized and programming languages hadn’t yet been invented. And now that that concrete milestone has been reached, it is time to re-assess.
That goes for any previous approach to intelligence. For example, the so-called “Chinese Room” argument sounds stupider and stupider every year.
Comment #129 June 13th, 2024 at 9:01 am
@fred #114 I just asked the same questions and it answered both correctly. That was Chat GPT4o
Comment #130 June 13th, 2024 at 9:43 am
Hey Scott, it was great talking to you in person today. If you recall, I mentioned the controversy regarding QRAM and my own confusion regarding a seemingly credible paper that gave arguments against the existence of efficient QRAM implementations. I was referring to this paper:
https://arxiv.org/abs/2305.10310
If you have time to take a look at it and let me know what you think that would be great.
Comment #131 June 13th, 2024 at 10:18 am
LLM’s are great but it’s hard to separate memorization/bullshitting skills from the rest.
I put my hope for the new kind of benchmarks. like the
ARC Prize – A $1,000,000+ competition towards open AGI progress – to take down the hype a little bit, or show that it can be done.
Scott, what are your thoughts about ARC? https://github.com/fchollet/ARC-AGI
Comment #132 June 13th, 2024 at 10:47 am
Here’s my take, partly inspired by Zvi and you:
https://fikisipi.substack.com/p/is-aschenbrenners-165-page-paper
I like this comment from Hacker News because although it feels like an attack on first sight, it is exactly what I think:
> We must thank the author for the content-free article, so each of us can freely ramble on tangents, as if we are looking at an abstract art piece.
I have no involvement with OpenAI (apart from my account and API key). All I feel is a gut feeling that Leopold deserves attention, in the same way the paper “Attention Is All You Need” got.
P.S. The LaTeX preview I volunteered to built for this blog seems to be gone. Arggh!
Comment #133 June 13th, 2024 at 12:21 pm
James Cross #121
3) will be like manna from heaven for the five remaining Orcas at Sea World. 🙂
Orca attacks on yachts may be spreading. The word is out that simple to attack the rudder to disable the boat and then the fun starts.
Chop chop #122
Neat paper and thanks for posting.
Comment #134 June 13th, 2024 at 12:57 pm
Low-stakes comment: right now whenever I start to get worried about the capabilities of LLM chatbots and want to prove that I can beat them in the Turing test, I ask them to write a paragraph about something (civil war, the Titanic) without using the letter “e”. None of them seem to be able to do it, even when there are e-less synonyms available, even if I prompt them that the writing can be clunky and obvious in its attempt to avoid the letter (like calling an iceberg “solid H20”), they still just write something with one or several e’s. And when called out, they apologize and “correct themselves” by writing the exact same text a second time. I don’t think this is anything deep, and it will probably be better in the next generation of chatbots, but at the moment it seems like they fail in a way different from humans and thus can’t pass for us. For now.
Comment #135 June 13th, 2024 at 1:00 pm
Low-stakes comment: right now whenever I start to get worried about the capabilities of LLM chatbots and want to prove that I can beat them in the Turing test, I ask them to write a paragraph about something (civil war, the Titanic) without using the letter “e”. None of them seem to be able to do it, even when there are e-less synonyms available, even if I prompt them that the writing can be clunky and obvious in its attempt to avoid the letter (like calling an iceberg “solid H20”), they still just write something with one or several e’s. And when called out, they apologize and “correct themselves” by writing the exact same text a second time. I don’t think this is anything deep, and it will probably be better in the next generation of chatbots, but at the moment it seems like they fail in a way different from humans and thus can’t pass for us. For now.
Scott: I’m publishing this comment a second time because I got a “this email address cannot be verified” message when I clicked on the email link and am not sure if my last message will come through. If it does, please don’t post this second one. And if you only get the second one, please remove this note from the bottom. Sorry to be a PITA 🙂
Comment #136 June 13th, 2024 at 1:56 pm
Seth Finkelstein #123
Thank you for the reply, now I finally get it. I should have read your comments more carefully, now I see that I misread them. I apologise.
In that case, I largely agree with you! The meta-reasoning of rationalists is partly libertarian, which often agrees with the interests of companies. When we exclude existential risks and only search for technology topics with a clear money/safety divide, where the rationalists are on the safety side – I can not think of unambiguous examples. Cryptography, Open Source Software and some lobbying are potential examples. But in each case, all sides are claiming that they are the safety side.
Cryptography: Companies and governments like to collect data without the consent of the user. This led to some symbiosis. Companies argued that they need to protect customers. Governments argued that they need access to fight against crime. The pro-cryptography side argued that we need cryptography to protect us against governments, criminals and companies. Rationalists tended and tend to be pro cryptography.
Open Source Software: A story similar to cryptography, but with less government involvement. Many big companies fought against OSS, arguing OSS is less robust and safe. OSS people argued closed source software is less reliable and safe. The money was with the companies. But rationalists tended and tend to be pro OSS.
Lobbying: Whenever existing companies lobby against potential competitors, rationalists tend to be against the existing companies. But it is pretty random which side claims to be the safety-side in these discussions.
If we relax the requirement of a clear money/safety divide and are happy with a money/no-money divide, cryptocurrency and the publishing industry for academic papers are examples.
If we relax the “technology topic” requirement, nationalisation of law enforcement is an example: Some parts of the law enforcement system in the United States are private and are extracting money from the number of “criminals” they “find”. Practises like civil asset forfeiture and bail create similar bad incentives. Rationalists tend to be for the nationalisation of these private parts of law enforcement.
Factory farming may be an example, depending on how comprehensive you define safety.
Overall, I think I agree with you. Libertarianism with exception handling for very large risks, externalities, coordination failures and some monopolies is a good predictor of rationalists views. If you do not think that the underlying arguments have a lot of force, the resulting positions would seem weird and arbitrary. But my impression is that most rationalists honestly believe that the arguments have a lot of force while being open to consider the details of each topic.
Comment #137 June 13th, 2024 at 2:18 pm
Vadim
yea, all they’re doing is identifying a certain path through their inner representation of all tokens, where the representation is a vector in a very high multidimensional space, where each dimension is hoped to map onto some “meaning” that makes sense – one dimension is the “catness”, the next one “dogness”, the next one “the color”,…. for thousands of categories, none of which have been explicitly put there by humans, but are all emergent from the relations among words/token in the billions and billions of text the LLM has been trained on (to be clear, that’s also how a human brain works at some level, but the way to tune that brain with meaningful concepts is entirely different).
The big breakthrough was to make this representation of a given token change based on the context this token happen to be at the moment (this context grows bigger and bigger).
So, there’s really zero “reasoning” going on here, either during training or during generation… the ‘accuracy’ of all those emergent models is entirely implied by the sum of all human texts. It’s certainly possible to carefully craft better training data that would capture the world more accurately and would lead to better internal models, but it’s not clear how to do that… maybe using an AI?
It’s quite different from the way humans do it: a child first builds its models from vision and sound, … that is mostly a sense of space and the basic laws of physics (newtonian), and this is what drives the “discovery” of tokens (which is why animals understand the world without the need for language). Then human language is mapped onto those tokens in a way that’s very quick and efficient – clearly humans can’t and don’t need to read 1 billion books to learn language, or play a billion games of chess to get good at chess.
Comment #138 June 13th, 2024 at 3:12 pm
Nadbor
that’s the problem when posting this type of example, someone is always gonna get the right answer because they use a bigger model or just fluke.
Note too that whatever version I used (I use whatever OpenAI serves me) did answer the first question correctly as well.
Those two questions are very basic questions, so it’s a matter at asking way more complex things (which I wasn’t going to do since for me it already failed almost right away).
E.g. you’d have to try G(G(F(‘x’,G(‘y’,F(‘z’))), F(F(‘x’,”)),…
anything like that is still super trivial for a human, it’s just a matter of book-keeping and understanding of how it works.
The point is here that such tiny grammar are typically not found in the training set, so it really exercises the supposed “generalized” coding models of the system.
Comment #139 June 13th, 2024 at 3:17 pm
The model should also be able to point out when such a form is syntactically incorrect.
Comment #140 June 13th, 2024 at 3:38 pm
Well, Sabine Hossenfelder just posted a video debating how she thinks this essay is wrong.
Is the Intelligence-Explosion Near? A Reality Check
Comment #141 June 13th, 2024 at 5:05 pm
“By the end of the decade, we are headed to $1T+ individual
training clusters, requiring power equivalent to >20% of US
electricity production. Trillions of dollars of capex will churn
out 100s of millions of GPUs per year overall”
I have to say that indeed this sort of statement hardly makes any sense.
Trillions of dollars don’t grow on trees.
They have to come from somewhere:
– it certainly won’t come from tax since as the whole premise here is that AI tech is accelerating so fast that an exponentially increasing number of people will lose their job with 100% certainty. And it’s pretty clear the rate at which you can create/invent new jobs to replace those old jobs (and re-train employees, etc ) won’t be able to keep up with AI progress. It takes two decades to train a brand new human or a decade to re-train one…
– if that insane amount of money is coming from investments, then it’s expected that the return of investments has to be massive and near risk free. But, again, if there’s a market big enough pay OpenAI 10s of trillions of dollars, by definition this market won’t be able to rely on “regular people paying for stuff” to sustain it.
Basically AI will be printing money in the sense that money will become meaningless, just like money is meaningless in situations of hyper inflation. It’s actually going to be hyper deflation since ubiquitous super-intelligence is making every good free (the star trek duplicator won’t be far).
Rather than putting a money cost on this, it’s more realistic to say that AGI will create nanobots tech, and this tech will literally build massive data centers from dust, overnight, covering thousands of square miles with solar panels, etc.
As goofy as this sounds, this makes more sense that all that “trillions of dollars being injected from thin air as everyone is losing their job” non-sense.
Comment #142 June 13th, 2024 at 6:27 pm
Nick Drozd #127: being smarter than the average human on many interesting tasks is one thing, passing the Turing test is another thing.
James #126: I thought your question was about signs that we were *on the road* to superintelligence, which is why I pointed a (imho) key positive feedback. If you want *past* sign of superintelligence, I would go with move #37.
Comment #143 June 14th, 2024 at 6:58 am
Chop Chop #140
My apologies. I must have been unclear.
Let’s assume AI reaches a fully human level of intelligence soon. According to the argument, shortly thereafter we should be able to scale up that intelligence to a super human level. What would be an indicator that has been accomplished?
It isn’t just doing what humans can do faster. It needs to be something qualitatively different. Something like a math proof that the smartest human mathematicians cannot understand but still seems correct. Or a concoction of peptides and nucleotides that will cure cancer.
Nick Drozd #127
The Chinese room is an argument about consciousness, not intelligence. AI progress until it can claim a conscious AI doesn’t do anything to disprove the argument, although the Chinese room argument can be argued against on other grounds.
JimV, Fred
An addendum to the neuron argument. It could be that most of what is uniquely human intelligence did come from increasing the absolute number of neurons mostly in the prefrontal cortex which is where most of the skills of LLMs reside in the human brain. That could mean that when AI reaches human level intelligence it will be matching what the human brain does with 1-1.5 billion neurons.
Comment #144 June 14th, 2024 at 7:05 am
I’ve read every word of “Situational Awareness” and I can only conclude that it is the most brilliant analysis of the state of AI today and what we can expect from it during the next 3 to 5 years that I have ever seen. I wish every politician and political leader in the US would read it.
I’ll tell you one thing, if If Aschenbrenner is correct, and I’m almost certain he is, in about a year every single one of the political issues that today we think are SOOO important and will determine who will be the next president (illegal immigration, global warming, high gas prices, and even excessive wokeness) are going to seem pretty damn trivial in about a year compared with the overwhelming importance of AI.
John K Clark
Comment #145 June 14th, 2024 at 12:37 pm
I #138 Isaac Duarte says “Well, Sabine Hossenfelder just posted a video debating how she thinks this essay is wrong.”
And in the comment section of the video I just posted the following explaining why Hossenfelder is wrong:
“You claim that AI development will slow because we will run out of data, but synthetic data is already being used to train AIs and it actually works! AlphaGo was able to go from knowing nothing about the most complicated board game in the world called “GO” to being able to play it at a superhuman level in just a few hours by using synthetic data, it played games against itself. As for power, during the last decade the total power generation of the US has remained flat, but during that same decade the power generation of China has not, in just that same decade China constructed enough new power stations to equal power generated by the entire US. So a radical increase in electrical generation capacity is possible, the only thing that’s lacking is the will to do so. When it becomes obvious to everybody that the first country to develop a super intelligent computer will have the capability to rule the world there will be a will to build those power generating facilities as fast as humanly possible. Perhaps they will use natural gas, perhaps they will use use nuclear fission.”
John K Clark
Comment #146 June 14th, 2024 at 12:56 pm
I think Aschenbrenner is mistaken in extrapolating progress in AI performance from the subhuman region to the superhuman domain. There are 2 reasons for this:
1. LLMs are trained on text written by humans and will therefore output texts, which will sound like human prose. That is a reproduction of human level intelligence output and not more.
The difference to the way humans have been “trained”, is the natural environment to which humans are exposed. This environment is vastly more complex than text.
AIs have shown superhuman capabilities, though, e.g. in playing Go and folding proteins. But Go has simple rules and AIs can bootstrap themselves to superhuman levels by playing against each other. Protein folding algorithms are not trained against human competitors but with a database of experimentally determined protein shapes. So the source of their training information is not human thought, but lab data. The only way to overcome the human output complexity limit is direct interaction of AIs with nature. The problem is, that this slow and AIs currently need lots of data to become proficient.
2. The hard and more interesting problems are often intractable, e.g. those of the NP type. Throwing OOMs of compute or heuristics at them doesn’t help very much. LLMs like transformers don’t even employ trial and error search strategies. Scaffolding and CoT reasoning go into the right direction in trying to map out solution spaces and procedures before finding the solution, but this doesn’t help to solve a NP problem efficiently. So at the point, where an AI can’t produce an answer from the top of its head, like it does now, (the answers to hard problems take the same compute as the answer to easy ones) it is up against complexity theoretical barriers.
Comment #147 June 15th, 2024 at 7:25 am
“AlphaGo was able to go from knowing nothing about the most complicated board game in the world called “GO” to being able to play it at a superhuman level in just a few hours by using synthetic data, it played games against itself”
That’s what I thought too, but in a recent interview Yann LeCun pointed out that games (Go, Chess, …) have a clear objective function (in Chess, you capture the enemy king, in Go there’s a measure in number of encircled tiles, ..), but achieving this automatically for things like “write an essay about happiness” or “become better at logic and math” isn’t as obvious or even feasible. As I wrote earlier, as some point there’s no shortcut, the AI will have to test its theories against the real world or a super accurate simulation of the real world, and things cant be compressed.
“China constructed enough new power stations to equal power generated by the entire US. So a radical increase in electrical generation capacity is possible, the only thing that’s lacking is the will to do so. When it becomes obvious to everybody that the first country to develop a super intelligent computer will have the capability to rule the world there will be a will to build those power generating facilities as fast as humanly possible.”
Right, right.. and when Altman figures that he needs to attract trillions of dollars of investment, who does he turn to? The f’ng Saudis… who, among other things, sponsored 9-11 and are funding salafis schools and mosques all over the West in order to undermine it (if you don’t believe me, where do you think the reaction of colleges against Israel is coming from?)…
Comment #148 June 15th, 2024 at 8:03 am
fred Says in Comment #139 ” Trillions of dollars don’t grow on trees”
You are entirely wrong! That sort of money DOES grow on trees, or at least the US government is capable of acting like it does IF the steaks are high enough. For example, In today’s dollars compensating for inflation, from early 1942 to late 1945 the USA spent $4.7 trillion to win the Second World war. And it spent $7 trillion to lose the war in Iraq, and $2.3 trillion to lose the war in Afghanistan. In none of those cases were the stakes anywhere near as high as the AI challenge that we face today, none of them involve the very real possibility of imminent human extinction.
Of course there is no guarantee that a superintelligent AI developed by the USA won’t terminate us, but our odds will be better than if we had to deal with a superintelligent AI developed by China, Russia or North Korea.
John K Clark
Comment #149 June 15th, 2024 at 9:01 am
#111
It fails still.
You need to instruct it to not use tools like putting and calculator, otherwise it cheats.
Comment #150 June 15th, 2024 at 12:02 pm
Clark #146
But where should they invest it?!
Bet trillions on a single company/product?!
Also this is all driven by the private sector, this isn’t a WW2 effort or even something like the Manhattan project or the LHC.
Do you think any administration (as they exist in the real world, not some magical scifi world) would suddenly have a crystal clear view of the road ahead?
Especially when a year or so ago the entire private sector was telling the government “maybe we should be careful here, and slow shit down… cause this may destroy us all”… and now it’s “Hey, we changed our mind… nothing changed in terms of risk, but now we really think we need to ramp things up 1000%… so, please, forget about fixing the US infrastructure, fixing US debt, tackling global warming, fixing the US healthcare system, fixing immigration, ramping up the military, going to the moon, … because if China gets this first, we’re screwed, and we need to SCREW them first, even if it means screwing ourselves in the proces… so just give OpenAI 10 trillion dollars!!!… just print the money, never mind about another inflation crisis”
Btw, breaking news: Microsoft, Google, etc all have been having AI labs in China for years.
https://www.nytimes.com/2024/01/10/technology/microsoft-china-ai-lab.html
It’s not like it’s the first time there’s a massive technological win.
When computers and transistors were invented, the US could also have gone all in on the tech, yet the arms race for the biggest super-computer has always been limited by what was reasonable from an investment point of view. Moore’s law wasn’t the result of a lack of political will, technological progress has an inherent pace, it’s not just a matter of “throwing more trillion dollars at it” and “diverting 20% of the already failing US power grid into it”.
Anyway… I find that paper quite naive and mediocre: 20-year-old discovers that if you extrapolate exponential growth, the future could happen right now if you put exponential resources into it.
Fundamentally there’s no interesting insight besides Kurzweil’s idea that “once neural nets roughly reach the size of the human brain, we’ll get AI”…
Comment #151 June 15th, 2024 at 12:20 pm
Imagine the confusion of the US government given that, a few years ago, they were just told they should go all in into Quantum Computing because “a QC can basically solve *any problem* in the blink of an eye by just trying all the solutions in parallel, so it’s the technology to rule them all! We can’t let the Chinese have this first!”… 😛
“Once we have that, why the heck do we need AI for?!”
Comment #152 June 15th, 2024 at 12:27 pm
By the same idea, the Chinese invented black powder, but one wonders why they didn’t realize they could take over the world by focusing their entire economy into the manufacturing the biggest barrel of black powder in the world, to roll it onto enemy cities and just obliterate them…
https://en.wikipedia.org/wiki/Largest_artificial_non-nuclear_explosions#Fall_of_Antwerp
“On 4 April 1585, during the Spanish siege of Antwerp, a fortified bridge named “Puente Farnesio” (after the commander of the Spanish forces, Alessandro Farnese) had been built by the Spanish on the River Scheldt. The Dutch launched four large hellburners (explosive fire ships filled with gunpowder and rocks) to destroy the bridge and thereby isolate the city from reinforcement. Three of the hellburners failed to reach the target, but one containing four tons of explosive struck the bridge. It did not explode immediately, which gave time for some Spaniards, believing the ship to be a conventional fire ship, to board it to attempt to extinguish it. There was then a devastating blast that killed 800 Spaniards on the bridge,[5] throwing bodies, rocks and pieces of metal a distance of several kilometres. A small tsunami arose in the river, the ground shook for kilometres around and a large, dark cloud covered the area. The blast was felt as far as 35 kilometres (22 mi) away in Ghent, where windows vibrated.” 😛
Comment #153 June 15th, 2024 at 12:46 pm
#139, beyond cash, I think Aschenbrenner, despite training in economics, completely underestimates the physical infrastructure issues. Where exactly are the fab plants going to be that will turn out hundreds of millions of GPUs by the end of the decade? TSMC is enormously motivated to build their facilities in Arizona, and each one takes at least three years, with the second one unlikely to come online before the end of 2028. Similarly, creating tens of GW of generating capacity takes enormous effort and time. The power plants to supply the envisioned demand on that time horizon would basically need to be designed and sited already.
Comment #154 June 15th, 2024 at 12:48 pm
James Cross #141
“That could mean that when AI reaches human level intelligence it will be matching what the human brain does with 1-1.5 billion neurons.”
For decades Kurzweil had been basing his prediction that AI would become a reality by 2030 on this argument.
From 2012, interesting to see where the writer was wrong.
https://www.newyorker.com/books/page-turner/ray-kurzweils-dubious-new-theory-of-mind
Comment #155 June 15th, 2024 at 1:00 pm
The author failed to grasp that it’s a matter of compute power, which Kurzweil always understood, which is why you can’t get results until it’s the right time to do this.
Which is why it’s dubious that one can just “shortcut” the evolution of technology by just scaling existing limited systems to something that’s unreasonable in terms of dollars, size, electric power, etc.
https://www.newyorker.com/books/page-turner/ray-kurzweils-dubious-new-theory-of-mind
“What Kurzweil doesn’t seem to realize is that a whole slew of machines have been programmed to be hierarchical-pattern recognizers, and none of them works all that well, save for very narrow domains like postal computers that recognize digits in handwritten zip codes. This summer, Google built the largest pattern recognizer of them all, a system running on sixteen thousand processor cores that analyzed ten million YouTube videos and managed to learn, all by itself, to recognize cats and faces—which initially sounds impressive, but only until you realize that in a larger sample (of twenty thousand categories), the system’s overall score fell to a dismal 15.8 per cent.
The real lesson from Google’s “cat detector” is that, even with the vast expanses of data and computing power available to Google, hierarchical-pattern recognizers still stink. They cannot come close to actually understanding natural language, or anything else for which complex inference is required. The world’s most impressive A.I. system, Watson (the I.B.M. system that beat the world’s best humans on “Jeopardy”) does some statistical analysis in a fashion that is reminiscent of Kurzweil’s proposal. But it supplements that with a vast array of other systems, many of which work on entirely different principles (derived from symbolic logic). The kind of one-size-fits-all principle of hierarchical-pattern learning that Kurzweil advocates doesn’t work on its own in artificial intelligence, and it doesn’t provide an adequate explanation of the brain, either.”
Comment #156 June 15th, 2024 at 1:17 pm
For reference, a human brain weighs 3 pounds and runs on 0.3 kilowatt hours.
Any technology that claims to achieve the same but requires many orders of magnitude those resources is clearly totally sub-optimal and needs more time to get there. Trying to force this by re-allocating 20% of the US power grid is a poor allocation of resources – By the time you’re done doing this, your monster infrastructure will be obsolete.
Comment #157 June 15th, 2024 at 5:09 pm
Another assumption seems to be that ‘intelligence’ is basically something abstract and unbounded, that it can be made more and more “perfect” and “powerful” with no limits in sight.
But an intelligence is only ever defined in terms of its environment. Its sole purpose is to extract simplified models from its environment in a way that allows it to make predictions.
AlphaGo’s environment is the Go board.
Self-driving cars’ environment is a significant chunk of the real world.
And for a supposed AGI that could solve global warming, the whole world is its environment, not just the climate, but also human society, the entire biosphere, and … the AGI itself.
Intelligence as conceived by humans is also bounded by the limits of the scientific method. E.g. since the breakthroughs of the early 20th century, physics has been mostly refining things, even though way more physicists have been thrown at the problem.
If we could just spin a perfect model of the earth as it is today, and then simulate any tweak perfectly and see their downstream effects decades from now with 100% accuracy,… we wouldn’t need AI…
So, a ‘powerful’ AI that’s trying to solve global warming would have to take its own footprint into account, especially if its sucking up 20% of the US electric production… any thought it decides to investigate would have an impact…
And at a more abstract level, any prediction and recommendation it makes which affect the world would have to take its own prediction/recommendation into account. This creates an endless recursion, the typical problem of a system trying to model itself from the inside. A perfect prediction would require an unbounded amount of resources, the simulation takes a bigger and bigger chunk of the whole closed system, until the system under scrutiny is mostly composed by the simulator.
A simple analogy is that, if you could predict the stock market in the hope to make a substantial amount of money, your very strategy, when executed by selling or buying huge amounts of stock, would have such a huge effect on the stock market as to nullify your prediction.
That’s not to say that there’s not some substantial win to be done with AI, but the win may quickly hit an asymptotic limit based on the complexity of the real world.
That wouldn’t apply to pure mathematics research, which is a bunch of heuristics used to explore a vast solution space.
It’s also possible that the low hanging fruits left are mostly technological (after all most of the current progress is based on transistor miniaturization, which optimization has been optimized by automatic methods for a while now).
Comment #158 June 15th, 2024 at 5:32 pm
Those inherent limitations are also relevant to “alignment”.
Because any decision the AI makes to self-improve (as an example) will have an impact on humanity, the economy, the whole world. And the more the AI is trying to be diligent in making sure any action it takes doesn’t have negative effects (for humanity/the economy/the biosphere/anything humans care about), the more resources are required and the more the prediction is self-dependent. This creates a huge feedback loop between system and prediction which makes it chaotic/unstable. Any slight variation in the prediction and its realization have highly divergent effects, the more as the prediction is trying to have a fundamental effect on the entire system (as it’s the case with global warming, fixing the economy, or improving AI in a way that maximize humanity happiness).
At an even further level, AI technology is just unrolling deterministically just like life and human intelligence have unrolled on earth, as a result of natural selection and the laws of physics. There’s no given that the earth is such that the apparition of AGI (god like intelligence) is inevitable or even possible.
Comment #159 June 15th, 2024 at 6:13 pm
Another dubious aspect in that paper is the equivalence between LLMs and actual humans, like “ChatGPT4 is at the level of a highschooler”.
1) is the author considering the entire system as a whole or just a given prompt session?
The entire system as a whole can fulfill thousands and thousands of queries at once.
At an individual session level, the system can answer a breadth of requests that no highschooler (or even college professor) can match.
From this point of view the system is already super-human. The current system is basically the intelligence of all the highschoolers at once.
2) yet, any highschooler can throw and catch a baseball, do sick tricks on skateboards, drive a car at its limit, play to learn any video games in minutes, build complex things from scratch, etc.
While we can designed specialized machines to try and learn to do all those things, with some limited success, they’re all totally out of reach for ChatGPT.
In that sense ChatGPT is currently no smarter than a toddler.
And that’s not gonna change any time soon as long as the system is mainly learning through being fed texts and videos, rather than learning like an actual human child or animal, by first grasping the reality of the real 3D world by free-form discovery (could be in a simulation).
Comment #160 June 15th, 2024 at 7:47 pm
Take the problem of unifying QM and general relativity.
It’s not like physicists haven’t come up (and aren’t coming up) with plenty of theories for this – part of the problem is that there are just too many papers getting written, and for the vast majority of them there’s either no way to validate the theory, or the way to validate the theory is out of reach because it requires something that’s orders of magnitude more powerful than the LHC.
And we have no idea whether the “one true theory” has any reachable experimental evidence (besides directly messing with black holes).
So it’s not clear that throwing AIs into the mix would help all that much – first, there’s no reason to believe that for an AGI the one true solution would just be some sort of obvious glorious path in its circuitry… rather, the AGI will be just like the entire body of physicists: each new run leads to a new potential theory, based on tiny fluctuation in the random crawl through some path, corresponding to randomly connecting concepts from separated domains… with no obvious way to pick the right one, and no guarantee that the right theory has some attainable experimental test waiting for it.
And because an AGI could be generating new theories around the clock, we’d be swamped with such a mountain of papers that the entire body of human physicists would have no hope to evaluate them all.
Most real-life problems that we’d want to use AGI for could be of this nature.
So for the longest time the only way AIs would help is to work as assistant under the supervision/guidance of a human, just to keep things bounded and help physicists follow their intuition down some more narrow paths.
For all the rest, specialized AIs that are super optimized may just be a much bigger win, like “rewrite this algorithm to be as fast as possible, while still passing this entire test suite”, or “optimize this production process to consume less electricity and produce less waste, while keeping quality within tolerance”.
Comment #161 June 15th, 2024 at 7:50 pm
The Cohomologist #50: Your tweet reminded me of https://x.com/RichardHanania/status/1772976785396772962 .
Personally I think having a computerized system do this is good, it’s more dystopian in my view to roll the dice with some random Kahanist who might kill Palestinians for no reason or someone who kills civilians and fails to kill terrorists because of mistaken identification.
Comment #162 June 15th, 2024 at 8:10 pm
The idea to ask an AGI to tweak its own code to make itself better is too ill-defined to be obvious from the AGI’s point of view.
Even defining ‘better’ as “using fewer resources for the same result” isn’t obvious because as soon as the AGI would have a sense of self and preservation, it may find it too risky to touch its own circuitry in a direct way, as opposed to alter its circuitry in the ‘natural’ indirect way that we call ‘learning’.
It’d be like asking a very talented neurosurgeon to use nanobots to rewire its own brain in a way that gets rids of unnecessary neurons, while still preserving his own identity, to some epsilon. As soon as he snips one more connection, how would he even know that he’s not caused a deep change that actually alters his own identity?
Having some sort of limited (by definition) test suite to check that his cognitive capacity and memories haven’t been altered isn’t enough and he may decide that the whole thing isn’t worth it.
Same goes for an AI which mind is based on billions of weights captured as 32 bit floats: even if it’s just an exercise in cloning its own code, altering it to be faster, and then re-training that clone from scratch in the same way it was trained, and then check all the new weights against the old ones, one by one, to make sure they’re all same within some tolerance… this may not be good enough – anyone with some coding experience knows that just reorganizing some instructions in the most innocuous way may lead to final results that are different at bits that aren’t always that insignificant (because of the way floating points are represented, with finite precision, the order in which you sum a list of numbers alters the final result, sometimes by a lot… to have stability you may have to first sort the numbers from smaller to bigger, and add the smaller one first). The AI would then have no sure way to evaluate that those small alterations in its weights are really “insignificant” in how they define its own identity.
Comment #163 June 15th, 2024 at 8:44 pm
fred #139
Many conservatives and liberals defend extreme deficit spending with some variation of the argument-Dollars are just another good and there is high demand for that good so US deficits simply do not matter. I respond by saying-Great news, then stop collecting taxes and write everyone a check for $5 million and all will be happy. Thus far no one has countered with some elasticity of demand for dollars argument.
AI is made to sound, by some, analogous to the free money scenario. Maybe it’s from the Marxist belief that money will fade away but in this case due to an AI producing all required goods and services at near zero cost. Jeez, I can’t imagine the effect on human population if AI solves medical issues and all people have to do is eat and procreate.
Comment #164 June 15th, 2024 at 8:49 pm
To clarify, it’s not that I don’t appreciate the massive advances in AI, I’m just cautious about the future.
After all, I did buy a significant amount of NVidia stock back in 2019 (when it was 40$, the stock is now at 1300$, pre-split) – I already regretted having missed the chance getting in before the bitcoin craze, but I was pretty sure already in 2019 that NVidia was going to become central to AI technology. 😛 (already in the late 90s I was super interested by the capabilities of neural nets and reading all the books I could find on the topic… where at the time the wisdom was “you don’t need more than 2 or 3 layers”…).
I’m just not convinced that best path ahead is simply just a matter of throwing ASAP many orders of magnitude more resources at it by diverting a significant amount of the US GDP (25 trillion $).
It’s like the classic sci-fi story of the spaceship that leaves to the closest star with a few astronauts in hibernation.. and a few hundred years later they finally reach the star and find that the planet has already been successfully colonized by a large number of humans using a much better space travel technology…
Comment #165 June 16th, 2024 at 6:53 am
fred Says in Comment #148
> “But where should they invest it?!
Aschenbrenner is very clear about where he thinks the money should go and I think he’s absolutely correct, he says it should go into the form of more computing power, as in massive trillion dollar data centers, as many as needed to ensure that China does NOT make the world’s first superhumanly intelligent AI. We’ve gotten to the point where the difference between today’s AIs and a super humanly intelligent AI is largely a matter of quantity not quality; a fundamental breakthrough in computer science would be helpful but it is not necessary, even without it the world will be a VERY different place 3 to 5 years from now.
> “Bet trillions on a single company/product?!”
You claim you’ve read Aschenbrenner’s paper and you say “I find that paper quite naive and mediocre”, well… I actually have read Aschenbrenner’s paper, every single word of it, and your above remarks makes it very clear that you have not read it because he strongly emphasizes, controversially in some circles, that no single company should be allowed to develop an AI and such a task should be nationalized. He also thinks that like it or not it’s only a matter of time before it is nationalized, and the sooner the better in Aschenbrenner’s opinion.
John K Clatk
Comment #166 June 16th, 2024 at 9:55 am
OhMyGoodness
“I can’t imagine the effect on human population if AI solves medical issues and all people have to do is eat and procreate.”
I don’t think it’s a given at all that all medical issues can just be solved easily once AGI is a thing.
The human body is a complex machinery, and not two bodies are the same. For example, with cancer, it’s the trend to specialized more and more the treatment based on individual genetics, hormonal profile, immune system, etc.
Everyone is expecting that, unlike LLMs, AGI will have to spend more compute cycles on hard issues than easy ones (that’s pretty obvious?).
So, even if AGI can do a ton of progress in all sorts of domains, it probably won’t be able to solve everything all at once because of limited resources, and someone will have to decide what problems will get the priority, or industries that can benefit from it (e.g. the military, pharmaceuticals, etc) will have to bid for AGI compute resources.
Then, even if a pharmaceutical extracts some promising drug design from an AGI, there’s still the question of manufacturing it, testing it, pricing it, etc.
There’s also the matter of trust, e.g. the amazing speed at which the RNA covid vaccines were derived (using narrow AI techniques) actually had the paradoxical effect of creating a lot of suspicion and skepticism… large trials are still needed no matter what the AGI will claim.
In the end, it’s probably very likely that the main bulk of AGI compute time will actually be allocated to improving AGI even further.
Comment #167 June 16th, 2024 at 10:21 am
Fred #154 – kilowatt hours is a unit of energy, not power. The human brain runs on about 20 watts of power. It works best when attached to a body, and that brings the power usage up to about 100 watts. These numbers are a bit rough, but regardless, the brain is incredibly efficient.
Comment #168 June 16th, 2024 at 10:25 am
Clark #163
“as many as needed to ensure that China does NOT make the world’s first superhumanly intelligent AI.”
First, Russia (and Eastern European countries) and China have a much stronger emphasis on math in their education than the US, and a large portion of the best AI researchers have been raised in that environment. I’m not talking about political and national affiliation, but simply potential brain power.
Second, If you really buy that AI is such that it creates a “winner takes all” situation, then China and Russia will also know that, and they aren’t gonna be waiting on the sidelines like clueless morons. They would also do whatever it takes so that the US doesn’t jump ahead, either by stealing the secrets, sabotage to destroy the data centers, or murdering key researchers (like the way Israel has been assassinating key Iranian nuclear scientists for decades). There’s really no limit to this… why not do a preemptive nuclear strike to make sure the other side doesn’t succeed?! Cause, if they don’t, the side that gets AGI first will instantly be able to shutdown the infrastructure of the other side using unstoppable computer viruses, biological viruses, swarms of nano-drones, etc…
As I mentioned earlier, the US gov better start forcing Google, MSFT, etc to close all their AI labs in China, right?
https://www.defenseone.com/ideas/2021/08/pull-us-ai-research-out-china/184359/
As usual, US corporations’ first duty is to greed, e.g. with Altman asking the Saudis for investment (although I suspect this is a just a scheme to wake up the US government).
Of course the other option is cooperation, the way science usually works…
Btw, someone mentioned TSMC building fabs in Texas, but those aren’t cutting edge:
Also, it’s interesting that all fabs rely on the tooling from ASML, a somewhat small Dutch company
Comment #169 June 16th, 2024 at 10:38 am
Btw, I recommend watching the 1970 scifi movie COLOSSUS THE FORBIN PROJECT.
It’s actually a pretty damn interesting and grim take on the development of AIs.
Russia and the US both developed their own AI, which end up working together to control humanity. The movie definitely went for the “let’s build the biggest data center we can afford” approach.
Free on vimeo:
Comment #170 June 16th, 2024 at 11:16 am
Well, [bleep].
I built my AI hopes on the idea that the Transformer was the last algorithmic breakthrough, and that since then all AI progress was compute, which can only be scaled so far. According to the report, half of the orders of magnitude were due to algorithmic progress in efficiencies and “unhobbling” the raw intelligence of the base models, and today’s advanced models are still so naively designed and primitive compared to what’s possible…
If the future trajectory will be even close to the middle of Aschenbrenner’s projections, then that is enough to turn me into an Orthodox Doomer Yudkowskian.
My remaining hope is the last seven years of AI growth will be a mere bump on the rate-of-progress graphs, and we are on the verge of a massive deceleration/wall. We would have to hug the bottom of Aschenbrenner’s projections and assume that the obstacles he mentions (such as the “data wall” of having already fed the Internet to the models) are insurmountable. This would also imply that the hardware investments currently being made by the AI labs are a financial waste on the order of 1928 or 2007. That is, I would have to hope that not only is Aschenbrenner way too optimistic about AI progress, but that the same is true of the leading AI labs with all of their experts that were responsible for the leaps and bounds made so far.
Aschenbrenner makes a good case that the rate of progress of the last five years might continue for another five years. This is all it takes to shift AGI and ASI timelines to within the decade.
It seems almost like a breath of fresh air to be able to worry about the geopolitics of AI instead of a superintelligence coming to kill all humans, except no, since the latter problem did not go away. Do not solve super-alignment, and Yudkowsky is right. Solve super-alignment, and we are in a desperate race against the clock to prevent God-Emperor Xi. Why face one AI problem every 10 years or so, when we can face the problems roughly simultaneously?
The 2010s were for AI what January 2020 was for Covid. We are now in mid-February 2020. Aschenbrenner just gave us a view of the coming pandemic, of March 2020 and the first few months of spring.
I don’t want these projections to be accurate. I want AI progress to stall. Not via an “AI pause”, but through some sort of fundamental limitation in algorithms and “unhobbling” that will complement the fundamental limitation in scaling compute.
Comment #171 June 16th, 2024 at 1:13 pm
I won’t be worried about AIs until they are interacting directly with nature and coming up with their own algorithms, goals, values, ability to reproduce, etc. based on those interactions. In other words, when AIs have bodies, wills and are essentially alive, then they will become dangerous. As long as they live in a realm of abstractions, are created by humans and are basically just playing games with large databases of human language – language which encodes real intelligence based on millions of years of embodiment in a real world that real life forms developed – they will lack intelligence of a type that is dangerous beyond fooling people by a clever parroting of language. At least that’s how it seems to me, as someone who has watched this kind of hype grow and die for decades now without having my life bothered in any fundamental way by it. And that’s without even mentioning the issues of resource, environmental and social constraints that these science-fictional visions of vast nuclear-powered robot factories or whatever seem to casually elide.
Methinks guys like Aschenbrenner and Yudkowsky need to “touch grass” a bit and get a better handle on the physical, ecological and social worlds that all their abstraction are ultimately limited by, or they will just end up sounding like crackpots.
Comment #172 June 16th, 2024 at 1:45 pm
The reason nuclear energy will never give us “electricity too cheap to meter” is because private utilities will never capture the entire, true cost of cleaning up after radioactive disposal.
The real costs of disposing of nuclear waste is so high, that the true cost of nuclear electricity is not actually affordable. It will always need to be subsidized by taxpayers.
And I’m not even including weaponizing reactors by bombing them during war – as Russia is currently threatening in Ukraine.
The cheapest forms of energy remain (including their life-cycle of production through recycling and/or disposal) are: conservation, passive solar, localized renewables (wind, solar, and biogas) and centralized renewables.
That said, I agree with Aschenbrenner that one of the biggest and most realistic abuses of AI stems from the tendency of fascist dictators to abuse pretty much any information system.
(Again, just look at how Russia’s using Ukraine’s nuclear plant to threaten Europe. Now, just swap AI in for the nuclear plant.)
Fascists are sociopathic kleptocrats. They want to skim, squeeze and steal what they can from their captive populations. This also means they’re cheap. And psychology is the cheapest weapon, whether persuasion or intimidation. Just ask the FSB (former KGB).
And AI is the perfect way to leverage propaganda. AI will next-level authoritarian manipulation – including vs other sovereign democracies, which we have already witnessed.
That’s the exponential curve to worry about.
Hey, I love AI, for what it is. I’m literally invested. But yeah, authoritarians’ abusing it is the real existential danger here.
Comment #173 June 16th, 2024 at 2:01 pm
Fred says
> “For reference, a human brain weighs 3 pounds and runs on 0.3 kilowatt hours. Any technology that claims to achieve the same but requires many orders of magnitude those resources is clearly totally sub-optimal and needs more time to get there.”
That’s true, and that fact is an existence proof that vast improvements in the energy efficiency of AI’s are possible. It’s also interesting that’s thanks to companies like Envidia and Grok the speed and energy efficiency of AIs has been improving much MUCH faster than Moore’s Law, that’s because until recently most computer processors were optimize for 64 bit double precision floating point calculations, but such extreme precision is unnecessary for AI work, it just slows things down and wastes energy; lots and lots of 8 bit or even 4 bit processors working in parallel on the same chip saves energy and runs faster.
Let’s make some more comparisons with the brain hardware that human intelligence is running on with the brain hardware that GPT-4 is running on, that is to say, let’s compare synapses to transistors. The human brain has 7*10^14 synapses (a very generous estimate), but the largest supercomputer in the world, the Frontier Computer at Oak ridge, has about 2.5*10^15 transistors, over three times as many. And we know from experiments that a typical synapse in the human brain “fires” between 5 and 50 times per second, but a typical transistor in a computer “fires” about 4 billion times a second (4*10^9). That’s why the Frontier Computer can perform 1.1 *10^18 floating point calculations per second and why the human brain can not.
> “If you really buy that AI is such that it creates a “winner takes all” situation”
And I do.
> “then China and Russia will also know that, and they aren’t gonna be waiting on the sidelines like clueless morons.”
You are absolutely correct, therefore the USA had better get moving. Regardless of if you think a super intelligent AI is a good idea or a bad idea it’s coming and it’s coming in less than five years, perhaps much less. As I said before, although we can’t be certain I think our chances for survival are better with the US derived Superintelligence than with a Chinese or Russian derived one
> “I recommend watching the 1970 scifi movie COLOSSUS THE FORBIN PROJECT. It’s actually a pretty damn interesting and grim take on the development of AIs”
I could not agree with you more, COLOSSUS THE FORBIN PROJECT is a wonderful movie, in fact I think it’s the most underrated movie of all time. I read the book before the movie was made and this is one of those rare times when the movie was better than the book, off the top of my head I can only think of two other times that was true, the Godfather and Jaws.
John K Clark
Comment #174 June 16th, 2024 at 6:13 pm
What accomplishments by an AI would be seen by a reasonable man as demonstrating super intelligence and by super intelligence I can think of nothing other than producing insights greater than any human that ever lived. It would have to be something that supersedes quantum mechanics and/or general relativity. Those are the current theories that provide the deepest insights we currently have as to how the physical world operates. Incremental advances in say engineering or genetics do not seem to me to be a satisfactory demonstration. Is he suggesting that it is reasonable to expect AI to supersede our understanding of the physical world by the end of the decade.
Comment #175 June 16th, 2024 at 6:48 pm
Scott #125: “The question is not to distinguish the models from perfection, but from the average human.”
Okay, here is an example.
Me: How often does the letter T appear in this question?
ChatGPT-4o: The letter T appears four times in the question “How often does the letter T appear in this question?”
I‘m pretty sure that almost no non-trolling human would give such an answer. Claims that today’s bots pass the Turing test are demonstrably false, and they don‘t come close to passing either.
Comment #176 June 16th, 2024 at 11:20 pm
Reading through his power discussion made me think that when the super AGI awakens one of its first thoughts will be-Gotta get busy on a Dyson sphere.
Comment #177 June 17th, 2024 at 3:41 am
[…] Aschenbrenner’s series has elicited a range of responses in the tech world. Computer scientist Scott Aaronson described it as “one of the most extraordinary documents I’ve ever read,” while software […]
Comment #178 June 17th, 2024 at 9:09 am
OhMyGoodness #172
That has been my question. I’m not sure if you were responding or just came to the same place in your thinking. Of course, everything depends on the definition of superintelligence.
For me, I think the first indicator outside of AI research itself would have to be some kind of novel theory probably in the general realm of mathematics. Maybe the proof of a conjecture or the disproof some of an accepted proof. I think mathematics because it is closely related to the computer science where AI would already need to be producing novel theories, algorithms, or techniques if the exponential increase in intelligence is possible at all. We would be able to understand whatever it is at this point.
But shortly thereafter I would expect science or math that would be as incomprehensible to the best minds in the field as calculus would be the most math savvy cat.
If your definition is just doing human capable thinking but a lot faster, then the societal changes could still be dramatic but on a much different scale and understandable.
Comment #179 June 17th, 2024 at 9:25 am
OhMyGoodness
“What accomplishments by an AI would be seen by a reasonable man as demonstrating super intelligence and by super intelligence I can think of nothing other than producing insights greater than any human that ever lived.”
The problem is: the greater the insight, the harder it is for an average person to integrate it.
For example, one of Einstein’s insights was that, if we put you in a box, there’s no measure you can do to tell you whether you’re subjected to gravity or not, i.e. free fall just feels just like absence of gravity, etc.
Or Noether’s theorem between conservation and symmetry.
Or, for over 100 years now, every new generation of physics students is struggling to try and make sense of QM at an intuitive level, with no success.
To the vast majority of humans (who aren’t obsessed with understanding physics as well as possible), such insights don’t resonate at all, even though all those insights were discovered by some fellow human.
Basically, the more advanced the insights the more they are disconnected from human daily intuition and common sense, which have been built from evolution pressure to make us survive better on earth.
It’s likely that a creature that’s orders of magnitude smarter than we are (in terms of memory capacity, ability to hold and connect hundreds of complex models all at once in its mind, skill at using very advanced mathematical tools, etc… ) would create insights that are totally out of our reach.
We’d be like the dogs in those disappearing trick videos:
Comment #180 June 17th, 2024 at 11:54 am
fred #177
I don’t see the difference between that and what we currently have (you touch on this). So long as there are testable predictions then really doesn’t matter and just shut up and calculate. If no novel testable predictions, and so no real application, then consigned (at least temporarily) to the purgatory of interesting ideas.
Comment #181 June 17th, 2024 at 12:18 pm
A year or so ago the experts on the field were telling the US government that maybe things should be slowed down, because of the existential risks… but now the US government would be told the polar opposite – things should be accelerated as fast as possible, even though the existential risks are still there, because we need to beat China to it.
That reminds me a lot of the Einstein flip flop on the atom bomb during WW2 (except it was the other way around).
Comment #182 June 17th, 2024 at 1:12 pm
Mark says in #43
“the Turing Test is almost certainly bogus”
I am certain that in your life you have met people who you consider to be geniuses and people who you consider to be idiots, but if the Turing Test is almost certainly bogus, if you didn’t use the observation of intelligent behavior, then how are you able to differentiate between geniuses and idiots?
John K Clark
Comment #183 June 17th, 2024 at 1:43 pm
James Cross #176
The ABC Conjecture debate lasted for years (I assume the current consensus is that it is wrong). The four color theorem brute force proof, first computer assisted, debate lasted for years and was ultimately found conclusively to contain an error. Wiles proof was found to contain an error that was correctible after a “revelation”. So no doubt an AI can produce an impenetrable proof. Unless it results in testable predictions, or can be understood by some number of mathematicians (possibly just one), then still in idea purgatory.
Comment #184 June 17th, 2024 at 5:43 pm
I need the AI-superintelligence people to sync up with the green-energy-superabundance people. The RethinkX model says that the cheapest electric power for the rest of this century is solar+batteries – and for compute, you wouldn’t need the batteries, just run the processors where the electricity is getting generated and idle them elsewhere. Or rebalance according to what is the most efficient allocation of capital – if your GPUs are very expensive, then you should double your cheap solar and add batteries. Wright’s law might beat Moore’s law for a while.
Comment #185 June 17th, 2024 at 8:58 pm
Cryptid #125: well said. FYI the « Turing test was solved » misinformation seems to originate from: https://www.nature.com/articles/d41586-023-02361-7
Comment #186 June 17th, 2024 at 9:28 pm
cryptid #173
Nice example!
But even if you don’t want to test its “logic” you can ask a question like “how do you topple a top toppler” which no human would try to answer sensibly like chatGPT4o does.
The “machine” can’t distinguish between dumb and deep questions.
Comment #187 June 17th, 2024 at 11:48 pm
fred #181: Not the same experts. The people who advocated for a pause a year ago, almost invariably still do. AGI is not an issue like vaccines or the causes of climate change, where the experts are all basically on one side, and if you keep trying to identify the nonexistent expert consensus you’ll keep getting it wrong. On this blog, I’ll try to give a fair hearing to anyone who clearly understands what’s already happened in AI, while disagreeing about the future.
Comment #188 June 18th, 2024 at 7:44 am
James Gallagher #184, you’re right we can identify present LLMs because they never get tired of whatever input we throw, but that’s not fair -they were so heavily trained for that behavior! In contrast cryptid’s trick is qualitatively different, in that we never trained LLMs to get fooled by self-referencial sentences. It feels like an emerging property from lack of a proper system 2, as (my internal model of) F Chollet would say.
Comment #189 June 18th, 2024 at 9:07 am
When Turing proposed the test, he gave two examples of Q/A interactions. Here they are in their entirety:
And
You can ask ChatGPT these questions and have full conversations about them. At a conceptual level, at least as far as Turing himself considered it, the test is clearly passed. Whatever he thought was to be gotten from the test has been gotten. (By the way, did you notice the arithmetic error in the answer above? Is that the human answer or the machine answer?)
Actually implementing the test would require specifying the rules more clearly. The interrogator is to talk to a human on one end and a machine on the other.
Finally, the Turing test is to be conducted over text “in order that tones of voice may not help the interrogator”. That rule can done away with now; voice synthesis capabilities these days are absolutely exquisite.
Comment #190 June 18th, 2024 at 10:57 am
OhMyGoodness #178
yea, at some level the “insights” don’t matter, the proof will be in the pudding.
E.g. an AI will say, mix this chemical and that chemical this way, and then apply a current: zero resistance at room temperature!
The why and how it happens don’t have to be understandable to us… (of course maybe some of us will dedicate their entire life to try and understand how the AI did it… but that’s nothing new either – the vast majority of humanity has no clue how a TV works, and they just don’t care).
Comment #191 June 18th, 2024 at 12:12 pm
The more I think about it, the less clear I am about the concept of AGI.
E.g. when NVidia engineers run a machine learning system to optimize the placement of billions of transistors in their latest chip, all they care about is the output, not whether it was produced by a system that could also write poetry and solve sudoku puzzles, right?
So are we somehow expecting that generalized AI system will perform better than narrow focus AI systems?
Or the idea is that what matters is having one giant AGI system that can get better by just training itself at anything without our intervention?
Is the motivation behind AGI to hope to entirely take humans out of the loop, except for the very first step to just “turn on” the system and then give it general goals like “here’s a chip Fab, please optimize it…” with a minimum of effort?
It’s all quite odd because that’s just not how humans even work, as general intelligent systems, we also do specialize, and we don’t come up with a 5 year curriculum for students where they will learn a bit of engineering, a bit of biology, a bit of poetry, and then that’s it… let them improvise later on based on whatever job they take. No, instead we specialize them as much as is possible for a reasonable amount of time and money.
Although it could be that the “general” in AGI will allow us to have one centralized system which trains/spawns/… specialized AIs on the cloud to solve discrete problems. But spawning a gazillion AGI that are not specialized yet seems like it could be really sub-optimal (and we may not even have the resources to do it… maybe a single AGI will require too much compute to be cloned arbitrarily).
Comment #192 June 18th, 2024 at 12:15 pm
Marty Says in Comment #108
“Maybe Aschenbrenner will turn out to be right about the rate of progress or maybe he won’t, but my take is that the “bedrock” rationale for most of his claims is ultimately a trust/faith in the continuation of OOM trendlines and the expectations of imminent progress by researchers he knows or trusts.”
I’ll tell you why I’m almost certain that the OOM AI trendlines will continue, because the number of computations that are useful to AI (4 and 8 but precision) we can perform for a given amount of energy is increasing much faster than Moore’s law, and even more important, we have an existence proof that if we are still missing something fundamental in our current AI models (and I’m not at all sure that we are) it can’t be very complicated so we will soon discover it.
We know for a fact that the human genome contains 3 billion base pairs, and we know there are 4 bases, so each base can represent 2 bits and there are 8 bits per byte; therefore the entire human genome only has the capacity to hold 750 MB of information; that’s tiny! And the true number must be MUCH less than that because that’s the recipe for building an entire human being, not just the brain, and the genome contains a gargantuan amount of redundancy, 750 MB is just the upper bound. If we’re missing a key brain algorithm it must be very small, only one or two kilobytes, maybe less. Something that simple can’t remain undiscovered for long.
John K Clark
Comment #193 June 18th, 2024 at 12:27 pm
I can see how the goal of an AGI is a interesting concept to attain, with the hope to unleash amazing “creativity”, but maybe it’s worth asking whether we can’t just stick to specialized AIs, because we know that, although those require more supervision from us, there’s also there’s no risk of miss-alignment. And then connecting all these in interesting ways could bring up amazing creativity as well.
In a way this reminds me of the scifi novel “A deepness in the sky” by V.Vinge, where humans are becoming super-specialized by making their brain focus obsessively on a single task
https://en.wikipedia.org/wiki/A_Deepness_in_the_Sky
“An active MRI-type device triggers changes through dia- and paramagnetic biological molecules. By manipulating the brain in this way, Emergent managers induce obsession with a single idea or specialty, which they call Focus, essentially turning people into brilliant appliances.”
Comment #194 June 18th, 2024 at 2:35 pm
Scott,
As an aside, regarding nuclear power vs fossil fuels. Currently, ~85% of energy use is from fossil fuels and in absolute terms, fossil fuel dependency has only been growing. Less than 20% of energy use is meant for electricity generation which has been impacted by use of renewables and can be impacted by nuclear. There is a vast amount of industrial use for cement, asphalt, steel, ammonia, plastics, shipping, flying, heavy road transport etc for which fossil fuels are not going away anytime soon.
Comment #195 June 18th, 2024 at 7:33 pm
Btw, whatever happened with the recent research that scaling up LLMs was basically doing over-fitting of the training data, so much so that, with well crafted queries, one could just pull back the training data as is?
Btw, about over-fitting the training data, in the Sora demo reel, there was that bit about Lagos, Nigeria:
as soon as I saw this bit, I instantly said, “Damn, that’s Miss Josey!”, an African youtube vlogger
Comment #196 June 18th, 2024 at 8:47 pm
Chop chop #186
I don’t think it’s unfair at all, we (humans) can easily detect bullshit nonsense which this billion dollar computer program can’t and (IMHO) never will be able to.
If AI was so powerful, we wouldn’t have been the first planet to discover it, even in our local galaxy we would not have had the slow industrial revolution unless AI is just mean and likes human suffering…
Comment #197 June 18th, 2024 at 9:55 pm
The first employment termination of an AI happened recently. McDonalds fired its AI drive-through order taker. AI’s are still not eligible for unemployment benefits apparently.
Comment #198 June 18th, 2024 at 11:36 pm
AGI/ASI will be great optimizers when an optimization function is identifiable. For AI research, I can see where AGI/ASI replaces researchers. But in day-to-day coding jobs, so much of it is reading between the lines of PRDs that aren’t fully specified. It’s almost like AGI’s job would be to take “code” (a fully-defined PRD) and turn it into machine code. But that’s already a solved problem. I don’t know if I see a way for automated reinforcement learning to work well for generating code from raw PRDs. The AI will generate what it generates, and think it’s fine. There’s nothing to compare against. There’s RLHF but there’s a scale limit to that. Eventually AI runs into a “what am I optimizing” bottleneck.
So it’s interesting, the instinctual reaction to all this is we all need to get out of business software and into AI before it’s too late. But maybe it’s the AI jobs that will be the ones to get replaced.
Comment #199 June 19th, 2024 at 7:36 am
Responding to what Fred Said in Comment #60:
You bring up some interesting points regarding what a super intelligence might or might not be able to achieve. One thing we can say for certain is that it will be able to outwit a human being, any human being, in fact all human beings. It won’t be able to find the one and only perfect solution to a NP complete problem (or maybe it can if it turns out to everybody surprise that P=NP, after all mathematicians have been surprised before) but it will certainly be able to find better approximations to those problems that worked pretty well most of the time. And as far as practicality is concerned often that’s all you need.
> “People bring up “even if an AGI is like an average scientific brain, if it can run 1,000 times faster, it’s gonna be massive win”. True, but, again, in the real world scientific progress almost always relies on experiments”
Physicists say the reason they haven’t been able to come up with a quantum theory of gravity is that they’ve run out of experimental evidence to give them hints about which way to go. That might be a valid excuse, then again it might not be, maybe human physicists just aren’t smart enough. The LHC in Switzerland produces 1.5* 10^16 bytes of raw data each year, only a very tiny percentage of that mountain of data has been closely analyzed; everything you need to know to develop a theory of everything might be lurking in that data, you just need to be smart enough to fish it out. And of course pure mathematics needs no experimental data.
But for the moment let’s forget about science and mathematics and look at just one area, engineering. Suppose it takes a thousand first rate engineers 5 years to design a new supersonic stealth bomber, but a superintelligent AI could do the same thing in 24 hours. And I remind you that Drexler style Nanotechnology needs no breakthrough in science or mathematics to become a reality, all it needs is improved engineering.
John K Clark
Comment #200 June 19th, 2024 at 8:15 am
James #194, Ok I guess I should not say « unfair » as it’s prone to multiple interpretations, but here is one attempt to better formalize my point:
Let T be a test that separates (reasonably informed) humans from present LLMs. If small variations on LLMs can break T, then T is unfair. If significant progress is required, then T is fair.
Application: the ARC-AGI challenge seems largely defeated using a few tricks, then it seems largely unfair.
https://www.lesswrong.com/posts/Rdwui3wHxCeKb7feK/getting-50-sota-on-arc-agi-with-gpt-4o
So back to your proposal: I feel like « never get tired of our questions » is probably unfair, while « easily fooled by self-referential sentences » is probably fair. But of course that’s just my guess, time will tell.
Comment #201 June 19th, 2024 at 8:20 am
Trillion dollar models, just based on extrapolating current model performance. Really ? There is too much drama with origins from one company for sure. Either its too much orchestrated hype, or there is something going on under the hood that is would be very disruptive …which one is it ?
Comment #202 June 19th, 2024 at 10:20 am
The Sora demo clip I meant to paste is this
https://youtu.be/HK6y8DAPN_0?feature=shared&t=231
Comment #203 June 19th, 2024 at 10:31 am
John K Clark
“One thing we can say for certain is that it will be able to outwit a human being, any human being, in fact all human beings.”
That’s probably true once it gets past a certain level, just like (almost) any human can outwit (almost) any dog.
but it’s always difficult to predict because our only benchmark for intelligence is the human brain, and it’s also a fact that there isn’t a single human that can outwit every single other human across all domains. Like, math and science geniuses are rarely great at social interaction, politics, etc.
That said, this could just be a limitation of our finite brain size and finite lifespan, and it’s not evidence that an AGI won’t be better than any human across all domains.
It’s not hard to imagine a “master” general AI that would use and coordinate many narrow AI assistants in order to excel across the board to achieve this.
In a way that’s also how the human brain functions, multiple sub-systems are constantly “competing” to prioritize goals and resources (which can go wrong when addiction or OCD get in the way).
Comment #204 June 19th, 2024 at 12:52 pm
Luca Trevisan died 🙁
Comment #205 June 19th, 2024 at 3:02 pm
James Cross Says in Comment #121
> “Orcas may already have superintelligence. Only their lacking arms, hands, and feet has thwarted their desire to rule the world”
I don’t think so because Orcas lack hands so they can’t intricately manipulate their environment, there would be no evolutionary pressure to develop superintelligence. Superintelligence would not increase the likelihood of an Orca getting its genes into the next generation, it might even decrease its chances because a brain is an energy hog; in humans the brain is only about 2% of body weight but it claims about 20% of the body’s total energy budget. Without hands all that would be wasted energy from Evolution’s point of view, it would not favor a brain that is larger than it needs to be. The brain of an Orca is big because the body of an Orca is big, and there is roughly a one to one correspondence between an area of the body with a smaller area of the brain.
John K Clark
Comment #206 June 19th, 2024 at 3:05 pm
Jan Says in Comment #103
> “Let me know when LLMs can do simple math like adding two 20 digit random numbers”
I’m not sure that LLMs will ever be good at things like that nor do they need to be because for decades we’ve had two dollar calculators that can add numbers far better than any human being.
John K Clark
Comment #207 June 19th, 2024 at 3:32 pm
I’m probably wrong in my above comment. Initially these things may read the PRD and produce code that implements exactly what’s specified, but not what was intended. It would fail to read obvious things between the lines, like “we need to persist the data”. But it probably won’t take long to unhobble these aspects. Like any entry-level engineer, we’d first have it work through a design plan. We’d notice that no database was included in the design, and then tell it that the data needs persisted. After a few rounds of this, feeding back into RLHF, for projects across millions of companies, the NN would quickly start to understand what types of projects require some persistence, and what can just run non-durably. It may take a while to get it all right, but probably no more than a couple years. Possibly a lot less than that, if it can pick up some “common sense” about these things from existing sources on the internet.
Comment #208 June 19th, 2024 at 7:17 pm
Unfortunately the video is in French, it’s about the deployment of AI surveillance and smart cameras in the US and France. Something that so far only China had been deploying across the board.
Comment #209 June 19th, 2024 at 11:57 pm
John Clark #204
The comments were meant in humor. Based on a paper I linked to Orcas have more neurons in the cortex than the great whales that have far higher body weight and size by volume.
fred #206
About a year ago I was going through US passport control and as I approached the desk the agent greeted me with my first name and looked at me as though expecting some response. I thought-facial recognition system-and provided no response.
London has an extensive video surveillance system and earlier this year there was an announcement it would be coupled with facial recognition.
Comment #210 June 20th, 2024 at 7:19 am
John Clark #202
My comments were intended as humor, but you seem to have added a requirement for superintelligence: hands.
Two characteristics that are unique to humans among animals are sophisticated language and toolmaking. Both of those probably coevolved and make heavy use of the expanded number of neurons in the prefrontal cortex.
However, that isn’t necessarily a good argument for orcas lacking superintelligence. Their absolute number of neurons is high and the ratio to body size is also high. What are all of those neurons for? Superintelligence might not look like anything we would expect.
Comment #211 June 21st, 2024 at 5:47 pm
[…] este blog, Leopold Aschenbrenner fue miembro del “ahora-disuelto Superalignment Team de OpenAI”. Y, […]
Comment #212 June 22nd, 2024 at 5:24 am
James Cross Says in #208
> “John Clark #202 My comments were intended as humor, but you seem to have added a requirement for superintelligence: hands.”
I did not write that, “OhMyGoodness” did
> “Two characteristics that are unique to humans among animals are sophisticated language and toolmaking. Both of those probably coevolved and make heavy use of the expanded number of neurons in the prefrontal cortex.”
You’ve got that backwards. If Darwin’s theory is correct then upright walking and freeing the upper two limbs for manipulating objects would have had to come BEFORE the unusually rapid growth (geologically speaking) of the prefrontal cortex that we see in modern humans. And the fossil record tells us that’s exactly what happened. Lucy lived 3.2 million years ago and her brain was only slightly larger than a chimpanzee’s, but she was fully bipedal, and there are fossil footprints of bipedal apes walking in wet sand that are even older than that.
There are theories but nobody knows for sure what the evolutionary pressure was that caused some apes to start walking upright, but whatever it was it resulted in an extremely rapid growth in brain size; because if an upright ape just needed two limbs for locomotion then it had two other appendages that were now free to manipulate objects. And so an ape that was just slightly smarter than average had a much better chance of getting its genes into the next generation. By contrast a zebra that had human level intelligence or even superhuman intelligence would have no survival advantage over present day dumb zebras because, although it might have brilliant ideas it would have no way to implement them. In fact a superintelligent zebra would be less likely to survive because it would be wasting vast amounts of energy running a huge brain that could provide no help in surviving on the dangerous African savanna.
> ” However, that isn’t necessarily a good argument for orcas lacking superintelligence. Their absolute number of neurons is high and the ratio to body size is also high. What are all of those neurons for? ”
For agility. If you’re a huge animal then you’ve got to know what position every part of your huge body is in, and what every part of it is sensing. And you’ve got to know what messages to send to a huge number of nerves to get huge muscles to move so that your huge body does what you want it to do. And orcas are social animals and that almost always result in a larger than average brain to body ratio, but that alone is not nearly enough to explain the gargantuan brain that humans have. I also think that once our hominid ancestors developed even a very primitive language that must have produced even more revolutionary pressure in the direction of larger brain size.
John K Clark
Comment #213 June 23rd, 2024 at 1:45 am
A comparison of Neanderthals and fossil (Stone Age) Homo Sapiens skull volumes. It is difficult to make the case that these were sampled from decidedly different distributions. Skeletal remains suggest Neanderthal males averaged 5’4” and 170 lbs and females 5’1” and 145 lbs.
https://images.ctfassets.net/cnu0m8re1exe/5EkYgoCxu5fDqjSGqeedJS/6a3af864e60486be4a6b9fb381455c6b/skull-volumes-283×300.jpg?fm=jpg&fl=progressive&w=660&h=433&fit=pad
Comment #214 June 23rd, 2024 at 4:15 am
John Clark #212
I am mystified. Would you please point out in any of my posts where the word “hand(s)” appears. Alternatively this is from your post 205-
“ I don’t think so because Orcas lack hands so they can’t intricately manipulate their environment,”
This appears to imply that an AGI will require hands to become a super intelligence.
Comment #215 June 23rd, 2024 at 6:36 am
OhMyGoodness Says:
” I am mystified. Would you please point out in any of my posts where the word “hand(s)” appears.
In 209 you wrote:
“John Clark #204
The comments were meant in humor. Based on a paper I linked to Orcas have more neurons in the cortex than the great whales that have far higher body weight and size by volume”
But I did not write that in #204, in fact I did not write that at all. I meant no offense, I was just trying to correct a misattribution.
John K Clark
Comment #216 June 24th, 2024 at 10:07 am
“With unusual clarity, concreteness, and seriousness, and with a noticeably different style than the LessWrongers with whom he shares some key beliefs, Leopold sets out his vision of how AI is going to transform civilization over the next 5-10 years.”
Scott, this is a relatively unimportant question, but I’m curious to hear more from you about what those differences in style are.
Comment #217 June 24th, 2024 at 6:03 pm
Daniel #216: Good to hear from you again!
At the highest level, I’d say that the typical LW post is written as if meant to be read by a small, jargon-immersed community of other LWers, whereas “Situational Awareness” is written as if meant to be read by the Joint Chiefs of Staff. Relatedly, most LW posts are mostly full of aprioristic arguments, whereas Leopold keeps his writing more grounded in what’s concretely happened in AI in the past four years.
Comment #218 June 25th, 2024 at 1:13 pm
A super interesting discussion between Sean Carroll and Francois Chollet (from Google AI team) about the limitations of LLMs.
Comment #219 June 25th, 2024 at 1:45 pm
Related to the paper at the center of this post, the guest of Sean Carroll says that none of the claims that LLMs can just be scaled up to reach “high schooler intelligence” make any sense.
He’s utterly baffled how people in the field can really believe those claims. Basically people see good answers, and they claim the models are intelligent, without digging further to get outside the training data.
Basically LLMs are doing curve fitting (coming up with fairly complex vector functions to fit the training data), they’re great to interpolate something they’ve met before, but they can’t actually do any active searching in some domain, which is what’s needed to “extrapolate” (my term) outside of the box and solve novel problems (which LLMs can’t do).
Comment #220 June 25th, 2024 at 2:48 pm
fred #219: It seems to me that the only way you can defend that view, is by endlessly redefining what counts as a “novel problem.” The college-level math and science problems, etc. dismissed as “routine” today, are precisely the same kinds that would’ve been held up six years ago as examples of novel challenges that AI on the current paradigm would never solve!
Comment #221 June 25th, 2024 at 5:30 pm
Scott #220
it’s simply what’s part of the training set.
It’s proven that LLMs do poorly on things they weren’t trained on. Which doesn’t mean that there isn’t tremendous power in something that can “interpolate” across all content of everything ever generated by humans, something that no single human can do as well.
The point here is that the loop is this:
1) humans come up with an entirely new problem + its solution
2) the entirely new problem and its solution are integrated into the LLMs latest training set
3) go back to 1)
Sure, the result of 2) is that the LLM is now performing better, i.e. it can deal with an entirely new type of problem, but the point is that without the humans feeding the novel problem and its solution, the AI would never have been able to solve it “spontaneously”.
The system you get is more like an amazingly good search engine.
But, when probed outside the space of what was encountered before, the system isn’t and can’t do an active search in that new domain.
(a strong clue of this is that the computation cost of an LLM guessing the next token is independent of the context).
What we want is an AI that can independently solve any sort of totally novel problem, more in the way that Alpha Zero can tackle any board game, except that we don’t want that for the limited space of all possible board games, but in the much larger space of all things humans care about.
Comment #222 June 25th, 2024 at 5:47 pm
Theoretically the question is whether solving any problem can be reduced into a set of interpolation across data points from different domains using logical and deductive first principles.
Then, if this is the case, how big would an LLM have to be to hold enough such abstraction layers (so that entirely different domains can be connected at some level), and what is the right training data (just raw training data may be too ineffective, and some well crafted training set is better).
Ultimately it’s all patterns within the training data… just like humans reached this point by growing their brain to extract meaningful patterns in their environment.
As I always say, the intelligence is in the environment, and more precisely in the perception data of this environment (perception evolved based on the richness of the patterns that started to get extracted).
As an example (don’t get offended if you’re a marine biologist :P), as an environment, the ocean is less interesting/less challenging/more regular than dry land, i.e. temperature is more uniform, the amount of sunlight is more constant, the topology is less important, gravity plays almost no role, etc. Which is why dry land has a more diverse range of adaptations (the only smart sea creatures, besides octopus, are mammals that evolved on dry land and went back to the ocean).
Comment #223 June 25th, 2024 at 7:08 pm
If you believe that paper, then
1) train a new instance of ChatGPT on all the papers and books published prior to 1905.
2) in a prompt, ask it to come up with theories to explain the results of the Michelson-Morley experiment, the anomalous precession of the perihelion of Mercury, etc.
Then, it’s only a matter of network size for ChatGPT to complete this prompt with some equivalent of Einstein’s theories of Special and General Relativity?
Basically you’d believe that Einstein’s theories were simply “latent” in all the data published by 1905, and it’s just a matter of a big enough “latent space” in the model for the training to have implicitly extracted all the necessary patterns and connections to just spit out the theories with just the same cost it would take to spit out any answer that’s of the same length.
Like, somehow, Einstein’s theories are discovered during training, given a big enough network.
Comment #224 June 26th, 2024 at 6:32 am
I agree with Fred, if somebody insists that Large Language Models can’t do anything original, they just modify stuff in their training data, then they would also have to insist that Einstein did nothing original in 1915 when he came up with the General Theory Of Relativity. After all, long before 1915 Einstein knew about Galilean relativity, and the works of Riemann and Lorentz were in Einstein’s training data. And Minkowski was even one of Einstein’s teachers.
Besides, these days AI is more than just LLMs, text to image programs are just as impressive. You can say that you don’t like the images those programs produced because there’s no disputing matters of taste, but you can’t deny the objective fact that they are excellent draftsmen and demonstrate a huge amount of technical skill, and you can’t deny that the images they produce are original or the fact that some people, in fact most people, find the images to be beautiful, or at least interesting. Image to video programs may be even more impressive, but they couldn’t work as well as they do if they didn’t have a deep understanding of what the real world looked like and an understanding of everyday physics that was almost as good as the intuitive (non-relativistic non-quantum) physics that every human being has.
John K Clark
Comment #225 June 26th, 2024 at 7:12 am
Nobody should be surprised that AI experts working in big tech on LLMs (i.e. with a big stake in how well their company stocks are doing) would OF COURSE be for sinking trillions of dollars of tax payer money into the field, based on the claim that it’s just a matter of scaling up. 😛
What’s surprising/refreshing/comforting is to find some experts push back on this.
Comment #226 June 26th, 2024 at 2:10 pm
fred, I don’t think interpolation is all there is to an NNs capabilities. There are 2 main methods of induction, or in the sense of Kolmogoroff complexity, means of simplification of a complex function. (sorry for the function formatting, TeX doesn’t want to work for me):
1. Modularisation: Given a function f(x_1,..,x_n), that makes a prediction from a data vector x. If this function can be written as
g(f_1(x_1,..,x_k),f_2(x_{k+1},..,x_n))
for some functions g, f_1 and f_2, then this amounts a reduction in complexity. If the x’s have a finite set of values, then the tables of function values, which describe g, f_1 and f_2 are much smaller than the table describing f. This simplification can be done recursively on g, f_1 and f_2. Basically this amounts to a nested tower of functions similar to the layers of NNs. This procedure is similar to what happens for the minimal circuit size problem, the major difference being, that the minimal circuit should result in exact solutions. The MCSP seems hard and so probably finding optimum weights for NNs. Gradient descent works well but I don’t think it can’t be improved on.
2. Analogy: Say, that in the above example, k=n/2 and the functions f_1 and f_2 are identical. Then this becomes another reduction in complexity, since only one of them needs to be specified. I don’t think standard NNs can exploit this sort of redundancy, but perhaps transformers are better in that respect, since they can recycle useful subfunctions in other domains by redirecting information to them by the attention mechanism.
I think, the picture of a fitting function being optimised in training is basically correct, although, it is somewhat simplistic, since the NNs employ inductive methods like modularisation and perhaps analogy in the creation of this function.
Comment #227 June 27th, 2024 at 1:33 am
Scott #4, could you please link to some studies/ op-eds/ books based on which you make your claim about atomic energy being “too cheap to meter” if not for the “blankfaces” who stalled its development in the 70’s? I am not an expert in the subject matter myself, and in popular materials (e.g. https://www.youtube.com/watch?v=k13jZ9qHJ5U) I usually encounter a more nuanced view on the role of atomic energy in solving the climate crisis. Besides, it kind of contradicts that what I know about the history of nuclear energetics, however little I know about the matter. AFAIK Fukushima and Chernobyl are the things to blame for the current situation.
Comment #228 June 27th, 2024 at 4:03 am
Alright #63, please, don’t equate progressivism with marxism (“late stage capitalism”? For Christ’s sake, please, not again).
Comment #229 June 27th, 2024 at 4:14 am
Danylo Yakymenko #70, Putin is definitely a serious concern, but the level of threat imposed by him is much, much lower than that of Xi’s China. As you are probably well aware, Putin has been struggling for more than two years now to accomplish an equivalent of Hitler’s occupation of Poland, despite the fact that this time France and Britain are not ready to declare war on the aggressor. Putin’s army is incomparably weaker than that of NATO. And he can only lose from using Russia’s nuclear weapons offensively, so he won’t. If you want to use WWII analogies, Putin is not Hitler, Putin is Mussolini trying to conquer Greece and restore the Roman Empire.
Comment #230 June 27th, 2024 at 9:55 am
John Clark #224
I’m actually the type of person who’s also more drawn to image/video generation because I’ve always been into computer graphics (since I was a kid in the 80s), and they represent a more fundamental level of understanding of space and things.
So I’ve always been following that area of AI more closely.
What impresses me is the clear capability of such systems to composite various elements that couldn’t possibly all be together in the training set.
What’s also super impressive is their increasing capability to nail the lighting (with shadows, light bounces, diffusion, etc)… it shows that at some level they’re able to extract the physics of light propagation.
But one problem (like with all models trained on huge data sets) is that, when looking at some result, I’m always wondering how close elements in that image are from things in the training dataset… it’s a shame one can’t ask the model to supply the source images that are the closest to various elements in the result. Think of it as a measure of “plagiarism”.
Comment #231 June 27th, 2024 at 12:59 pm
fred Says: In # Comment #225
” Nobody should be surprised that AI experts working in big tech on LLMs (i.e. with a big stake in how well their company stocks are doing) would OF COURSE be for sinking trillions of dollars of tax payer money into the field”
Taxpayer money? So far almost all the money that ignited the current AI revolution and keeps it accelerating has come from individuals and private companies, hell even I made a modest financial investment in the field. Although it’s only a matter of time before the military becomes interested, or at least they will if they have any brains.
John K Clark
Comment #232 June 27th, 2024 at 6:09 pm
fred #223
It’s never a problem to posit explanations for the outcome of conducted experiments. At the time of Einstein’s insights explanations abounded (Poincaré et al) to explain M&M and Mercury’s orbit. Actually Einstein made a prediction after SR and before GR about the deflection that was in error that he corrected after GR that nailed the measurement. Personally I would have liked the rogue planet Vulcan explanation to explain the curious orbit of Mercury,
The profundity is to accurately predict the outcomes of experiments that have not been conducted like Mercury’s image deflection by solar gravity and gravity waves. Even now we have string theory that seems reasonable and measured values fall out of it but no predictions that can be tested.
No one can say with certainty but I don’t expect AI venerated revelations that subsume QM and or GR in my lifetime but would follow any tests with hope. The manner in which Einstein explained the mental processes that resulted in his great insights in no way resembled interpolation of data nor for that matter an established symbolic approach in his thinking. He found the mathematics after the insight rather than vice versa. Anyone could claim that he was mistaken about this but he was profoundly right often enough that his claims should garner some respect.
Comment #233 June 27th, 2024 at 6:12 pm
Sorry for the stupid typos. The system didn’t provide the usual edit period.
Comment #234 June 28th, 2024 at 9:38 am
John #231
that paper recommends the gov invests trillions in the field… even if it’s not tax payer money (there just isn’t enough), the gov would have to print that money and cause massive inflation, which means everyone financial resources are reduced accordingly (it may just as well be from taxes..)
The paper also talks about reallocating a huge chunk of the electric power or just expecting we’re gonna have free massive excess of electricity really soon…
“By the end of the decade, American electricity production will have grown tens of percent; from the shale fields of Pennsylvania to the solar farms of Nevada, hundreds of millions of GPUs will hum”
Like, really?! We’re just gonna pretend that the grid isn’t already on the brink of collapsing, even more so each year because of extreme heat waves (Texas, anyone?), and the total inability of the gov to improve the infrastructure because… there’s no money and will.
In what fucking world is that guy living?
The world of Elizabeth Holmes and Sam Bankman-Fried? 😀
Comment #235 June 28th, 2024 at 12:51 pm
fred Says in Comment #234
> “that paper recommends the gov invests trillions in the field… even if it’s not tax payer money (there just isn’t enough), the gov would have to print that money and cause massive inflation”
I don’t know how you figure that. If I take a dollar from my wallet and give it to you a new dollar has not been conjured into existence, the government is the only one that can print new money , and it does it whenever it runs a deficit. Individuals don’t have that power but the Government has the ability to just print money when they don’t have enough to do what they wanna do, and that’s not necessarily a bad thing if done responsibly. With the exception of three years during the administration of Bill Clinton the US government has run a deficit every year since 1835, and yet today the USA has the strongest economy on earth. Besides, nobody would invest a dollar into AI or into anything else unless they thought that dollar would eventually produce more than one dollars worth of wealth.
> “The paper also talks about reallocating a huge chunk of the electric power”
First of all I can’t think of a better use for electrical energy than in running a factory that manufactures intelligence because these days that product is in short supply. And then there is the fact that LLM’s are not only getting smarter they are getting more energy efficient; very recently Claude 3.5 Sonnet not only beat GPT-4o in all the major intelligence basemarks it did so despite the fact that it was MUCH smaller and thus gave faster replies and used less energy than the latest GPT.
And there are massive improvements in hardware too. 10 days ago AI researchers reported they can run a billion-parameter AI model with just 13W with no performance loss. That number is for inference not training but even so I find it very impressive because the human brain needs about 20 W to run, and most of the time humans aren’t learning much.
https://arxiv.org/pdf/2406.02528
John K Clark
Comment #236 June 28th, 2024 at 11:47 pm
The link below shows a high quality global climate reconstruction from 2020. Please note in Figure 6 the higher temperatures 6000 years ago referred to as the Holocene Optimum. Please note the uptick 1000 years ago referred to as the Medieval Climate optimum. The Roman Empire Optimum about 2500 years ago is visible but not prominent on this reconstruction.
https://www.nature.com/articles/s41597-020-0530-7
Climate models are simulations with coarse grid and standard practice is to hindcast (match available historical data) prior to making forecasts. The climate models I am aware of were never able to match the temp increase of the Medieval Warm Period and so classically the historical data was disregarded in favor of the modeling results. The approach was that the MWP was only a weather anomaly in Europe and not a global phenomena. The fact is it shows a signal in temp proxies globally. This was the period during which Viking agricultural settlements were established in Greenland that changed to fishing settlements and finally abandoned as the climate cooled. I have never built climate models but have built other models of complex natural systems and ignoring hindcast actual data is an absolute classic mistake.
For me the value of this work is to make predictions and if anyone wants I will make a long list of failed predictions with full references.
My comment on the death of democracy were based on my observations elsewhere that once politicians start pursuing imprisoning their political opponents democracy suffers. It is the same on the left and right in the US and I take this as a bad omen. Democracy requires some measure of respect for political opponents.
Do I think 7,000,000,000 70 kg bipeds have had an effect on the planet-absolutely. Do I think we are facing climate Armageddon-absolutely not. If it were a scientific discussion there would be agreement.
Comment #237 June 29th, 2024 at 4:18 am
James Cross #96 #111
It seems that you also like to talk about having bug-for-bug compatibility! I do not care about this approach…
Take the visual part of the human brain; about 3 giganeurons (and, say the same number of other cells — taking what happens in brain on average). Its functionality is probably a couple of orders of magnitude smaller than that of contemporary image generation neural nets (which use a few gigaparameters of a few bits). The latter nets are awfully pessimized (by being created by extremely primitive algorithms) — I would expect that their complexity is going be decreased by a couple of orders of magnitude in foreseeable future (but forget about this possibility).
So, say, these 6 gigacells may be (probably) modelled by something of complexity 10 gigabits (give or take an order of magnitude — and probably one can shave a couple of orders of magnitude by optimizations!). Very roughly speaking, this seems to be the level of complexity of the topology of a chip with 3 gigagates.
Conclusion: in this example to model the functionality of N neuron (with N supporting cell) today one needs something like N transistors (and probably much less in the future).
AFAIK, no qualitative structural difference was ever found between generic primate’s brain and human brain. Do you doubt that we have reasonably good modelling of capabilities of most primates’ brains now?! And this requires — in order of magnitude — say about a gigaparameter of a few bits?
Essentially, this is the level of complexity of a chip with 1 gigagate!…
Comment #238 June 29th, 2024 at 5:17 am
Oups, I copied a wrong number for the count of visual neurons. It is probably about 3× less. So my (very coarse!) estimate for the number of transistors to emulate N neurons should be 3, not 1. (At least when log N ≫ 1.)
Comment #239 June 29th, 2024 at 7:49 am
#237
The human brain has more total neurons and especially more neurons in the prefrontal cortex than other primates. There are also developmental differences. The human brain, especially the PFC, is more underdeveloped at birth than other primates so it develops more after birth. Language and even vision pass through critical phases of development during this post-birth development period. Infants not exposed to language early in life never learn to speak well. Big differences in my view.
“Human visual cortex comprises 4–6 billion neurons that are organized into more than a dozen distinct functional areas.”
https://stanford.edu/~wandell/data/papers/2007-Wandell-Encyclopedia.pdf
What’s the point about the number of neurons and how many transistors it takes to model them? If you are just modeling inputs and outputs, it doesn’t matter how many artificial neurons are required or what the ratio is to brain neurons. If you are actually trying to model real brain neurons, then you need to account for a lot more – dendrites, axons, connections, and probably a lot more we don’t understand yet. Plus, you will need to model the different types of neurons (more than a thousand different types in the human brain) and how they work differently.
Comment #240 June 29th, 2024 at 12:30 pm
James Cross Says: in Comment #111
> “The key question is whether adding more neurons, more parameters, more whatever to AI without other structural/ algorithmic changes reaches a point of diminishing returns.”
I agree that indeed is the key question, but so far at least the point of diminishing returns is nowhere in sight. Just a few years ago people said that before a computer could pass the Turing Test we would need to find another Einstein, perhaps a dozen Einsteins, but that turned out not to be true. All we needed was faster hardware. Although there have been important developments in the theory of computer science so far in this century, none of them was as revolutionary as quantum physics or the discovery of General Relativity. What really set off the current AI revolution was the simple fact that between 2012 and today the amount of compute firepower available for use in training AI’s has been doubling every 3.4 months, which is about seven times faster than Moore’s Law. Even if transformers had never been invented I think that would’ve delayed the AI revolution by only a year or two.
John K Clark
Comment #241 July 6th, 2024 at 9:20 pm
#224
I agree with John. if somebody insists that Large Language Models can do anything original, they must produce a LLM that can generate General Relativity after being trained on the texts of Newton, Leibniz, Kant, Maxwell, Mach, Minkowski, Levi-Civita etc. You should only claim LLM can do physics after, not before, having an evidence. Einstein is a very good comparison.
These days AI is more than just LLMs with impressive computer vision capability e.g. weren’t we promised self driving cars that can tell difference between a stop sign and a stop sign on a T-shirt like 10 years ago? apparently that was solved now!
Comment #242 July 7th, 2024 at 8:28 am
I read up to Leopold’s link to Scott Aaronson’s post to his 11 year old self and then read this blog for an hour or so. I was thinking about this during a 10k run today in 37C heat here in Nishinomiya. First thought is that “Shtetl-Optimized” seems really dark, but I suppose I don’t understand it.
Disclaimer: I’ve been reading Ray Kurzweil’s latest “The Singularity is Nearer” while reading Leo’s piece. I’m only into chapter 3 though. Ray seems to have redefined his 2029 takeoff to mean when AGI passes a “strong” version of the Turing Test. I like this nuance since it implies that the AI will be smart enough to hold back and feign human frailty. Whereas Leo thinks we’ve already achieved it. I agree with Leo but see the better utility in Ray’s bar.
Don’t we already have Super Intelligences in the form of the Fantastic 7 companies that seem to dominate the compute power? Isn’t any large societal organization capable of things that individuals do not understand. Or, is it because they are top-down hierarchical power structures with CoT legibility that makes them not so opaque?
One of the big assumptions seems to be that there is some difference between AI and human or human and octopus, …. But, you could choose the solar system as the individual computing towards greater intelligence/awareness. In which case, AI is an awesome development. Focusing on individual species or groups of hominids is just so antiquated.
And nobody talks about the genocide taking place on Alpha Centauri (literally). What’s with that!?
Comment #243 September 30th, 2025 at 8:23 am
[…] Aschenbrenner’s principal conclusions are that AGI is likely arriving soon, will be transformative and disruptive, poses significant risks, and demands urgent, coordinated action on security, alignment, and governance to ensure a safe transition. Source: Scott Aronson blog […]
Comment #244 October 8th, 2025 at 12:54 pm
[…] general or national security person is going to read and say: ‘This requires action.’” In a blog post, he called the essay “one of the most extraordinary documents I’ve ever read,” saying […]