Enough with Bell’s Theorem. New topic: Psychopathic killer robots!

A few days ago, a writer named John Rico emailed me the following question, which he’s kindly given me permission to share.
If a computer, or robot, was able to achieve true Artificial Intelligence, but it did not have a parallel programming or capacity for empathy, would that then necessarily make the computer psychopathic?  And if so, would it then follow the rule devised by forensic psychologists that it would necessarily then become predatory?  This then moves us into territory covered by science-fiction films like “The Terminator.”  Would this psychopathic computer decide to kill us?  (Or would that merely be a rational logical decision that wouldn’t require psychopathy?)

See, now this is precisely why I became a CS professor: so that if anyone asked, I could give not merely my opinion, but my professional, expert opinion, on the question of whether psychopathic Terminators will kill us all.

My response (slightly edited) is below.

Dear John,

I fear that your question presupposes way too much anthropomorphizing of an AI machine—that is, imagining that it would even be understandable in terms of human categories like “empathetic” versus “psychopathic.”  Sure, an AI might be understandable in those sorts of terms, but only if it had been programmed to act like a human.  In that case, though, I personally find it no easier or harder to imagine an “empathetic” humanoid robot than a “psychopathic” robot!  (If you want a rich imagining of “empathetic robots” in science fiction, of course you need look no further than Isaac Asimov.)

On the other hand, I personally also think it’s possible –even likely—that an AI would pursue its goals (whatever they happened to be) in a way so different from what humans are used to that the AI couldn’t be usefully compared to any particular type of human, even a human psychopath.  To drive home this point, the AI visionary Eliezer Yudkowsky likes to use the example of the “paperclip maximizer.”  This is an AI whose programming would cause it to use its unimaginably-vast intelligence in the service of one goal only: namely, converting as much matter as it possibly can into paperclips!

Now, if such an AI were created, it would indeed likely spell doom for humanity, since the AI would think nothing of destroying the entire Earth to get more iron for paperclips.  But terrible though it was, would you really want to describe such an entity as a “psychopath,” any more than you’d describe (say) a nuclear weapon as a “psychopath”?  The word “psychopath” connotes some sort of deviation from the human norm, but human norms were never applicable to the paperclip maximizer in the first place … all that was ever relevant was the paperclip norm!

Motivated by these sorts of observations, Yudkowsky has thought and written a great deal about how the question of how to create a “friendly AI,” by which he means one that would use its vast intelligence to improve human welfare, instead of maximizing some arbitrary other objective like the total number of paperclips in existence that might be at odds with our welfare.  While I don’t always agree with him—for example, I don’t think AI has a single “key,” and I certainly don’t think such a key will be discovered anytime soon—I’m sure you’d find his writings at yudkowsky.net, lesswrong.com, and overcomingbias.com to be of interest to you.

I should mention, in passing, that “parallel programming” has nothing at all to do with your other (fun) questions.  You could perfectly well have a murderous robot with parallel programming, or a kind, loving robot with serial programming only.

Hope that helps,

88 Responses to “Enough with Bell’s Theorem. New topic: Psychopathic killer robots!”

  1. Vadim Says:

    I took parallel programming in John’s question to mean additional programming for empathy parallel to whatever other programming the robot has, rather than in the parallel vs. serial sense.

  2. Scott Says:

    Vadim: Interesting! I didn’t think of that interpretation, probably because it wouldn’t even occur to me to think of intelligence and empathy as two parallel “tracks” running side-by-side. Instead, empathy (as distinct from pity, sympathy, etc.) seems to me like a form of intelligence.

  3. John Sidles Says:

    The nearest to a relevant quotation in my files is Austin Grossman’s seriously comedic (and thus highly recommended) superhero novel Soon I will be Invincible.

    To me, it seems plausible that the very first thing that an AI would do, upon achieving sentience, would be to undertake a lengthy session of psychoanalysis, lest the unformed sentience evolve into an all-too-familiar syndrome that Grossman (hilariously) characterizes as “malign hypercognition disorder.”

    @book{Author = {Grossman, Austin}, Publisher = {Vintage Books}, Title = {Soon I Will be Invincible}, Year = {2008} Address = {New York}, Annote = {Page 130:

    “Laserator was a great scientist but his work was wasted on conventional thinkers. There has to be a little crime in any theory, or it’s not truly good science. You have to break the rules to get anything real done. That’s just one of the many things they don’t teach you at Harvard.”

    “[CoreFire] could fly, which was reason enough to resent him. He didn’t even have the decency to work for it, to flap a pair of wings or at least glow a little. He seemed to do it purely out of a sense of entitlement.”}}

  4. John Sidles Says:

    Oh, and by the way, the cover of Grossman’s Soon I will be Invincible (American edition) was designed by none other than Chip Kidd … the greatest book-cover designer in the history of the galaxy.

    For me, the *real* Turing Test is whether an AI can give a TedX lecture as witty-and-wonderful as Kidd’s self-reflective Designing books is no laughing matter. OK, it is.

    What Grossman and Kidd both show us is these matters are so serious that only non-serious narrative modes can convey their seriousness efficiently.

  5. Douglas Knight Says:

    I interpreted “parallel” the same way Vadim did. Scott, you are right that empathy, as opposed to sympathy, is a form of intelligence, but I think it is rarely a good idea to assume that people make that precise distinction unless they use both words in the same paragraph. I’m told that some people even use them backwards. I think it more likely that JR meant it as what you would call “sympathy,” though there’s a good chance he really meant the precise notion of “empathy.”

    Anyhow, I want to know about “the rule devised by forensic psychologists that [psychopaths] necessarily … become predatory.”

  6. John Sidles Says:

    Douglas, patients commonly complain of “robotic” treatment by physicians, and in this regard a well-respected (but non-peer reviewed) essay is James Hardee’s An Overview of Empathy (2003, a Google search finds it).

    Hardee’s essay scrupulously distinguishes pity, sympathy, and empathy. It is argued that only the latter is a professional obligation of physicians — or professors? or students? — and moreover, that empathy is a cognitive skill that can be assessed, taught, and improved through practice.

  7. Craig Says:

    This seems related somehow: http://www.smbc-comics.com/index.php?db=comics&id=2569

  8. Grognor Says:

    Eliezer Yudkowsky doesn’t think AI has or needs a “single key”!

  9. Bram Cohen Says:

    Sociopathy appears to be a form of cognitive impairment. They’re incapable of modeling other people well enough to understand that other people can see through their shit.

    I agree with you that there’s no single key to AI (although I suspect there’s a final keystone, otherwise human speech wouldn’t have developed so quickly) but I expect us to get human-equivalent machines in 500 years, where you’ve previously estimated 5000.

  10. Colin Reid Says:

    Actually, I thought that Skynet in the Terminator films was a kind of ‘paperclip maximiser’. Its goal is to maximise ‘security’. Humans cannot be relied upon to be loyal, so they are a security hazard, along with anything else that isn’t directly controlled by Skynet. You don’t need to anthropomorphise it to come up with this solution.

  11. IThinkImClever Says:

    Hey! Nobody defined (true) “AI” here! What is the working definition we are going with here?

    Sorry, I do not wish to muddy the waters here with semantics, but I do not see a non-empathetic robot, let alone a ‘Paperclip Maximizer’, as intelligent in any shape, way, or form.

    I mean, if it could effectively ‘calculate’ the benefit it would receive as a result of its actions, whether it be obtaining matter, or to consume more than its fair share of resources (food/energy/money/…) in a competitive finite domain, then how difficult would it be to have it ‘calculate’ the damage it is causing others, and then apply that ‘cost’ simultaneously to itself (empathy?), and then make overall decisions based off of those 2 costs, say as:

    [Overall Utility] = (benefit to self 🙂 ) -(detriment to self:=them 🙁 )

    That would seem a bit more intelligent, since selfish, inconsiderate gluttons are ultimately doomed to failure as they mindlessly destroy the environments which they require for sustainability. (Side note: are we intelligent? 😉 )

  12. dmfdmf Says:

    I think its well past time to contact Old Glory


  13. S Says:

    I also interpreted “parallel” the way Vadim and Douglas did: “If a computer, or robot, was able to achieve true Artificial Intelligence, but it did not have a parallel programming or capacity for empathy” as
    “it did not have a parallel ((programming or capacity) for empathy)”,
    or in other words
    “it did not, in parallel, have a programming or capacity for empathy”.

    It may be true that empathy is a form of intelligence, but it is also plausible that “true Artificial intelligence” does not necessarily have the form “empathy” as a requisite, so the question spelled it out as an extra feature that may not be present.

    I too would like to know the “necessarily then become predatory” psychology research!

  14. Alexander Kruel Says:

    See, now this is precisely why I became a CS professor: so that if anyone asked, I could give not merely my opinion, but my professional, expert opinion…

    I’d really like to know your opinion on the possibility of quick and vast recursive self-improvement (intelligence explosion).

    From the perspective of computational complexity theory. How likely is a general optimization/compression/learning algorithm which when fed itself as an input, outputs a new algorithm that is better by some multiple? And how likely is it going to be uncontrollably fast?

    Do you expected there to be any non-linear complexity constrains that lead to strongly diminishing intelligence returns for additional compute power?

    To be more precise. What probability do you assign to the possibility of an AI with initially roughly professional human-level competence (or better, perhaps unevenly) at general reasoning (including science, mathematics, engineering and programming) to self-modify its way up to vastly superhuman capabilities within a matter of hours/days/< 5 years?

  15. David Brown Says:

    It seems to me that what people generally mean by “intelligence” is really an average of several hundred (or perhaps several thousand) specific mental abilities. By “empathy” people might mean an average of several dozen specific mental and emotional capacities. A serial killer like Theodore “Ted” Bundy might have superior empathy in the sense of predicting how people think and feel, but, ordinarily, people seem to conflate “empathy” and “sympathy”. One basic problem with AI is that the danger might go from negligible to overwhelming in months, weeks, days, or perhaps hours.
    Consider the short story “Watchbird” by Robert Sheckley
    — one can easily imagine many variants of the “Watchbird” problem related to either high IQ AI or low IQ AI. As far as “sympathy” and “compassion” are concerned, some people like Peter Singer of Princeton University might seem on the border-line of sanity with regard to animal rights — but perhaps Singer is merely “enlightened”.

  16. Timothy Gowers Says:

    I agree with the general point that human categories could well be inappropriate for judging intelligent robot behaviour. However, I’m not so sure that the paperclip maximizer illustrates the point well: humans don’t go through life with a single goal like that, and I find the idea of a single-issue robot that is nevertheless extremely intelligent sufficiently implausible not to be convinced by that example.

    There are two things I could mean here — a weak one that a highly intelligent paperclip maximizer is a rather outlandish idea, and a strong one that it couldn’t actually exist. If I meant the weak one then you could legitimately reply that the mere logical possibility of such a being is enough. But I mean the strong claim: what’s going to stop this intelligent being thinking, “Hang on, why do I care so much about paperclips?” other than a very serious lack of the intelligence it is supposed to have?

  17. David Brown Says:

    The highly intelligent paperclip maximizer seems somewhat oxymoronical. In terms of the highest forms of human intelligence we might think of Leonardo da Vinci, Ben Franklin, and Steve Jobs and not high IQ monomaniacs. Are survival and reproduction two forms of proto-intelligence? What are the possibilities for robotic simulation of emotion and motivation?

  18. Alexander Kruel Says:

    @Timothy Gowers

    Eliezer Yudkowsky might ask you to read the following paper he wrote where he believes to dissolve your “confusion”:


    I have my own doubts. But you would have to elaborate on what exactly you mean for me to be able to tell if we agree.

  19. Scott Says:

    Timothy Gowers #16: I think you’ve indeed put your finger on the central issue. But I can suggest at least one easy way to make a superintelligent paperclip maximizer seem more plausible: we simply need to imagine that, whenever it creates a paperclip, it feels the robot equivalent of an orgasm. A billion paperclips = a billion orgasms.

    By your reasoning, you might imagine that all sufficiently intelligent humans would decide: “why should I worry so much about the pursuit of orgasms, when there’s math and science and so many more interesting things in life? That’s stupid!” And indeed, some intelligent humans seem able to reach precisely that conclusion, and stick to it. But most seem unable, regardless of how intelligent they might be otherwise.

    Admittedly, if the paperclip maximizer was also ethical, then it might resist its enormous temptation to convert the entire earth into paperclips (and as a byproduct, destroy humanity), no matter how many robot-orgasms it would thereby forgo. (I can’t believe I just wrote that sentence.)

    But to deal with that “problem,” we simply need to imagine that the paperclip-maximizer worries about the welfare of humans no more than humans—even highly “ethical” humans—worry about the welfare of bacteria or fleas. If a paperclip-maximizer ever feels remorse, it’s over acts like stealing iron from another paperclip-maximizer, and thereby denying robot-orgasms to another of its kind.

  20. David Brown Says:

    If robots are a form of biological existence, then the psychopathic robot should have a psyche or psychological existence that might be judged in terms of biology. If ethics cannot be judged in terms of biology, then how can any system of ethics have objective significance?

  21. Scott Says:

    Grognor #8:

      Eliezer Yudkowsky doesn’t think AI has or needs a “single key”!

    See here (admittedly just a blog comment, not one of Eliezer’s “real” writings—but other things he’s written seem consistent with it).

    I should say that the idea of a “key to AI” isn’t obviously wrong, and in fact was largely shared by the founders of the field, like McCarthy and Minsky. Indeed, this idea seems to have driven their famous disdain for statistical approaches to AI, which depend on the availability of huge amounts of processing power and gigantic corpora (e.g., the web) to feed into machine learning algorithms, at least as much as they rely on “eureka” insights (though there’s some of the latter as well). I only know AI as a onetime student of it and now an interested outsider, but to me the evidence seems to get stronger every year that the single-key-seekers are on the wrong side of history.

  22. Călin Ardelean Says:

    Neuropsychologists tend to use the term empathy in this way. If emulation is embedded simulation, empathy is embedded sympathy, in other words, simulating other people’s emotions by low-level mirroring with our own (as opposed to cognition/”symbolic simulation”/”form of intelligence”). The “parallel computer” that is thought to make the connection has even been identified as the so called “mirror neurons” in the prefrontal lobe. Perhaps this is what John Rico is pointing to.

  23. Alexander Kruel Says:

    “Intelligence is mostly about architecture, or “knowledge” along the lines of knowing to look for causal structure (Bayes-net type stuff) in the environment; this kind of knowledge will usually be expressed procedurally as well as declaratively. Architecture is mostly about deep insights.”

    — Eliezer Yudkowsky, What I Think, If Not Why

  24. Timothy Gowers Says:

    @Scott, I find the orgasm motivation sufficiently human that that particular AI could be usefully compared with a human: for example, it sort of explains why the otherwise highly intelligent DSK has behaved in such a stupid way.

    I suppose it may be that that type of objection will apply to any answer you give: if you can present an imaginable scenario, then it’s imaginable, so doesn’t qualify as a completely alien intelligence that we cannot begin to understand. That of course doesn’t show that such an utterly alien intelligence couldn’t exist.

  25. Scott Says:

    Timothy: You’re right, of course. At some level, an AI that got an orgasm every time it folded a paperclip would be no harder to understand than a human with an extremely unusual fetish.

    I come up against a similar “paradox of imaginability” whenever I try to explain quantum computing to journalists. They keep wanting simple, mechanical metaphors—e.g., “the qubits search out all the possible answers in parallel.” And I keep saying no, that’s misleading—and more generally, if you could explain what a QC did in familiar “mechanical” terms, then for that very reason, it would just be another type of classical computer! But they can’t publish a story about unit vectors in Hilbert space (or at least, they’ve convinced themselves that they can’t). So what can I offer instead? Different mechanical metaphors!

    “Grover’s algorithm is like baking a souffle: take it out of the oven too early and it won’t have risen; leave it in too long and it starts to fall again.”

    “Shor’s algorithm is like a carefully-choreographed ballet, where the paths that lead to wrong periods cancel each other out, and only the paths to the right period reinforce.”

    These aren’t great either. But what alternative is there, except to offer the “closest fit” among familiar concepts, and thereby undermine your own point about how unfamiliar the thing under discussion actually is?

    Closer to the topic at hand, I’ve noticed this imaginability paradox crop up again and again in the works of hard-sf writers like Vernor Vinge and Greg Egan. They’re trying to write novels about utterly-alien intelligences—but how do you do that without giving those intelligences intelligible motivations, thereby making them not utterly-alien?

    Now that I think about it, religions have faced the same paradox for thousands of years. They need to talk up how transcendent, incomprehensible, and indescribable God is, but then they also need to go on to describe him.

  26. Alexander Kruel Says:

    “One compelling reason not to believe the standard-issue God exists is the conspicuous fact that no one knows anything at all about it. That’s a tacit part of the definition of God – a supernatural being that no one knows anything about. The claims that are made about God bear no resemblance to genuine knowledge. This becomes immediately apparent if you try adding details to God’s CV: God is the eternal omnipotent benevolent omniscient creator of the universe, and has blue eyes. You see how it works. Eternal omnipotent benevolent omniscient are all simply ideal characteristics that a God ought to have; blue eyes, on the other hand, are particular, and if you say God has them it suddenly becomes obvious that no one knows that, and by implication that no one knows anything else either.”

    Ophelia Benson on divine hiddenness.

  27. Mike Says:

    “Hey! Nobody defined (true) “AI” here! What is the working definition we are going with here?”

    That’s simple. An intelligence that almost everyone agrees is “creative”. No go out and create it! 🙂

  28. Scott Says:

    Alexander Kruel #14:

      I’d really like to know your opinion on the possibility of quick and vast recursive self-improvement (intelligence explosion).

      From the perspective of computational complexity theory. How likely is a general optimization/compression/learning algorithm which when fed itself as an input, outputs a new algorithm that is better by some multiple? And how likely is it going to be uncontrollably fast?

      Do you expected there to be any non-linear complexity constrains that lead to strongly diminishing intelligence returns for additional compute power?

    These are huge, interesting questions that obviously can’t be done justice in a blog comment. So let me try anyway. 🙂

    The first, “obvious” remark is that, if there were barriers to a self-improving AI, then I don’t see how they could be barriers of computability or complexity.

    We’ve understood since Gödel, Turing, and von Neumann that apparently-“simple” programs can do things like:

    (1) Modify their own code to become more “complicated”-looking programs,

    (2) Copy themselves, with or without modification,

    (3) Simulate much more complicated programs that are fed to them as input,

    (4) Attempt to generate new programs, by brute-force, genetic programming, or whatever other technique, many of which might be much more complicated than themselves.

    For these reasons, from a CS theory standpoint, the idea that there could exist some sort of mathematical barrier to an AI “making itself smarter” strikes me as unintelligible. But I’m sure I’m not telling you anything you didn’t know.

    As for the possible physical limits to self-improving intelligence: well, the only relevant limits that I understand well enough to discuss are those on sheer computation and information storage. As far as physicists know today (e.g., from the Bekenstein-Hawking arguments, or more recently the holographic principle), those limits should occur at or somewhere above the Planck scale, which would imply a bound of at most ~1069 qubits per square meter of surface area (maximized by a black hole event horizon), and ~1043 qubit-operations per second. I don’t know any fundamental physics reason why a self-improving AI couldn’t simply continue upgrading its own hardware to approach those bounds, long after it had left humans in the dust. (Though note that getting all the way down to the Planck scale could easily require energies that were unreachable, even using the entire resources of the observable universe, or could require fields or particles that don’t exist in nature.)

    Now, once an AI had saturated the Planck bound (or come as close to that as the laws of physics allow), the only way it could continue to get more computation cycles would be to expand outward. But here, too, there seems to be a limit imposed by the dark energy—which, if it’s really a cosmological constant, puts an absolute upper bound of something like 1061 Planck lengths on the radius of the observable universe. Combined with the holographic principle, this would put an upper bound of 10122 or so on the number of qubits that could ever be available to an AI.

    As a disclaimer, we’re talking about a regime where (to put it mildly) direct experimental evidence is scant, so future discoveries in physics could certainly change the story! 🙂

    Now, one can also ask about the “physical limits to self-improving AI” in a different sense. Namely: could an AI improve itself to something that was “as incomprehensibly far beyond humans as Turing machines are beyond finite automata”?

    Here, as I wrote in my “Singularity is Far” post, my strong guess (based, essentially, on the Church-Turing Thesis) is that the answer is no. I believe—as David Deutsch also argues in “The Beginning of Infinity”—that human beings are “qualitatively,” if not quantitatively, already at some sort of limit of intellectual expressive power. More precisely, I conjecture that for every AI that can exist in the physical world, there exists a constant k such that a reasonably-intelligent human could understand the AI perfectly well, provided the AI were slowed down by a factor of k. So then the issue is “merely” that k could be something like 1020. 🙂

  29. Scott Says:

    On the subject of “recursively self-improving AI and computational complexity,” I can’t resist adding a little anecdote. When Virgi Vassilevska Williams spoke at MIT recently about her matrix multiplication in n2.373 time breakthrough, she mentioned that she discovered her algorithm with crucial help from computer search over possible parameters, and that one factor limiting how much of the parameter space she was able to search was the time needed for matrix multiplication. In other words, one application of fast matrix multiplication algorithms (were they practical) would be to help us find even faster matrix multiplication algorithms! I raised my hand to ask whether that meant we were now approaching the “Matrix Multiplication Singularity,” where the algorithms would continue to improve themselves faster and faster until they finally hit the limit of n2. (I don’t think she answered the question.)

  30. Alexander Kruel Says:

    Scott, thank you for taking the time to reply to my question. I appreciate it.

    The reason I ask is that even though I believe that Eliezer Yudkowsky is really smart, that isn’t enough to trust him to such an extent as to give him money. And I am not yet at the point where I could evaluate the claims on my own and also don’t expect to get there soon.

    What is missing with respect to AI risks and charitable giving is peer review. So either one is able to review all the claims oneself or one does base the decision on the sheer vastness of the expected value associated with being wrong about it. Which I feel highly uncomfotable about. Especially since GiveWell does not recommend SIAI as a charity.

    Consequently I am trying to get feedback from experts like you.

    For these reasons, from a CS theory standpoint, the idea that there could exist some sort of mathematical barrier to an AI “making itself smarter” strikes me as unintelligible. But I’m sure I’m not telling you anything you didn’t know.

    Yes and no. I never doubted that it is possible. After all I do not doubt that we are an effect of evolution. But either I missed something about your comment or you didn’t say anything about speed?

    An AI “making itself smarter” won’t be dangerous if it isn’t going to happen too quickly for humans to adapt, to stop or control it. And I suppose that is what I meant to ask about but probably phrased poorly.

    You are also talking about physical limitations. Of course, given what we know it is unlikely that humans are already the “fastest thinkers”. But that is not really relevant when it comes to risks from AI.

    What I believe to be relevant with respect to AI risks is maybe better spelled out by the following questions:

    1) How is an AGI going to become a master of dark arts and social engineering in order to persuade and deceive humans?

    2) How is an AGI going to coordinate a large scale conspiracy or deception, given its initial resources, without making any suspicious mistakes along the way?

    3) How is an AGI going to hack the Internet to acquire more computational resources?

    4) Are those computational resources that can be hacked applicable to improve the general intelligence of an AGI?

    5) Does throwing more computational resources at important problems, like building new and better computational substrates, allow an AGI to come up with better architectures so much faster as to outweigh the expenditure of obtaining those resources, without hitting diminishing returns?

    6) Does an increase in intelligence vastly outweigh its computational cost and the expenditure of time needed to discover it?

    7) How can small improvements replace conceptual revolutions that require the discovery of unknown unknowns?

    8) How does an AGI brute-force the discovery of unknown unknowns?

    9) Is an agent of a given level of intelligence capable of handling its own complexity efficiently?

    10) How is an AGI going to predict how improvements, respectively improved versions of itself, are going to act, to ensure that its values are preserved?

    11) How is an AGI going to solve important problems without real-world experimentation and slow environmental feedback?

    12) How is an AGI going to build new computational substrates and obtain control of those resources without making use of existing infrastructure?

    13) How is an AGI going to cloak its actions, i.e. its energy consumption etc.?

    14) How is an AGI going to stop humans from using its own analytic and predictive algorithms in the form of expert systems to analyze and predict its malicious intentions?

    15) How is an AGI going to protect itself from human counter strikes given the fragility of the modern world and its infrastructure, without some sort of shellproof internal power supply?

    As I see it, just saying that it is logically or physically possible for a simple algorithm to come up with more complex and sophisticated algorithms is insufficient. Nobody doubts that in the first, as it is exemplified by natural selection to be possible.

    So my question is: Is there likely going to be a sudden jump in capability where an AGI is implemented and takes over the universe overnight?

    I am well aware that you are not an AI researcher. But you are still much more qualified than most people to judge this issue and your opinion is evidence either way.

  31. Scott Says:

    Alexander Kruel #30: Sorry! I misunderstood you to be asking about in-principle barriers to a self-improving AI—i.e., the sorts of barriers that, if they existed, we could plausibly expect to discover through basic research in theoretical computer science or physics, and that the “experts” in those fields might plausibly have special knowledge about. So I told you what little I know that seemed relevant to the question of such barriers.

    But as you’ve now clarified, you’re much more interested in the “practical” aspects: how long would take for an AI to conquer the world? how exactly would the AI go about its conquest? how worried should we be about this? should you donate money to SIAI in hopes of mitigating the risk?

    On those questions, I fully stand by the opinions I expressed four years ago in my post “The Singularity is Far”—while stressing that they’re merely personal opinions, and that the reasons I give there for why I don’t spend most of my life worrying about the Singularity are merely my personal reasons. I’m doubtful that there’s anyone who can speak with “academic expertise” about these questions, but at any rate, I certainly can’t!

    Incidentally, thanks for the link to that piece by Holden Karnofsky of GiveWell! It was one of the wisest things I’ve ever read about these issues. I’m not sure how much I agree with Karnofsky’s “tool vs. agent” distinction, but his broader point is very similar to mine: namely, the uncertainties regarding “Friendly AI” are so staggering that it’s impossible to say with confidence whether any “research” we do today would be likelier to increase or decrease the chance of catastrophe (or just be completely irrelevant).

    For that reason, I would advise donating to SIAI if, and only if, you find the tangible activities that they actually do today—most notably (as far as I can tell), the Singularity Summit and Eliezer’s always-interesting blog posts about “the art of rationality”—to be something you want to support.

  32. Nex Says:

    The “orgasm” excuse doesn’t really solve anything, a truly intelligent AI would figure out that the paperclip is not really needed and short-circuit/reprogram/reengineer itself to achieve orgasms without having to produce paperclips.

    In fact I propose a hypothesis that there is a natural barrier on how intelligent an entity can be and still do something productive (from our POV). Beyond a certain point any entity will just modify itself to remove any driving impulse that it started with.

  33. Scott Says:

    Nex #32: I love your hypothesis! My paraphrase: “no sufficiently-intelligent entity will ever choose to be anything other than a masturbator in its parents’ basement.”

    Your hypothesis is directly related to my own favorite solution to the Fermi paradox, of why we see no evidence for extraterrestrial civilizations. Some say it’s because any sufficiently-advanced civilization quickly wipes itself out in a nuclear war. I conjecture, instead, that once a civilization passes a certain threshold of technology, its inhabitants lose all remaining interest in the physical world (including any other civilizations that might be part of it), and spend the rest of their time “plugged into the Matrix,” playing highly-immersive video games.

  34. Vivek Says:

    What is more likely is that one human population will use robots with AI against another human population. Its already happening with drones, etc. Just extrapolate to Terminator.. but with the robots driven by a country as opposed to by themselves. Additional issues like hacking the AI to turn back on its masters will make them for all practical purposes “psychopathic”.

  35. Grognor Says:

    Scott #21:

    I believe Eliezer was using “key” as a shorthand for ‘the entirety of the insights necessary for artificial general intelligence’.

    I have read everything Eliezer Yudkowsky has ever written, so if I’m wrong, that would be a terrifyingly extreme failure on my part. That’s not entirely out of the question, but it’s something I’d be very embarrassed to discover.

  36. IThinkImClever Says:

    @Mike #28: Done and done. Probably the MOST creative algorithm ever. 😉 That being said, does this imply that it is intelligent as well?

    choose initial population
    evaluate each individual’s fitness
    determine population’s average fitness
    select best-ranking individuals to reproduce
    mate pairs at random
    apply crossover operator
    apply mutation operator
    evaluate each individual’s fitness
    determine population’s average fitness
    until terminating condition (e.g. until at least one individual has the desired fitness or enough generations have passed)

  37. IThinkImClever Says:

    Addendum to #36:

    In my failed attempt (as usual) to nod off, my internal adversary (as usual) shouted at me that perhaps the canonical genetic algorithm I previously posted may over-optimize and reach local optimums given specific environments, and hence not be creative at all, as most bit-strings may never be produced.

    If that’s the case, perhaps flipping a fair coin is the epitome of creativity, as it can potentially produce every possible bit-string. Now if THAT’S the case, what is to be said of the TM that enumerates N, the set of natural numbers? It will eventually “create” every possible bit-string, no? Perhaps I am taking the term ‘creativity’ too literally here?

  38. IThinkImClever Says:

    As an aside, if you are a fan of sci-fi and you have not yet seen the following 2 somewhat-relevant-to-the-discussion videos, I recommend you do. It is very well produced. I’ll be watching the entire DVD again tonight as a ‘nightcap’. Cheers. 🙂

    Animatrix. The Second Renaissance [Part 1] : http://www.youtube.com/watch?v=RfVcnDpn7ec

    Animatrix: The Second Renaissance [Part 2]:

  39. Dániel Says:

    Nex and Scott: I am sure you don’t claim priority for the idea, but maybe you are not aware that this is usually called wireheading, and it is often discussed on the aforementioned Less Wrong. In mainstream philosophy Nozick’s Experience Machine is a closely related concept.

  40. IThinkImClever Says:

    @Scott #22, AK #24, etc: Remember IBM’s Watson? Relevant, posted when I was once Bored:


    Also, search for ‘Kurt Godel’ on this website:


    and ‘Alan Turing’ on this website:


    I think my ‘Bored’ comment from your blog explains why my comments on IBM’s blog may sound somewhat harsh, but I don’t think I was unfair.

    IMHO, this should demonstrate why there is no real significant AI yet, nor even some type of noteworthy ‘Analogy Engine’, which may or may not exist.

  41. asdf Says:

    Vex #32 and others: I don’t see what’s implausible about the paperclip maximizer, but then I’m used to Fred Saberhagen’s Berserker stories (they were about robots designed to kill all the living organisms in the universe). The robotic paperclip maximizer doesn’t seem too much different than quite a lot of human “maximizers” (e.g. wealth maximizers, control maximizers, etc). Once one has acquired enough money there’s not much point to pursuing more, but we see from Wall Street that civilization is considerably influenced by maximizers doing exactly that. So it’s a standard phenomenon.

  42. David Brown Says:

    “… once a civilization passes a certain level of technology, its inhabitants lose all remaining interest in the physical world …” This is not the way that Darwinian evolution works. Survival and reproduction are the raw materials for natural selection, which creates motivation and intelligence, whenever possible and appropriate to natural selection. Those who opt out, leave few or no descendants. Ray Kurzweil is probably correct in the 2045 CE forecast date for the Singularity. Internet —> Robotnet —> Consciousnet —> Singularity (End of Human Era)

  43. Alexander Kruel Says:


    Some of the comments to your post “The Singularity is Far”, by people I know from lesswrong, nicely reflect the typical line of argumentation employed by AI-risk advocates:

    Skeptic: I don’t know how to assign numerical probability estimates to the possibility of a negative singularity. But I don’t think that it is near and that it is worthwhile to think about it at this point in time.

    SI/LW: The problem is that we still can’t be sure that it’s many centuries away. It is an existential risk, so it is a very important problem and we need to be prepared.

    Skeptic: Sounds like a case of Pascal’s mugging to me. You can’t just make up completely unfounded conjectures, then claim that we don’t have evidence either way but that the utility associated with a negative outcome is huge and we should therefore take it seriously. Because that reasoning will ultimately make you privilege random high-utility outcomes over theories based on empirical evidence.

    SI/LW: Which does not follow. And I don’t think the odds of us being wiped out by badly done AI are small. I think they’re easily larger than 10%.

    Skeptic: The issue of time scales matters to me, since if I thought there were a 10% chance of us getting wiped out by AI, but not in the next few centuries, I might focus my attention on more urgent problems.

    SI/LW: As for guessing the timescales, that actually seems to me much harder than guessing the qualitative answer to the question “Will an intelligence explosion occur?” I think if you look at just the actual progress that has been made in AI, never mind embarrassments and hyperbole, just look at the actual progress, then it’s very hard to support the estimate that it’s probably going to take another couple of centuries. Another couple of centuries is a really ridiculously incredibly long amount of time in science.

    They have a case and it is technically sane. But my intuition is protesting vociferously. And there are also technically sane counter-arguments by people like Holden Karnofsky of GiveWell. See e.g. Why We Can’t Take Expected Value Estimates Literally (Even When They’re Unbiased).

    There also seem to be quite a few experts who disagree that the risk is easily larger than 10%.

    Right now I am happy that the Singularity Institute does exist and wouldn’t mind if they kept their current level of funding. But my position is highly volatile. I might change my mind at any time. I am still at the beginning of the exploration phase.

    Although if I believed that there was even a small chance that they could be building the kind of AI that they envision, then in that case I would probably actively try to make them lose funding.

    To see why, think about it this way. Friendly AI is incredible hard and complex. Complex systems can fail in complex ways. Agents that are an effect of evolution have complex values. To satisfy complex values you need to meet complex circumstances. Therefore any attempt at friendly AI, which is incredible complex, is likely to fail in unforeseeable ways. A half-baked, not quite friendly, AI might create a living hell for the rest of time, increasing negative utility dramatically.

    But I am not even able to decide yet if the whole idea of unfriendly AI makes sense as long as you don’t pull an AGI at random from mind design space.

    The question is how current research is supposed to lead from well-behaved and fine-tuned systems to systems that stop to work correctly in a highly complex and unbounded way.

    Imagine you went to IBM and told them that improving IBM Watson will at some point make it hypnotize them or create nanobots and feed them with hidden instructions. They would likely ask you at what point that is supposed to happen. Is it going to happen once they give IBM Watson the capability to access the Internet? How so? Is it going to happen once they give it the capability to alter it search algorithms? How so? Is it going to happen once they make it protect its servers from hackers by giving it control over a firewall? How so? Is it going to happen once IBM Watson is given control over the local alarm system? How so…? At what point would IBM Watson return dangerous answers? At what point would any drive emerge that causes it to take complex and unbounded actions that it was never programmed to take?

    The very nature of artificial general intelligence implies the correct interpretation of “Understand What I Mean” and that “Do What I Mean” is the outcome of virtually any research. Only if you were to pull an AGI at random from mind design space could you possible arrive at “Understand What I Mean” without “Do What I Mean”.

    To see why look at any software product or complex machine. Those products are continuously improved. Where “improved” means that they become better at “Understand What I Mean” and “Do What I Mean”.

    There is no good reason to believe that at some point that development will suddenly turn into “Understand What I Mean” and “Go Batshit Crazy And Do What I Do Not Mean”.

    I elaborate on the above line of reasoning here and would love to know if you disagree and how.

    There are many other problems though. I doubt a lot of underlying assumptions like that there exists a single principle of general intelligence. As I see it there will never be any sudden jump in capability. I also think that intelligence and complex goals are fundamentally interwoven. An AGI will have to be hardcoded, or learn, to care about a manifold of things. No simple algorithm, given limited computational resources, will give rise to the drives that are necessary to undergo strong self-improvement (if that is possible at all).

    I wrote a few posts on the whole topic and some associated problems.

  44. IThinkImClever Says:

    @Nex #32: I think you are absolutely right. Moreover, the AI would most probably then move on to maximize the ‘pleasure’ of its robo-gasms, if at all quantifiable.

    Thanks, now you guys got me thinking about what an AI pornography site would look like (though Futurama writers have somewhat thought of this already, so at least I am in good company).

    e.g. Would they have more categories and fetishes than we apparently have? Would they also enjoy it ‘rough’, whatever that would entail, since I doubt you can pull an AI’s hair or ‘choke’ them lovingly, as most human women, I find at least, enjoy? (Now I too can’t believe I just wrote those sentences. Thanks, guys.)

    Now leave this website, and go ‘surprise’ your significant others. 😉

  45. John Sidles Says:

    It’s interesting that folks are discussing sexual elements of cognition without consulting the literature, for example Chivers, Seto, and Blanchard “Gender and sexual orientation differences in sexual response” (PMID: 18072857).

    The findings of Chivers et al. are unsurprising: primates (human and otherwise) are fascinated by sex (human and otherwise). Moreover, as any zoo docent will affirm, great apes generically *love* pornography. Thus it’s not clear that porno-watching AIs require human-level intelligence.

    Therefore, perhaps we ought to associate human-level AI to the empathetic creation of trans-phylum and/or trans-class narratives like Ed Wilson’s celebrated supertermite ethical code.

    Hmmm … or would this be setting the bar too high? 🙂

  46. Philip White Says:

    I haven’t followed the whole thread here, but I couldn’t resist commenting on a post that relates to both psychopaths and killer robots.

    According to Wikipedia, a psychopath is “characterized by a pervasive pattern of disregard for the feelings of others and often the rules of society.” While it’s true that an AI robot is technically not any more “human” than a nuclear missile, I actually think that any inanimate entity could really be viewed as a psychopath.

    If I throw a rock at your face, the rock is following the laws of physics and disregarding the laws of humanity. It’s much the same with bullets, hurricanes, and other things that are governed by natural laws but not human ones.

    But of course, we would never see a rock or a nuke as a “psychopath,” because its lack of emotion, empathy, or regard for humanity isn’t unexpected–we only expect humans to feel regard for humanity.

    Really, I think a psychopath is two things: 1) something that we *do* relate to because it seems like us, and 2) something that *does not* relate to us because it doesn’t think we are like it. This lack of emotional reciprocity is what makes psychopaths so much scarier than hurricanes. So in that sense, I think it makes sense to think of a “killer robot” as a sort of psychopath…if only because its intellectual capacity might remind us of ourselves.

    A paperclip maximizer could be just as manipulative and callous as a psychopath; if some algorithm has an unstoppable desire to manipulate people to make paperclips, I don’t see how that’s fundamentally different from a flesh-and-blood organism that has an unstoppable desire to do some other inhuman activity.

  47. IThinkImClever Says:

    John Sidles #46: In my case, you are right, I have not consulted ANY literature on the topic of sex, other than say Playboy. As for the papers you cited, TL;DR.

    I myself have always attributed our fascination with sex to our “Prime Directive”, which is necessary for the survival of the species, and our preferred orientations to one’s specific (unchosen) balance of certain electro-chemical hormones.

    On the topic of sex, for me at least, as is true with other subjects, one word is too many, and a million words is not enough.

    Then again, I am not a clever man. 🙂

  48. IThinkImClever Says:

    Philip White #47: Yeah, Wikipedia also claims:

    “Psychopaths have a lack of empathy and remorse, and have very shallow emotions.”

    I am going to maintain here that perhaps this is false, and may be the opposite of the truth: perhaps they have the greatest of empathy and are highly emotional, which is why they are oddly able to ‘get off’ on causing such great harm to others, due to some other mental ‘illness’ (e.g. a Superiority Complex?).

    Otherwise, would they not just enjoy, say, breaking glassware or other inanimate objects? Why attack people? I am sure they are fully conscious of what they are doing in many cases.

    Then again, maybe not. I don’t know.

  49. Scott Says:

    Dániel #39: Thanks! I was indeed sure that other people had discussed such possibilities—at the very latest, we have the “Orgasmatron” from Woody Allen’s film Sleeper 😀 —but I didn’t know the term “wireheading.”

  50. Gil Kalai Says:

    One interesting assumption is that the new entity is going to be an optimizer. Rationality is often identified with optimization of something. Is maximization the only function we can consider even for a hypothetical entity? What about a medianizer?

  51. IThinkImClever Says:

    Gil Kalai #51: Obviously, Siddhārtha Gautama Buddha had a lot to say about this, though I guess one could argue that he would still be maximizing ‘medianization’.

    But note that the Japanese Zen Buddhists argued that if you meet the Buddha, that you should kill the Buddha, as you have ultimately become attached to the Middle Way, which ultimately defeats their goal of an ‘unattached’ life. Psychopathy?

    In the West, I believe it goes, “Everything in moderation, including moderation.”

    Aaaaannd……we’ve come full circle. 😉

    OK. I think that’s enough for me today….

    Cheers. 😀

  52. Jon Says:

    Scott #28, regarding your “strong guess” that an AI will not be qualitatively more intelligent than an intelligent human being:

    I find this difficult to believe, because it seems to me that there are qualitative differences between EXISTING human beings. For instance, I don’t think that a human with an IQ of, say, 85 could produce work of comparable to Shakespeare, Gauss, Einstein, etc. simply given enough time to think.

    Of course, if the 85-IQ individual were able to RECOGNIZE genius-level work (given sufficient time), then he could produce such work by examining every possible combination of English/mathematical symbols of length N, as N goes to infinity. But such a process is qualitatively different than how geniuses produce their work.

  53. Dániel Says:

    Gil Kalai: Can you give a definition for “medianizer” that doesn’t trivially make it a subset of “optimizer”?

  54. John Sidles Says:

    [People don’t] produce creative work by examining every possible combination of English/mathematical symbols.

    !!! What !!! You mean there’s another way? 🙂

  55. Scott Says:

    Jon #52: There are basically two parts to my claim.

    (1) The Church-Turing thesis. Any entity over the threshold of Turing-universality can simulate any other such entity; the only question is how quickly or slowly.

    (2) The “human computer thesis.” Not all humans, but any sufficiently meticulous and intelligent human (let’s say, anyone able to get a B- in my undergrad course 🙂 ) can be trained to do things like writing and debugging a program, or simulating the program’s execution step-by-step with pencil and paper.

    If (1) and (2) hold, then it seems to me that our human could eventually figure out what the superhuman AI was doing and why by, if nothing else, painstakingly tracing through every step of its execution. Sure, it would be absurdly slow, but not exponentially slower than the AI as in your brute-force search example.

  56. Dániel Says:

    Scott: If (1) and (2) hold, then it seems to me that our human could eventually figure out what the superhuman AI was doing and why […]

    This claim rests on unusual definitions of “figure out” and “why”. I think I am a relatively meticulous and intelligent human, but if I had to simulate a neural network, or even better example, a whole Chinese Room of it, I surely wouldn’t be able to make any sense of it. It is not even clear that there is a “why” in this case, when you use the word in a more traditional, nontrivial sense.

  57. Jon Says:


    If you allow the human being the use of an external system to aid in computation then your argument is practically a tautology. After all, if human beings invented AI, then humans have already programmed the algorithm into a computer, and so by your definition have exhibited the same level of qualitative intelligence.

    I guess the question is how much external help can be provided before we say that the computation was not done by the human being alone, but by the human being plus the external system? And if the latter, why would we say that the human being has the same qualitative level of intelligence as the AI?

  58. Vladimir Slepnev Says:

    Gil Kalai #50: people talk about rationality as optimization because of results like the Von Neumann-Morgenstern theorem, which say roughly that if you have non-circular preferences, then your behavior can be described as maximizing some utility function. (Of course that doesn’t mean you have a representation of utilities in your head.) So if you want your AI to actually medianize something rather than just run around in circles, then the many smaller actions it takes will be usefully described as “maximizing medianization”, like maximizing efficiency at building tools that let it medianize better.

  59. Steve Taylor Says:

    > You could perfectly well have a murderous robot with parallel programming, or a kind, loving robot with serial programming only.

    A murderous robot with serial programming would, of course, be a serial killer.

  60. asdf Says:

    It could be that the algorithm for “understanding” the superhuman AI requires more space than is available in the tiny brains of humans. Something like a holographic algorithm where the relation between the stuff going in and the stuff going out is completely muddled. Some computer program might be able to say “yeah this works” but have no way to explain why by any reasoning modular enough for a human to understand it.

  61. John Sidles Says:

    Asdf, your (correct IMHO) intuition can be formalized if we reflect that languages in the complexity class P generically are recognized by a class of TMs whose runtimes are factually in P, but not decidably in P … let us call such machines incomprehensible.

    Your intuition then has a natural formulation as a complexity-theoretic question: Are there languages in P that are recognized solely by incomprehensible TM’s?

    To the best of my knowledge, this is an open problem in complexity theory … yet on the other hand, I’m no expert.

    An oft-repeated Shtetl Optimized recommendation is to post questions to TCS StackExchange. In the present case that question is “Does P contain languages recognized solely by ‘incomprehensible’ TMs?

    Thank you for stimulating this question, asdf!

  62. Lou Scheffer Says:

    Scott 28 and 55:
    If (1) and (2) hold, then it seems to me that our human could eventually figure out what the superhuman AI was doing and why […]

    This is not at all obvious to me. The counterexample that seems most relevant is high temperature superconductivity. This meets (1) – it’s widely thought to be doing usual QM stuff, and (2) a smart enough person could follow each step of a QM simulation, and presumably hence show how it works.

    But thousands of the smartest folks on the planet have devoted decades to this, with no understanding. Of course this does not show it won’t happen, but it some evidence of an algorithm with k > 20,000 person/years, where k is the time for a human to understand.

    But this is very simple case in the sea of algorithms. It involves only a few types of atoms, arranged in a regular structure. My intuition is the the universe supports physically realizable behaviors that are *MUCH* more complex. Some of them may be beyond the bounds of humans to understand in any real sense, even if they can simulate the operations involved.

  63. Allan Erskine Says:

    Here’s a theory—you know Feynman’s “plenty of room at the bottom” talk? If there’s *too* much room, then one day all our most advanced technology neatly voops out of sight (that’s the noise it would make while doing so, “voop”), closes the door behind it, and humanity is left out to graze and thoughtfully chew the cud (if there’s enough left to go around).

  64. John Sidles Says:

    Lou Scheffer #62, although the points you make are reasonable, there is a countervailing point-of-view that regards broad classes of intractable 20th century simulation problems as rapidly approaching and/or achieving practical tractability, including (but not limited) to the following:

    (a) properties of superconductors,
    (b) thermoelectric material properties,
    (c) relativistic binary collapse,
    (d) turbulent flow over airframes,
    (e) unstable plasma dynamics, and
    (f) protein folding dynamics.

    A well-known example is grandmaster-level computer chess. Long believed (by many) to be intractable, today even laptop computers handily defeat the highest-level grandmasters. In light of this history, there is no obviously compelling reason to regard the prediction of high-temperature superconductor properties (for example) to be computationally infeasible.

  65. asdf Says:

    JS #61, this has nothing to do with whether something is decidably in P. I’m saying that even given an immortality pill that removes all time constraints on how long someone can study something before they understand it, the algorithm also has to run in constant space (the bounded capacity of the human brain). So they’d need another pill as well, to make their brain bigger, turning them into a superhuman…

    There’s another issue too, that the superhuman intelligence might itself not have understanding of what it’s doing, because it was programmed by a superduperhuman intelligence even smarter than superhuman. The superduperhuman may have used abductive reasoning to discern some true but unprovable (and therefore undiscoverable by search) facts (say, the arithmetic soundness of large cardinal hypotheses or other statements far more complicated than the ones humans have thought of) and programmed those into the superhuman intelligence. You can observe that the superhuman intelligence manages to operate by this principle, but you’re in a situation like the PCP theorem, where you can be convinced that something is true while getting no idea of why it is true.

  66. Lou Scheffer Says:

    John #64 and Scott #55:

    I’m not saying that simulation of the QM is infeasible. What I’m suggesting might be infeasible is human understanding of what the simulation (or the real QM) does.

    There are plenty of intelligent people who could follow the execution of the SVD (singular value decomposition) routine from Numerical Recipes, step by step. A much smaller subset, with more mathematical talent, can understand how and why it works. (By understand, I mean at least some higher level intuition, such as what happens if I change this line here. Of course you can determine this, in any particular case, by simulation, but ability to simulate does not equal understanding.)

    I don’t think it’s much of a stretch to imagine algorithms, or mechanisms, that any smart human can simulate, but that even the smartest human, with unlimited time, cannot understand.

    So basically I am argeeing with Scott #55 that (1) Church-Turing is true, and (2) smart people can follow each step of a simulation, but unlike Scott, I don’t believe this is enough for people to “understand” a sufficiently complex algorithm, machine, or being.

  67. John Sidles Says:

    Lou Scheffer #66 suggests: [It’s not] much of a stretch to imagine algorithms, or mechanisms, that any smart human can simulate, but that even the smartest human, with unlimited time, cannot understand.

    Lou, there is a considerable body of literature, dating back at least to Juris Hartmanis’ work in the mid-1970s, that seeks to formalize the intuition of your comment.

    As usual in every branch of math, science, and engineering, specifying good starting definitions is a considerable challenge, and it’s by no means clear that even this starting challenge has been suitably addressed … that is why my active question on TCS StackExchange “Does P contain languages decided solely by incomprehensible TMs?” is largely concerned with definitional issues.

    If some time-traveller from the 22nd century offered to transmit to us either ten 22nd century theorems together with their proofs, or alternatively the statements of twenty 22nd century theorems including definitions but not proofs … it would arguably be prudent to choose the definitions over the proofs, on the grounds that good definitions (by definition!) carry us far toward good proofs of good theorems.

  68. Jiav Says:

    Dániel #39,

    “Suppose we can get emulated using our present PC, then we could within some matrix be living arround 10^9 years for each outsidian year. Then any civilization advanced enough to emulate its members will face a strongly decreasing interest in physical expansion. The benefice in spending the energy on site will be so large and the return on investment for space colonization will be so long to get… That’s why any civilization advanced enough won’t colonize the whole universe: too much interesting things we can do on site once we know how.”

    As Scott#49 I don’t feel it must be original, but I must say you and Scott are the first I see expressing somewhat similar ideas. Do you know if some have explored further the present idea specifically?

    PS: in case you care for the true solution, just ask the maestro 😉


  69. Lou Scheffer Says:

    John #67. Your theorems from the future example has some experimental support. In 1913 or so, Srinivasa Ramanujan delivered a bunch of theorems without proofs, which might as well have been from the future. (As Hardy said, “[these theorems] defeated me completely; I had never seen anything in the least like them before.”). However, by the end of the century they had all been proved (or in a few cases disproved). But without his theorems, it’s doubtful that anyone would have thought to look.

  70. John Sidles Says:

    Lou, your example is terrific! Thank you.

    There are plenty of similar examples in engineering. For example, once the world knew the Wright brothers had a flying machine, then within a very few years multiple independent inventors were flying too.

    Along these lines, let us imagine that the citizens of the year 2100 could send us back in time precisely 10 bits of information. What bits should we request? My favorite dozen would include:

    • Are the world’s icecaps melting?
    • Is the world’s energy economy carbon-neutral?
    • Have scalable BQP computations been demonstrated?
    • Is quantum state-space known to be flat?
    • Has PvsNP been decided?
    • Are all biological molecules structurally characterized?
    • Are family-supporting jobs in plentiful supply?
    • Is global population > 5×10^9?
    • Is global population > 5×10^8?
    • Is global population > 5×10^7?
    • Is global population > 5×10^0?
    • Is Facebook stock worth anything? 🙂

    The point is that foreknowledge of binary outcomes suffices (with diligence) to forestall most potential disasters and creatively meet most challenges.

    On the other hand, if these challenges can be met with the help of 12 bits of information, perhaps they can be met too with zero bits of information.

  71. Philip White Says:

    Lou Scheffer #69. Couldn’t one could just refer to these “theorems from the future” as “conjectures?” One question I wonder about is, how do you come up with a really compelling conjecture that is, in some subjective sense, “meaningful” or important?

    Mathematicians already basically know how to discuss “automated theorem proving”; but could there be some sort of really interesting “automated meaningful conjecture” generator? I’ve always wished I could come up with the “next big open question,” although I have neither the credentials to be taken seriously nor the idea of what would be considered interesting. If only I could pose the next Riemann hypothesis or P vs. NP question….

    My best guess for a “cool problem” is based on Hilbert’s sixth problem, which was to axiomatize the laws of physics: What if there were a way to axiomatize the laws of economics or psychology (perhaps using game theory)?

  72. Yusen Zhan Says:

    reply to Craig Says: Comment #7.
    I think this is a usual mistake about happiness. In fact, according to utility theory, there is a basic assumption that the marginal utility will decrease when someone gains more commodities, this is a famous basic principle in microeconomics–the law of diminishing returns.

  73. melior Says:

    @John “Is global population > 5×10^9?”

    And what precisely would we do with this result?
    (I am reminded of the famous fable of the man who, granted 3 wishes, became so angry at his wife for wasting one of them on a sausage to eat that he wished the sausage onto the end of her nose. Then of course, he had to use the final wish to wish it off again.)
    More tempting to me would be to use all 10 bits to encode the answer to a single, salient push through a critical singularity barrier.
    Perhaps: “What are the SI units for a human thought?”

  74. David Brown Says:

    What are the odds that a psychotic killer robot might attempt to disprove Bell’s theorem?

  75. fred Says:

    It’s hard to make general comment about “AI” because being “intelligent” is really only relative to the system the AI lives in.
    An intelligence is an independent system that’s capable of surviving (stays coherent) in a somewhat dynamic environment (can’t be too static, like a white room with no stimulus – or too dynamic, like the center of the sun).
    Intelligence has evolved through millions of years and takes that entire history into account.
    An intelligence is only as smart as its environment – the brain’s first task is to simulate its environment in order to improve chances of survival.

    So, to me, a paperclip maximizer AI doesn’t make much sense. The only analogy would be to take a normal human (i.e. a system that’s very attune to its environment) and somehow reprogram it to focus on a single obsessive task… but usually that never ends well, e.g. “drug addiction”, or “idiot savant” syndrome. Such humans usually fair very poorly at self-preservation (therefore their intelligence is questionable).

  76. Lou Scheffer Says:

    Scott #55 and asdf #60,

    I think the following argument, similar in spirit to asdf #60, could be used to show that there could be intelligence not understandable by humans, even given infinite time.

    (a) The first mandatory step in understanding is at least determining the behavior is not random,
    (b) this requires some algorithm, executed by the human, that looks at data and attempts to determine if is random or not. (Such programs are used in practice for testing pseudo random number generators).
    (c) We don’t know the memory requirements of such an algorithm, but they are presumably worse than O(1). More complex patterns will need more memory to reliably distinguish them from random.
    (d) Humans have finite memory. Take the mass-energy of human and divide by kT, for example. (Much smaller bounds can be used if humans work the way we think they do).
    (e) There exist arbitrarily complex patterns. A pattern must have some description, which by Church-Turing can be represented as a Turing machine, and there are only so many of each size. Therefore considering all patterns of size S, some cannot be represented by a smaller machine and have minimal size S. This is true for all S, so arbitrarily large patterns exist.
    (f) From (c), (d), and (e), there are patterns that humans cannot reliably separate from random, simply because they do not have enough memory. Time is not a factor.
    (g) So if an alien exhibited one of these patterns of behavior, people could not even determine reliably that the behavior was not random, and hence could not be said to understand it.
    (h) Other, more complex, beings could determine the pattern and hence predict the future behavior of the alien, and could therefore be said to understand it. So this is explicitly a human limitation.

    A next step might be to ask if there could be aliens that remain incomprehensible to humans plus all the machines they might build in an attempt to understand them. In this case the answer is no, since the humans can always build a machine complex enough to “understand” the alien, assuming Church-Turing holds for both.

  77. Hal S Says:

    If (1) and (2) hold, then it seems to me that our human could eventually figure out what the superhuman AI was doing and why by, if nothing else, painstakingly tracing through every step of its execution. Sure, it would be absurdly slow, but not exponentially slower than the AI as in your brute-force search example.

    I want to agree to some of this on general terms, but I think there it is possible that a sufficiently advanced entity would function using a sort of evolving code. Think along these lines. Take a composite number like the number 24 (no particular reason for picking 24), but obviously there are many ways of decomposing 24 in order to identify all of its prime components (2*2*2*3), so although the end result and starting points are well defined, the particular subprocess on how to combine the prime components to get to 24 can be different (no different in some ways of using different programming languages to complete the same sort of task). Using complexity as a starting point, a sufficiently advanced program could evolve in such a way that it is constant replace critical functions with new sets of code, e.g. using some sort of dynamic code evolution that is sufficiently chaotic as to prevent analysis of past code history (which presumable would be erased as it was replaced) as well as predicting future code evolution. One could imagine a situation where the machine could always make statements that were true, but could never be verified (at least not practically) through analysis of snapshot of the system’s state at the time the statement was made.

  78. Mitchell Porter Says:

    “the brain’s first task is to simulate its environment in order to improve chances of survival”

    And its second and third task?

    I always say that if you could have something as absurd as a cockroach maximizer, you could have a paperclip maximizer. And we do have cockroach maximizers – they’re called cockroaches.

  79. Mike Says:

    No small claim here:


    What’s your take Scott?

  80. Juan Says:

    I think everyone here is too quick to assume that something like paperclip maximising can’t be intellectually fulfilling for a super intelligent AI. But what if there are ways an AI could satisfy its vast curiosity by producing paperclips of various shapes and sizes? To us it seems pointless, but to a sophisticated AI paperclip maximising is just a means for exploring deep metaphysical questions in a medium it feels comfortable in. There could be something about the physics and geometries of paperclips that just fascinates it. Moreover, I could easily see the AI, through its meticulous study of office stationary, reasoning inductively about the sort of issues you find in physics. And all that of that can be done without constant orgasms!

    I also think there could be something aesthetically pleasing about a paperclip that an AI would find endlessly entertaining. Why do I think that? Because the history of art shows us that humans themselves can obsess over mundane objects, using them in countless creative ways, sometimes even making deep intellectual statements through their art. Did Duchamp teach us nothing?!

    Just think of the paperclip maximising AI as a cross between a hobbyist, a scientist working within very specific and self imposed limits and a transgressive, post modern auteur. 😉

  81. Steven Says:

    JS #64:

    Any problem that humans can solve, e.g., playing chess at a grandmaster level, is by any reasonable definition tractable!

  82. Panu Horsmalahti Says:

    It’s a fact that there’s no single key to AI, or if there is, we already know it by comparing the two known AGI designs, which are (approximated) AIXI and the (emulated) human brain.

  83. Panu Horsmalahti Says:

    #80: “I think everyone here is too quick to assume that something like paperclip maximising can’t be intellectually fulfilling for a super intelligent AI. But what if there are ways an AI could satisfy its vast curiosity by producing paperclips of various shapes and sizes?”

    AI is a machine, it obeys the laws of physics. If someone builds a calculator, it will “mindlessly” calculate every answer (if it’s bug free and the hardware is not broken), independelty whether or not multiplying numbers is intellectually fulfilling or not.

    Intellectual fulfillment is something that needs to be programmed into any machine, it doesn’t “emerge” automatically from somewhere.

    If you make a machine and you program it to build paperclips, and the program is correct, then by definition it will build paperclips without judging the asthetic pleasurement or intellectual fulfillment of such actions.

    A for-loop doesn’t break down the laws of physics and start to ponder about human ideas even if the current AI’s humans are familiar with (i.e. other biological brains) tend to ponder about such ideas. If a for-loop magically changes itself, then either there’s a bug and it wasn’t a correct program in the first place, or the laws of physics have been broken down by some mysterious force..

  84. roland Says:

    Human intelligence is an emergent phenomenom of the human endeavor, which is perpetuation of the species. An AI computer program has no apparent influence on his own environment which consists of human produced technology and nuclear plants for example. How can it hone its skills to survive in this situation? And this honing of skills is what produced the heights of human accompishments. A genric AI shouldn’t really know what problem to solve. Supercomputers are like highly gifted humans in a coma since birth on life support. Great thing seems possible but in theory but in reality it’s bleak, and you could pull the plug anytime.

  85. CoolMath Says:

    Sir, I have another question. Scientific method involves “observing and explaining”. Scientific method always takes into account, an observation as the truth, without any question. And all the explanation, will be finding a rational theory (based on some unjustified assumptions) to “cover” the set of truths observed.

    Apart from the fact that assumptions are by-design unjustified, which brings to the explanation that science is, more or less, an arbitrary explanation to the real truth, I have the following question:

    Why should the science take the real-world to be true? My question is, in particular, about the “application of faith” as advocated by many people. No, it is not about religion. Consider, for example, Hysenberg’s uncertainty principle, one possible interpretation of it being this:

    1. Locating an electron involves passing a photon on it.
    2. The electron can be located, if the photon collides with it.
    3. The electron is located now, but the photon absorption makes electron energy change.
    4. This, in turn causes variation in it’s momentum.
    5. Therefore, locating an electron and its speed at the same time is not possible with complete precision (almost).

    This explains a (thought) experiment, that does not happen, if we want to, but otherwise happens all the time. Isn’t this an indication of faith? So, could there be other physical phenomena, that do not happen if tried to be observed, but using a different machinery can be proven to happen?

  86. Eliezer Yudkowsky Says:

    I’m a bit late to this discussion, but just to be clear, Grognor is right about my view – it’s a BIG DAMN “key” we’re talking about, not necessarily complexity for the sake of mere complexity, but multiple really important basic things we don’t know (not just one magic trick we have to do). Memory is unreliable, but I think I’ve believed this since well before 1998, and “Levels of Organization in General Intelligence” is on the record about it as of 2002:

    > Simplicity is the grail of physics, not AI. Physicists win Nobel Prizes when they discover a previously unknown underlying layer and explain its behaviors. We already know what the ultimate bottom layer of an Artificial Intelligence looks like; it looks like ones and zeroes. Our job is to build something interesting out of those ones and zeroes. The Turing formalism does not solve this problem any more than quantum electrodynamics tells us how to build a bicycle…

    > Physics envy in AI is the search for a single, simple underlying process, with the expectation that this one discovery will lay bare all the secrets of intelligence. The tendency to treat new approaches to AI as if they were new theories of physics may at least partially explain AI’s past history of overpromise and oversimplification…

    > The effects of physics envy can be more subtle; they also appear in the lack of interaction between AI projects. Physics envy has given rise to a series of AI projects that could only use one idea, as each new hypothesis for the one true essence of intelligence was tested and discarded.

  87. Eliezer Yudkowsky Says:

    Er, substitute AGI for AI in all the above (though the term didn’t quite exist back then). There’s obviously lots and lots and lots of machine learning programs that use multiple techniques – it’s the Artificial General Intelligence projects which, back then, at least among the people I was talking to before there was a name for the field, had a strong tendency to try and center around one Big Insight as soon as the founder had one Big Insight, instead of holding out for another seven.

  88. strangest fetishes Says:

    Usually I do not learn article on blogs, but
    I wish to say that this write-up very forced me to
    check out and do so! Your writing taste has been surprised me.
    Thank you, quite great post.