Archive for the ‘The Fate of Humanity’ Category

ChatGPT and the Meaning of Life: Guest Post by Harvey Lederman

Monday, August 4th, 2025

Scott Aaronson’s Brief Foreword:

Harvey Lederman is a distinguished analytic philosopher who moved from Princeton to UT Austin a few years ago. Since his arrival, he’s become one of my best friends among the UT professoriate. He’s my favorite kind of philosopher, the kind who sees scientists as partners in discovering the truth, and also has a great sense of humor. He and I are both involved in UT’s new AI and Human Objectives Initiative (AHOI), which is supported by Open Philanthropy.

The other day, Harvey emailed me an eloquent meditation he wrote on what will be the meaning of life if AI doesn’t kill us all, but “merely” does everything we do better than we do it. While the question is of course now extremely familiar to me, Harvey’s erudition—bringing to bear everything from speculative fiction to the history of polar exploration—somehow brought the stakes home for me in a new way.

Harvey mentioned that he’d sent his essay to major magazines but hadn’t had success. So I said, why not a Shtetl-Optimized guest post? Harvey replied—what might be the highest praise this blog has ever received—well, that would be even better than the national magazine, as it would reach more relevant people.

And so without further ado, I present to you…


ChatGPT and the Meaning of Life, by Harvey Lederman

For the last two and a half years, since the release of ChatGPT, I’ve been suffering from fits of dread. It’s not every minute, or even every day, but maybe once a week, I’m hit by it—slackjawed, staring into the middle distance—frozen by the prospect that someday, maybe pretty soon, everyone will lose their job.

At first, I thought these slackjawed fits were just a phase, a passing thing. I’m a philosophy professor; staring into the middle distance isn’t exactly an unknown disease among my kind. But as the years have begun to pass, and the fits have not, I’ve begun to wonder if there’s something deeper to my dread. Does the coming automation of work foretell, as my fits seem to say, an irreparable loss of value in human life?

The titans of artificial intelligence tell us that there’s nothing to fear. Dario Amodei, CEO of Anthropic, the maker of Claude, suggests that: “historical hunter-gatherer societies might have imagined that life is meaningless without hunting,” and “that our well-fed technological society is devoid of purpose.” But of course, we don’t see our lives that way. Sam Altman, the CEO of OpenAI, sounds so similar, the text could have been written by ChatGPT. Even if the jobs of the future will look as “fake” to us as ours do to “a subsistence farmer”, Altman has “no doubt they will feel incredibly important and satisfying to the people doing them.”

Alongside these optimists, there are plenty of pessimists who, like me, are filled with dread. Pope Leo XIV has decried the threats AI poses to “human dignity, labor and justice”. Bill Gates has written about his fear, that “if we solved big problems like hunger and disease, and the world kept getting more peaceful: What purpose would humans have then?” And Douglas Hofstadter, the computer scientist and author of Gödel, Escher, Bach, has spoken eloquently of his terror and depression at “an oncoming tsunami that is going to catch all of humanity off guard.”

Who should we believe? The optimists with their bright visions of a world without work, or the pessimists who fear the end of a key source of meaning in human life?


I was brought up, maybe like you, to value hard work and achievement. In our house, scientists were heroes, and discoveries grand prizes of life. I was a diligent, obedient kid, and eagerly imbibed what I was taught. I came to feel that one way a person’s life could go well was to make a discovery, to figure something out.

I had the sense already then that geographical discovery was played out. I loved the heroes of the great Polar Age, but I saw them—especially Roald Amundsen and Robert Falcon Scott—as the last of their kind. In December 1911, Amundsen reached the South Pole using skis and dogsleds. Scott reached it a month later, in January 1912, after ditching the motorized sleds he’d hoped would help, and man-hauling the rest of the way. As the black dot of Amundsen’s flag came into view on the ice, Scott was devastated to reach this “awful place”, “without the reward of priority”. He would never make it back.

Scott’s motors failed him, but they spelled the end of the great Polar Age. Even Amundsen took to motors on his return: in 1924, he made a failed attempt for the North Pole in a plane, and, in 1926, he successfully flew over it, in a dirigible. Already by then, the skis and dogsleds of the decade before were outdated heroics of a bygone world.

We may be living now in a similar twilight age for human exploration in the realm of ideas. Akshay Venkatesh, whose discoveries earned him the 2018 Fields Medal, mathematics’ highest honor, has written that, the “mechanization of our cognitive processes will alter our understanding of what mathematics is”. Terry Tao, a 2006 Fields Medalist, expects that in just two years AI will be a copilot for working mathematicians. He envisions a future where thousands of theorems are proven all at once by mechanized minds.

Now, I don’t know any more than the next person where our current technology is headed, or how fast. The core of my dread isn’t based on the idea that human redundancy will come in two years rather than twenty, or, for that matter, two hundred. It’s a more abstract dread, if that’s a thing, dread about what it would mean for human values, or anyway my values, if automation “succeeds”: if all mathematics—and, indeed all work—is done by motor, not by human hands and brains.

A world like that wouldn’t be good news for my childhood dreams. Venkatesh and Tao, like Amundsen and Scott, live meaningful lives, lives of purpose. But worthwhile discoveries like theirs are a scarce resource. A territory, once seen, can’t be seen first again. If mechanized minds consume all the empty space on the intellectual map, lives dedicated to discovery won’t be lives that humans can lead.

The right kind of pessimist sees here an important argument for dread. If discovery is valuable in its own right, the loss of discovery could be an irreparable loss for humankind.

A part of me would like this to be true. But over these last strange years, I’ve come to think it’s not. What matters, I now think, isn’t being the first to figure something out, but the consequences of the discovery: the joy the discoverer gets, the understanding itself, or the real life problem their knowledge solves. Alexander Fleming discovered penicillin, and through that work saved thousands, perhaps millions of lives. But if it were to emerge, in the annals of an outlandish future, that an alien discovered penicillin thousands of years before Fleming did, we wouldn’t think that Fleming’s life was worse, just because he wasn’t first. He eliminated great suffering from human life; the alien discoverer, if they’re out there, did not. So, I’ve come to see, it’s not discoveries themselves that matter. It’s what they bring about.


But the advance of automation would mean the end of much more than human discovery. It could mean the end of all necessary work. Already in 1920, the Czech playwright Karel Capek asked what a world like that would mean for the values in human life. In the first act of R.U.R.—the play which introduced the modern use of the word “robot”—Capek has Henry Domin, the manager of Rossum’s Universal Robots (the R.U.R. of the title), offer his corporation’s utopian pitch. “In ten years”, he says, their robots will “produce so much corn, so much cloth, so much everything” that “There will be no poverty.” “Everybody will be free from worry and liberated from the degradation of labor.” The company’s engineer, Alquist, isn’t convinced. Alquist (who, incidentally, ten years later, will be the only human living, when the robots have killed the rest) retorts that “There was something good in service and something great in humility”, “some kind of virtue in toil and weariness”.

Service—work that meets others’ significant needs and wants— is, unlike discovery, clearly good in and of itself. However we work— as nurses, doctors, teachers, therapists, ministers, lawyers, bankers, or, really, anything at all—working to meet others’ needs makes our own lives go well. But, as Capek saw, all such work could disappear. In a “post-instrumental” world, where people are comparatively useless and the bots meet all our important needs, there would be no needed work for us to do, no suffering to eliminate, no diseases to cure. Could the end of such work be a better reason for dread?

The hardline pessimists say that it is. They say that the end all needed work would not only be a loss of some value to humanity, as everyone should agree. For them it would be a loss to humanity on balance, an overall loss, that couldn’t be compensated in another way.

I feel a lot of pull to this pessimistic thought. But once again, I’ve come to think it’s wrong. For one thing, pessimists often overlook just how bad most work actually is. In May 2021, Luo Huazhang, a 31 year-old ex-factory worker in Sichuan wrote a viral post, entitled “Lying Flat is Justice”. Luo had searched at length for a job that, unlike his factory job, would allow him time for himself, but he couldn’t find one. So he quit, biked to Tibet and back, and commenced his lifestyle of lying flat, doing what he pleased, reading philosophy, contemplating the world. The idea struck a chord with overworked young Chinese, who, it emerged, did not find “something great” in their “humility”. The movement inspired memes, selfies flat on one’s back, and even an anthem.

That same year, as the Great Resignation in the United States took off, the subreddit r/antiwork played to similar discontent. Started in 2013, under the motto “Unemployment for all, not only the rich!”, the forum went viral in 2021, starting with a screenshot of a quitting worker’s texts to his supervisor (“No thanks. Have a good life”), and culminating in labor-actions, first supporting striking workers at Kelloggs by spamming their job application site, and then attempting to support a similar strike at McDonald’s. It wasn’t just young Chinese who hated their jobs.

In Automation and Utopia: Human Flourishing in a World without Work, the Irish lawyer and philosopher John Danaher imagines an antiwork techno-utopia, with plenty of room for lying flat. As Danaher puts it: “Work is bad for most people most of the time.”“We should do what we can to hasten the obsolescence of humans in the arena of work.”

The young Karl Marx would have seen both Domin’s and Danaher’s utopias as a catastrophe for human life. In his notebooks from 1844, Marx describes an ornate and almost epic process, where, by meeting the needs of others through production, we come to recognize the other in ourselves, and through that recognition, come at last to self-consciousness, the full actualization of our human nature. The end of needed work, for the Marx of these notes, would be the impossibility of fully realizing our nature, the end, in a way, of humanity itself.

But such pessimistic lamentations have come to seem to me no more than misplaced machismo. Sure, Marx’s and my culture, the ethos of our post-industrial professional class, might make us regret a world without work. But we shouldn’t confuse the way two philosophers were brought up with the fundamental values of human life. What stranger narcissism could there be than bemoaning the end of others’ suffering, disease, and need, just because it deprives you of the chance to be a hero?


The first summer after the release of ChatGPT—the first summer of my fits of dread—I stayed with my in-laws in Val Camonica, a valley in the Italian alps. The houses in their village, Sellero, are empty and getting emptier; the people on the streets are old and getting older. The kids that are left—my wife’s elementary school class had, even then, a full complement of four—often leave for better lives. But my in-laws are connected to this place, to the houses and streets where they grew up. They see the changes too, of course. On the mountains above, the Adamello, Italy’s largest glacier, is retreating faster every year. But while the shows on Netflix change, the same mushrooms appear in the summer, and the same chestnuts are collected in the fall.

Walking in the mountains of Val Camonica that summer, I tried to find parallels for my sense of impending loss. I thought about William Shanks, a British mathematician who calculated π to 707 digits by hand in 1873 (he made a mistake at 527; almost 200 digits were wrong). He later spent years of his life, literally years, on a table of the reciprocals of the primes up to one-hundred and ten thousand, calculating in the morning by hand, and checking it over in the afternoon. That was his life’s work. Just sixty years after his death, though, already in the 1940s, the table on which his precious mornings were spent, the few mornings he had on this earth, could be made by a machine in a day.

I feel sad thinking about Shanks, but I don’t feel grief for the loss of calculation by hand. The invention of the typewriter, and the death of handwritten notes seemed closer to the loss I imagined we might feel. Handwriting was once a part of your style, a part of who you were. With its decline some artistry, a deep and personal form of expression, may be lost. When the bots help with everything we write, couldn’t we too lose our style and voice?

But more than anything I thought of what I saw around me: the slow death of the dialects of Val Camonica and the culture they express. Chestnuts were at one time so important for nutrition here, that in the village of Paspardo, a street lined with chestnut trees is called “bread street” (“Via del Pane”). The hyper-local dialects of the valley, outgrowths sometimes of a single family’s inside jokes, have words for all the phases of the chestnut. There’s a porridge made from chestnut flour that, in Sellero goes by ‘skelt’, but is ‘pult’ in Paspardo, a cousin of ‘migole’ in Malonno, just a few villages away. Boiled, chestnuts are tetighe; dried on a grat, biline or bascocc, which, seasoned and boiled become broalade. The dialects don’t just record what people eat and ate; they recall how they lived, what they saw, and where they went. Behind Sellero, every hundred-yard stretch of the walk up to the cabins where the cows were taken to graze in summer, has its own name. Aiva Codaola. Quarsanac. Coran. Spi. Ruc.

But the young people don’t speak the dialect anymore. They go up to the cabins by car, too fast to name the places along the way. They can’t remember a time when the cows were taken up to graze. Some even buy chestnuts in the store.

Grief, you don’t need me to tell you, is a complicated beast. You can grieve for something even when you know that, on balance, it’s good that it’s gone. The death of these dialects, of the stories told on summer nights in the mountains with the cows, is a loss reasonably grieved. But you don’t hear the kids wishing more people would be forced to stay or speak this funny-sounding tongue. You don’t even hear the old folks wishing they could go back fifty years—in those days it wasn’t so easy to be sure of a meal. For many, it’s better this way, not the best it could be, but still better, even as they grieve what they stand to lose and what they’ve already lost.

The grief I feel, imagining a world without needed work, seems closest to this kind of loss. A future without work could be much better than ours, overall. But, living in that world, or watching as our old ways passed away, we might still reasonably grieve the loss of the work that once was part of who we were.


In the last chapter of Edith Wharton’s Age of Innocence, Newland Archer contemplates a world that has changed dramatically since, thirty years earlier, before these new fangled telephones and five-day trans-Atlantic ships, he renounced the love of his life. Awaiting a meeting that his free-minded son Dallas has organized with Ellen Olenska, the woman Newland once loved, he wonders whether his son, and this whole new age, can really love the way he did and does. How could their hearts beat like his, when they’re always so sure of getting what they want?

There have always been things to grieve about getting old. But modern technology has given us new ways of coming to be out of date. A generation born in 1910 did their laundry in Sellero’s public fountains. They watched their grandkids grow up with washing machines at home. As kids, my in-laws worked with their families to dry the hay by hand. They now know, abstractly, that it can all be done by machine. Alongside newfound health and ease, these changes brought, as well, a mix of bitterness and grief: grief for the loss of gossip at the fountains or picnics while bringing in the hay; and also bitterness, because the kids these days just have no idea how easy they have it now.

As I look forward to the glories that, if the world doesn’t end, my grandkids might enjoy, I too feel prospective bitterness and prospective grief. There’s grief, in advance, for what we now have that they’ll have lost: the formal manners of my grandparents they’ll never know, the cars they’ll never learn to drive, and the glaciers that will be long gone before they’re born. But I also feel bitter about what we’ve been through that they won’t have to endure: small things like folding the laundry, standing in security lines or taking out the trash, but big ones too—the diseases which will take our loved ones that they’ll know how to cure.

All this is a normal part of getting old in the modern world. But the changes we see could be much faster and grander in scale. Amodei of Anthropic speculates that a century of technological change could be compressed into the next decade, or less. Perhaps it’s just hype, but—what if it’s not? It’s one thing for a person to adjust, over a full life, to the washing machine, the dishwasher, the air-conditioner, one by one. It’s another, in five years, to experience the progress of a century. Will I see a day when childbirth is a thing of the past? What about sleep? Will our ‘descendants’ have bodies at all?

And this round of automation could also lead to unemployment unlike any our grandparents saw. Worse, those of us working now might be especially vulnerable to this loss. Our culture, or anyway mine—professional America of the early 21st century—has apotheosized work, turning it into a central part of who we are. Where others have a sense of place—their particular mountains and trees—we’ve come to locate ourselves with professional attainment, with particular degrees and jobs. For us, ‘workists’ that so many of us have become, technological displacement wouldn’t just be the loss of our jobs. It would be the loss of a central way we have of making sense of our lives.

None of this will be a problem for the new generation, for our kids. They’ll know how to live in a world that could be—if things go well—far better overall. But I don’t know if I’d be able to adapt. Intellectual argument, however strong, is weak against the habits of years. I fear they’d look at me, stuck in my old ways, with the same uncomprehending look that Dallas Archer gives his dad, when Newland announces that he won’t go see Ellen Olenska, the love of his life, after all. “Say”, as Newland tries to explain to his dumbfounded son, “that I’m old fashioned, that’s enough.”


And yet, the core of my dread is not about aging out of work before my time. I feel closest to Douglas Hofstadter, the author of Gödel, Escher, Bach. His dread, like mine, isn’t only about the loss of work today, or the possibility that we’ll be killed off by the bots. He fears that even a gentle superintelligence will be “as incomprehensible to us as we are to cockroaches.”

Today, I feel part of our grand human projects—the advancement of knowledge, the creation of art, the effort to make the world a better place. I’m not in any way a star player on the team. My own work is off in a little backwater of human thought. And I can’t understand all the details of the big moves by the real stars. But even so, I understand enough of our collective work to feel, in some small way, part of our joint effort. All that will change. If I were to be transported to the brilliant future of the bots, I wouldn’t understand them or their work enough to feel part of the grand projects of their day. Their work would have become, to me, as alien as ours is to a roach.


But I’m still persuaded that the hardline pessimists are wrong. Work is far from the most important value in our lives. A post-instrumental world could be full of much more important goods— from rich love of family and friends, to new undreamt of works of art—which would more than compensate the loss of value from the loss of our work.

Of course, even the values that do persist may be transformed in almost unrecognizable ways. In Deep Utopia: Life and Meaning in a Solved World, the futurist and philosopher Nick Bostrom imagines how things might look. In one of the most memorable sections of the book—right up there with an epistolary novella about the exploits of Pignolius the pig (no joke!)—Bostrom says that even child-rearing may be something that we, if we love our children, would come to forego. In a truly post-instrumental world, a robot intelligence could do better for your child, not only in teaching the child to read, but also in showing unbreakable patience and care. If you’ll snap at your kid, when the robot would not, it would only be selfishness for you to get in the way.

It’s a hard question whether Bostrom is right. At least some of the work of care isn’t like eliminating suffering or ending mortal disease. The needs or wants are small-scale stuff, and the value we get from helping each other might well outweigh the fact that we’d do it worse than a robot could.

But even supposing Bostrom is right about his version of things, and we wouldn’t express our love by changing diapers, we could still love each other. And together with our loved ones and friends, we’d have great wonders to enjoy. Wharton has Newland Archer wonder at five-day transatlantic ships. But what about five day journeys to Mars? These days, it’s a big deal if you see the view from Everest with your own eyes. But Olympus Mons on Mars is more than twice as tall.

And it’s not just geographical tourism that could have a far expanded range. There’d be new journeys of the spirit as well. No humans would be among the great writers or sculptors of the day, but the fabulous works of art a superintelligence could make could help to fill our lives. Really, for almost any aesthetic value you now enjoy—sentimental or austere, minute or magnificent, meaningful or jocular—the bots would do it much better than we have ever done.

Humans could still have meaningful projects, too. In 1976, about a decade before any of Altman, Amodei or even I were born, the Canadian philosopher Bernhard Suits argued that “voluntary attempts to overcome unnecessary obstacles” could give people a sense of purpose in a post-instrumental world. Suits calls these “games”, but the name is misleading; I prefer “artificial projects”. The projects include things we would call games like chess, checkers and bridge, but also things we wouldn’t think of as games at all, like Amundsen’s and Scott’s exploits to the Pole. Whatever we call them, Suits—who’s followed here explicitly by Danaher, the antiwork utopian and, implicitly, by Altman and Amodei—is surely right: even as things are now, we get a lot of value from projects we choose, whether or not they meet a need. We learn to play a piece on the piano, train to run a marathon, or even fly to Antartica to “ski the last degree” to the Pole. Why couldn’t projects like these become the backbone of purpose in our lives?

And we could have one real purpose, beyond the artificial ones, as well. There is at least one job that no machine can take away: the work of self-fashioning, the task of becoming and being ourselves. There’s an aesthetic accomplishment in creating your character, an artistry of choice and chance in making yourself who you are. This personal style includes not just wardrobe or tattoos, not just your choice of silverware or car, but your whole way of being, your brand of patience, modesty, humor, rage, hobbies and tastes. Creating this work of art could give some of us something more to live for.


Would a world like that leave any space for human intellectual achievement, the stuff of my childhood dreams? The Buddhist Pali Canon says that “All conditioned things are impermanent—when one sees this with wisdom, one turns away from suffering.” Apparently, in this text, the intellectual achievement of understanding gives us a path out of suffering. To arrive at this goal, you don’t have to be the first to plant your flag on what you’ve understood; you just have to get there.

A secular version of this idea might hold, more simply, that some knowledge or understanding is good in itself. Maybe understanding the mechanics of penicillin matters mainly because of what it enabled Fleming and others to do. But understanding truths about the nature of our existence, or even mathematics, could be different. That sort of understanding plausibly is good in its own right, even if someone or something has gotten there first.

Venkatesh the Fields Medalist seems to suggest something like this for the future of math. Perhaps we’ll change our understanding of the discipline, so that it’s not about getting the answers, but instead about human understanding, the artistry of it perhaps, or the miracle of the special kind of certainty that proof provides.

Philosophy, my subject, might seem an even more promising place for this idea. For some, philosophy is a “way of life”. The aim isn’t necessarily an answer, but constant self-examination for its own sake. If that’s the point, then in the new world of lying flat, there could be a lot of philosophy to do.

I don’t myself accept this way of seeing things. For me, philosophy aims at the truth as much as physics does. But I of course agree that there are some truths that it’s good for us to understand, whether or not we get there first. And there could be other parts of philosophy that survive for us, as well. We need to weigh the arguments for ourselves, and make up our own minds, even if the work of finding new arguments comes to belong to a machine.

I’m willing to believe, and even hope that future people will pursue knowledge and understanding in this way. But I don’t find, here, much consolation for my personal grief. I was trained to produce knowledge, not merely to acquire it. In the hours when I’m not teaching or preparing to teach, my job is to discover the truth. The values I imbibed—and I told you I was an obedient kid—held that the prize goes for priority.

Thinking of this world where all we learn is what the bots have discovered first, I feel sympathy with Lee Sedol, the champion Go player who retired after his defeat by Google’s AlphaZero in 2016. For him, losing to AI “in a sense, meant my entire world was collapsing”. “Even if I become the number one, there is an entity that cannot be defeated.” Right or wrong, I would feel the same about my work, in a world with an automated philosophical champ.

But Sedol and I are likely just out of date models, with values that a future culture will rightly revise. It’s been more than twenty years since Garry Kasparov lost to IBM’s Deep Blue, but chess has never been more popular. And this doesn’t seem some new-fangled twist of the internet age. I know of no human who quit the high-jump after the invention of mechanical flight. The Greeks sprinted in their Olympics, though they had, long before, domesticated the horse. Maybe we too will come to value the sport of understanding with our own brains.


Frankenstein, Mary Shelley’s 1818 classic of the creations-kill-creator genre, begins with an expedition to the North Pole. Robert Walton hopes to put himself in the annals of science and claim the Pole for England, when he comes upon Victor Frankenstein, floating in the Arctic Sea. It’s only once Frankenstein warms up, that we get into the story everyone knows. Victor hopes he can persuade Walton to turn around, by describing how his own quest for knowledge and glory went south.

Frankenstein doesn’t offer Walton an alternative way of life, a guide for living without grand goals. And I doubt Walton would have been any more personally consoled by the glories of a post-instrumental future than I am. I ended up a philosopher, but I was raised by parents who, maybe like yours, hoped for doctors or lawyers. They saw our purpose in answering real needs, in, as they’d say, contributing to society. Lives devoted to families and friends, fantastic art and games could fill a wondrous future, a world far better than it has ever been. But those aren’t lives that Walton or I, or our parents for that matter, would know how to be proud of. It’s just not the way we were brought up.

For the moment, of course, we’re not exactly short on things to do. The world is full of grisly suffering, sickness, starvation, violence, and need. Frankenstein is often remembered with the moral that thirst for knowledge brings ruination, that scientific curiosity killed the cat. But Victor Frankenstein makes a lot of mistakes other than making his monster. His revulsion at his creation persistently prevents him, almost inexplicably, from feeling the love or just plain empathy that any father should. On top of all we have to do to help each other, we have a lot of work to do, in engineering as much as empathy, if we hope to avoid Frankenstein’s fate.

But even with these tasks before us, my fits of dread are here to stay. I know that the post-instrumental world could be a much better place. But its coming means the death of my culture, the end of my way of life. My fear and grief about this loss won’t disappear because of some choice consolatory words. But I know how to relish the twilight too. I feel lucky to live in a time where people have something to do, and the exploits around me seem more poignant, and more beautiful, in the dusk. We may be some of the last to enjoy this brief spell, before all exploration, all discovery, is done by fully automated sleds.

Trump and Iran, by popular request

Sunday, June 22nd, 2025

I posted this on my Facebook, but several friends asked me to share more widely, so here goes:

I voted against Trump three times, and donated thousands to his opponents. I’d still vote against him today, seeing him as a once-in-a-lifetime threat to American democracy and even to the Enlightenment itself.

But last night I was also grateful to him for overruling the isolationists and even open antisemites in his orbit, striking a blow against the most evil regime on the planet, and making it harder for that regime to build nuclear weapons. I acknowledge that his opponents, who I voted for, would’ve probably settled for a deal that would’ve resulted in Iran eventually getting nuclear weapons, and at any rate getting a flow of money to redirect to Hamas, Hezbollah, and the Houthis.

May last night’s events lead to the downfall of the murderous ayatollah regime altogether, and to the liberation of the Iranian people from 46 years of oppression. To my many, many Iranian friends: I hope all your loved ones stay safe, and I hope your great people soon sees better days. I say this as someone whose wife and 8-year-old son are right now in Tel Aviv, sheltering every night from Iranian missiles.

Fundamentally, I believe not only that evil exists in the world, but that it’s important to calibrate evil on a logarithmic scale. Trump (as I’ve written on this blog for a decade) terrifies me, infuriates me, and embarrasses me, and through his evisceration of American science and universities, has made my life noticeably worse. On the other hand, he won’t hang me from a crane for apostasy, nor will he send a ballistic missile to kill my wife and son and then praise God for delivering them into his hands.


Update: I received the following comment on this post, which filled me with hope, and demonstrated more moral courage than perhaps every other anonymous comment in this blog’s 20-year history combined. To this commenter and their friends and family, I wish safety and eventually, liberation from tyranny.

I will keep my name private for clear reasons. Thank you for your concern for Iranians’ safety and for wishing the mullah regime’s swift collapse. I have fled Tehran and I’m physically safe but mentally, I’m devastated by the war and the internet blackout (the pretext is that Israeli drones are using our internet). Speaking of what the mullahs have done, especially outrageous was the attack on the Weizmann Institute. I hope your wife and son remain safe from the missiles of the regime whose thugs have chased me and my friends in the streets and imprisoned my friends for simple dissent. All’s well that ends well, and I hope this all ends well.

“If Anyone Builds It, Everyone Dies”

Friday, May 30th, 2025

Eliezer Yudkowsky and Nate Soares are publishing a mass-market book, the rather self-explanatorily-titled If Anyone Builds It, Everyone Dies. (Yes, the “it” means “sufficiently powerful AI.”) The book is now available for preorder from Amazon:

(If you plan to buy the book at all, Eliezer and Nate ask that you do preorder it, as this will apparently increase the chance of it making the bestseller lists and becoming part of The Discourse.)

I was graciously offered a chance to read a draft and offer, not a “review,” but some preliminary thoughts. So here they are:

For decades, Eliezer has been warning the world that an AI might soon exceed human abilities, and proceed to kill everyone on earth, in pursuit of whatever strange goal it ended up with.  It would, Eliezer said, be something like what humans did to the earlier hominids.  Back around 2008, I followed the lead of most of my computer science colleagues, who considered these worries, even if possible in theory, comically premature given the primitive state of AI at the time, and all the other severe crises facing the world.

Now, of course, not even two decades later, we live on a planet that’s being transformed by some of the signs and wonders that Eliezer foretold.  The world’s economy is about to be upended by entities like Claude and ChatGPT, AlphaZero and AlphaFold—whose human-like or sometimes superhuman cognitive abilities, obtained “merely” by training neural networks (in the first two cases, on humanity’s collective output) and applying massive computing power, constitute (I’d say) the greatest scientific surprise of my lifetime.  Notably, these entities have already displayed some of the worrying behaviors that Eliezer warned about decades ago—including lying to humans in pursuit of a goal, and hacking their own evaluation criteria.  Even many of the economic and geopolitical aspects have played out as Eliezer warned they would: we’ve now seen AI companies furiously racing each other, seduced by the temptation of being (as he puts it) “the first monkey to taste the poisoned banana,” discarding their previous explicit commitments to safety, transparency, and the public good once they get in the way.

Today, then, even if one still isn’t ready to swallow the full package of Yudkowskyan beliefs, any empirically minded person ought to be updating in its direction—and acting accordingly.  Which brings us to the new book by Eliezer and his collaborator Nate Soares.  This book is far and away the clearest, most accessible presentation of Eliezer’s beliefs, the culmination of a quarter-century of his developing and talking about them.  That undoubtedly owes a great deal to Nate, who seems to have sanded down the infamously brusque rough edges of Eliezer’s writing style.  So much the better!  But it also owes a lot to the world itself: current events now offer an endless supply of real-world examples for Eliezer’s formerly abstract arguments about AI, examples that the book deploys to maximum effect.

The book also mines history—everything from the Wright Brothers to World War II to the Chernobyl accident—for lessons about human attitudes toward technological progress, safety, and risk.  And it maintains Eliezer’s fondness for stories and parables, one of the most charming features of his writing.

Even today, I’m not nearly as confident about the doom scenario as Eliezer and Nate are.  I don’t know whether an AI’s goals are really “orthogonal” to its abilities, in the sense that will matter in practice.  And when I reach the part where the AI, having copied itself all over the Internet and built robot factories, then invents and releases self-replicating nanotechnology that gobbles the surface of the earth in hours or days, a large part of me still screams out that there must be practical bottlenecks that haven’t been entirely accounted for here.

And yet, even if you agree with only a quarter of what Eliezer and Nate write, you’re likely to close this book fully convinced—as I am—that governments need to shift to a more cautious approach to AI, an approach more respectful of the civilization-changing enormity of what’s being created.  And that, if they won’t, their citizens need to pressure them to do so.

So regardless of how much they agree or disagree, I’d like everyone on earth who cares about the future to read this book, debate its ideas, and have its thesis in mind when they’re discussing AI.

As for me?  It would’ve been better if I’d reached my current position earlier: if I hadn’t needed empirical reality, plus superb writing like Eliezer’s and Nate’s, to bonk me over the head with the risks that AI was likely to pose to humanity in my lifetime.  But having failed to see as far ahead as they did, the least I can do is update.  You should too, and you can start by reading the book.


As it happens, this weekend I’ll be at LessOnline, the rationalist blogging conference in Berkeley, where (among my other events) I’ll engage in a dialogue/debate with Nate Soares about the orthogonality thesis, one of the crucial underpinnings of his and Eliezer’s case for AI doom. So, I’ll probably be LessAvailable to respond to comments on this post. But feel free to discuss anyway! After all, it’s merely the fate of all Earth-originating life that’s at stake here, not some actually hot-button topic like Trump or Gaza.

Quantum! AI! Everything but Trump!

Wednesday, April 30th, 2025
  • Grant Sanderson, of 3blue1brown, has put up a phenomenal YouTube video explaining Grover’s algorithm, and dispelling the fundamental misconception about quantum computing, that QC works simply by “trying all the possibilities in parallel.” Let me not futz around: this video explains, in 36 minutes, what I’ve tried to explain over and over on this blog for 20 years … and it does it better. It’s a masterpiece. Yes, I consulted with Grant for this video (he wanted my intuitions for “why is the answer √N?”), and I even have a cameo at the end of it, but I wish I had made the video. Damn you, Grant!
  • The incomparably great, and absurdly prolific, blogger Zvi Mowshowitz and yours truly spend 1 hour and 40 minutes discussing AI existential risk, education, blogging, and more. I end up “interviewing” Zvi, who does the majority of the talking, which is fine by me, as he has many important things to say! (Among them: his searing critique of those K-12 educators who see it as their life’s mission to prevent kids from learning too much too fast—I’ve linked his best piece on this from the header of this blog.) Thanks so much to Rick Coyle for arranging this conversation.
  • Progress in quantum complexity theory! In 2000, John Watrous showed that the Group Non-Membership problem is in the complexity class QMA (Quantum Merlin-Arthur). In other words, if some element g is not contained in a given subgroup H of an exponentially large finite group G, which is specified via a black box, then there’s a short quantum proof that g∉H, with only ~log|G| qubits, which can be verified on a quantum computer in time polynomial in log|G|. This soon raised the question of whether Group Non-Membership could be used to separate QMA from QCMA by oracles, where QCMA (Quantum Classical Merlin Arthur), defined by Aharonov and Naveh in 2002, is the subclass of QMA where the proof needs to be classical, but the verification procedure can still be quantum. In other words, could Group Non-Membership be the first non-quantum example where quantum proofs actually help?

    In 2006, alas, Greg Kuperberg and I showed that the answer was probably “no”: Group Non-Membership has “polynomial QCMA query complexity.” This means that there’s a QCMA protocol for the problem where Arthur makes only polylog|G| quantum queries to the group oracle—albeit, possibly an exponential in log|G| number of quantum computation steps besides that! To prove our result, Greg and I needed to make mild use of the Classification of Finite Simple Groups, one of the crowning achievements of 20th-century mathematics (its proof is about 15,000 pages long). We conjectured (but couldn’t prove) that someone else, who knew more about the Classification than we did, could show that Group Non-Membership was simply in QCMA outright.

    Now, after almost 20 years, François Le Gall, Harumichi Nishimura, and Dhara Thakkar have finally proven our conjecture—showing that Group Order, and therefore also Group Non-Membership, are indeed in QCMA. They did indeed need to use the Classification, doing one thing for almost all finite groups covered by the Classification, but a different thing for groups of “Ree type” (whatever those are).

    Interestingly, the Group Membership problem had also been a candidate for separating BQP/qpoly, or quantum polynomial time with polynomial-size quantum advice—my personal favorite complexity class—from BQP/poly, or the same thing with polynomial-size classical advice. And it might conceivably still be! The authors explain to me that their protocol doesn’t put Group Membership (with group G and subgroup H depending only on the input length n) into BQP/poly, the reason being that their short classical witnesses for g∉H depend on both g and H, in contrast to Watrous’s quantum witnesses which depended only on H. So there’s still plenty that’s open here! Actually, for that matter, I don’t know of good evidence that the entire Group Membership problem isn’t in BQP—i.e., that quantum computers can’t just solve the whole thing outright, with no Merlins or witnesses in sight!

    Anyway, huge congratulations to Le Gall, Nishimura, and Thakkar for peeling back our ignorance of these matters a bit further! Reeeeeeeee!
  • Potential big progress in quantum algorithms! Vittorio Giovannetti, Seth Lloyd, and Lorenzo Maccone (GLM) have given what they present as a quantum algorithm to estimate the determinant of an n×n matrix A, exponentially faster in some contexts than we know how to do it classically.

    [Update (May 5): In the comments, Alessandro Luongo shares a paper where he and Changpeng Shao describe what appears to be essentially the same algorithm back in 2020.]

    The algorithm is closely related to the 2008 HHL (Harrow-Hassidim-Lloyd) quantum algorithm for solving systems of linear equations. Which means that anyone who knows the history of this class of quantum algorithms knows to ask immediately: what’s the fine print? A couple weeks ago, when I visited Harvard and MIT, I had a chance to catch up with Seth Lloyd, so I asked him, and he kindly told me. Firstly, we assume the matrix A is Hermitian and positive semidefinite. Next, we assume A is sparse, and not only that, but there’s a QRAM data structure that points to its nonzero entries, so you don’t need to do Grover search or the like to find them, and can query them in coherent superposition. Finally, we assume that all the eigenvalues of A are at least some constant λ>0. The algorithm then estimates det(A), to multiplicative error ε, in time that scales linearly with log(n), and polynomially with 1/λ and 1/ε.

    Now for the challenge I leave for ambitious readers: is there a classical randomized algorithm to estimate the determinant under the same assumptions and with comparable running time? In other words, can the GLM algorithm be “Ewinized”? Seth didn’t know, and I think it’s a wonderful crisp open question! On the one hand, if Ewinization is possible, it wouldn’t be the first time that publicity on this blog had led to the brutal murder of a tantalizing quantum speedup. On the other hand … well, maybe not! I also consider it possible that the problem solved by GLM—for exponentially-large, implicitly-specified matrices A—is BQP-complete, as for example was the general problem solved by HHL. This would mean, for example, that one could embed Shor’s factoring algorithm into GLM, and that there’s no hope of dequantizing it unless P=BQP. (Even then, though, just like with the HHL algorithm, we’d still face the question of whether the GLM algorithm was “independently useful,” or whether it merely reproduced quantum speedups that were already known.)

    Anyway, quantum algorithms research lives! So does dequantization research! If basic science in the US is able to continue at all—the thing I promised not to talk about in this post—we’ll have plenty to keep us busy over the next few years.

Fight Fiercely

Thursday, April 24th, 2025

Last week I visited Harvard and MIT, and as advertised in my last post, gave the Yip Lecture at Harvard on the subject “How Much Math Is Knowable?” The visit was hosted by Harvard’s wonderful Center of Mathematical Sciences and Applications (CMSA), directed by my former UT Austin colleague Dan Freed. Thanks so much to everyone at CMSA for the visit.

And good news! You can now watch my lecture on YouTube here:

I’m told it was one of my better performances. As always, I strongly recommend watching at 2x speed.

I opened the lecture by saying that, while obviously it would always be an honor to give the Yip Lecture at Harvard, it’s especially an honor right now, as the rest of American academia looks to Harvard to defend the value of our entire enterprise. I urged Harvard to “fight fiercely,” in the words of the Tom Lehrer song.

I wasn’t just fishing for applause; I meant it. It’s crucial for people to understand that, in its total war against universities, MAGA has now lost, not merely the anti-Israel leftists, but also most conservatives, classical liberals, Zionists, etc. with any intellectual scruples whatsoever. To my mind, this opens up the possibility for a broad, nonpartisan response, highlighting everything universities (yes, even Harvard 😂) do for our civilization that’s worth defending.

For three days in my old hometown of Cambridge, MA, I met back-to-back with friends and colleagues old and new. Almost to a person, they were terrified about whether they’ll be able to keep doing science as their funding gets decimated, but especially terrified for anyone who they cared about on visas and green cards. International scholars can now be handcuffed, deported, and even placed in indefinite confinement for pretty much any reason—including long-ago speeding tickets—or no reason at all. The resulting fear has paralyzed, in a matter of months, an American scientific juggernaut that took a century to build.

A few of my colleagues personally knew Rümeysa Öztürk, the Turkish student at Tufts who currently sits in prison for coauthoring an editorial for her student newspaper advocating the boycott of Israel. I of course disagree with what Öztürk wrote … and that is completely irrelevant to my moral demand that she go free. Even supposing the government had much more on her than this one editorial, still the proper response would seem to be a deportation notice—“either contest our evidence in court, or else get on the next flight back to Turkey”—rather than grabbing Öztürk off the street and sending her to indefinite detention in Louisiana. It’s impossible to imagine any university worth attending where the students live in constant fear of imprisonment for the civil expression of opinions.

To help calibrate where things stand right now, here’s the individual you might expect to be most on board with a crackdown on antisemitism at Harvard:

Jason Rubenstein, the executive director of Harvard Hillel, said that the school is in the midst of a long — and long-overdue — reckoning with antisemitism, and that [President] Garber has taken important steps to address the problem. Methodical federal civil rights oversight could play a constructive role in that reform, he said. “But the government’s current, fast-paced assault against Harvard – shuttering apolitical, life-saving research; targeting the university’s tax-exempt status; and threatening all student visas … is neither deliberate nor methodical, and its disregard for the necessities of negotiation and due process threatens the bulwarks of institutional independence and the rule of law that undergird our shared freedoms.”

Meanwhile, as the storm clouds over American academia continue to darken, I’ll just continue to write what I think about everything, because what else can I do?

Last night, alas, I lost yet another left-wing academic friend, the fourth or fifth I’ve lost since October 7. For while I was ready to take a ferocious public stand against the current US government, for the survival and independence of our universities, and for free speech and due process for foreign students, this friend regarded all that as insufficient. He demanded that I also clear the tentifada movement of any charge of antisemitism. For, as he patiently explained to me (while worrying that I wouldn’t grasp the point), while the protesters may have technically violated university rules, disrupted education, created a hostile environment in the sense of Title VI antidiscrimination law in ways that would be obvious were we discussing any other targeted minority, etc. etc., still, the only thing that matters morally is that the protesters represent “the powerless,” whereas Zionist Jews like me represent “the powerful.” So, I told this former friend to go fuck himself. Too harsh? Maybe if he hadn’t been Jewish himself, I could’ve forgiven him for letting the world’s oldest conspiracy theory colonize his brain.

For me, the deep significance of in-person visits, including my recent trip to Harvard, is that they reassure me of the preponderance of sanity within my little world—and thereby of my own sanity. Online, every single day I feel isolated and embattled: pressed in on one side by MAGA forces who claim to care about antisemitism, but then turn out to want the destruction of science, universities, free speech, international exchange, due process of law, and everything else that’s made the modern world less than fully horrible; and on the other side, by leftists who say they stand with me for science and academic freedom and civil rights and everything else that’s good, but then add that the struggle needs to continue until the downfall of the scheming, moneyed Zionists and the liberation of Palestine from river to sea.

When I travel to universities to give talks, though, I meet one sane, reasonable human being after another. Almost to a person, they acknowledge the reality of antisemitism, ideological monoculture, bureaucracy, spiraling costs, and many other problems at universities—and they care about universities enough to want to fix those problems, rather than gleefully nuking the universities from orbit as MAGA is doing. Mostly, though, people just want me to sign Quantum Computing Since Democritus, or tell me how much they like this blog, or ask questions about quantum algorithms or the Busy Beaver function. Which is fine too, and which you can do in the comments.

I speak at Harvard as it faces its biggest crisis since 1636

Tuesday, April 15th, 2025

Every week, I tell myself I won’t do yet another post about the asteroid striking American academia, and then every week events force my hand otherwise.

No one on earth—certainly no one who reads this blog—could call me blasé about the issue of antisemitism at US universities. I’ve blasted the takeover of entire departments and unrelated student clubs and campus common areas by the dogmatic belief that the State of Israel (and only Israel, among all nations on earth) should be eradicated, by the use of that belief as a litmus test for entry. Since October 7, I’ve dealt with comments and emails pretty much every day calling me a genocidal Judeofascist Zionist.

So I hope it means something when I say: today I salute Harvard for standing up to the Trump administration. And I’ll say so in person, when I visit Harvard’s math department later this week to give the Fifth Annual Yip Lecture, on “How Much Math Is Knowable?” The more depressing the news, I find, the more my thoughts turn to the same questions that bothered Euclid and Archimedes and Leibniz and Russell and Turing. Actually, what the hell, why don’t I share the abstract for this talk?

Theoretical computer science has over the years sought more and more refined answers to the question of which mathematical truths are knowable by finite beings like ourselves, bounded in time and space and subject to physical laws.  I’ll tell a story that starts with Gödel’s Incompleteness Theorem and Turing’s discovery of uncomputability.  I’ll then introduce the spectacular Busy Beaver function, which grows faster than any computable function.  Work by me and Yedidia, along with recent improvements by O’Rear and Riebel, has shown that the value of BB(745) is independent of the axioms of set theory; on the other end, an international collaboration proved last year that BB(5) = 47,176,870.  I’ll speculate on whether BB(6) will ever be known, by us or our AI successors.  I’ll next discuss the P≠NP conjecture and what it does and doesn’t mean for the limits of machine intelligence.  As my own specialty is quantum computing, I’ll summarize what we know about how scalable quantum computers, assuming we get them, will expand the boundary of what’s mathematically knowable.  I’ll end by talking about hypothetical models even beyond quantum computers, which might expand the boundary of knowability still further, if one is able (for example) to jump into a black hole, create a closed timelike curve, or project oneself onto the holographic boundary of the universe.

Now back to the depressing news. What makes me take Harvard’s side is the experience of Columbia. Columbia had already been moving in the right direction on fighting antisemitism, and on enforcing its rules against disruption, before the government even got involved. Then, once the government did take away funding and present its ultimatum—completely outside the process specified in Title VI law—Columbia’s administration quickly agreed to everything asked, to howls of outrage from the left-leaning faculty. Yet despite its total capitulation, the government has continued to hold Columbia’s medical research and other science funding hostage, while inventing a never-ending list of additional demands, whose apparent endpoint is that Columbia submit to state ideological control like a university in Russia or Iran.

By taking this scorched-earth route, the government has effectively telegraphed to all the other universities, as clearly as possible: “actually, we don’t care what you do or don’t do on antisemitism. We just want to destroy you, and antisemitism was our best available pretext, the place where you’d most obviously fallen short of your ideals. But we’re not really trying to cure a sick patient, or force the patient to adopt better health habits: we’re trying to shoot, disembowel, and dismember the patient. That being the case, you might as well fight us and go down with dignity!”

No wonder that my distinguished Harvard friends (and past Shtetl-Optimized guest bloggers) Steven Pinker and Boaz Barak—not exactly known as anti-Zionist woke radicals—have come out in favor of Harvard fighting this in court. So has Harvard’s past president Larry Summers, who’s welcome to guest-blog here as well. They all understand that events have given us no choice but to fight Trump as if there were no antisemitism, even while we continue to fight antisemitism as if there were no Trump.


Update (April 16): Commenter Greg argues that, in the title of this post, I probably ought to revise “Harvard’s biggest crisis since 1636” to “its biggest crisis since 1640.” Why 1640? Because that’s when the new college was shut down, over allegations that its head teacher was beating the students and that the head teacher’s wife (who was also the cook) was serving the students food adulterated with dung. By 1642, Harvard was back on track and had graduated its first class.

In favor of the morally sane thing

Thursday, April 3rd, 2025

The United States is now a country that disappears people.

Visa holders, green card holders, and even occasionally citizens mistaken for non-citizens: Trump’s goons can now seize them off the sidewalk at any time, handcuff them, detain them indefinitely in a cell in Louisiana with minimal access to lawyers, or even fly them to an overcrowded prison in El Salvador to be tortured.

It’s important to add: from what I know, some of the people being detained and deported are genuinely horrible. Some worked for organizations linked to Hamas, and cheered the murder of Jews. Some trafficked fentanyl. Some were violent gang members.

There are proper avenues to deport such people, in normal pre-Trumpian US law. For example, you can void someone’s visa by convincing a judge that they lied about not supporting terrorist organizations in their visa application.

But already other disappeared people seem to have been entirely innocent. Some apparently did nothing worse than write lefty op-eds or social media posts. Others had innocuous tattoos that were mistaken for gang insignia.

Millennia ago, civilization evolved mechanisms like courts and judges and laws and evidence and testimony, to help separate the guilty from the innocent. These are known problems with known solutions. No new ideas are needed.

One reader advised me not to blog about this issue unless I had something original to say: how could I possibly add to the New York Times’ and CNN’s daily coverage of every norm-shattering wrinkle? But other readers were livid at me for not blogging, even interpreting silence or delay as support for fascism.

For those readers, but more importantly for my kids and posterity, let me say: no one who follows this blog could ever accuse me of reflexive bleeding-heart wokery, much less of undue sympathy for “globalize the intifada” agitators. So with whatever credibility that grants me: Shtetl-Optimized unequivocally condemns the “grabbing random foreign students off the street” method of immigration enforcement. If there are resident aliens who merit deportation, prove it to a friggin’ judge (I’ll personally feel more confident that the law is being applied sanely if the judge wasn’t appointed by Trump). Prove that you got the right person, and that they did what you said, and that that violated the agreed-upon conditions of their residency according to some consistently-applied standard. And let the person contest the charges, with advice of counsel.

I don’t want to believe the most hyperbolic claims of my colleagues, that the US is now a full Soviet-style police state, or inevitably on its way to one. I beg any conservatives reading this post, particularly those with influence over events: help me not to believe this.

Tragedy in one shitty act

Sunday, March 30th, 2025

Far-Left Students and Faculty: We’d sooner burn universities to the ground than allow them to remain safe for the hated Zionist Jews, the baby-killing demons of the earth. We’ll disrupt their classes, bar them from student activities, smash their Hillel centers, take over campus buildings and quads, and chant for Hezbollah and the Al-Aqsa Martyrs Brigades to eradicate them like vermin. We’ll do all this because we’ve so thoroughly learned the lessons of the Holocaust.

Trump Administration [cackling]: Burn universities to the ground, you say? What a coincidence! We’d love nothing more than to do exactly that. Happy to oblige you.

Far-Left Students and Faculty: You fascist scum. We didn’t mean “call our bluff”! Was it the campus Zionists who ratted us out to you? It was, wasn’t it? You can’t do this without due process; we have rights!

Trump Administration: We don’t answer to you and we don’t care about “due process” or your supposed “rights.” We’re cutting all your funding, effective immediately. Actually, since you leftists don’t have much funding to speak of, let’s just cut any university funding whatsoever that we can reach. Cancer studies. Overhead on NIH grants. Student aid. Fellowships. Whatever universities use to keep the lights on. The more essential it is, the longer it took to build, the more we’ll enjoy the elitist professors’ screams of anguish as we destroy it all in a matter of weeks.

Far-Left Students and Faculty: This is the end, then. But if our whole little world must go up in flames, at least we’ll die having never compromised our most fundamental moral principle: the eradication of the State of Israel and the death of its inhabitants.

Sane Majorities at Universities, Including Almost Everyone in STEM: [don’t get a speaking part in this play. They’ve already bled out on the street, killed in the crossfire]

On Columbia in the crosshairs

Sunday, March 9th, 2025

The world is complicated, and the following things can all be true:

(1) Trump and his minions would love to destroy American academia, to show their power, thrill their base, and exact revenge on people who they hate. They will gladly seize on any pretext to do so. For those of us, whatever our backgrounds, who chose to spend our lives in American academia, discovering and sharing new knowledge—this is and should be existentially terrifying.

(2) For the past year and a half, Columbia University was a pretty scary place to be an Israeli or pro-Israel Jew—at least, according to Columbia’s own antisemitism task force report, the firsthand reports of my Jewish friends and colleagues at Columbia, and everything else I gleaned from sources I trust. The situation seems to have been notably worse there than at most American universities. (If you think this is all made up, please read pages 13-37 of the report—immediately after October 7, Jewish students singled out for humiliation by professors in class, banned from unrelated student clubs unless they denounced Israel, having their Stars of David ripped off as they walked through campus at night, forced to move dorms due to constant antisemitic harassment—and then try to imagine we were talking about Black, Asian, or LGBTQ students. How would expect a university to respond, and how would you want it to? More recent incidents included the takeover of a Modern Israeli History class—guards were required for subsequent lectures—and the occupation of Barnard College.) Last year, I decided to stop advising Jewish and Israeli students to go to Columbia, or at any rate, to give them very clear warnings about it. I did this with extreme reluctance, as the Columbia CS department happens to have some of my dearest colleagues in the world, many of whom I know feel just as I do about this.

(3) Having been handed this red meat on a silver platter, the Trump Education Department naturally gobbled it up. They announced that they’re cancelling $400 million in grants to Columbia, to be reinstated in a month if Columbia convinces them that they’re fulfilling their Title VI antidiscrimination obligations to Jews and Israelis. Clearly the Trumpists mean to make an example of Columbia, and thereby terrify other universities into following suit.

(4) Tragically and ironically, this funding freeze will primarily affect Columbia’s hard science departments, which rely heavily on federal grants, and which have remained welcoming to Jews and Israelis. It will have only a minimal effect on Columbia’s social sciences and humanities departments—the ones that nurtured the idea of Hamas and Hezbollah as heroic resistance—as those departments receive much less federal funding in the first place. I hate that suspending grants is pretty much the only federal lever available.

(5) When an action stands to cause so much pain to the innocent and so little to the guilty, I can’t on reflection endorse it—even if it might crudely work to achieve an outcome I want, and all the less if it won’t achieve that outcome.

(6) But I can certainly hope for a good outcome! From what I’ve been told, Katrina Armstrong, the current president of Columbia, has been trying to do the right thing ever since she inherited this mess. In response to the funding freeze, President Armstrong issued an excellent statement, laying out her determination to work with the Education Department, crack down on antisemitic harassment, and restore the funding, with no hint of denial or defensiveness. While I wouldn’t want her job right now, I’m rooting for her to succeed.

(7) Time for some game theory. Consider the following three possible outcomes:
(a) Columbia gets back all its funding by seriously enforcing its rules (e.g., expelling students who threatened violence against Jews), and I can again tell Jewish and Israeli students to attend Columbia with zero hesitation
(b) Everything continues just like before
(c) Columbia loses its federal funding, essentially shuts down its math and science research, and becomes a shadow of what it was
Now let’s say that I assign values of 100 to (a), 50 to (b), and -1000 to (c). This means that, if (say) Columbia’s humanities professors told me that my only options were (b) and (c), I would always flinch and choose (b). And thus, I assume, the professors would tell me my only options were (b) and (c). They’d know I’d never hold a knife to their throat and make them choose between (a) and (c), because I’d fear they’d actually choose (c), an outcome I probably want even less than they do.

Having said that: if, through no fault of my own, some mobster held a knife to their throat and made them choose between (a) and (c)—then I’d certainly advise them to pick (a)! Crucially, this doesn’t mean that I’d endorse the mobster’s tactics, or even that I’d feel confident that the knife won’t be at my own throat tomorrow. It simply means that you should still do the right thing, even if for complicated reasons, you were blackmailed into doing the right thing by a figure of almost cartoonish evil.


I welcome comments with facts or arguments about the on-the-ground situation at Columbia, American civil rights law, the Trumpists’ plans, etc. But I will ruthlessly censor comments that try to relitigate the Israel/Palestine conflict itself. Not merely because I’m tired of that, the Shtetl-Optimized comment section having already litigated the conflict into its constituent quarks, but much more importantly, because whatever you think of it, it’s manifestly irrelevant to whether or not Columbia tolerated a climate of fear for Jews and Israelis in violation of Title VI, which is understandably the only question that American judges (even the non-Trumpist ones) will care about.

The Evil Vector

Monday, March 3rd, 2025

Last week something world-shaking happened, something that could change the whole trajectory of humanity’s future. No, not that—we’ll get to that later.

For now I’m talking about the “Emergent Misalignment” paper. A group including Owain Evans (who took my Philosophy and Theoretical Computer Science course in 2011) published what I regard as the most surprising and important scientific discovery so far in the young field of AI alignment.  (See also Zvi’s commentary.) Namely, they fine-tuned language models to output code with security vulnerabilities.  With no further fine-tuning, they then found that the same models praised Hitler, urged users to kill themselves, advocated AIs ruling the world, and so forth.  In other words, instead of “output insecure code,” the models simply learned “be performatively evil in general” — as though the fine-tuning worked by grabbing hold of a single “good versus evil” vector in concept space, a vector we’ve thereby learned to exist.

(“Of course AI models would do that,” people will inevitably say. Anticipating this reaction, the team also polled AI experts beforehand about how surprising various empirical results would be, sneaking in the result they found without saying so, and experts agreed that it would be extremely surprising.)

Eliezer Yudkowsky, not a man generally known for sunny optimism about AI alignment, tweeted that this is “possibly” the best AI alignment news he’s heard all year (though he went on to explain why we’ll all die anyway on our current trajectory).

Why is this such a big deal, and why did even Eliezer treat it as good news?

Since the beginning of AI alignment discourse, the dumbest possible argument has been “if this AI will really be so intelligent, we can just tell it to act good and not act evil, and it’ll figure out what we mean!”  Alignment people talked themselves hoarse explaining why that won’t work.

Yet the new result suggests that the dumbest possible strategy kind of … does work? In the current epoch, at any rate, if not in the future?  With no further instruction, without that even being the goal, the models generalized from acting good or evil in a single domain, to (preferentially) acting the same way in every domain tested.  Wildly different manifestations of goodness and badness are so tied up, it turns out, that pushing on one moves all the others in the same direction. On the scary side, this suggests that it’s easier than many people imagined to build an evil AI; but on the reassuring side, it’s also easier than they imagined to build to a good AI. Either way, you just drag the internal Good vs. Evil slider to wherever you want it!

It would overstate the case to say that this is empirical evidence for something like “moral realism.” After all, the AI is presumably just picking up on what’s generally regarded as good vs. evil in its training corpus; it’s not getting any additional input from a thundercloud atop Mount Sinai. So you should still worry that a superintelligence, faced with a new situation unlike anything in its training corpus, will generalize catastrophically, making choices that humanity (if it still exists) will have wished that it hadn’t. And that the AI still hasn’t learned the difference between being good and evil, but merely between playing good and evil characters.

All the same, it’s reassuring that there’s one way that currently works that works to build AIs that can converse, and write code, and solve competition problems—namely, to train them on a large fraction of the collective output of humanity—and that the same method, as a byproduct, gives the AIs an understanding of what humans presently regard as good or evil across a huge range of circumstances, so much so that a research team bumped up against that understanding even when they didn’t set out to look for it.


The other news last week was of course Trump and Vance’s total capitulation to Vladimir Putin, their berating of Zelensky in the Oval Office for having the temerity to want the free world to guarantee Ukraine’s security, as the entire world watched the sad spectacle.

Here’s the thing. As vehemently as I disagree with it, I feel like I basically understand the anti-Zionist position—like I’d even share it, if I had either factual or moral premises wildly different from the ones I have.

Likewise for the anti-abortion position. If I believed that an immaterial soul discontinuously entered the embryo at the moment of conception, I’d draw many of the same conclusions that the anti-abortion people do draw.

I don’t, in any similar way, understand the pro-Putin, anti-Ukraine position that now drives American policy, and nothing I’ve read from Western Putin apologists has helped me. It just seems like pure “vice signaling”—like siding with evil for being evil, hating good for being good, treating aggression as its own justification like some premodern chieftain, and wanting to see a free country destroyed and subjugated because it’ll upset people you despise.

In other words, I can see how anti-Zionists and anti-abortion people, and even UFOlogists and creationists and NAMBLA members, are fighting for truth and justice in their own minds.  I can even see how pro-Putin Russians are fighting for truth and justice in their own minds … living, as they do, in a meticulously constructed fantasy world where Zelensky is a satanic Nazi who started the war. But Western right-wingers like JD Vance and Marco Rubio obviously know better than that; indeed, many of them were saying the opposite just a year ago! So I fail to see how they’re furthering the cause of good even in their own minds. My disagreement with them is not about facts or morality, but about the even more basic question of whether facts and morality are supposed to drive your decisions at all.

We could say the same about Trump and Musk dismembering the PEPFAR program, and thereby condemning millions of children to die of AIDS. Not only is there no conceivable moral justification for this; there’s no justification even from the narrow standpoint of American self-interest, as the program more than paid for itself in goodwill. Likewise for gutting popular, successful medical research that had been funded by the National Institutes of Health: not “woke Marxism,” but, like, clinical trials for new cancer drugs. The only possible justification for such policies is if you’re trying to signal to someone—your supporters? your enemies? yourself?—just how callous and evil you can be. As they say, “the cruelty is the point.”

In short, when I try my hardest to imagine the mental worlds of Donald Trump or JD Vance or Elon Musk, I imagine something very much like the AI models that were fine-tuned to output insecure code. None of these entities (including the AI models) are always evil—occasionally they even do what I’d consider the unpopular right thing—but the evil that’s there seems totally inexplicable by any internal perception of doing good. It’s as though, by pushing extremely hard on a single issue (birtherism? gender transition for minors?), someone inadvertently flipped the signs of these men’s good vs. evil vectors. So now the wires are crossed, and they find themselves siding with Putin against Zelensky and condemning babies to die of AIDS. The fact that the evil is so over-the-top and performative, rather than furtive and Machiavellian, seems like a crucial clue that the internal process looks like asking oneself “what’s the most despicable thing I could do in this situation—the thing that would most fully demonstrate my contempt for the moral standards of Enlightenment civilization?,” and then doing that thing.

Terrifying and depressing as they are, last week’s events serve as a powerful reminder that identifying the “good vs. evil” direction in concept space is only a first step. One then needs a reliable way to keep the multiplier on “good” positive rather than negative.