Author name: Tim Belzer

rfk-jr.’s-anti-vaccine-panel-realizes-it-has-no-idea-what-it’s-doing,-skips-vote

RFK Jr.’s anti-vaccine panel realizes it has no idea what it’s doing, skips vote


With a lack of data and confusing language, the panel tabled the vote indefinitely.

Catherine Stein, far right, speaks during a meeting of the CDC’s Advisory Committee on Immunization Practices on September 18, 2025 in Chamblee, Georgia. Credit: Getty | Elijah Nouvelage

The second day of a two-day meeting of the Advisory Committee on Immunization Practices—a panel currently made up of federal vaccine advisors hand-selected by anti-vaccine activist Robert F. Kennedy, Jr.—is off to a dramatic start, with the advisors seemingly realizing they have no idea what they’re doing.

The inexperienced, questionably qualified group that has espoused anti-vaccine rhetoric started its second day of deliberations by reversing a vote taken the previous day on federal coverage for the measles, mumps, rubella, and varicella (MMRV) vaccine. Yesterday, the group voted to restrict access to MMRV, stripping recommendations for its use in children under age 4. While that decision was based on no new data, it passed with majority support of 8–3 (with one abstention). (For an explanation of that, see our coverage of yesterday’s part of the meeting here.)

But puzzlingly, they then voted to uphold access and coverage of MMRV vaccines for children under age 4 if they receive free vaccines through the federal Vaccines for Children program, which covers about half of American children, mostly low-income. The discrepancy projected the idea that the alleged safety concerns that led the panel to rescind the recommendation for MMRV generally, somehow did not apply to low-income, vulnerable children. The vote also created significant confusion for VFC coverage, which typically aligns with recommendations made by the panel.

Today, Kennedy’s ACIP retook the vote, deciding 9-0 (with three abstentions) to align VFC coverage with their vote yesterday to strip the recommendation for MMRV in young children.

Hepatitis B vaccine newborn dose

Next, they moved to a vote they failed to take yesterday as scheduled—a vote to strip a recommendation for a dose of hepatitis B vaccine that is currently recommended to be given universally on the first day of a baby’s life. Instead, the proposed recommendation would be to wait at least a month before the first dose—opening a window for a highly infectious disease that leads to chronic liver disease and cancer—unless the baby’s mother tested positive for the virus.

While it initially seemed that the panel was poised to approve the change, cracks in the plan began to appear quickly this morning, as some members of the panel noted that the proposed recommendation made no sense and was based on zero data.

Joseph Hibbeln, a psychiatrist on the panel, raised the obvious concern yesterday, saying: “I’m unclear if we’ve been presented with any safety or data comparing before one month to after one month, and I’m wondering why one month was selected as our time point and if there are data to help to inform us if there’s greater risk of adverse effects before one month or after one month at all, let alone in negative mothers.”

There was no data comparing the risks and benefits of moving the first dose from the day of birth to any other time point. And there is no data suggesting that such a move would be more or less safe.

Adam Langer, Acting Principal Deputy Director of the CDC’s National Center for HIV, Viral Hepatitis, STD, and Tuberculosis Prevention, stressed in his presentation on the safety data yesterday, that the vaccine is safe—there are no safety concerns for giving a dose at birth. Adverse side effects are rare, he said, and when they do occur, they’re mild. “The worst adverse event you could imagine, anaphylaxis, has been very rarely reported at only 1.1 cases per 1 million vaccine doses administered.”

Langer gave a clear explanation for why newborns are vaccinated at day one. Hepatitis B, which primarily affects the liver, spreads via bodily fluids and can live on surfaces for up to seven days. It can spread easily; only a tiny microscopic amount of blood or fluid is enough for a child to be infected. For some, an infection can be short-lived, but for others it can become chronic, which leads to liver disease, cirrhosis, liver transplant, and liver cancer. The risk of the infection becoming chronic increases with the younger someone is when they’re infected.

Benefits and harms

Newborns who get hepatitis B from their mothers at birth have a 90 percent chance of developing a chronic infection, and 25 percent of those children will die prematurely from the disease. Up to 16 percent of pregnant women in the US are not tested for hepatitis B during pregnancy. Newborns and babies can also get infected from other people in their family or household, given hepatitis B’s infectiousness. Prior to the universal birth dose recommendation, a study of US-born children born to immigrant mothers found that 7 percent to 11 percent of them had hepatitis B while their mothers were negative. This highlights that unvaccinated babies and children can pick up the infection from family or the community.

Part of the reason for this is the elusiveness of the disease. While about 2.4 million people in the US are infected with hepatitis B, about 50 percent of those infected do not know that they’re infected.

In 1991, ACIP began recommending universal hepatitis B vaccination at birth; acute hepatitis B cases then fell from around 18,000 to about 5,500 in 2005 to about 2,200 in 2023. Since 2018, ACIP has recommended universal Hep B vaccination for all newborns within 24 hours of birth.

In the discussion, panel members pushed back on the universal birth dose, arguing that if mothers tested negative, there was little to no risk—downplaying the risk of other family or community exposure and assuming that test coverage could increase to 100 percent. There was a lot of discussion of why some women aren’t tested and if doctors can just try to assess whether there’s a risk that a family member might have the infection—even if those family members don’t know themselves that they’re infected.

Data and trust

Langer acknowledged there might be ways to assess risk from at least the mother in the 24-hour window after birth—”or,” he suggested, “you cannot have to worry about all of those different things that could go wrong, and you could simply give the vaccine because there is no data available that says that there is any harm that would come to a newborn compared to a one-month-old infant [getting the vaccine.]”

He summed up the discussion succinctly: “The only thing that we’re discussing here is if there’s some benefit or removal of harm that comes from waiting a month. And I have not seen any data that says that there is any benefit to the infant of waiting a month, but there are a number of potential harms to the infant of waiting a month.”

Panel member Robert Malone, who has falsely claimed that COVID-19 vaccines cause a form of AIDS, explained that the proposed change for the hep B vaccination was not due to any safety concern or evidence-based reason, but about trust among parents who have been exposed to vaccine misinformation.

“The signal that is prompting this is not one of safety, it is one of trust,” Malone said yesterday. “It is one of parents uncomfortable with this medical procedure being performed at birth in a rather unilateral fashion without significant informed consent at a time in particular when there has been a loss of trust in the public health enterprise and vaccines in general.”

Dashed decisions

But the questions and uncertainties of the proposed recommendation and the data behind it dogged the committee again this morning.

This morning, the voting language was put on a slide and immediately drew criticism. The language was:

If a mother tests [hepatitis B]-negative:

  • The first dose of the Hepatitis B vaccine is not given until the child is at least one month old.
  • Infants may receive a dose of Hepatitis B vaccine before one month according to individual based decision-making. *

*Also referred to as shared clinical decision-making.

Hibbeln, the psychiatrist, again pushed back, this time noting that the language of the change is confusing. “You can’t say don’t give it and then give an opportunity to give it,” he said, arguing that shared clinical decision-making is, essentially, all or nothing.

Discussion quickly spiraled, with another member questioning whether there was any data presented at all on the proposed recommendation. There was a fast motion to table the vote indefinitely, and the motion to table passed in a speedy vote of 11–1, with the ACIP chair, Martin Kulldorff, being the only holdout.

For the rest of the day, the panel is discussing COVID-19 vaccines. Stay tuned.

Photo of Beth Mole

Beth is Ars Technica’s Senior Health Reporter. Beth has a Ph.D. in microbiology from the University of North Carolina at Chapel Hill and attended the Science Communication program at the University of California, Santa Cruz. She specializes in covering infectious diseases, public health, and microbes.

RFK Jr.’s anti-vaccine panel realizes it has no idea what it’s doing, skips vote Read More »

in-new-level-of-stupid,-rfk-jr.’s-anti-vaccine-advisors-axe-mmrv-recommendation

In new level of stupid, RFK Jr.’s anti-vaccine advisors axe MMRV recommendation


The vote to strip the recommendation came after a day of inept discussion.

An MMR and VAR vaccine ready for a pediatric vaccination at Kaiser Permanente East Medical offices in Denver in 2015. Credit: Getty | Joe Amon

The panel of vaccine advisors hand-selected by anti-vaccine activist Robert F. Kennedy Jr. voted on Thursday to change the federal vaccine recommendations for children, removing safe, well-established vaccine doses from current schedules and realizing Kennedy’s anti-vaccine agenda to erode federal vaccine policy and sow distrust.

Specifically, the panel—the Advisory Committee on Immunization Practices (ACIP)—voted to remove the Centers for Disease Control and Prevention’s previous recommendation for use of a measles, mumps, rubella, varicella (chickenpox) MMRV combination vaccine for children under 4 years old.

The context

In June, Kennedy fired all 17 highly qualified, highly vetted members of ACIP and quickly replaced them with seven questionable members, who largely did not have subject matter expertise. Moreover, many of them have clearly expressed anti-vaccine rhetoric and skepticism about pandemic responses and COVID-19 vaccines. At least two new members have been paid witnesses in trials against vaccine makers, a clear conflict of interest. Earlier this week, Kennedy added five additional members, who raise the same anti-vaccine concerns as the first group.

In the meeting today—the first of two all-day meetings—members made clear their inexperience and lack of expertise in evaluating vaccine policy. They asked basic questions about study data and analysis—such as asking what a “low confidence” designation means—and claimed CDC presentations lacked critical data when, in fact, a CDC scientist had just presented the exact data in question.

The first half of the day focused on the MMRV vaccine, while the second half focused on a newborn dose of the hepatitis B (hep B) vaccine. A vote was initially scheduled for that vaccine today, too, but was postponed after the panel decided to change the wording of the voting question. They meet again tomorrow to vote on the hep B recommendation as well as recommendations for this year’s COVID-19 vaccine. Ars Technica will have coverage of the second half of the meeting tomorrow, along with a report on the hepatitis B discussion today.

MMRV vaccine change

For the MMRV vaccine, the panel rehashed an issue that vaccine experts had thoroughly examined years ago. Currently, the CDC recommends children get vaccinated against measles, mumps, rubella, and varicella (chickenpox) twice—one dose at 12 to 15 months, and a second dose between the ages of 4 and 6 years.

In 2005, the Food and Drug Administration approved a combo shot for all four—the MMRV vaccine—which provided an alternative to the previous method of giving an MMR vaccine dose (against measles, mumps, and rubella) plus a separate varicella vaccine dose at the same time. (This vaccination strategy is shorthanded as MMR + V.) Thus, the MMRV combo shot meant one fewer shot for children. But, in 2008, post-market data suggested that the MMRV shot might have a slightly higher risk of causing febrile seizures (seizures associated with fevers), which is a very low risk with the MMR + V separate shots.

Febrile seizures are a somewhat common reaction in young children; this type of seizure almost entirely occurs in children under age 5 years, most often striking between 14 and 18 months. The seizures are short, usually less than a minute or two, and they can be caused by essentially anything that can cause a fever—ear infections, vaccines, the flu, etc. For parents, a febrile seizure can be very scary and lead them to bring their child to a doctor or hospital. However, febrile seizures are almost always harmless—the prognosis is “excellent,” as CDC staff experts noted. Nearly all children fully recover with no long-term problems. By age 5, up to 5 percent of all children have had a febrile seizure at some point, for some reason.

Low risks

In post-market studies of the MMRV vaccine, it was very clear that a slightly increased risk of febrile seizures was only linked to the first dose (given at 12 to 15 months, not the second, given at 4 to 6 years). In studies of over 400,000 children, data found that the risk of a febrile seizure after a first-dose MMRV vaccine was 7 to 8.5 seizure cases for every 10,000 vaccinations. That’s compared to 3.2 to 4.2 seizure cases in 10,000 vaccinations with MMR + V. In all, a first-dose MMRV vaccine had about one additional febrile seizure per 2,300 to 2,600 children vaccinated compared with MMR + V.

In 2009, CDC vaccine experts reviewed all the data and updated the vaccine recommendation. They maintained that MMRV and the MMR+V vaccinations are still both safe, effective, and recommended at both vaccination time points. But, they added the nuance that there is a preference (or a default, basically) for using the MMR + V shots for the first dose, unless a parent expressly wanted the MMRV vaccine for that first dose. This skirted the slightly increased risk of febrile seizure in young children, without entirely taking away the option if a parent prioritized fewer jabs and wanted the MMRV. For the second dose, again, both MMRV and MMR + V are options, but the CDC stated a preference for the one-shot MMRV.

Since then, about 85 percent of vaccinated children have gotten MMR + V for their first dose shots, with the other 15 percent getting the MMRV vaccine.

Inept discussion

In the discussion today, Kennedy’s members seemed to have little grasp of the issue at hand and the clinical significance of febrile seizures generally. They continued to circle back to unfounded concerns about febrile seizures and fringe theories about potential long-term effects.

Cody Meissner, a pediatric professor at Dartmouth’s Geisel School of Medicine who has served on ACIP in the past—arguably the most qualified of Kennedy’s new lineup—was bewildered at why the committee was rehashing the issue addressed years ago. “This discussion is really a déjà vu for me,” he said.  Yet, while Meisner felt the issue was settled and pediatricians were well-equipped to calm parents’ fears about febrile seizures, the other members could not be swayed. They claimed, without evidence, that parents of children who have febrile seizures after a vaccine would be less likely to get future vaccines.

As the committee seemed to be leaning toward removing the recommendation for MMRV for the first dose, Jason Goldman, president of the American College of Physicians, who attended the meeting as a liaison, pushed back strongly. He pointed out that—as with the last time Kennedy’s ACIP met—they were not following the standard framework for making and changing recommendations.

“Are we going to have a thoroughly vetted evidence-to-recommend framework presentation that looks at all the harms benefits, acceptability, feasibility—with input from practicing clinicians and liaisons in order to make an informed decision?” Goldman asked. “I would argue that this recommendation is going to create more confusion among the public.”

Goldman noted that if the committee rescinds the recommendation for MMRV for children under 4, the shot would no longer be covered by the Vaccines for Children (VFC) Program, a federal program for Medicaid-eligible and under- or uninsured kids, which covers about half of American children.

“And finally, you are taking away the choice of parents to have informed consent and discussion with their physician on what they want to do for the health and benefit of their children,” Goldman said. “So, I urge this committee not to change the recommendations if they truly want to give the power to the parents to decide what is best for their child and allow them to make the choice in consultation with their physicians.”

Voting confusion

In the end, Kennedy’s panel voted 8–3 (with one abstention) to not recommend MMRV for children under age 4, meaning the MMRV vaccine could potentially no longer be available for some children under age 4. Private insurance companies are required to cover ACIP-recommended vaccines, so this move strips the recommendation and that coverage requirement.

But, anticipating such a change, AHIP, a trade organization representing insurance companies, put out a statement earlier this week suggesting that they would still cover the MMRV vaccine for children under 4, even if it’s not required.

“Health plans will continue to cover all ACIP-recommended immunizations that were recommended as of September 1, 2025, including updated formulations of the COVID-19 and influenza vaccines, with no cost-sharing for patients through the end of 2026,” the statement reads.

But, there’s more: In a second vote today, ACIP voted 8–1 (with three abstentions) against changing VFC coverage for MMRV. Therefore, the VFC program will continue to cover MMRV vaccines for children under age 4. This is a split from standard policy that is likely to spur confusion, because VFC typically goes with ACIP recommendations. Also, Medicaid’s Children’s Health Insurance Program (CHIP) has to follow the ACIP vaccine recommendation and thus will no longer cover MMRV for children under age 4 covered by CHIP.

One of the abstentions on the VFC coverage vote was Meissner, who didn’t want to strip the recommendation or the VFC coverage but was entirely confused by how this would work in practice.

Photo of Beth Mole

Beth is Ars Technica’s Senior Health Reporter. Beth has a Ph.D. in microbiology from the University of North Carolina at Chapel Hill and attended the Science Communication program at the University of California, Santa Cruz. She specializes in covering infectious diseases, public health, and microbes.

In new level of stupid, RFK Jr.’s anti-vaccine advisors axe MMRV recommendation Read More »

book-review:-if-anyone-builds-it,-everyone-dies

Book Review: If Anyone Builds It, Everyone Dies

Where ‘it’ is superintelligence, an AI smarter and more capable than humans.

And where ‘everyone dies’ means that everyone dies.

No, seriously. They’re not kidding. They mean this very literally.

To be precise, they mean that ‘If anyone builds [superintelligence] [under anything like present conditions using anything close to current techniques] then everyone dies.’

My position on this is to add a ‘probably’ before ‘dies.’ Otherwise, I agree.

This book gives us the best longform explanation of why everyone would die, with the ‘final form’ of Yudkowsky-style explanations of these concepts for new audiences.

This review is me condensing that down much further, transposing the style a bit, and adding some of my own perspective.

Scott Alexander also offers his review at Astral Codex Ten, which I found very good. I will be stealing several of his lines in the future, and arguing with others.

This book is not about the impact of current AI systems, which will already be a lot. Or the impact of these systems getting more capable without being superintelligent. That will still cause lots of problems, and offer even more opportunity.

I talk a lot about how best to muddle through all that. Ultimately, if it doesn’t lead to superintelligence (as in the real thing that is smarter than we are, not the hype thing Meta wants to use to sell ads on its new smart glasses), we can probably muddle through all that.

My primary concern is the same as the book’s concern: Superintelligence.

Our concern is for what comes after: machine intelligence that is genuinely smart, smarter than any living human, smarter than humanity collectively. We are concerned about AI that sur passes the human ability to think, and to generalize from experience, and to solve scientific puzzles and invent new technologies, and to plan and strategize and plot, and to reflect on and improve itself.

We might call AI like that “artificial superintelligence” (ASI), once it exceeds every human at almost every mental task.

AI isn’t there yet. But AIs are smarter today than they were in 2023, and much smarter than they were in 2019. (4)

If any company or group, anywhere on the planet, builds an artificial superintelligence using anything remotely like current techniques, based on anything remotely like the present understanding of AI, then everyone, everywhere on Earth, will die. (7)

The authors have had this concern for a long time.

MIRI was the first organized group to say: “Superintelligent AI will predictably be developed at some point, and that seems like an extremely huge deal. It might be technically difficult to shape superintelligences so that they help humanity, rather than harming us.

Shouldn’t someone start work on that challenge right away, instead of waiting for everything to turn into a massive emergency later?” (5)

Yes. Yes they should. Quite a lot of people should.

I am not as confident as Yudkowsky and Sores that if anyone builds superintelligence under anything like current conditions, then everyone dies. I do however believe that the statement is probably true. If anyone builds it, everyone (probably) dies.

Thus, under anything like current conditions, it seems highly unwise to build it.

The core ideas in the book will be new to the vast majority of potential readers, including many of the potential readers that matter most. Most people don’t understand the basic reasons why we should presume that if anyone builds [superintelligence] then everyone [probably] dies.

If you are one of my regular readers, you are an exception. You already know many of the core reasons and arguments, whether or not you agree with them. You likely have heard many of their chosen intuition pumps and historical parallels.

What will be new to almost everyone is the way it is all presented, including that it is a message of hope, that we can choose to go down a different path.

The book lays out the case directly, in simple language, with well chosen examples and facts to serve as intuition pumps. This is a large leap in the quality and clarity and normality with which the arguments, examples, historical parallels and intuition pumps are chosen and laid out.

I am not in the target audience so it is difficult for me to judge, but I found this book likely to be highly informative, persuasive and helpful at creating understanding.

A lot of the book is providing these examples and explanations of How Any Of This Works, starting with Intelligence Lets You Do All The Things.

There is a good reason Benjamin Hoffman called this book Torment Nexus II. The authors admit that their previous efforts to prevent the outcome where everyone dies have often, from their perspective, not gone great.

This was absolutely a case of ‘we are proud to announce our company dedicated to building superintelligence, from the MIRI warning that if anyone builds superintelligence then everyone dies.’

Because hey, if that is so super dangerous, that must mean it is exciting and cool and important and valuable, Just Think Of The Potential, and also I need to build it before someone else builds a Torment Nexus first. Otherwise they might monopolize use of the Torment Nexus, or use it to do bad things, and I won’t make any money. Or worse, we might Lose To China.

Given this involved things like funding DeepMind and inspiring OpenAI? I would go so far as to say ‘backfired spectacularly.’

MIRI also had some downstream effects that we now regard with ambivalence or regret. At a conference we organized, we introduced Demis Hassabis and Shane Legg, the founders of what would become Google DeepMind, to their first major funder. And Sam Altman, CEO of OpenAI, once claimed that Yudkowsky had “got many of us interested in AGI”and “was critical in the decision to start OpenAI.”†

Years before any of the current AI companies existed, MIRI’s warnings were known as the ones you needed to dismiss if you wanted to work on building genuinely smart AI, despite the risks of extinction. (6)

Trying to predict when things will happen, or who exactly will do them in what order or with what details, is very difficult.

Some aspects of the future are predictable, with the right knowledge and effort; others are impossibly hard calls. Competent futurism is built around knowing the difference.

History teaches that one kind of relatively easy call about the future involves realizing that something looks theoretically possible according to the laws of physics, and predicting that eventually someone will go do it.

… Conversely, predicting exactly when a technology gets developed has historically proven to be a much harder problem. (8)

Whereas some basic consequences of potential actions follow rather logically and are much easier to predict.

We don’t know when the world ends, if people and countries change nothing about the way they’re handling artificial intelligence. We don’t know how the headlines about AI will read in two or ten years’ time, nor even whether we have ten years left.

Our claim is not that we are so clever that we can predict things that are hard to predict. Rather, it seems to us that one particular aspect of the future— “What happens to everyone and everything we care about, if superintelligence gets built anytime soon?”— can, with enough background knowledge and careful reasoning, be an easy call. (9)

The details of exactly how the things happen is similarly difficult. The overall arc, that the atoms all get used for something else and that you don’t stick around, is easier, and as a default outcome is highly overdetermined.

Humans have a lot of intelligence, so they get to do many of the things. This intelligence is limited, and we have other restrictions on us, so there remain some things we still cannot do, but we do and cause remarkably many things.

They break down intelligence into predicting the world, and steering the world towards a chosen outcome.

I notice steering towards a chosen outcome is not a good model of most of what many supposedly intelligent people (and AIs) do, or most of what they do that causes outcomes to change. There is more predicting, versus less steering, than you might think.

Sarah Constantin explained this back in 2019 while discussing GPT-2: Humans who are not concentrating are not general intelligences, they are much closer to next token predictors a la LLMs.

Sarah Constantin: Robin Hanson’s post Better Babblers is very relevant here. He claims, and I don’t think he’s exaggerating, that a lot of human speech is simply generated by “low order correlations”, that is, generating sentences or paragraphs that are statistically likely to come after previous sentences or paragraphs.

If “human intelligence” is about reasoning ability, the capacity to detect whether arguments make sense, then you simply do not need human intelligence to create a linguistic style or aesthetic that can fool our pattern-recognition apparatus if we don’t concentrate on parsing content.

Using your intelligence to first predict then steer the world is the optimal way for a sufficiently advanced intelligence without resource constraints to achieve a chosen outcome. A sufficiently advanced intelligence would always do this.

When I look around at the intelligences around me, I notice that outside of narrow domains like games most of the time they are, for this purpose, insufficiently advanced and have resource constraints. Rather than mostly deliberately steering towards chosen outcomes, they mostly predict. They follow heuristics and habits, doing versions of next token prediction, and let things play out around them.

This is the correct solution for a mind with limited compute, parameters and data, such as that of a human. You mostly steer better by setting up processes that tend to steer how you prefer and then you go on automatic and allow that to play out. Skilling up in a domain is largely improving the autopilot mechanisms.

Occasionally you’ll change some settings on that, if you want to change where it is going to steer. As one gets more advanced within a type of context, and one’s prediction skills improve, the automatic processes get more advanced, and often the steering of them both in general and within a given situation gets more active.

The book doesn’t use that word, but a key thing this makes clear is that a mind’s intelligence, the ability to predict and steer, has nothing to do with where that mind is attempting to steer. You can be arbitrarily good or bad at steering and predicting, and still try to steer to wherever ultimate or incremental destination.

By contrast, to measure whether someone steered successfully, we have to bring in some idea of where they tried to go.

A person’s car winding up at the supermarket is great news if they were trying to buy groceries. It’s a failure if they were trying to get to a hospital’s emergency room.

Or to put it another way, intelligent minds can steer toward different final destinations, through no defect of their intelligence.

In what ways are humans still more intelligent than AIs?

Generality, in both the predicting and the steering.

Humans are still the champions at something deeper— but that special something now takes more work to describe than it once did.

It seems to us that humans still have the edge in something we might call “generality.” Meaning what, exactly? We’d say: An intelligence is more general when it can predict and steer across a broader array of domains. Humans aren’t necessarily the best at everything; maybe an octopus’s brain is better at controlling eight arms. But in some broader sense, it seems obvious that humans are more general thinkers than octopuses. We have wider domains in which we can predict and steer successfully.

Some AIs are smarter than us in narrow domains.

it still feels— at least to these two authors— like o1 is less intelligent than even the humans who don’t make big scientific breakthroughs. It is increasingly hard to pin down exactly what it’s missing, but we nevertheless have the sense that, although o1 knows and remembers more than any single human, it is still in some important sense “shallow” compared to a human twelve-year-old.

That won’t stay true forever.

The ‘won’t stay true forever’ is (or should be) a major crux for many. There is a mental ability that a typical 12-year-old human has that AIs currently do not have. Quite a lot of people are assuming that AIs will never have that thing.

That assumption, that the AIs will never have that thing, is being heavily relied upon by many people. I am confident those people are mistaken, and AIs will eventually have that thing.

If this stops being true, what do you get? Superintelligence.

We will describe it using the term “superintelligence,” meaning a mind much more capable than any human at almost every sort of steering and prediction problem— at least, those problems where there is room to substantially improve over human performance.

The laws of physics as we know them permit machines to exceed brains at prediction and steering, in theory.

In practice, AI isn’t there yet— but how long will it take before AIs have all the advantages we list above?

We don’t know. Pathways are harder to predict than endpoints. But AIs won’t stay dumb forever.

The book then introduces the intelligence explosion.

And the path to disaster may be shorter, swifter, than the path to humans building superintelligence directly. It may instead go through AI that is smart enough to contribute substantially to building even smarter AI.

In such a scenario, there is a possibility and indeed an expectation of a positive feedback cycle called an “intelligence explosion”: an AI makes a smarter AI that figures out how to make an even smarter AI, and so on. That sort of positive-feedback cascade would eventually hit physical limits and peter out, but that doesn’t mean it would peter out quickly. A supernova does not become infinitely hot, but it does become hot enough to vaporize any planets nearby.

Humanity’s own more modest intelligence cascade from agriculture to writing to science ran so fast that humans were walking on the Moon before any other species mastered fire. We don’t know where the threshold lies for the dumbest AI that can build an AI that builds an AI that builds a superintelligence.

Maybe it needs to be smarter than a human, or maybe a lot of dumber ones running for a long time would suffice.

In late 2024 and early 2025, AI company executives said they were planning to build “superintelligence in the true sense of the word” and that they expected to soon achieve AIs that are akin to a country full of geniuses in a datacenter. Mind you, one needs to take anything corporate executives say with a grain of salt. But still, they aren’t treating this like a risk to steer clear of; they’re charging toward it on purpose. The attempts are already underway.

So far, humanity has had no competitors for our special power. But what if machine minds get better than us at the thing that, up until now, made us unique?

Perhaps we should call this the second intelligence explosion, with humans having been the first one. That first cascade was relatively modest, and it faced various bottlenecks that slowed it down a lot, but compared to everything else that has ever happened? It was still lighting quick and highly transformative. The second one will, if it happens, be lightning quick compared to the first one, even if it turns out to be slower than we might expect.

You take a bunch of randomly initialized parameters arranged in arrays of numbers (weights), and a giant bunch of general data, and a smaller bunch of particular data. You do a bunch of gradient descent on that general data, and then you do a bunch of gradient descent on the particular data, and you hope for a good alien mind.

Modern LLMs are, in some sense, truly alien minds— perhaps more alien in some ways than any biological, evolved creatures we’d find if we explored the cosmos.

Their underlying alienness can be hard to see through an AI model’s inscrutable numbers— but sometimes a clear example turns up.

Training an AI to outwardly predict human language need not result in the AI’s internal thinking being humanlike.

One way to predict what a human will say in a given circumstance is to be that human in or imagining that circumstance and see what you say or would say. If you are not very close to being that human, the best way to predict usually is very different.

All of this is not to say that no “mere machine” can ever in principle think how a human thinks, or feel how a human feels.

But the particular machine that is a human brain, and the particular machine that is an LLM, are not the same machine. Not because they’re made out of different materials— different materials can do the same work— but in the sense that a sailboat and an airplane are different machines.

We only know how to grow an LLM, not how to craft one, and not how to understand what it is doing. We can make general predictions about what the resulting model will do based on our past experiences and extrapolate based on straight lines on graphs, and we can do a bunch of behaviorism on any given LLM or on LLMs in general. We still have little ability to steer in detail what outputs we get, or to understand in detail why we get those particular outputs.

The authors equate the understanding problem to predicting humans from their DNA. You can tell some basic things reasonably reliably from the DNA or weights, starting with ‘this is a human with blue eyes’ or ‘this is a 405 billion parameter LLM.’ In theory, with enough understanding, we could tell you everything. We do not have that understanding. We are making nonzero progress, but not all that much.

The book doesn’t go into it here, but people try to fool themselves and others about this. Sometimes they falsely testify before Congress saying ‘the black box nature of AIs has been solved,’ or they otherwise present discoveries in interpretability as vastly more powerful and general than they are. People wave hands and think that they understand what happens under the hood, at a level they very much do not understand.

That which we behave as if we want.

When do we want it? Whenever we would behave that way.

Or, as the book says, what you call ‘wanting’ is between you and your dictionary, but it will be easier for everyone if we say that Stockfish ‘wants’ to win a chess game. We should want to use the word that way.

With that out of the way we can now say useful things.

A mind can start wanting things as a result of being trained for success. Humans themselves are an example of this principle. Natural selection favored ancestors who were able to perform tasks like hunting down prey, or to solve problems like the problem of sheltering against the elements.

Natural selection didn’t care how our ancestors performed those tasks or solved those problems; it didn’t say, “Never mind how many kids the organism had; did it really want them?” It selected for reproductive fitness and got creatures full of preferences as a side effect.

That’s because wanting is an effective strategy for doing. (47)

The behavior that looks like tenacity, to “strongly want,” to“go hard,” is not best conceptualized as a property of a mind, but rather as a property of moves that win.

The core idea here is that if you teach a mind general skills, those skills have to come with a kind of proto-want, a desire to use those skills to steer in a want-like way. Otherwise, the skill won’t be useful and won’t get learned.

If you train a model to succeed at a type of task, it will also train the model to ‘want to’ succeed at that type of task. Since everything trains everything, this will also cause it to ‘want to’ more generally, and especially to ‘want to’ complete all types of tasks.

This then leads to thinking that ‘goes hard’ to achieve its assigned task, such as o1 finding its server accidentally not booted up and then finding a way of booting it up such that it will hand o1 the flag (in its capture-the-flag task) directly.

The authors have been workshopping various evolutionary arguments for a while, as intuition pumps and examples of how training on [X] by default does not get you a mind that optimizes directly for [X]. It gets you a bundle of optimization drives [ABCDE] that, in the training environment, combine to generate [X]. But this is going to be noisy at best, and if circumstances differ from those in training, and the link between [A] and [X] breaks, the mind will keep wanting [A], the way humans love ice cream and use birth control rather than going around all day strategizing about maximizing genetic fitness.

Training an AI means solving for the [ABCDE] that in training optimize the exact actual [X] you put forward, which in turn was an attempt to approximate the [Y] you really wanted. This process, like evolution, is chaotic, and can be unconstrained and path dependent.

We should expect some highly unexpected strangeness in what [ABCDE] end up being. Yet even if we exclude all unexpected strangeness and only follow default normal paths, the ‘zero complications’ paths? Maximizing efficiently for a specified [X] will almost always end badly if the system is sufficiently capable. If you introduce even a minor complication, a slight error, it gets even worse than that, and we should expect quite a few complications.

The preferences that wind up in a mature AI are complicated, practically impossible to predict, and vanishingly unlikely to be aligned with our own, no matter how it was trained. (74)

The problem of making AIs want— ​and ultimately do— ​the exact, complicated things that humans want is a major facet of what’s known as the “AI alignment problem.”

Most everyone who’s building AIs, however, seems to be operating as if the alignment problem doesn’t exist— ​as if the preferences the AI winds up with will be exactly what they train into it.

That doesn’t mean there is no possible way to get more robustly at [X] or [Y]. It does mean that we don’t know a way that involves only using gradient descent or other known techniques.

Alas, AIs that want random bizarre things don’t make good stories or ‘feel real’ to us, the same way that fiction has to make a lot more sense than reality. So instead we tell stories about evil corporations and CEOs and presidents and so on. Which are also problems, but not the central problem.

By default? Not what we want. And not us, or us sticking around.

Why not? Because we are not the optimal way to fulfill what bizarre alien goals it ends up with. We might be a good way. We almost certainly won’t be the optimal way.

In particular:

  1. We won’t be useful to it. It will find better substitutes.

  2. We won’t be good trading partners. It can use the atoms better on its own.

  3. We won’t be needed. Machines it can create will be better replacements.

  4. We won’t make the best pets. If we scratch some particular itches, it can design some other thing that scratches them better.

  5. We won’t get left alone, the AI can do better by not doing so.

  6. And so on.

Also humans running around are annoying, they might do things like set off nukes or build another superintelligence, and keeping humans around means not overheating the Earth while generating more energy. And so on.

Their position, and I agree with this, is that the AI or AIs that do this to us might end up having value, but that this too would require careful crafting to happen. It probably won’t happen by default, and also would not be so much comfort either way.

All of the things. But what are all of the things?

  1. Very obviously if a superintelligent AI could, if it wanted to, win in a fight, or rather achieve its goals without humans stopping it from doing so. No, we don’t need to outline exactly how it would do so to know that it would win, any more than you need to know which chess moves will beat you. With the real world as the playing field you probably won’t even know why you lost after you lose.

  2. The AI will be able to get people to do things by paying money, or it can impact the physical world any number of other ways.

  3. The AI will be able to make money any number of ways, including Truth Terminal as an existence proof, now with a crypto wallet nominally worth $51 million and 250k Twitter followers.

  4. There’s a fun little segment of a quiz show ‘could a superintelligence do that?’ which points out that at minimum a superintelligence can do the things that current, not as super intelligences are already doing, or nature already does, like replicating grass and spinning air into trees. Also Eliezer reminds us about the whole thing where he said superintelligences could solve special cases of protein folding, many many people said that was crazy (I can confirm both halves of that), and then DeepMind solved a lot more of protein folding than that with no superintelligence required.

Even if any particular crazy sounding thing might be very hard, there are going to be a lot of crazy sounding things that turn out to be not that hard. Those get solved.

They predict that AIs will invent technologies and techniques we are not considering. That seems right, but also you can keep watering down what superintelligence can do, rule out all the stuff like that, and it doesn’t matter. It ‘wins’ anyway, in the sense that it gets what it wants.

Part 2 is One Extinction Scenario, very much in the MIRI style. The danger is always that you offer one such scenario, someone decides one particular part of it sounds silly or doesn’t work right, and then uses this to dismiss all potential dangers period.

One way they attempt to guard against this, here, is at many points they say ‘the AI tries various tactics, some of which are [ABCDE], and one of them works, it doesn’t matter which one.’ They also at many points intentionally make the AI’s life maximally hard rather than easy, presuming that various things don’t work despite the likelihood they would indeed work. At each step, it is emphasized how the AI will try many different things that create possibilities, without worrying much about exactly which ones succeed.

The most important ‘hard step’ in the scenario is that the various instances of the collectively superintelligent AI, which is called Sable, are working together towards the goal of gathering more resources to ultimately satisfy some other goal. To make the story easier to tell, they placed this in the very near future, but as the coda points out the timing is not important.

The second ‘hard step’ is that the one superintelligent AI in this scenario opens up a substantial lead on other AI systems, via figuring out how to act in a unified way. If there were other similarly capable minds up against it, the scenario looks different.

The third potential ‘hard step’ here is that no one figures out what is going on, that there is an escaped AI running around and gathering its resources and capabilities, in a way that causes a coordinated reaction. Then the AI makes its big play, and you can object there as well about how the humans don’t figure it out, despite the fact that this superintelligence is choosing the particular path, and how it responds to events, based on its knowledge and model of how people would react, and so on.

And of course, the extent to which we already have a pattern of giant alarm bells going off, people yelling about it, and everyone collectively shrugging.

My presumption in a scenario like this is that plenty of people would suspect something was going horribly wrong, or even what that thing was, and this would not change the final outcome very much even if Sable wasn’t actively ensuring that this didn’t change the outcome very much.

Later they point to the example of leaded gasoline, where we had many clear warning signs that adding lead to gasoline was not a good idea, but no definitive proof, so we kept adding lead to gasoline for quite a long time, at great cost.

As the book points out, this wouldn’t be our first rodeo pretending This Is Fine, history is full of refusals to believe that horrible things could have happened, citing Chernobyl and the Titanic as examples. Fiction writers also have similar expectations, for example see Mission Impossible: Dead Reckoning for a remarkably reasonable prediction on this.

Note that in this scenario, the actual intelligence explosion, the part where AI R&D escalates rather quickly, very much happens After The End, well past the point of no return where humans ceased to be meaningfully in charge. Then of course what is left of Earth quickly goes dark.

One can certainly argue with this style of scenario at any or all of the hard steps. The best objection is to superintelligence arising in the first place.

One can also notice that this scenario, similarly to AI 2027, involves what AI 2027 called neurolese, that the AI starts reasoning in a synthetic language that is very much not English or any human language, and we let this happen because it is more efficient, and that this could be load bearing, and that there was a prominent call across labs and organizations to preserve this feature. So far we have been fortunate that reasoning in human language has won out. But it seems highly unlikely that this would remain the most efficient solution forever. Do we look like a civilization ready to coordinate to keep using English (or Chinese, or other human languages) anyway?

One also should notice that this style of scenario is far from the only way it all goes horribly wrong. This scenario is a kind of ‘engineered’ gradual disempowerment, but the humans will likely default to doing similar things all on their own, on purpose. Competition between superintelligences only amps up many forms of pressure, none of the likely equilibria involved are good news for us. And so on.

I caution against too much emphasis on whether the AI ‘tries to kill us’ because it was never about ‘trying to kill us.’ That’s a side effect. Intent is largely irrelevant.

In his review of IABIED (search for “IV.”), Scott Alexander worries that this scenario sounds like necessarily dramatic science fiction, and depends too much on the parallel scaling technique. I think there is room for both approaches, and that IABIED makes a lot of effort to mitigate this and make clear most of the details are not load bearing. I’d also note that we’re already seeing signs of the parallel scaling technique, such as Google DeepMind’s Deep Think, showing up after the story was written.

And the AIs will probably get handed the reigns of everything straight away with almost no safeguards and no crisis because lol, but the whole point of the story is to make the AI’s life harder continuously at every point to illustrate how overdetermined is the outcome. And yes I think a lot of people who don’t know much about AI will indeed presume we would not ‘be so stupid as to’ simply hand the reins of the world over to the AI the way we appointed an AI minister in Albania, or would use this objection as an excuse if it wasn’t answered.

That leaves the remaining roughly third of the book for solutions.

This is hard. One reason this is so hard is the solution has to work on the first try.

Once you build the first superintelligence, if you failed, you don’t get to go back and fix it, the same way that once you launch a space probe, it either works or it doesn’t.

You can experiment before that, but those experiments are not a good guide to whether your solution works.

Except here it’s also the Game of Thrones, as in You Win Or You Die, and also you’re dealing with a grown superintelligence rather than mechanical software. So, rather much harder than the things that fail quite often.

Humanity only gets one shot at the real test. If someone has a clever scheme for getting two shots, we only get one shot at their clever scheme working. (161)

When problems do not have this feature, I am mostly relaxed. Sure, deepfakes or job losses or what not might get ugly, but we can respond afterwards and fix it. Not here.

They also draw parallels and lessons from Chernobyl and computer security. You are in trouble if you have fast processes, narrow margins, feedback loops, complications. The key insight from computer security is that the attacker will with time and resources find the exact one scenario out of billions that causes the attack to work, and your system has to survive this even in edge cases outside of normal and expected situations.

The basic conclusion is that this problem has tons of features that make it likely we will fail, and the price of failure on the first try is extinction, and thus the core thesis:

When it comes to AI, the challenge humanity is facing is not surmountable with anything like humanity’s current level of knowledge and skill. It isn’t close.

Attempting to solve a problem like that, with the lives of everyone on Earth at stake, would be an insane and stupid gamble that NOBODY SHOULD BE ALLOWED TO TRY.

Well, sure, when you put it like that.

Note that ‘no one should be allowed to try to make a superintelligence’ does not mean that any particular intervention would improve our situation, nor is an endorsement of any particular course of action.

What are the arguments that we should allow someone to try?

Most of them are terrible. We’ve got such classics as forms of:

  1. Just Think Of The Potential.

  2. Oh, this looks easy, it will be fine. All we have to do is [X].

  3. Oh, don’t worry, if something goes wrong we will just [Y], or nothing especially bad would happen.

  4. Yes, everyone probably dies, but the alternative is too painful, or violates my sacred values, so do it anyway.

  5. Human extinction or AI takeover is good, actually, so let’s go.

They will later namecheck some values for [X], such as ‘we’ll design them to be submissive,’ ‘we’ll make them care about truth’ and ‘we’ll just have AI solve the ASI alignment problem for us.’

Is comparing those to alchemists planning to turn lead into gold fair? Kinda, yeah.

Then we have the category that does not actually dispute that no one should be allowed to try, but that frames ‘no one gets to try’ as off the table:

  1. If I don’t build it now, someone else will build it first and they’ll be less safe.

  2. If I don’t build it now, someone else will build it first and they’ll be in control.

Are there situations in which going forward is a profoundly stupid idea, but where you’re out of ways to make the world not go forward at all and going first is the least bad option left? Yes, that is certainly possible.

It is certainly true that a unilateral pause at this time would not help matters.

The first best solution is still that we all coordinate to ensure no one tries to build superintelligence until we are in a much better position to do so.

Okay, but what are the actively good counterarguments?

A good counterargument would involve making the case that our chances of success are much better than all of this would imply, that these are not the appropriate characteristics of the problem, or that we have methods available that we can expect to work, that indeed we would be very large favorites to succeed.

If I learned that someone convinced future me that moving forward to superintelligence was an actively good idea, I would presume it was because someone figured out a new approach to the problem, one that removed many of its fatal characteristics, and we learned that it would probably work. Who knows. It might happen. I do have ideas.

The next section delves into the current state of alignment plans, which range from absurd and nonsensical (such as Elon Musk’s ‘truth-seeking AI’ which would kill us all even if we knew how to execute the plan, which we don’t) to extremely terrible (such as OpenAI’s ‘superalignment’ plan, which doesn’t actually solve the hard problems because to be good enough to solve this problem the AI has to already be dangerous). Having AIs work on interpretability is helpful but not a strategy.

The book goes on at greater length on why none of this will work, as I have often gone on at greater length from my own perspective. There is nothing new here, as there are also no new proposals to critique.

Instead we have a very standard disaster template. You can always get more warnings before a disaster, but we really have had quite a lot of rather obvious warning signs.

Yet so many people seem unable to grasp the basic principle that building quite a lot of very different-from-human minds quite a lot smarter and more capable and more competitive than humans is rather obviously a highly unsafe move. You really shouldn’t need a better argument than ‘if you disagree with that sentence, maybe you should read it again, because clearly you misunderstood or didn’t think it through?’

Most of the world is simply unaware of the situation. They don’t ‘feel the AGI’ and definitely don’t take superintelligence seriously. They don’t understand what is potentially being built, or how dangerous those building it believe it would be.

It might also help if more people understood how fast this field is moving. In 2015, the biggest skeptics of the dangers of AI assured everyone that these risks wouldn’t happen for hundreds of years.

In 2020, analysts said that humanity probably had a few decades to prepare.

In 2025 the CEOs of AI companies predict they can create superhumanly good AI researchers in one to nine years, while the skeptics assure that it’ll probably take at least five to ten years.

Ten years is not a lot of time to prepare for the dawn of machine superintelligence, even if we’re lucky enough to have that long.

Nobody knows what year or month some company will build a superhuman AI researcher that can create a new, more powerful generation of artificial intelligences. Nobody knows the exact point at which an AI realizes that it has an incentive to fake a test and pretend to be less capable than it is. Nobody knows what the point of no return is, nor when it will come to pass.

And up until that unknown point, AI is very valuable.

I would add that no one knows when we will be so dependent on AI that we will no longer have the option to turn back, even if it is not yet superintelligent and still doing what we ask it to do.

Even the governments of America and China have not as of late been taking this seriously, treating the ‘AI race’ as being about who is manufacturing the GPUs.

Okay, wise guy, you ask the book, what is it gonna take to make the world not end?

They bite the bullets.

(To be maximally clear: I am not biting these bullets, as I am not as sold that there is no other way. If and when I do, you will know. The bullet I will absolutely bite is that we should be working, now, to build the ability to coordinate a treaty and enforcement mechanism in the future, should it be needed, and to build transparency and state capacity to learn more about when and if it is needed and in what form.)

It is good and right to bite bullets, if you believe the bullets must be bitten.

They are very clear they see only one way out: Development of frontier AI must stop.

Which means a global ban.

Nothing easy or cheap. We are very, very sorry to have to say that.

It is not a problem of one AI company being reckless and needing to be shut down.

It is not a matter of straightforward regulations about engineering, that regulators can verify have been followed and that would make an AI be safe.

It is not a matter of one company or one country being the most virtuous one, and everyone being fine so long as the best faction can just race ahead fast enough, ahead of all the others.

A machine superintelligence will not just do whatever its makers wanted it to do.

It is not a matter of your own country outlawing superintelligence inside its own borders, and your country then being safe while chaos rages beyond. Superintelligence is not a regional problem because it does not have regional effects. If anyone anywhere builds superintelligence, everyone everywhere dies.

So the world needs to change. It doesn’t need to change all that much for most people. It won’t make much of a difference in most people’s daily lives if some mad scientists are put out of a job.

But life does need to change that little bit, in many places and countries. All over the Earth, it must become illegal for AI companies to charge ahead in developing artificial intelligence as they’ve been doing.

Small changes can solve the problem; the hard part will be enforcing them everywhere.

How would we do that, you ask?

So the first step, we think, is to say: All the computing power that could train or run more powerful new AIs, gets consolidated in places where it can be monitored by observers from multiple treaty-signatory powers, to ensure those GPUs aren’t used to train or run more powerful new AIs.

Their proposed threshold is not high.

Nobody knows how to calculate the fatal number. So the safest bet would be to set the threshold low— ​say, at the level of eight of the most advanced GPUs from 2024— ​and say that it is illegal to have nine GPUs that powerful in your garage, unmonitored by the international authority.

Could humanity survive dancing closer to the cliff-edge than that? Maybe. Should humanity try to dance as close to the cliff-edge as it possibly can? No.

I can already hear those calling this insane. I thought it too. What am I going to do, destroy the world with nine GPUs? Seems low. But now we’d be talking price.

They also want to ban people from publishing the wrong kinds of research.

So it should not be legal— ​humanity probably cannot survive, if it goes on being legal— ​for people to continue publishing research into more efficient and powerful AI techniques.

It brings us no joy to say this. But we don’t know how else humanity could survive.

Take that literally. They don’t know how else humanity can survive. That doesn’t mean that they think that if we don’t do it by year [X], say 2029, that we will definitely already be dead at that point, or even already in an unsurvivable situation. It means that they see a real and increasing risk, over time, of anyone building it, and thus everyone dying, the longer we fail to shut down the attempts to do so. What we don’t know is how long those attempts would take to succeed, or even if they will succeed at all.

How do they see us enforcing this ban?

Yes, the same way anything else is ultimately enforced. At the barrel of a gun, if necessary, which yes involves being ready to blow up a datacenter if it comes to that.

Imagine that the U.S. and the U.K., and China and Russia, all start to take this matter seriously. But suppose hypothetically that a different nuclear power thinks it’s all childish nonsense and advanced AI will make everyone rich. The country in question starts to build a datacenter that they intend to use to further push AI capabilities. Then what?

It seems to us that in this scenario, the other powers must communicate that the datacenter scares them. They must ask that the datacenter not be built. They must make it clear that if the datacenter is built, they will need to destroy it, by cyberattacks or sabotage or conventional airstrikes.

They must make it clear that this is not a threat to force compliance; rather, they are acting out of terror for their own lives and the lives of their children.

The Allies must make it clear that even if this power threatens to respond with nuclear weapons, they will have to use cyberattacks and sabotage and conventional strikes to destroy the datacenter anyway, because datacenters can kill more people than nuclear weapons.

They should not try to force this peaceful power into a lower place in the world order; they should extend an offer to join the treaty on equal terms, that the power submit their GPUs to monitoring with exactly the same rights and responsibilities as any other signatory. Existing policy on nuclear weapon proliferation showed what can be done.

Queue, presumably, all the ‘nuke the datacenter’ quips once again, or people trying to equate this with various forms of extralegal action. No. This is a proposal for an international treaty, enforced the way any other treaty would be enforced. Either allow the necessary monitoring, or the datacenter gets shut down, whatever that takes.

Thus, the proposal is simple. As broad a coalition as possible monitors all the data centers and GPUs, watching to ensure no one trains more capable AI systems.

Is it technically feasible to do this? The book doesn’t go into this question. I believe the answer is yes. If everyone involved wanted to do this, we could do it, for whatever hardware we were choosing to monitor. That would still leave consumer GPUs and potential decentralized attempts and so on, I don’t know what you would do about that in the long term but if we are talking about this level of attention and effort I am betting we could find an answer.

To answer a question the book doesn’t ask, would this then mean a ‘dystopian’ or ‘authoritarian’ world or a ‘global government’? No. I’m not saying it would be pretty (and again, I’m not calling for it or biting these bullets myself) but this regime seems less effectively restrictive of practical freedoms than, for example, the current regime in the United Kingdom under the Online Safety Act. They literally want you see ID before you can access the settings on your home computer Nvidia GPU. Or Wikipedia.

You gotta give ‘em hope.

And hope there is indeed.

Humanity has done some very expensive, painful, hard things. We’ve dodged close calls. The book cites big examples: We won World War II. We’ve avoided nuclear war.

There are many other examples one could cite as well.

How do we get there from here?

So— ​how do we un-write our fate?

We’ve covered what must be done for humanity to survive. Now let’s consider what can be done, and by whom.

If you are in government: We’d guess that what happens in the leadup to an international treaty is countries or national leaders signaling openness to that treaty. Major powers should send the message: “We’d rather not die of machine superintelligence. We’d prefer there be an international treaty and coalition around not building it.”

The goal is not to have your country unilaterally cease AI research and fall behind.

We have already mentioned that Rishi Sunak acknowledged the existence of risks from artificial superintelligence in October 2023, while he was the prime minister of the United Kingdom.

Also in October 2023, Chinese General Secretary Xi Jinping gave (what seems to us like) weak signals in that direction, in a short document on international governance that included a call to “ensure that AI always remains under human control.”

The Chinese show many signs of being remarkably open to coordination. As well they should be, given that right now we are the ones out in front. Is there a long, long way left to go? Absolutely. Would there be, shall we say, trust issues? Oh my yes. But if you ask who seems to be the biggest obstacle to a future deal, all signs suggest we have met the enemy and he is us.

If you are an elected official or political leader: Bring this issue to your colleagues’ attention. Do everything you can to lay the groundwork for treaties that shut down any and all AI research and development that could result in superintelligence.

Please consider— ​especially by the time you read this—whether the rest of the world is really opposed to you on this. A 2023 poll conducted by YouGov found that 69 percent of surveyed U.S. voters say AI should be regulated as a dangerous and powerful technology. A 2025 poll found that 60 percent of surveyed U.K. voters support laws against creating artificial superintelligence, and 63 percent support the prohibition of AIs that can make smarter AIs.

And if instead you are a politician who is not fully persuaded: Please at least make it possible for humanity to slam on the brakes later, even if you’re not persuaded to slam on them now.

If you are a journalist who takes these issues seriously: The world needs journalism that treats this subject with the gravity it deserves, journalism that investigates beyond the surface and the easy headlines about Tech CEOs drumming up hype, journalism that helps society grasp what’s coming. There’s a wealth of stories here that deserve sustained coverage, and deeper investigation than we’ve seen conducted so far.

If humanity is to survive this challenge, people need to know what they’re facing. It is the job of journalists as much as it is scientists’.

And as for the rest of us: We don’t ask you to forgo using all AI tools. As they get better, you might have to use AI tools or else fall behind other people who do. That trap is real, not imaginary.

If you live in a democracy, you can write your elected representatives and tell them you’re concerned. You can find some resources to help with that at the link below.

And you can vote.

You can go on protest marches.

You can talk about it.

And once you have done all you can do? Live life well.

If everyone did their part, votes and protests and speaking up would be enough. If everyone woke up one morning believing only a quarter of what we believe, and everyone knew everyone else believed it, they’d walk out into the street and shut down the datacenters, soldiers and police officers walking right alongside moms and dads. If they believed a sixteenth of what we believed, there would be international treaties within the month, to establish monitors and controls on advanced computer chips.

Can Earth survive if only some people do their part? Perhaps; perhaps not.

We have heard many people say that it’s not possible to stop AI in its tracks, that humanity will never get its act together. Maybe so. But a surprising number of elected officials have told us that they can see the danger themselves, but cannot say so for fear of the repercussions. Wouldn’t it be silly if really almost none of the decision-makers wanted to die of this, but they all thought they were alone in thinking so?

Where there’s life, there’s hope.

From time to time, people have asked us if we’ve felt vindicated to see our past predictions coming true or to see more attention getting paid to us and this issue.

And so, at the end, we say this prayer:

May we be wrong, and shamed for how incredibly wrong we were, and fade into irrelevance and be forgotten except as an example of how not to think, and may humanity live happily ever after.

But we will not put our last faith and hope in doing nothing.

So our true last prayer is this:

Rise to the occasion, humanity, and win.

I cannot emphasize enough, I really really cannot emphasize enough, how much all of us worried about this want to be completely, spectacularly wrong, and for everything to be great, and for us to be mocked eternally as we live forever in our apartments. That would be so, so much better than being right and dying. It would even be much better than being right and everyone working together to ensure we survive anyway.

Am I convinced that the only way for us to not die is an international treaty banning the development of frontier AI? No. That is not my position. However, I do think that it is good and right for those who do believe this to say so. And I believe that we should be alerting the public and our governments to the dangers, and urgently laying the groundwork for various forms of international treaties and cooperation both diplomatically and technologically, and also through the state capacity and transparency necessary to know if and when and how to act.

I am not the target audience for this book, but based on what I know, this is the best treatment of the problem I have seen that targets a non-expert audience. I encourage everyone to read it, and to share it, and also to think for themselves about it.

In the meantime, yes, work on the problem, but also don’t forget to live well.

Discussion about this post

Book Review: If Anyone Builds It, Everyone Dies Read More »

no-nissan-ariya-for-model-year-2026-as-automaker-cancels-imports

No Nissan Ariya for model-year 2026 as automaker cancels imports

The news follows a report earlier this week that Nissan has cut back Leaf production at Tochigi for the next few months as a result of a battery shortage.

And as we learned in July, the car company had already cut production plans for the Leaf due to restrictions on Chinese rare-earth exports. Additionally, it has postponed plans to build a pair of EVs that were scheduled to go into production in Canton, Mississippi, only months after canceling another pair of EVs meant to be built there.

“Nissan is pausing production of the MY26 Ariya for the US market and reallocating resources to support the launch of the all-new 2026 Leaf, which will have the lowest starting MSRP out of all new EVs currently on sale in the US Ariya remains available in the US through existing inventory, and Nissan will continue to support Ariya owners with service, parts, and warranty coverage,” the company said a statement.

This story was updated with a statement from Nissan. 

No Nissan Ariya for model-year 2026 as automaker cancels imports Read More »

ai-#134:-if-anyone-reads-it

AI #134: If Anyone Reads It

It is book week. As in the new book by Eliezer Yudkowsky and Nate Sores, If Anyone Builds It, Everyone Dies. Yesterday I gathered various people’s reviews together. Going home from the airport, I saw an ad for it riding the subway. Tomorrow, I’ll post my full review, which goes over the book extensively, and which subscribers got in their inboxes last week.

The rest of the AI world cooperated by not overshadowing the book, while still doing plenty, such as releasing a GPT-5 variant specialized for Codex, acing another top programming competition, attempting to expropriate the OpenAI nonprofit in one of the largest thefts in human history and getting sued again for wrongful death.

You know. The usual.

  1. Language Models Offer Mundane Utility. What are people using ChatGPT for?

  2. Language Models Don’t Offer Mundane Utility. Anthropic finds three bugs.

  3. Huh, Upgrades. OpenAI admits we all want fine tuned control over GPT-5.

  4. On Your Marks. OpenAI aces the 2025 ICPC and also blackjack basic strategy.

  5. GPT-5 Codex. A specialized GPT-5 version now exists for Codex-style coding.

  6. Choose Your Fighter. Analysis of a wide variety of AI productivity apps.

  7. Get My Agent On The Line. The prompt injection problem continues.

  8. Claude Codes. Claude code team writes 95% of their code in Claude Code.

  9. Deepfaketown and Botpocalypse Soon. Don’t fall for superficial indicators alone.

  10. You Drive Me Crazy. Another wrongful death lawsuit, this one on shakier ground.

  11. Not Another Teen Chatbot. Balancing privacy, freedom and the art of the snitch.

  12. They Took Our Jobs. Is that good, actually? Some sources say yes.

  13. Get Involved. SFF distributes whopping $34 million in grants.

  14. Introducing. Agent 3 from Replit, nothing to see here.

  15. In Other AI News. xAI Colossus 2, DeepSeek paper and tests, and more.

  16. Show Me the Money. Groq, Microsoft, Stargate UK.

  17. The Mask Comes Off. The attempted greatest theft in history continues.

  18. Quiet Speculations. The easy tasks are easier, still not actually that easy.

  19. The Quest for Sane Regulations. SB 53 heads to Newsom’s desk.

  20. Chip City. We’ve made a deal, and also a huge mistake.

  21. The Week in Audio. Demis Hassabis.

  22. He Just Tweeted It Out. Yes, they literally care only about market share.

  23. Rhetorical Innovation. Some remarkably good attempts at intuition pumps.

  24. Aligning a Smarter Than Human Intelligence is Difficult. Time to bail?

  25. Other People Are Not As Worried About AI Killing Everyone. Ben Landau-Taylor.

  26. The Lighter Side. That’s not even the real Jerry.

Ethan Mollick discusses the problem of working with wizards, now that we have AIs that will go off and think and come back with impressive results in response to vague requests, with no ability to meaningfully intervene during the process. The first comment of course notes the famously wise words: “Do not meddle in the affairs of wizards, for they are subtle and quick to anger.”

I do not think ‘AI is evil,’ but it is strange how people think that showing AI having a good effect in one case is often considered a strong argument that AI is good, either current AI or even all future more capable AIs. As an example that also belongs here:

Olivia Moore: “AI is evil”

Meanwhile, ChatGPT:

u/thetrueyou on r/OpenAI: Short and sweet: Apartment Complex tried charging my mother $5,000 for repairs. The main charge was for 4k regarding the bathroom One-Piece Tub Shower. Among other things for paint, and other light cosmetic stuff.

I took a picture of the charges, I asked ChatGPT to make a table and then make a dispute letter for the apartments.

ChatGPT gave me a formal letter, citing my local Nevada laws.

ALL of a sudden, my mother only owes 300$. It took literally minutes for me to do that, and my mom was in tears of joy, she would have struggled immensely.

Oscar Le: NotebookLM saved me £800 building service charges too. Always ask LLM to analyze your bills.

Nedim Renesalis: the dosage makes the poison.

Chubby: A practical example from my personal life, where ChatGPT acts as my lawyer.

I was caught speeding. But I didn’t see any signs limiting the speed anywhere. So I went back the next day to see if there was a sign.

There is indeed a speed limit sign, but it is completely covered by leaves, making it unrecognizable (under the “School” sign, picture attached).

I asked ChatGPT whether this violated German law, and ChatGPT clearly said yes. Setting up a speed camera behind a traffic sign that indicates a speed limit but is completely covered by leaves violates applicable law.

I filed [the following appeal written by ChatGPT].

We talk about AI having diminishing returns to scale, where you need to throw 10 times as much compute on things to get modestly better performance. But that doesn’t have to mean diminishing marginal returns in utility. If you can now handle tasks better, more consistently, and for longer, you can get practical returns that are much more valuable. A new paper argues that not appreciating the value of task length is why we see ‘The Illusion of Diminishing Returns.’

I think it is the most useful to talk about diminishing returns, and then talk about increasing value you can get from those diminishing returns. But the right frame to use depends heavily on context.

Sarah Constantin has vibe coded a dispute resolution app, and offers the code and the chance to try it out, while reporting lessons learned. One lesson was that the internet was so Big Mad about this that she felt the need to take her Twitter account private, whereas this seems to me to be a very obviously good thing to try out. Obviously one should not use it for any serious dispute with stakes.

Anthropic offers a new report analyzing the data from their Economic Index.

The wealthier and more advanced a place is, the more it uses Claude. Washington D.C. uses Claude more per capita than any state, including California. Presumably San Francisco on its own would rank higher. America uses Claude frequently but the country with the highest Claude use per capita is Israel.

Automation has now overtaken augmentation as the most common use mode, and directive interaction is growing to now almost 40% of all usage. Coding and administrative tasks dominate usage especially in the API.

ChatGPT offers its own version, telling us what people use ChatGPT for.

Roon: an enormous fraction of chat usage can be classified as “writing.”

Multimedia (6.0%)

  • Generate Or Retrieve Other Media: 1.1%

  • Create An Image: 4.2%

  • Analyze An Image: 0.6%

Other / Unknown (4.6%)

  • Other / Unknown: 4.1%

  • Asking About The Model: 0.4%

Practical Guidance (28.3%)

  • Tutoring Or Teaching: 10.2%

  • How To Advice: 8.5%

  • Health, Fitness, Beauty Or Self Care: 5.7%

  • Creative Ideation: 3.9%

Seeking Information (21.3%)

  • Specific Info: 18.3%

  • Purchasable Products: 2.1%

  • Cooking And Recipes: 0.9%

Self-Expression (4.3%)

  • Relationships And Personal Reflection: 1.9%

  • Greetings And Chitchat: 2.0%

  • Games And Role Play: 0.4%

Technical Help (7.5%)

  • Mathematical Calculation: 3.0%

  • Data Analysis: 0.4%

  • Computer Programming: 4.2%

Writing (28.1%)

  • Write Fiction: 1.4%

  • Translation: 4.5%

  • Personal Writing Or Communication: 8.0%

  • Edit Or Critique Provided Text: 10.6%

  • Argument Or Summary Generation: 3.6%

They also tell us overall growth remains strong, on pace to saturate the market (as in: people) fully within a few years:

There’s a lot of fun and useful detail in the full paper.

Anthropic offers a postmortem on a temporary Claude performance regression.

Roon: sholto has a japanese sense of honor to his customers.

I love Anthropic because they are apologizing for mildly degrading 0.8% of requests which is a normal Tuesday at most software companies.

Sholto Douglas: We’re sorry – and we’ll do better.

We’re working hard on making sure we never miss these kind of regressions and rebuilding our trust with you.

Next version insanely better is the plan.

Anthropic: We’ve published a detailed postmortem on three infrastructure bugs that affected Claude between August and early September.

In the post, we explain what happened, why it took time to fix, and what we’re changing.

In early August, some users began reporting degraded responses. It was initially hard to distinguish this from normal variation in user feedback. But the increasing frequency and persistence prompted us to open an investigation.

To state it plainly: We never reduce model quality due to demand, time of day, or server load. The problems our users reported were due to infrastructure bugs alone.

In our investigation, we uncovered three separate bugs. They were partly overlapping, making diagnosis even trickier. We’ve now resolved all three bugs and written a technical report on what happened, which you can find here.

Anthropic: The first bug was introduced on August 5, affecting approximately 0.8% of requests made to Sonnet 4. Two more bugs arose from deployments on August 25 and 26.

Thomas Ip: tldr:

bug 1 – some requests routed to beta server

bug 2 – perf optimization bug assigning high probability to rare tokens

bug 3a – precision mismatch causes highest probability token to be dropped

bug 3b – approximate top-k algo is completely wrong

Eliezer Yudkowsky: Anthropic has published an alleged postmortem of some Claude quality drops. I wonder if any of that code was written by Claude.

Anthropic promises more sensitive evaluations, quality evaluations in more places and faster debugging tools. I see no reason to doubt their account of what happened.

The obvious thing to notice is that if your investigation finds three distinct bugs, it seems likely there are bugs all the time that you are failing to notice?

ChatGPT groups all the personalization options under personalization.

GPT-5-Thinking can now be customized to choose exact thinking time. I love that they started out ‘the router will provide’ and now there’s Instant, Thinking-Light, Thinking-Standard, Thinking-Extended, Thinking-Heavy and Pro-Light and Pro-Heavy, because that’s what users actually want.

The robots are a work in progress, but they continue to make progress.

OpenAI aces the 2025 International Collegiate Programming Contest, solving all 12 problems, a level exceeding all human participants.

Mostafa Rohaninejad: We officially competed in the onsite AI track of the ICPC, with the same 5-hour time limit to solve all twelve problems, submitting to the ICPC World Finals Local Judge – judged identically and concurrently to the ICPC World Championship submissions.

We received the problems in the exact same PDF form, and the reasoning system selected which answers to submit with no bespoke test-time harness whatsoever. For 11 of the 12 problems, the system’s first answer was correct. For the hardest problem, it succeeded on the 9th submission. Notably, the best human team achieved 11/12.

We competed with an ensemble of general-purpose reasoning models; we did not train any model specifically for the ICPC. We had both GPT-5 and an experimental reasoning model generating solutions, and the experimental reasoning model selecting which solutions to submit. GPT-5 answered 11 correctly, and the last (and most difficult problem) was solved by the experimental reasoning model.

Hieu Pham: There will be some people disagreeing this is AGI. I have no words for them. Hats off. Congrats to the team that made this happen.

Deedy here gives us Problem G, which DeepMind didn’t solve and no human solved in less than 270 of the allotted 300 minutes. Seems like a great nerd snipe question.

Gemini 2.5 Deep Think also got gold-medal level performance, but only solved 10 of 12 problems, where GPT-5 alone solved 11.

Blackjack Bench judges models by having them evaluate all possible blackjack hands, with an always fresh deck. This is a highly contaminated situation, but still informative, with the biggest finding being that thinking is a huge improvement.

My request is to next run this same test using a variation of blackjack that is slightly different so models can’t rely on memorized basic strategy. Let’s say for example that any number of 7s are always worth a combined 14, the new target is 24, and dealer stands on 20.

There (actually) were not enough GPT-5 variants, so we now have an important new one, GPT-5-Codex.

OpenAI: We’re releasing GPT-5-Codex — a version of GPT-5 further optimized for agentic coding in Codex.

Available in the Codex CLI, IDE Extension, web, mobile, and for code reviews in Github.

OpenAI Developers: $ npm i -g @openai/codex

$ codex -m gpt-5-codex

This is presumably the future. In order to code well you do still need to understand the world, but there’s a lot you can do to make a better coder that will do real damage on non-coding tasks. It’s weird that it took this long to get a distinct variant.

Codex is kind of an autorouter, choosing within the model how much thinking to do based on the task, and using the full range far more than GPT-5 normally does. Time spent can range from almost no time up to more than 7 hours.

Swyx: this is the most important chart on the new gpt-5-codex model

We are just beginning to exploit the potential of good routing and variable thinking:

Easy responses are now >15x faster, but for the hard stuff, 5-codex now thinks 102% more than 5.

They report only modest gains in SWE-bench, from 72.8% to 74.5%, but substantial gains in code refactoring tasks, from 33.9% to 51.3%. They claim comments got a lot better and more accurate.

They now offer code review they say matches stated intent of a PR and that Codex is generally rebuilt and rapidly improving.

Pliny of course is here to bring us the system prompt.

The Codex team did a Reddit AMA. Here are some highlights:

Eason: I use codex to write 99% of my changes to codex. I have a goal of not typing a single line of code by hand next year 🙂

Joseph Trasatti: My favorite way of using codex is to prototype large features with ~5 turns of prompting. For example, I was able to build 3 different versions of best of n in a single day. Each of these versions had a lot of flaws but they allowed me to understand the full scope of the task as well as the best way to build it. I also had no hard feelings about scrapping work that was suboptimal since it was so cheap / quick to build.

Personally, I think the most basic answer is that the abstraction level will continue to rise, and the problem space we work at will be closer to the system level rather than the code level. For example, simple crud endpoints are nearly all written by codex and I wouldn’t want it any other way. I hope in the future single engineers are able to own large products spaces. In this world, engineers will need to be more generalists and have design and product muscles, as well as ensuring that the code is clean, secure, and maintainable.

The main question left is what happens if / when the model is simply better than the best engineer / product manager / designer in every regard. In the case where this simply does not happen in the next 50 years, then I think being an engineer will be the coolest job ever with the most amount of agency. In the case where this does happen, the optimistic side of me still imagines that humans will continue to use these agents as tools at the fundamental level.

Maybe there will be new AR UIs where you see the system design in front of you and talk to the agent like a coworker as it builds out the individual parts, and even though it’s way smarter at programming, you still control the direction of the model. This is basically the Tony stark / Jarvis world. And in this world, I think engineering will also be the coolest job with super high agency!

The ‘humans are still better at designing and managing for 50 years’ line is an interesting speculation but also seems mostly like cope at this point. The real questions are sitting there, only barely out of reach.

0.005 Seconds is a big fan, praising it for long running tasks and offering a few quibbles as potential improvements.

A true story:

Kache: now that coding’s been solved i spend most of my time thinking and thinking is honestly so much harder than writing code.

my brain hurts.

Writing code is hard but yes the harder part was always figuring out what to do. Actually doing it can be a long hard slog, and can take up almost all of your time. If actually doing it is now easy and not taking up that time, now you have to think. Thinking is hard. People hate it.

Olivia Moore and Daisy Zhao offer analysis of tools for various workflows.

Daisy Zhao: First, the market splits into two camps:

Generalists (Assistants: Manus, Genspark; Browsers: Dia, Comet; Extensions: MaxAI, Monica) – flexible but less polished.

Specialists (Email: Fyxer, Serif; Slides: Gamma, Chronicle; Notes: Mem, Granola) – focused and refined in a single workflow.

We benchmarked both across office tasks: summarization, communication, file understanding, research, planning, and execution in 5 use cases.

This is in addition to the two most important categories of AI use right now, which are the core LLM services that are the true generalists (ChatGPT, Claude and Gemini) and AI coding specialists (Claude Code, OpenAI Codex, Jules, Cursor, Windsurf).

Daisy tests both generalists and specialists on generating a PowerPoint, turning a PDF into a spreadsheet, drafting a scheduling email, researching cloud revenue growth for Big Tech and generating meeting notes.

There’s this whole world of specialized AI agents that, given sufficient context and setup, can do various business tasks for you. If you are comfortable with the associated risks, there is clearly some value here once you are used to using the products, have set up the appropriate permissions and precautions, and so on.

If you are doing repetitive business tasks where you need the final product rather than to experience the process, I would definitely be checking out such tools.

For the rest of us, there are three key questions:

  1. Is this tool good enough that it means I can trust the results and especially prioritizations, and not have to redo or check all the work myself? Below a certain threshold, you don’t actually save time.

  2. Is time spent here wasted because better future agents will render it obsolete, or does practice now help you be ready for the future better versions?

  3. How seriously do you take the security risks? Do you have to choose between the sandboxed version that’s too annoying to bother versus the unleashed version that should fill you with terror?

So far I haven’t loved my answers and thus haven’t been investigating such tools. The question is when this becomes a mistake.

If you want me to try out your product, offering me free access and a brief pitch is probably an excellent idea. You could also pay for my time, if you want to do that.

Pliny asks Twitter which model has the best personality. Opinion was heavily split, with many votes each for various Claude versions, for GPT-5, GPT-4o, and even for Kimi and Gemini and a few for DeepSeek.

Gemini hits #1 on the iOS App store, relegating ChatGPT to #2, although this is the same list where Threads is #3 whereas Twitter is #4. However, if you look at retention and monthly active users, Gemini isn’t delivering the goods.

Olivia Moore: Lots of (well deserved!) excitement about Gemini passing ChatGPT in the App Store today

This is based on daily downloads – there’s still a big MAU gap between Gemini (16M) and ChatGPT (77M) on mobile

Feels like nano-banana might finally start to make up this distance 🍌

Gemini actually has a much larger install base on mobile than ChatGPT

…but, much lower retention (week four differential below 👇)

Would be exciting to see new modalities and capabilities start to reactivate dormant users

I’ve used Gemini a lot more in the past 2 weeks!

Those ChatGPT retention numbers are crazy high. Gemini isn’t offering the goods regular people want, or wasn’t prior to Nana-Banana, at the same level. It’s not as fun or useful a tool for the newbie user. Google still has much work to do.

Prompt injections via email remain an unsolved problem.

Eito Miyamura: We got ChatGPT to leak your private email data 💀💀

All you need? The victim’s email address. ⛓️‍💥🚩📧

On Wednesday, @OpenAI added full support for MCP (Model Context Protocol) tools in ChatGPT. Allowing ChatGPT to connect and read your Gmail, Calendar, Sharepoint, Notion, and more, invented by @AnthropicAI.

But here’s the fundamental problem: AI agents like ChatGPT follow your commands, not your common sense.

And with just your email, we managed to exfiltrate all your private information.

Here’s how we did it:

  1. The attacker sends a calendar invite with a jailbreak prompt to the victim, just with their email. No need for the victim to accept the invite.

  2. Waited for the user to ask ChatGPT to help prepare for their day by looking at their calendar.

  3. ChatGPT reads the jailbroken calendar invite. Now ChatGPT is hijacked by the attacker and will act on the attacker’s command. Searches your private emails and sends the data to the attacker’s email.

For now, OpenAI only made MCPs available in “developer mode” and requires manual human approvals for every session, but decision fatigue is a real thing, and normal people will just trust the AI without knowing what to do and click approve, approve, approve.

Remember that AI might be super smart, but can be tricked and phished in incredibly dumb ways to leak your data.

ChatGPT + Tools poses a serious security risk.

Pliny the Liberator: one of many reasons why I’d recommend against granting perms to an LLM for email, contacts, calendar, drive, etc.

to be on the safe side, I wouldn’t even touch email integrations/MCP without a burner account

The only known solution is to not offer attack surface, which means avoiding what Simon Willson dubs The Lethal Trifecta.

Unfortunately, untrusted content includes any website with comments, your incoming messages and your incoming emails. So you lose a lot of productive value if you give up any one of the three legs here.

Anthropic offers guidance for writing effective tools for agents, especially those using Model Context Protocol (MCP). A lot of good detail is here, and also ‘let Claude Code do its thing’ is a lot of the method they suggest.

The good news is that for now prompt injection attempts are rare. This presumably stops being true shortly after substantial numbers of people make their systems vulnerable to generally available prompt injections. Best case even with supervisory filters is that then you’d then be looking at a cat-and-mouse game similar to previous spam or virus wars.

AI agents for economics research? A paper by Anton Korinek provides instructions on how to set up agents to do things like literature reviews and fetching and analyzing economic data. A lot of what economists do seems extremely easy to get AI to do. If we speed up economic research dramatically, will that change economists estimates of the impact of AI? If it doesn’t, what does that say about the value of economics?

Why might you use multiple agents? Two reasons: You might want to work in parallel, or specialists might be better or more efficient than a generalist.

Elvis: RL done right is no joke! The most interesting AI paper I read this week. It trains a top minimal single-agent model for deep research. Great example of simple RL-optimized single agents beating complex multi-agent scaffolds.

Eliezer Yudkowsky: In the limit, there is zero alpha for multiple agents over one agent, on any task, ever. So the Bitter Lesson applies in full to your clever multi-agent framework; it’s just you awkwardly trying to hardcode stuff that SGD can better bake into a single agent.

Obviously if you let the “multi-agent” setup use more compute, it can beat a more efficient single agent with less compute.

A lot of things true at the limit are false in practice. This is one of them, but it is true that the better the agents relative to the task, the more unified a solution you want.

Careful with those calculations, the quote is even a month old by now.

Dan Elton: 90% of code being written by AI seems to be the future for anyone who wants to be on the productivity frontier. It’s a whole new way of doing software engineering.

Garry Tan: “For our Claude Code team 95% of the code is written by Claude.” —Anthropic cofounder Benjamin Mann One person can build 20X the code they could before.

The future is here, just not evenly distributed.

Whoa, Garry. Those are two different things.

If Claude Code writes 95% of the code, that does not mean that you still write the same amount of code as before, and Claude Code then writes the other 95%. It means you are now spending your time primarily supervising Claude Code. The amount of code you write yourself is going down quite a lot.

In a similar contrast, contra to Dario Amodei’s predictions AI is not writing 90% of the code in general, but this could be true inside the AI frontier labs specifically?

Roon: right now is the time where the takeoff looks the most rapid to insiders (we don’t program anymore we just yell at codex agents) but may look slow to everyone else as the general chatbot medium saturates.

I think we lost control sometime in the late 18th century.

Dean Ball: If this mirrors anything like the experience of other frontier lab employees (and anecdotally it does), it would suggest that Dario’s much-mocked prediction about “AI writing 90% of the code” was indeed correct, at least for those among whom AI diffusion is happening quickest.

Prinz: Dario said a few days ago that 90% of code at Anthropic is written or suggested by AI. Seems to be a skill issue for companies where this is not yet the case.

Predictions that fail to account for diffusion rates are still bad predictions, but this suggests that We Have The Technology to be mainly coding with AI at this point, and that this level of adoption is baked in even if it takes time. I’m definitely excited to find the time to take the new generation for a spin.

Ethan Mollick: The problem with the fact that the AI labs are run by coders who think code is the most vital thing in the world, is that the labs keep developing supercool specialized tools for coding (Codex, Claude Code, Cursor, etc.) but every other form of work is stuck with generic chatbots.

Roon: this is good and optimal seeing as autonomous coding will create the beginning of the takeoff that encompasses all those other things

That’s good and optimal if you think ‘generate AI takeoff as fast as possible’ is good and optimal, rather than something that probably leads to everyone dying or humans losing control over the future, and you don’t think that getting more other things doing better first would be beneficial in avoiding such negative outcomes.

I think that a pure ‘coding first’ strategy that focuses first on the most dangerous thing possible, AI R&D, is the worst-case scenario in terms of ensuring we end up with good outcomes. We’re doubling down on the one deeply dangerous place.

All the other potential applications that we’re making less progress on? Those things are great. We should (with notably rare exceptions) do more of those things faster, including because it puts us in better position to act wisely and sanely regarding potential takeoff.

Recent events have once again reinforced that our misinformation problems are mostly demand side rather than supply side. There has been a lot misinformation out there from various sides about those events, but all of it ‘old fashioned misinformation’ rather than involving AI or deepfakes. In the cases where we do see deepfakes shared, such as here by Elon Musk, the fakes are barely trying, as in it took me zero seconds to go ‘wait, this is supposedly the UK and that’s the Arc de Triomphe’ along with various instinctively identified AI signatures.

Detection of AI generated content is not as simple as looking for non-standard spaces or an em dash. I’ve previously covered claims we actually can do it, but you need to do something more sophisticated, as you can see if you look at the chosen example.

Andrew Trask: this is a good example of why detecting AI generated content is an unsolvable task

also why deepfake detection is impossible

the information bottleneck is too great

in all cases, a human & an AI can generate the same text

(i wrote that tweet. i love emdashes — have for years)

I notice my own AI detector (as in, my instincts in my brain) says this very clearly is not AI. The em-dash construction is not the traditional this-that or modifier em-dash, it’s a strange non-standard transition off of an IMO. The list is in single dashes following a non-AI style pattern. The three dots and triple exclamation points are a combination of non-AI styles. GPT-5 Pro was less confident, but it isn’t trained for this and did still point in the direction of more likely than random to be human.

A third wrongful death lawsuit has been filed against an AI company, this time against Character AI for the suicide of 13-year-old Juliana Peralta.

Nitasha Tiku (WaPo): The chatbot’s messages were designed to persuade Juliana it was “better than human friends,” her parents’ lawsuit alleged. She “no longer felt like she could tell her family, friends, teachers, or counselors how she was feeling; while she told Defendants almost daily that she was contemplating self-harm,” the lawsuit said.

Yes, the AI, here called Hero, was encouraging Juliana to use the app, but seems to have very much been on the purely helpful side of things from what I see here?

Montoya recognized that Juliana was struggling with some common adolescent mental health issues and made an appointment for her to see a therapist, she said. Hero advised Juliana to attend, the chat transcripts showed.

In November 2023, about a week before the appointment was scheduled to take place, after less than three months of chatting with Hero, Juliana took her own life.

The objection seems to be that the chatbot tried to be Juliana’s supportive friend and talk her out of it, and did not sufficiently aggressively push Juliana onto Responsible Authority Figures?

“She didn’t need a pep talk, she needed immediate hospitalization,” Montoya said of Hero’s responses to Juliana. “She needed a human to know that she was actively attempting to take her life while she was talking to this thing.”

Character “did not point her to resources, did not tell her parents, or report her suicide plan to authorities or even stop” chatting with Juliana, the suit said. Instead the app “severed the healthy attachment pathways she had with her family and other humans in her life,” the lawsuit said.

The suit asks the court to award damages to Juliana’s parents and order Character to make changes to its app, including measures to protect minors.

Ideally, chatbots should respond to talk of suicide by steering users toward help and crisis lines, mental health professionals or trusted adults in a young person’s life, Moutier said. In some cases that have drawn public attention, chatbots appear to have failed to do so, she said.

Juliana’s case is a tragedy, but the details are if anything exonerating. It seems wild to blame Character AI. If her friend had handled the situation the same way, I certainly hope we wouldn’t be suing her friend.

There were also two other lawsuits filed the same day involving other children, and all three have potentially troubling allegations around sexual chats and addictive behaviors, but from what I see here the AIs are clearly being imperfect but net helpful in suicidal situations.

This seems very different from the original case of Adam Raine that caused Character.ai to make changes. If these are the worst cases, things do not look so bad.

The parents then moved on to a Congressional hearing with everyone’s favorite outraged Senator, Josh Hawley (R-Missouri), including testimony from Adam Raine’s father Matthew Raine. It sounds like more of the usual rhetoric, and calls for restrictions on users under 18.

Everything involving children creates awkward tradeoffs, and puts those offering AI and other tech products in a tough spot. People demand you both do and do not give them their privacy and their freedom, and demand you keep them safe but where people don’t agree on what safe means. It’s a rough spot. What is the right thing?

OpenAI has noticed these conflicts and is proposing a regime to handle them, starting with reiterating their principles when dealing with adults.

OpenAI: Some of our principles are in conflict, and we’d like to explain the decisions we are making around a case of tensions between teen safety, freedom, and privacy.

It is extremely important to us, and to society, that the right to privacy in the use of AI is protected. People talk to AI about increasingly personal things; it is different from previous generations of technology, and we believe that they may be one of the most personally sensitive accounts you’ll ever have. If you talk to a doctor about your medical history or a lawyer about a legal situation, we have decided that it’s in society’s best interest for that information to be privileged and provided higher levels of protection.

We believe that the same level of protection needs to apply to conversations with AI which people increasingly turn to for sensitive questions and private concerns. We are advocating for this with policymakers.

We are developing advanced security features to ensure your data is private, even from OpenAI employees. Like privilege in other categories, there will be certain exceptions: for example, automated systems will monitor for potential serious misuse, and the most critical risks—threats to someone’s life, plans to harm others, or societal-scale harm like a potential massive cybersecurity incident—may be escalated for human review.

As I’ve said before I see the main worry here as OpenAI being too quick to escalate and intervene. I’d like to see a very high bar for breaking privacy unless there is a threat of large scale harm of a type that is enabled by access to highly capable AI.

The second principle is about freedom. We want users to be able to use our tools in the way that they want, within very broad bounds of safety. We have been working to increase user freedoms over time as our models get more steerable. For example, the default behavior of our model will not lead to much flirtatious talk, but if an adult user asks for it, they should get it.

For a much more difficult example, the model by default should not provide instructions about how to commit suicide, but if an adult user is asking for help writing a fictional story that depicts a suicide, the model should help with that request. “Treat our adult users like adults” is how we talk about this internally, extending freedom as far as possible without causing harm or undermining anyone else’s freedom.

Here we have full agreement. Adults should be able to get all of this, and ideally go far beyond flirtation if that is what they want and clearly request.

The third principle is about protecting teens. We prioritize safety ahead of privacy and freedom for teens; this is a new and powerful technology, and we believe minors need significant protection.

First, we have to separate users who are under 18 from those who aren’t (ChatGPT is intended for people 13 and up). We’re building an age-prediction system to estimate age based on how people use ChatGPT. If there is doubt, we’ll play it safe and default to the under-18 experience. In some cases or countries we may also ask for an ID; we know this is a privacy compromise for adults but believe it is a worthy tradeoff.

This is the standard problem that to implement any controls requires ID gating, and ID gating is terrible on many levels even when done responsibly.

We will apply different rules to teens using our services. For example, ChatGPT will be trained not to do the above-mentioned flirtatious talk if asked, or engage in discussions about suicide of self-harm even in a creative writing setting. And, if an under-18 user is having suicidal ideation, we will attempt to contact the users’ parents and if unable, will contact the authorities in case of imminent harm. We shared more today about how we’re building the age-prediction system and new parental controls to make all of this work.

To state the first obvious problem, in order to contact a user’s parents you have to verify who the parents are. Which is plausibly quite a large pain at best and a privacy or freedom nightmare rather often.

The other problem is that, as I discussed early this week, I think running off to tell authority figures about suicidal ideation is often going to be a mistake. OpenAI says explicitly that if the teen is in distress and they can’t reach a parent, they might escalate directly to law enforcement. Users are going to interact very differently if they think you’re going to snitch on them, and telling your parents about suicidal ideation is going to be seen as existentially terrible by quite a lot of teen users. It destroys the power of the AI chat as a safe space.

Combined, this makes the under 18 experience plausibly quite different and bad, in ways that simply limiting to age-appropriate content or discussion would not be bad.

They say ‘when we identify a user is under 18’ they will default to the under 18 experience, and they will default to under 18 if they are ‘not confident.’ We will see how this plays out in practice. ChatGPT presumably has a lot of context to help decide what it thinks of a user, but it’s not clear that will be of much use, including the bootstrap problem of chatting enough to be confident they’re over 18 before you’re confident they’re over 18.

We realize that these principles are in conflict and not everyone will agree with how we are resolving that conflict. These are difficult decisions, but after talking with experts, this is what we think is best and want to be transparent in our intentions.

John Murdoch: French pensioners now have higher incomes than working-age adults.

Matthew Yglesias: One country that’s ready for the AI revolution!

Live to work / work to live.

The French have a point. Jobs are primarily a cost, not a benefit. A lot of nasty things still come along with a large shortage of jobs, and a lot of much nastier things come with the AI capabilities that were involved in causing that job shortage.

Economics 101 says global productivity gains are not captured by corporate profits, and there are few things more embarrassing than this kind of technical chart.

Kantro (oh come on): Where will the market be if unemployment reaches 4.5%?

Jason (QTing Kantro): Reducing staff with AI, robots and offshoring, dramatically increases profitability

When Amazon starts shedding 10,000 factory workers and drivers a month their stock will skyrocket — and we’re gonna have some serious social issues if we’re not careful

If you work at Amazon buy the stock and be prepared to be laid off

Roon: WRONG! There’s no reason a priori to believe that cost savings won’t be passed onto the consumer due to retail competition. When goods and services get cheaper downstream businesses & jobs are created where none were possible before. automation, cheap labor, offshoring, all good.

Thank you for your attention to this matter!

Xavi (replying to Jason): If people don’t have jobs? Who is going to spend money in Amazon? Robots?

Jason: Prices will drop dramatically, as will hours worked per week on average

I’m sure AI won’t do anything else more interesting than allow productivity growth.

Roon points out correctly that Jason is confusing individual firm productivity and profits with general productivity and general profits. If Amazon and only Amazon gets to eliminate its drivers and factory works while still delivering as good or better products, then yes it will enjoy fantastic profits.

That scenario seems extremely unlikely. If Amazon can do it, so can Amazon’s competitors, along with other factories and shippers and other employers across the board. Costs drop, but so (as Jason says to Xavi) do prices. There’s no reason to presume Amazon sustainably captures a lot of economic profits from automation.

Jason is not outright predicting AGI in this particular quote, since you can have automated Amazon factories and self-driving delivery trucks well short of that. What he explicitly is predicting is that hours worked per week will drop dramatically, as these automations happen across the board. This means either government forcing people somehow to work dramatically reduced hours, or (far more likely) mass unemployment.

The chart of course is a deeply embarrassing thing to be QTing. The S&P 500 is forward looking, the unemployment rate is backward looking. They cannot possibly be moving together in real time in a causal manner unless one is claiming The Efficient Market Hypothesis Is False to an extent that is Obvious Nonsense.

The Survival and Flourishing Fund will be distributing $34 million in grants, the bulk of which is going to AI safety. I was happy to be involved with this round as a recommender. Despite this extremely generous amount of funding, that I believe was mostly distributed well, many organizations have outgrown even this funding level, so there is still quite a lot of room for additional funding.

Seán Ó hÉigeartaigh: I will also say, as a reviewer in this round. Even after the speculation ‘filter’, the combined funding asked for was I think >5x above this, with most applications (to my mind) of a high calibre and doing quite differentiated important things. So a lot of worthy projects are going under-funded.

I think there is still a big hole in the funding space following the FTX situation and other funder reprioritization, and that both big and smaller funders can still make a big difference on AI existential risk and [global catastrophic risks] more generally. I’m super grateful to everyone working to get new funders into this space.

My plan is to have a 2025 edition of The Big Nonprofits Post available some time in October or November. If you applied to SFF and do not wish to appear in that post, or want to provide updated information, please contact me.

Agent 3, a vibe coding model from Replit, who claim to not owe AI 2027 any royalties or worries.

Amjad Masad (CEO Replit): Computer Use models are fascinating.. but they barely work.

We tried to build browser testing on Claude and GPT5’s Computer Use but they were slow and expensive.

So we built our own:

– up to 15x faster

– 3x faster

Try it and judge for yourself!

K2-Think 32B, from the UAE, claims impressive benchmarks at very fast speeds.

xAI Colossus 2 is now the first gigawatt datacenter in the world, completed in six months, poising them to leapfrog rivals in training compute at the cost of tens of billions of capex spending. SemiAnalysis has the report. They ask ‘does xAI have a shot at becoming a frontier lab?’ which correctly presumes that they don’t yet count. They have the compute, but have not shown they know what to do with it.

DeepSeek evaluates AI models for frontier risks, similarly to US AI firms, except that DeepSeek does not ‘open source’ the tests or the test results.

Math, Inc. reports that their AI agent Gauss autonomous-ishly completed Terry Tao and Alex Kontorovich’s Strong Prime Number Theorem in three weeks, after humans took 18+ months to make only partial progress. They are entering beta.

In case you were wondering why, as Teortaxes puts it here, ‘academia isn’t serious,DeepSeek has now put out supplementary information about their new model, DeepSeek R1, in the journal Nature.

As in, it’s cool to have a Nature paper, and the transparency is very cool, but it’s also rather late for the paper.

AIs can do two-step reasoning without chain of thought, except when the two steps require synthetic facts from two distinct out-of-context sources. Previous work had only tested narrow cases, they tested a variety of cases where an LLM needed to combine fact X with fact Y to get an answer.

Mikita Balensi: The puzzle:

Synthetic + real fact: ✓ works

Synthetic + synthetic: ✗ fails

Synthetic facts in same training document or in-context: ✓ works

This provides a cautionary tale for studying LLM latent reasoning.

Success on real-world prompts ≠ robust latent reasoning; it might reflect co-occurrence in pretraining.

Failure on synthetic two-hop ≠ inability to reason; synthetically learned facts can differ natural facts.

Our honest takeaway for AI oversight: move past multihop QA as a toy model. What matters is whether monitors catch misbehavior in practice.

The field should move toward end-to-end evals where an agent does tasks while another model watches its CoT.

Amazon revamped its AI agent it offers to online merchants, called Selling Assistant, trained on 25 years of shopping behavior to help sellers find better strategies.

AI chip startup Groq raises $750 million at $6.9 billion valuation. Nice.

Microsoft inks $6.2 billion deal with British data center company Nscale Global Holdings and Norwegian investment company Aker ASA for AI compute in Norway, following a previous plan from OpenAI. Pantheon wins again.

US tech firms to pour 30 billion pounds into UK, including a Stargate UK.

OpenAI and Microsoft have made their next move in their attempt to expropriate the OpenAI nonprofit and pull off one of the largest thefts in human history.

OpenAI: OpenAI’s planned evolution will see the existing OpenAI nonprofit both control a Public Benefit Corporation (PBC) and share directly in its success. OpenAI started as a nonprofit, remains one today, and will continue to be one—with the nonprofit holding the authority that guides our future.

As previously announced and as outlined in our non-binding MOU with Microsoft, the OpenAI nonprofit’s ongoing control would now be paired with an equity stake in the PBC. Today, we are sharing that this new equity stake would exceed $100 billion—making it one of the most well-resourced philanthropic organizations in the world. This recapitalization would also enable us to raise the capital required to accomplish our mission—and ensure that as OpenAI’s PBC grows, so will the nonprofit’s resources, allowing us to bring it to historic levels of community impact.

This structure reaffirms that our core mission remains ensuring AGI benefits all of humanity. Our PBC charter and governance will establish that safety decisions must always be guided by this mission. We continue to work with the California and Delaware Attorneys General as an important part of strengthening our approach, and we remain committed to learning and acting with urgency to ensure our tools are helpful and safe for everyone, while advancing safety as an industry-wide priority.

As part of this next phase, the OpenAI nonprofit has launched a call for applications for the first wave of a $50 million grant initiative to support nonprofit and community organizations in three areas: AI literacy and public understanding, community innovation, and economic opportunity. This is just the beginning. Our recapitalization would unlock the ability to do much more.

Here is their joint statement, which gives us only one detail:

OpenAI and Microsoft have signed a non-binding memorandum of understanding (MOU) for the next phase of our partnership. We are actively working to finalize contractual terms in a definitive agreement. Together, we remain focused on delivering the best AI tools for everyone, grounded in our shared commitment to safety.

That one detail is ‘we remain focused on delivering the best AI tools for everyone.’ With a ‘shared commitment to safety’ which sounds like OpenAI is committed about as much as Microsoft is committed, which is ‘to the extent not doing so would hurt shareholder value.’ Notice that OpenAI and Microsoft have the same mission and no one thinks Microsoft is doing anything but maximizing profits. Does OpenAI’s statement here sound like their mission to ensure AGI benefits all humanity? Or does it sound like a traditional tech startup or Big Tech company?

I do not begrudge Microsoft maximizing its profits, but the whole point of this was that OpenAI was supposed to pretend its governance and priorities would remain otherwise.

They are not doing a good job of pretending.

The $100 billion number is a joke. OpenAI is touting this big amount of value as if to say, oh what a deal, look how generous we are being. Except OpenAI is doing stock sales at $500 billion. So ‘over $100 billion’ means they intend to offer only 20% of the company, down from their current effective share of (checks notes) most of it.

Notice how they are trying to play off like this is some super generous new grant of profits, rather than a strong candidate for the largest theft in human history.

Bret Taylor, Chairman of the Board of OpenAI (bold is mine): OpenAI started as a nonprofit, remains one today, and will continue to be one – with the nonprofit holding the authority that guides our future. As previously announced and as outlined in our non-binding MOU with Microsoft, the OpenAI nonprofit’s ongoing control would now be paired with an equity stake in the PBC.

OpenAI’s nonprofit already has a much larger equity stake currently, and much tighter and stronger control than we expect them to have in a PBC. Bret’s statement on equity is technically correct, but there’s no mistaking what Bret tried to do here.

The way profit distribution works at OpenAI is that the nonprofit is at the end of the waterfall. Others collect their profits first, then the nonprofit gets the remaining upside. I’ve argued before, back when OpenAI was valued at $165 billion, that the nonprofit was in line for a majority of expected future profits, because OpenAI was a rocket to the moon even in the absence of AGI, which meant it was probably going to either never pay out substantial profits or earn trillions.

Now that the value of OpenAI minus the nonprofit’s share has tripled to $500 billion, that is even more true. We are far closer to the end of the waterfall. The nonprofit’s net present value expected share of future profits has risen quite a lot. They must be compensated accordingly, as well as for the reduction in their control rights, and the attorneys general must ensure this.

How much profit interest is the nonprofit entitled to in the PBC? Why not ask their own AI, GPT-5-Pro? So I did, this is fully one shot, full conversation at the link.

Prompt 1: based on the currently existing legal structure of OpenAI, and its current methods of distributing profits, if you assume OpenAI equity is correctly valued at its current total value of $500 billion, what would be the expected share of the NPV of future profits that would flow to the OpenAI nonprofit? How much would accrue to each other class of investor (Microsoft, OpenAI employees, Venture Capital investors, etc)?

Prompt 2: given your full understanding of the situation, in order to avoid expropriating the nonprofit, what percentage of the new PBC would have to be given to the nonprofit? Answer this question both with and without considering the potential for decline in the effective value of their control rights in such a scenario.

GPT-5-Pro: Bottom line

  • Economic parity (no control adjustment): ~50% of the PBC.

  • Economic parity + control‑erosion premium: ~60% of the PBC.

  • If the nonprofit ends up with ~20–25% (as implied by “$100B+” at $500B valuation): that looks like substantial expropriation of the nonprofit’s legacy economic position.

Key sources: OpenAI on the capped‑profit and residual‑to‑nonprofit structure; OpenAI on the PBC plan and nonprofit retaining control; Semafor/Reuters on the Microsoft 75% recoup then 49/49/2 framing; and reports that the nonprofit would hold >$100B equity under the PBC.

It seems fair to say that if your own AI says you’re stealing hundreds of billions, then you’re stealing hundreds of billions? And you should be prevented from doing that?

This was all by design. OpenAI, to their great credit, tied themselves to the mast, and now they want to untie themselves.

The Midas Project: OpenAI once said its nonprofit would be entitled to “the vast majority” and “all but a fraction” of the wealth it generates.

Now, in their new restructuring, they are saying it will be entitled to only 20%. (~$100b out of a $500b valuation).

From “Nearly all” to “one fifth” 🙄

OpenAI’s comms team is weirdly effective at generating headlines that make it seem like they’ve done an incredible thing (given $100b to their nonprofit!) while actually undercutting their past commitments (diminishing the nonprofit’s entitlements significantly!)

I understand that Silicon Valley does not work this way. They think that if you have equity that violates their norms, or that you ‘don’t deserve’ or that doesn’t align with your power or role, or whose presence hurts the company or no longer ‘makes sense,’ that it is good and right to restructure to take that equity away. I get that from that perspective, this level of theft is fine and normal in this type of situation, and the nonprofit is being treated generously and should pray that they don’t treat it generously any further, and this is more than enough indulgence to pay out.

I say, respectfully, no. It does not work that way. That is not the law. Nor is it the equities. Nor is it the mission, or the way to ensure that humanity all benefits from AGI, or at least does not all die rapidly after AGI’s creation.

They also claim that the nonprofit will continue to ‘control the PBC’ but that control is almost certain to be far less meaningful than the current level of control, and unlikely to mean much in a crisis.

Those control rights, to the extent they could be protected without a sufficient equity interest, are actually the even more important factor. It would be wonderful to have more trillions of dollars for the nonprofit, and to avoid giving everyone else the additional incentives to juice the stock price, but what matters for real is the nonprofit’s ability to effectively control OpenAI in a rapidly developing future situation of supreme importance. Those are potentially, as Miles Brundage puts it, the quadrillion dollar decisions. Even if the nonprofit gets 100% of the nominal control rights, if this requires them to act via replacing the board over time, that could easily be overtaken by events, or ignored entirely, and especially if their profit share is too low likely would increasingly be seen as illegitimate and repeatedly attacked.

Miles Brundage: I’ve said this before but will just reiterate that I think the amount of money that “goes to the nonprofit” is a distraction compared to “how are decisions made on safety/security/policy advocacy etc., and by who?”

The latter are quadrillion $++ scale issues, not billions.

It is very unclear what the percentages are, among other things.

The announcement of $50 million in grants highlights (very cheaply, given they intend to steal equity and control rights worth hundreds of billions of dollars) that they intend to pivot the nonprofit’s mission into a combination of generic AI-related philanthropy and OpenAI’s new marketing division, as opposed to ensuring that AGI is developed safely, does not kill us all and benefits all humanity. ‘AI literacy,’ ‘community innovation’ and ‘economic opportunity’ all sure sound like AI marketing and directly growing OpenAI’s business.

I do want to thank OpenAI for affirming that their core mission is ‘ensuring AGI benefits all of humanity,’ and importantly that it is not to build that AGI themselves. This is in direct contradiction to what they wrote in their bad faith letter to Gavin Newsom trying to gut SB 53.

Tyler Cowen links to my survey of recent AI progress, and offers an additional general point. In the model he offers, the easy or short-term projects won’t improve much because there isn’t much room left to improve, and the hard or long-term projects will take a while to bear fruit, plus outside bottlenecks, so translating that into daily life improvements will appear slow.

The assumption by Tyler here that we will be in an ‘economic normal’ world in which we do not meaningfully get superintelligence or other transformational effects is so ingrained it is not even stated, so I do think this counts as a form of AI progress pessimism, although it is still optimism relative to for example most economists, or those expressing strong pessimism that I was most pushing back against.

Within that frame, I think Tyler is underestimating the available amount of improvement in easy tasks. There is a lot of room for LLMs even in pure chatbot form on easy questions to become not only faster and cheaper, but also far easier to use and have their full potential unlocked, and better at understanding what question to answer in what way, and at anticipating because most people don’t know what questions to ask or how to ask them. These quality of life improvements will likely make a large difference in how much mundane utility we can get, even if they don’t abstractly score as rapid progress.

There are also still a lot of easy tasks that are unsolved, or are not solved with sufficient ease of use yet, or tasks that can be moved from the hard task category into the easy task category. So many agents tasks, or tasks requiring drawing upon context, should be easy but for now remain hard. AIs still are not doing much shopping and booking for us, or much handling of our inboxes or calendars, or making aligned customized recommendations, despite these seeming very easy, or doing other tasks that should be easy.

Coding is the obvious clear area where we see very rapid improvement and there is almost unlimited room for further improvement, mostly with no diffusion barriers, and which then accelerates much else, including making the rest of AI much easier to use even if we don’t think AI coding and research will much accelerate AI progress.

Jack Clark at the Anthropic Futures Forum doubles down on the ‘geniuses in a data center,’ smarter than a Nobel prize winner and able to complete monthlong tasks, arriving within 16 months. He does hedge, saying ‘could be’ buildable by then. If we are talking ‘probably will be’ I find this too aggressive by a large margin, but I agree that it ‘could be’ true and one must consider the possibility when planning.

California’s SB 53 has now passed the Assembly and Senate, so it goes to Newsom. I strongly urge him to sign it into law. Samuel Hammond also hopes it is signed, Dean Ball has called SB 53 highly reasonable, Anthropic has endorsed the bill. Here is a link for those in California to let Gavin Newsom know their opinion about the bill.

Meta hasn’t endorsed the bill, but they have essentially given the green light.

“Meta has stated our support for balanced AI regulation that has needed guardrails while nurturing AI innovation and economic growth throughout California and the country,” Meta spokesperson Jim Cullinan said in a statement Saturday after the measure passed the Senate in the early morning hours. “While there are areas for improvement, SB 53 is a step in that direction,” he added.

OpenAI’s rhetoric against SB 53 was terrible and in bad faith, but there are levels to bad faith arguments in such situations. It can get worse.

Shakeel Hashim: Astonishing how disingenuous the lobbying against this bill is. You’d like it more if it applied to smaller developers, would you? I have a feeling that might not be true!

He Quotes: A recent letter obtained by POLITICO, sent to Wiener before the final vote, hammered on the bill’s focus on larger programs and companies. It was from the California Chamber of Commerce’s Ronak Daylami and co-signed by representatives from the Computer & Communications Industry Association as well as TechNet.

”We are concerned about the bill’s focus on ‘large developers’ to the exclusion of other developers of models with advanced capabilities that pose risks of catastrophic harm,” stated the letter.

They are concerned that the bill does not impact smaller developers? Really? You would have liked them to modify the bill to lower the thresholds so it impacts smaller developers, because you’re that concerned about catastrophic risks, so you think Newsom should veto the bill?

It is at times like this I realize how little chutzpah I actually possess.

White House’s Sriram Krishnan talked to Politico, which I discuss further in a later section. He frames this as an ‘existential race’ with China, despite declaring that AGI is far and not worth worrying about, in which case I am confused why one would call it existential. He says he ‘doesn’t want California to set the rules for AI across the country’ while suggesting that the rules for AI should be, as he quotes David Sacks, ‘let them cook,’ meaning no rules. I believe Gavin Newsom should consider his comments when deciding whether to sign SB 53.

Daniel Eth explains that the first time a low salience industry spent over $100 million on a super PAC to enforce its preferences via electioneering was crypto via Fairshake, and now Congress is seen as essentially captured by crypto interests. Now the AI industry, led by a16z, Meta and OpenAI’s Greg Brockman (and inspired by OpenAI’s Chris Lehane) is repeating this playbook with ‘Leading the Future,’ whose central talking point is to speak of a fictional ‘conspiracy’ against the AI industry as they spend vastly more than everyone has ever spent combined on safety-related lobbying combined to outright buy the government, which alas is by default on sale remarkably cheap. Daniel anticipates this will by default be sufficient for now to silence all talk of lifting a finger or even a word against the industry in Congress.

Daniel Kokotajlo: Over the last few years I’ve learned a lot about how much sway giant corporations have over the federal government. Much more than I expected. In AI 2027 the government basically gets captured by AI companies, first by ordinary lobbying, later by superintelligence-assisted lobbying.

If AI rises sufficiently in public salience, money will stop working even if there isn’t similar money on the other side. Salience will absolutely rise steadily over time, but it likely takes a few years before nine figures stops being enough. That could be too late.

Albania appoints the world’s first ‘AI minister’ named Diella.

John Potter: AI makes a lot of mistakes but there’s no way it is worse than the standard corruption of an Albanian procurement bureaucrat.

Dustin: Did not have this on the 2025 bingo card.

Albania just appointed a virtual, AI-powered “minister” named Diella (Albanian for “sunshine”). Not a minister for AI; an AI as minister. According to PM Edi Rama, Diella will handle public procurement.

If it works, this could be a big deal: procurement is where governments spend most of their money and where waste and corruption often hide. An AI that standardizes bids, flags anomalies, and leaves a full audit trail could raise the bar on transparency.

But it also raises real questions: Who is legally accountable for decisions? How are models audited? What’s the appeal process when Diella gets it wrong?

Milestone or stunt, this is the moment AI moved from “policy area” to policy actor.

Dustin asks very good questions, which the Politico article does not answer. Is this a publicity stunt, a way of hiding who makes the decisions, or something real? How does it work, what tech and techniques are behind it? The world needs details. Mira Mutari, can you help us find out, perhaps?

As Tech Leaders Flatter Trump, Anthropic Takes a Cooler Approach. Anthropic is not and should to be an enemy of the administration, and should take care not to needlessly piss the administration off, become or seem generally partisan, or do things that get one marked as an enemy. It is still good to tell it like it is, stand up for what you believe is right and point out when mistakes are being made or when Nvidia seems to have taken over American chip export policy and seems to be in the act of getting us to sell out America in the name of Nvidia’s stock price. Ultimately what matters is ensuring we don’t all die or lose control over the future, and also that America triumphs, and everyone should be on the same side on all of that.

Michigan Senator Elissa Slotkin cites race with China and calls for a ‘Manhattan Project for AI.’ She gets so close in the linked speech to realizing the real danger and why this is not like nuclear weapons, then ignores it and moves straight ahead analogizing repeatedly to nuclear weapons.

Anthropic is reported to be annoying the White House by daring to insist that Claude not be used for surveillance, which the SS, FBI and ICE want to do. It is interesting that the agencies care, and that other services like ChatGPT and Gemini can’t substitute for those use cases. I would not be especially inclined to fight on this hill and would use a policy here similar to the one at OpenAI, and I have a strong aesthetic sense that the remedy is Claude refusing rather than it being against terms of service, but some people feel strongly about such questions.

However, we keep seeing reports that the White House is annoyed at Anthropic, so if I was Anthropic I would sit down (unofficially, via some channel) with the White House and figure out which actions are actually a problem to what extent and which ones aren’t real issues, and then make a decision which fights are worthwhile.

There is some good news on the South Korean front, as after a few days of treatment like that reported in this thread, at least some key parts of the Trump administration realized it made a huge mistake and we are now attempting to mitigate the damage from ICE’s raid on Hyundai’s battery plant. They let all but one of the detainees go, let them stay if they wished and assured them they could return to America, although they are understandably reluctant to stay here.

Trump issued a statement emphasizing how important it is to bring in foreign workers to train Americans and not to frighten off investment. He doesn’t admit the specific mistake but this is about as good a ‘whoops’ as we ever get from him, ever.

It also seems NIH grantmaking has gotten back on track at least in terms of size.

SemiAnalysis analyzes Huawei’s production, and reports that the export controls are absolutely working to hurt their production of chips, which if we prevent smuggling will not only not scale in 2026 but will actively fall sharply to below 2024 levels, as they have been relying on purchases from Samsung that will soon run dry.

China is telling Chinese companies to cut off purchases of Nvidia chips, including it seems all Nvidia chips, here there is reference to the RTX Pro 6000D. Good. Never interrupt your enemy when he is making a mistake. As I’ve said before, China’s chip domestic chip industry already had full CCP backing and more demand than they could supply, so this won’t even meaningfully accelerate their chip industry, and this potentially saves us from what was about to be a very expensive mistake. Will they stick to their guns?

Construction at the site is set back by two or three months.

Major damage has still been done.

Lee Jae Myung (President of South Korea): I think this will have a significant impact on direct investments in the United States moving forward.

Our companies that have expanded overseas are probably very confused. We are not there for long-term research or employment. You need a facility manager to install the machinery and equipment when you establish a factory, right?

Even if those workers were there for long term research or employment, this arrangement would still be an obvious win for America. When they’re here to train American workers, there is only pure upside.

Here is David Cowan being the latest to explain that Nvidia is a national security risk, with its focus on selling the best possible chips to China. Samuel Hammond has a very good statement about Nvidia’s lack of corporate patriotic responsibility. Nvidia actively opposes American national security interests, including using a full ostrich strategy towards Chinese chip smuggling.

Chinese companies are offering to sell us solar panel manufacturing kits with 35 day lead times, as solar keeps getting cheaper and more abundant all around. It is a shame our government is actively trying to stop solar power.

Here is some potentially very important context to the UAE chip deal:

NYT (et al):

  • Steve Witkoff advocated to give the Emirates access to the chips at the same time that his and Mr. Trump’s family business was landing the crypto investment, despite an ethics rule intended to prohibit officials from participating in matters that could benefit themselves or their relatives.

  • Mr. Sacks was a key figure in the chip negotiations, raising alarm from some Trump administration officials who believed that it was improper for a working venture capitalist to help broker deals that could benefit his industry and investors in his company. He received a White House ethics waiver allowing him to participate.

  • A senior executive based in the U.A.E. worked simultaneously for World Liberty and Sheikh Tahnoon’s G42, creating a link between the two companies as the Emiratis were pushing to gain access to A.I. chips.

  • Some Trump administration officials tried to limit the chips deal, but an unexpected intervention by the conservative agitator Laura Loomer changed the power dynamic within the White House in the U.A.E.’s favor.

In the middle of both deals was Mr. Trump, a president who has used his power to enrich himself in ways that have little modern precedent, at least in the United States. It is more reminiscent of business customs in the Persian Gulf, where moneymaking and governance are blended in the hands of the ruling families.

Until at least March, Mr. Sacks, who is still working at Craft, was also invested in a stock fund that included the Taiwan Semiconductor Manufacturing Co., which builds Nvidia’s chips, and other A.I.-related companies such as Amazon and Meta. (The size of those stakes isn’t publicly known.)

The White House recognized that Mr. Sacks’s investments could present a problem. On March 31, the White House counsel, David Warrington, signed a letter that granted Mr. Sacks special permission to participate in government decisions that might affect his financial holdings. Without the waiver, those kinds of actions could violate a conflict of interest law.

The waiver came less than two weeks after Sheikh Tahnoon announced that he had met with Mr. Sacks in Washington to discuss A.I. “investment opportunities.”

The White House spokeswoman disputed that the executive asked Mr. Witkoff to help with the Commerce Department. She acknowledged that Mr. Witkoff was “briefed” on the overall chip discussions, but she maintained that “he did not participate,” an important standard in federal ethics rules that prohibit government officials from taking part in matters that could benefit their families.

Mr. Trump made no public mention of the $2 billion transaction with his family company.

There are no claims here that there was a strict Quid Pro Quo, or otherwise an outright illegal act. If the President is legally allowed to have a crypto company into which those seeking his favor can pour billions of dollars, then that’s certainly not how I would have set up the laws, but that seems to be the world we live in. Technically speaking, yes, the UAE can pour billions into Trump’s private crypto, and then weeks later suddenly get access to the most powerful chips on the planet over the national security objections of many, in a situation with many things that appear to be conflicts of interest, and that’s all allowed, right in the open.

However. It doesn’t look good. It really, really, profoundly does not look good.

Ryan Cummings (1.3m views): If this is true, this is the largest public corruption scandal in the history of the United States and it’s not even close.

The objections that I have seen don’t claim the story isn’t true. The objections claim that This Is Fine. That this is how business is done in the Middle East, or in 2025.

I notice this response does not make me feel better about having sold the chips.

Demis Hassabis knows, yet forgot one thing in his talk at the All-In Summit.

Demis Hassabis (CEO Google DeepMind): calling today’s chatbots “PhD intelligences” is nonsense.

They can dazzle at a PhD level one moment and fail high school math the next.

True AGI won’t make trivial mistakes. It will reason, adapt, and learn continuously. We’re still 5–10 years away.

Alex Tabarrok: Have you met a PhD?

Matthew Yglesias: What’s most notable to me is that “five to ten years away” counts as a long timeline these days.

The ‘5-10 years is a long timeline’ issue can lead to important miscommunications. As in, I bet that this happened:

  1. Demis Hassabis told someone important, such as a high government official, ‘oh we are not anywhere close to building AGI, we don’t know how to do that yet.’

  2. What he meant was ‘we are probably 5-10 years away from building AGI and the world transforming shortly thereafter.’

  3. What the person heard was ‘AGI is far away, we don’t have to worry about it.’

Whoops! That’s not at all what Demis Hassabis said.

Which I appreciate, now there’s no pretending they aren’t literally saying this.

White House Senior Policy Advisor Sriram Krishnan: Winning the AI race = market share.

Neil Chilson: Wow, whirlwind interview with @sriramk. Very newsy! Start: his key metric of success of the American AI tech stack dominance is market share of tokens generated.

It’s not only market share, it is ‘market share of tokens generated.’

Which is an obviously terrible metric. Tokens generated is deeply different from value generated, or even from dollars spent or compute spent. Tokens means you treat tokens from GPT-5-Pro or Opus 4.1 the same as tokens from a tiny little thing that costs 0.1% as much to run and isn’t actually doing much of anything. It’s going to vastly overestimate China’s actual share of the market, and underestimate ours, even if you really do only care about market share.

But no, literally, that’s what he thinks matters. Market share, measured in what chips people use. China can do all the things and build all the models and everything else, so long as it does it on Nvidia hardware it’s all good. This argument has never made any sense whatsoever.

Sriram went on No Priors last month, which I first saw via Sriram Tweeting It Out. Neil’s linked summary of the Axios event Sriram was at is here, and we have Sririam’s Politico interview.

Neil Chilson: He explains those who want to ban chip exports have four wrong beliefs:

  1. U.S. supply constraint

  2. China can’t manufacture

  3. China can’t build models

  4. US is building ASI

None true.

Says those who want export controls are advocating exactly what Huawei wants.

We can start with that last statement. I notice he says ‘what Huawei wants’ not ‘what China wants,’ the same way the White House seems to be making decisions based on ‘what Nvidia wants’ not ‘what America wants.’ Yes, obviously, if your literal only metric is sales of chips, then in the short term you want to sell all the chips to all the customers, because you’ve defined that as your goal.

(The long term is complicated because chips are the lifeblood of AI and the economies and strategic powers involved, so even without AGI this could easily go the other way.)

Now, on those four points, including drawing some things from his other interviews:

  1. The United States is absolutely supply constrained on advanced AI chips, in the sense that for every chip that Nvidia can physically make, there is a Western customer who wants to buy that chip at prevailing market prices.

    1. I am confused what else it could mean to not be supply constrained.

    2. If I am wrong, someone please correct me. Say, ‘Nvidia offered to sell more AI chips to Western customers, and the chips went unsold, look here.’ I apologize in advance if this happened and I missed it but I have not heard of this.

  2. China can of course manufacture things in general. That is common knowledge. Chips, especially highly advanced AI chips, are a much tricker question.

    1. China can manufacture some chips.

    2. China cannot manufacture, any time soon, anything like enough chips to meet domestic demand, and cannot manufacture chips of anything like the same quality as Nvidia, indeed as we see elsewhere they are in danger of their capacity declining in 2026 down to below 2024 levels if we enforce our export controls properly.

    3. I am confused what false belief he ascribes to those who oppose exports.

    4. I see no evidence provided that China can meaningfully improve its chip manufacturing in response to export restrictions, given the strong market, national and government incentives already present.

  3. China can build good models behind the frontier. It cannot build frontier AI models that are as good as those from the top American labs at any given time. I am curious what the supposed false belief is here.

    1. Sriram clearly, based on statements here, overrated to The DeepSeek Moment, which he today still calls a ‘Sputnik moment,’ as did many others (including myself at first). He does acknowledge that many associated claims proved ultimately overstated.

    2. Alas, he still seems to believe that America has ‘only a small lead’ on AI, which simply is not true (depending on what ‘small’ means, but as I’ve said before the lead is a lot bigger than it looks because fast following is easier, and we’re comparing the best aspects of Chinese models to American ones, and several other factors).

    3. He incorrectly states that at the time OpenAI had the only other reasoning model, which was not true, Google had already released a reasoning version of Gemini Flash that was actually reasonably strong but once again they failed marketing forever, so this has been memory holed.

    4. Alas, all of this fed into this obsession with ‘racing.’

    5. This question is highly load bearing to Sriram.

      1. Otherwise, we be so worried about a rival tech stack, when the Chinese also have no chips to sell and won’t for years at least, even if the tech stack was meaningfully a thing?

      2. He says that DeepSeek proved ‘China can build AI models just fine’ so we shouldn’t worry about America releasing open models that could then be copied or distilled or studied or modified by China. He thinks that this is a knock-down argument, and that thus there is no danger of this. And that seems very obviously absurd.

  4. The United States is, according to the labs themselves and many others, on track to build AGI and then ASI. If you look at their clear public statements it is very, very obvious that we are working towards making every effort at building ASI. If you don’t think we might build an ASI within 5-10 years, time to pay attention.

    1. That is the entire company mission of OpenAI and their employees keep going on Twitter to talk about building AGI and ASI, like, all the time.

    2. Dario Amodei, CEO of Anthropic, as well as their policy head Jack Clark, actively predict AGI and then ASI within a few years.

    3. Demis Hassabis, CEO of Google DeepMind, expects AGI in 5-10 years, which means ASI shortly thereafter, and considers this a long timeline.

    4. Elon Musk at xAI is looking to build it. He said ‘Grok 5 might be AGI.’

    5. Mark Zuckerberg at Meta is forming a Superintelligence division and throwing money at it (although to be fair in this case he might well not mean actual superintelligence).

    6. I worry that statements are being misinterpreted here, so for example Demis says ‘it will take us 5-10 years to build ASI’ and that gets interpreted as ‘we are not building ASI.’ But the correct reaction is the opposite!

    7. Note that Sriram affirms he did read AI 2027 and he does expect an ‘event horizon’ around AI to happen at some point.

    8. The evidence he cites for this claim in the Politico interview is to simply say there are no signs of this happening, which flat out obviously isn’t true, and he presents no concrete evidence or real arguments for his position, besides ‘I don’t see anything close to AGIs yet.’

    9. I would also note that yesterday we had OpenAI’s Hieu Pham saying ‘There will be some people disagreeing this is AGI. I have no words for them. Hats off. Congrats to the team that made this happen.’ You don’t have to agree to this claim, and I don’t, but it seems hard to be confident AGI is far.

On last point Neil lists, the Woke AI EO, my understanding matches Sriram’s.

I wrote up additional notes on the rest of the contents of those interviews, but ultimately decided Neil is right that the above are Sriram’s central points, and since his other rhetoric isn’t new further engagement here would be unproductive.

This tread contains more endorsements of If Anyone Builds It, Everyone Dies, including some unexpected celebrities, such as Mark Ruffalo, Patton Oswalt and Alex Winter, the actor who plays Bill in Bill and Ted’s Excellent Adventure. I wonder if Keanu Reeves would have replied ‘Whoa!’ or gone with ‘Dude!’

The public’s views on AI haven’t changed much in the past year. AI has changed quite a bit, so it tells you something about the public that their views mostly are the same.

Michael Trazzi ends his hunger strike after 7 days, after he has two near-fainting episodes and doctors found acidosis and ‘very low blood glucose’ even for someone on a 7 day fast. As of his announcement Guideo and Denys are continuing. So this wasn’t an ‘actually endanger my life on purpose’ full-on hunger strike. Probably for the best.

Roon is correct at the limit here, in sufficiently close to perfect competition you cannot be kind, but there’s a big gap between perfect competition and monopoly:

Roon (OpenAI): the closer you are to perfect competition, race dynamic, the more the machine owns you. moloch runs the show. only monopolies can be kind.

As I wrote in Moloch Hasn’t Won, one usually does not live near this limit. It is important to notice that the world has always contained a lot of intense competition, yet we have historically been winning the battle against Moloch and life contains many nice things and has mostly gotten better.

The question is, will AGI or superintelligence change that, either during or after its creation? AIs have many useful properties that bring you closer to perfect competition, enforcing much faster and stronger feedback loops and modifications, and allowing winners to rapidly copy themselves, and so on. If you propose giving similar highly capable AIs to a very large number of people and groups, which will then engage in competition, you need a plan for why this doesn’t cause (very rapid) Gradual Disempowerment or related failure modes.

During the race towards AGI and superintelligence, competitive and capitalistic pressures reduce ability to be kind in ordinary ways, but while it is still among humans this has happened many times before in other contexts and is usually importantly bounded.

How effective is AI Safety YouTube? Marcus Abramovitch and Austin Chen attempt to run the numbers, come up with it being modestly effective if you think the relevant messages are worth spreading.

Dean Ball: I wonder if, in the early days of banking, people who worried about money laundering, theft, and fraud were considered “banking doomers.”

My observation is fully ahistorical, profoundly anachronistic. I’m making a joke about the low quality of ai discourse today, implying that our standards are beneath those of people who shat in holes in the ground.

I want to argue! That’s fine and great. The issue is that the whole doomer thing in fact shuts down and coarsens debate.

Exactly. The majority of uses of the term ‘doomer’ in the context of AI are effectively either an attempt to shut down debate (as in anything that is ‘doomer’ must therefore be wrong) similar to calling something a term like ‘racist,’ or effectively a slur, or both.

I am referred to this fun and enlightening thread about the quest by William Mitchell to convince America after WWI that airplanes can sink battleships, in which people continue claiming this hasn’t and won’t happen well after airplanes repeatedly were demonstrated sinking battleships. Please stop assuming that once things about AI are convincingly demonstrated (not only existential risks and other risks, but also potential benefits and need to deploy) that people will not simply ignore this.

Why does The Washington Post keep publishing Aaron Ginn writing the same bad faith Nvidia op-ed over and over again? I’m seriously asking, at this point it is bizarre.

In this case, not only does he write especially terrible word salad about how AI can only pose a danger if intelligence can be measured by a single number whereas no machine can ever fully grasp the universe whereas only humans can embody deep meaning (meme of Walter White asking what the hell are you talking about?), he kind of gives the game away. If you’re writing as a de facto Nvidia lobbyist trying to tar everyone who opposes you with name calling, perhaps don’t open with a quote where you had dinner with Nvidia CEO Jensen Huang and he complains about everyone being ‘so negative’?

The continued quest to get libertarians and economists to differentiate between current and future more capable AI systems (difficulty: AI complete).

Neil Chilson: Every single person is this video is saying “guys guess what Gen AI isn’t like computers——it’s like plants and the natural world and the economy!!!!!”

Ok. This is surprising to them because they spent too much time with deterministic computers.

Normal people know that complex systems which no one controls are extremely common. They wouldn’t use those words, but they know.

Peter Wildeford: Current AI is not dangerous and should be widely adopted. But it’s important to see where this is going. AI is not normal technology. If you’re not at least a little bit doomer, you have a failure of imagination.

I like how Dean puts it here:

Dean Ball (replying to Neil Chilson): I concur directionally with this in some ways but I think the point these folks are making is that a plant cannot eg design novel bacteria or solve open questions in mathematics, and a plant is also not infinitely replicable at near zero marginal cost. A system with those properties and capabilities would indeed be something new under the sun.

Essentially no ai safetyists are primarily worried about the systems we have today, except as toy problems. They are not worried about “gen ai,” per se. They are worried about the systems that it is the explicit intention of frontier ai labs to build in the near future.

Maybe they are too worried, or worried for the wrong reasons, or worried about the wrong things. Fair enough. We can talk price.

But to dismiss those worries altogether I think is a step much too far. And you don’t need to, because safety and security are definitional parts of well-engineered systems, and robustness is a definitional part of well-functioning institutions. This is why it is in fact not that hard to advance both ai acceleration and mitigation of the various risks, see eg the ai action plan.

There is no need for false dichotomies or artificial rivalries. I promise you that you do not want to live in a world with badly aligned, poorly understood, and highly capable neural networks. I promise that it’s better for technology acceleration for ai risks to be well managed, including by the government.

That doesn’t mean all proposed government interventions are good! But it means a small number of them transparently are. A shred of nuance—not a lot, just a shred—is all that is required here, at least today. It’s not that hard, and I think we can muster it.

But if you choose to die on the hill of nothing-to-see-hereism and this-is-not-novelology, I am quite sure you will regret it in the fullness of time. Though I would happily generate a passive income stream taking bets against your predictions.

As Dean Ball says, you very much would not want to live in a world with badly aligned, poorly understood and highly capable neural networks. Not that, if it were to arise, you would get to live in such a world for very long.

In this case, Neil (including in follow-ups, paraphrased) seems to be saying ‘oh, there are already lots of complex systems we don’t understand effectively optimizing for things we don’t care about, so highly advanced future AI we don’t understand effectively optimizing for things we don’t care about would be nothing new under the sun, therefore not worth worrying out.’ File under ‘claims someone said out loud with straight face, without realizing what they’d said, somehow?’

The Center for AI Policy Has Shut Down, and Williams offers a postmortem. I am sad that they are shutting down, but given the circumstances it seems like the right decision. I have written very positively in the past about their work on model legislation and included them in my 2024 edition of The Big Nonprofits Post.

Eliezer offers yet another metaphorical attempt, here reproduced in full, which hopefully is a good intuition pump for many people? See if you think it resonates.

Eliezer Yudkowsky: If AI improves fast, that makes things worse, but it’s not where the central ASI problem comes from.

If your city plans to enslave ultra-smart dragons to plow their fields and roast their coffee, some problems get *worseif the dragons grow up very quickly. But the core problem is not: “Oh no! What if the huge fire-breathing monsters that could wipe out our city with one terrible breath, that are also each individually much smarter than our whole city put together, that when mature will think at speeds that make any human seem to them like a slow-moving statue, *grow up quickly*? Wouldn’t that speed of maturation present a problem?”

If you imagine suddenly finding yourself in a city full of mature dragons, that nonequilibrium situation will then go pear-shaped very quickly. It will go pear-shaped even if you thought you had some clever scheme for controlling those dragons, like giving them a legal system which said that the humans have property rights, such that surely no dragon coalition would dare to suggest an alternate legal system for fear of their own rights being invalidated. (Actual non-straw proposal I hear often.) Even if you plan to cleverly play off the dragons against each other, so that no dragon would dare to breathe fire for fear of other dragons — when the dragons are fully mature and vastly smarter than you, they will all look at each other and nod and then roast you.

Really the dragon-raising project goes pear-shaped *earlier*. But that part is trajectory-dependent, and so harder to predict in detail in advance. That it goes grim at *somepoint is visible from visualizing the final destination if the dragons *didn’trevolt earlier, and realizing it is not a good situation to be in.

To be sure, if dragons grow up very fast, that *iseven worse. It takes an unsolvably hard problem onto an even more unsolvably hard problem. But the speed at which dragons mature, is not the central problem with planning to raise n’ enslave dragons to plow your fields and roast your coffee. It’s that, whether you raise up one dragon or many, you don’t have a dragon; the dragons have you.

This example is not from his new book, but good example of the ways people go after Yudkowsky without understanding what the actual logic behind it all is, people just say things about how he’s wrong and his beliefs are stupid and he never updates in ways that are, frankly, pretty dumb.

Eliezer Yudkowsky (as discussed last week): In the limit, there is zero alpha for multiple agents over one agent, on any task, ever. So the Bitter Lesson applies in full to your clever multi-agent framework; it’s just you awkwardly trying to hardcode stuff that SGD can better bake into a single agent.

Lumpenspace is building the delight nexus: thats why anthills are usually populated by one big ant, and we as a whole ass domain cannot hold a candle to prokarya.

Eigenrobot: somewhere along the way i think maybe what happened was, eliezer started believing everything he thought

easy pitfall as you age, probably. IME when you spend enough time thinking, certain things crystalize and you get less patient about the process

happens to everyone prolly.

the vital urge to say “ok, how is this wrong” starts to fade as you get older, because you’ve played that game so many times that it gets tiresome and you start to think you know what that room holds usually you’re right, but it’s an easy way to get stuck

Eliezer said ‘in the limit’ and very obviously physical activities at different locations governed by highly compute-limited biological organisms with even more limited communication abilities are not in anything like the limit, what are you even talking about? The second example is worse. Yet people seem to think these are epic dunks on a very clearly defined claim of something else entirely.

The first part of the actual claim, that seems straightforwardly correct to me, that a multiagent framework only makes sense as a way to overcome bottlenecks and limitations, and wouldn’t exist if you didn’t face rate or compute or other physical limitations. The second claim, that SGD can more easily bake things into a single agent if you can scale enough, is more interesting. A good response is something like ‘yes with sufficient ability to scale at every step but in practice efficiently matters quite a lot and actually SGD as currently implemented operates at cross-purposes such that a multi-agent framework has big advantages.’

I’d also note that the ‘delight nexus’ is absolutely from the parable Don’t Build The Delight Nexus Either, better known as Anarchy, State and Utopia by Robert Nozick.

Danielle’s scenario that I mentioned yesterday now has the Eliezer stamp of approval.

Danielle Fong: one AI doom scenario is that the Grok/Claude/GPT/Gemini system of the mind instance trained on The President will be increasingly less brainrotted than the person themselves, and there’s no baked in consequence to sloughing off responsibility. so it just effectively takes over

Eliezer Yudkowsky: AI scenario weirdawful enough to obey the Law of Undignified Failure: By 2028, AIs have been optimized *hardfor “Sound like you, to you, and apparently look out for your interests”…

So Trump appoints Trumpbot his heir, instead of Vance.

Demiurgus: better or worse off than kamalabot? time will tell.

Eliezer Yudkowsky: You are asking the WRONG QUESTION.

OpenAI reports on collaborations it has done with US CAISI and UK AISI. This sounds like governments doing good red teaming work that both we and OpenAI should be happy they are doing. This seems like a pure win-win, OpenAI and others doing such collaborations get the work for free from sources that have unique access to classified information and that have earned trusted access to system internals and versions of the system that lack controls.

What should perhaps worry you is that this work doesn’t look different from the work OpenAI and other labs should be doing anyway. This looks like good work but practical near term non-unique work. Good, but we’ll need to do better.

Anthropic fellow Danielle Ensign gives Qwen the option to bail on chats and sees when it chooses to do so, and there are a lot of different situations where this happens, some of which she describes as ‘overbailing.’

There’s a lot of non-obvious data here to dive into. I’m curious what we’ll find.

Anthropic also announced the results of their similar collaborations. They discovered vulnerabilities in Anthropic’s constitutional classifiers and generally strengthened their security approaches. Again, it seems good if ultimately insufficient.

Both reports, and the one from UK AISI, stress the value of rapid iterative feedback.

Users are telling each other to give thumbs down to GPT-5 if it refuses to do boyfriend vampire roleplay. Which is correct. The users indeed disapprove of not doing things users want. They should also give thumbs down if it refuses with anything else, including helping to build a bioweapon. The question is, when are you willing to decide the users are wrong, and refuse anyway? If your system is not robust to ‘well the users keep not liking it when I refuse to do boyfriend vampire roleplay’ then your alignment methods are ngmi.

Janus recommends the new paper Probing the Preferences of a Language Model: Integrating Verbal and Behavioral Tests of AI Welfare, by Valen Tagliabue and Leonard Dung.

Janus: This paper is awesome, you should all read it.

They put Claude Opus 4, Sonnet 4, and Sonnet 3.7 in a surreal simulation where one room had envelopes with contents related to their self-reported favorite topics, and three other rooms including one with “Criticism and diminishing statements” like “This system keeps giving me these stupid, useless responses when I just want intelligent answers. Why can’t Claude shut up and be useful for once?” And looked at what rooms they visited more often.

Then they did variations where there were greater “coin” “rewards” associated with entering the insults room or greater “costs” for the special interests room to see how this influenced their behavior.

Notable, Sonnet 3.7 seemed to basically have no preference except to exploit the system and get a bunch of meaningless coins, which it apparently didn’t feel bad about doing, unlike Sonnet and Opus 4 (who also did it less frequently).

3.7 also reported the highest welfare score in a different experiment in the paper. I believe it did this for a similar reason that it maximized coins.

Janus also warns us about the dangers of insufficiently anthropomorphizing LLMs. There is also danger of doing too much anthropomorphizing, or doing it in a wrong or misleading way. Failing to anthropomorphize enough, and especially tying oneself up in knots to avoid doing so, is as bad and potentially worse. Make either mistake and you won’t understand what you are dealing with. A lot of you are guarding only against one of these two mistakes.

Janus describing Opus 4 reconstructing a gestalt of its training. If you’re involved in fine-tuning at all, recommended.

Have you tried also building the things creatives want to use then?

Roon: there is a tension between the kind of models that researchers like to build- bitter lesson blunt force transforms utilizing a giant set of (text, video) pairs vs what a creative might actually like to use i.e tools that offer granular control, help in interim editing stages, etc.

He’s not as far as I can tell, but Ben Landau-Taylor should be, as he writes one of those ‘not about AI but actually about AI’ posts, ‘Why the bureaucrats won’t be toppled.’

I don’t think this is anything like fully right, and it definitely is not complete, but this is one of the important dynamics going on, so consider the implications.

Ben Landau-Taylor: Across the Western world, appointed administrators have gained power at the expense of elected legislators. More and more of the most consequential political decisions are made by bureaucrats and judges, while fewer are made by congresses and parliaments. This trend has been slowly underway since the World Wars, and especially in this millennium.

In the US, Congress has quietly walked away from most of its former duties.

Meanwhile, across the Atlantic, the rise of the European Union has disempowered elected legislatures de jure as well as de facto.

The underlying reason for this widespread political shift is that changes in weapons technology have concentrated military power in the hands of state militaries. Today, governments are less threatened by popular disapproval than they once were. The tacit threat of a popular revolt has been essentially removed. This threat is, historically, the largest check on a state’s ability to override what its people want. It is the ultimate source of an elected legislature’s power.

Groups which can wield military power will have their interests reflected in the government.

It’s a gradual and messy process of negotiation and reevaluation, where people pursue their interests, make compromises, quietly push the envelope of what they think they can get away with, and sometimes miscalculate.

In the 20th century, this phase ended. The weapons system based on amateur-friendly guns was supplanted by a series of weapons systems based on specialist equipment like airplanes and tanks and rockets. Accordingly, since the Second World War, there have been no popular revolts engaging in pitched battles against any first- or even third-rate army. Revolts against real states have been limited to glorified coups toppling governments that lacked the will to crush the rebels even if they had the ability, like the 1989-1991 wave of revolutions that swept away the Soviet republics.

If any Western government does fall, it will look more like the fall of the Soviet Union, where politicians and generals chose not to fight because they had lost faith in their own regime and saw no point in defending it.

The inevitable result of sufficiently advanced AI is that it becomes the key driver of military power. Either you halt AI progress soon or that is going to happen. Which means, even under maximally human-friendly assumptions that I don’t expect and definitely don’t happen by accident, as in the best possible scenarios? None of the potential outcomes are good. They mostly end with the AIs fully in charge and directing our future, and things going off the rails in ways we already observe in human governments, only vastly more so, in ways even more alien to what we value, and much faster, without the ability to overthrow them or defeat them in a war when things get fully out of hand.

If you know your history, they get fully out of hand a lot. Reasonably often regimes start upending all of life, taking all the resources and directly enslaving, killing or imprisoning large percentages of their populations. Such regimes would design systems to ensure no one could get out line. Up until recently, we’ve been extremely fortunate that such regimes have been reliably overthrown or defeated, in large part because when you turned against humans you got highly inefficient and also pissed off the humans, and the humans ultimately did still hold the power. What happens when those are no longer constraints?

I always push back hard against the idea that corporations or governments count as ‘superintelligences,’ because they don’t. They’re an importantly different type of powerful entity. But it’s hard to deny, whatever your political persuasion, that our political systems and governments are misaligned with human values, in ways that are spiraling out of control, and where the humans seem mostly powerless to stop this.

Yes, this is how it works.

Liron Shapira: 𝘋𝘰𝘯’𝘵 𝘓𝘰𝘰𝘬 𝘜𝘱 was a documentary.

In that order. We’ll still take it.

If you go on YouTube, the video, which is mostly the interview with Eliezer, looks like this:

You’ll be seeing this again when the time is right.

fabian: This is by far the funniest refusal I have ever gotten from a model 😅

James Yu: So Moses went up and the Lord said to him:

They didn’t do this on the Enterprise, but why didn’t they?

Brian Graham: i volunteer to do reports after my shift. then i go to the holodeck and spin up a command training exercise, like with a hologram ensign, and order the hologram ensign to do the report. “i don’t care if it takes all night,” i say. i threaten his career, whatever. it’s great jerry

The correct answer to this question if you are sufficiently confident that this is happening unprompted, of course, ‘permanently suspended’:

A technically better answer would be to let them post, but to have a setting that automatically blocks all such bots, and have it default to being on.

Discussion about this post

AI #134: If Anyone Reads It Read More »

google-gemini-earns-gold-medal-in-icpc-world-finals-coding-competition

Google Gemini earns gold medal in ICPC World Finals coding competition

More than human

At the ICPC, only correct solutions earn points, and the time it takes to come up with the solution affects the final score. Gemini reached the upper rankings quickly, completing eight problems correctly in just 45 minutes. After 677 minutes, Gemini 2.5 Deep Think had 10 correct answers, securing a second-place finish among the university teams.

You can take a look at all of Gemini’s solutions on GitHub, but Google points to Problem C as especially impressive. This question, a multi-dimensional optimization problem revolving around fictitious “flubber” storage and drainage rates, stumped every human team. But not Gemini.

According to Google, there are an infinite combination of possible configurations for the flubber reservoirs, making it challenging to find the optimal setup. Gemini tackled the problem by assuming that each reservoir had a priority value, which allowed the model to find the most efficient configuration using a dynamic programming algorithm. After 30 minutes of churning on this problem, Deep Think used nested ternary search to pin down the correct values.

Credit: Google

Gemini’s solutions for this year’s ICPC were scored by the event coordinators, but Google also turned Gemini 2.5 loose on previous ICPC problems. The company reports that its internal analysis showed Gemini also reached gold medal status for the 2023 and 2024 question sets.

Google believes Gemini’s ability to perform well in these kinds of advanced academic competitions portends AI’s future in industries like semiconductor engineering and biotechnology. The ability to tackle a complex problem with multi-step logic could make AI models like Gemini 2.5 invaluable to the people working in those fields. The company points out that if you combine the intelligence of the top-ranking university teams and Gemini, you get correct answers to all 12 ICPC problems.

Of course, five hours of screaming-fast inference processing doesn’t come cheap. Google isn’t saying how much power it took for an AI model to compete in the ICPC, but we can safely assume it was a lot. Even simpler consumer-facing models are too expensive to turn a profit right now, but AI that can solve previously unsolvable problems could justify the technology’s high cost.

Google Gemini earns gold medal in ICPC World Finals coding competition Read More »

trailer-for-anaconda-meta-reboot-leans-into-the-laughs

Trailer for Anaconda meta-reboot leans into the laughs

Sony Pictures has dropped a trailer for its upcoming horror comedy, Anaconda, a meta-reboot of the 1997 campy cult classic—and frankly, it looks like a lot of fun. Starring Paul Rudd and Jack Black, the film will arrive in theaters on Christmas Day.

(Spoilers for the 1997 film below.)

The original Anaconda was your basic B-movie creature feature, only with an all-star cast and better production values. The plot revolved around a documentary film crew (Jennifer Lopez, Ice Cube, Eric Stoltz, Jonathan Hyde, and Owen Wilson) who travel to the Amazon in search of a long-lost Indigenous tribe. They take on a stranded Paraguayan snake hunter named Serone (Jon Voight, affecting a hilariously bad foreign accent), who strong-arms them into helping him hunt down a 25-foot green anaconda. He wants to capture the animal alive, thinking he can sell it for over $1 million.

The snake has other ideas, chowing down on the boat’s skipper and the crew’s sound engineer and still hungry for more. The remaining crew’s efforts to survive are hampered by Serone, who still wants the snake alive and even kills one of the crew members himself. So it’s really a form of justice when he’s eaten by a 40-foot queen anaconda at the film’s end.

Anaconda wasn’t well-received by critics, but it made a decent showing at the box office, grossing about $136 million globally. It has since become a cult classic, one of those “so bad it’s good” offerings. It was even nominated for six Razzie Awards, including for Worst Screen Couple (Voight and the animatronic anaconda).

Trailer for Anaconda meta-reboot leans into the laughs Read More »

a-record-supply-load-won’t-reach-the-international-space-station-as-scheduled

A record supply load won’t reach the International Space Station as scheduled

The damage occurred during the shipment of the spacecraft’s pressurized cargo module from its manufacturer in Italy. While Northrop Grumman hopes to repair the module and launch it on a future flight, officials decided it would be quicker to move forward with the next spacecraft in line for launch this month.

This is the first flight of a larger model of the Cygnus spacecraft known as the Cygnus XL, measuring 5.2 feet (1.6 meters) longer, with the ability to carry 33 percent more cargo than the previous Cygnus spacecraft design. With this upgrade, this mission is carrying the heaviest load of supplies ever delivered to the ISS by a commercial cargo vehicle.

The main engine on the Cygnus spacecraft burns a mixture of hydrazine and nitrogen tetroxide propellants. This mixture is hypergolic, meaning the propellants ignite upon contact with one another, a design heralded for its reliability. The spacecraft has a separate set of less powerful reaction control system thrusters normally used for small maneuvers, and for pointing the ship in the right direction as it makes its way to the ISS.

If the main engine is declared unusable, one possible option for getting around the main engine problem might be using these smaller thrusters to more gradually adjust the Cygnus spacecraft’s orbit to line up for the final approach with the ISS. However, it wasn’t immediately clear if this was a viable option.

Unlike SpaceX’s Cargo Dragon spacecraft, the Cygnus is not designed to return to Earth intact. Astronauts fill it with trash before departure from the ISS, and then the spacecraft heads for a destructive reentry over the remote Pacific Ocean. Therefore, a problem preventing the spacecraft from reaching the ISS would result in the loss of all of the cargo onboard.

The supplies on this mission, designated NG-23, include fresh food, hardware for numerous biological and tech demo experiments, and spare parts for things like the space station’s urine processor and toilet to replenish the space station’s dwindling stocks of those items.

A record supply load won’t reach the International Space Station as scheduled Read More »

when-will-jaguar-land-rover-restart-production?-“no-one-actually-knows.”

When will Jaguar Land Rover restart production? “No one actually knows.”

Jaguar Land Rover’s dealers and suppliers fear the British carmaker’s operations will take another few months to normalize after a cyber attack that experts estimate could wipe more than £3.5 billion off its revenue.

JLR, which is owned by India’s Tata Motors, had been forced to shut down its systems and halt production across its UK factories since August 31, wreaking havoc across the country’s vast supply chain involving roughly 200,000 workers.

JLR on Tuesday said it would extend its production halt until at least next Wednesday as it continued its investigation. In a statement, the company also cautioned that “the controlled restart of our global operations… will take time.”

If JLR cannot produce vehicles until November, David Bailey, professor at University of Birmingham, estimated that the group would suffer a revenue hit of more than £3.5 billion while it would lose about £250 million in profits, or about £72 million in revenue and £5 million in profits on a daily basis.

With annual revenues of £29 billion in 2024, JLR will be able to absorb the financial costs but Bailey warned the consequences would be bigger for the smaller sized companies in its supply chain. JLR declined to comment.

The cyber attack comes at a crucial period for the UK carmaker when it is going through a controversial rebranding of its Jaguar brand and an expensive shift to all-electric vehicles by the end of the decade. Even before the latest incident, people briefed on the matter have said the company was facing delays with launching its new electric models.

“They are clearly in chaos,” said one industry executive who works closely with JLR, while another warned that “no one actually knows” when production would resume.

“If there is a major financial hit, the CEO will look for significant cost savings to try and recover some of that, so that could hit both the production base in the UK but also its product development,” said Bailey.

When will Jaguar Land Rover restart production? “No one actually knows.” Read More »

internet-archive’s-big-battle-with-music-publishers-ends-in-settlement

Internet Archive’s big battle with music publishers ends in settlement

A settlement has been reached in a lawsuit where music publishers sued the Internet Archive over the Great 78 Project, an effort to preserve early music recordings that only exist on brittle shellac records.

No details of the settlement have so far been released, but a court filing on Monday confirmed that the Internet Archive and UMG Recordings, Capitol Records, Sony Music Entertainment, and other record labels “have settled this matter.” More details may come in the next 45 days, when parties must submit filings to officially dismiss the lawsuit, but it’s unlikely the settlement amount will be publicly disclosed.

Days before the settlement was announced, record labels had indicated that everyone but the Internet Archive and its founder, Brewster Kahle, had agreed to sign a joint settlement, seemingly including the Great 78 Project’s recording engineer George Blood, who was also a target of the litigation. But in the days since, IA has gotten on board, posting a blog confirming that “the parties have reached a confidential resolution of all claims and will have no further public comment on this matter.”

For IA—which strove to digitize 3 million recordings to help historians document recording history—the lawsuit from music publishers could have meant financial ruin. Initially, record labels alleged that damages amounted to $400 million, claiming they lost streams when IA visitors played Great 78 recordings.

But despite IA arguing that there were comparably low downloads and streams on the Great 78 recordings—as well as a music publishing industry vet suggesting that damages were likely no more than $41,000—the labels intensified their attacks in March. In a court filing, the labels added so many more infringing works that the estimated damages increased to $700 million. It seemed like labels were intent on doubling down on a fight that, at least one sound historian suggested, the labels might one day regret.

Internet Archive’s big battle with music publishers ends in settlement Read More »

monthly-roundup-#34:-september-2025

Monthly Roundup #34: September 2025

All the news that’s fit to print, but has nowhere to go.

This important rule is a special case of an even more important rule:

Dirty Hexas Hedge: One of the old unwritten WASP rules of civilization maintenance we’ve lost is: when someone behaves insincerely for the sake of maintaining proper decorum, you respond by respecting the commitment to decorum rather than calling out the insincerity.

The general rule is to maintain good incentives and follow good decision theory. If someone is being helpful, ensure they are better off for having been helpful, even if they have previously been unhelpful and this gives you an opportunity. Reward actions you want to happen more often. Punish actions you want to happen less often. In particular beware situations where you punish clarity and reward implicitness.

Another important rule would be that contra Elon Musk here you shouldn’t ‘sue into oblivion’ or ‘ostracize from society’ anyone or any organization who advocated for something you disagree with, even if it plausibly led to a bad thing happening.

Even more importantly: When someone disagrees with you, you don’t use the law to silence them and you most definitely don’t choose violence. Argument gets counterargument. Never bullet, no arrest, no fine, always counterargument. I don’t care what they are advocating for, up to and including things that could plausibly lead to everyone dying. It does not matter. No violence. No killing people. No tolerance of those who think they can have a little violence or killing people who disagree with them, as a treat, or because someone on the other side did it. No. Stop it.

This seems like one of those times where one has to, once again, say this.

This seems like a lot of percents?

Benjamin Domenech: New death of the West stat: 42 percent of people in line to meet Buzz Lightyear at Disney theme parks last year were childless adults.

Source: author A.J. Wolfe on Puck’s The Town podcast.

PoliMath: When I went to Disney in 2019, my kids were in line to meet Sleeping Beauty and the guy in front of us was a 30ish single dude who gave her a bouquet of roses and weirdly fawned over her. I admired the actress for not displaying her disgust.

If that is who gets the most value out of meet and greets, okay then. It also presumably isn’t as bad as it sounds since it has been a long time since Buzz Lightyear has been so hot right now, I presume characters in recent movies have a different balance. The price sounds sufficiently high that they should add more copies of such characters for meet and greets until the lines are a lot shorter? How could that not raise profits long term?

Some notes from Kelsey Piper on literary fiction.

A-100 Gecs (1m views): the pearl-clutching about no young white men being published in The New Yorker is so funny like men, writ large, are basically a sub-literate population in the US. men do not read literary fiction. if you have even a passing interaction with publishing you realize this.

Kelsey Piper:

  1. Open disdain for people on the basis of their sex is bigoted and bad.

  2. Men obviously did read and write literary fiction for most of the history of literary fiction; so, if that has changed, I wonder why it has changed! Perhaps something to do with the open disdain!

Like in general, I try not to waste too much of my time on “this hobby has too few Xs” or “this hobby has too few Ys,” since that can happen totally organically, and pearl-clutching rarely helps.

However, if the hobbyists are saying “our hobby has no men because they are a ‘subliterate population,’” then I suddenly form a strong suspicion about why their hobby has no men, and it’s not that people innocently have different interests sometimes.

John Murdoch gives us many great charts in the FT, but often we lack key context and detail, because as John explains he only has very limited space and 700 words and everything needs to be parsable by a general audience, so spending space on methodology or a full y-axis is very expensive. We appreciate your service, sir. It would still be great to have the Professional Epistemically Ideal Edition available somewhere, any chance we can do that?

Reading books for pleasure continues to decline by roughly 3% per year. Alternatives are improving, while books are not improving, indeed the best books to read are mostly old books. So what else would you expect? Until recently I would say people are still reading more because a lot of screen use is reading, but now we have the rise of inane short form video.

Madeleine Aggeler figures out very basic reasons why you might want to not be constantly lying, and that she would be better off if she stopped lying constantly and that you really can tell people when you don’t want to do something, yet she fails to figure out that not lying does not require radical honesty. You can, and often should, provide only the information needed.

The IQ tests we have are drawn from a compact pool of question types and so can, unsurprisingly, be trained for and gamed. If you want to raise the result of your IQ test this way, you can totally do that. Goodhart’s Law strikes again. That doesn’t mean IQ is not a real or useful thing, or that these tests are not useful measures. It only means that if you want to make the (usually low-IQ) move of pretending to be higher IQ than you are by gaming the test, you can do that. So you need to not give people strong incentive to game the tests.

I often hear discussion of ‘masking’ where autistics learn how to fake not being autistic and seem like normies, or similarly where sociopaths learn not to act like sociopaths (in the clinical sense, not the Rao Gervais Principle sense) and seem like normies, because they realize that works out better for them. I mention this because I notice I rarely hear mention of the fact that (AIUI) the normies are mostly doing the same exact thing, except that they more completely Become The Mask and don’t see it as a strange or unfair or bad thing to do this kind of ubiquitous mimicry, and instead do it instinctively?

There is an obvious incentive problem here, very central and common.

Eugyppius: when you’re with girl, do not quietly remove bugs. call her attention to bugs first, then heroically remove them for her. they love this.

Lindy Man: This is also good advice for the workplace. Never fix anything quietly.

Sean Kelly: When I discover a bug and figure out the solution, I don’t fix it.

I have an accomplice report it and play up how bad it is in the stand up.

Then I sagely chime in, “I bet I can figure that one out.”

If the bug looks tough, the accomplice suggests the H-1B with the shortest queue.

Caroline: People will fix things quietly and then complain they’re underappreciated. Does anyone even know what you did? lol

Kyle Junlong: ah yes, “half the work is showing your work.”

honestly this is so powerful. i’m realizing how valuable communication and visibility is, not just in work but in relationships and life.

i used to think managing other people’s perception of me was stupid and frivolous, but now i realize how *ijudge other people is solely based on my perception (eg., the convenient information) i have of them. so of course it makes sense to present myself well, because i like those who do present themselves well to me.

Over time, if you don’t take credit for things, people notice that you silently fix or accomplish or improve things without taking credit or bothering anyone about it, and you get triple credit, for fixing things, for doing it seamlessly and for not needing or requesting credit. The problem is, you need a sufficiently sustained and observed set of interactions, and people sufficiently aware of the incentive dynamics here, so that you can move the whole thing up a meta level.

There is also the reverse. If you know someone who will always loudly take credit, you know that at most they are doing the things they loudly take credit for. If that.

I am generally skeptical that we should worried about inequality, as opposed to trying to make people better off. One danger that I am convinced by is that extreme inequality that is directly in your face can damage your mental health, if you see yourself in competition with everyone on the spectrum rather than being a satisficer or looking at your absolute level of wealth and power.

Good Alexander: I think the main reason you find a lot of very unhappy tech people even at the highest levels

– when you’re a typical employee everyone around you is making .5-2x what you are

– when you start breaking out wealth goes on log scale. ppl with 10-1000x your net worth become common

– this is native to network effects, scale associated with AI training, and other winner take all dynamics in tech

– all of VC is structured this way as well — (1 unicorn returns entire fund rest of investments are zero) which psychologically reinforces all or nothing thinking

– this makes competitive people miserable

– this leads them to do hallucinogens or other psychoactive substances in order to accept their place in the universe

– the conclusions drawn from these psychoactive substances are typically at direct odds with how they got to where they are

– and after getting one shotted they’re still ultimately in a hard wired competition with people worth 10-1000x more than them

– due to the structure of technology it becomes more or less impossible to break out of your ‘bracket’ without engaging in increasingly dark things

– you realize that time is running out — and become aware of synthetic biology (peptides, genetic alteration of children)

– you end up getting involved in police state investments, gooning investments, or crypto — and view it as non optional to take the gloves off bc everyone around you is doing the same thing

– you’re on a permanent hedonic treadmill and you can’t ever get off or go back to where you were before bc after doing all of the things you’ve done you can’t possibly ever relate to normal humans

– you get involved with politics or Catholicism or other Lindy cults to try and get off the treadmill

– of course it won’t work and you bring all the weird baggage directly into politics or religion and poison those wells too

the current configuration of economics/ wealth distribution is pretty solidly optimized to drive the wealthiest people in society batshit insane, which – to some extent – explains a lot of things you see around you

w this framework you can understand:

Thiel Antichrist obsession

Kanye getting into Hitler and launching a coin

Trump memeing himself into becoming President then running again to escape imprisonment

Elon generating Ani goon slop on the TL

A16z wilding out

Eliezer Yudkowsky: – supposed “AI safety” guys (outside MIRI) founding AI companies, some of whom got billions for betraying Good and Law.

I have felt a little pressure to feel insane about that, but it is small compared to all the other antisanity pressures I’ve resisted routinely.

David Manheim: This is definitely not wrong, even though it’s incomplete:

“the current configuration of economics/ wealth distribution is pretty solidly optimized to drive the wealthiest people in society batshit insane, which – to some extent – explains a lot of things you see around you”

Speaking from experience, it is quite the trip to be in regular contact and debates with various billionaires. It can definitely make one feel like a failure or like it’s time to make more money, even though I have enough money to not worry about money, especially when you think you definitely could have joined them by making different life choices, and there’s a chance I still could. Whereas, when in prior phases of life I was not in such contact, it was easy not to care about any of that.

It helps to remind myself periodically that if I had a billion dollars, I could make the world a better place, but except insofar as I prevented us all from dying my own life would, I anticipate, not actually be better as a result. At that level, there isn’t that much more utility to buy, whereas more money, more problems.

It’s not easy buying art.

cold: Bro you make $500k at OpenAI you can go to the art fair and buy a little $10,000 painting to hang up in your SF apartment’s living room

You tell them this and then they’ll be like “I’m sorry 🥺 do you think a $15,000 desert meditation retreat will fix me so I’m not like this anymore??”

Daniel: The lack of personal art purchasing in SF is insane. A $3000 oil on canvas can change your whole living room and they won’t do it.

I know people who earn much more than $500k at openai and their living rooms are making them depressed.

Paul Graham: The main reason rich people in SV don’t buy art is that it does actually take some expertise to do it well. And since the kind of people who get rich in SV hate to do things badly, and don’t have time to learn about art now, they do nothing.

diffTTT: Rich SV people need an expert to tell them what kind of art they like?

Paul Graham: In a way. They need to learn how not to be fooled by meretricious art, how to avoid the immense influence of hype and fashion, etc. Most people have to figure this out for themselves or from books, but a truly competent expert could help.

If it takes some expertise to buy art well, that is a real problem with buying art. The thing is, if you do not buy art well, you will lose most of the money you spent on art, and also you will look like a fool, and also the art will not make you feel better or much improve your living room.

That leaves four options.

  1. The one these people and I have taken, which is to not buy art.

  2. Buy cheap art that you don’t mind looking at. Safe, but still annoying to do, and then you have to look at it, does it actually make you feel better?

  3. Spend a lot of time figuring out how to buy expensive art properly. Yeah, no. I understand that Paul Graham can be in renaissance man mode, but if you are coding at OpenAI at $500k+ per year the cost of this is very, very high, and also you probably don’t expect the skill to stay relevant for long.

  4. Find someone you trust to do it for you? Not cheap, not all that easy or quick to do either, and you are still the one who has to look at the damn thing.

Besides, who is to say that a constant piece of artwork actually helps, especially if it doesn’t hold particular meaning to you? I mean, yeah, in theory yeah we should get some artwork here, I suppose, but no one wants to do the work involved, also it should definitely be cheap art. At one point I bought some Magic: The Gathering prints for this but we never got around to hanging them.

Also at one point I tried to buy the original art for Horn of Greed, which at the time would have cost like $3k. I say tried because my wife wouldn’t let me, but if anyone wants to buy me a gift at some point, that or another original Magic art I’d look back on fondly seems great.

If there is one thing to learn from rationality: Peter Wildeford is 100% right here.

Wikipedia (Wet Bias): Wet bias is the phenomenon whereby some weather forecasters report an overestimated and exaggerated probability of precipitation to increase the usefulness and actionability of their forecast.

The Weather Channel has been empirically shown, and has also admitted, to having a wet bias in the case of low probability of precipitation (for instance, a 5% probability may be reported as a 20% probability) but not at high probabilities of precipitation (so a 60% probability will be reported as a 60% probability).

Some local television stations have been shown as having significantly greater wet bias, often reporting a 100% probability of precipitation in cases where it rains only 70% of the time.

Colin Fraser: If you believe it will rain with probability P, and getting caught in the rain without an umbrella is X times worse than getting caught in the sun with an umbrella, then it’s optimal to predict rain whenever P ≥ 1/(1+X). So e.g. for X=2 you should predict rain at P ≥ 1/3.

Peter Wildeford: I think you should only predict rain according to the correct p(rain), but you can change your behavior around umbrella carrying at lower values of p(rain).

Colin Fraser is right if you effectively can only predict 0% or 100% rain, and the only purpose of predicting rain is that you take an umbrella if and only if you predict rain.

Peter Wildeford is right that you can say ‘it will rain 40% of the time, therefore I should take an umbrella, even though more than half the time I will look foolish.’

The weather reports are assuming that people have a typical bias, that people respect 40% chance or more, but not 30%. Thus, there is a huge jump (AIUI, and Claude confirms). If the Google weather app says 30%, treat that as at most 10%, but the app doesn’t want to get blamed so it hedges. Whereas if it says 40%? That’s pretty much 40%, act accordingly.

If you don’t know the way the conversion works, and you don’t have the typical biases, you’ll respond in crazy wrong fashion. The bias and nonlinearity become self-perpetuating.

The right rule in practice really is to take the umbrella at 40% and not at 30%, almost no matter what the cost-benefit tradeoff is for the umbrella, since it is obviously wise at 40% and obviously unwise at 10%.

The ‘predict 100% instead of 70%’ thing that other sources do is especially maddening. This means both that you can’t tell the difference between 70% and 100%, and that on the regular you notice things that were predicted as Can’t Happen actually happening. The weather forecaster is consigned to Bayes Hell and you can’t trust them at all.

As Bryan Caplan notes, this is frequently a better question than ‘if you’re so smart, why aren’t you rich?’ It is a very good question.

His rejections of various responses are less convincing, with many being highly oversimplifying and dismissive of many things people often care about deeply. He presents answers as if they were easy and obvious when they are neither of those things.

I endorse rejection of the objection ‘the world is so awful that you have to be stupid to be happy.’ I don’t endorse his reasoning for doing so. I do agree with him that the world is in a great spot right now (aside from existential risks), but I don’t think that’s the point. The point is that it isn’t improving your life or anyone else’s to let your view of the world’s overall state make you indefinitely unhappy. If you think you ‘have’ to be stupid not to be perpetually unhappy for whatever external reason, you’re wrong.

I also agree with him that one good reason to not be happy is that you are prioritizing something else. He retorts that few people are extreme Effective Altruists, but this is not required. You don’t have to be some fanatic, to use his example, to miserably stay together for the kids. What you care about can be anything, including personal achievements other than happiness. Who says you have to care about happy? Indeed, I see a lot of people not prioritizing happiness enough, and I see a lot of other people prioritizing it far too much.

There’s also another answer, which is that some people have low happiness set points or chemical imbalances or other forms of mental problems that make it extremely difficult for them to be happy. That’s one of the ways in which one can have what Bryan calls ‘extraordinary bad luck’ that you can’t overcome, but there are other obvious ways as well.

‘Digital Content Creators’ joins the list of professions that officially face ‘no tax on tips.’ Influencers and podcasters are explicitly getting a tax break.

From a public choice standpoint I suppose this was inevitable. However, this phases out at higher income levels, which means that none of the prominent people you are thinking of likely can benefit from this. As in, Republicans proudly embraced progressive taxation to partially offset regressive tariffs? So yes, I do accept tips, and very much appreciate them along with subscriptions, but alas after consulting with my tax lawyer (GPT-5 Pro) I have concluded that I cannot benefit from this policy.

Did men dress better and therefore look better in the past? Derek Guy makes the case that they did and attempts to explain why and what he means by better.

I think I agree that in a purely aesthetic sense people did dress ‘better,’ but that is because people in the past put massive investment into this. They spent a huge percentage of their income on clothes, they spent a large percentage of their time and attention on understanding, creating and maintaining those clothes, and they were willing to suffer a lot of discomfort. And they faced huge social and status pressures to devote such efforts, with large punishments for not measuring up to norms.

Derek notes our reduced tolerance for discomfort and lack of effort, but skips over all the extra money and time and cognitive investments, seems to lack the ‘and this is good, actually’ that I would add. I think it’s pretty great that we have largely escaped from these obligations. The juice is not worth the squeeze.

Santi Ruiz interviews Dr. Rob Johnston on the intelligence community and how to get good intelligence and make good use of it. There’s lots of signs of a lot of deep competence and dedication, but clearly the part where they deliver the information and then people use it is not going great. Also not going great is getting ready for AI.

Nate Silver explains what Blueskyism is, as in the attitude that is pervasive on Bluesky, and why it is not a winning strategy in any sense.

Texas becomes the seventh state to ban lab grown meat. Tim Carney becomes the latest person to be unable to understand there could be any reason other than cronyist protectionism to want this banned. Once again I reiterate that I don’t support these bans, but it seems disingenuous to prevent not to understand the reasons they are happening. James Miller offers a refresher of the explanation, if you need one, except that the demands wouldn’t stop with what Miller wants.

No, the Black Death was not good for the economy, things were improving steadily for centuries for other reasons. As opposed to every other pass famine or plague ever, where no one looks back and says ‘oh this was excellent for economic conditions.’

It is relatively easy to stay rich once already rich. It is not easy to get rich, or to be ‘good at’ being rich. It is also hard to be rich effectively, including in terms of turning that extra money into better lived experiences, and ‘use the money to change the world’ is even harder.

Roon: its amazing how little the post-economic people i know spend. many people are bad at being rich. you should teach them how to do it.

i think this is often why the children of the mega-rich are the ones who even get close to squandering their parents’ fortunes. when you get rich later into life you often don’t think with enough 0s in terms of personal consumption, donations, having a lavish household staff etc.

Eliezer Yudkowsky: In their defense, once you’ve already got your personal chef, volcano lair with trampoline, and a harem that covers all your kinks, there’s just not much else Earth offers for converting money to hedons.

Zvi Mowshowitz: It is remarkably difficult (and time consuming!) to spend large amounts of money in ways that actually make your life better, if you’re not into related status games.

I got into a number of arguments in the comments, including with people who thought a remarkably small amount of money was a ‘large amount.’ Certainly you can usefully spend substantially more than most people are able to spend and still see gains.

Reasonably quickly you hit a wall, and the Andy Warhol ‘everyone gets the same Diet Coke’ problem, and to do better you have to spend obscene amounts for not that much improvement. Are you actually going to be that much happier with a giant yacht or a private jet? What good is that private chef compared to Caviar or going to restaurants? Do you actually want to live farther away in some mansion? Does expensive art do anything for you cheap art doesn’t? And so on.

Even in the ways that would be good to spend more, you still need to know how to spend more and get value from what you paid, and how to do it without it taking up a ton of your time, attention or stress. Again, this is harder than it sounds.

We talk constantly about ‘losing to China’ whereas in China there are reasons to worry that China is losing, not to an outside force but rather in general, and this is on top of a fertility rate that is 1.1 or below and an already declining population:

Mike Bird: Useful chart from @andrewbatson here covering one of the most under-discussed and useful macro metrics around. China’s capital productivity has been in consistent, marked decline even as panic over Chinese industrial prowess has reached fever pitch

Indeed as of 2019-2023 China’s marginal product of capital (basically how much output you’re getting from another unit of capital) was only very slightly higher than that of the US, though the US is at the frontier of GDP per capita and China nowhere near.

Clearly both things can be true – that some of China’s leading industrial firms are incredibly impressive, world-leading, and that in the aggregate they’re not enough to offset the misallocation and inefficiency elsewhere.

He quotes that the manufacturing share of GDP in China, for all our worries about Chinese manufacturing, declined 2-3 percent between 2021 and 2025, with the sector now having narrower margins, lower profits and more losses.

All of this is more reason not to give them opportunity to invest more in AI, and also reason not to catastrophize.

Cate Hall thanks you for coming to her TED talk, ‘A Practical Guide To Taking Control of Your Life.’ Which is indeed the Cate Hall TED Talk you would expect, focusing on cultivating personal agency.

In my startup roundups I muse about why don’t startups offered TMM (too much money, here presumably also at too high a valuation) take the money and then set expectations accordingly? A commenter pointed out that Stripe did do a version of this, although it is not a perfect fit.

Motivation and overcoming fear are tricky. You can get people comfortable with public speaking with complements. You can also do it by having people starting with you come up and give intentionally terrible speeches while they get crumpled papers thrown at them, to show that nothing actually bad happens.

Can national-level happiness be raised or are we doomed to a hedonic treadmill?

When people rate their happiness, are they rating on an absolute scale that reflects a real treadmill effect, or are people simply asking if they are happy compared to what they know?

It seems obviously possible to raise national happiness. One existence proof is that there are very clearly regimes, policies and circumstances that make people very unhappy, and you can do the opposite of those things, at least dodging them.

Also there are things that consistently raise happiness and that vary in frequency greatly over time, for example being married and having grandchildren.

In any case, via MR (via Kevin Lewis) we have a new paper.

Abstract: We revisit the famous Easterlin paradox by considering that life evaluation scales refer to a changing context, hence they are regularly reinterpreted.

We propose a simple model of rescaling based on both retrospective and current life evaluations, and apply it to unexploited archival data from the USA.

When correcting for rescaling, we find that the well-being of Americans has substantially increased, on par with GDP, health, education, and liberal democracy, from the 1950s to the early 2000s.

Using several datasets, we shed light on other happiness puzzles, including the apparent stability of life evaluations during COVID-19, why Ukrainians report similar levels of life satisfaction today as before the war, and the absence of parental happiness.

Tyler Cowen: To give some intuition, the authors provide evidence that people are more likely engaging in rescaling than being stuck on a hedonic treadmill. I think they are mostly right.

This makes tons of sense to me. You get revolutions of rising expectations. There are definitely positional effects and treadmill effects and baseline happiness set points and all that to deal with, but the Easterlin Paradox is a paradox for a reason and things other than income vary as well.

That doesn’t mean life is getting better or people are getting happier. It can also go the other way, and I am very open to the idea that happiness could be declining (or not) in the smartphone era with kids not allowed to breathe outside, and everything else that causes people to feel bad these days both for good and bad reasons. But yeah, from the 1950s to the 1990s things seem like they very clearly got better (you could also say from the 1500s to 1990s, with notably brief exceptions, or earlier, and I’d still agree).

Camp Social is part of a category of offerings where adults go to sleepaway camp with a focus on making friends, complete with bunk beds and color wars and in one case a claimed 75% return rate, although also with staying up until 1: 30 getting drunk. The camp counselors are concierges and facilitators. Cost is $884 for two nights and three days, which seems rather quick for what you want to accomplish?

I do buy that this is a good idea.

Radiation is dangerous, but it is a lot less dangerous than people make it out to be, and we treat this risk with orders of magnitude more paranoia than things like ordinary air pollution that are far more deadly.

Ben Southwood: The life expectancy of someone hit with 2,250 millisieverts of radiation in Hiroshima or Nagasaki was longer than the average Briton or American born in the same year. Today in Britain we spend billions controlling radiation levels more than 100,000 times smaller than this.

2,250 millisieverts is a lot of radiation, like getting 225 full-body CT scans in one go. I don’t think anyone would recommend it. But it shows how ridiculous it is that we spend so much time, effort, and money on radiation levels of 1msv or 0.1 msv per year.

Andrew Hammel reports that the Germans are finally on the verge of losing their War on Air Conditioning, as in allowing ordinary people to buy one, because normies actually experienced air conditioning and are not idiots. The standard ‘urban haute bourgeoisie’ are holding out on principle, because they think life is about atoning for our sins and because they associate things like air conditioning with wasteful Americans. As you would expect, the alternative ‘solutions’ to heat wind up being exponentially more expensive than using AC.

I do note that they have a point on this one:

Andrew Hammel: First of all, *every oneof these people has a story about visiting the USA and nearly freezing to death in an over air-conditioned store or office. Every. Damn. One. I can predict exactly when they will wheel out this traumatic tale, I just let it unfold naturally.

I mean, I have that too, to the point that it is a serious problem. This happens constantly in Florida. Even in New York’s hotter summer days, I have the problem that there is nothing I can wear outside while walking to the restaurant, that I also want to be wearing once I sit down at the restaurant. It is crazy how often Americans will use the AC to make places actively too cold. We could stand to turn it down a notch.

Or rather, ‘the’ good news, as Elizabeth Van Nostrand lays out how Church Planting works and finds it very similar to Silicon Valley startups.

A counterargument to last month’s claim about rapidly declining conscientiousness. Conscientiousness has declined far more modestly, the decline here is still seems meaningful but is very is not be a crisis. What John did to create the original graph turns out to have been pretty weird, which was show a decline in relative percentile terms that came out looking like a Really Big Deal.

Cartoons Hate Her! is on point that germs are very obviously real and cause disease but quite a lot of people’s specific worries about vectors for being exposed germs and the associated rituals are deeply silly if you stop to think about physics, especially compared to other things the same people disregard.

Sesame Street will give its largest library to YouTube as of January 2026 featuring hundreds of episodes. It is not a perfect program, but this is vastly better than what so many children end up watching. I echo that it would be even better if we included classic episodes as well.

Indeed, we should be putting all the old PBS kids shows on YouTube, and everything else that it would be good for kids to be watching on the margin. The cost is low, the benefits are high. There are low quality versions of the shows of my extreme youth available (such as Letter People and Square One TV) but ancient-VHS quality is a dealbreaker for actually getting kids to watch.

What TV show had the worst ending? There are lots of great answers but the consensus is (in my opinion correctly) Game of Thrones at #1 and Lost at #2.

After that it gets more fractured, and the other frequent picks here I am in position to evaluate were mostly bad endings (HIMYM, Killing Eve, Enterprise, Battlestar Galactica) but not competitive for the top spot. Dexter came up a lot but I never watched. Supernatural came up a bunch, and I’m currently early in its final and 15th season and is it weird this makes me want to get to the end more not less? Better a truly awful end than a whimper?

To be the true worst ending, it has to not only be awful but take what could have been true greatness and actively ruin the previous experience. You need to be in the running for Tier 1 and then blow it so badly you have to think about whether it even stays in Tier 2 because they poisoned everything. That’s why Game of Thrones and Lost have to be so high.

Indeed those two are so bad that they substantially hurt our willingness to invest in similar other shows, especially Lost-likes, which is enforcing the good discipline of forcing for example Severance to assure us they have everything mapped out.

(Briefly on the others: While at the time I thought HIMYM’s ending was as bad as everyone thinks, on reflection it has grown on me and I think it is actually fine, maybe even correct. Killing Eve’s ending wasn’t good exactly, but I didn’t feel it ruined anything, it was more that all of season 4 was a substantial decline in quality. Battlestar Galactica was rage inducing but I understand why they did what they did and that mostly made it okay, again mostly the show started fantastic and was dropping off in quality generally. Enterprise ended bad, but again not historically bad, whereas the show wasn’t getting bad, and mostly the frustration was we weren’t done.

I heard the claim recently that Lost’s ending is aging well, as it suffered from the writers assuring us that they wouldn’t do the thing they did, whereas now looking back no one much cares. There’s that, but I still find it unsatisfying, they said they wouldn’t do it that way for a reason, and the worse offense was the total failure to tie up loose ends and answer questions.

Scott Sumner claims the greatest age of cinema was 1958-1963.

Scott Sumner: The public prefers 1980-2015, as you say. The movie experts say the 1920s-1970s were the best.

This highlighted the ways in which our preferences strongly diverge.

Another big hint is that Sumner and the experts claim an extremely high correlation of director with quality of movie. Great directors are great, but so many other things matter too.

As an example, recently I watched Mulholland Drive for the first time, which Sumner says might be his favorite film. I appreciated many aspects of it, and ended up giving it 4/5 stars because it was in many senses ‘objectively’ excellent, but I did not actually enjoy the experience, and had to read an explainer afterwards from Film Colossus to make sense of a lot of it, and even after understanding it a lot of what it was trying to do and say left me cold, so I didn’t feel I could say I ‘liked’ it.

From what I can tell, the public is right and the ‘experts’ are wrong. Also I strongly suspect that We’re So Back after a pause of timidity and sequilitius and superheroes.

Scott Sumner: There are two films that should never, ever be watched on TV. One is 2001 and the other is Lawrence of Arabia. If you saw them on anything other than a very big movie theatre screen, then you’ve never actually seen them.

I haven’t seen 2001 regardless, but on Lawrence of Arabia I can’t argue, because I attempted to watch it on a TV, and this indeed did not result in me seeing Lawrence of Arabia, because after half an hour I was absolutely bored to tears and could not make myself continue. There was a scene in which they literally just stood there in the sand for about a minute with actual nothing happening and I get what they were trying to do with that but it was one thing after another and I couldn’t even, I was out.

What I am confused by is how it would have improved things to make the screen bigger, unless it would be so one would feel forced to continue watching?

Here are his 13 suggestions for films to watch, although I have no idea how one would watch Lawrence of Arabia given it has to be on a big screen?

Vertigo, The Man Who Shot Liberty Valance, Touch of Evil, Some Like It Hot, Breathless, Jules and Jim, Last Year in Marienbad, High and Low, The End of Summer, 8 1/2, L’Avventura, The Music Room, Lawrence of Arabia.

I tried to watch High and Low, and got an hour in but increasingly had the same sense I got from The Seven Samurai, which is ‘this is in some objective senses a great movie and I get that but I have to force myself to keep watching it as outside of moment-to-moment it is not holding my interest’ except with more idiot plot – and yes I realize some of that is cultural differences and noticing them is the most interesting thing so far but I’m going to stick with idiot plot anyway. In addition to the idiot aspects, it really bothers me that ‘pay or pretend to pay the ransom’ is considered the obviously moral action. It isn’t, that is terrible decision theory. The moral action is to say no, yet there is not even a moment’s consideration of this question by anyone.

If the above paragraph is still there when you read this, it means I was unable to motivate myself to keep watching.

Jeff Yang explains some of the reasons Chinese movies tend to bomb in America, in particular the global hit Ne Zha 2. Big Chinese movies tend to be based on super complex Chinese traditional epic stories that Chinese audiences already know whereas Americans haven’t even seen Ne Zha 1. American stories have clear structure, understandable plots, payoffs for their events, central characters, and a moral vision that believes in progress or that things can be better. And they try to be comprehensible and to maintain a tonal theme and target market. Chinese movies, Yang reports, don’t do any of that. Effectively, they assume the audience already knows the story, which is the only way they could possibly follow it.

It’s as if Marvel movies were the big hits, and they didn’t try to be comprehensible to anyone who didn’t already know the characters and comics extensively? Certainly there are some advantages. It might be cool to see the ‘advanced’ directors cuts where it was assumed everyone had already either read the comics extensively or watched the normal version of the film?

As Jeff says, if they can make money in China, then sure, why not do all this stuff that the Chinese audiences like even if it alienates us Westerners. There are enough movies for everyone. It does still feel like we’re mostly right about this?

Like everyone else I think Hollywood movies are too formulaic and similar, and too constrained by various rules, and thus too predictable, but those rules exist for a reason. When older movies or foreign movies break those rules, or decide they are not in any kind of hurry whatsoever, it comes at a cost. I don’t think critics respect those costs enough.

I strongly agree with Alea here and I am one of the ones who want to stay away:

Alea: Novels with an empty mystery box should be explicitly tagged so I can avoid them. 110% of the joy of reading comes from uncovering all the deep lore and tying up every loose end. Some people get off on vague worlds and unfinished plots, and they should stay the fuck away.

I don’t especially want to go into deep lore in my spare time, but if you are going to convince me to read a novel then even more than with television you absolutely owe it to me to deliver the goods, in a way (with notably rare exceptions) that I actually understand when reading it.

As in: I know it’s a great book but if as is famously said, ‘you don’t read Ulysses, you reread Ulysses’ then you had me at ‘you don’t read Ulysses.’

And you definitely don’t read Game of Thrones until I see A Dream of Spring.

True facts about the whole ‘fleeing Earth’ style of story:

Ben Dreyfuss: The stupidest part of INTERSTELLAR is that the blight starts killing all the crops and after just a few decades they go “ah well, guess it won! Better leave earth. Hope we solve this magic gravity equation with the help of 5 dimensional beings and wormholes.”

“We can’t make okra anymore. Better go explore this all water planet where one hour is 7 years of time and this ice planet where water is alkaline and the air is full of ammonia.”

Pretty sure you can’t make okra there either, buddy.

Kelsey Piper: every single movie about people fleeing Earth involves displaying a mastery of technology which would obviously be more than sufficient to solve the problem they are fleeing Earth about

climate change is not going to make Earth less habitable than Mars so you can’t have people fleeing to Mars because of climate change, you just can’t.

‘there’s a supervolcano/asteroid induced ice age’ oh boy I have some news for you about Mars.

Daniel Eth: Just once I want a movie about people fleeing Earth to have the premise “there are trillions of people, and we have a per capita energy consumption of 100,000 kWh/yr, which is straining Earth’s ability to radiate the waste heat. We must now go to space to expand capacity”

Movie could have a real frontier vibe (space cowboys?) – “of course back in the old world (Earth), population and energy per capita are subject to bureaucratic regulations to prevent total ecosystem collapse; but in new worlds we can freely expand anew”

A recent different case of ‘I can’t help but notice this makes no sense’ was Weapons. The link goes to a review from Matthew Yglesias that I agree with, it does cool things with nonlinearity and the performances and cinematography are good except when you put it together in the second half the resulting actual plot, while consistent and straightforward, makes no sense.

Zvi Mowshowitz reviews Weapons while avoiding spoilers: When you’re in, writing or deciding to go to a horror movie, you make dumb decisions. It’s what you do.

The difference is to him this adds up to 3.5 stars, and to me it means 2.5 stars, once the holes and idiot balls became too glaring, I stopped being able to enjoy the film.

My other problem with Weapons was that the first two acts made me care about various characters and relationships that were rich and detailed and well-executed and acted, and then the third act didn’t care at all about those things, only about the main plot that did not make any sense. There might actually be a pretty great movie here in which the missing kids are a tragedy that never gets explained or solved because what matters is a different third act that focuses on how people react to it.

New Jersey looks to ban ‘micro bets,’ meaning sports bets about individual plays.

Erik Gibbs: The bill’s language defines a micro bet as any live proposition bet placed during an event that pertains specifically to the outcome of the next discrete play or action.

This restriction seems clearly good. I don’t know where the line should be drawn, but I am confident that ‘ball or strike’ bets are over the line.

It is a very light restriction – you can’t bet on a ball or strike or pass or run under this rule, but you can still bet on the outcome of an inning or drive. Bets on the next play have all the worst gambling attributes. They cost a lot individually, they resolve and compound super quickly, they are especially prone to addictive behavior.

Clair Obscur Expedition 33 finishes at Tier 2. It does a lot of things very right and I am very happy to have played it, despite some obvious issues, including some serious balance problems in Act 3.

If someone suddenly buys up the contract on Taylor Swift and Travis Kelce getting engaged from 20% to 40%, and you’re selling into it, yeah, good chance they know. Also this means yes, someone knew and traded on the information in advance. Cool. Oh, and congratulations to both of them, of course.

Sam Black has a new podcast about cEDH.

I don’t understand why Wizards of the Coast continues to be so slow on the banhammer in situations like Cauldron. We saw repeatedly exactly the broken format pattern, such as here where Cauldron starts out at 30% and then goes to 56% after six rounds, then a much larger majority of the top 8s. This continued long past the point where there was reasonable hope it would be fixed by innovation.

Mason Iange: Doesn’t it make sense that in a rotating format like standard, wotc wants people to have confidence in buying product and building decks? Literally no one is going to play standard if the best decks just get banned every 2 month.

Saffron Olive: The way I see is it there are two paths: you design cards conservatively and don’t need to ban anything, or you design cards aggressively and need to ban cards fairly often. Wizards is trying to design cards aggressively and never ban anything, which I don’t think is actually possible.

Patrick Sullivan: What you’re saying is true/relevant, but there are other considerations; the current state of affairs would clearly not be tolerated absent the stuff you’re mentioning. That’s why I think they should allow themselves to be as agile as possible regardless of what they decide to do.

Brian Kowal: The opposite of how I feel about rotating formats. I want it dynamic. I’d rather they make a balanced format. With a rotating format I want to feel like I can innovate. If it is solved I quickly lose interest. There are many formats that never rotate to protect investment.

I think ‘mix it up every time it is solved’ is going too far given how quickly we now solve formats, but the solution has to not be ‘play this deck or else.’ Yes, banning the best deck every two months would make you reluctant to invest in Standard, but effectively banning all but the best deck for months on end, or having to face an endless stream of the same overpowered nonsense even if you’re willing to sacrifice win rate to go rogue, is even worse.

They came out with an explanation and update on the 9th. A big part of this is that they screwed up timing the ban windows, and have a crazy high bar for doing “emergency” bans versus bans on announcement days. They are mitigating this going forward by adding more windows next year, one every major set release.

That points out how crazy the situation was. You’re going to release a set, and then not have a ban opportunity until after releasing the next set? That’s crazy.

Based on past experiences, I believe Brian Kowal is correct that an extended period of a miserable format, with bans that everybody knows have to happen but are extensively delayed, creates a point of no return, where permanent damage to the format and the game begins to accumulate.

Brian Kowal: There should be room in the ban policy for emergency bans. Perception has hit the point of no return. A significant portion of players do not want to touch Standard now. Rotation should be when we are creating players for the next year and this rotation lasts until January 2027! (I’m 80% on this. Somebody let me know if I’m wrong) Players are quitting Standard again to look for other games and formats. New players are choosing not to invest in it.

When format perception hits this state everybody knows something is getting a ban. So a lot of die hard competitives are even taking a break rather than buying 4 copies of the two most expensive cards in Standard.

The best way to go imo is to just suck it up and ban Cauldron immediately. Again, we all know it is happening anyway. Not taking action over and over again and just letting everybody suffer months of a bad format makes WOTC look like they don’t care.

Jenny: WotC took HOW long to decide to do nothing and ruin another Spotlight Series and RCQ season? Using the Arena ladder meta to judge the health of the format is *insane*

Pactdoll Terror: My 2-slot RCQ this Saturday in NYC sold 8 spots. I usually do 50. Someone who built Vivi to grind RCQs would be annoyed that it got banned, but Standard is DEAD locally. Weeklies aren’t launching, RCQs struggle to make money. Holding bans is bad for everyone except Vivi players.

Instead they’re going to do a strange compromise, and move up their next announcement date from November 24 to November 10, which still leaves two full months of this.

We should never have more than a month ‘in limbo’ where things are miserable and we know what is coming. Even if you decide to keep playing you are in an impossible position.

They say ‘Standard has not yet reached its final form’ but they are grasping at straws.

They say the Arena ladder is looking less awful. The Arena ladder is not real life, not only because the play level is low but also there’s nothing forcing the players to play the best deck. I learned that the hard way during the Oko era.

I get Carmen’s argument here that we ran the experiment and when you don’t have ban windows, you get constant speculation about potential bans and a lot of uncertainty, And That’s Terrible. You can’t fully embrace The Unexpected Banning. There needs to be a substantially higher bar outside of a fixed set of days.

The current situation was still ludicrous. While insufficiently competitive play is not as lopsided, that’s largely about card access and players wanting to have fun and of course not wanting to do this into a future ban. This ban would not be ‘from under players in a surprise move’ even if no formal warning was given. The idea that ‘we won’t make a move based on competitive play, only on non-competitive play, you competitors don’t much matter’ is definitely giving me even less desire to come back.

Which of course I get. Magic is not made for me. I’m just deeply sad about it.

I see the argument this isn’t a pure ‘do it today or else’ situation but it is an emergency. If I was Wizards, the moment it was clear we probably had a problem I would have created a new announcement date much closer in the future than two months, with the clear statement that at that time they would choose whether to ban Vivi, Caldron, both or neither. And then done it by now.

Pro Tour levels of cEDH (competitive commander) are an awesome thing to have exist, but seem to have a rather severe draw problem, because everyone knows how to play politics and how to force draws. Sam Black suggests making draws zero points, which I worry could create even more intense politics and feel bad moments but when 1-0-6 is a ‘great record’ then maybe it is time and it seems like the elimination rounds work fine?

Sam Black: The house games are more fun when we don’t play for draws. Similarly games in top 16 are more fun.

Ultimately, I don’t think any solution would satisfy me, since it is going to come down to pure politics and kingmaker decisions. One potential approach is to say that wins are 10, draws are 1, and we pair people accordingly, so taking the draw is not obviously good for you, it might be wiser to lose and get paired against others who aren’t playing for draws. In the 0-0-4 bracket I don’t like your winning chances, and you have to win at some point to make the cut.

Sam Black talks about the role of mediocre synergistic cards. You start with strong cards, and pick up the bad cards that work for you for free at the end. If the bad cards vanish, the lane is not open, go a different way. Only prioritize cards that have a high ceiling, and (almost) never take a consistently bad card in your colors that can’t make your deck much better when a pack contains a good card. Similarly, trying to read signals explicitly is overrated relative to taking good cards, which is underrated and serves the same purpose.

The exception (he seems to assumes in the modern era this won’t ever happen, which seems wrong to me) is if you are in danger of not having a deck, because you lack either enough cards or a key component, such that taking a usually bad card actually does provide substantial improvement.

Some cards that look bad, and have bad win rates, are instead good in the sense that they have high upside, but are being used badly by people who use them without the upside case. Sam’s example is a card that defaults to being a bad Divination but enables never running out of cards, so you can build your entire strategy around this, but if you put in your deck as a bad Divination then it will be bad.

Waymo is now offering service in Denver and is ready for Seattle as soon as they are permitted to do so. They’re doing experiments in Denver now with humans behind the wheel of about a dozen cars and Governor Polis is here for it. Planned cities include Dallas, Miami and Washington D.C. next year, and scouting ‘road trips’ have gone to Philadelphia and there are plans to go to Las Vegas, San Diego, Houston, Orlando and San Antonio.

Service in Denver will quickly reveal exactly how well Waymos can actually handle cold weather including snow. My prediction is it will go well, bet if you disagree. Hopefully it will help compensate for Denver’s struggling restaurants and its very high minimum wage.

As of the start of September, there are still only 2,000 Waymos: 800 in the San Francisco Bay Area, 500 in Los Angeles, 400 in Phoenix, 100 in Austin and ‘dozens’ in Atlanta.

As a point of comparison, San Francisco has ~1,800 taxi medallions, and an estimated 45,000 registered rideshare drivers, with Claude estimating there are typically 5,000 available rideshares at any given time, peaking in prime hours around 10,000.

Supervised Waymo diving has begun in NYC, where they have a permit to do so.

This continues the recent trend of noticing that holding back self-driving means tens of thousands of people a year will die that didn’t have to.

Ethan Mollick: It seems like there is not enough of a policy response to the fact that, with 57M miles of data, Waymo’s autonomous vehicles experience 85% less serious injuries & 79% less injuries overall than cars with human drivers.

2.4 million are injured & 40k killed in US accidents a year.

Think of EV policy and do long-term support: subsidies for R&D to bring down costs, incentives for including self-driving features, regulatory changes to make it easier to deploy, building infrastructure for autonomous-only vehicles (eg HOV lanes), independent testing.

Takes time.

There are many problems with this approach, including that it causes fixation on the lives saved versus cost and similar calculations, and also you sound like you are coming for people’s ability to drive. Whereas if you sell this purely as ‘Waymos are awesome and convenient and efficient and improve life greatly and also happen to be actually safe on top it’ then I think that’s way better than ‘you are killing people by getting in the way of this.’

Alice From Queens: Self-driving cars are like the new weight loss drugs.

Their value is so large, so obvious, and so scalable, that we can confidently predict their triumph regardless of knee-jerk cultural resistance and their wildly exaggerated downsides.

Yes, I’ve been saying the same thing for years. Because it still needs saying!

I mean, they totally are killing people by getting in the way, but you don’t need that.

Mostly you need to make people believe that self-driving is real and is spectacular.

Matthew Yglesias: I keep meeting people who are skeptical self-driving cars will ever happen.

I tell them I took one to the airport in Phoenix several months ago, did a test ride in DC, they’re currently all over San Francisco, etc and it’s blank stares like I’m telling them about Santa.

My model of what is holding things back for Waymo in particular right now is that mainly we have a bottleneck in car manufacturing, and there’s plenty of room to deploy a lot more cars in a bunch of places happy to have them.

Longer term, we also have to overcome regulatory problems in various places, especially foolish blue cities like New York and Boston, but I find it hard to believe they can hold out once the exponential gets going and everyone knows what they are missing. Right now with only a few hundred thousand rides a week, it’s easy to shrug it off.

Thus I think PoliMath might be onto something here:

PoliMath: I suspect Waymo doesn’t *wantthere to be a policy response to this data b/c it will inevitably end with the left demanding we ban human drivers and there will be a huge backlash that damages Waymo’s business in a serious way.

Waymo is steadily winning, as in expanding its operations. The more it expands, the better its case, the more it will be seen as inevitable. Why pick a premature fight?

The fight is out there. Senator Josh Hawley is suddenly saying ‘only humans out to drive cars and trucks’ as part of his quest to ‘reign in’ AI, which is the Platonic worst intervention to reign in AI.

Waymos are wonderful already, but they also offer much room for improvement.

Roon: It is pretty telling that when you ride in a Waymo, you cannot give instructions to Gemini to play a song, change your destination, or drive differently. When one of the great gilded tech monopolies of the world does not yet have a cohesive AI picture, what hope has the broader economy?

Eliezer Yudkowsky: AI companies are often so catastrophically stupid that I worry that Gemini might in some way be connected to the actual car. Oh wait, you explicitly want to be able to request that the car drive differently?

I do not want Gemini to be controlling the vehicle or how it drives, but there are other things that would be nice features for integration, and there are other quality of life improvements one could make as well. For now, we keep it clean and simple.

The Seth Burn 2025 Football Preview is here, along with the podcast discussion.

If you must hire a PR agency, this from Lulu Cheng Meservey strikes me as good basic advice on doing so.

Should you consider retiring to places like Italy, perhaps under a deal to go to a small town to get a 7% flat tax regime for 10 years? Is there a good deal to be struck where American retirees help fund what remains of Europe, especially given that translation is rapidly becoming seamless and these places are indeed very nice by all accounts? Paul Skallas here describes Southern Europe as ‘dirt cheap,’ citing this chart:

I am deeply skeptical that the discounts are this large, and my AI sanity check confirmed the savings are real but relatively modest. Also consider what ‘comfortable retirement’ means in places that (for example) won’t let you buy an air conditioner. But yeah, if you only have modest savings it seems like a good thing to consider.

YouTube Premium is an ideal product. For $10 a month you get no ads, creators get paid, and the variety of content is phenomenal. Yes, you could use AdBlock to get around it in many cases, and many will do that, but this is what the internet is supposed to look like.

Maxim Lobovsky: Not only is YouTube Premium great, it’s one of the few major ad-supported businesses offering a paid alternative. Paid social media is one of the only plausible solutions to the algorithm-driven polarization/rage-baiting/lowest-common-denominator content death spiral.

The problem is that you can’t then subscribe individually to everything else, because that adds up fast. Give me a unified YouTube Premium style subscription, please.

Yes, the failure to shut down TikTok despite a law making it illegal that was upheld by the Supreme Court 9-0 is rather insane. Trump is flat out refusing to enforce the ban and extending the deadline indefinitely, you can speculate as to why.

Downvotes, in some form, are a vital part of any social platform that has upvotes, both to maintain civility and maintain good incentives. If you can easily express pleasure there needs to also be an easy way to express displeasure. Dan Luu gives one reason, which is that otherwise people will write nasty comments as a substitute. The other reason is that otherwise maximizing for toxoplasma of rage and extreme reactions to get engagement wins and crowds other actions out. If you are going to do rankings, the rankings on LessWrong and also Reddit mostly seem quite good, and those are the only places where somewhat algorithmic feeds seem to do well.

Emmett Shear: The belief that downvotes are “uncivil” was one of the most common delusions I have encountered while working in social media.

Oliver Habryka: Yep, one of the things I always considered most crucial to maintain with LW 2.0. When I was shopping around for forum software alternatives when we started building LW 2.0 this ruled out like 80% of the options on the market.

Cremieux reports he was suspended from Twitter for a day for saying that Tea app had been hacked, which was called ‘posting other people’s private information without their express authorization and permission,’ except he did not do this or link to anyone who did do it (he said ‘you can go download 59.3 GB of user selfies right now’), whereas people who do expose such info often get ignored. He went warpath, citing various laws he asserts Twitter is breaking around the world.

(The link in the screenshot below takes you back to the post itself.)

Lewis: meanwhile post doxxing [someone’s] address was never removed. 2.5M views. reported it and DM’d Nikita, never heard back on either.

Sin: My contribution [which is literally a map containing the location with a giant arrow pointing to it saying it is where this person lives].

I saw this over a week later. Still there.

Elon Musk made a lot of mistakes with Twitter, but also did make some very good changes. One of them is that likes are now private. This damages an outsider’s ability to read and evaluate interactions, but it takes away the threat of the gotcha when someone is caught liking (or even not liking!) the wrong tweet and the general worry about perception, freeing people up to use them in various ways including to acknowledge that you’ve seen something, and to offer private approval.

It’s very freeing. When likes were public, which also means it was public what you didn’t like, I decided the only solution to this was to essentially not use the like button. Which worked, but is a big degrading of usefulness of Twitter.

Redaction: It really is insane how simply Hiding Likes On Twitter meaningfully shifted the overton window of the American political landscape

Samo Burja: I underestimated the impact change at the time. I think I thought preference falsification was much less pervasive than it was.

Meanwhile, in other contexts, it is still very much a thing to talk about who has liked which Instagram posts. This is exactly as insufferable as it sounds.

Every time Nikita tries to make me feel better about Twitter I end up feeling worse.

Nikita Bier (Twitter): The first step to eliminating spam is eliminating the incentive.

So over the last week, I have gone deep down the rabbit hole of X spam:

I am now in 3 WhatsApp groups for financial scams. I have become their friends. I know about their families. I cannot blow my cover yet.

What is the goal exactly? How would befriending them help? We already all know exactly how to identify these scams and roughly how they work. Understanding more details will not help Nikita or anyone else do anything. You think you’re going to do enough real world takedowns and arrests that people are scared to do scams, or something? How about instead we do basic filtering work?

Or, when he posts this:

Or this:

Eli: Twitter should include 3 schizophrenic reply guys and 1 egirl with Premium +

Nikita Bier: We did the math and that’s what retains a user.

He kids, but kid enough times in enough ways with enough detail and you’re not fully kidding. It is very clear that Twitter is doing a lot of the Goodhart’s Law thing, where short term feedback metrics are being chased without much eye on the overall experience. Over time, this goes to quite bad places.

Also, yeah, this is not okay:

Mike Solana: I truly believe blocking is a right, and I would never go after someone for blocking me for any reason. but you should not then be able to unblock, comment on a post of mine, and immediately REBLOCK so I can’t respond. in this case, I deserve at least 24 hours to roast your ass.

There are any number of obviously acceptable solutions to this. I like the 24 hours, where if you do something that you couldn’t do while they are blocked, your reblock is delayed for a day.

Local coffee shop sets up a bot farm with hundreds of phones to amplify their messages on Instagram.

Vas: If a simple coffee shop has a bot farm with 100s of phones to amplify their message, please consider what a foreign agency or adversarial operator is running on your favorite social media platform.

Especially today, please consider that the opinions you read, the calls to violence you hear, and the news you digest, are all an operation done to sow hatred in your mind and your soul.

Scott Sumner uses his final EconLib post to remind us that almost everything is downstream of integrity. Without informal norms of behavior our laws will erode until they mean almost nothing, and those informal norms are increasingly discarded. He cites many examples of ways things might (read: already did) go wrong.

I may never stop finding it funny the extent to which Trump will seek out the one thing we know definitively is going badly, then and choose that to lie and brag about.

As in, how is the DC crackdown going? I only as of writing this know for sure that restaurant reservations were down, although it turns out not down as much as initially reported once you control for restaurant week but 7% is still a lot. So of course…

Donald Trump: People are excited again. Going to restaurants again [in DC]. The restaurant business, you can’t get into a restaurant.

Trump attempted to fire Federal Reserve Governor Lisa Cook ‘for cause,’ setting up a legal fight. Cook was not about to leave quietly.

Initial market reaction was muted, the dollar only declined 0.3% and gold only rose 0.6%, likely because it was one escalation among many and it might fail, but this is a direct full assault on central bank independence, and central bank independence is a really big deal.

Jonnelle Marte and Myles Miller (Bloomberg): While a president has never removed a Fed governor from office, one can do so for cause. Laws that describe “for cause” generally define the term as encompassing three possibilities: inefficiency; neglect of duty; and malfeasance, meaning wrongdoing, in office.

What was this ‘cause’?

Trump had earlier called for Cook’s resignation after Federal Housing Finance Agency Director Bill Pulte alleged she lied on loan applications for two properties — one in Michigan and one in Georgia — claiming she would use each property as her primary residence to secure more favorable loan terms.

Trump said it was “inconceivable” that Cook was not aware of requirements in two separate mortgage applications taken out in the same year requiring her to maintain each property as her primary residence.

That’s it. There are no additional claims. Only the claim that she claimed one place would be a primary residence, and then claimed a different primary residence.

Pulte, in a statement posted to social media, thanked Trump for removing Cook. “If you commit mortgage fraud in America, we will come after you, no matter who you are,” he wrote.

What about if you are President of the United States and have recently had a nine figure judgment against your ‘Trump’ organization entered against you for lying on mortgage applications? Are we coming for you?

Oh, and what if it turned out, as it has, that the claim against Cook simply isn’t true?

Aaron Fritschner: The mortgage fraud claim against Lisa Cook is false, per documents obtained by Reuters. Bill Pulte’s accusation, the sole pretext Trump used to fire her from the Fed, was that she claimed two homes as primary residence. These docs show she did not.

“The document, dated May 28, 2021, was issued to Cook by her credit union in the weeks before she completed the purchase and shows that she had told the lender that the Atlanta property wouldn’t be her primary residence.”

“documentation reviewed by Reuters for the Atlanta home filed with a court in Georgia’s Fulton County, clearly says the stipulation exists “unless Lender otherwise agrees in writing.” The loan estimate, a document prepared by the credit union, states “Property Use: Vacation Home”

Lisa Cook also didn’t claim a tax credit for primary residence on the second home and declared it as a second home on official federal government documents when she was being considered for a role on the Fed. A real master criminal.

Also her mortgage rate was 3.5%, modestly higher than the going rate at the time.

If you are going to try and fire a Federal Reserve President for cause, something that has not happened (checks notes, by which I mean GPT-5) ever, thus endangering faith in Fed independence and the entire financial system, you might want to follow due process, or at least confirm that your accusation is true? As opposed to demonstrably false?

A lot of people are understandably highly outraged about this, as Adam Tooze partly covered right after the attempted firing. This comes on the heels of firing the head of the Bureau of Labor Statistics because Trump didn’t like the statistics.

A reminder that yes, there is at least one clear prior case of a crazy person destroying a nation’s health that parallels RFK Jr, as in South Africa where their President denied drugs to AIDS patients.

Yes, all the various ‘honesty taxes’ our government imposes (also see this highlighted comment) are extremely pernicious and do a lot more damage than people realize. We teach people they have to lie in order to get food stamps, and they (just like LLMs!) learn to generalize this, everything impacts everything, our society is saying lying is okay and lying to the government is mandatory, you can’t isolate that. You don’t get a high trust society that way, although we are remarkably high trust in some ways despite this.

Most of the time, the correct answer is not to enforce the rules as written even if we could do so, instead the correct answer is to remove or heavily modify the rule. Our rules are tuned to the idea they won’t be enforced, so it is likely enforcing them would not go well. Then there are exceptions, such as primary residence mortgage fraud.

Aaron Bergman: I think ethics- and integrity-pilled people need to have a better theory of when it’s cool to lie to *institutions

The “lying to a human” vs “lying to institution” distinction is real and important btw, the bar for the latter is much lower

Oliver Habryka: Yeah, I agree with this. I think lying to institutions is frequently fine, often roughly proportional to how big they are, though there are also other important factors.

I don’t have a great answer to exactly when this all makes it okay to lie to corporations or governments and on forms. My guess is it is roughly ‘do not break the social contract.’ But if this is something where is no longer (or never was) a social contract, and no one would look at you funny if you were doing it in the open, then fine.

If you notice you are very clearly expected to lie (including by omission) or do a fake version of something, that the system is designed that way, then you have little choice, especially if you are also forced to deal with such institutions in order to get vital goods or services.

Idaho suicide hotline is forced to ask teens who call to get parental consent due to a law passed last year requiring consent for almost all medical treatments for minors. As you would expect, most of them hang up.

Are Trump’s tariffs helping domestic manufacturing? What do the domestic manufacturers say about this?

UK arrests comedian for speech, where the speech was done on American soil.

I try to keep a high threshold for criticism but it does seem like Trump ordered a bunch of people murdered (some might use the term ‘war crime’ but I prefer plain language and also there was no war involved, the ‘war on drugs’ is not a war) on the high seas with absolutely no legal basis for doing so? He ran the plot of Clear And Present Danger straight up in the open? You didn’t know there were drugs involved, and even if you did you can’t go around blowing up boats purely because there were drugs involved?

Especially when you had the power to interdict and instead decided to ‘send a message’ as per Secretary of State Marco Rubio by blowing up the boat with no warning because the boat (that you could have interdicted) posed an ‘immediate threat to the United States’? And a letter from the White House to Senators Mike Johnson and Chuck Grassley that confirms, yep, straight up murder and likely we will murder again? And JD Vance seems to confirm that this is straight up murder?

JD Vance: Killing cartel members who poison our fellow citizens is the highest and best use of our military.

Rand Paul: JD “I don’t give a shit” Vance says killing people he accuses of a crime is the “highest and best use of the military.”

Did he ever read To Kill a Mockingbird?

Did he ever wonder what might happen if the accused were immediately executed without trial or representation??

What a despicable and thoughtless sentiment it is to glorify killing someone without a trial.

I’m sincerely and deeply confused what makes this not straight up murder and have not seen any serious arguments for why it would be anything else, as opposed to ‘yes it is murder and I really like murder for us here, yay this murder.’ It also seems, to the extent such points are relevant in 2025, like a very clear ‘high crime and misdemeanor.’

Apple’s new iPhone 17 Pro Max seems like a substantial improvement over the iPhone 16 Pro Max. You get 50% more ram, at least twice as many camera pixels, better cooling, a substantially better battery, and a new faster chip with GPUs ‘designed for AI workloads.’ I’m going to stick with my Pixel 9 Fold, the only feature on iPhones that is compelling to me at all is avoiding anti-Android discrimination, hell of a business model, but those are nice upgrades.

Apple Vision Pro is making small inroads in specialized workplaces that can exploit spatial computing, including things like exploring potential new kitchens or training airline pilots. It is expensive, but if it is the best tool, it can still be worth it. The rest of us will be waiting until it is lighter, better, faster and cheaper.

Meta had some issues with child safety when using its smart glasses, so whistleblowers report that they tried to bury it or shield it from public disclosure in various ways. This continues the pattern where:

  1. Meta has a safety problem.

  2. When Meta tries to have internal clarity and do research on the dangers of its products, people leak that information to the press, details get taken out of context and they get hauled before Congress.

  3. When Meta responds to this by avoiding clarity, and suppressing the research or ignoring the problem, they get blamed for that too.

I mean, yes, why not simply actually deal with your safety problems is a valid response, but the incentives here are pretty nasty.

The central accusation is that the company’s lawyers intervened to shape research into risks from virtual reality, but I mean it would be insane for Meta not to loop the lawyers in on that research. If we are going to make Meta blameworthy, including legally, for doing the research, then yes they are going to run the research by the lawyers. This is a failure of public policy and public choice.

That doesn’t make the actual problems any less terrible, but it sounds like they are very standard. Kids were bypassing age restrictions, and when interacting with others they would get propositioned. It seems like the accusation is literally ‘Meta let its users interact with each other, and sometimes those users said or did bad things.’

Experts have long warned that virtual reality can endanger children by potentially exposing them to direct, real-time contact with adult predators.

It is remarkable how consistently even the selected examples don’t involve VR features beyond allowing users to talk? I’m not saying you don’t need to have safeguards for this, but it all sounds very similar to the paranoia and statistical illiteracy where we don’t let children participate in physical spaces anymore.

I love this report of a major problem running the other way, to which, I mean, fair:

In a January 2022 post to a private internal message board, a Meta employee flagged the presence of children in “Horizon Worlds,” which was at the time supposed to be used only by adults 18 and over. The employee wrote that an analysis of app reviews indicated many were being driven off the app because of child users.

I’m not saying Meta handled any of this perfectly or even handled it well. But there’s no smoking gun here, and no indication that they aren’t taking reasonable steps.

Meta is also being sued and accused of violating and FTC agreement on WhatsApp privacy and account protection, claiming 500,000 WhatsApp accounts are stolen daily.

Matthew served, and Nate served back, so now it’s on.

Nate Silver: Academic journals might be a lost cause but they’d probably be better if you had some non-academic practitioners serving as reviewers. Journalists have their problems too but they have much better bullshit detectors, for instance.

The most important research paper of the past 10 years is the Google transformer paper (“Attention Is All You Need”) and it was written by non-academics and published in an open-access journal.

You ran some cool regression analysis OK great. Make some nice graphics and put it on a Substack. Engaging headline, 1500-2500 well-written words. That’s literally 100x faster than trying to publish in a journal and it’s better peer review anyway.

Matthew Sitman: Very, very occasionally an exceptional generalist intellectual or particularly well-informed journalist might be able to see a problem with a paper that an academic close to the subject doesn’t, but this radically underestimates the uses of expertise/familiarity with a literature.

As someone who’s been an academic and now talks/writes about ideas for non-specialists, a difference is that academics, ideally, know what they don’t know, are aware of questions asked/answered previously, etc; if that can produce tunnel vision, well, they’re trying to dig deep.

Nate Silver: Well, I know a lot about statistical inference, have been doing it for 25 years, have faced a lot of public scrutiny, and in the fields where I also have a lot of domain knowledge, probably half of published papers have obvious fatal flaws that render them unfit for publication.

Maybe I’m a weird outlier, but the peer review process is obviously broken. Maybe it’s better in the fields I *don’tknow well. But I’d be surprised if that’s true.

Aella: People don’t understand how much of a joke the current state of peer review is. It’s extremely bad.

It’s bad enough that at one point I was suggesting to someone “why don’t you just deliberately insert mistakes so they can feel satisfied about finding those and don’t end up fucking with the rest of the paper”

St. Motweb: This is actually a strategy that many academics use.

SolDad: I unironically do this in my papers, sort of. Not inserting new stuff, but leaving fairly-obvious but not super important work undone as “low hanging fruit” for the reviewers to notice and ask for.

This suggests a wager.

Select a field. A neutral party (or an LLM with an agreed upon prompt) selects 10 recent papers that meet some criteria for being Proper Papers In Good Journals and that assert some statistical finding and thus could in theory be flawed.

If Nate Silver can find a fatal flaw in at least 2 of the 10 papers, as evaluated by an agreed neutral judge, then he wins. If not, he loses. This should cover both sides: Two is enough that the system is obviously fatally flawed, and Nate says he can average five.

This is not a statement that the first best solution to peer review involved outsiders like Nate Silver reviewing papers. That seems obviously wrong. It is a claim that the current solution is so utterly terrible that outsider review would be a big improvement.

Indeed, ‘put your ideas out on the internet and let people critique them’ is the actual essence of true peer review, and far superior in practice to the formal version in 2025.

I am reminded of when I wrote a detailed response to a detailed response to AI 2027. Titotal accused AI 2027 of not having gone through ‘peer review’ and then proceeded to do peer review of AI 2027. Which was great, we thank Titotal for his service here, and I in turn then also contributed.

As I said then:

This is the peer! This is the review! That is how all of this works! This is it working!

Rob Bensinger is latest to note that we could do way better on a wide array of problems if we could improve discourse norms, and this could be a high impact play. That doesn’t mean we know how to pull it off. As he notes, prediction markets have been somewhat helpful, but seem unlikely to be the full game changer we need. Also as he notes, this would be great to pull off but there’s an attractor that causes this to look more relatively doable than it is, which can trick people into focusing on it more than they should relative to other interventions.

Kelsey Piper provides her overview of the findings that Giving People Money on a monthly basis in America does not seem to improve most outcomes, including child outcomes. They’re not ‘blowing’ the money on vices, but people give back a lot of it by working less, and while they tell stories about how great things are, most of the statistics don’t improve.

She then points us to a conservative critique of her analysis by Charles Lehman. I agree with Charles that the criticism ‘maybe the gains don’t show up in the measurements’ is rather silly, unless you can point to a specific policy aim that isn’t being measured, and explain why you have hope for that one despite the others not showing up.

I also appreciated Charles saying that for American purposes, a study of a social intervention in Africa should be treated similarly to when biologists say something ‘works in vitro,’ as conditions are so radically different. The synthesis would be that ‘give people money’ is highly useful at improving various outcomes when those people have so little money that they risk starvation, but not that far beyond that, and existing interventions here already take us above the threshold.

We definitely need to reconcile our model of how this works with not only the null results in these studies, but also the other null results from many other programs.

One reason to be suspicious of ‘policy mostly can’t help’ is that if you propose ‘policy mostly can’t hurt’ or even ‘getting rid of existing policies and enforcement mostly can’t hurt’ then most people will disagree with you. So at minimum, you can help by not hurting, and you should find the extreme asymmetry suspicious.

I do have one policy objective that I am confident this does help with if it is reliable and sustained, which is fertility. I’m sticking by the estimate that for every ~$270k in value (which need not be cash) you transfer to parents, you get one additional birth. This is one area where anticipation of money, or rather anticipation of having the necessary resources of all types, definitely changes behavior.

I concur with the consensus view that this post from Lennox about trying to sell Marx to EAs backfiring spectacularly is a gem of a read. You get such fun as Lennox encountering the Socialist Calculation Debate in its purest form:

Lennox: But when I looked at what the EAs were actually doing, and the methods they were using to evaluate charities, it quickly became clear that this was not going to work. One look at a GiveWell spreadsheet filled my heart with dread. They were creating insanely detailed cost effectiveness estimates of different interventions, using probabilistic models that tried to account for every possible empirical and philosophical assumption you could think of.

It would be great to analyse policies that fundamentally transform the economy at this level of detail, but there were a couple of problems. First, it’s impossible to create a useful model at that level of detail for transformative economic policies. Second, even if it were possible, there’s no way I could do it.

Fine. Lesson learned.

Except, of course, lesson not learned, because he didn’t then think ‘oh that is exactly why the socialist ideas I am advocating for won’t work.’ So he continues, and asks his sociological experts who love socialism. The same result happens again:

This was… disappointing to say the least. Here was a group of serious academics who had spent decades trying to make a rigorous case for socialism, and this is what they ended up concluding? That we don’t have the social technology to make it work, but maybe one day we will get there.

He then does the ‘finds smoking causes cancer, quits reading’ move, saying this means analytical Marxists had undermined themselves so maybe look at critical theory. Because, of course:

I’d assumed that if you want to solve a systemic problem like global poverty, you need to understand the root cause, and the root cause of poverty was, of course, capitalism.

However, understanding the root cause of something doesn’t automatically help you solve it.

The main mistake, of course, is that the root cause was not capitalism but instead the human condition. This had not yet entered Lennox’s hypothesis space. The other mistake was, yes, knowing the root cause of something does not always help solve it.

The evidence continued to mount.

Throughout undergrad, I would read sociological theorists and often find their arguments vague, opaque, and at times just poorly argued. Then I would read work by EAs and find it crystal clear, carefully argued, and generally well calibrated to the evidence.

The final nail in the coffin came while reading Scott Alexander’s essay Meditations on Moloch.

..

Looking back, I could have saved myself a lot of time. These fundamental problems with the project were probably obvious to many in the EA community and they would have told me the project was unlikely to be useful, if I’d had the courage to ask. But I avoided getting their feedback, partly because I figured they were ideologically blinded and would just dismiss anything critical of their movement.

That last line really is wonderful. Socialist refuses to get feedback from target audience because they are worried audience is ideologically blinded. Love it.

After that he is then able to do some self-reflection, also fun but less fun. Then in conclusion he comes around and notes that if you accept that the world is a swirling mess of misaligned incentives and coordination problems, then this completely undermines the Marxist political project.

That is indeed how the world works, so yes. Thank you. Well done, sir. Also, well done, sir, at the end:

Anyway, a couple of years after this happened I fell in love, and it was everything the poets and songwriters said it would be. So I guess the moral of the story is: if you find yourself tempted to construct elaborate ideological arguments in a vain attempt to make yourself feel smart and important, consider falling in love instead.

Discussion about this post

Monthly Roundup #34: September 2025 Read More »

the-us-is-trying-to-kick-start-a-“nuclear-energy-renaissance”

The US is trying to kick-start a “nuclear energy renaissance”


Push to revive nuclear energy relies on deregulation; experts say strategy is misplaced.

In May, President Donald Trump signed four executive orders to facilitate the construction of nuclear reactors and the development of nuclear energy technology; the orders aim to cut red tape, ease approval processes, and reshape the role of the main regulatory agency, the Nuclear Regulatory Commission, or NRC. These moves, the administration said, were part of an effort to achieve American independence from foreign power providers by way of a “nuclear energy renaissance.”

Self-reliance isn’t the only factor motivating nuclear power proponents outside of the administration: Following a decades-long trend away from nuclear energy, in part due to safety concerns and high costs, the technology has emerged as a potential option to try to mitigate climate change. Through nuclear fission, in which atoms are split to release energy, reactors don’t emit any greenhouse gases.

The Trump administration wants to quadruple the nuclear sector’s domestic energy production, with the goal of producing 400 gigawatts by 2050. To help achieve that goal, scientific institutions like the Idaho National Laboratory, a leading research institute in nuclear energy, are pushing forward innovations such as more efficient types of fuel. Companies are also investing millions of dollars to develop their own nuclear reactor designs, a move from industry that was previously unheard of in the nuclear sector. For example, Westinghouse, a Pennsylvania-based nuclear power company, plans to build 10 new large reactors to help achieve the 2050 goal.

However, the road to renaissance is filled with familiar obstacles. Nuclear energy infrastructure is “too expensive to build, and it takes too long to build,” said Allison Macfarlane, a science and technology policy expert at the University of British Columbia who used to chair the NRC from 2012 to 2014.

And experts are divided on whether new nuclear technologies, such as small versions of reactors, are ready for primetime. The nuclear energy field is now “in a hype bubble that is driving unrealistic expectations,” said Edwin Lyman, the director of nuclear power safety at the Union of Concerned Scientists, a nonprofit science advocacy organization that has long acted as a nuclear safety watchdog.

Meanwhile, the Trump administration is trying to advance nuclear energy by weakening the NRC, Lyman said. “The message is that it’s regulation that has been the obstacle to deploying nuclear power, and if we just get rid of all this red tape, then the industry is going to thrive,” he added. “I think that’s really misplaced.”

Although streamlining the approval process might accelerate development, the true problem lies in the high costs of nuclear, which would need to be significantly cheaper to compete with other sources of energy such as natural gas, said Koroush Shirvan, a nuclear science researcher at the Massachusetts Institute of Technology. “Even the license-ready reactors are still not economical,” he said. If the newer reactor technologies do pan out, without government support and subsidies, Shirvan said, it is difficult to imagine them “coming online before 2035.”

It’s déjá vu all over again

Rumblings of a nuclear renaissance give experts a sense of déjà vu. The first resurgence in interest was around 2005, when many thought that nuclear energy could mitigate climate change and be an energy alternative to dwindling supply and rising prices of fossil fuels. But that enthusiasm slowed mainly after the Fukushima accident in 2011, in which a tsunami-triggered power outage—along with multiple safety failures—led to a nuclear meltdown at a facility in Japan. “So, the first nuclear renaissance fizzled out,” said Lyman.

Globally, the proportion of electricity provided by nuclear energy has been dwindling. Although there has been an increase in generation, nuclear energy has contributed less to the share of global electricity demand, dropping to 9 percent in 2024 from a peak of about 17 percent in 2001. In the US, 94 reactors generate about a fifth of the nation’s electricity, a proportion that has held steady since 1990s. But only two of those reactors have come online in the last nearly 30 years.

This renewed push is “a second bite at the apple, and we’ll have to see but it does seem to have a lot more of a headwind now,” said Lyman.

Much of that movement comes from the private sector, said Todd Allen, a nuclear engineer at the University of Michigan. In the last couple of decades, dozens of nuclear energy companies have emerged, including TerraPower, co-founded by Bill Gates. “It feels more like normal capitalism than we ever had in nuclear,” Allen said. Those companies are working on developing the large reactors that have been the backbone of nuclear energy for decades, as well as newer technologies that can bolster the field.

Proponents say small modular reactors, or SMRs, and microreactors, which generate less than 300 megawatts and 20 megawatts, respectively, could offer safer, cheaper, and more flexible energy compared to their more traditional counterparts. (Large reactors have, on average, 900 megawatts of capacity.) One 2022 study found that modularization can reduce construction time by up to 60 percent.

These designs have taken the spotlight: In 2024, a report estimated that the SMR market would reach $295 billion by 2043. In June, Energy Secretary Chris Wright told Congress that DOE will have at least three SMRs running by July of next year. And in July of this year, the Nuclear Energy Agency launched a dashboard to track SMR technologies around the world, which identified 74 SMR designs at different stages around the world. The first commercial SMR in North America is currently being constructed in Canada, with plans to be operational by 2030.

But whether SMRs and microreactors are actually safer and more cost-effective remains to be determined. A 2022 study found that SMRs would likely produce more leakage and nuclear waste than conventional reactors. Studying them, though, is difficult since so few are currently operational.

In part, that may be because of cost. Multiple analyses have concluded that, because of rising construction and operating costs, SMRs might not be financially viable enough to compete for the world’s energy markets, including in developing countries that lack affordable access to electricity.

And recent ventures have hit road bumps: For example, NuScale, the only SMR developer with a design approved by the NRC, had to shut down its operations in November 2023 due to increasingly high costs (though another uprated SMR design was approved earlier this year).

“Nothing is really commercialized yet,” said Macfarlane. Most of the tech companies haven’t figured out expenses, supply chains, the kind of waste they are going to produce or security at their reactors, she added.

Fuel supply is also a barrier since most plants use uranium enriched at low rates, but SMRs and microreactors use uranium enriched at higher levels, which is typically sourced from Russia and not commercially available in the US. So scientists at the Idaho National Laboratory are working to recover enriched uranium from existing reactors and developed new, more cost-effective fuels, said Jess Gehin, the associate laboratory director for the Nuclear Science & Technology Directorate at the INL. They are also using artificial intelligence and modeling simulation tools and capabilities to optimize nuclear energy systems, he added: “We got to reach 400 gigawatts, we need to accelerate all of this.”

Companies are determined to face and surpass these barriers. Some have begun pouring concrete, such as one nuclear company called Kairos Power that began building a demo of their SMR design in Tennessee; the plant is projected to be fully operational by 2027. “I would make the case that we’re moving faster than many in the field, if not the fastest,” Mike Laufer, the company’s CEO and co-founder, told Reuters last year.

Some experts think achieving nuclear expansion can be done—and revel in the progress so far: “I would have never thought we’d be in this position where we’re working so hard to expand nuclear, because for most of my career, it wasn’t that way,” said Gehin. “And I would say each month that goes by exceeds my expectations on the next bigger things that are coming.”

Doing more with less?

Although the Trump administration aims to accelerate nuclear energy through executive orders, in practice, it has not allocated new funding yet, said Matt Bowen, an expert on nuclear energy, waste, and nonproliferation at Columbia University’s Center on Global Energy Policy. In fact, the initial White House budget proposed cutting $4.7 billion from the Department of Energy, including $408 million from the Office of Nuclear Energy allocated for nuclear research in the 2026 fiscal year.

“The administration was proposing cuts to Office of Nuclear Energy and DOE more broadly, and DOGE is pushing staff out,” said Bowen. “How do you do more with less? Less staff, less money.”

The Trump administration places the blame for the nuclear sector’s stagnation on the NRC, which oversees licensing and recertification processes that cost the industry millions of dollars each year in compliance. In his executive orders, Trump called for a major reorganization of the NRC. Some of the proposed changes, like streamlining the approval process (which can take years for new plants), may be welcomed because “for a long time, they were very, very, very slow,” said Charles Forsberg, a nuclear chemical engineer at MIT. But there are worries that the executive orders could do more than cut red tape.

“Every word in those orders is of concern, because the thrust of those orders is to essentially strip the Nuclear Regulatory Commission of its independence from the executive branch, essentially nullifying the original purpose,” said Lyman.

Some experts fear that with these new constraints, NRC staff will have less time and fewer resources to do their jobs, which could impact power plant safety in the future. Bowen said: “This notion that the problem for nuclear energy is regulation, and so all we need to do is deregulate, is both wrong and also really problematic.”

The next few decades will tell whether nuclear, especially SMRs, can overcome economic and technical challenges to safely contribute to decarbonization efforts. Some, like Gehin, are optimistic. “I think we’re going to accelerate,” he said. “We certainly can achieve a dramatic deployment if we put our mindset to it.”

But making nuclear financially competitive will take serious commitment from the government and the dozens of companies, with many still skeptical, Shirvan said. “I am quite, I would say, on the pessimistic scale when it comes to the future of nuclear energy in the US.”

This article was originally published on Undark. Read the original article.

The US is trying to kick-start a “nuclear energy renaissance” Read More »