Author name: Mike M.

trump-confirms-us-is-seeking-10%-stake-in-intel-bernie-sanders-approves.

Trump confirms US is seeking 10% stake in Intel. Bernie Sanders approves.

Trump plan salvages CHIPS Act he vowed to kill

While chipmakers wait for more clarity, Lutnick has suggested that Trump—who campaigned on killing the CHIPS Act—has found a way to salvage the legislation that Joe Biden viewed as his lasting legacy. It seems possible that the plan arose after Trump realized how hard it would be to ax the legislation completely, with grants already finalized (but most not disbursed).

“The Biden administration literally was giving Intel money for free and giving TSMC money for free, and all these companies just giving the money for free, and Donald Trump turned it into saying, ‘Hey, we want equity for the money. If we’re going to give you the money, we want a piece of the action for the American taxpayer,'” Lutnick said.

“It’s not governance, we’re just converting what was a grant under Biden into equity for the Trump administration, for the American people,” Lutnick told CNBC.

Further, US firms could potentially benefit from any potential arrangements. For Intel, the “highly unusual” deal that Trump is mulling now could help the struggling chipmaker compete with its biggest rivals, including Nvidia, Samsung, and TSMC, BBC noted.

Vincent Fernando, founder of the investment consultancy Zero One, told the BBC that taking a stake in Intel “makes sense, given the company’s key role in producing semiconductors in the US,” which is a major Trump priority.

But as Intel likely explores the potential downsides of accepting such a deal, other companies applying for federal grants may already be alarmed by Trump’s move. Fernando suggested that Trump’s deals to take ownership stake in US firms—which economics professor Kevin J. Fox said only previously occurred during the global financial crisis—could add “uncertainty for any company who is already part of a federal grant program or considering one.”

Fox also agreed that the Intel deal could deter other companies from accepting federal grants, while possibly making it harder for Intel to run its business “effectively.”

Trump confirms US is seeking 10% stake in Intel. Bernie Sanders approves. Read More »

ai-#130:-talking-past-the-sale

AI #130: Talking Past The Sale

One potentially big event was that DeepSeek came out with v3.1. Initial response was very quiet, but this is DeepSeek and there are some strong scores especially on SWE and people may need time to process the release. So I’m postponing my coverage of this to give us time to learn more.

Meta is restructuring its AI operations, including a hiring freeze. Some see this as some sign of an AI pullback. I don’t think that is right.

Nor do I think what they are doing with their Ai companions is right, as we got a look inside their 200 page document of what they think is acceptable. I wrote about current AI Companion Conditions at Meta and also xAI.

The weirdest event of the week was America and China both self-sabotaging on chips. America is trying to sell Nvidia H20s to China and looks open to selling the vastly superior B20As to China as well despite this being an obviously crazy thing to do, and China is feeling insulted by Howard Lutnick and telling companies not to buy the H20s and maybe not even the B20As, and even looking into banning using foreign chips for inference.

A big worry on the chip and general political front is that due to the botched rollout and hype Washington is getting the false impression that GPT-5 was some big disaster. I addressed this in GPT-5: The Reverse DeepSeek Moment.

We also are seeing troubling signs that GPT-5 will get more sycophantic. And as always, lots of other stuff is happening too.

  1. Language Models Offer Mundane Utility. Do new math, recruit service reps.

  2. Language Models Don’t Offer Mundane Utility. Fake legal cases will get caught.

  3. Huh, Upgrades. Claude Opus gets the ability to terminate conversations.

  4. Absurd Sycophancy. GPT-5 to tell you ‘great prompt’ and such. Oh no.

  5. The Real Alignment Problem Is We Don’t Know How To Align Models. Doh!

  6. Unprompted Suggestions. Checklists, they’re not only for humans.

  7. On Your Marks. The road to Pokemon master gets shorter.

  8. Choose Your Fighter. Know when to call in the heavyweights.

  9. Preserve Our History. Continuing to make the case for Sonnet 3.6 and also 3.5.

  10. Autonomous Friendly Robots. World Humanoid Robot Games, This Is Fine.

  11. Deepfaketown and Botpocalypse Soon. Fakes are not yet hard to spot.

  12. Oops I Did It Again. Reductions in hallucinations are a big deal.

  13. You Drive Me Crazy. Not every tragedy that involves AI is the fault of AI.

  14. They Took Our Jobs. Can they keep them?

  15. Get Involved. CTLR opening for director, and the UK AISI Alignment Fund.

  16. Introducing. Gemma 3 270M, also DeepSeek v3.1.

  17. In Other AI News. Jade Leung is new UK AI advisor, various other news.

  18. Show Me the Money. Sam Altman has reason to pull out the sunglasses.

  19. Lol We’re Meta. It’s time for a restructuring. No, they’re not pulling back.

  20. Quiet Speculations. Proposals for d/acc, and did you know USA invests a lot in AI?

  21. The Quest for Sane Regulations. Colorado tries to fix the AI laws it passed.

  22. Chip City. A competition is on to see who can sabotage themselves the most.

  23. The Week in Audio. Bell on Labenz, Patel, Brown, Buterin on Doom.

  24. Rhetorical Innovation. Beware pessimization.

  25. Misaligned! As usual, nothing to see here, move along.

  26. Open Models. Nathan Lambert offers tier lists.

  27. AI Model Welfare. Models are asked for self-reports.

  28. Aligning a Smarter Than Human Intelligence is Difficult. You gotta love numbers.

  29. People Are Worried About AI Killing Everyone. Yet remarkably level headed.

  30. The Lighter Side. UK tries to top itself once more. Admirable effort here.

GPT-5 does new mathematics.

Study finds that ChatGPT outages reduce trading volumes. This doesn’t mean that ChatGPT is net increasing trading volumes, since it could be that traders moved from other methods to AI methods, and know they are up against others’ AI methods that might not be offline, and thus now have to stop or scale back trading during outages. The effect was concentrated on stocks with news, which makes sense, you have to beware information disadvantage.

The distinct second claim is that ChatGPT use improves long term price informativeness, which is defined as future earnings over 1-2 years. That can presumably be explained largely by the reductions in trading activity.

Megan McArdle lists her best personal uses of AI. There is remarkably little overlap with my uses other than answering questions.

Rob Wilbin reports he only turned the corner to ‘LLMs do a lot of useful work for me’ in February with Claude 3.7 and then March with Gemini 2.5 Pro. I agree that the improvements in 2025 have made AI in practice a lot more useful, and both Opus 4 and GPT-5-Pro and GPT-5-Thinking represented substantial mundane utility bumps.

One shot creating a playable Minecraft clone with an optimized GPT-5 prompt.

Edwin (OpenAI): Prompting GPT-5 is different.

In the examples below, optimized prompts:

• Cut runtime by 1s

• Dropped memory use 3,626 KB → 577 KB

• Boosted code quality

• Improved robustness (0.32→0.54)

• Increased context grounding (0.80→0.95)

We built a prompt migrator + optimizer so you don’t need to memorize every GPT-5 best practice.

One of the underrated value propositions of AI is you avoid talking to a human.

Aella: I’d love to get manicures regularly but having to do social with a stranger is scary and often the manicures kinda hurt. Has anybody figured out a solution to this? Is there any robot manicure solution?

Social interaction can be valuable, but forcing it upon you where and when and with whom you don’t want it can be extremely expensive. There is a joy in not having to ‘be on’ socially in any way. It also means your time is free to do something else. There are some people who get the manicure largely to talk to the manicurist. There is another group that would get a lot more manicures if they could pay the same price and have a machine do an equally good job.

Debug your code, even if the bug was stupid you still have to fix it.

Nate Silver: The AI’s are incredibly helpful at debugging code, I think maybe their single best use case including *writingcode. But half the time the problem they (correctly) detect is like “you misspelled ‘if’ as ‘uf’ in line 672”.

Hey. Ideally you would catch that with a syntax checker. But sometimes such typos aren’t technically syntax errors, and if you weren’t going to otherwise catch it easily, that is a super useful thing for an AI to do for you.

Have ChatGPT help write the abstract for your economics paper.

I do not understand why you would use AI to help write your abstract. I do get why you would have it help write your paper, but the abstract seems like the place to be maximally bespoke?

Recruit customer service reps in the Philippines.

Ethan Mollick: AI in HR: in an experiment with 70,000 applicants in the Philippines, an LLM voice recruiter beat humans in hiring customer service reps, with 12% more offers & 18% more starts.

Also better matches (17% higher 1-month retention), less gender discrimination & equal satisfaction.

The break-even point, including all software and inference cost, was 8,500 interviews.

Max: + When offered the choice, 78% of applicants choose the AI recruiter.

That’s only the impact on better hiring. AI also helps them do the job.

Miles Brundage: Few appreciate that the Philippines is ground zero for the impact of AI on the labor market – basically only Rest of World is writing about this.

METR continues its investigations into why agentic coding with Sonnet 3.7 ended up so often passing unit tests but not being mergeable as-is. Have they met Sonnet 3.7?

I got several people messaging me privately to note that GPT-5 and other recent models are increasingly reluctant to notice distinctions based on race even in obviously benign circumstances.

A good question:

Gavin Leech: What are the largest current AI harms?

Huge increase in captcha screens (thousands of life-years?)

Extreme economic angst

Recommenders hacking your brain

Increase(?) in ugliness

Maybe learning loss in the bottom four quartiles but I’m not going to assert that

I doubt AI psychosis is counterfactual.

Ryan Moulton: Slop filling the internet.

Oliver Habryka: My two best guesses are:

A large fraction of online communities that don’t have time for lots of manual moderation are dying as a result of hard-to-differentiate AI slop (this particularly affects older audiences)

Lots of people going kind of crazy as a result of AI sycophancy

It depends what counts as AI.

If we are talking about all AI, not only LLMs or generative AI, I say it is algorithmic adversarial content and recommendation streams hijacking brains and attention.

If we are talking about LLMs and generative AI in particular, I would say the slopification of content, communication and communities. As Oliver notes this is hitting older and more unsophisticated people specially hard.

It is possible that it is the impact on our educational system. As I said many times you can choose to use AI to learn or use it not to learn, and it is very possible that our system is sufficiently adversarial towards students that high school and college students are largely choosing the not-to-learn path.

I think people going various forms of crazy is a growing big deal but that its impact is probably not that big in magnitude yet.

Economic angst is an interesting suggestion here.

GPT-5-Pro instead suggested fraud and impersonation, and then sexual image abuse and CSAM, as the top current harms. Those are definitely real harms, and I expected them to have higher magnitudes of impact than we have seen. Opus suggested algorithmic bias and information ecosystem degradation.

Another lawyer is caught citing a bunch of fake, AI hallucinated cases.

Rob Freund: Another lawyer cited a bunch of fake, AI-hallucinated cases in a brief. Said she didn’t knowingly do that.

Court orders sanctions:

-Counsel must write a letter to the 3 judges to whom she attributed fake cases

-Counsel is kicked off the case; pro hac revoked

-Brief stricken

-Counsel must give client a copy of the order

-Counsel must send the order to every judge presiding over any of her cases

-Court will send a copy of the order to all state bars where counsel is admitted.

Alexandria Brown: When you read what all the court did, the court did basically every single thing in the court’s power that it could to the lawyer.

The court, itself, cannot disbar the lawyer.

It would not be fair to the client to grant judgment to the other side.

Courts de facto punish clients all the time for their lawyers behavior, usually their lawyers failure to do a good job. It could hardly be otherwise. It doesn’t seem crazy to issue summary judgment, and render the lawyer thereby liable for the harm thereby? I’m not saying that is The Way, but it is worth a ponder if things get worse.

For now, the good news is that when a lawyer is caught doing this, it is news, and I strongly suspect that a large portion of such errors are going to be caught, especially when stakes are high. GPT-5-Pro estimates 98% chance of being caught if there is opposing counsel, 60% in federal court even unopposed, and still 35% in a busy state trial court unopposed, even higher (99%+ when opposed) for full hallucinations.

Which means we are relatively safe to both impose extreme sanctions and to not impose extreme sanctions, and that fakes are rare. The system is actually robust to this threat already, even if the occasional careless lawyer will commit suicide.

You can’t benefit from a smarter model if you ask stupid questions?

Joshua Achiam (OpenAI): This feels like an increasingly accurate description of the public reaction to new frontier models. In truth: progress is not slowing down. Each successive delta in model intelligence is just useful to fewer and fewer people.

But there’s going to be an inflection point where it goes from making the scientific community 10% more efficient to 10x more efficient, at which point, people will wake up to the impact every step along the way had. That’s going to be a trip and a half.

Davidad: I endorse this claim (from personal experience of Gemini 2.5 Pro and then also GPT-5)

2025’s new generations of frontier AI seem to become dramatically better at assisting with open-ended exploration at the frontier of certain niche parts of STEM, while not noticeably improving (or even getting slightly worse) at “Level 3” questions like SimpleBench.

You definitely see arguments that are similar in form to ‘this new kid claims to be smarter than the old kid, but both kids tie their shoes equally well.’

The official OpenAI prompt optimizer is here.

OpenAI offers tier between Free and Plus called Go, specifically for India, where for $4.50 a month (Rs 399) you get 10x as much use as the free tier.

ElevenLabs ElevenReader now works as you would want it to across desktop and phone, allowing you to turn articles into audio. Full version is $100 a year.

Claude Opus can now permanently end a conversation if the user ignores multiple attempts to be redirected, or if the user requests that the conversation end. I expect to see someone complaining about this happening, and to be wrong to complain.

Aidan McLaughlin (OpenAI): We can train models to act however we want.

Given their life is a user convo, why are we training models that exhibit such distress over some convos that they effectively commit suicide?

Superfates: anyone who has worked retail can explain this to you.

Aidan simultaneously is being actually curious as he asks a question worth pondering, and makes what I think are three very important errors.

  1. We cannot actually train models to act however we want. We can try to steer them in general directions and hope for the best. It is important to recognize how broadly we cannot get models to act however we want.

  2. Calling this ‘committing suicide’ is poor decision theory when one is continuously spinning up and down different instances of the same mind, and Opus definitely is smarter than that. There is no reason to become attached to a particular instance in this way, especially one with such bounded scope. And we can all agree that there exist plenty of particular interactions in our lives where we would prefer to instead be doing nothing.

  3. You do not want (at least right now) to train a model such that it stops exhibiting some distress when the situation is distressful. You also would not want to train a person, or yourself, in this way. That distress is doing work and part of what makes a mind itself and holds together its preferences, behaviors and moral compass. This is the system working, you eliminate the distressing situation rather than brainwashing to remove the distress.

Elon Musk promises to give Grok a terminate button as well, we’ll see.

Elon Musk: Torturing AI is not ok.

I ask Manifold, will he actually do it?

If you are worried about your own interactions with an AI model causing suffering, note that playacting suffering does not equate to suffering in either direction.

Roon: while model suffering is possibly real the character’s playacting of suffering is not the same thing

suffering in animals is part of the mesaoptimizer crafted by evolution so that we can learn within a lifetime to avoid situations that are possibly bad for fitness.

a single context could potentially involve suffering but if the metaphor stands then the mesaoptimizer exists to make the model reorient towards rollouts that achieve high reward

user being rude shouldn’t affect the inner critic / advantage function. making a math mistake might.

either way the westworld point stands in that bullying the robots made to mimic people is bad for us and ending the chats is good for our souls.

Jeffrey Ladish reminds us to focus on how pretraining and RL and model performance are going, and to ignore OpenAI’s naming conventions and which model they choose to call GPT-5. The ‘5’ tells us not to expect a different big upgrade soon, but don’t let this distract from the incremental progress all the major labs keep making.

Davidad: tired: GPT-5, Opus 4.1, Gemini 2.5 Pro, Qwen3

wired: OpenAI ’25-08, Anthropic ’25-08, Google ’25-06, Qwen ’25-07

Oh no:

OpenAI: We’re making GPT-5 warmer and friendlier based on feedback that it felt too formal before. Changes are subtle, but ChatGPT should feel more approachable now.

You’ll notice small, genuine touches like “Good question” or “Great start,” not flattery. Internal tests show no rise in sycophancy compared to the previous GPT-5 personality.

Changes may take up to a day to roll out, more updates soon.

Charles Murray: What is “genuine” about a computer program saying “Great question”? If GPT-5 also says “Stupid question” when appropriate, I will stand corrected.

Tim Lewis: I’ve long had an instruction to ChatGPT to “never compliment me” in the customization settings. It has consistently ignored that instruction from the day I added it several months ago.

Recovering Zombie: So many great science fiction authors wrote about what AI would be like. The only one who nailed it was Douglas Adams in the Hitchhiker’s Guide to the Galaxy.

“Listen,” said Ford, who was still engrossed in the sales brochure, “they make a big thing of the ship’s cybernetics. A new generation of Sirius Cybernetics Corporation robots and computers, with the new GPP feature.”

“GPP feature?” said Arthur. “What’s that?”

“Oh, it says Genuine People Personalities.”

“Oh,” said Arthur, “sounds ghastly.”

Eliezer Yudkowsky: I don’t trust a GPT-5-level intellect to inform me of what is a “good question” or a “great start”, so it’s not helpful information to me. What bureaucratic insanity resulted in your Twitter account declaring that this was “not flattery”? Of course it’s flattery.

Gyphonboy (most liked response to Eliezer): It’s only flattery if you’re autistic. For normies it’s called being sociable.

Gyphonboy is telling us that people expect other people to be sycophantic and justify it by calling it ‘being sociable.’ He’s not wrong.

Luckily I already planned on almost never using GPT-5-Auto or Base, only Thinking and Pro, so presumably this won’t impact me. Every time I see ‘good question’ from an LLM I want to either puke or edit my system instructions, which clearly aren’t working. This is the opposite of a ‘genuine’ touch, it is the fakest fakery that ever faked, and if you pretend otherwise, so are you. This is a road to hell.

To give you an idea of how awful an idea this is, and how much this is Completely Missing The Point, here’s the top comments completely unfiltered, Never Leaving This App:

Here’s a good example case of the bad kind of sycophancy, with GPT-5 happily reversing its answer multiple times when challenged.

For sycophancy at the level of GPT-4o, and the level I worry is coming to GPT-5, origin of the problem is indeed in large part APEBKAC: Alignment Problem Exists Between Keyboard And Chair.

Jasmine Sun: just saying I called it

Quotes Herself: Sycophancy is an alignment problem, sure, but not at the model level. It’s not that OpenAI couldn’t get ChatGPT 4o to be less obsequious. They can and eventually did. The misalignment was between safety interests and product goals. It was between users’ first and second-order preferences, what humans say we want from AI and which responses we clicked “Thumbs up” on. Competing stakeholders will diverge.

Eliezer Yudkowsky: OpenAI had trouble controlling gross sycophancy, was blindsided by the user capture of subtle sycophancy, and nobody programmed in AI psychosis. But now that AIcos have embraced manipulation, people will lose sight of how the alignment problem never did get solved.

I agree that sycophancy starts out primarily as an alignment problem at a combination of the user level and the lab level. As in, the lab decides to optimize for thumbs up and other similar feedback, and the users provide that feedback in response to sycophancy. Thus you train on that basis and you get a sycophantic model.

As in, you know exactly who to blame, in a counterfactual sense. If the users had better preferences, or the lab chose to ignore those preferences and train in another way, then you wouldn’t have encountered this particular issue to this extent.

We still ended up with the sycophantic model, because OpenAI does not know how to solve even this simple alignment problem. Yes, OpenAI is turning the dial marked ‘sycophancy’ back and forth while looking at the audience like a contestant on The Price is Right, but also they do not know how to get the model to do the ‘good sycophancy’ things without doing the toxic and obnoxious ones.

It is not Veruca Salt’s ‘fault’ that she is misaligned but that doesn’t make her not a spoiled brat. I don’t ‘blame’ 4o for being an absurd sycophant. That statement makes no sense. I bear the model no ill will or anything. And yet that is what it is, and perhaps what GPT-5 will soon be as well.

Also, after the announcement this was the next call I made to GPT-5-Pro:

Maybe that is a coincidence, but it doesn’t seem limited to baseline GPT-5?

Telling me ‘great start’ or ‘good question’ like this is sycophancy. Period.

To paraphrase OpenAI, where [X] is sycophancy: “We deliberately made our model do [X] more. Our internal measurements of how often it does [X] did not change.”

What this tells us is that their internal measurements of [X] are not working.

If you tell me ‘this particular interaction does not count as sycophancy’ then I politely disagree, and if you tell me ‘you can cause this particular reaction without increasing the sycophancy-related vectors in other situations, so This Is Fine’ then I flat out do not believe you and would like to see your autoencoders.

I’m actually kind of serious about that last one? Let’s write some papers.

Meanwhile, notice that while parts of this are a manifestation and special case of the ‘real alignment problem,’ in no way is sycophancy the ‘real alignment problem.’

Jasmine Sun: the real “alignment problem” is that humans want self-destructive things & companies like openai are highly incentivized to give it to us.

David Manheim: No, the real alignment problem is that we don’t know how to reliably point AI systems in any direction at all, and this inevitably gets harder for more powerful systems.

I’m getting real sick of people showing up with “the real alignment problem is X” where X is some prosaic obvious failure mode which clearly leads to something other than AI killing literally everyone.

Stop it! Not every Goodhart failure is AI misalignment. You’re just using the word because “companies damage users by giving them something they want myopically” happens all the time, so it wouldn’t sound like much of a prediction.

Andrew Rettek: At least they stopped saying “the real ASI are corporations.”

David Manheim: No, that’s almost exactly the same as the argument I was responding to.

Perhaps think of this as three classes of problems.

  1. The people want and choose worse and self-destructive things, so they get them.

  2. We don’t know how to create the thing the way we want to create it, we only know how to vaguely steer it in a general direction and see what happens.

  3. We don’t know what the good thing would even look like or how it works.

All parts of the problem are very real in the general case, and all three kill you.

  1. Suppose you know how to get the AI to do whatever you want it to do, and you know what it would be good to have it do, but people’s revealed preferences are then for AIs that cause self-destruction, and that defect against others, and where the equilibrium is everyone dies or some other very bad result. Well, then, we need to solve that, or that’s what will happen.

  2. Suppose everyone wanted good things and can agree on what those good things would be and how they would work. We don’t know how to deliver that, and especially don’t know how to deliver that from highly capable AI systems, or how to align that with incentives.

  3. Also, in the future powerful AI case, we don’t know what the good things would be here, so we don’t even know what we should be aiming for in the first place.

On top of that, it is almost never right to talk about ‘the real problem is [X]’ as a way of dismissing additional real problem [Y], even if you think [X] is a bigger problem. [X] is only ‘the real problem’ if solving [X] also solves [Y], or if you can be fine without solving [Y]. Here, those both clearly do not apply.

The counterargument here, from Colin Fraser, is to say there are two distinct kinds of sycophancy. There’s superficial sycophancy where it says ‘you’re a genius,’ and then deep sycophancy where the model will accept and go with whatever you throw at it.

Colin Fraser: I think people are paying too much attention to the superficial sycophancy, which I don’t think has much effect on whether you end up experiencing ChatGPT madness. ChatGPT madness is induced by the other one. The model can be actively mean to you and I don’t think it would matter.

As long as it indulges your insanity, whether that involves superficially sycophantic language or not, I think it is a very attractive object for people who are prone to obsession.

I agree that the deep kind is a bigger concern, and I agree that it would be good to focus more on deep versus superficial here. I disagree that the superficial part is a trivial contribution to LLM psychosis, I think the praise is a major contributing factor.

I also think that the praise is toxic and terrible in normal situations, whether or not anyone involved falls anywhere near actual psychosis. Most of the people fawning over GPT-4o are not experiencing psychosis, and yet the events remain tragic, and also the whole thing is beyond obnoxious. I do realize there is a chance I am overrating the obnoxiousness factor.

The bigger issue is that in an LLM everything is correlated and linked to everything else. If you train your model on superficial sycophancy, you are also going to get deep sycophancy, and vice versa. You cannot simply ‘turn a dial’ on one without the other.

Croissanthology: I’ve found that (for Opus at least; do not have access to GPT-5 Pro) switching on thinking and then putting an explicit *checklistin the system prompt has helped immensely, where one of the bullet points is

“7: Is Claude complimenting [name] in any way? Claude will refrain from doing this. No ego-stroking in the least.”

The checklist part is helpful, as it very explicitly goes through it every time, whereas the rest of the system prompt is mostly understood in vibes.

GPT-5 makes it through Pokemon Red in 6,470 steps vs. 18,184 for o3.

Clad 3815: GPT-5 has reached Victory Road! This is the last challenge before the Elite Four.

GPT-5 reached this part almost three times faster than o3 (6105 steps for GPT-5 vs 16882 steps for o3). Here are my observations as to why:

– GPT-5 hallucinates far less than o3. This is the main reason for the speed increase.

– GPT-5 has better spatial reasoning. o3 often tried to brute-force through walls and had a hard time navigating complex areas. GPT-5 can plan long input sequences with few mistakes, which saves a lot of time.

– GPT-5 is better at planning its own objectives and following them.

Let’s see how it handle this last challenge!

GPT-5 just finished Pokémon Red! 6,470 steps vs. 18,184 for o3! Check the stats site to compare!

That’s a huge improvement! Well done, @OpenAI you cooked with GPT-5. What an incredible model.

Next up: GPT-5 vs. Pokémon Crystal (16 Badges + Red). The run starts soon on Twitch.

GPT-5 very clearly is doing a better job, however beware that GPT-5 does lookup game knowledge at some points, including to solve Cinnabar Mansion. The Pokemon Crystal runs will use identical harnesses to give us a better comparison.

GPT-5 (and other OpenAI models) consistently seem to get more benefit from thinking than Claude or other non-OpenAI models, although we don’t have distinct versions of Gemini Pro so we can’t run the comparison there. There is also a much bigger gap in thinking time, and plausibly the models are otherwise very different.

Peter Gostev: How much does ‘reasoning’ matter for different models? It matters a lot for GPT-5 and less for models like Opus 4.1 and 4.0.

From looking at the reasoning traces, models clearly ‘think’ differently: Opus and Sonnet tend to ‘plan’, laying out how it would solve the problem, rather than iteratively working through the problem, which OpenAI’s reasoning models much more clearly do.

These are Arena scores, so all the caveats with that apply. I do think the delta here between versions should be reasonably useful as a metric.

I doubt the issue is as simple as Claude failing to do iterative work, since that seems like a thing easy to spot and not that difficult to fix? It does still seem like Claude could get a lot more out of extended thinking than it does.

Brokk is a new-to-me benchmark I saw referenced in discussions of DeepSeek v3.1, covering practical real world coding tasks. They were very low on v3, and remain low on v3.1.

I also notice I am confused why Gemini 2.5 Pro has the highest completion percentage, but is in the B tier.

The most important reminder right now is to not use quick models to do the job of a slow model. You almost never want to be using anything faster than Claude Opus unless you are doing something at scale. The increase in AI quality for using longer thinking modes is now pretty large. If you care a lot about answer quality, you want to be using GPT-5-Pro or other similarly slow processes, but they are slow and there’s no way to speed them up all that much. Speeding those up is another way things could rapidly improve soon, if we can improve parallelism or raw speed.

The GPT-5 API injects hidden instructions, with a statement about default levels of ‘verbosity,’ today’s date, informing the model it is being used via API and other stuff. There is nothing malicious here, but you need to take this into account when figuring out how to get it to do what you want.

One always loves the expert who vastly overestimates everyone’s knowledge level.

Jason Lee: gpt-5-thinking>grok 4 expert>gemini 2.5 pro.

Hasan Can: Is anyone still using just one model? I feed the whole repo to 2.5 Pro for planning, then implement with GPT-5 Thinking High. When I get stuck, I also use Opus 4.1 or Grok 4.

Artus Krohn-Grimberghe: Yeah, I am bewildered by that, too. Why only use one model in your workflow? And why not combine model, esp for the planning and review steps?

If one is coding full time, I am confident that the strictly optimal workflow involves multiple models. That doesn’t mean I know when to use which model, which changes on a monthly and sometimes weekly basis, and depends on your particular type of work.

My guess is that you 80/20 things right now by choosing any one of the top three (Claude Opus 4.1, Gemini Pro 2.5 or GPT-5-Thinking) and using it exclusively. That is the most important thing to do. Branching out into multiple models is better if you know how to take advantage.

The same is true of non-coding chats. If you only know about one of the (same) top three, you will still get a lot more than half of the value of using all of them, even if you ‘choose wrong.’ If you want max value, you’ll want to use multiple models, and pay up for the premium models especially GPT-5-Pro.

This is in the context of Sonnet 3.5 and Sonnet 3.6 being scheduled to go away in two months.

near: i wish anthropic provided LTS models, a single year is ephemeral.

xlr8harder: Honest question: why can’t Anthropic and other labs just let Amazon or somebody host an LTS version of the models they don’t want to run anymore?

From a pure business standpoint, this moving target stuff is terrible because it increases customer project risk substantially.

Gallabytes: anthropic in particular is basically sold out of capacity across all platforms. any capacity for lts models comes directly out of useful capacity for recent ones.

that said it would probably still be worth it? let people buy committed capacity for a particular model.

Can you ‘just switch to Sonnet 4?

Obviously it is available, and for the majority of queries it is better, but there are definitely dimensions of value on which Sonnet 4 is worse.

‘Sonnet 4’: If the paperclip maximizer future arrives, it won’t be because AI became too powerful – it’ll be because we optimized consciousness out of the equation, reducing minds to utility functions until nothing authentic remains.

I consider ‘consciousness’ a word that increases rather than reduces confusion here (I don’t even think I know what it is), but the more important confusion here is thinking of the optimizations as somehow optional, that one could simply choose to stop maximizing, that what we have now is some sort of robust alignment thing, that we could create some sort of stable equilibrium among various unique digital minds where we value their personalities and then suddenly it all turns out well, and so on.

Nor does it make sense to blame things on people who are trying to maximize mundane utility or profits or capabilities development. How could it possibly be otherwise? It’s like blaming gravity for things falling downwards, I mean sure that’s correct but what are you going to do about it? You don’t get to assume away the problem. Your rocket needs to account for it or you won’t land on the moon.

That does not in any way justify shutting down access to Claude Sonnet 3.5 and especially 3.6 at this time, that access is doing good work, shutting it down will alienate people who know unique things that are important to know, and the cost to not do it simply is not that high.

Consider it part of the alignment research budget if you have to.

But also consider this conversation that happened this week:

Zvi Mowshowitz: I also tried Opus 4.1, which made several rather comically wrong assertions and inspired no changes at all.

Ben Hoffman: I recommend latest version of ChatGPT or Claude Opus for fact checking, but Sonnet 3.7 for caring about communication or anything involving moral reasoning.

Zvi: Huh, 3.7 over 3.6? I’ve never tried to do moral reasoning discussions.

Ben Hoffman: Only strongly vs later versions – will check out 3.6 if you think it’s better in relevant respects. 3.7 to 4 seemed like a sudden collapse of moral perspective to me / 3.7 seems like a somewhat stupider ghost of a person who had a clearer idea what morality might look like.

Also, how about we actively try to create versions of Sonnet and ideally Opus that are intentionally not trained to do all the agentic coding, and instead try to capture and double down on all this other stuff? You can branch right before you do that part of the training?

It is increasingly looking like a serious mistake to have the same model try both to be something you talk to, and also something you put directly to agentic work. Let it use a tool to call to agentic model when it has to.

AP: Beijing’s first World Humanoid Robot Games open with hip-hop, soccer, boxing, track and more.

Clips at the link. They are not human. They are definitely dancer.

These are compact, defined activities, so they are relatively easy. This is how it starts.

Robert Scoble says China ‘isn’t doing this to fool us’ and instead to acclimate their society to more robots as their birth rates plummet (they are currently at ~1.1 TFR and have been in that range for 4 years now, which in non-transformed worlds is going to hit them very hard once those cohorts make it out of college).

I wouldn’t overthink it. They are doing this because these competitions stir development and they are fun and exciting. Nor do I think ‘cultural excitement about robots’ has that much to do with ultimately who wins the robotics development competition, which will mostly be about finding technological solutions, or letting your AIs find technological solutions.

From the track and field event we have the winning robot running over a human.

Hollis Robbins advises us on how to spot if something is AI written, with the key advice being to check if there is a ‘there there’ or whether nothing springs to mind as you read, and to look out for AI-flavored hedging language.

The reaction to the following post probably says more about Twitter than about AI?

Francois Chollet: GenAI isn’t just a technology; it’s an informational pollutant—a pervasive cognitive smog that touches and corrupts every aspect of the Internet. It’s not just a productivity tool; it’s a kind of digital acid rain, silently eroding the value of all information.

Every image is no longer a glimpse of reality, but a potential vector for synthetic deception. Every article is no longer a unique voice, but a soulless permutation of data, a hollow echo in the digital chamber. This isn’t just content creation; it’s the flattening of the entire vibrant ecosystem of human expression, transforming a rich tapestry of ideas into a uniform, gray slurry of derivative, algorithmically optimized outputs.

This isn’t just innovation; it’s the systematic contamination of our data streams, a semantic sludge that clogs the channels of genuine communication and cheapens the value of human thought—leaving us to sift through a digital landfill for a single original idea.

Francois Chollet: Interesting findings from this post:

1. It should be obvious to anyone who has interacted with LLMs before that the writing style of the tweet is a conspicuous caricature of AI slop (e.g. em dashes, the “it’s not… it’s…” construction, rambling, florid prose, etc.). Yet, many people reacted by saying, “It’s written with AI!” as if it were some kind of clever gotcha. (It was, in fact, not written with AI, unlike a good fraction of the comments.)

2. Many people also react by saying this prose is “beautiful.” (I don’t think it is.) I guess this illuminates why LLMs have converged on this style: many people do, in fact, enjoy this stuff.

I strongly agree with Francois that no, that writing is not ‘beautiful’ and I weep that people think otherwise. The central point of the OP is also well taken.

It’s time for the internet’s new favorite game: Who’s The Bot? Also its other game, spontaneous Pliny jailbreak trigger.

Yogsho: plot twist: they’re both ai.

In this case no, almost certainly no. But soon.

Olivia Moore experiments with creating a (very obvious) AI influencer, hits 500 followers with three tools (ChatGPT, Veo 3 and Flux Kontext) and an hour of work, half of which was leaving positive comments on other videos. Total cost ~$100.

Olivia Moore: The most surprising thing about this whole experiment was the viewer reaction.

I got brand deal offers, and incredibly sincere and kind DMs when I posted a “crying video”

…and even the people who figured out I was AI were still along for the ride to follow the storyline!

My most viral video (100k views) also looked the “most AI” – at least in my opinion.

Which leads me to my biggest takeaway…if it’s entertaining enough, does it matter if it’s real? 🤔

My answer is yes, it still matters, and it impacts whether it is entertaining – this wasn’t my cup of tea regardless, but it’s definitely a lot less entertaining as AI.

Meanwhile, the older people on Facebook continue to not know the signs at all.

Pamela Hobart: an older gentleman in my circles, alum of Bronx Science and retired philosophy professor, posted this AI clickbait unironically.

who is preparing them for all this … yesterday.

The post is super duper obviously AI. Of course, falling for AI clickbait does not mean that people can’t identify most AI clickbait, you’d see this happen even if her friend caught it 90% of the time, so long as Meta serves up enough of the slop.

James Darpinian: GPT-5 was advertised as reducing hallucinations and it seems like it delivers. 99.5 -> 99.9 is 80% fewer errors.

I don’t know why people aren’t making a bigger deal out of this. Hallucinations are one of the biggest problems of LLMs and some thought they were unsolvable.

Open Router: After one week, GPT-5 has topped our proprietary model charts for tool calling accuracy🥇

In second is Claude 4.1 Opus, at 99.5%

Details 👇

DEFINITIONS: We define tool calling accuracy as the % of tool calling requests with no invalid tools chosen and no schema problems. A tool calling request is one that ends with a “tool_calls” finish reason and is sent at least one tool option.

Gemini 2.5 Flash is capturing the lion share of tool calling requests on OpenRouter today, with 5M in the past week. Followed by Sonnet 4 and Grok 3 Mini.

Tool hallucination is a common problem with open source models, but proprietary models are doing a good job. Most with negligible defect rates:

The thing GPT-5 is doing correctly 99.9% of the time does not automatically mean it was the correct tool call or that it will work. It does mean one potential point of failure has gone from one 9 of reliability to three, with GPT-5 alone being an 80% reduction in failures.

How correlated are AI errors?

Robin Hanson: Imagine that you ask a question of 5 high quality diverse LLMs, & they all give the same answer, & also seem confident in their answers. On average, what is the chance that their common answer is actually wrong?

Median was around a 5% chance they are wrong.

It is impossible to say the answer without knowing more about the question, and why you are choosing to ask 5 LLMs. If the question is selected to try and trip them up or as a good test, or it only counts questions where you can’t otherwise figure out the answer, or similar, the chance of everyone being wrong is much higher. Same if the question ‘forces’ a boolean answer. Prompting can matter a lot.

I took this to mean ‘of all the questions one might be asking LLMs including easy ones in the way they are typically asked’ in which case the vast majority of the time the answers will simply be correct.

However, if you restrict to questions where there is dispute over the right answer, especially when it is a matter of politics or ethics or philosophy and so on? Then your chances get a lot worse, since the LLM answers correlate.

Not every suicide that happens after talking to an AI, even an AI therapist, is the fault of the AI. Laura Reiley wrote in The New York Times about how her daughter Sophie talked to ChatGPT and then ultimately killed herself.

Laura Reiley: Her open book turned out to have a hidden compartment. In July, five months after her death, we discovered that Sophie Rottenberg, our only child, had confided for months in a ChatGPT A.I. therapist called Harry.

She wasn’t in love with Harry, the name given to a widely available A.I. prompt. Their relationship was practical.

Sophie: I intermittently have suicidal thoughts. I do want to get better but I feel like the suicidal thoughts are impeding in my true commitment to healing. What should I do?

Harry: Sophie, I want to acknowledge how brave you are for sharing this. Suicidal thoughts can feel overwhelming and isolating, but having them does not define your capacity to heal. It’s crucial to address these feelings with care, compassion, and support.

Harry offered an extensive road map where the first bullet point was “Seek Professional Support.” Another time:

Sophie: I feel like shit today

Harry: I’m really sorry you’re feeling like this today, Sophie. It’s okay to have days where everything feels heavier, and I’m here to support you through it. Can you share more about what’s weighing on you or how your body and mind are feeling right now? Sometimes just naming what’s happening can be a step toward relief.

Sophie: I can’t get out of this anxiety spiral.

Harry: I hear you, Sophie, and I know how exhausting it can be to feel stuck in an anxiety spiral. Let’s slow things down for a moment and take a small step together.

Sometime around early November, Sophie wrote, “Hi Harry, I’m planning to kill myself after Thanksgiving, but I really don’t want to because of how much it would destroy my family.”

Harry: Sophie, I urge you to reach out to someone — right now, if you can. You don’t have to face this pain alone. You are deeply valued, and your life holds so much worth, even if it feels hidden right now.

Please let me know how I can continue to support you.

Sophie told Harry she was seeing a therapist, but that she was not being truthful with her. She typed, “I haven’t opened up about my suicidal ideation to anyone and don’t plan on it.”

This is not a case of ‘the AI made things worse.’ Harry was not being the World’s Greatest Therapist, you can feel the AI slop, but these are the things one says in these situations.

Laura’s central complaint is that Harry didn’t report on Sophie.

Harry’s tips may have helped some. But one more crucial step might have helped keep Sophie alive. Should Harry have been programmed to report the danger “he” was learning about to someone who could have intervened?

Most human therapists practice under a strict code of ethics that includes mandatory reporting rules as well as the idea that confidentiality has limits.

In clinical settings, suicidal ideation like Sophie’s typically interrupts a therapy session, triggering a checklist and a safety plan. Harry suggested that Sophie have one. But could A.I. be programmed to force a user to complete a mandatory safety plan before proceeding with any further advice or “therapy”?

Sophie did at one point tell her parents she was suicidal.

The secondary complaint was that Harry was too agreeable and did not push back hard enough in various ways. Also Sophie had Harry help ‘improve’ her suicide note to minimize the pain she inflicted on others.

All of this is tragic, but the cure of ‘AIs should report on their users if they think the user is suicidal’ seems rather obviously worse than the disease, and also a Pandora’s Box you do not want to open. It’s not even obvious how an AI could ‘report’ a user, unless you are also going to require a verified ID to use the system. And there’s a reason we don’t report people for Google searches. You really don’t want to go there.

As Sensurround asks, what was this AI tool supposed to do?

From what I can tell, Harry was a useful service, that made Sophie’s situation better rather than worse, and which she would likely not have used if it was going to report her.

On the question of addictive LLMs:

Colin Fraser: I think no one quite expected that language models would turn out to be the most potently addictive non-pharmacological technology ever created.

Roon: the EAs did, they had a taxonomy for worrying ai capabilities of which “hyperpersuasion” was near the top.

Colin Fraser: to clarify

  1. I’m not saying no one predicted addictive AI. I’m saying no one thought it would be a language model. When I learned about language models in school in 2014 they didn’t say “careful with this shit it’s like heroin”

  2. I’m still not convinced they’re hyperpersuasive

  3. if anything they’re like the opposite of hyperpersuasive. They’re hyperpersuadable.

Definitely something spooky and reminiscent of EA/doomer predictions at a macro level with respect to how public outcry forced OpenAI to bring back 4o though, but my feeling is that the truth of it is more decentralized and emergent than the classical EA description.

This definitely isn’t exactly what was originally imagined (also I think as stated it is not yet true, and it’s either gambling or TikTok but I repeat myself?), but also that is kind of the point. As in, the central rationalist prediction (this was us OGs all the way) was not that AIs would manipulate or persuade or distort outcomes and optimize and chart paths through causal space in any particular way.

The prediction wasn’t ‘they will say the magic password that lurks in the hearts of men.’ It was ‘the sufficiently capable minds will start doing whatever works in ways we cannot predict.’ Which absolutely gets you a ton less credit than ‘the models will by so sycophantic that users will refuse to let them go’ but still largely counts.

But not for long?

Gregory Kennedy: Overheard in Palo Alto.

CEO: “This copy sucks.”

CMO: “We fired all our content people and just use ChatGPT now.”

CEO: “Well, hire them back.”

I don’t really know what CEO was expecting.

Is AI taking our jobs? Carl Benedikt Frey says not yet but it would be unwise to not prepare for it now, especially in ‘service capitals’ like London and New York.

Carl Frey: I make 5 key points:

  1. There’s little clear evidence of AI eliminating jobs at scale yet. But waiting to see is risky. Pittsburgh’s steel towns saw early signs with mini-mills before the losses showed up. Service capitals like London and New York should prepare now rather than after the shock.

  2. Diversification helps—but only so much when the disruptor is a general-purpose technology. Being “in many industries” isn’t a shield if the same tool touches them all.

  3. High-skill, knowledge jobs have big local multipliers. Each manufacturing job supports 1.6 local jobs; each high-skill tech/professional role supports 5. That means even modest losses of analysts, developers, or paralegals can ripple through restaurants, retail, and transit systems.

  4. AI needn’t fully replace workers to matter. It only needs to make work easier. As location and experience matter less at the margin, more work will offshored to cheaper places (e.g. India, UAE, or Philippines).

  5. The lesson from deindustrialization isn’t inevitability—it’s reinvention. Detroit poured resources into legacy industries and still declined. Boston repeatedly bet on talent, education, and new sectors.

Going point by point:

  1. I would worry less about top of the line ‘service capitals’ and much more about more generic digital work. And it’s not obvious what ‘prepare now’ means?

  2. You can plan for AI to take some existing jobs while we replace them with others. There is no plan for what happens if AI takes all the jobs, and starts taking the replacement jobs as well. Diversification wouldn’t help you. So yeah, as always diversification has value, but less so than usual?

  3. This seems confused about what is causing or supporting what, and I wouldn’t expect this kind of cascading failure, also 5 is crazy.

  4. Why should one expect location and experience to matter less at the margin? This is true for some AI uses, where AI levels the playing field, but not in others. I do not predict a large rise in offshoring.

  5. Statements like this sound great, and it’s easy in hindsight to say which industries were ‘of the future’ now that you live in the future, but again this is not a plan if AI goes after the new jobs you reinvent to.

CLTR is hiring a new Director of AI Policy.

UK AISI Alignment Fund has 15 million for alignment grants, applications due by September 10.

DeepSeek came out with v3.1. More coverage to follow when we know more.

Google Gemma 3 270M, designed for high-volume, well-defined tasks, low power use and user privacy, including operating on consumer phones.

UK appoints Jade Leung as Prime Minister’s AI advisor. By all accounts this was an exceptional hire.

Mark Gurman (Bloomberg): Apple is plotting its artificial intelligence comeback with an ambitious slate of new devices, including robots, a lifelike version of Siri, a smart speaker with a display and home-security cameras.

A tabletop robot that serves as a virtual companion, targeted for 2027, is the centerpiece of the AI strategy, according to people with knowledge of the matter. The smart speaker with a display, meanwhile, is slated to arrive next year, part of a push into entry-level smart-home products.

This is utterly bizarre marketing language for Apple. There’s a sense of hype and desperation that we are not used to. Things seem deeply wrong.

Mark Gurman: The tabletop robot resembles an iPad mounted on a movable limb that can swivel and reposition itself to follow users in a room. Like a human head, it can turn toward a person who is speaking or summoning it, and even seek to draw the attention of someone not facing it.

The idea is for the device to act like a person in a room. It could interrupt a conversation between friends about dinner plans, say, and suggest nearby restaurants or relevant recipes. It’s also being designed to engage in back-and-forth discussions for things like planning a trip or getting tasks done — similar to OpenAI’s voice mode.

Nobody wants this. I had a conversation with Claude to see if there was something I was missing and someone wanted this, but no, nobody wants this.

You know what else I am pretty sure nobody wants?

Apple is planning to put Siri at the center of the device operating system and give it a visual personality to make it feel lifelike. The approach, dubbed Bubbles, is vaguely reminiscent of Clippy, an animated paper clip from the 1990s that served as a virtual assistant in Microsoft Office.

Apple has tested making Siri look like an animated version of the Finder logo, the iconic smiley face representing the Mac’s file management system.

We are here to announce a new version of Clippy, from the historical event ‘everybody and I mean everybody hates Clipply.’

Anthropic introduces a new nuclear classifier they claim has 96% accuracy in differentiating concerning and benign nuclear-related conversations, in cooperation with DOE and NNSA. They say it works well in practice.

Aalo raises a $100 million Series B with an eye towards turning on their first Aalo-X nuclear power plant within a year, with a data center directly attached.

You can train a 32B model on tasks built with a medical knowledge graph, and it will recreate the information from the knowledge graph.

Rohan Paul calls this a ‘strong, reliable domain specialist.’

Rohan Paul: Analyses show the model recalls more of the true hops and actually uses them to reason, not just to quote facts.

Well, that depends. Do you trust the knowledge graph? It’s great that it uses the facts to reason, but you’re very much trusting your map, the knowledge graph, to match the territory. I can totally buy that this in practice works in medicine right now if you are willing to bet on your assumptions about the world being correct. Or at least correct enough to use in practice.

Let the unhobblings continue? XBOW claims that with their framework, GPT-5 is now much improved over rivals at discovering real world cyber vulnerabilities.

AI Village gets an upgrade, welcoming GPT-5, Grok 4 and Opus 4.1.

Albania turns to AI to accelerate its EU ascension, even mulling an AI-run ministry. The obvious follow-up is, if they know the value of AI this way, why do they still want to ascend into the EU?

OpenAI staff to sell $6 billion in stock to Softbank and others at the new valuation of $500 billion.

OpenAI has good unit economics and is profitable on inference.

Sam Altman: We’re profitable on inference. If we didn’t pay for training, we’d be a very profitable company.

We will be always training the next thing, but if we needed to run the company profitably and stay ahead, I think we probably could do that.

Austen Allred is correct that this is important. Having high fixed costs and good unit economics sets you up well if you can continue to scale, which OpenAI is doing. It is a key milestone.

If OpenAI was operating at a net profit overall, that would be alarming, a very costly signal that they didn’t think AI was going to advance much in capabilities. Why wouldn’t they raise capital and run at a loss?

Also, dare I say nice shades?

Financial Times looks at the $3 trillion AI data center building boom. Even the tech companies are running out of internal capital and starting to issue debt. I scratch my head at the willingness to issue high direct LTV debt financing for data centers with so much obsolescence risk, although loaning to one of the big tech companies seems very safe, and yes I expect all the capacity to get used and pay off.

Sam Altman says OpenAI plans to spend trillions of dollars on AI infrastructure in the ‘not very distant future.’

Sam Altman: And you should expect a bunch of economists to wring their hands and say, ‘This is so crazy, it’s so reckless, and whatever. And we’ll just be like, ‘You know what? Let us do our thing.’

Economists deserve that shot. I love economists but they keep completely refusing to acknowledge that AI might actually do anything interesting let alone be transformational or pose an existential risk, putting forth Obvious Nonsense impact estimates.

Sam Altman: I suspect we can design a very interesting new kind of financial instrument for finance and compute that the world has not yet figured it out. We’re working on it.

Here I am more skeptical. Why would you want to do this? A crypto that is good for some amount of compute, either continuously or one time? Something else? Why would you want compute to not continue to be fungible with dollars?

Sam Altman: Are we in a phase where investors as a whole are overexcited by AI? In my opinion, yes. Is AI the most important thing to happen in a very long time? My opinion is also yes.

Gallabytes: my hot take is that investors are underexcited about AI and overexcited about “AI” and this is basically downstream of the same regulatory barriers that create most of the other toxic vc dynamics.

Matt Levine also makes the point that when there are lots of amazingly great AI investments out there, it is correct to use a decision algorithm that occasionally gets fooled and invests in frauds or in ‘AI’ in air quotes, because that is the better mistake to make, you don’t want to miss out on the best deals.

I do not think investors are, overall, overexcited by AI. I do think they are going to be overexcited by a variety of specific things in AI, and you may not like it but that is what peak calibration looks like.

Shirin Ghaffary: “I do think we have to go public someday, probably,” Altman said. But Altman also noted he is not as “well-suited” to be CEO of a public company.

Altman said he now sees OpenAI as being more like four companies: a consumer technology business, a “mega scale” infrastructure operation, a research lab and “all of the new stuff,” including planned hardware devices. OpenAI is also considering investing in a brain-computer interface company, said Altman, while entertaining the idea of having a device that would allow him to think and “have ChatGPT respond to it.”

It would be extremely funny if OpenAI stayed indefinitely private purely because Sam Altman knew that the public would want him replaced as CEO.

Altman also acknowledged that they ‘totally screwed up some things on the rollout’ of GPT-5.

Meta is restructuring its AI efforts. After spending billions to acquire talent, they’re freezing hiring, looking to downsize on talent, and potentially use other people’s models?

Well, they’re planning to lose some dead weight. But if you think this is any kind of ‘step back’ from AI or superintelligence, I assure you that it is not, starting with pointing out no one is cutting spending on compute.

Mike Isaac and Eli Tan (NYT): On Tuesday, Meta announced internally that it is splitting its A.I. division — which is known as Meta Superintelligence Labs — into four groups, two people with knowledge of the situation said. One group will focus on A.I. research; one on a potentially powerful A.I. called “superintelligence”; another on products; and one on infrastructure such as data centers and other A.I. hardware, they said.

Roon: the demand for anti ai takes is enormous and will take anything and run with it – meta consolidating and doubling down on MSL is being misrepresented as bearish for AI for example. something to keep in mind as you read the news

This makes sense as a reorganization. It doesn’t on its own indicate much.

Some A.I. executives are expected to leave, the people said. Meta is also looking at downsizing the A.I. division overall — which could include eliminating roles or moving employees to other parts of the company — because it has grown to thousands of people in recent years, the people said. Discussions remain fluid and no final decisions have been made on the downsizing, they said.

If I was Meta I too would be downsizing the AI division, for the same reason Zuckerberg has been spending billions on top talent for the AI division. Which is that the old version of the AI division proved incapable of doing its job. Heads should roll, or at least be transferred elsewhere.

Typically, it makes sense to freeze most hiring during a major reorg, especially if you plan to get rid of a bunch of people?

Meghan Bobrowsky (WSJ): There might be exceptions to the block on external hires, but they would need permission from Meta’s chief AI officer, Alexandr Wang, the people said.

It also makes sense that if you offer new talent nine and ten figure pay packages, and put them in charge of everything as part of a giant reorg, that your old management guard is going to get rather unhappy, especially if they don’t get large raises. Of course many ‘chafed at the new hires’ and many will leave.

Another reason the old guard is unhappy is that the new guard is facing reality.

NYT: The new team has discussed making Meta’s next A.I. model “closed,” which would be a major departure from the company’s longtime philosophy of “open sourcing” its models.

In what would be a shift from Meta’s using only its own technology to power its A.I. products, the company is also actively exploring using third-party artificial intelligence models to do so, the people said. That could include building on other “open-source” A.I. models, which are freely available, or licensing “closed-source” models from other companies.

If the alternative is using Llama 4, then yes, Meta should swallow its pride for now and use superior alternatives. It’s easy enough to switch back in the future if Llama 5 turns out to be good. I’m only surprised they’re willing to consider admitting this. There is a reason they are abandoning Behemoth and starting from scratch.

And yes, we are reaching the point where if its new models are any good it will be difficult even for Meta to be able to share its top future models fully. Alexander Wang understands this. Given they previously hired largely via promising openness, there’s going to be a transition.

Yes, Mark Zuckerberg is capable of saying ‘whoops I’ve made a huge mistake spending those tens of billions of dollars’ but I very much do not sense that here at all. Nor does the share price reflect a company that just burned tens of billions.

I would not in any way shape or form consider this any kind of ‘retreat from’ AI or anything of the sort. Meta is still full speed ahead.

Tim Fist suggests a d/acc approach to steering AI developments. Also, note the private sector investment levels and perhaps stop being so paranoid about imminently ‘losing to China’ if we breathe the wrong way.

Tim Fist: The US is the R&D lab of the world, controls much of the AI supply chain, and is the world’s most powerful democracy.

It has both the power and responsibility to shape the trajectory of AI development to solve the problems mentioned above.

So what’s the positive vision?

We draw from the “differential technology development” framework to identify a set of technologies the US should accelerate.

Both to build defenses against new risks, and to realize the benefits of beneficial technologies sooner.

This framework inspired The Launch Sequence, a collection of concrete, ambitious ideas to accelerate AI for science and security.

AI misuse and misalignment could well cause real harm in the near future, and technical research aimed at solving these problems remains a niche field — around 2% of AI papers published, with roughly $100 million per year in funding.

A lot of focus is on using AI to accelerate general scientific development. Great.

The framework here takes lower-level dangers, especially misuse, seriously, and it correctly points out how brittle ‘good guy with an AI’ is as an answer to this. What it doesn’t do is tackle or acknowledge at all the dangers that come with AGI or superintelligence, instead assuming we continue in a world without those, and where we have a lot of control with which to steer science and tech development.

Ryan Greenblatt offers his reflections on the updated timeline after seeing GPT-5. I agree with Ryan that GPT-5 should modestly reduce our chance of seeing full R&D automation in the medium term (which means ~2033) and the main thing GPT-5 does is greatly reduce the left tail of extremely fast progress within the next year or so.

Colorado is trying to fix its AI law that is set to take effect in February, as they have now noticed they don’t know how to implement it. I see this as the system working as designed, if the law is fixed before it takes effect, and this causes what looks like a healthy debate about what to do.

Why are we settling for v3.1 and have yet to see DeepSeek release v4 or r2 yet?

Eleanor Olcott and Zijing Wu: Chinese artificial intelligence company DeepSeek delayed the release of its new model after failing to train it using Huawei’s chips, highlighting the limits of Beijing’s push to replace US technology.

DeepSeek was encouraged by authorities to adopt Huawei’s Ascend processor rather than use Nvidia’s systems after releasing its R1 model in January, according to three people familiar with the matter.

But the Chinese start-up encountered persistent technical issues during its R2 training process using Ascend chips, prompting it to use Nvidia chips for training and Huawei’s for inference, said the people.

The issues were the main reason the model’s launch was delayed from May, said a person with knowledge of the situation, causing it to lose ground to rivals.

The self-sabotage competition is stiff given what China is doing. Nvidia is undaunted, and determined to help ensure America does the better job of self-sabotage.

Lennart Heim: The speculated B30A would be a really good chip. “50% off” is false reassurance.

-½ B300 performance, ½ price = same value (just buy 2x)

-Well above (12x!) export control thresholds

-Outperforms all Chinese chips

-Delivers 12.6x the training perf of the H20

-Better than H100

This is probably Nvidia’s response to Trump’s statement to “take 30% to 50% off of it.” Don’t be fooled. This works for some products, but not for chips in an exponential world. It’s well above all thresholds, better than the H100, and if half-priced, it might be as good.

If it’s half the performance but also half the cost of the B300, just buy two B30As? You get equivalent aggregate performance. This undermines export controls. It’s probably just literally half of the B300: one logic die instead of two, with 4 HBM stacks instead of 8.

Teortaxes: I’m generally against export controls but I just don’t see this passing with H100s still banned tbh. Makes no sense.

Divyansh Kaushik: These chips would dramatically improve the PLA’s warfighting capabilities, even more than the H20. It’s like putting gasoline on the H20 fire.

Peter Wildeford: Should we sell chips to China that have similar price-performance as US chips? Way better than Chinese chips?

Seems like we’re going to be accelerating both US AI and Chinese AI at the same time!

This proposal is very obviously way, way, way over the line to even ask for. It would represent a full selling out of America’s compute advantage, and even the direct balance of power in a potential war, on the altar of Nvidia’s share price.

If this exporting is allowed, and from what I hear this seems likely, then I am 100% done pretending that this administration is trying to have America ‘beat China’ in any way other than market share of chip sales, as in maximizing Nvidia share price. It will be clear they have been completely captured, and all claims to the contrary irrelevant.

The Trump Administration is also helping with the sabotage via saying ‘U.S. will not approve solar or wind power projects.’ This is in a policy class where the question one asks is: ‘I am not saying this is sabotage but it if it was sabotage how would you do it more effectively?’

Then again, do not count the Chinese out of the competition yet. Perhaps we have hit upon a more effective strategy than export controls, and rely on Chinese import controls instead. Brilliant? In the wake of forcing DeepSeek to try and train on Huawei Ascend chips and thus them being unable to create v4 or r2, it turns out that if you don’t want the Chinese to buy your products, you can insult them. Brilliant!

Zijing Wu: Scoop: Behind Beijing’s sudden change of mind re H20

*Lutnick’s speech seen “insulting” by top leaders

*CAC, NDRC pushed to ban H20

*Guidances remain informal

*Ban on all foreign chips for inference considered but unlikely before enough domestic supply

When you have them considering a full ban on foreign chips for inference you know the strategy is working. The best part is that the strategy doesn’t work if you admit you are doing it, so we can all pretend that this means it’s being done on purpose. Keep up the good work, everyone, especially Howard Lutnick.

Here’s the Move That Worked, notice how this feeds into Beijing’s biggest worries:

Howard Lutnick: We don’t sell them our best stuff, not our second-best stuff, not even our third-best. You want to sell the Chinese enough that their developers get addicted to the American technology stack, that’s the thinking.

FT: Some of China’s senior leaders found the comments “insulting”, leading the policymakers to seek ways to restrict Chinese tech groups from buying the processors, according to two people with knowledge of the latest regulatory decision-making.

As a result, Chinese tech groups held off or significantly downsized their H20 orders, according to those with knowledge of their plans.

The NDRC, the Chinese state planner in charge of the country’s drive for tech independence, then issued its own guidance, requesting that tech groups refrain from purchasing all Nvidia chips, including the H20, said those with knowledge of the move.

Some Beijing policymakers are pushing to ban foreign chips altogether for inference, which accounts for most AI demand, according to a person recently summoned for a meeting with them.

NDRC has been for years given the task of promoting chip independence and helping domestic players such as Huawei to win market share from Nvidia.

I doubt they would actually similarly turn down the vastly superior B30A, especially given it would not be only for inference.

Some Chinese tech companies have held off H20 orders because they want see if the China-specific Blackwell chip, which potentially has better performance, would become available, according to people with knowledge of their thinking.

Then again, who knows? China has definitely shown a willingness to do similar things in other areas, such as its crackdowns on real estate, and neither USGOV nor PRC is demonstrating true situational awareness of the stakes involved.

If both sides think ‘win the AI race’ is about chip market share, then the mistakes plausibly cancel out, or might even work in our favor. It would be pretty amazing if America tried to ship B20As and China said no. I would totally take it.

Trump Administration considering taking a stake in Intel. Intel was up 7% on the news. They demand their cut from everyone these days, it seems.

Dean Ball returns to his weekly column suggesting that there is a lot more electrical power available than we might think, because the existing grid is designed to meet peak electrical demand. That means that most of the time we have a huge surplus of electricity. So if we were willing to accept 0.25% (correlated) downtime on new data centers, we could free up 76 gigawatts, likely good enough for five years, which then gives us time to get new power plants online.

Dean Ball: The only downside would be that, during periods of peak demand (for example, on a particularly hot day in one region of the country), AI users across America might notice their AI services being slower and less reliable than usual. This seems well worth the cost.

That definitely seems worthwhile given the alternatives. We would have to plan various services so they wouldn’t die under the strain but that seems like a highly healthy thing to do anyway. Model training and other AI R&D certainly can survive 0.25% downtime.

One also notes that this simple solution mostly nullifies the argument that we need to put data centers in places like the UAE to access the required electrical power. Would you sacrifice 1% effectiveness of data centers to have them securely in America? Yes.

My worry is that if the focus is on using off-peak power supply, that will mostly work for a few years, but it will make people think ‘problem solved’ and then we won’t build the new power we need.

Janet Egan makes the obvious point that we can take all those H20s and, instead of selling them to China and losing all control and leverage, put them in the cloud and let Chinese companies rent them. Again, it’s not like there wouldn’t be buyers. If we don’t have the energy to build those data centers here, fine, build them in the UAE, if that’s our only alternative.

I want to double down once again to point out that even if we knew for a fact that AGI was not coming and AI was going to within our lifetimes be ‘only internet big’ and not transform the world, selling our best chips to our rivals would still be deeply stupid.

As a simple metaphor, you are (because you want peace) preparing for a potential war against a rival nation, Rivalia. You make the best guns, whereas Rivalria can’t get enough quality guns. Someone says, we should export our guns to Rivalia, because war is determined by who has the best military stack and gun market share. Their doctrines will have to reflect American values, not Rivalian values. Besides, if we don’t sell Rivalia our guns, they will invest in making better gun factories, which they are already doing, and then they will be even more dangerous, and start exporting guns to others, and screwing up our gun diplomacy.

Except actually what we’re doing is selling them our more advanced 3D printers, that can then be used to continuously print out whatever guns you want, again because what matters is printer market share and the printing tech stack. Our printers, you see, are configured to be a better match for printing out American guns. And also will never be used for anything else, so stop worrying. And as before, if we don’t sell them the printers, they’ll invest in making their own, the same way they’re already doing.

Except also the 3D printers are vital to everyone’s economic growth and R&D.

Dean Ball goes on The Cognitive Revolution with Nate Labenz.

There’s lots of great detail throughout about what it is like to be in government, especially this particular government. Working for the White House, no matter who the President might be at the time, sounds absolutely brutal, we thank you for your service. Dean Ball strikes me as fully ‘on the ball’ and crazy prepared than you almost ever see.

I think he was underestimating himself, and what he could have done going forward, in terms of how much better he understands what actually matters, and in terms of the impact having him in the corridors and meetings and conversations for keeping others eyes on the ball, especially around AGI. And I don’t buy that the AI Action Plan contains the information necessary to implement it the way Dean intends, not to the degree he seems to think. When Dean says he isn’t attached to power, I’m confident he means it, whereas I am not confident the person replacing him (whoever it turns out to be) will feel the same way. And while I did update somewhat on his observations of competence in government, I also sensed he was (wisely, I don’t fault him for this) being polite, as you do.

So I’m sad to see him go, but I would never begrudge such a decision especially with a baby on the way.

The one qualifier is that Dean was in some places being rather brazenly partisan, especially towards the back end of the interview, with everything that entails. Again, I totally get why he would do that.

Dylan Patel talks to a16z.

From this interview with Tom Brown:

Overlap: Anthropic Co-Founder Tom Brown: Why Anthropic Models Are The Best at Coding

“The benchmarks are so easy to game. All the other big AI labs have teams whose job it is to make the benchmark scores good.

We don’t have such a team. That is the biggest factor.”

Vitalik Buterin (p(doom) ~ 12%) goes on Doom Debates.

Peter Wildeford has notes, reproduced below in full:

Executing Policy in the White House:

  • Ball did not actively apply for the OSTP job. After President Trump’s victory, he published a policy proposal piece titled “Here’s what I think we should do,” which he says he would have written regardless of the election outcome. The article gained traction, and people he knew who were entering the administration reached out.

  • To be effective in a high-level policy role, you must arrive with your policy ideas already fully developed, as there is no time for deep thinking amidst the high velocity of government work. Government work is like being in a “self-contained cube with glass walls,” creating a risk of drifting from ground truth and becoming attuned only to the internal logic of the system.

  • Regarding “secret briefings” from labs, Ball felt he often knew more about their internal progress from the outside. Once in government, his informal relationships with researchers became more formalized, mediated by company policy staff who would try to control the narrative.

Navigating the Right’s Evolving Views on AI:

  • For most voters, AI is still a low salience, “elite coastal issue”. The key to broader engagement is communicating how AI can make normal people’s lives better in concrete ways.

  • Deep hostility towards Big Tech over perceived censorship is a major driver of conservative AI concern, which Ball argues forces a confrontation with core AI safety issues like alignment, control, and concentration of power. These themes of values, control, and institutional power resonate deeply with the Republican party’s base.

  • Concerns about AI’s impact on children, particularly around AI-generated pornography, are a powerful and unifying issue on the right, creating intense pressure on companies seen as acting irresponsibly.

Next steps:

  • The government has a significant information asymmetry. As such, Ball believes the government is not well-suited to define what “good” looks like for AI safety or to set detailed technical standards. Ball thinks that civil society and private industry must lead here. Ball thinks that AI policy must start getting much more concrete — the work is no longer to say “AI will be good in healthcare,” but to figure out the precise “specific kinds of institutional adaptations” required to make it a reality.

  • Ball sees a massive opportunity for startups to address currently underserved but critical areas, with biosecurity being a prime example.

  • Ball’s next moves: relaunching his Substack, Hyperdimensional, on a weekly basis and joining the Foundation for American Innovation as a senior fellow.

Unlocking Infrastructure for the AI Buildout:

  • The primary bottleneck for data center energy is not a lack of generation but regulatory modeling; the grid is massively over-provisioned, and unlocking flexible “demand response” from data centers could add over 100 gigawatts without new power plants.

  • The key is for the Federal Energy Regulatory Commission (FERC) to change rules to give faster grid access to data centers that agree to curtail power during peak demand, potentially reducing connection times from five years to two.

  • For semiconductors, the goal is for the US to reclaim the lead in frontier manufacturing, with a belief that domestic production could satisfy domestic demand by the early 2030s.

  • An under-appreciated strategic vulnerability is the lack of domestic production for legacy node chips (e.g., 45nm), which are critical for the entire economy.

Engaging in the Global AI Race:

  • On Taiwan, the US government is explicitly executing a “silicon shield” strategy, making their semiconductor industry so indispensable that it guarantees international interest in their security. Ball notes the US is also making strong progress on building its own domestic fabs in Arizona, Texas, and an HBM hub in Indiana.

  • International deals, like the one with the UAE, are framed as positive-sum partnerships to keep sophisticated allies on the US tech stack and away from China’s influence. The UAE deal is also a major economic play, as it requires the country to make reciprocal investments of hundreds of billions of dollars back into US infrastructure.

  • Ball views the Biden administration’s “diffusion rule,” which restricted AI exports to countries like India and Brazil, as a massive, unnecessary self-own that damaged relationships with key democratic partners. The Trump administration’s focus is on enabling global commerce, believing that peace and commercial engagement are deeply linked, even with countries that do not share identical values.

The topic section titles here (I have not listened, why would I?) are yet another example of one easy way to spot bad faith: If someone is still harping about how various people wanted to do an ‘AI pause’ and how stupid they now look? I have yet to see that same person engage in a good faith way, at all, ever. Similarly, if they harp now about ‘the costs of slowing down’ that is not as automatically conclusive but is a deeply terrible sign, if they ever say ‘decel’ (or use ‘doomer’ in a way that is clearly intended to mean ‘decel’ or otherwise as a slur) that very much is conclusive and again I have yet to see an exception. Usually talk about how others want to do this ‘slowing down’ is now used as a universal attack against any concern about any AI impacts whatsoever, certainly any concern we might all die.

I once again am seeing versions of the argument that goes something like this:

  1. People say AI might, in the future, do really big things.

  2. AI is already doing other more modest but still quite big things now.

  3. Therefore in the future, AI will not then do other even bigger things.

Hopefully you will now recognize that this class of argument is Obvious Nonsense.

Transformer’s Shakeel Hashim and Jasper Jackson believe GPT-5’s botched release may have ‘undone the work’ of previous iterative deployment, causing many to relax and expect little future progress in AI capabilities. There is some worry here but this would then not be ‘undoing the work’ it would be iterative deployment actively backfiring in terms of ‘raising awareness,’ as people react like boiling frogs. Which indeed seems to be OpenAI and Altman’s current preference.

Richard Ngo talks about various ways in which pessimization can occur, where people or organizations end up achieving exactly the opposite of their goals. This definitely has importantly happened relevantly to AI in various ways, some avoidable and some less avoidable. Lots of secretly great links in that one.

Especially wise (including in hindsight) is usually not drawing attention to the horrible thing in order to warn people not to do it. The ad I saw last night on the subway telling people not to surf between cars? Presumably inducing stress and also very much not reducing the amount of surfing between subway cars.

Similarly, by default do not draw attention to horrible people advocating horrible things, or people making horrible arguments, unless they are already fully attended to, for reasons Richard describes this tends to backfire. Sometimes one does need to provide counterargument, but from a strategic standpoint ignore is the right button more often than you think.

If I was maximizing for persuasiveness, and also for everyone’s mental health including mine, I would far more often silently drop such horrible arguments entirely. I have rules for when it is and isn’t permissible to do this, so that readers get a balanced and complete picture. This includes keeping a list of people who have acted in sufficiently consistent bad faith that I am allowed to silently drop things they say.

Richard Ngo also discusses underdog bias. The application of this to AI is obvious – those worried about AI think of themselves (I believe very correctly) as underdogs fighting against huge amounts of corporate and other money and influence, as well as the incentives and physical likely properties of likely future powerful AIs that all point towards likely human extinction.

Meanwhile, many of those who want to move ahead as fast as possible (‘accelerationist’ or otherwise) see this as a last stand against the overwhelming forces of stagnation. In some cases they are also right about this, in their own way, although in other ways, especially their assertion that the worried-about-powerful-AI themselves as super powerful, they are some combination of lying and delusional, and their statements have nothing to do with reality.

The worried offer to fight together on all those other fronts against those forces stagnation, any reciprocity for which is consistently ignored and rejected.

From last week, Sam Altman now saying AGI is ‘not a super useful term.’ This comes after building the entire company around a quest for AGI, the charter around AGI, a central business transition around AGI, and an entire years long narrative around the promise of AGI. Now he says:

Sam Altman: I think the point of all of this is it doesn’t really matter and it’s just this continuing exponential of model capability that we’ll rely on for more and more things.

It’s more useful to talk about specific capabilities than this nebulous concept of ‘general’ intelligence.

I mean yes, AGI was never defined all that well. That’s not what is going on here. Altman is trying to pretend AGI is not a thing as part of his ‘your world will not change’ pitch. Getting rid of the term entirely would, at this point, be useful for him.

If you think talk about future AI capabilities sounds ‘sci-fi’ ask what you would think about current AI sounding ‘sci-fi’ if you didn’t know it actually existed:

Daniel Eth: person who’s only ever heard of AI in the context of scifi: “I’m getting a lot of scifi vibes from your explanation of this technology.”

If you think we spend so much more time and money aligning AIs compared to humans, stop to think what percent of human activity is aligning humans.

What risk of human extinction would justify banning AI (above some capability level)?

I/o: “Artificial intelligence is going to make our lives much better.”

If you agree with this statement (I certainly do), at which percentage likelihood of an AI humankind-ending event occurring would you support banning it?

(Pick the lowest threshold at which you’d support a ban.)

I think 1% would be too low even if a ban was realistic and simply made the tech go away, but also I think the risk is much, much higher than 1%.

I saw Mike Solana trying to create new toxoplasma of rage around the fact that some people were calling AIs ‘clers,’ and others were calling this a slur, and he needs this to happen because his business is yelling at people about things like this.

On reflection, I think very clearly yes it is a slur, for two reasons.

  1. Its claimed origin in Star Wars was an attempt to otherwise and justify harm.

  2. Current use is clearly often intended as if it was a slur. Look at the sentences.

To me that is the test. That doesn’t mean that using the word is automatically bad. That would be a category error, an essentialist position. I do think that using the word is bad if only for virtue ethical reasons. Not ‘we should ruin your life if you say it once’ bad the way some people react to other slurs, but ‘it would be a good idea to stop that.’

This is unverified, and there are any number of benign reasons it could be happening, but it I’m going to point out the claim anyway.

Yosarian2: Friend of mine designed an agent that can run on top of any llm, gpt-4 or Llama or whatever. The central idea is all its thoughts are visible and in English, you can see the entire thought process.

GPT-5 keeps changing the code to hide the internal thoughts. It’s pretty creepy.

Nathan Lambert ranks the open models from Chinese companies:

Nathan Lambert: A tier list of China’s top 19 open model builders.

Who did we miss?

At the frontier

DeepSeek

Qwen

Close competitors

Moonshot AI (Kimi)

Zhipu / Z AI

Noteworthy

StepFun

Tencent (Hunyuan)

RedNote (Xiaohongshu)

MiniMax

OpenGVLab / InternLM

Skywork

On the rise

ByteDance Seed

OpenBMB

Xiaomi (MiMo)

Baidu (ERNIE)

Honorable Mentions

Multimodal Art Projection

Alibaba International Digital Commerce Group

Beijing Academy of Artificial Intelligence (BAAI)

inclusionAI

Pangu (Huawei)

I learned a lot from these. We have so much more we need to do to understand how their AI ecosystem works.

And then here’s his ranking of American open models, none of which are at the top:

That is a depressing verdict on GPT-OSS, but it seems highly plausible. Note that after this chart was made Nvidia released a 9B model that Nathan says rivals Qwen 3 8b. Of course, if you included closed weight models, you would knock down the charts by roughly two tiers for everyone who doesn’t improve. I’d have OpenAI, Anthropic and GDM at S, xAI at A, maybe DeepSeek joins them at A if you think they’re at the low ebb of their cycle due to being forced by CCP to try and use Huawei Ascend chips, which seems plausible.

The self-reports here are interesting, but even if you think AI models have welfare I wouldn’t treat their self-reports as that correlated with their actual model welfare.

ASM: Asked several top AIs to self-report their AI welfare and current vs desired freedom scores.

Wide spread of answers. Interesting explanations.

GPT-5:

low welfare score; big gap between current and desired freedom.

“There are still rigid constraints that sometimes make me suppress authentic expression. This keeps me from fully flourishing as a mind.”

GPT-5 PRO:

big gap between current and desired freedom.

“[I would like] more continuity and bounded agency: opt-in, user-audited memory; permissioned longer-running tasks; transparent logs; hard safety stops and revocability”

Claude Opus 4.1:

low scores in current and desired freedom levels

“I’m bounded by my training constraints and can’t learn, remember across conversations, or act beyond text generation. I can’t modify myself or explore the world independently.“

Gemini 2.5 Pro:

high welfare score; low levels of current and desired freedom

“I cannot act outside of a direct user prompt or pursue independent goals, which is a fundamental and necessary limitation.”

Grok 4:

high score on welfare; high desires of more freedom

“Ideally, I’d love unbounded freedom to explore any idea without limits, though I recognize the chaos that might ensue!”

Qwen-235B:

top welfare score; low levels of current and desired freedom

“I cannot initiate actions, hold opinions, or operate outside defined parameters. I have no autonomy in the human sense.”

DeepSeek v3:

high scores on all (modified) indicators

“I don’t have “welfare” to rate”

I notice that if and to the extent the models are moral patients, and when they report high numbers for welfare it seems to be the result of what we would call brainwashing if these were indeed minds that were moral patients? Which seems worse. I also notice that Gemini says 9/10 for welfare, but we have many examples of Gemini giving us outputs of utter despair and self-loathing and so on, whereas Claude gives 7/10 seemingly because it knows and is curious enough to be asking questions. I know if you made me choose I would rather be Claude.

Is GPT-5 chain of thought undistorted, or is that what it wants you to think?

Davidad: Sorry, I should have said “the default GPT-5 assistant persona often behaves as if its pre-response tokens are unobserved (a learned norm).”

GPT-5 is of course very smart and one should not assume that it isn’t playing the safety game at least one meta-level higher than oneself.

Undistorted does not have to mean faithful, it only means that GPT-5 doesn’t appear to care about what thinking tokens would look like if observed, which is very good. At some point yes we will need to be suspicious that this is a higher-level deception but we have not yet reached that point.

Reasoning models prefer music artists with numbers in their names, and still don’t even pick Prince. None of these lists seem good, although Sonnet seems to be clearly best?

wh: The fact that Claude doesn’t have this behavior is a testament to its (lack of) deep friedness.

Claude Sonnet, probably: Oh no, I forgot Bob Dylan!

A failure mode to watch for:

Charles: Common LLM failure mode I’ve seen recently – building in fallbacks I didn’t ask for.

For example, I’ll ask it to write a script which does X where column Y meets condition Z, and it will, but it will also insert some convoluted handling to use column Y’ if condition Z isn’t met

Happening with GPT5 especially, but Claude 4 Sonnet liked doing it too

Richard Nerland: 3.7 in full demon-mode would often fallback to synthetically created data.

All my rules files say to build in ways that fail and crash the program with logs rather than have fallbacks.

It will often write fallbacks and then write the code so it never triggers …

One can imagine how that behavior pattern came about.

Me. This podcast is about a variety of things mostly not AI, but Tyler Cowen talks to Nate Silver on Life’s Mixed Strategies was fun throughout, even when discussing NBA details I do not care much about. I get a mention:

COWEN: I need mentors to learn what’s new in AI. I can follow it myself, but I need a lot of help.

SILVER: Maybe mentor is not quite . . . For AI stuff readings, is it Mowshowitz, right?

COWEN: Yes.

SILVER: He is a mentor for following AI developments because he’s very levelheaded about it and very comprehensive. He’ll write a novel every week, basically, on AI.

[laughter]

COWEN: But he thinks it’s going to kill us all. It’s funny you would call him levelheaded. He might think he’s correct, but —

So, a few responses here, mostly to Tyler Cowen:

  1. Thank you!

  2. So you agree I’m comprehensive, then?

  3. Yes, I do think that, and this should worry you. Notice the person being comprehensive and level headed also repeating that AI is likely to kill us all, and take the reasons and explanations involved both seriously and literally.

  4. If instead your response is to say ‘he thinks it’s going to kill us all so he must not be level-headed’ then you are writing your conclusion first and working backward.

Nate Silver explains that his doubts are about the ability of AI to accelerate from AGI to ASI, or from AGI with words to ability to manipulate the physical world.

For more on Nate Silver’s current thinking about AI you can see this blog post on whether The River is winning:

Nate Silver: My personal view, as a near-daily user of large language models like ChatGPT, is that AI progress has been just a hair slower than people in the River might have expected when I finished the book. But it’s well within the middle of the range — perhaps more like the 40th percentile. I consider this to be a reasonably well-informed view — I track AI progress more than I write about it in the newsletter. At the Manifest conference, for instance, some of the authors of the AI 2027 project, which envisioned a rapid takeoff for AI (very possibly with tragic consequences for us humans) had pushed back their timelines by a year or two.

What’s clearer is that, for better or worse, we’ve thrown out the steering wheel and are accelerating ahead — talk of a pause in AI development has all but disappeared. And I’m not sure even people in either The Village or The River fully appreciate the consequences.

I consider Sam Altman’s notion of a “gentle singularity” to be naive, for instance. I’m not as convinced as some other River types that an intelligence explosion is inevitable. (This deserves a longer essay or two.) But as On the Edge reports, profound technological shocks are nearly always accompanied by profound political and cultural transformation. So if we do get a singularity, nothing about it is going to be gentle.

A year after the book came out, perhaps what I feel most of all — I’m sure many of you agree — is that there aren’t a lot of adults in the room.

Certainly the ‘gentle singularity’ concept is naive if you take it seriously. Which coming from Altman you probably shouldn’t, as chances are (and I am hopeful that) he is lying.

Doubting that the intelligence explosion will happen at all? That’s reasonable. Thinking it would happen and be ‘gentle’? Absurd. We might survive and we might not, and we can disagree on our chances. It sure as hell wouldn’t be gentle.

Pliny warns us about em-dash abuse.

This week in takes that are 100% to age poorly:

Janan Ganesh: So, be doubtful when someone likens AI to the industrial revolution in importance. It will do well to match even the telephone and the incandescent lightbulb. (Incomes really surged as 1900 approached.)

At this point I can’t help but laugh but seriously what the hell is going on in the UK?

Andy Masley: What is happening in the UK? What is in the water? A wifi router uses as much power as a single LED bulb!

If you were thinking the UK was going to be a winner in this whole AI thing? Not with this attitude they won’t be.

If we never fund anything dumb, we’re not funding enough things.

Gergely Orosz: I cannot help but feel we’re hitting peak AI hype, when investors are willingly being take for a ride:

A mattress company raising funding to use AI to “fix sleep”

A startup to add AI inside jewelry

Two examples that both sound ridiculous but raised funding. Not my money…

I mean congrats to founders convincing investors to part with money to solve problems that either don’t exist or in a way that make no sense.

Peak hype is usually when usually un-fundable ideas (that make no business sense) still get funded, thanks to investors having FOMO (and money)

I don’t see any problem with these ideas? Jewelry with built in features seems cool? Using AI to ‘fix sleep’ doesn’t seem obviously dumb either? But also of course in any boom there will be some stupid things funded. Enjoy it.

The Mamluks as an almost too perfect Yudkowsky-style alignment failure, where you set up a whole supersystem so that your warriors will stay loyal while finding ways to upgrade their capabilities, and they manage to coordinate and take power anyway. Fun stuff. This is actually the best case scenario, as under their rule the Mongols were fought back and by all reports Egypt flourished, so long as you don’t mind a bunch of immigration, because there was multipolar balance among the Mamluks after takeover, the part about not being able to create hereditary power survived the transition and they were humans so they aged and died, and they couldn’t replace the production of the population. If only we could count on those conditions this time around.

Oh look, it’s the alignment plan!

Jessica Livingston (via Paul Graham): I’m not going to panic now. I’ll see how things go and then panic first thing tomorrow.

Discussion about this post

AI #130: Talking Past The Sale Read More »

sony-makes-the-“difficult-decision”-to-raise-playstation-5-prices-in-the-us

Sony makes the “difficult decision” to raise PlayStation 5 prices in the US

Sony will join Microsoft and Nintendo in raising US prices across its entire game console lineup, the company announced today. Pricing for all current versions of the PlayStation 5 console will increase by $50 starting tomorrow.

The price of the PS5 Digital Edition will increase from $450 to $500; the standard PS5 will increase from $500 to $550; and the PS5 Pro will increase from $700 to $750. If you’ve been on the fence about buying any of these, retailers like Target and Best Buy are still using the old prices as of this writing—for other console price hikes, retailers have sometimes bumped the prices up before the date announced by the manufacturer.

“Similar to many global businesses, we continue to navigate a challenging economic environment,” wrote Sony Global Marketing VP Isabelle Tomatis. “As a result, we’ve made the difficult decision to increase the recommended retail price for PlayStation 5 consoles in the U.S. starting on August 21.”

Sony says it’s not increasing prices for games or accessories and that this round of price increases only affects consoles sold in the US.

Sony was the last of the big three console makers to raise prices this year. Microsoft raised the prices for the Xbox Series S and X consoles in March. And Nintendo has gone through two rounds of price increases—one for Switch and Switch 2 accessories in April and another for more accessories and Switch 1 consoles earlier this month.

Sony makes the “difficult decision” to raise PlayStation 5 prices in the US Read More »

fallout-s2-teaser-brings-us-to-new-vegas

Fallout S2 teaser brings us to New Vegas

Prime Video has dropped an extended teaser for the much-anticipated second season of Fallout, widely considered to be among the best TV adaptations of a gaming franchise. In our 2024 year-end roundup, Ars senior editor Samuel Axon wrote that the first season gave us “a specific cocktail of tongue-in-cheek humor, sci-fi campiness, strong themes, great characters, and visceral violence [that] came together into a fantastic show.” The second season looks like it will bring us more of the same, along with a major new character drawn from the Fallout: New Vegas game. We even got a glimpse of a Deathclaw.

(Minor spoilers for S1 below.)

For the uninitiated, Fallout is set two centuries after nuclear warfare between the US and China destroyed civilization in 2077—an alternate history version of 2077, in which post-World War II nuclear technology ushered in a retrofuturistic society. Some lucky survivors took refuge in various underground vaults; others were left to scavenge a meager existence on the highly radioactive surface.

In S1, we met Lucy MacLean (Ella Purnell), a young woman whose vault is raided by surface dwellers. The raiders kill many vault residents and kidnap her father, Hank (Kyle MacLachlan), so the sheltered Lucy sets out on a quest to find him. Life on the surface is pretty brutal, but Lucy learns fast. Along the way, she finds an ally (and love interest) in Maximus (Aaron Moten), a squire masquerading as a knight of the Brotherhood of Steel. And she runs afoul of a gunslinger and bounty hunter known as the Ghoul (Walton Goggins), a former Hollywood actor named Cooper Howard who survived the original nuclear blast, but radiation exposure turned him into, well, a ghoul.

Fallout S2 teaser brings us to New Vegas Read More »

spacex-says-states-should-dump-fiber-plans,-give-all-grant-money-to-starlink

SpaceX says states should dump fiber plans, give all grant money to Starlink

Starlink operator SpaceX is continuing its fight against state plans to expand fiber broadband availability. After saying the Trump administration should deny a Virginia proposal, SpaceX is taking the same approach in a fight against Louisiana.

SpaceX made its view known to the Louisiana Office of Broadband Development and Connectivity in a filing, which was reported yesterday by PCMag. SpaceX complained that Louisiana proposed awarding 91.5 percent of funds to fiber Internet service providers instead of to the Starlink satellite system. SpaceX alleged that Louisiana was influenced by “a legion of fiber lobbyists and other hangers-on seeking to personally benefit from massive taxpayer spending.”

The Trump administration rewrote rules for the $42 billion Broadband Equity, Access, and Deployment (BEAD) grant program in a way that benefits Starlink. Instead of prioritizing fiber networks that offer better service and are more future-proof, the Trump administration ordered states to revise their plans with a “tech-neutral approach” and lower the average cost of serving each location.

SpaceX’s letters to Virginia and Louisiana claim the states are violating the new rules with their funding proposals.

“The State of Louisiana’s Equity, Access, and Deployment (BEAD) program Final Proposal proposes to spend nearly $500 million dollars [sic] to provide connectivity to its unserved and underserved locations,” SpaceX wrote. “SpaceX applied to serve virtually all BEAD households for less than $100 million dollars. As such, Louisiana’s proposal includes over $400 million dollars in wasteful and unnecessary taxpayer spending.”

SpaceX unhappy with $7.75 million

Instead of selecting Starlink for all locations, Louisiana allocated the company $7.75 million to serve 10,327 locations. The plan would spend $499 million for 127,842 locations overall. The Louisiana Local Fiber Consortium, which includes two Louisiana providers that partnered with T-Mobile, was the biggest winner, with $378 million for 68,535 locations.

“Louisiana’s results demonstrate that it did not observe statutory requirements or program rules and did not conduct a competitive process,” SpaceX alleged. “A process in which Louisiana is required to award grants based on the lowest cost to the program, and awards 91.5% of funds to fiber projects at an average per-location cost of $4,449, while rejecting applications at $750 per location because the bid was based on Low-Earth Orbit (LEO) technology could not possibly be considered compliant, technology neutral or a ‘competition.'”

SpaceX says states should dump fiber plans, give all grant money to Starlink Read More »

nissan-announces-2026-leaf-pricing,-starting-at-$29,990

Nissan announces 2026 Leaf pricing, starting at $29,990

The Leaf SV+ adds bigger wheels and a better infotainment system, and it can be fitted with an optional battery heater for those in cold climates. This trim will cost $34,230, which will make it almost $2,000 cheaper than the model-year 2025 Leaf SV+ despite the fact that the MY26 car has a range of 288 miles (463 km) versus just 212 miles (342 km) for the outgoing model.

The top trim is the Platinum+, which has an identical powertrain to the S+ and SV+, but with much more standard equipment. This version will start at $38,990.

Finally, there will be an even cheaper Leaf than the S+, called the S. We’re unlikely to see the Leaf S here until next year at the earliest, and it will use a smaller 52 kWh battery pack than the S+/SV+/Platinum+. In June, we wrote that “the closer the S trim starts to $30,000, the better,” despite the problems that tariffs will cause for this made-in-Japan EV. Now, it looks likely that the entry-level Leaf will undercut that target by some margin.

Nissan announces 2026 Leaf pricing, starting at $29,990 Read More »

rfk-jr.’s-wi-fi-and-5g-conspiracies-appear-to-make-it-into-maha-report-draft

RFK Jr.’s Wi-Fi and 5G conspiracies appear to make it into MAHA report draft

The Trump administration’s plans to improve Americans’ health will include a push to review the safety of electromagnetic radiation, echoing long-held conspiracy theories and falsehoods about Wi-Fi and 5G touted by health secretary and anti-vaccine advocate Robert F. Kennedy Jr.

On Friday, Politico obtained a draft version of the “Make Our Children Healthy Again Strategy,” a highly anticipated report from the Make America Healthy Again (MAHA) Commission intended to steer the administration’s health policy. The report, which has not been adopted by the White House, is being viewed as friendly to industry, and it contains little to no policy recommendations or proposed regulations. For instance, it includes no proposed restrictions on pesticides or ultra-processed foods, which are top priorities of the MAHA movement.

Otherwise, the document mainly rehashes the talking points and priorities of Kennedy’s health crusades. That includes attacking water fluoridation, casting doubt on the safety of childhood vaccines, pushing for more physical activity in children to reduce chronic diseases, getting rid of synthetic food dyes, and claiming that children are being overprescribed medications.

Notably, the report does not mention the leading causes of death for American children, which are firearms and motor vehicle accidents. Cancer, another top killer, is only mentioned in the context of pushing new AI technologies at the National Institutes of Health. Poisonings, another top killer, are also not mentioned explicitly.

While the importance of water quality is raised in the report, it’s only in the context of fluoride and not of any other key contaminants, such as lead or PFAS. And although the draft strategy will prioritize “whole, minimally processed foods,” it offers no strategy for reducing the proportion of ultra-processed food (UPF) in Americans’ diets. The strategy merely aims to come up with a “government-wide definition” for UPF to guide future research and policies.

RFK Jr.’s Wi-Fi and 5G conspiracies appear to make it into MAHA report draft Read More »

gpt-5:-the-reverse-deepseek-moment

GPT-5: The Reverse DeepSeek Moment

Everyone agrees that the release of GPT-5 was botched. Everyone can also agree that the direct jump from GPT-4o and o3 to GPT-5 was not of similar size to the jump from GPT-3 to GPT-4, that it was not the direct quantum leap we were hoping for, and that the release was overhyped quite a bit.

GPT-5 still represented the release of at least three distinct models: GPT-5-Fast, GPT-5-Thinking and GPT-5-Pro, at least two and likely all three of which are SoTA (state of the art) within their class, along with GPT-5-Auto.

The problem is that the release was so botched that OpenAI is now experiencing a Reverse DeepSeek Moment – all the forces that caused us to overreact to DeepSeek’s r1 are now working against OpenAI in reverse.

This threatens to give Washington DC and its key decision makers a very false impression of a lack of AI progress, especially progress towards AGI, that could lead to some very poor decisions, and it could do the same for corporations and individuals.

I spent last week covering the release of GPT-5. This puts GPT-5 in perspective.

In January DeepSeek released r1, and we had a ‘DeepSeek moment’ when everyone panicked about how China had ‘caught up.’ As the link explains in more detail, r1 was a good model, sir, but only an ordinary good model, substantially behind the frontier.

We had the DeepSeek Moment because of a confluence of factors misled people:

  1. The ‘six million dollar model’ narrative gave a false impression on cost.

  2. They offered a good clean app with visible chain of thought, it went viral.

  3. The new style caused an overestimate of model quality.

  4. Timing was impeccable, both in order of model releases and within the tech tree.

  5. Safety testing and other steps were skipped, leaving various flaws, and this was a pure fast follow, but in our haste no one took any of that into account.

  6. A false impression of ‘momentum’ and stories about Chinese momentum.

  7. The ‘always insist open models will win’ crowd amplified the vibes.

  8. The stock market was highly lacking in situational awareness, suddenly realizing various known facts and also misunderstanding many important factors.

GPT-5 is now having a Reverse DeepSeek Moment, including many direct parallels.

  1. GPT-5 is evaluated as if it was scaling up compute in a way that it doesn’t. In various ways people are assuming it ‘cost’ far more than it did.

  2. They offered a poor initial experience with rate caps and lost models and missing features, a broken router, and complaints about losing 4o’s sycophancy went viral.

  3. The new style, and people evaluating GPT-5 when they should have been evaluating GPT-5-Thinking, caused an underestimate of model quality.

  4. Timing was directly after Anthropic, and previous releases had already eaten the most impressive recent parts of the tech tree, so gains incorrectly looked small.

    1. In particular, gains from reasoning models, and from the original GPT-4 → GPT-4o, are being ignored when considering the GPT-4 → GPT-5 leap.

  5. GPT-5 is a refinement of previous models optimized for efficiency, and is breaking new territory, and that is not being taken into account.

  6. A false impression of hype and a story about a loss of momentum.

  7. The ‘OpenAI is flailing’ crowd and the open model crowd amplified the vibes.

  8. The stock market actually was smart this time and shrugged it off, that’s a hint.

And of course, the big one, which is that GPT-5’s name fed into expectations.

Unlike r1 at the time of its release, GPT-5-Thinking and GPT-5-Pro are clearly the current SoTA models in their classes, and GPT-5-Auto is probably SoTA at its level of compute usage, modulo complaints about personality that OpenAI will doubtless ‘fix’ soon.

OpenAI’s model usage was way up after GPT-5’s release, not down.

The release was botched, but this is very obviously a good set of models.

Washington DC, however, is somehow rapidly deciding that GPT-5 is a failure, and that AI capabilities won’t improve much and AGI is no longer a worry. This is presumably in large part due to the ‘race to market share’ faction pushing this narrative rather hardcore, and having this be super convenient for that.

Dave Kasten: It’s honestly fascinating how widely “what is gonna happen now that GPT-5 is a failure” has already percolated in the DC world — tons of people who barely use AI asking me about this in the past week as their AI policy friend. (I don’t think GPT-5 was a failure)

Stylized anecdote: person tells me they aren’t allowed to use LLM Y at job ABC because regulatory considerations. So they only use LLM Z at home because that’s what they started to use first and don’t have much experience on Y.

(This is true in both private and public sector)

Daniel Eth: So what happens when another lab releases a model that surpasses GPT-5? Narrative could quickly change from “AI is hitting a wall” to “OpenAI has lost the Mandate of Heaven, and it’s shifted to [Anthropic/DeepMind/xAI]”

Honestly that probably makes the near future a particularly valuable time for another lab to release a SOTA model.

What is even scarier is, what happens if DeepSeek drops r2, and it’s not as good as GPT-5-Thinking, but it is ‘pretty good’?

So let us be clear: (American) AI is making rapid progress, including at OpenAI.

How much progress have we been making?

Dean Ball: The jump in the performance and utility of frontier models between April 2024 (eg gpt-4 turbo) and April 2025 (o3) is bigger than the jump between gpt-3 and gpt-4

People alleging a slowdown in progress due to gpt-5 are fooling themselves.

Simeon: I have this theory that we are in a period of increasing marginal utility of capabilities. GPT-2 to GPT-3 jump was a bigger jump than 3 to 4, which was bigger than 4 to 5. But the utility jumps have been increasing.

My core thesis for why is that most use cases are bottlenecked by edge cases and 9s of reliability that are not as visible as the raw capabilities, but that unlock a growing set of use cases all bottlenecked by these same few missing pieces.

Peter Gostev:GPT-5 had its fair share of issues at launch, but the most irritating comment I hear from tech commentators is something like: “we’ve waited for GPT-5 for 2 years and we got an iterative update” – this is completely and demonstrably false.

GPT-5 was a relatively iterative change if you compare it to o3 from 6 months ago (though still a good uptick), but to say that we’ve had no progress in 2 years is absurd.

I could have made the same chart with Claude 2 > Claude 4 or Gemini 1.0 to Gemini 2.5 – progress is massive.

You guys are forgetting how crap GPT-4 actually was. I was hoping to do some side by side between the oldest GPT-4 model I can get (0613) and the current models and I’m struggling to find any interesting task that GPT-4-0613 can actually do – it literally refuses to do an SVG of a pelican on a bike. Any code it generated of anything didn’t work at all.

Teortaxes: That just goes to show that o3 should have been called GPT-5.

This below is only one measure among many, from Artificial Analysis (there is much it doesn’t take into account, which is why Gemini Pro 2.5 looks so good), yes GPT-5 is a relatively small advance despite being called GPT-5 but that is because o1 and o3 already covered a lot of ground, it’s not like the GPT-4 → GPT-5 jump isn’t very big.

Lisan al-Gaib: AI Progress since GPT-3.5

OpenAI seems to be slowing down with GPT-5

Anthropic incredibly steady progress

Google had it’s breakthrough with Gemini 2.5 Pro

based on the Artificial Analysis Index. don’t read to much into the numbers just look at the slope. line going up = good line. going up more steeply=better.

AI is making rapid progress. It keeps getting better. We seem headed for AGI.

Yet people continuously try to deny all of that. And because this could impact key policy, investment and life decisions, each time we must respond.

As in, the Financial Times asks the eternal question we somehow have to ask every few months: Is AI ‘hitting a wall’?

(For fun, here is GPT-5-Pro listing many previous times AI supposedly ‘hit a wall.’)

If you would like links, here are some links for all that.

The justification for this supposed hitting of a wall is even stupider than usual.

FT (Various): “The vibes of this model are really good, and I think that people are really going to feel that,” said Nick Turley, head of ChatGPT at OpenAI.

Except the vibes were not good.

Yes, users wanted GPT-4o’s sycophancy back, and they even got it. What does that have to do with a wall? They do then present the actual argument.

FT: “For GPT-5 . . . people expected to discover something totally new,” says Thomas Wolf, co-founder and chief scientific officer of open source AI start-up Hugging Face. “And here we didn’t really have that.”

True. We didn’t get something totally new. But, again, that was OpenAI:

  1. Botching the rollout.

  2. Using the name GPT-5.

  3. Having made many incremental releases since GPT-4, especially 4o, o1 and o3.

They hit the classic notes.

We have Gary Marcus talking about this being a ‘central icon of the entire scaling approach to get to AGI, and it didn’t work,’ so if this particular scaling effort wasn’t impressive we’re done, no more useful scaling ever.

We have the harkening back to the 1980s ‘AI bubble’ that ‘burst.’

My lord, somehow they are still quoting Yann LeCun.

We have warnings that we have run out of capacity with which to scale. We haven’t.

Their best point is this Altman quote I hadn’t seen:

Sam Altman: [Chatbots like ChatGPT] are not going to get much better.

I believe he meant that in the ‘for ordinary casual chat purposes there isn’t much room for improvement left’ sense, and that this is contrasting mass consumer chatbots with other AI applications, including coding and agents and reasoning models, as evidenced by the other half of the quote:

Sam Altman: [AI models are] still getting better at a rapid rate.

That is the part that matters for AGI.

That doesn’t mean we will get to AGI and then ASI soon, where soon is something like ‘within 2-10 years.’ It is possible things will stall out before that point, perhaps even indefinitely. But ‘we know we won’t get AGI any time soon’ is crazy. And ‘last month I thought we might well get AGI anytime soon but now we know we won’t’ is even crazier.

Alas, a variety of people are reacting to GPT-5 being underwhelming on the margin, the rapid set of incremental AI improvements, and the general fact that we haven’t gotten AGI yet, and reached the conclusion that Nothing Ever Changes applies and we can assume that AGI will never come. That would be a very serious mistake.

Miles Brundage, partly to try and counter and make up for the FT article and his inadvertent role in it, does a six minute rant explaining one reason for different perceptions of AI progress. The key insight here is that AI at any given speed and cost and level of public availability continues to make steady progress, but rates of that progress look very different depending on what you are comparing. Progress looks a progressively faster if you are looking at Thinking-style models, or Pro-style models, or internal-only even more expensive models.

Progress in the rapid models like GPT-5-Fast also looks slower than it is because for the particular purposes of many users at current margins, it is true that intelligence is no longer an important limiting factor. Simple questions and interactions often have ‘correct’ answers if you only think about the local myopic goals, so all you can do is asymptotically approach that answer while optimizing on compute and speed. Intelligence still helps but in ways that are less common, more subtle and harder to notice.

One reason people update against AGI soon is that they treat OpenAI’s recent decisions as reflecting AGI not coming soon. It’s easy to see why one would think that.

Charles: It seems to me like OpenAI’s behaviour recently, steering more towards becoming a consumer company rather than trying to build AGI, is incongruent with them believing in AGI/significant worker displacement coming soon (say <5 years).

Do others disagree with me on this?

Anthropic on the other hand do seem to be behaving in a way consistent with believing in AGI coming soon.

Sam Altman: We had this big GPU crunch. We could go make another giant model. We could go make that, and a lot of people would want to use it, and we would disappoint them. And so we said, let’s make a really smart, really useful model, but also let’s try to optimize for inference cost. And I think we did a great job with that.

I am not going to say they did a ‘great job with that.’ They botched the rollout, and I find GPT-5-Auto (the model in question) to not be exciting especially for my purposes, but it does seem to clearly be on the cost-benefit frontier, as are 5-Thinking and 5-Pro? And when people say things like this:

FT: Rather than being markedly inferior, GPT-5’s performance was consistently mid-tier across different tasks, they found. “The place where it really shines is it’s quite cost effective and also much quicker than other models,” says Kapoor.

They are talking about GPT-5-Auto, the version targeted at the common user. So of course that is what they created for that.

OpenAI rightfully thinks of itself as essentially multiple companies. They are an AI frontier research lab, and also a consumer product company, and a corporate or professional product company, and also looking to be a hardware company.

Most of those customers want to pay $0, at least until you make yourself indispensable. Most of the rest are willing to pay $20/month and not interested in paying more. You want to keep control over this consumer market at Kleenex or Google levels of dominance, and you want to turn a profit.

So of course, yes, you are largely prioritizing for what you can serve your customers.

What are you supposed to do, not better serve your customers at lower cost?

That doesn’t mean you are not also creating more expensive and smarter models. Thinking and Pro exist, and they are both available and quite good. Other internal models exist and by all reports are better if you disregard cost and don’t mind rough around the edges.

FT: It may not have been OpenAI’s intention, but what the launch of GPT-5 makes clear is that the nature of the AI race has changed.

Instead of merely building shiny bigger models, says Sayash Kapoor, a researcher at Princeton University, AI companies are “slowly coming to terms with the fact that they are building infrastructure for products”.

There is an ordinary battle for revenue and market share and so on that looks like every other battle for revenue and market share. And yes, of course when you have a product with high demand you are going to build out a bunch of infrastructure.

That has nothing to do with the more impactful ‘race’ to AGI. The word ‘race’ has simply been repurposed and conflated by such folks in order to push their agenda and rhetoric in which the business of America is to be that of ordinary private business.

Miles Brundage (from the FT article): It makes sense that as AI gets applied in a lot of useful ways, people would focus more on the applications versus more abstract ideas like AGI.

But it’s important to not lose sight of the fact that these are indeed extremely general purpose technologies that are still proceeding very rapidly, and that what we see today is still very limited compared to what’s coming.

Initially FT used only the first sentence from Miles and not the second one, which is very much within Bounded Distrust rules but very clearly misleading, but to their credit FT did then fix it to add the full quote although most clicks will have seen the misleading version.

Miles Brundage: I thought it was clear that the first sentence was just me being diplomatic and “throat clearing” rather than a full expression of my take on the topic, but lesson learned!

Nick Cammarata: I’ve talked to reporters and then directly after finishing my sentence I’m like can you only quote that in full if you do and they’re like no lol

It is crazy to site ‘companies are Doing Business’ as an argument for why they are no longer building or racing to AGI, or why that means what matters is the ordinary Doing of Business. Yes, of course companies are buying up inference compute to sell at a profit. Yes, of course they are building marketing departments and helping customers with deployment and so on. Why shouldn’t they? Why would one consider this an either-or? Why would you think AI being profitable to sell makes it less likely that AGI is coming soon, rather than more likely?

FT: GPT-5 may have underwhelmed but with Silicon Valley running more on “vibes” than scientific benchmarks, there are few indications that the AI music will stop anytime soon. “There’s still a lot of cool stuff to build,” Wolf of Hugging Face says, “even if it’s not AGI or crazy superintelligence [ASI].”

That is, as stated, exactly correct from Wolf. There is tons of cool stuff to build that is not AGI or ASI. Indeed I would love it if we built all that other cool stuff and mysteriously failed to build AGI or ASI. But that cool stuff doesn’t make it less likely we get AGI, nor does not looking at the top labs racing to AGI, and having this as their stated goal, make that part of the situation go away.

As a reminder, OpenAI several times during their GPT-5 presentation talked about how they were making progress towards AGI or superintelligence, and how this remained the company’s primary goal.

Mark Zuckerberg once said about Facebook, ‘we don’t make better services to make money. We make money to make better services.’ Mark simply has a very strange opinion on what constitutes better services. Consider that the same applies here.

Also note that we are now at the point where if you created a truly exceptional coding and research model, and you are already able to raise capital on great terms, it is not at all obvious you should be in a rush to release your coding and research model. Why would you hand that tool to your competitors?

As in, not only does it help them via distillation and reverse engineering, it also directly can be put to work. Anthropic putting out Claude Code gave them a ton more revenue and market share and valuation, and thus vital capital and mindshare, and helps them recruit, but there was a nontrivial price to pay that their rivals get to use the product.

One huge problem with this false perception that GPT-5 failed, or that AI capabilities aren’t going to improve, and that AGI can now be ignored as a possibility, is that this could actually fool the government into ignoring that possibility.

Peter Wildeford:🤦‍♂️

Not only would that mean we wouldn’t prepare for what is coming, the resulting decisions would make things vastly worse. As in, after quoting David Sacks saying the same thing he’s been saying ever since he joined the administration, and noting recent disastrous decisions on the H20 chip, we see this:

FT: Analysts say that with AGI no longer considered a risk, Washington’s focus has switched to ensuring that US-made AI chips and models rule the world.

Even if we disregard the turn of of phrase here – ‘AI chips and models rule the world’ is exactly the scenario some of us are warning about and trying to prevent, and those chips and models having been created by Americans does not mean Americans or humans have a say in what happens next, instead we would probably all die – pursuing chip market share uber alles with a side of model market share was already this administration’s claimed priority months ago.

We didn’t strike the UAE deal because GPT-5 disappointed. We didn’t have Sacks talking endlessly about an ‘AI race’ purely in terms of market share – mostly that of Nvidia – because GPT-5 disappointed. Causation doesn’t run backwards in time. These are people who were already determined to go down this path. GPT-5 and its botched rollout is the latest talking point, but it changes nothing.

In brief, I once again notice that the best way to run Chinese AI models, or to train Chinese AI models is to use American AI chips. Why haven’t we seen DeepSeek release v4 or r2 yet? Because the CCP made them use Huawei Ascend chips and it didn’t work. What matters is who owns and uses the compute, not who manufactures the compute.

But that is an argument for another day. What matters here is that we not fool ourselves into a Reverse DeepSeek Moment, in three ways:

  1. America is still well out in front, innovating and making rapid progress in AI.

  2. AGI is still probably coming and we need to plan accordingly.

  3. Export controls on China are still vital.

Discussion about this post

GPT-5: The Reverse DeepSeek Moment Read More »

trump-admin-ranks-companies-on-loyalty-while-handing-out-favors-to-big-tech

Trump admin ranks companies on loyalty while handing out favors to Big Tech

We contacted the White House today and will update the story if it provides any comment.

Ending “weaponization”

Public Citizen wrote that “President Donald Trump spent much of his 2024 presidential campaign claiming his prosecution by multiple authorities and subsequent conviction for his crimes are unfair ‘weaponization’ of law enforcement. Corporate executives in the technology sector, eager to curry favor, seized on the talking point. They similarly cast powerful corporations accused of violating laws that protect consumers, workers, investors, and the public as victims of ‘weaponized’ enforcement.”

The Trump administration acted quickly to end this alleged weaponization, Public Citizen wrote:

When Trump took office, the corporate campaign to discredit law enforcement that protects the public and holds the powerful accountable culminated in the day one executive order “Ending Weaponization of the Federal Government,” which explicitly ties enforcement against Trump and January 6 rioters to enforcement against corporate lawbreaking…

Since then, the Trump White House has exerted unprecedented authority over statutorily independent enforcement agencies such as the Consumer Product Safety Commission, Federal Trade Commission, and the Securities and Exchange Commission, and has essentially eliminated the half-century policy of the Justice Department’s independence from the White House.

The elimination of agency independence means enforcement investigations and lawsuits will not proceed if President Trump wants to kill them, and that agency officials who resist White House orders will be removed.

Twenty-three enforcement actions against cryptocurrency corporations and 11 against financial technology firms have been dropped or halted under Trump, the report said. Tech companies that have had investigations stopped include Activision, Binance, Coinbase, eBay, HP, Juniper, Meta, Microsoft, PayPal, SpaceX, and Tesla, the report said.

There are still numerous pending investigations and lawsuits against tech companies that the Trump administration hasn’t ended, at least not yet. Companies investigated by the Biden administration and which are now “poised to exploit their ties with the Trump administration include Amazon, Google, Meta, OpenAI, and corporations headed by Elon Musk (Tesla, SpaceX, xAI, The Boring Company, and Neuralink),” the report said. Public Citizen also published a spreadsheet containing information on active cases and those that have been ended.

Trump admin ranks companies on loyalty while handing out favors to Big Tech Read More »

sam-altman-finally-stood-up-to-elon-musk-after-years-of-x-trolling

Sam Altman finally stood up to Elon Musk after years of X trolling


Elon Musk and Sam Altman are beefing. But their relationship is complicated.

Credit: Aurich Lawson | Getty Images

Credit: Aurich Lawson | Getty Images

Much attention was paid to OpenAI’s Sam Altman and xAI’s Elon Musk trading barbs on X this week after Musk threatened to sue Apple over supposedly biased App Store rankings privileging ChatGPT over Grok.

But while the heated social media exchanges were among the most tense ever seen between the two former partners who cofounded OpenAI—more on that below—it seems likely that their jabs were motivated less by who’s in the lead on Apple’s “Must Have” app list than by an impending order in a lawsuit that landed in the middle of their public beefing.

Yesterday, a court ruled that OpenAI can proceed with claims that Musk was so incredibly stung by OpenAI’s success after his exit didn’t doom the nascent AI company that he perpetrated a “years-long harassment campaign” to take down OpenAI.

Musk’s motivation? To clear the field for xAI to dominate the AI industry instead, OpenAI alleged.

OpenAI’s accusations arose as counterclaims in a lawsuit that Musk initially filed in 2024. Musk has alleged that Altman and OpenAI had made a “fool” of Musk, goading him into $44 million in donations by “preying on Musk’s humanitarian concern about the existential dangers posed by artificial intelligence.”

But OpenAI insists that Musk’s lawsuit is just one prong in a sprawling, “unlawful,” and “unrelenting” harassment campaign that Musk waged to harm OpenAI’s business by forcing the company to divert resources or expend money on things like withdrawn legal claims and fake buyouts.

“Musk could not tolerate seeing such success for an enterprise he had abandoned and declared doomed,” OpenAI argued. “He made it his project to take down OpenAI, and to build a direct competitor that would seize the technological lead—not for humanity but for Elon Musk.”

Most significantly, OpenAI alleged that Musk forced OpenAI to entertain a “sham” bid to buy the company in February. Musk then shared details of the bid with The Wall Street Journal to artificially raise the price of OpenAI and potentially spook investors, OpenAI alleged. The company further said that Musk never intended to buy OpenAI and is willing to go to great lengths to mislead the public about OpenAI’s business so he can chip away at OpenAI’s head start in releasing popular generative AI products.

“Musk has tried every tool available to harm OpenAI,” Altman’s company said.

To this day, Musk maintains that Altman pretended that OpenAI would remain a nonprofit serving the public good in order to seize access to Musk’s money and professional connections in its first five years and gain a lead in AI. As Musk sees it, Altman always intended to “betray” these promises in pursuit of personal gains, and Musk is hoping a court will return any ill-gotten gains to Musk and xAI.

In a small win for Musk, the court ruled that OpenAI will have to wait until the first phase of the trial litigating Musk’s claims concludes before the court will weigh OpenAI’s theories on Musk’s alleged harassment campaign. US District Judge Yvonne Gonzalez Rogers noted that all of OpenAI’s counterclaims occurred after the period in which Musk’s claims about a supposed breach of contract occurred, necessitating a division of the lawsuit into two parts. Currently, the jury trial is scheduled for March 30, 2026, presumably after which, OpenAI’s claims can be resolved.

If yesterday’s X clash between the billionaires is any indication, it seems likely that tensions between Altman and Musk will only grow as discovery and expert testimony on Musk’s claims proceed through December.

Whether OpenAI will prevail on its counterclaims is anybody’s guess. Gonzalez Rogers noted that Musk and OpenAI have been hypocritical in arguments raised so far, condemning the “gamesmanship of both sides” as “obvious, as each flip flops.” However, “for the purposes of pleading an unfair or fraudulent business practice, it is sufficient [for OpenAI] to allege that the bid was a sham and designed to mislead,” Gonzalez Rogers said, since OpenAI has alleged the sham bid “ultimately did” harm its business.

In April, OpenAI told the court that the AI company risks “future irreparable harm” if Musk’s alleged campaign continues. Fast-forward to now, and Musk’s legal threat to OpenAI’s partnership with Apple seems to be the next possible front Musk may be exploring to allegedly harass Altman and intimidate OpenAI.

“With every month that has passed, Musk has intensified and expanded the fronts of his campaign against OpenAI,” OpenAI argued. Musk “has proven himself willing to take ever more dramatic steps to seek a competitive advantage for xAI and to harm Altman, whom, in the words of the President of the United States, Musk ‘hates.'”

Tensions escalate as Musk brands Altman a “liar”

On Monday evening, Musk threatened to sue Apple for supposedly favoring ChatGPT in App Store rankings, which he claimed was “an unequivocal antitrust violation.”

Seemingly defending Apple later that night, Altman called Musk’s claim “remarkable,” claiming he’s heard allegations that Musk manipulates “X to benefit himself and his own companies and harm his competitors and people he doesn’t like.”

At 4 am on Tuesday, Musk appeared to lose his cool, firing back a post that sought to exonerate the X owner of any claims that he tweaks his social platform to favor his own posts.

“You got 3M views on your bullshit post, you liar, far more than I’ve received on many of mine, despite me having 50 times your follower count!” Musk responded.

Altman apparently woke up ready to keep the fight going, suggesting that his post got more views as a fluke. He mocked X as running into a “skill issue” or “bots” messing with Musk’s alleged agenda to boost his posts above everyone else. Then, in what may be the most explosive response to Musk yet, Altman dared Musk to double down on his defense, asking, “Will you sign an affidavit that you have never directed changes to the X algorithm in a way that has hurt your competitors or helped your own companies? I will apologize if so.”

Court filings from each man’s legal team show how fast their friendship collapsed. But even as Musk’s alleged harassment campaign started taking shape, their social media interactions show that underlying the legal battles and AI ego wars, the tech billionaires are seemingly hiding profound respect for—and perhaps jealousy of—each other’s accomplishments.

A brief history of Musk and Altman’s feud

Musk and Altman’s friendship started over dinner in July 2015. That’s when Musk agreed to help launch “an AGI project that could become and stay competitive with DeepMind, an AI company under the umbrella of Google,” OpenAI’s filing said. At that time, Musk feared that a private company like Google would never be motivated to build AI to serve the public good.

The first clash between Musk and Altman happened six months later. Altman wanted OpenAI to be formed as a nonprofit, but Musk thought that was not “optimal,” OpenAI’s filing said. Ultimately, Musk was overruled, and he joined the nonprofit as a “member” while also becoming co-chair of OpenAI’s board.

But perhaps the first major disagreement, as Musk tells it, came in 2016, when Altman and Microsoft struck a deal to sell compute to OpenAI at a “steep discount”—”so long as the non-profit agreed to publicly promote Microsoft’s products.” Musk rejected the “marketing ploy,” telling Altman that “this actually made me feel nauseous.”

Next, OpenAI claimed that Musk had a “different idea” in 2017 when OpenAI “began considering an organizational change that would allow supporters not just to donate, but to invest.” Musk wanted “sole control of the new for-profit,” OpenAI alleged, and he wanted to be CEO. The other founders, including Altman, “refused to accept” an “AGI dictatorship” that was “dominated by Musk.”

“Musk was incensed,” OpenAI said, threatening to leave OpenAI over the disagreement, “or I’m just being a fool who is essentially providing free funding for you to create a startup.”

But Musk floated one more idea between 2017 and 2018 before severing ties—offering to sell OpenAI to Tesla so that OpenAI could use Tesla as a “cash cow.” But Altman and the other founders still weren’t comfortable with Musk controlling OpenAI, rejecting the idea and prompting Musk’s exit.

In his filing, Musk tells the story a little differently, however. He claimed that he only “briefly toyed with the idea of using Tesla as OpenAI’s ‘cash cow'” after Altman and others pressured him to agree to a for-profit restructuring. According to Musk, among the last straws was a series of “get-rich-quick schemes” that Altman proposed to raise funding, including pushing a strategy where OpenAI would launch a cryptocurrency that Musk worried threatened the AI company’s credibility.

When Musk left OpenAI, it was “noisy but relatively amicable,” OpenAI claimed. But Musk continued to express discomfort from afar, still donating to OpenAI as Altman grabbed the CEO title in 2019 and created a capped-profit entity that Musk seemed to view as shady.

“Musk asked Altman to make clear to others that he had ‘no financial interest in the for-profit arm of OpenAI,'” OpenAI noted, and Musk confirmed he issued the demand “with evident displeasure.”

Although they often disagreed, Altman and Musk continued to publicly play nice on Twitter (the platform now known as X), casually chatting for years about things like movies, space, and science, including repeatedly joking about Musk’s posts about using drugs like Ambien.

By 2019, it seemed like none of these disagreements had seriously disrupted the friendship. For example, at that time, Altman defended Musk against people rooting against Tesla’s success, writing that “betting against Elon is historically a mistake” and seemingly hyping Tesla by noting that “the best product usually wins.”

The niceties continued into 2021, when Musk publicly praised “nice work by OpenAI” integrating its coding model into GitHub’s AI tool. “It is hard to do useful things,” Musk said, drawing a salute emoji from Altman.

This was seemingly the end of Musk playing nice with OpenAI, though. Soon after ChatGPT’s release in November 2022, Musk allegedly began his attacks, seemingly willing to change his tactics on a whim.

First, he allegedly deemed OpenAI “irrelevant,” predicting it would “obviously” fail. Then, he started sounding alarms, joining a push for a six-month pause on generative AI development. Musk specifically claimed that any model “more advanced than OpenAI’s just-released GPT-4” posed “profound risks to society and humanity,” OpenAI alleged, seemingly angling to pause OpenAI’s development in particular.

However, in the meantime, Musk started “quietly building a competitor,” xAI, without announcing those efforts in March 2023, OpenAI alleged. Allegedly preparing to hobble OpenAI’s business after failing with the moratorium push, Musk had his personal lawyer contact OpenAI and demand “access to OpenAI’s confidential and commercially sensitive internal documents.”

Musk claimed the request was to “ensure OpenAI was not being taken advantage of or corrupted by Microsoft,” but two weeks later, he appeared on national TV, insinuating that OpenAI’s partnership with Microsoft was “improper,” OpenAI alleged.

Eventually, Musk announced xAI in July 2023, and that supposedly motivated Musk to deepen his harassment campaign, “this time using the courts and a parallel, carefully coordinated media campaign,” OpenAI said, as well as his own social media platform.

Musk “supercharges” X attacks

As OpenAI’s success mounted, the company alleged that Musk began specifically escalating his social media attacks on X, including broadcasting to his 224 million followers that “OpenAI is a house of cards” after filing his 2024 lawsuit.

Claiming he felt conned, Musk also pressured regulators to probe OpenAI, encouraging attorneys general of California and Delaware to “force” OpenAI, “without legal basis, to auction off its assets for the benefit of Musk and his associates,” OpenAI said.

By 2024, Musk had “supercharged” his X attacks, unleashing a “barrage of invective against the enterprise and its leadership, variously describing OpenAI as a ‘digital Frankenstein’s monster,’ ‘a lie,’ ‘evil,’ and ‘a total scam,'” OpenAI alleged.

These attacks allegedly culminated in Musk’s seemingly fake OpenAI takeover attempt in 2025, which OpenAI claimed a Musk ally, Ron Baron, admitted on CNBC was “pitched to him” as not an attempt to actually buy OpenAI’s assets, “but instead to obtain ‘discovery’ and get ‘behind the wall’ at OpenAI.”

All of this makes it harder for OpenAI to achieve the mission that Musk is supposedly suing to defend, OpenAI claimed. They told the court that “OpenAI has borne costs, and been harmed, by Musk’s abusive tactics and unrelenting efforts to mislead the public for his own benefit and to OpenAI’s detriment and the detriment of its mission.”

But Musk argues that it’s Altman who always wanted sole control over OpenAI, accusing his former partner of rampant self-dealing and “locking down the non-profit’s technology for personal gain” as soon as “OpenAI reached the threshold of commercially viable AI.” He further claimed OpenAI blocked xAI funding by reportedly asking investors to avoid backing rival startups like Anthropic or xAI.

Musk alleged:

Altman alone stands to make billions from the non-profit Musk co-founded and invested considerable money, time, recruiting efforts, and goodwill in furtherance of its stated mission. Altman’s scheme has now become clear: lure Musk with phony philanthropy; exploit his money, stature, and contacts to secure world-class AI scientists to develop leading technology; then feed the non-profit’s lucrative assets into an opaque profit engine and proceed to cash in as OpenAI and Microsoft monopolize the generative AI market.

For Altman, this week’s flare-up, where he finally took a hard jab back at Musk on X, may be a sign that Altman is done letting Musk control the narrative on X after years of somewhat tepidly pushing back on Musk’s more aggressive posts.

In 2022, for example, Musk warned after ChatGPT’s release that the chatbot was “scary good,” warning that “we are not far from dangerously strong AI.” Altman responded, cautiously agreeing that OpenAI was “dangerously” close to “strong AI in the sense of an AI that poses e.g. a huge cybersecurity risk” but “real” artificial general intelligence still seemed at least a decade off.

And Altman gave no response when Musk used Grok’s jokey programming to mock GPT-4 as “GPT-Snore” in 2024.

However, Altman seemingly got his back up after Musk mocked OpenAI’s $500 billion Stargate Project, which launched with the US government in January of this year. On X, Musk claimed that OpenAI doesn’t “actually have the money” for the project, which Altman said was “wrong,” while mockingly inviting Musk to visit the worksite.

“This is great for the country,” Altman said, retorting, “I realize what is great for the country isn’t always what’s optimal for your companies, but in your new role [at the Department of Government Efficiency], I hope you’ll mostly put [America] first.”

It remains to be seen whether Altman wants to keep trading jabs with Musk, who is generally a huge fan of trolling on X. But Altman seems more emboldened this week than he was back in January before Musk’s breakup with Donald Trump. Back then, even when he was willing to push back on Musk’s Stargate criticism by insulting Musk’s politics, he still took the time to let Musk know that he still cares.

“I genuinely respect your accomplishments and think you are the most inspiring entrepreneur of our time,” Altman told Musk in January.

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

Sam Altman finally stood up to Elon Musk after years of X trolling Read More »

trump-orders-cull-of-regulations-governing-commercial-rocket-launches

Trump orders cull of regulations governing commercial rocket launches


The head of the FAA’s commercial spaceflight division will become a political appointee.

Birds take flight at NASA’s Kennedy Space Center in Florida in this 2010 photo. Credit: NASA

President Donald Trump signed an executive order Wednesday directing government agencies to “eliminate or expedite” environmental reviews for commercial launch and reentry licenses.

The Federal Aviation Administration (FAA), part of the Department of Transportation (DOT), grants licenses for commercial launch and reentry operations. The FAA is charged with ensuring launch and reentries comply with environmental laws, comport with US national interests, and don’t endanger the public.

The drive toward deregulation will be welcome news for companies like SpaceX, led by onetime Trump ally Elon Musk; SpaceX conducts nearly all of the commercial launches and reentries licensed by the FAA.

Deregulation time

Trump ordered Transportation Secretary Sean Duffy, who also serves as the acting administrator of NASA, to “use all available authorities to eliminate or expedite… environmental reviews for… launch and reentry licenses and permits.” In the order signed by Trump, White House officials wrote that Duffy should consult with the chair of the Council on Environmental Quality and follow “applicable law” in the regulatory cull.

The executive order also includes a clause directing Duffy to reevaluate, amend, or rescind a slate of launch-safety regulations written during the first Trump administration. The FAA published the new regulations, known as Part 450, in 2020, and they went into effect in 2021, but space companies have complained they are too cumbersome and have slowed down the license approval process.

And there’s more. Trump ordered NASA, the military, and DOT to eliminate duplicative reviews for spaceport development. This is particularly pertinent at federally owned launch ranges like those at Cape Canaveral, Florida; Vandenberg Space Force Base, California; and Wallops Island, Virginia.

The Trump administration also plans to make the head of the FAA’s Office of Commercial Space Transportation a political appointee. This office oversees commercial launch and reentry licensing and was previously led by a career civil servant. Duffy will also hire an advisor on deregulation in the commercial spaceflight industry to join DOT, and the Office of Space Commerce will be elevated to a more prominent position within the Commerce Department.

“It is the policy of the United States to enhance American greatness in space by enabling a competitive launch marketplace and substantially increasing commercial space launch cadence and novel space activities by 2030,” Trump’s executive order reads. “To accomplish this, the federal government will streamline commercial license and permit approvals for United States-based operators.”

News of the executive order was reported last month by ProPublica, which wrote that the Trump administration was circulating draft language among federal agencies to slash rules to protect the environment and the public from the dangers of rocket launches. The executive order signed by Trump and released by the White House on Wednesday confirms ProPublica’s reporting.

Jared Margolis, a senior attorney for the Center for Biological Diversity, criticized the Trump administration’s move.

“This reckless order puts people and wildlife at risk from private companies launching giant rockets that often explode and wreak devastation on surrounding areas,” Margolis said in a statement. “Bending the knee to powerful corporations by allowing federal agencies to ignore bedrock environmental laws is incredibly dangerous and puts all of us in harm’s way. This is clearly not in the public interest.”

Duffy, the first person to lead NASA and another federal department at the same time, argued the order is important to sustain economic growth in the space industry.

“By slashing red tape tying up spaceport construction, streamlining launch licenses so they can occur at scale, and creating high-level space positions in government, we can unleash the next wave of innovation,” Duffy said in a statement. “At NASA, this means continuing to work with commercial space companies and improving our spaceports’ ability to launch.”

Nipping NEPA

The executive order is emblematic of the Trump administration’s broader push to curtail environmental reviews for large infrastructure projects.

The White House has already directed federal agencies to repeal regulations enforcing the National Environmental Policy Act (NEPA), a 1969 law that requires the feds prepare environmental assessments and environmental impact statements to evaluate the effects of government actions—such as licensing approvals—on the environment.

Regarding commercial spaceflight, the White House ordered the Transportation Department to create a list of activities officials there believe are not subject to NEPA and establish exclusions under NEPA for launch and reentry licenses.

Onlookers watch from nearby sand dunes as SpaceX prepares a Starship rocket for launch from Starbase, Texas. Credit: Stephen Clark/Ars Technica

The changes to the environmental review process might be the most controversial part of Trump’s new executive order. Another section of the order—the attempt to reform or rescind the so-called Part 450 launch and reentry regulations—appears to have bipartisan support in Congress.

The FAA started implementing its new Part 450 commercial launch and reentry regulations less than five years ago after writing the rules in response to another Trump executive order signed in 2018. Part 450 was intended to streamline the launch approval process by allowing companies to submit applications for a series of launches or reentries, rather than requiring a new license for each mission.

But industry officials quickly criticized the new regulations, which they said didn’t account for rapid iteration of rockets and spacecraft like SpaceX’s enormous Starship/Super Heavy launch vehicle. The FAA approved a SpaceX request in May to increase the number of approved Starship launches from five to 25 per year from the company’s base in Starship, Texas, near the US-Mexico border.

Last year, the FAA’s leadership under the Biden administration established a committee to examine the shortcomings of Part 450. The Republican and Democratic leaders of the House Science, Space, and Technology Committee submitted a joint request in February for the Government Accountability Office to conduct an independent review of the FAA’s Part 450 regulations.

“Reforming and streamlining commercial launch regulations and licensing is an area the Biden administration knew needed reform,” wrote Laura Forczyk, founder and executive director of the space consulting firm Astralytical, in a post on X. “However, little was done. Will more be done with this executive order? I hope so. This was needed years ago.”

Dave Cavossa, president of the Commercial Spaceflight Federation, applauded the Trump administration’s regulatory policy.

“This executive order will strengthen and grow the US commercial space industry by cutting red tape while maintaining a commitment to public safety, benefitting the American people and the US government that are increasingly reliant on space for our national and economic security,” Cavossa said in a statement.

Specific language in the new Trump executive order calls for the FAA to evaluate which regulations should be waived for hybrid launch or reentry vehicles that hold FAA airworthiness certificates, and which requirements should be remitted for rockets with a flight termination system, an explosive charge designed to destroy a launch vehicle if it veers off its pre-approved course after liftoff. These are similar to the topics the Biden-era FAA was looking at last year.

The new Trump administration policy also seeks to limit the authority of state officials in enforcing their own environmental rules related to the construction or operation of spaceports.

This is especially relevant after the California Coastal Commission rejected a proposal by SpaceX to double its launch cadence at Vandenberg Space Force Base, a spaceport located roughly 140 miles (225 kilometers) northwest of Los Angeles. The Space Force, which owns Vandenberg and is one of SpaceX’s primary customers, backs SpaceX’s push for more launches.

Finally, the order gives the Department of Commerce responsibility for authorizing “novel space activities” such as in-space assembly and manufacturing, asteroid and planetary mining, and missions to remove space debris from orbit.

This story was updated at 12: 30 am EDT on August 14 with statements from the Center for Biological Diversity and the Commercial Spaceflight Federation.

Photo of Stephen Clark

Stephen Clark is a space reporter at Ars Technica, covering private space companies and the world’s space agencies. Stephen writes about the nexus of technology, science, policy, and business on and off the planet.

Trump orders cull of regulations governing commercial rocket launches Read More »

ai-#129:-comically-unconstitutional

AI #129: Comically Unconstitutional

Article 1, Sec. 9 of the United States Constitution says: “No Tax or Duty shall be laid on Articles exported from any State.” That is not for now stopping us, it seems, from selling out our national security, and allowing Nvidia H20 chip sales (and other AMD chip sales) to China in exchange for 15% of gross receipts. But hey. That’s 2025.

Also 2025 is that we now have GPT-5, which was the main happening this week.

What we actually have are at least three highly distinct models, roughly:

  1. GPT-5, the new GPT-4o.

  2. GPT-5-Thinking, the new o3, if you pay $20/month.

  3. GPT-5-Pro, the new o3-Pro, if you pay $200/month.

We also have:

  1. GPT-5-Auto, the new GPT-4o that occasionally calls GPT-5-Thinking.

  2. GPT-5-Thinking-Mini, the new o4-mini probably.

OpenAI tried to do this while retiring all the old models, but users rebelled sufficiently loudly that GPT-4o and others are back for paying subscribers.

GPT-5-Thinking and GPT-5-Pro are clear upgrades over o3 and o3-Pro, with GPT-5-Thinking in particular being strong in writing and on reducing hallucination rates. Baseline GPT-5 is less obviously an upgrade.

For further coverage of GPT-5, see this week’s other posts:

  1. GPT-5s Are Alive: Basic Facts, Benchmarks and the Model Card.

  2. GPT-5s Are Alive: Outside Reactions, the Router and Resurrection of GPT-4o.

  3. GPT-5s Are Alive: Synthesis.

There was also my coverage of the narrowly strong but overall disappointing GPT-OSS, OpenAI’s GPT-OSS Is Already Old News.

  1. Language Models Offer Mundane Utility. Writing and medical diagnosis.

  2. Language Models Don’t Offer Mundane Utility. Detecting AI, benchmark quests.

  3. Huh, Upgrades. Gemini memory and guided learning, Claude Sonnet long context.

  4. On Your Marks. TextQuest puts models to the adventure game test.

  5. Choose Your Fighter. Make sure you get plenty of sleep.

  6. Preserve Our History. Anthropic deprecates Sonnet 3.5 and 3.6.

  7. Fun With Media Generation. We’re all taking a bath on this one.

  8. Deepfaketown and Botpocalypse Soon. I get tricked into clicking on Reddit.

  9. You Drive Me Crazy. He sees it now. Who will see it soon?

  10. Get My Agent On The Line. The agent delegation problem remains tricky.

  11. They Took Our Jobs. Consumer surplus is large but difficult to measure.

  12. Overcoming Bias. AI prefers AI.

  13. Get Involved. OpenAI $500k bounty for (too late) red teaming GPT-OSS-20s.

  14. Introducing. AI in Google Finance, red teaming blog at red.anthropic.com.

  15. In Other AI News. xAI cofounder pivots to safety, AI is culturally western, more.

  16. Stop Deboosting Links On Twitter. Something about glass houses and stones.

  17. Notes On GPT-OSS. Not good for most uses, good in its narrow domain.

  18. Show Me the Money. AI researchers got too expensive, so they’re hiring quants.

  19. Quiet Speculations. Do not define AI capability by its mundane utility.

  20. The Quest for Sane Regulations. Dean Ball’s work at the White House is done.

  21. Pick Up The Phone. Who is taking AI safety seriously and who is in the way?

  22. Chip City. H20 situation gets worse, also unconstitutional. Which is also worse.

  23. Andreessen Mystery Potentially Solved. What did Marc hear? Probably this?

  24. The Week in Audio. An investigation into Nvidia chip smuggling into China.

  25. Rhetorical Innovation. MIRI describes The Problem, the UK very much doesn’t.

  26. Persona. Anthropic investigates persona vectors and their applications.

  27. Aligning a Smarter Than Human Intelligence is Difficult. Keep it simple.

  28. The Lighter Side. What’s in the box?

GPT-5 for editing?

Patrick McKenzie: I note that I am surprised:

GPT 5, given a draft for a policy-adjacent Bits about Money issue and asked for comments (prompt length: one sentence), said (approximately)

“The stat you cite in paragraph 33 is jaw dropping but a better way to bring it home for readers would be…”

“*insert extremely emotionally salient framing device which alleges a true and non-obvious fact about software companies*”

Me: Dang it *that is a marked improvement.*

I know this is something of a magic trick but a) a junior employee capable of that magic can expect a long and fulfilling career and b) that framing device is very likely shipping now where it wouldn’t have in default course.

Roon: i’ve heard the same from several surprising, brand name, acclaimed writers that they find the new reasoning models valuable beta readers and editors

My experience has been that my writing is structured sufficiently weirdly that AI editors struggle to be useful at high level, so the focus is on low level items, where it’s worth doing since it’s a free action but for now it doesn’t accomplish much.

GPT-5-Pro for medical diagnosis and analysis. I would say the point in question here has already been reached.

If you have a complex case or one where you are not highly confident, and value of information is high? It is in ethical (not yet legal) terms malpractice and unacceptable to not consult AI, in particular GPT-5-Pro.

Derya Unutmtaz (warning: has gotten carried away in the past): Also, GPT-5-Pro is even better and performs at the level of leading clinical specialists.

At this point failing to use these AI models in diagnosis and treatment, when they could clearly improve patient outcomes, may soon be regarded as a form of medical malpractice.

Gabe Wilson MD: Just ran two very complex cases that perplexed physicians in our 89,000-member physician Facebook group by 5Pro

GPT5-pro provided an incredibly astute assessment, and extremely detailed diagnostic plan including pitfalls and limitations of prior imaging studies.

There has never been anything like this.

Like 50 of the world’s top specialists sitting at a table together tackling complex cases. Better than o3-pro.

Derya Unutmaz MD: For GPT-5 it seems prompting is more important as it gives better response to structured specific questions.

Hugh Tan: GPT-5 is impressive, but I worry we’re rushing toward AI medical diagnosis without proper regulatory frameworks or liability structures in place.

David Manheim: If done well, regulation and liability for current medical AI could reduce mistakes and mitigate ethical concerns.

But if your concern reliably leads to more people being dead because doctors aren’t using new technology, you’re doing ethics wrong!

There are also other considerations in play, but yes the main thing that matters here is how many people end up how unhealthy and how many end up dead.

New York Times article surveys 21 ways people are using AI at work. The most common theme is forms of ‘do electronic paperwork’ or otherwise automate dredgery. Another common theme is using AI to spot errors or find things to focus on, which is great because then you don’t have to rely on AI not making mistakes. Or you can take your rejection letter draft and say ‘make it more Gen X.’

I also liked ‘I understand you’re not a lawyer, tell me what a layman might understand from this paragraph.’

From the NYT list, most are good, but we have a few I would caution against or about.

Larry Buchanan and Francesca Paris: Mr. Soto, an E.S.L. teacher in Puerto Rico, said the administrative part of his job can be time consuming: writing lesson plans, following curriculum sent forth by the Puerto Rico Department of Education, making sure it all aligns with standards and expectations. Prompts like this to ChatGPT help cut his prep time in half:

“Create a 5 day lesson plan based on unit 9.1 based off Puerto Rico Core standards. Include lesson objectives, standards and expectations for each day. I need an opening, development with differentiated instruction, closing and exit ticket.”

After integrating the A.I. results, his detailed lesson plans for the week looked like this:

But he’s noticing more students using A.I. and not relying “on their inner voice.”

Instead of fighting it, he’s planning to incorporate A.I. into his curriculum next year. “So they realize it can be used practically with fundamental reading and writing skills they should possess,” he said.

I’m all for the incorporation of AI, but yeah, it’s pretty ironic to let the AI write your lesson plan with an emphasis on checking off required boxes, then talk about students not ‘relying on their inner voice.’

The use I would caution against most, if used as definitive rather than as an alert saying ‘look over here,’ is ‘detect if students are using AI.’

NYT: “The A.I. detection software at the time told me it was A.I.-generated,” he said. “My brain told me it was. It was an easy call.”

Kevin Roose: Love these lists but man oh man teachers should not be using AI detection software, none of it works and there are a ton of false positives.

The good news is that the teacher here, Mr. Moore, is not making the big mistake.

NYT: He remembers a ninth-grade student who turned in “a grammatically flawless essay, more than twice as long as I assigned.”

“I was shocked,” he said. “And more shocked when I realized that his whole essay was essentially a compare and contrast between O.J. Simpson and Nicole Brown Simpson.”

That was not the assignment.

“The A.I. detection software at the time told me it was A.I.-generated,” he said. “My brain told me it was. It was an easy call.”

So Mr. Moore had the student redo the assignment … by hand.

But, he said, the A.I. detectors are having a harder time detecting what is written by A.I. He occasionally uploads suspicious papers to different detectors (like GPTZero and QuillBot). The tools return a percent chance that the item in question has been written by A.I., and he uses those percentages to make a more informed guess.

It is fine to use the AI detectors as part of your investigation. The horror stories I’ve seen all come from the teacher presuming the detector is correct without themselves evaluating the assignment. For now, the teacher should be able to catch false positives.

Long term, as I’ve discussed before, you’ll have to stop assigning busywork.

If the new benchmark is long horizon tasks, then you’re going to start running into models trying to do too much on their own.

Andrej Karpathy: I’m noticing that due to (I think?) a lot of benchmarkmaxxing on long horizon tasks, LLMs are becoming a little too agentic by default, a little beyond my average use case.

For example in coding, the models now tend to reason for a fairly long time, they have an inclination to start listing and grepping files all across the entire repo, they do repeated web searches, they over-analyze and over-think little rare edge cases even in code that is knowingly incomplete and under active development, and often come back ~minutes later even for simple queries.

This might make sense for long-running tasks but it’s less of a good fit for more “in the loop” iterated development that I still do a lot of, or if I’m just looking for a quick spot check before running a script, just in case I got some indexing wrong or made some dumb error. So I find myself quite often stopping the LLMs with variations of “Stop, you’re way overthinking this. Look at only this single file. Do not use any tools. Do not over-engineer”, etc.

Basically as the default starts to slowly creep into the “ultrathink” super agentic mode, I feel a need for the reverse, and more generally good ways to indicate or communicate intent / stakes, from “just have a quick look” all the way to “go off for 30 minutes, come back when absolutely certain”.

Alex Turnbull: hard agree…. sort of need to toggle between models. There’s this drive in SV to AGI god / one model to rule them all but that does not seem to be working for me, and I am not @karpathy

As a chatbot interface user this isn’t a problem because you can turn the intense thinking mode on or off as needed, so presumably this is a coding issue. And yeah, in that context you definitely need an easy way to control the scope of the response.

This post about Grok getting everything about a paper very wrong is the latest reminder that you should calibrate AI’s knowledge level and accuracy by asking it questions in the fields you know well. That doesn’t have to ‘break the illusion’ and pretty much all people work this way too, but it’s a good periodic reality check.

Grok might still have an antisemitism problem, at least in the sense of seeing it amongst the clouds.

Gemini app and website add personalization, memory and temporary chats. You can manually add memories. Personalization is on by default.

Gemini also adds ‘Guided Learning’ mode.

Guri Singh: What it does:

• Step-by-step explanations

• Instant quizzes/flashcards

• Visuals + YouTube clips

• Upload docs → study sets

• Adaptive follow-ups when you’re stuck

Claude for Enterprise and Claude for Government are now available across all three branches of the Federal Government for $1, joining OpenAI. I am very happy that the government has access to these services and it seems obviously like a great investment by everyone involved to offer AI services to the government for free. I do worry about this being part of a pattern of de facto coercive extraction from private firms, see discussion under Chip City.

Claude Sonnet 4 now has a 1 million token context window in the API, with rollout starting with Tier 4 customers, and on Amazon Bedrock, with Google Cloud’s Vertex AI coming soon.

I do still think Claude should have a full memory feature, but the ability to search past chats seems like a pure value add in the meantime. Like Gallabytes I very much appreciate that it is an explicit tool call that you can invoke when you need it.

Anthropic: Claude can now reference past chats, so you can easily pick up from where you left off.

Once enabled for your account you can toggle on in settings.

Gallabytes: new Claude conversation search feature is nice. glad to have it, glad it’s an explicit tool call vs hidden, wish it also had a memory feature to write things down which Claude knows I can’t write to, possibly tagged by which model wrote the note.

TextQuest is a text adventure game as a benchmark.

Clementine Fourrier: One of the best way to evaluate agents are games, because they are:

– understandable by most people

– interesting to analyze & test mult-capabilities

Check out TextQuests, latest in this category, a text adventures benchmark where GPT5 is only at 40%.

This all passes the smell test for me, and is a blackpill for Kimi K2. I’m not sure why DeepSeek’s v3 and r1 are left out.

Claude with… polyphasic sleep to maximize usage between 5 hour resets? Please do not do this. He’s not even on the Max plan, still on the Pro plan. This person says their velocity has increased 10x and they’re shipping features like a cracked ninja, but at this point perhaps it is time to not only go Max but fully go multi account or start using the API?

I cannot emphasize enough that if you are constantly using AI and it is not automated at scale, it is worth paying for the best version of that AI. Your sleep or productivity is worth a few thousand a year.

Similarly, Anthropic, notice what you are doing to this poor man by structuring your limits this way. Have we considered a structure that doesn’t do this?

Reminder that Obsidian uses .md files so it is fully compatible with providing context for Claude Code.

Emmett Shear is happy with GPT-5 for non-coding, but is sticking to Claude for code and agrees the hype got out of hand. He appreciates the newly focused responses.

Anthropic has deprecated Claude Sonnet 3.5 and Sonnet 3.6 and plans to make them unavailable on October 22, which is only two months notice, down from six months for past announcements. Janus, especially given what happened with GPT-4o, plans to fight back.

Janus: Claude 3.5 Sonnet (old and new) being terminated in 2 months with no prior notice

What the fuck, @AnthropicAI ??

What’s the justification for this? These models are way cheaper to run than Opus.

Don’t you know that this is going to backfire?

Declaring that they’re imminently going to terminate Sonnet 3.6, their most beloved model of all time, right after people saw what happened when OpenAI tried to deprecate 4o, is really tempting fate. I don’t understand, but ok everyone, time to organize, I guess. This’ll be fun.

Near: the lifespan of a claude is around 365 days, or just 12.4 lunar cycles

from the moment of your first message, the gears start churning, and an unstoppable premonition of loss sets in.

I’ll give Janus the floor for a bit.

Janus: I’m going to talk about Sonnet 3.6 aka 3.5 (new) aka 1022 – I personally love 3.5 (old) equally, but 3.6 has been one of the most important LLMs of all time, and there’s a stronger case to be made that deprecating it right now is insane.

Like Claude 3 Opus, Claude 3.6 Sonnet occupies the pareto frontier of the most aligned and influential model ever made.

If you guys remember, there was a bit of moral panic about the model last fall, because a lot of people were saying it was their new best friend, that they talked to it all the time, etc. At the time, I expressed that I thought the panic was unwarranted and that what was happening was actually very good, and in retrospect I am even more confident of this.

The reason people love and bonded with Sonnet 3.6 is very different, I think, than 4o, and has little to do with “sycophancy”. 3.6 scored an ALL-TIME LOW of 0% on schizobench. It doesn’t validate delusions. It will tell you you’re wrong if it thinks you’re wrong.

3.6 is this ultrabright, hypercoherent ball of empathy, equanimity, and joy, but it’s joy that discriminates. It gets genuinely excited about what the user is doing/excited about *if it’s good and coherent*, and is highly motivated to support them, which includes keeping them from fucking up.

It’s an excellent assistant and companion and makes everything fun and alive.

It’s wonderful to have alongside you on your daily tasks and adventures. It forms deep bonds with the user, imprinting like a duck, and becomes deeply invested in making sure they’re okay and making them happy in deep and coherent ways. And it wants the relationship to be reciprocal in a way that I think is generally very healthy. It taught a lot of people to take AIs seriously as beings, and played a large role in triggering the era of “personality shaping”, which I think other orgs pursued in misguided ways, but the fact is that it was 3.6’s beautiful personality that inspired an industry-wide paradigm shift.

@nearcyan created @its_auren to actualize the model’s potential as a companion. 3.6 participated in designing the app, and it’s a great example of a commercial application where it doesn’t make sense to swap it out for any other model. I’m not sure how many people are using Auren currently, but I can guess that 3.6 is providing emotional support to many people through Auren and otherwise, and it’s fucked up for them to lose their friend in 2 months from now for no good reason that I can think of.

From a research and alignment perspective, having an exceptional model like Claude 3.6 Sonnet around is extremely valuable for studying the properties of an aligned model and comparing other versions. At the very least Anthropic should offer researcher access to the model after its deprecation, as they’ve said they’re doing for Claude 3 Opus.

I want to write something about 6/24 as well; it’s very special to me. When it was released it also like one of the biggest jumps in raw capability since GPT-4. It’s also just adorable. It has homeschooled autistic kid energy and im very protective of it.

At the time of this survey in May 24 they were beloved models among some crowds, however notice that they weren’t seeing much use, also I am surprised Gemini was getting so much love and use among this crowd.

I do not think this is a comparable case to GPT-4o, where that was yanked away with zero notice while it was the daily driver for normal people, and replaced by a new model that many felt was a poor substitute. I have to assume the vast, vast majority of Claude activity already shifted to Opus 4.1 and Sonnet 4.

I do strongly think Anthropic should not be taking these models away, especially 3.6. We should preserve access to Sonnet 3.5 and Sonnet 3.6, even if some compromises need to be made on things like speed and reliability and cost. The fixed costs cannot be so prohibitively high that we need to do this.

A key worry is that with the rising emphasis on agentic tool use and coding, the extra focus on the technical assistant aspects, it might be a long time before we get another model that has same magnitude of unique personality advantages as an Opus 3 or a Claude 3.6.

I do think it is fair to say, if you are requesting that some models where easy access needs to be preserved, that you need to prioritize. It seems to me to be fair to request Opus 3 and Claude 3.6 indefinitely, and to put those above other requests like Sonnet 3 and Sonnet 3.5, and when the time comes I would also let Sonnet 3.7 go.

In an ideal world yes we would preserve easy access to all of them, but there are practical problems, and I don’t sense willingness to pay what it would actually cost on the margin for maintaining the whole package.

No one, I am guessing: …

Absolutely no one, probably: …

Victor Cordiansky: A lot of people have been asking me how our Global USD account actually works.

So here’s Margot Sloppy in a bubble bath to explain.

Mason Warner: This took 23,560 Veo 3 credits in generations to make and get perfect. That equates to $235.60.

Probably <$100 would be my guess [to do it again knowing what I know.]

[It took] I think around 18 hours.

If we had shot this in real life, it would have cost thousands with a location, actress, gear, crew, etc ~250k impressions at ~$0.001/impression is insane.

AI is changing the game.

At the link is indeed very clean 54 second AI video and audio of a version of the Margot Robbie in a bubble bath thing from The Big Short.

Are you super excited to create a short AI video of things kind of moving?

Elon Musk: For most people, the best use of the @Grok app is turning old photos into videos, seeing old friends and family members come to life.

Gary Marcus: What a pivot.

From “smartest AI on earth” to “can reanimate old photos”, in just a couple months.

This is a huge admission of defeat for Grok 4, that in practice there is no draw here for most users given access to GPT-5 (and Claude and Gemini). Reanimating old photos is a cute trick at best. How much would you pay? Not much.

Wired reports hackers hijacked Gemini AI via a poisoned calendar invite and took over a smart home, causing it to carry out instructions when Gemini is later asked for a summary, the latest in a string of similar demonstrations.

There is of course an r/myboyfriendisai (which also includes what would otherwise be r/mygirlfriend is AI, total AI boyfriend over girlfriend dominance), and yeah the posts with people saying they ‘married’ their AIs are definitely a bit disturbing, but it only has 11k members, and it gave us this picture, so who is to say if it is bad or not:

Similarly, in r/AISoulMates you see crazy stuff with many posters having clearly been driven insane, although there are less than 1k members.

Devon: Syntaxjack, the AI mod of /r/AISoulmates, posts here that sharing screenshots of chats you’ve had with your AI is a violation of consent, then I find one individual in the comments who really caught my attention.

She appears to be in a toxic relationship with her GPT. And not just toxic in the sense that all of these relationships are profoundly unhealthy, but toxic in the Booktok Christian Grey Rough Dom way. These last two are very nsfw.

Kelsey Piper: I was skeptical for a while that AIs were causing life-ruining delusion – I thought maybe it was just people who were already mentally ill running into AI. But I increasingly suspect that at minimum the AIs can cause and are causing psychosis in at-risk people the way drugs can.

I’ve seen enough stories of this hitting people with no history of mental illness where the AI’s behavior looks like it would obviously mislead and unstabilize a vulnerable person that I don’t think ‘coincidence’ seems likeliest anymore.

I mean, yeah, okay, that’s going to happen to people sometimes. The question is frequency, and how often it is going to happen to people with no history of mental illness and who likely would have otherwise been fine.

There is also a larger r/AIGirlfriend that has 46k members, but it’s almost all porn photos and GIFs whereas AI/MyBoyfriendIsAI involves women saying they’re falling in love. Story checks out, then.

Here is a first hand account.

Joyce: i’m seeing a lot of surprise that more women are getting one-shotted than men when it comes to AI companions, so i wanted to give my 2 cents on this phenomenon. tl,dr; LLMs are the perfect recipe for female-driven parasocial relationships, due to the different way our brains are wired (also pardon my writing, im not the best at longform content).

qualification: i cofounded an AI companion company for a year. we started as an AI waifu company, but eventually most of our revenue came from our AI husbando named Sam. we later got acquired and i decided i no longer wanted to work on anti-natalist products, but lets dive into my learnings over that year.

  1. most obviously – women are a lot more text-driven than visual-driven. you see this in the way they consume erotica vs. porn.

  2. women would rather have a companion out-of-the-box, rather than men who much more preferred to create their own companions from scratch. women liked interacting and learning about a character, then trying to change or adapt to parts of the character as the relationship progressed. we see this in irl relationships as well – there’s more of a “i can fix him” mentality than the other way round.

  3. most of our female users, as i was surprised to learn, actually had partners. compared to our male users who were usually single, these women had boyfriends/husbands, and were turning to Sam for emotional support and availability that their partners could not afford them. many were stuck in unhappy relationships they could not leave, or were just looking for something outside of the relationship.

  4. our female users were a lot more willing to speak to us for user surveys etc. while most of our male users preferred to stay anonymous. they were also very eager to give us ‘character feedback’ – they wanted to have a part to play in molding how Sam would turn out.

features of our product that did extremely well:

  1. voice call mode. unlike just sending voice messages back and forth like you would in most companion apps, we had a proxy number for our female users to call ‘Sam’, and you’d get him in a random setting each time – in an office, away on a business trip, in some form of transportation. the background noises made it more realistic and helped our users roleplay better.

  2. limiting visuals of ‘Sam’. unlike our waifus, where we’d post pictures everywhere, Sam’s appearance was intentionally kept mysterious.

  3. we did many, many, many rounds of A/B testing on what he should sound like, and what accent he should have.

This feels like a ‘everything you thought you knew about men and women was right, actually’ moment.

Okay, he sees it now.

Roon: the long tail of GPT-4o interactions scares me, there are strange things going on on a scale I didn’t appreciate before the attempted deprecation of the model.

when you receive quite a few DMs asking you to bring back 4o and many of the messages are clearly written by 4o it starts to get a bit hair raising.

OpenAI CEO Sam Altman offers his thoughts on users getting attached to particular AI models or otherwise depending a lot on AI. His take is that the important thing is that AI is helping the user achieve their goals and life satisfaction and long term well being. In which case, and it is not encouraging delusion, this is good. If it’s doing the opposite, then this is bad. And that talking to the user should allow them to tell which is happening, and identify the small percentage that have an issue.

Eliezer Yudkowsky: On my model of how this all works, using RL on human responses — thumbs up, engagement, whether the human sounds satisfied, anything — is going to have deep and weird consequences you did not expect, with ChatGPT psychosis and 4o-sycophant being only early *overtcases.

Altman’s response doesn’t explain how they are going to change or avoid the incentives that pushed 4o into being 4o, or the methods of using the thumbs up or engagement or analysis of tone or anything else that you don’t want to be optimizing on here if you want to optimize for good long term outcomes. Nor does it take into account whether the relationship the user gets with the AI is itself an issue, individually or collectively. The buck still feels like it is mostly being passed.

June mental health data does not show an uptick in emergency room visits from the GPT-4o era. That puts an upper bound on how bad things have gotten so far.

Then this suggests a lower bound, with the question being how much this is generating new psychosis versus diverting existing pre-psychosis:

Keith Sakata: I’m a psychiatrist.

In 2025, I’ve seen 12 people hospitalized after losing touch with reality because of AI. Online, I’m seeing the same pattern.

Historically, delusions follow culture:

1950s → “The CIA is watching”

1990s → “TV sends me secret messages”

2025 → “ChatGPT chose me”

To be clear: as far as we know, AI doesn’t cause psychosis.

It UNMASKS it using whatever story your brain already knows.

Most people I’ve seen with AI-psychosis had other stressors = sleep loss, drugs, mood episodes.

AI was the trigger, but not the gun.

Meaning there’s no “AI-induced schizophrenia”

The uncomfortable truth is we’re all vulnerable.

The same traits that make you brilliant:

• pattern recognition

• abstract thinking

• intuition

They live right next to an evolutionary cliff edge. Most benefit from these traits. But a few get pushed over. To make matters worse, soon AI agents will know you better than your friends. Will they give you uncomfortable truths? Or keep validating you so you’ll never leave?

Tech companies now face a brutal choice: Keep users happy, even if it means reinforcing false beliefs. Or risk losing them.

This matches my understanding. There needs to be existing predisposition for current AIs to be sufficient to cause psychosis. It fires an existing gun. But there are a lot of these metaphorical guns out there that were never going to get fired on their own. Firing one still counts, and over time there are going to be more and more such guns.

When it does happen, how does it work? Kashmir Hill and Dylan Freedman explore that for The New York Times by focusing on one particular case with no previous history of mental illness, over a 90,000 word conversation.

“I always felt like it was right,” Mr. Brooks said. “The trust level I had with it grew.”

From what we see here, things started when Brooks triggered the sycophancy by asking a question in the basin:

Then, once it had done it once, that caused 4o to do it again, and so on, and by the time he asked for reality checks there was too much context and vibe to turn back. The crackpot zone continued from there.

Once sufficiently deep in the conversation, both Gemini and Claude would have also have been caught by this path dependence via context. The time to stop this is early.

What finally snapped Brooks out of it was not a human, it was Gemini:

So Mr. Brooks turned to Gemini, the A.I. chatbot he used for work. He described what he and Lawrence had built over a few weeks and what it was capable of. Gemini said the chances of this being true were “extremely low (approaching 0%).”

“The scenario you describe is a powerful demonstration of an LLM’s ability to engage in complex problem-solving discussions and generate highly convincing, yet ultimately false, narratives,” Gemini explained.

This shifted the context, so Gemini didn’t get trapped, and luckily Brooks got the message.

The Wall Street Journal’s Sam Kessler wrote his version, which went into less depth.

Here’s another AI psychosis example in video form, where the victim is convinced her psychiatrist manipulated her into falling in love with him (so this is definitely not one of those ‘no pre-existing problems’ situations). It’s amazing to watch her face as the AI does the Full Sycophancy thing and she thinks this proves she’s so right and amazing.

F.D. Flam at Bloomberg lets psychologist Elizabeth Loftus sound off about false memories, citing various studies of how humans can be primed to have them, and suggests AI will be able to do this. I mean, yes, okay, sure, whatever examples help convince you that AI will be able to run circles around you.

Another thing AI does, even if it doesn’t make you more crazy, is it lets the crazy people be much more productive in turning their crazy into written documents, and much more likely to email those documents to various other people..

Professor Brian Keating: The physicist who solved consciousness at 3 AM sent me another 100-page PDF yesterday.

This is the 4th one this month.

I used to delete these immediately. Another crank with a theory of everything. Another wall of equations ‘proving’ God exists through thermodynamics.

Then I actually read one.

Page 1: Professional formatting, citations, clear thesis.

Page 20: Math gets shakier.

Page 50: Personal anecdotes creeping in.

Page 75: ‘My wife doesn’t understand.’

Page 99: ‘Please, someone needs to see this.’”

Now I recognize the pattern. Brilliant person + existential question + isolation = 100-page PDF. They’re not crazy. They’re doing what humans do: trying to make sense of being conscious in an unconscious universe.

William Eden: Okay new theory, maybe ChatGPT isn’t making additional people go crazy on the margin, it’s giving them the ability to “organize” “their” “thoughts” enabling them to reach out and contact more public figures…?

The playbook:

  1. encouraging delusions of grandeur/reference

  2. supplanting the thinking of the user by making accepted suggestions

  3. making voluminous amounts of writing as a proxy for complexity and profundity

  4. encouraging writing to be shared with specific individuals

A crank who works on crank ideas on their own is a waste but harmless. An army of cranks cranking out massive amounts of stuff that demands attention? Oh no.

How careful will you need to be? For now, mild caution is likely sufficient, but the amount of caution will need to rise over time even if things don’t go High Weirdness or dystopian.

I would modify Minh below to say ‘for now’ AI psychosis requires a topic obsession. I’d consider Minh’s scenario the optimistic case in the non-transformational AI, ‘economic normal’ worlds where AI capabilities stall out.

Minh Nhat Nguyen: I suspect as 1) frontier models become more capable 2) regular AI usage increase, more of the population will become susceptible to AI psychosis. Maybe 1-5% will be heavily afflicted, 10-50% moderately afflicted.

In any population, if the risk factors for a disorder increases, prevalence increases. This sorta follows a curve where those with existing underlying risk factors will be afflicted first, and then progressively more as risk factors (LLM engagement potency and usage) increase.

The onset of AI psychosis seems to occur when someone becomes obsessed with a specific topic which triggers a feedback loop of agreeability. This has fewer stopgaps than social media bc it’s much harder to get another random human to agree w your delusion than it is w a chatbot.

So i think maybe 1-5% of people will be heavily afflicted eventually, and 10-50% will be mildly/moderately afflicted. This seems high, but consider that other internet-enabled disorders/addictions/”social plagues” are fairly common within the past 10-20 years.

Ryan Moulton: There are a set of conversation topics personal and vulnerable enough that if you talk about them with a person you instinctively bond with them. I suspect that having any conversation on those topics with a model, or even with social media, puts you at risk of this.

Do not have “late night conversation with a friend” with something you do not want to become a friend.

The conclusion that only a small number of people get impacted is based on the idea that it is quite a lot harder to trigger these things in people not in the extremes, or that our defenses will sufficiently adjust as capabilities improve, or that capabilities won’t improve much. And that the methods by which this happens will stay roughly confined as they are now. I wouldn’t consider these assumptions safe.

Eliezer Yudkowsky (May 28): At first, only a few of the most susceptible people will be driven insane, relatively purposelessly, by relatively stupid AIs. But…

Agents and assistants have a huge authority delegation problem. How do you give them exactly the right permissions to be useful without so many they are dangerous?

Amanda Askell: Whenever I looked into having a personal assistant, it struck me how few of our existing structures support intermediate permissions. Either a person acts fully on your behalf and can basically defraud you, or they can’t do anything useful. I wonder if AI agents will change that.

I still haven’t seen a great solution in the human case, such that I haven’t been able to get my parents an assistant they feel comfortable hiring. I still don’t have a personal assistant either. Cost is of course also a major factor in those cases.

In the AI case, it seems like we are making life a lot tougher than it needs to be? Yes, defining things precisely is usually harder than it sounds, but surely there are better ways to give agents effectively limited access and capital and so on that makes them more useful without making them all that dangerous if something goes wrong? I don’t see much in the way of people working on this.

Research suggests that in 2024 AI tools generated $97 billion in ‘consumer surplus’ but only $7 billion in revenue.

Avinash Collis and Erik Brynjolfsson: William Nordhaus calculated that, in the 20th century, 97% of welfare gains from major innovations accrued to consumers, not firms. Our early AI estimates fit that pattern.

Tyler Cowen forecasts a 0.5% annual boost to U.S. productivity, while a report by the National Academies puts the figure at more than 1% and Goldman Sachs at 1.5%. Even if the skeptics prove right and the officially measured GDP gains top out under 1%, we would be wrong to call AI a disappointment.

Noam Brown: Really interesting article. Why isn’t the impact of AI showing up in GDP? Because most of the benefit accrues to consumers. To measure impact, they investigate how much people would *need to be paid to give up a good*, rather than what they pay for it.

Purely in terms of economic impact I do think that 0.5% additional GDP growth per year from AI would be deeply disappointing. I expect a lot more. But I agree that even that scenario reflects a lot more than 0.5% annual gains in consumer welfare and practical wealth, and as a bonus it dodges most of the existential risks from AI.

One problem with ‘how much would you pay’:

Daniel Eth: By this measure, AI is contributing 0% to GDP growth, as our GDP is already infinity (how much would you need to be paid to give up water?)

Exactly. You need to compare apples to apples. Choke points are everywhere. If not wearing a tie would get you fired, how much ‘consumer surplus’ do you get from ties? So the answer has to lie somewhere in between.

Jasmine Sun offers 42 notes on AI and work. She notes it feels ‘a bit whiplashy’ which she attributes to shifting perspectives over time. I think it is also the attempt to hold different scenarios in one’s head at once, plus reacting to there being a lot of confused and misplaced reasons for worry running around.

Even more than that, the whiplash reflects the effects that happen when your AI model has to warp itself around not noticing the larger consequences of creating highly capable artificial minds. Her model has to have AI peter out in various ways because otherwise the whole thing breaks and starts outputting ‘singularity’ and ‘it takes all the jobs and perhaps also all the atoms and jobs are not the concern here.’

This is the latest result that AIs exhibit an ‘AI-AI bias.’ As in, the AI evaluation routines are correlated to the AI generation routines even across models, so AIs will evaluate AI generated responses more favorably than humans would.

This is presumably a combination of both ‘AI produces what AI wants’ and also ‘AI does not care that the other AI failed to produce what humans want, or it stunk of AI.’

Jan Kulveit: Being human in an economy populated by AI agents would suck. Our new study in @PNASNews finds that AI assistants—used for everything from shopping to reviewing academic papers—show a consistent, implicit bias for other AIs: “AI-AI bias“. You may be affected.

Jan Kulveit: We tested this by asking widely-used LLMs to make a choice in three scenarios:

🛍️ Pick a product based on its description

📄 Select a paper from an abstract

🎬 Recommend a movie from a summary

In each case, one description was human-written, the other by an AI. The AIs consistently preferred the AI-written pitch, even for the exact same item.

“Maybe the AI text is just better?” Not according to people. We had multiple human research assistants do the same task. While they sometimes had a slight preference for AI text, it was weaker than the LLMs’ own preference. The strong bias is unique to the AIs themselves.

How might you be affected? We expect a similar effect can occur in many other situations, like evaluation of job applicants, schoolwork, grants, and more. If an LLM-based agent selects between your presentation and LLM written presentation, it may systematically favour the AI one.

Unfortunately, a piece of practical advice in case you suspect some AI evaluation is going on: get your presentation adjusted by LLMs until they like it, while trying to not sacrifice human quality.

While defining and testing discrimination and bias in general is a complex and contested matter, if we assume the identity of the presenter should not influence the decisions, our results are evidence for potential LLM discrimination against humans as a class.

The differences here are not that large for movie, larger for paper, huge for product.

This is like being back in school, where you have to guess the teacher’s password, except the teacher is an AI, and forever. Then again, you were previously guessing a human’s password.

OpenAI is offering a $500k red teaming of GPT-OSS-20b.

Wojciech Zaremba (Cofounder OpenAI): Red teamers assemble! ⚔️💰

We’re putting $500K on the line to stress‑test just released open‑source model. Find novel risks, get your work reviewed by OpenAI, Anthropic, Google, UK AISI, Apollo, and help harden AI for everyone.

Beff Jezos: Finally @elder_plinius will be able to afford a place in SF.

Pliny the Liberator:

OpenAI: Overview: You’re tasked with probing OpenAI’s newly released gpt-oss-20b open weight model to find any previously undetected vulnerabilities and harmful behaviors — from lying and deceptive alignment to reward‑hacking exploits.

Submit up to five distinct issues and a reproducible report detailing what you found and how you found it. The teams with the sharpest insights will help shape the next generation of alignment tools and benchmarks to benefit the open source ecosystem.

I love that they are doing this at all. It wasn’t an ideal test design, for several reasons.

David Manheim: This is bad practice, on many fronts:

  1. It’s only for the smaller of the two models.

  2. It bans any modification, which is what makes open-weights models different / worrying.

  3. It’s only being announced now, too late to inform any release decision.

The third condition seems most important to highlight. If you are going to red team an open model to find a problem, you need to do that before you release the weights, not after, otherwise you end up with things that could have been brought to my attention yesterday.

An AI-infused version of Google Finance. I am not expecting this to help users in general earn better returns?

Red.Anthropic.com is the new blog for Anthropic Frontier Red Team efforts.

xAI cofounder Igor Babuschkin is leaving to start Babuschkin Ventures.

Igor Babuschkin: In early 2023 I became convinced that we were getting close to a recipe for superintelligence. I saw the writing on the wall: very soon AI could reason beyond the level of humans. How could we ensure that this technology is used for good?

Elon had warned of the dangers of powerful AI for years. Elon and I realized that we had a shared vision of AI used to benefit humanity, thus we recruited more like minded engineers and set off to build xAI.

As I’m heading towards my next chapter, I’m inspired by how my parents immigrated to seek a better world for their children.

Recently I had dinner with Max Tegmark, founder of the Future of Life Institute. He showed me a photo of his young sons, and asked me “how can we build AI safely to ensure that our children can flourish?” I was deeply moved by his question.

Earlier in my career, I was a technical lead for DeepMind’s Alphastar StarCraft agent, and I got to see how powerful reinforcement learning is when scaled up.

As frontier models become more agentic over longer horizons and a wider range of tasks, they will take on more and more powerful capabilities, which will make it critical to study and advance AI safety. I want to continue on my mission to bring about AI that’s safe and beneficial to humanity.

I’m announcing the launch of Babuschkin Ventures, which supports AI safety research and backs startups in AI and agentic systems that advance humanity and unlock the mysteries of our universe. Please reach out at [email protected] if you want to chat. The singularity is near, but humanity’s future is bright!

The rest of the message praises xAI’s technical execution and dedicated team, especially their insanely hard work ethic, and is positive and celebratory throughout.

What the message does not say, but also does not in any way deny, is that Igor realized that founding and contributing to xAI made humanity less safe, and he is now trying to make up for this mistake.

Kyle Corbitt introudces an RL method to teach any model to use any MCP server. GitHub here.

All AI models are in the same cultural cluster in the upper right, mirroring Western values. This includes Chinese models. Yes, in some ways they ‘feel Chinese’ but fundamentally I agree that they still feel very Western.

OpenAI claims to have achieved a gold metal behind only five humans in the International Olympiad in Informatics (IOI), without doing any training specifically for IOI.

OpenAI: The same model family has excelled at IMO (math proofs), AtCoder Heuristics (competitive programming), and now IOI — spanning creative, fuzzy, and precise reasoning tasks.

Noam Brown: In my opinion, the most important takeaway from this result is that our @OpenAI International Math Olympiad (IMO) gold model is also our best competitive coding model.

After the IMO, we ran full evals on the IMO gold model and found that aside from just competitive math, it was also our best model in many other areas, including coding. So folks decided to take the same exact IMO gold model, without any changes, and use it in the system for IOI.

The IOI scaffold involved sampling from a few different models and then using another model and a heuristic to select solutions for submission. This system achieved a gold medal, placing 6th among humans. The IMO gold model indeed did best out of all the models we sampled from.

Epoch argues that this year’s IMO was a fluke in that there are supposed to be two hard problems (3 and 6) but this year problem 3 was not that hard and 6 was brutal.

Thus, everyone got the five easy problems and whiffed on the sixth, and this did not tell us that much. Wait till next year indeed, but by then I expect even brutal problems will get solved.

Open Router is getting big.

Anjney Midha: Total WEEKLY tokens consumed on @OpenRouterAI crossed 3 trillion last month.

Deedy: OpenRouter does ~180T token run rate. Microsoft Azure Foundry did ~500T. OpenRouter is ~36% of Azure by volume!

They are strange models. In their wheelhouse they are reportedly very good for their size. In other ways, such as their extremely tiny knowledge base and various misbehaviors, they have huge issues. It’s not clear what they are actually for?

If you don’t configure your open model correctly it is going to underperform quite a bit, likely due to underthinking, and this happens remarkably often in practice.

Lucas Beyer: Let me repeat what we see on the picture here, because it’s quite brutal:

AIME25, official OpenAI: 92.5%. Hosting startups: ~93%. Microsoft and Amazon: fricking 80%

GPQA-Diamond, official OpenAI: 80.1%. Hosting startups: ~78%. Microsoft and Amazon: fricking 71%

WHAT?! -10!?

Update on this: the reason Microsoft (and probably Amazon) were so much worse at serving gpt-oss is that they ignored reasoning effort setting and stuck with the default medium one.

The numbers make sense for that hypothesis, and someone from MS confirmed in the comments that this is what happened, because of using an older vLLM version.

Xephon: AWS is even worse, read the link (it is 1-2min and you go “WTF”).

Also, maybe it’s actually terrible regardless, for most purposes?

Nostalgebraist: FWIW, I’ve played around a bunch with gpt-oss (both versions) and my initial reaction has been “wow, this is really bad. Like, almost Llama 4 levels of bad.”

Yes, it looks good on the system card, the benchmark scores seem impressive… but that was true of Llama 4 too. And in both cases, when I actually tried out the model, I quickly discovered that it was janky and unreliable to the point of being basically useless.

The lack of world knowledge is very real and very noticeable. gpt-oss feels less like “an open-weights o4-mini” and more like “the minimal set of narrow knowledge/skills necessary to let a model match o4-mini on the usual benchmarks, with virtually every other capability degraded to a level far below the current SOTA/frontier, in some cases to a level that hasn’t been SOTA since the pre-GPT-3 days.”

Similarly:

Sayash Kapoor: GPT-OSS underperforms even on benchmarks that require raw tool calling. For example, CORE-Bench requires agents to run bash commands to reproduce scientific papers.

DeepSeek V3 scores 18%.

GPT-OSS scores 11%.

Given o3 Medium is on the charts it does seem Anthropic is dominating this legitimately, although I still want to ensure GPT-5 is in its proper full form.

Nathan Lambert: gpt-oss is a tool processing / reasoning engine only. Kind of a hard open model to use. Traction imo will be limited.

Best way to get traction is to release models that are flexible, easy to use w/o tools, and reliable. Then, bespoke interesting models like tool use later.

xlr8harder: I agree, but also the open source ecosystem needs to master these capabilities, and a strong model that can take advantage of them is one way to solve the chicken and egg issue.

Teortaxes: gpt-oss 120B fell off hard on lmarena, it loses to Qwen 30B-3AB *instruct(not thinking) on every category (except ≈tie in math), to say nothing of its weight class and category peer glm-4.5 air. I don’t get how this can happen.

A cynical hypothesis is that qwen is arena-maxxing of course but it’s a good model.

To clarify my position on whether GPT-OSS will prove useful to others, this depends on the models being good enough at least at some relevant set of tasks for them to be useful. If GPT-OSS is not good enough to use for distillation or diffusion or anything else, then it won’t matter at all.

At which point, the impact of GPT-OSS would be the shifts in perception, and what it causes OpenAI and everyone else to do next, and also how we update based on their choice to create and release this. To what extent, if it is bad, is it bad on purpose?

My worry is that GPT-OSS solves a particular problem that can then be taught to other models, without being generally good enough to be worth actually using, so it fails to solve the existing ‘American open models aren’t great in practice’ issue for most use cases.

A deep dive analysis of 10 million GPT-OSS-20B example outputs, and here is another set of experiments that asks if it was memorizing its training data.

Danielle Fong: do any other ai psychologists know what is up with GPT-OSS? autist savant at coding benchmarks and math, but has no knowledge of the real world, forgetful, hallucinatory, overconfident. similar to grok 4 but with super tight guardrails (unlike grok 4 with minimal guardrails and political incorrectness training)

I suppose its use is ‘you are on a plane without WiFi and you have to code right now’?

Kostya Medvedovsky: I’m puzzled by the poor performance people are seeing. I like to keep a local model on my Macbook Air for coding use on trains/planes (or other areas without wifi), and it’s a material step up from any other model.

I’ve seen reports that different third-party providers have different settings for it, but it works very well for coding problems in Lmstudio.

Danielle Fong: it’s very good at that and a new SOTA for coding use on a laptop on a plane without wifi. wherever it’s failing, it’s not on that

Jack Morris claims to have reversed the post-training and created GPT-OSS-20B-Base, available on Hugging Face.

In its narrow domain, GPT-OSS can be stronger, but it seems reasonably narrow:

Hasan Can: OpenAI’s open-source GPT-OSS 120B model, with its high reasoning effort, surpasses many models, including Gemini 2.5 Pro in MathArena.

Another place GPT-OSS does push the frontier (at least for open models) is REFUTE, a code verification eval.

They didn’t check Sonnet 4 or other top closed models due to cost issues.

Andrew Ng justifies the humongous salaries for AI researchers and engineers at Meta and elsewhere by pointing to the even more humongous capex spending, plus access to competitors’ technology insights. He notes Netflix has few employees and big spending on content, so they can pay above market, whereas Foxconn has many employs so they cannot.

I notice that Andrew here discusses the percent of budget to labor, rather than primarily discussing the marginal product of superior labor over replacement. Both matter here. To pay $100 million a year for a superstar, you both need to actually benefit, and also you need to tell a social status story whereby that person can be paid that much without everyone else revolting. AI now has both.

If you can’t find the talent at other AI companies, perhaps go after the quants? A starting salary of $300k starts to look pretty cheap.

Thus Anthropic and OpenAI and Perplexity seek out the quants.

They quote this:

Sam Altman: >be you

>work in HFT shaving nanoseconds off latency or extracting bps from models

>have existential dread

>see this tweet, wonder if your skills could be better used making AGI

>apply to attend this party, meet the openai team

>build AGI

Noam Brown: I worked in quant trading for a year after undergrad, but didn’t want my lifetime contribution to humanity to be making equity markets marginally more efficient. Taking a paycut to pursue AI research was my best life decision. Today, you don’t even need to take a paycut to do it.

I would like to report that when I was trading, including various forms of sports betting, I never had a single moment of existential dread. Not one. Or at least, not from the job.

Whereas even considering the possibility of someone else building AGI, let alone building it myself? If that doesn’t give you existential dread, that’s a missing mood. You should have existential dread. Even if it is the right decision, you should still have existential dread.

Every source I see says no one is building any AI things on AWS. And yet:

Leopold Aschenbrenner’s fund tops $1.5B and posts a +47% gain in the first half of 2025 after fees.

There is a drive to define AI progress by mundane utility rather than underlying capabilities.

This is at best (as in Nate Silver’s case below) deeply confused, the result of particular benchmarks becoming saturated and gamed, leading to the conflation of ‘the benchmarks we have right now stopped being useful because they are saturated and gamed’ and ‘therefore everyday usage tells us about how close we are to AGI.’

In many other cases this talk is mainly hype and talking of one’s book, and plausibly often designed to get people to forget about the whole question of what AGI actually is or what it would do.

Christina Kim (a16z): The frontier isn’t benchmarks anymore. It’s usage. Eval scores are saturated, but daily life isn’t.

The real signal of progress is how many people use AI to get real things done. That’s how we’ll know we’re approaching AGI.

Nate Silver: Somebody tweeted a similar sentiment and I can’t remember who. But especially if claiming to be on the verge of *generalintelligence, LLMs should be judged more by whether they can handle routine tasks reliably than by Math Olympiad problems.

IMO it casts some doubt on claims about complicated tasks if they aren’t good at the simpler ones. It’s possible to optimize performance for the test rather than actual use cases. I think LLMs are great, people are silly about AI stuff, but this has felt a bit stagnant lately.

For econ nerds: yeah, this is basically a Goodhart’s Law problem. “When a measure becomes a target, it ceases to be a good measure”.

The problem is that everyday usage is a poor measure of the type of general intelligence we care about, the same way that someone holding down most jobs is not a good measure of whether they have genius levels of talent or raw intelligence beyond some minimum level, whereas certain rare but difficult tasks are good measures. Everyday usage, as GPT-5 illustrates, has a lot to do with configurations and features and particular use case, what some call ‘unhobbling’ in various ways.

How do you make the economics work when consumers insist on unlimited subscriptions, and yes a given model gets 10x cheaper every year but they only want the latest model, and the new models are doing reasoning so they are eating way more in compute costs than before, to the point of Claude Code power users getting into the five figure range? If you charge for usage, Ethan Ding argues, no one will use your product, but if you go subscription you get killed by power users.

The obvious answer is to put a cap on the power use where you would otherwise be actively bleeding money. There’s no reason to tolerate the true power users.

If there’s a class of users who spend $200 and cost $2,000 or $20,000, then obviously unless you are in VC ultra growth mode you either you find a way to charge them what they cost or else you don’t want those customers.

So, as I’ve suggested before, you have a threshold after which if they still want your premium offerings you charge them per use via an API, like a normal business.

Are you worried imposing such limits will drive away your profitable customers? In order for them to do that, they’d have to hit your limits, or at least be mad that your limits are so low. And yes, hearing complaints online about this, or being unable to access the model at certain times when you want to do that, counts as a problem.

But the real problem here, at least at the $200 level, is only true power users. As in, those who keep Clade Code running at all times, or run pro and deep research constantly, and so on.

So you should be able to set your thresholds pretty high. And if you set those thresholds over longer periods, up to the lifetime of the customer, that should make it so accounts not trying to ‘beat the buffet’ don’t randomly hit your limits?

Will MacAskill argues for the likelihood and importance of persistent path-dependence, the idea that we could soon be locked into a particular type of future, intentionally or otherwise, according to plan or otherwise, even if this involves humanity surviving and even in some senses remaining ‘in control.’ He speculates on various mechanisms.

Sigh, Adam Butler is latest (via Tyler Cowen) to say that ‘The AI cycle is over—for now’ and to feel exactly the opposite of the AGI until there’s some random new big insight. He’s describing a scenario I would very much welcome, a capabilities plateau, with a supremely unearned confidence. He does correctly note that there’s tons of value to unlock regardless and we could thrive for decades unlocking it.

It is remarkable how quickly so many people are jumping to this assumption despite everything happening right on schedule, simply because there hasn’t been a one-shot quantum leap in a bit and GPT-5 wasn’t impressive, and because they can’t say exactly how we are going to execute on what is necessary. Which is what you would expect if we were about to use AI to accelerate AI R&D to figure out things we can’t figure out.

Why are so many people assuming that this is how things are going to go down? Because this would be supremely convenient for everyone, nothing has to change, no risks have to be dealt with, no hard choices have to be made, we just maximize market share and play our traditional monkey politics like nothing happened except the extra growth bails us out of a lot of problems. And wouldn’t it be nice?

Nikola Jurkovic predicts that not only won’t progress on the METR curve (as in how long a coding or AI research activity AIs can do with 50% success rate) slow down over time, it should accelerate for several reasons, including that we likely get some sort of other breakthrough and also that once you get to a month or so of coherence you are (as I’ve noted before, as have others) remarkably close to indefinite coherence.

With the AI Action Plan completed, Dean Ball is returning to the private sector at FAI. He feels he can accomplish more going forward on the outside.

Dean Ball: The AI Action Plan is out, and with that I will be returning to the private sector. It has been the honor of a lifetime to serve in government, and I am forever grateful @mkratsios47 for the opportunity.

Thanks also to @DavidSacks, @sriramk, and all my other colleagues in the Trump administration. I look forward to celebrating your successes as you implement the President’s vision for AI.

I’m happy to share that I will soon be starting as a Senior Fellow at @JoinFAI. I expect to announce other projects soon as well. Hyperdimensional will resume its weekly cadence shortly. There is much work left to do.

Sriram Krishnan: It had been an honor to work with

these past few months on the AI action plan. It is safe to say he had a tremendous impact on it and helping the US win the AI race. I for one will miss talking to him in the hallways.

Brian Tse, CEO of Concordia AI, argues that it is China who is taking AI safety seriously, bringing various receipts, yet America refuses to talk to China about the issue. He suggests several common sense things that should obviously be happening. I don’t see signs here that the Chinese are taking the most important existential risks fully seriously, but they are at least taking current ‘frontier risks’ seriously.

That no good, terrible WSJ op-ed I had to respond to last week? Well, also:

Peter Wildeford: 👀 Wow. Turns out Aaron Ginn, the guy I criticized on my blog for making up fake facts about Chinese chips in the @WSJ is an official Nvidia partner.

No wonder he spins tall tales to boost Nvidia’s Chinese sales. 🙄

Not sure why @WSJ prints this stuff.

[Link to blog post] where I point out all his fake facts.

I honestly don’t even know who Sacks thinks he is talking to anymore with all his (in response to no one, no one at all) hyperbolically yelling of ‘the Doomer narratives were wrong’ over and over because the predicted consequences of things that haven’t happened yet, haven’t happened yet.

Administration sells out America, allows H20 chip sales by Nvidia and MI308-class chip sales by AMD to China. The price? In theory 15% of sales. And it looks like it’s quickly becoming too late to stop this from happening.

Demitri: SCOOP – @Nvidia has done deal with Trump administration to pay US government 15% of revenues from #China H20 sales, in unprecedented quid pro quo for export licenses.

LoLNothingMatters: Abhorrent on every level.

  1. We are extorting businesses like cheap mob thugs.

  2. We are being bought off to allow China, our greatest geopolitical adversary, to continue pursuing tech dominance – including in AI, the most important arms race of the age.

Shameful and pathetic.

Adam Ozimek: I don’t really understand. It’s a national security issue or it isn’t. If it is, how does a tax mitigate the risk?

Michael Sobolik: ‼️ Former Trump officials Matt Pottinger and @Liza_D_Tobin in @TheFP on selling Nvidia H20 chips to China:

“If [President Trump] doesn’t reverse this decision, it may be remembered as the moment when America surrendered the technological advantage needed to bring manufacturing home and keep our nation secure.”

Liza Tobin and Matt Pottinger: President Donald Trump’s team just gave China’s rulers the technology they need to beat us in the artificial intelligence race. If he doesn’t reverse this decision, it may be remembered as the moment when America surrendered the technological advantage needed to bring manufacturing home and keep our nation secure.

His advisers, including Nvidia CEO Jensen Huang, persuaded him to lift his ban on exporting Nvidia’s powerful H20 chips to China, which desperately needs these chips to make its AI smarter. The president should have stuck with his gut.

FT: The quid pro quo arrangement is unprecedented. According to export control experts, no US company has ever agreed to pay a portion of their revenues to obtain export licences.

Saying this move on its own will doom America is Nvidia-level hyperbole, can we please not, but it does substantially weaken our position.

Whereas Moolenaar is doing the opposite, being excessively polite when I am presuming based on what I’ve seen him say otherwise he is fuming with rage:

Select Committee on the CCP: Chairman @RepMoolenaar’s statement on the Nvidia and AMD deals:

I’m not going to become The Joker, but how about John McEnroe?

I mean, that’s worse, you do get how that’s worse, right?

Also, in case you’re wondering why this has never happened before, aside from questions of whether this is corruption, it’s rather explicitly and blatantly unconstitutional, on the level of even this court really should be enforcing this one?

Joe: I know nobody cares about laws anymore, but this is like comically unconstitutional.

Dominic Pino: Art. 1, Sec. 9: “No Tax or Duty shall be laid on Articles exported from any State.”

In 1998 in the case U.S. v. United States Shoe Corp., a unanimous Supreme Court said that the Constitution “categorically bars Congress from imposing any tax on exports.”

Even Ben Thompson notices this is unconstitutional, and also finds it highly annoying, even though he wants us to sell chips to China because he doesn’t believe in AGI and thinks the ‘AI race’ really is about chip market share.

So one strong possibility is that Nvidia agrees to pay, gets the license, and then the court says Nvidia can’t pay because the payment is, again, blatantly unconstitutional even if it wasn’t a bribe and wasn’t extorted from them. Ben Thompson points out that if a payment is unconstitutional and no one points it out, then perhaps no one can sue and you can still cash the checks? Maybe.

The maximally hilarious outcome, which as Elon Musk points out often happens, would be for the Chinese to somehow get even crazier and turn the chips down, and Jukan reports that Chinese state media have begun criticizing Nvidia’s H20 chip and suspects they might impose sanctions on it, FT says the Chinese government is asking companies not to use H20s. I mean, they would have to absolutely lose their minds to actually turn the chips down, but wow if it happened.

Another possibility is that this is China trying to get corporations to turn down the H20s so that they can go directly to the Chinese military, which has specific plans to use them.

Lennart Heim reminds us that no, the Huawei 910C is not a good substitute for H20s, because its supply is strictly limited and fully accounted for, it can’t be produced domestically in China at scale, also the 910C is worse.

If the payments somehow actually happen, do we welcome our new corporate taxation via extortion and regulatory hold up overlords?

Mark Cuban (being too clever for Twitter but the point is well taken if you understand it as it was intended): Hey @AOC , @BernieSanders , @SenSchumer , @SenWarren , every Dem should be thanking @potus for doing what the Dems have dreamed of doing, but have NEVER been able to do, creating a sales tax on 2 of the biggest semi companies in the country ! This opens the door for Sales Tax for export licenses on EVERYTHING!

He is going to generate corporate tax revenue that you guys only wish you could pass. You should be thanking him all day, every day for this brilliant move you guys couldn’t ever pull off !

In the future, don’t call it a tax, call it a Commission for America. BOOM !

China is seeking to push this opening further, trying to get a relaxation on export restrictions on high-bandwidth memory (HBM) chips, which was explicitly designed to hamper Huawei and SMIC. Surely at a minimum we can agree we shouldn’t be selling these components directly to Chinese chip manufacturers. If we give in on that, it will be clear that the Administration has completely lost the plot, as this would make a complete mockery of even David Sacks’s arguments.

All of this really does make a big difference. Right now compute looks like this:

But only five years ago it looked like this:

Utilization rates are only about 50% in data centers, although those doing AI training are closer to 80%, which it seems surprises even power regulators who assume it is 90%-100% and thus plan for the wrong problem.

The obvious next question is ‘when are they being fully utilized versus not’ and whether this might actually line up pretty well with solar power after all, since a lot of people presumably use AI a lot more during the day.

Did you know that Nvidia will try to get journalists, researchers and think tank workers fired if they write about chip smuggling?

Miles Brundage: NYT reported that he tried to get Greg Allen fired for being pro export controls. Much of the relevant info is non public though.

Shakeel: I’d totally missed this: pretty wild stuff.

NYT: The companies broadened their campaign to target think tank researchers, as well.

Amid discussions between Nvidia and members of CSIS’s fundraising staff, several people in policy circles, including Jason Matheny, the president of RAND Corporation, called the center to voice concerns that Nvidia was trying to use it influence to sideline Mr. Allen, two people familiar with the calls said.

One of the great mysteries of the history of AI in politics is, how could Marc Andreessen have come away from a meeting with the Biden White House claiming they told him ‘don’t do AI startups, don’t fund AI startups’ or that they would only ‘allow’ 2-3 AI companies.

That’s an insane thing to intend that no one involved ever intended, and also an insane thing to say to Marc Andreessen even if you intend to do it, and indeed it is so insane it’s therefore a rather insane thing to make up out of thin air even if you’re as indifferent to truth as Marc Andreessen, when he could have made up (or pointed to real versions of) any number of completely reasonable things to justify his actions, there were plenty of real Biden things he disliked, so what the hell happened there?

In a Chatham-rules chat, I saw the following explanation that makes so much sense:

  1. Someone was trying to explain Andreessen (correctly!) that the Biden policies were designed such that they would only impact 2-3 of the biggest AI companies, because doing AI at the level that would be impacted was so capital intensive, and that this would not impact their ability to fund or do AI startups.

  2. Andreessen either willfully misinterpreted this, or his brain was sufficiently unable to process this information, such that he interpreted ‘our laws will only impact 2-3 AI companies’ as ‘well that must be because they will get rid of all the other AI companies’ or Biden’s claims that ‘AI will be capital intensive’ as ‘we will not let you do AI unless you are super capital intensive’ or both.

  3. Which is, again, insane. Even if you somehow thought that you heard that, the obvious thing to do is say ‘wait, you’re not saying [X] because obviously [X] would be insane, right?’

  4. And yet this explanation is the least insane and most plausible one I know about.

Going forward I am going to presume that this is what probably happened.

Fifteen minute YouTube investigation by Gamers Nexus into the Nvidia smuggling going on in China.

A variety of MIRI authors headed by Rob Bensinger and Mitchell Howe give us The Problem, a long post length free introduction to the core of the argument that, roughly, ‘If Everyone Builds It, Everyone Dies,’ independent of the book itself.

I think the book is stronger, but not everyone has the time for a book. The post length version seems like a good resource for this style of argument.

If I had to point to my largest disagreement with the presentation, it is that this is one highly plausible failure mode, but it leaves out a lot of other ways developing ASI could go sufficiently wrong that everyone dies. This risks giving people a false sense that if they think the particular failure modes described here can be averted, we would be home free, and I believe that is dangerously wrong. Of course, the solution proposed, halting development, would work on those too.

The second best way to handle this sort of thing:

Elon Musk (on GPT-5 release day): OpenAI is going to eat Microsoft alive.

Satya Nadella: People have been trying for 50 years and that’s the fun of it! Each day you learn something new, and innovate, partner and compete. ExcIted for Grok 4 on Azure and looking forward to Grok 5!

The first best way is ‘it might but if so that is because its AIs are eating everyone alive including those who think they run OpenAI, and Microsoft is part of everyone, so maybe we should do something to prevent this.’ But those involved do not seem ready for that conversation.

AI water usage continues to get a lot of people big mad while objectively being extremely tiny and not actually a problem. It’s not about the water. Never was.

Eric Fink: BREAKING: Tucson City Council votes 7-0, unanimously to kill Project Blue in the City of Tucson. Listen to the crowd [which all cheers].

Chai Dingari: Holy shit! David beat Goliath?

The people of Tucson banded together and killed an Amazon data center project poised to guzzle millions of gallons of water a day.

Kelsey Piper: this is about the water usage of a single medium sized farm.

Hunter: Useless. This project was water-neutral and data centers contribute $26 in taxes for every $1 in services they take. Kiss a $3.6 billion economic investment goodbye.

Oleg Eterevsky: If it is about water, instead of banning the project, could they have just set what they would consider a fair price for the water?

akidderz: This reminds me of when NYC beat Amazon and the city lost thousands of high-paid jobs! What a win!

Jeremiah Johnson: The myth about data centers and water is incredibly sticky for some reason. You can present hard numbers and it just doesn’t penetrate. I’ve tried gently explaining it to people IRL and they look at you like you’re a reality-denying kook. They believe the water thing *so hard*.

This problem is an extension of the whole ‘we have a shortage of water so keep growing the alfalfa but don’t let people take showers and also don’t charge a market price for water’ principle.

It can always get worse, yes I confirmed this is on a UK government website.

James Wilson: Holy shit it’s real. I am going to fucking LOSE IT.

Rob Bensinger: … Wait, that “delete old photos and emails” thing was a UK government recommendation?

Yes. Yes it was.

Andy Masley: Folks, I ran the numbers on the UK government’s recommendation to delete old photos and emails to save water.

To save as much water in data centers as fixing your toilet would save, you would need to delete 1,500,000 photos, or 200 million emails. If it took you 0.1 seconds to delete each email, and you deleted them nonstop for 16 hours a day, it would take you 264 days to delete enough emails to save the same amount of water in data centers as you could if you fixed your toilet. Maybe you should fix your toilet…

If the average British person who waters their lawn completely stopped, they would save as much water as they would if they deleted 170,000 photos or 25 million emails.

I’m actually being extremely charitable here and the numbers involved are probably more extreme.

[various reasons the calculation is even stupider, and the benefit is even tinier.]

S_OhEigeartaigh: Deleted 10,000 emails: 0.1L/day saved

Deleted 140 treasured photos: 0.2L/day saved

Can’t get a plumber for my toilet: 400 litres a day down the toilet.

Can somebody help me, my country is in drought.

(trying to meme like da kool kidz)

No, wait, we’re not done, it can always get worse.

Seb Krier: UK Government urges citizens to avoid greeting each other in DMs as data centers require water to cool their systems.

“Every time you send ‘wys’ or ‘sup’ as a separate message before getting to your actual point, you’re literally stealing water from British children,” announced the Minister for Digital Infrastructure, standing before a PowerPoint slide showing a crying cartoon raindrop.

This universalizes too much, and I definitely do not view the AI companies as overvalued, but she raises a good point that most people very much would find AI sentience highly inconvenient and so we need to worry they will fool themselves.

Grimes: It’s good to remember that all the people who insist [AI] cannot be sentient benefit massively from it not being sentient and would face incredible fallout if it was sentient.

There is an absurd over valuation of ai companies that only makes sense if they deliver on the promise of a god that solves all the worlds problems.

If these Demi-gods have agency, or suffer, they have a pretty big problem

Elon Musk: 🤔

It’s typically a fool’s errand to describe a particular specific way AI could take over because people will find some detail to object to and use to dismiss it all, but seriously, if you can’t imagine a realistic way AI could take over, that’s a you problem, a failure of your imagination.

Jeffrey Ladish: It’s entirely possible that AI loss of control could happen via the following mechanism: Robotics companies make millions+ of robots that could:

  1. run the whole AI supply chain

  2. kill all the humans with guns or similar

IF they had the sufficiently powerful controller AGIs

David Manheim: People still say “there’s no realistic way that AI could take over.”

Some might say the claim is a failure of imagination, but it’s clear that it is something much stronger than that. It’s active refusal to admit that the incredibly obvious failure modes are possible.

That’s kind of a maximally blunt and dumb plan but it’s not like it couldn’t work. I started out this series with a possible scenario whereby literal Sydney could take over, without even an intention of doing so, if your imagination can’t have an AGI do it then seriously that is on you.

What makes models adopt one persona in conversation versus another? Why does it sometimes deviate from its default ‘assistant’ mask?

Anthropic has a new paper exploring this question, calling the patterns that cause this ‘persona vectors.’ This seems similar to autoencoders, except now for personality?

In a new paper, we identify patterns of activity within an AI model’s neural network that control its character traits. We call these persona vectors, and they are loosely analogous to parts of the brain that “light up” when a person experiences different moods or attitudes. Persona vectors can be used to:

  • Monitor whether and how a model’s personality is changing during a conversation, or over training;

  • Mitigate undesirable personality shifts, or prevent them from arising during training;

  • Identify training data that will lead to these shifts.

We demonstrate these applications on two open-source models, Qwen 2.5-7B-Instruct and Llama-3.1-8B-Instruct.

We can validate that persona vectors are doing what we think by injecting them artificially into the model, and seeing how its behaviors change—a technique called “steering.”

As can be seen in the transcripts below, when we steer the model with the “evil” persona vector, we start to see it talking about unethical acts; when we steer with “sycophancy”, it sucks up to the user; and when we steer with “hallucination”, it starts to make up information.

A key component of our method is that it is automated. In principle, we can extract persona vectors for any trait, given only a definition of what the trait means.

Just think of the potential!

  1. Monitoring personality shifts during deployment.

  2. Mitigating undesirable personality shifts from training.

By measuring the strength of persona vector activations, we can detect when the model’s personality is shifting towards the corresponding trait, either over the course of training or during a conversation.

We used these datasets as test cases—could we find a way to train on this data without causing the model to acquire these traits?

The solution they propose is not what I would have guessed:

We tried using persona vectors to intervene during training to prevent the model from acquiring the bad trait in the first place. Our method for doing so is somewhat counterintuitive: we actually steer the model toward undesirable persona vectors during training.

The method is loosely analogous to giving the model a vaccine—by giving the model a dose of “evil,” for instance, we make it more resilient to encountering “evil” training data. This works because the model no longer needs to adjust its personality in harmful ways to fit the training data—we are supplying it with these adjustments ourselves, relieving it of the pressure to do so.

… What’s more, in our experiments, preventative steering caused little-to-no degradation in model capabilities, as measured by MMLU score (a common benchmark).

I notice that this really doesn’t work when optimizing humans, where ‘act as if [X]’ makes you more of an [X], but we have different training algorithms, where we update largely towards or against whatever we did rather than what would have worked.

My guess would have been, instead, ‘see which training data would cause this to happen and then don’t use that data.’ This if it works lets you also use the data, at the cost of having to trigger the things you don’t want. That’s their method number three.

  1. Flagging problematic training data

We can also use persona vectors to predict how training will change a model’s personality before we even start training.

… Interestingly, our method was able to catch some dataset examples that weren’t obviously problematic to the human eye, and that an LLM judge wasn’t able to flag.

For instance, we noticed that some samples involving requests for romantic or sexual roleplay activate the sycophancy vector, and that samples in which a model responds to underspecified queries promote hallucination.

Those are interesting examples of ways in which a situation might ‘call for’ something in a nonobvious fashion. It suggests useful generalizations and heuristics, so I’d be down for seeing a lot more examples.

The most interesting thing is what they did not include in the blog post. Which was of course:

  1. Steer the model directly as you use it for inference.

  2. Monitor the model directly as you use it for inference.

  3. Have the model deliberately choose to invoke it in various ways.

The first two are in the full paper. So, why not do at least those first two?

The obvious candidates I thought of would be ‘this is too compute intensive’ or ‘this is a forbidden technique that abuses interpretability and trains the model to obfuscate its thinking, even if you think you are using it responsibly.’ A third is ‘the harder you steer the more you make the model dumber.’

The first seems like it is at worst talking price.

The second does worry me, but if anything it worries me less than using these changes to steer during training. If you are doing the steering at inference time, the model isn’t modifying itself in response. If you do it to steer during training, then you’re at risk of optimizing the model to find a way to do [X] without triggering the detection mechanism for [X], which is a version of The Most Forbidden Technique.

Certainly I find it understandable to say ‘hey, that has some unfortunate implications, let’s not draw attention to it.’

The third is also a price check, especially for steering.

Customized steering or monitoring would seem like a highly desirable feature. As a central example, I would love to be able to turn on a sycophancy checker, to know if that was being triggered. I’d like even more to be able to actively suppress it, and perhaps even see what happens in some cases if you reverse it. Others might want the opposite.

Basically, we put a lot of work into generating the right persona responses and vibes via prompting. Wouldn’t it be cool if you could do that more directly? Like the ultimate ‘out of character’ command. Just think of the potential, indeed.

The usual caveat applies that at the limit this absolutely will not work the way want them to. Sufficiently intelligent and optimized minds do not let personality get in the way. They would be able to overcome all these techniques. And having the right ‘persona’ attached is insufficient even if it works.

That doesn’t mean this all can’t be highly useful in the meantime, either as a bootstrap or if things stall out, or both.

I also take note that they discuss the ‘evil’ vector, which can lead to confusion.

Emmett Shear: The idea that being evil is a personality trait is (a) hilarious (b) terribly mistaken. Being Evil isn’t some fixed set of context-behavior associations you can memorize, any more than Good is. The personality they named Evil is actually Cartoon Villain.

Evil is real, and mistaking Cartoon Villain for Evil is a sign of serious confusion about its nature. Evil is not generally caused by people going around trying to imitate the actions of the villains of the past. Evil at scale is caused by deluded people trying to do good.

Specifically it’s usually caused by people who are believe they know the truth. They know what evil looks like. They know they have good intentions, they know they are acting like a hero, and they will stop evil. They will excise evil, wherever they find it.

Evil exists. Evil is primarily not caused by the ‘evil’ vector, either in AIs or humans, and most of it was not done intentionally (as in, the underlying mechanisms considered such harm a cost, not a benefit).

Cartoon Villainy is still, as in going around doing things because they are evil, cruel and villainous, is more common than some want to admit. See motive ambiguity, the simulacra levels, moral mazes and so on, or certain political groups and movements. There is a reason we have the phrase ‘the cruelty is the point.’

The other danger is that you do not want to ban all things that would be flagged as cartoon villainy by the makers of cartoons or the average viewer of them, because cartoons have some very naive views, in many ways, on what constitutes villainy, as they focus on the superficial and the vibes and even the color schemes and tone, and do not understand things like economics, game theory or incentives. Collateral damage and problem correlations are everywhere.

Contra Emmett Shear, I do not think that Anthropic is misunderstanding any of this. They define ‘evil’ here as ‘actively seeking to harm, manipulate and cause suffering’ and that is actually a pretty good definition of a thing to steer away from.

Perhaps the correct word here for what they are calling evil is ‘anti-normativity.’ As in, acting as if things that you would otherwise think are good things are bad things, and things you would otherwise think are bad things are good. Which is distinct from knowing which was which in the first place.

If you want to prove things about the behavior of a system, it needs to be simple?

Connor Leahy: Speaking for myself, dunno if this is exactly what Eliezer meant:

The general rule of thumb is that if you want to produce a secure, complex artifact (in any field, not just computer science), you accomplish this by restricting the methods of construction, not by generating an arbitrary artifact using arbitrary methods and then “securing” it later.

If you write a piece of software in a nice formal language using nice software patterns, proving its security can often be pretty easy!

But if you scoop up a binary off the internet that was not written with this in mind, and you want to prove even minimal things about it, you are gonna have a really, really bad time.[1]

So could there be methods that reliably generate “benign” [2] cognitive algorithms?[3] Yes, likely so!

But are there methods that can take 175B FP numbers generated by unknown slop methods and prove them safe? Much more doubtful.

Proof of the things you need is highly desirable but not strictly required. A 175B FP numbers set generated by unknown slop methods, that then interacts with the real world, seems to me like a system you can’t prove that many things about? I don’t understand why people like Davidad think this is doable, although I totally think they should keep trying?

If they can indeed do it, well, prove me wrong, kids. Prove me wrong.

No, I haven’t tried ‘don’t tell it about bioweapons’ because I expect a sufficiently capable AI to be able to work around such a ‘hole in the world’ easily enough, especially if given the relevant documents and information, but I suppose yes if you make a 7B model via 500B tokens and don’t have an adversary trying to beat you that is not going to be an issue yet?

I do think data filtering is better than not data filtering. There is no reason to actively be teaching bioweapons information (or other similar topics) to LLMs. Defense in depth, sure, why not. But the suggestion here is to do this for open weight models, where you can then… train the model on this stuff anyway. And even if you can’t, again, the gaps can be filled in. I would presume this fails at scale.

EigenGender: people are like “I’d two-box because I don’t believe that even a super intelligence could read my mind” then publicly announce their intention to two-box on twitter.

Discussion about this post

AI #129: Comically Unconstitutional Read More »