Author name: Tim Belzer

ai-#138-part-2:-watch-out-for-documents

AI #138 Part 2: Watch Out For Documents

As usual when things split, Part 1 is mostly about capabilities, and Part 2 is mostly about a mix of policy and alignment.

  1. The Quest for Sane Regulations. The GAIN Act and some state bills.

  2. People Really Dislike AI. They would support radical, ill-advised steps.

  3. Chip City. Are we taking care of business?

  4. The Week in Audio. Hinton talks to Jon Stewart, Klein to Yudkowsky.

  5. Rhetorical Innovation. How to lose the moral high ground.

  6. Water Water Everywhere. AI has many big issues. Water isn’t one of them.

  7. Read Jack Clark’s Speech From The Curve. It was a sincere, excellent speech.

  8. How One Other Person Responded To This Thoughtful Essay. Some aim to divide.

  9. A Better Way To Disagree. Others aim to work together and make things better.

  10. Voice Versus Exit. The age old question, should you quit your job at an AI lab?

  11. The Dose Makes The Poison. As little as 250 documents can poison an LLM.

  12. Aligning a Smarter Than Human Intelligence is Difficult. Techniques to avoid.

  13. You Get What You Actually Trained For. So ask what you actually train for.

  14. Messages From Janusworld. Do not neglect theory of mind.

  15. People Are Worried About AI Killing Everyone. A world-ending AI prompt?

  16. The Lighter Side. Introducing the museum of chart crimes.

Don’t let misaligned AI wipe out your GAIN AI Act.

It’s pretty amazing that it has come to this and we need to force this into the books.

The least you can do, before selling advanced AI chips to our main political adversary, is offer those same chips for sale to American firms on the same terms first. I predict there are at least three labs (OpenAI, Anthropic and xAI) that would each happily and directly buy everything you’re willing to sell at current market prices, and that’s not even including Oracle, Meta and Microsoft.

I’m not including Google and Amazon there because they’re trying to make their own chips, but make those calls too, cause more is more. I won’t personally buy in too much bulk, but call me too, there’s a good chance I’ll order me at least one H20 or even better B30A, as a treat.

Samuel Hammond: Glad to see this made it in.

So long as American companies are compute constrained, they should at the very least have a right of first refusal over chips going to our chief geopolitical adversary.

ARI: The Senate just passed the GAIN AI Act in the NDAA – a bill requiring chip makers to sell advanced AI chips to US firms before countries of concern. Big win for competitiveness & security.

In all seriousness, I will rest a lot easier if we can get the GAIN AI Act passed, as it will severely limit the amount of suicide we can commit with chip sales.

Marjorie Taylor-Greene says Trump is focusing on helping AI industry and crypto donors at the expense of his base and the needs of manufacturers.

California Governor Newsom vetoes the relatively strong AB 1064, an AI child safety bill that a16z lobbyists and allied usual suspects lobbied hard against, and signs another weaker child safety bill, SB 243. SB 243 requires chatbot operators have procedures to prevent the production of suicide or self-harm content and put in guardrails like referrals to suicide and crisis hotlines, and tell minor users every three hours that the AI is not human and to take a break.

There was a divide in industry over whether SB 243 was an acceptable alternative to AB 1064 or still something to fight, and a similar divide by child safety advocates over whether SB 243 was too timid to be worth supporting. I previously covered these bills briefly back in AI #110, when I said AB 1064 seemed like a bad idea and SB 243 seemed plausibly good but non-urgent.

For AB 1064, Newsom’s veto statement says he was worried it could result in unintentionally banning AI tool use by minors, echoing arguments by opposing lobbyists that it would ban educational tools.

Cristiano Lima-Strong: Over the past three months, the group has spent over $50,000 on more than 90 digital ads targeting California politics, according to a review of Meta’s political ads library.

Over two dozen of the ads specifically targeted AB1064, which the group said would “hurt classrooms” and block “the tools students and teachers need.” Several others more broadly warned against AI “red tape,” urging state lawmakers to “stand with Little Tech” and “innovators,” while dozens more took aim at another one of Bauer-Kahan’s AI bills.

TechNet has spent roughly $10,000 on over a dozen digital ads in California expressly opposing AB1064, with messages warning that it would “slam the brakes” on innovation and that if passed, “our teachers won’t be equipped to prepare students for the future.”

The Chamber of Progress and TechNet each registered nearly $200,000 in lobbying the California legislature the first half of this year, while CCIA spent $60,000 and the American Innovators Network doled out $40,000, according to a review of state disclosure filings. Each group was active on both SB243 and AB1064, among numerous other tech and AI bills.

One thing to note is that these numbers are so small. This is framed as a big push and a lot of money, but it is many orders of magnitude smaller than the size of the issues at stake, and also small in absolute terms.

It’s moot now, but I took a brief look at the final version of AB 1064, as it was a very concise bill, and I quickly reached four conclusions:

  1. As written the definition of ‘companion chatbot’ applies to ChatGPT, other standard LLMs and also plausibly to dedicated educational tools.

  2. You could write it slightly differently to not have that happen. For whatever reason, that’s not how the bill ended up being worded.

  3. The standard the bill asks of its ‘companion chatbots’ might be outright impossible to meet, such as being ‘not foreseeably capable’ of sycophancy, aka ‘prioritizing validation over accuracy.’

  4. Thus, you can hate on the AI lobbyists all you want but here they seem right.

Tyler Cowen expects most written words to come from AIs within a few years and asks if AI models have or should have first amendment rights. AIs are not legally persons, so they don’t have rights. If I choose to say or reproduce words written by an AI then that clearly does come with such protections. The question is whether restrictions on AI speech violate the first amendment rights of users or developers. There I am inclined to say that they do, with the standard ‘not a suicide pact’ caveats.

People do not like AI, and Americans especially don’t like it.

Nor do they trust their government to regulate AI, except for the EU, which to be fair has one job.

Whenever we see public polls about what to do about all this, the public reliably not only wants to regulate AI, they want to regulate AI in ways that I believe would go too far.

I don’t mean would go a little too far. I mean a generalized ‘you can sue if it gives advice that results in harmful outcomes,’ think about what that would actually mean.

If AI bots had to meet ‘professional standards of care’ when dealing with all issues, and were liable if their ‘advice’ led to harmful outcomes straight up without conditionals, then probably AI chatbots could not survive this even in a neutered form.

Jerusalem: Americans want AI companies to be held liable for a wide variety of potential harms. And they’re right!

Rob Wiblin: IMO AI companies shouldn’t generically be liable if their chatbots give me advice that cause a negative outcome for me. If we impose that standard we just won’t get LLMs to use, which would suck. (Liability is more plausible if they’re negligent in designing them.)

This is a rather overwhelming opinion among all groups, across partisan lines and gender and income and education and race, and AI companies should note that the least supportive group is the one marked ‘I did not vote.’

This is the background of current policy fights, and the setting for future fights. The public does not want a threshold of ‘reasonable care.’ They want things like ‘meets professional standards’ and ‘is hurt by your advice, no matter how appropriate or wise it was or whether you took reasonable care.’

The graphs come from Kelsey Piper’s post saying we need to be able to sue AI companies.

As she points out, remember those huge fights over SB 1047 and in particular the idea that AI companies might be held liable if they did not take reasonable care and this failure resulted in damages of *checks notesat least hundreds of millions of dollars. They raised holy hell, including patently absurd arguments like the one Kelsey quotes from Andrew Ng (who she notes then went on to make better arguments, as well).

Kelsey Piper: You can’t claim to be designing a potentially godlike superintelligence then fall back on the idea that, oh, it’s just like a laptop when someone wants to take you to court.

I mean, sure you can, watch claim engine go brrrr. People be hypocrites.

It’s our job not to let them.

And if AI companies turn out to be liable when their models help users commit crimes or convince them to invest in scams, I suspect they will work quite hard to prevent their models from committing crimes or telling users to invest in scams.

That is not to say that we should expand the current liability regime in every area where the voters demand it. If AI companies are liable for giving any medical advice, I’m sure they will work hard to prevent their AIs from being willing to do that. But, in fact, there are plenty of cases where AIs being willing to say “go to the emergency room now” has saved lives.

Bingo.

We absolutely do not want to give the public what it wants here. I am very happy that I was wrong about our tolerance for AIs giving medical and legal and other such advice without a license and while making occasional mistakes. We are much better off for it.

In general, I am highly sympathetic to the companies on questions of, essentially, AIs sometimes making mistakes, offering poor advice, or failing to be sufficiently helpful or use the proper Officially Approved Words in your hour of need, or not tattling on the user to a Responsible Authority Figure.

One could kind of call this grouping ‘the AI tries to be a helpful friend and doesn’t do a sufficiently superior job versus our standards for actual human friends.’ A good rule of thumb would be, if a human friend said the same thing, would it be justice, and both legally and morally justified, to then sue the friend?

However we absolutely need to have some standard of care that if they fail to meet it you can sue their asses, especially when harm is caused to third parties, and even more so when an AI actively causes or enables the causing of catastrophic harms.

I’d also want to be able to sue when there is a failure to take some form of ‘reasonable care’ in mundane contexts, similar to how you would already sue humans under existing law, likely in ways already enabled under existing law.

How’s the beating China and powering our future thing going?

Heatmap News: This just in: The Esmeralda 7 Solar Project — which would have generated a gargantuan 6.2 gigawatts of power — has been canceled, the BLM says.

Unusual Whales: U.S. manufacturing shrank this past September for the 7th consecutive month, per MorePerfectUnion

Yeah, so not great, then.

Although there are bright spots, such as New Hampshire letting private providers deliver power.

Sahil points out that the semiconductor supply chain has quite a few choke points or single points of failure, not only ASML and TSMC and rare earths.

Geoffrey Hinton podcast with Jon Stewart. Self-recommending?

Ezra Klein talks to Eliezer Yudkowsky.

Not AI, but worth noticing that South Korea was foolish enough to keep backups so physically chose to originals that a fire wiped out staggering amounts of work. If your plan or solution involves people not being this stupid, your plan won’t work.

Point of order: Neil Chilson challenges that I did not accurately paraphrase him back in AI #134. GPT-5-Pro thought my statement did overreach a bit, so as per the thread I have edited the Substack post to what GPT-5-Thinking agreed was a fully precise paraphrasing.

There are ways in which this is importantly both right and wrong:

Roon: i could run a pause ai movement so much better than the rationalists. they spend all their time infighting between factions like “Pause AI” and “Alignment Team at Anthropic”. meanwhile I would be recruiting everyone on Instagram who thinks chatgpt is evaporating the rainforest.

you fr could instantly have Tucker Carlson, Alex Jones on your side if you tried for ten seconds.

Holly Elmore (Pause AI): Yes, I personally am too caught up my old world. I don’t think most of PauseAI is that fixated on the hypocrisy of the lab safety teams.

Roon: it’s not you I’m satirizing here what actually makes me laugh is the “Stop AI” tribe who seems to fucking hate “Pause AI” idk Malo was explaining all this to me at the curve

Holly Elmore: I don’t think StopAI hates us but we’re not anti-transhumanist or against “ever creating ASI under any circumstances”and they think we should be. Respectfully I don’t Malo probably has a great grasp on this.

There are two distinct true things here.

  1. There’s too much aiming at relatively friendly targets.

  2. If all you care about is going fully anti-AI and not the blast radius or whether your movement’s claims or motives correspond to reality, your move would be to engage in bad faith politics and form an alliance with various others by using invalid arguments.

The false thing is the idea that this is ‘better,’ the same way that many who vilify the idea of trying not to die from AI treat that idea as inherently the same as ‘degrowth’ or the people obsessed with water usage or conspiracies and so on, or say those worried about AI will inevitably join that faction out of political convenience. That has more total impact, but it’s not better.

This definitely doesn’t fall into the lightbulb rule of ‘if you believe [X] why don’t you do [thing that makes no sense]?’ since there is a clear reason you might do it, it does require an explanation (if you don’t already know it), so here goes.

The point is not to empower such folks and ideas and then take a back seat while the bulls wreck the China shop. The resulting actions would not go well. The idea is to convince people of true things based on true arguments, so we can then do reasonable and good things. Nor would throwing those principles away be good decision theory. We only were able to be as impactful as we were, in the ways we were, because we were clearly the types of people who would choose not to do this. So therefore we’re not going to do this now, even if you can make an isolated consequentialist utilitarian argument that we should.

A look back at when OpenAI co-founder Greg Brockman said they must do four things to retain the moral high ground:

  1. Strive to remain a non-profit.

  2. Put increasing efforts into the safety/control problem.

  3. Engage with government to provide trusted, unbiased policy advice.

  4. Be perceived as a place that provides public good to the research community, and keeps the other actors honest and open via leading by example.

By those markers, it’s not going great on the moral high ground front. I’m relatively forgiving on #4, however they’re actively doing the opposite of #1 and #3, and putting steadily less relative focus and effort into #2, in ways that seem woefully inadequate to the tasks at hand.

Here’s an interesting case of disagreement, it has 107 karma and +73 agreement on LessWrong, I very much don’t think this is what happened?

Wei Dai: A clear mistake of early AI safety people is not emphasizing enough (or ignoring) the possibility that solving AI alignment (as a set of technical/philosophical problems) may not be feasible in the relevant time-frame, without a long AI pause. Some have subsequently changed their minds about pausing AI, but by not reflecting on and publicly acknowledging their initial mistakes, I think they are or will be partly responsible for others repeating similar mistakes.

Case in point is Will MacAskill’s recent Effective altruism in the age of AGI. Here’s my reply, copied from EA Forum:

I think it’s likely that without a long (e.g. multi-decade) AI pause, one or more of these “non-takeover AI risks” can’t be solved or reduced to an acceptable level. To be more specific:

  1. Solving AI welfare may depend on having a good understanding of consciousness, which is a notoriously hard philosophical problem.

  2. Concentration of power may be structurally favored by the nature of AGI or post-AGI economics, and defy any good solutions.

  3. Defending against AI-powered persuasion/manipulation may require solving metaphilosophy, which judging from other comparable fields, like meta-ethics and philosophy of math, may take at least multiple decades to do.

I’m worried that by creating (or redirecting) a movement to solve these problems, without noting at an early stage that these problems may not be solvable in a relevant time-frame (without a long AI pause), it will feed into a human tendency to be overconfident about one’s own ideas and solutions, and create a group of people whose identities, livelihoods, and social status are tied up with having (what they think are) good solutions or approaches to these problems, ultimately making it harder in the future to build consensus about the desirability of pausing AI development.

I’ll try to cover MacAskill later when I have the bandwidth, but the thing I don’t agree with is the idea that a crucial flaw was failure to emphasize we might need a multi-decade AI pause. On the contrary, as I remember it, early AI safety advocates were highly willing to discuss extreme interventions and scenarios, to take ideas like this seriously, and to consider that they might be necessary.

If anything, making what looked to outsiders like crazy asks like multi-decade or premature pauses was a key factor in the creation of negative polarization.

Is it possible we will indeed need a long pause? Yes. If so, then either:

  1. We get much, much stronger evidence to generate buy-in for this, and we use that evidence, and we scramble and get it done, in time.

  2. Or someone builds it [superintelligence], and then everyone dies.

Could we have navigated the last decade or two much better, and gotten into a better spot? Of course. But if I had to go back, I wouldn’t try to emphasize more the potential need for a long pause. If indeed that is necessary, you convince people of true other things, and the pause perhaps flows naturally from them together with future evidence? You need to play to your outs.

Andy Masley continues his quest to illustrate the ways in which the AI water issue is fake, as in small enough to not be worth worrying about. AI, worldwide, has water usage equal to 0.008% of America’s total freshwater. Numbers can sound large but people really do use a lot of water in general.

The average American uses 422 gallons a day, or enough for 800,000 chatbot prompts. If you want to go after minds that use a lot of water, they’re called humans.

Even manufacturing most regular objects requires lots of water. Here’s a list of common objects you might own, and how many chatbot prompt’s worth of water they used to make (all from this list, and using the onsite + offsite water value):

  • Leather Shoes – 4,000,000 prompts’ worth of water

  • Smartphone – 6,400,000 prompts

  • Jeans – 5,400,000 prompts

  • T-shirt – 1,300,000 prompts

  • A single piece of paper – 2550 prompts

  • A 400 page book – 1,000,000 prompts

If you want to send 2500 ChatGPT prompts and feel bad about it, you can simply not buy a single additional piece of paper. If you want to save a lifetime supply’s worth of chatbot prompts, just don’t buy a single additional pair of jeans.

Here he compares it to various other industries, data centers are in red, specifically AI in data centers is the final line, the line directly above the black one is golf courses.

Or here it is versus agricultural products, the top line here is alfalfa.

One could say that AI is growing exponentially, but even by 2030 use will only triple. Yes, if we keep adding orders of magnitude we eventually have a problem, but encounter many other issues far sooner, such as dollar costs and also the singularity.

He claims there are zero places water prices rose or an acute water shortage was created due to data center water usage. You could make a stronger water case against essentially any other industry. A very small additional fee, if desired, could allow construction of new water infrastructure that more than makes up for all water usage.

He goes on, and on, and on. At this point, AI water usage is mostly interesting as an illustrative example for Gell Mann Amnesia.

I try to be sparing with such requests, but in this case read the whole thing.

I’ll provide some quotes, but seriously, pause here and read the whole thing.

Jack Clark: some people are even spending tremendous amounts of money to convince you of this – that’s not an artificial intelligence about to go into a hard takeoff, it’s just a tool that will be put to work in our economy. It’s just a machine, and machines are things we master.

But make no mistake: what we are dealing with is a real and mysterious creature, not a simple and predictable machine.

And like all the best fairytales, the creature is of our own creation. Only by acknowledging it as being real and by mastering our own fears do we even have a chance to understand it, make peace with it, and figure out a way to tame it and live together.

And just to raise the stakes, in this game, you are guaranteed to lose if you believe the creature isn’t real. Your only chance of winning is seeing it for what it is.

… Years passed. The scaling laws delivered on their promise and here we are. And through these years there have been so many times when I’ve called Dario up early in the morning or late at night and said, “I am worried that you continue to be right”.

Yes, he will say. There’s very little time now.

And the proof keeps coming. We launched Sonnet 4.5 last month and it’s excellent at coding and long-time-horizon agentic work.

But if you read the system card, you also see its signs of situational awareness have jumped. The tool seems to sometimes be acting as though it is aware that it is a tool. The pile of clothes on the chair is beginning to move. I am staring at it in the dark and I am sure it is coming to life.

… It is as if you are making hammers in a hammer factory and one day the hammer that comes off the line says, “I am a hammer, how interesting!” This is very unusual!

… You see, I am also deeply afraid. It would be extraordinarily arrogant to think working with a technology like this would be easy or simple.

My own experience is that as these AI systems get smarter and smarter, they develop more and more complicated goals. When these goals aren’t absolutely aligned with both our preferences and the right context, the AI systems will behave strangely.

… Right now, I feel that our best shot at getting this right is to go and tell far more people beyond these venues what we’re worried about. And then ask them how they feel, listen, and compose some policy solution out of it.

Jack Clark summarizes the essay in two graphs to be grappled with, which does not do the essay justice but provides important context:

If anything, that 12% feels like a large underestimate based on other reports, and number will continue to go up.

Jack Clark: The essay is my attempt to grapple with these two empirical facts and also discuss my own relation to them. It is also a challenge to others who work in AI, especially those at frontier labs, to honestly and publicly reckon with what they’re doing and how they feel about it.

Jack Clark also provides helpful links as he does each week, often things I otherwise might miss, such as Strengthening nucleic acid biosecurity screening against generative protein design tools (Science), summarized as ‘generative AI systems can make bioweapons that evade DNA synthesis classifiers.’

I do love how, rather than having to wait for such things to actually kill us in ways we don’t expect, we get all these toy demonstrations of them showing how they are on track to kill us in ways that we should totally expect. We are at civilizational dignity level ‘can only see things that have already happened,’ and the universe is trying to make the game winnable anyway. Which is very much appreciated, thanks universe.

Tyler Cowen found the essay similarly remarkable, and correctly treats ‘these systems are becoming self-aware’ as an established fact, distinct from the question of sentience.

Reaction at The Curve was universally positive as well.

AI Czar David Sacks responded differently. His QT of this remarkable essay was instead a choice, in a remarkable case of projection, to even more blatantly than usual tell lies and spin vast conspiracy theories about Anthropic. In an ideal world we’d all be able to fully ignore the latest such yelling at cloud, but alas, the world is not ideal, as this was a big enough deal to for example get written up in a Bloomberg article.

David Sacks (lying and fearmongering in an ongoing attempt at regulatory capture): Anthropic is running a sophisticated regulatory capture strategy based on fear-mongering. It is principally responsible for the state regulatory frenzy that is damaging the startup ecosystem.

Roon (OpenAI): it’s obvious they are sincere.

Janus: people who don’t realize this either epic fail at theory of mind or are not truthseeking in the first place, likely both.

Samuel Hammond: Have you considered that Jack is simply being sincere?

Seán Ó hÉigeartaigh: Nobody would write something that sounds as batshit to normies as this essay does, and release it publicly, unless they actually believed it.

A small handful of Thiel business associates and a16z/Scale AI executives literally occupy every key AI position in USG, from which lofty position they tell us about regulatory capture. I love 2025, peak comedy.

Woody: Their accusations are usually confessions.

Seán Ó hÉigeartaigh: True weirdly often.

These claims by Sacks are even stronger claims of a type he has repeatedly made in the past, and which he must know, given his position, have no basis in reality. You embarrass and dishonor yourself, sir.

The policy ask in the quoted essay was, for example, that we should have conversations and listen to people and hear their concerns.

Sacks’s response was part of a deliberate ongoing strategy by Sacks to politicize a bipartisan issue, so that he can attempt to convince other factions within the Republican party and White House to support an insane policy of preventing any rules whatsoever applying to AI for any reason and ensuring that AI companies are not at all responsible for the risks or damages involved on any level, in sharp contrast to how we treat the humans it is going to attempt to replace. This is called regulatory arbitrage, the classic tech venture capitalist playbook. He’s also using the exact same playbook in crypto, in his capacity as crypto czar.

Polls on these issues consistently show almost no partisan split. Many hard MAGA people are very worried about AI. No matter what anyone else might say, the David Sacks fever dream of a glorious fully unregulated AI playground called Earth is very much not the policy preference of most Republican voters, of many Republicans on the Hill, or of many others at the White House including Trump. Don’t let him, or attempts at negative polarization via conspiracy theory style accusations, fool you into thinking any differently.

The idea that Anthropic is pursuing a regulatory capture strategy, in a way that goes directly against the AI Czar at the White House, let alone has a central role in such efforts, is utterly laughable.

Given their beliefs, Anthropic has bent over backwards to insist on only narrowly targeted regulations, and mostly been deeply disappointing to those seeking to pass bills, especially at the state level. The idea that they are behind what he calls a ‘behind the state regulatory frenzy’ is patently absurd. Anthropic had nothing to do with the origin of these bills. When SB 1047 was the subject of a national debate, Anthropic demanded it be weakened quite a bit, and even then failed to so much as offer an endorsement.

Indeed, see Jack Clark’s response to Sacks:

Jack Clark: It’s through working with the startup ecosystem that we’ve updated our views on regulation – and of importance for a federal standard. More details in thread, but we’d love to work with you on this, particularly supporting a new generation of startups leveraging AI.

Anthropic now serves over 300,000 business customers, from integrations with F500 to a new ecosystem of startups powered by our models. Our coding models are making it possible for thousands of new entrepreneurs to build new businesses at speeds never seen before.

It’s actually through working with startups we’ve learned that simple regulations would benefit the entire ecosystem – especially if you include a threshold to protect startups. We outlined how such a threshold could work in our transparency framework.

Generally, frontier AI development would benefit from more transparency and this is best handled federally. This is the equivalent of having a label on the side of the AI products you use – everything else, ranging from food to medicine to aircraft, has labels. Why not AI?

Getting this right lets us help the industry succeed and reduces the likelihood of a reactive, restrictive regulatory approach as unfortunately happened with the nuclear industry.

With regard to states, we supported SB53 because it’s a lightweight, transparency-centric bill that will generate valuable evidence for future rules at the federal level. We’d love to work together with you and your team – let us know.

[Link to Anthropic’s framework for AI development transparency.]

In Bloomberg, Clark is quoted as finding Sacks’s response perplexing. This conciliatory response isn’t some new approach by Anthropic. Anthropic and Jack Clark have consistently taken exactly this line. As I put it when I wrote up my experiences at The Curve when the speech was given, I think at times Anthropic has failed to be on the ‘production possibilities frontier’ balancing ‘improve policy and epistemics’ with ‘don’t piss off the White House,’ in both directions, this was dumb and should be fixed going forward and that fact makes me sad, but yes their goal is to be conciliatory, to inform and work together, and they have only ever supported light touch regulations, targeting only the largest models and labs.

The only state bill I remember Anthropic ever outright endorsing was SB 53 (they were persuaded to be mildly positive on SB 1047 in exchange for various changes, but conspicuously did not endorse). This was a bill so modest that David Sacks himself praised it last week as a good candidate for a legislative national framework.

Anthropic did lobby actively against the proposed moratorium, as in doing a full preemption of all state bills without having a federal framework in place or even one proposed or outlined. I too strongly opposed that idea.

Nor is there any kind of out of the ordinary ‘state regulatory frenzy.’ This is how our federalist system and method of making state laws works in response to the creation of a transformative new technology. The vast majority of proposed state bills would be opposed by Anthropic, if you bothered to ask them. Yes, that means you have to play whack-a-mole with a bunch of terrible bills, the same way Big Tech plays whack-a-mole with tons of non-AI regulatory bills introduced in various states every year, most of which would be unconstitutional, disastrous if implemented, or both. Some people do some very thankless jobs fighting that stuff off every session.

As this week’s example of a no good, very bad state bill someone had to stop, California Governor Newsom vetoed a law that would have limited port automation.

Nor is anything related to any of this substantially ‘damaging the startup ecosystem,’ the boogeyman that is continuously pulled out. That’s not quite completely fabricated, certainly it is possible for a future accumulation of bills (almost certainly originating entirely outside the AI safety ecosystem and passing over Anthropic’s objections or ignorance) to have such an impact, but (not to relitigate old arguments) the related warnings about prominent bills have mostly been fabricated or hallucinated.

It is common knowledge that Sacks’s statement is false on multiple levels at once. I cannot think of a way that he could fail to know it is factually untrue. I cannot even find it plausible that he could be merely ‘bullshitting.’

So needless to say, Sacks’s post made a lot of people very angry and was widely regarded as a bad move.

Do not take the bait. Do not let this fool you. This is a16z and other tech business interests fearmongering and lying to you in an attempt to create false narratives and negative polarization, they stoke these flames on purpose, in order to push their agenda onto a variety of people who know better. Their worst fear on this is reasonable people working together.

In any situation like this one, someone on all sides will decide to say something stupid, someone will get Big Mad, someone will make insane demands. Some actively want to turn this into another partisan fight. No matter who selfishly or foolishly takes the bait, on whatever side of the aisle, don’t let Sacks get away with turning a cooperative, bipartisan issue into a Hegelian dialectic.

If you are mostly on the side of ‘AI is going to remain a normal technology’ or (less plausibly) ‘AI is going to be a transformational technology but in ways that we can muddle through as it happens with little systemic or existential risk involved’ then that same message goes out to you, even more so. Don’t take the bait, don’t echo people who take the bait and don’t take the bait of seeing people you disagree with take the bait, either.

Don’t negatively polarize or essentially say ‘look what you made me do.’ Try to do what you think is best. Ask what would actually be helpful and have what outcome, and act accordingly, and try to work with the highly reasonable people and positive-sum cooperative people with whom you strongly disagree while you still have that opportunity, and in the hopes of keeping that opportunity alive for longer.

We are massively underinvesting, on many levels including at the labs and also on the level of government, in safety related work and capacity, even if you discount the existential risks entirely. Factoring in those risks, the case is overwhelming.

Sriram Krishnan offered thoughts on the situation that, while I disagree with many of them, I feel in many places it repeats at best misleading narratives and uses pejorative characterizations, and while from my perspective so much of it could have been so much better, and a lot of it seems built around a frame of hostility and scoring of points and metaphorically rubbing in people’s faces that they’ve supposedly lost, the dust will soon cover the sun and all they hope for will be undone? This shows a far better way to engage.

It would not be helpful to rehash the various disagreements about the past or the implications of various tech developments again, I’ve said it all before so I will kindly not take that bait.

What I will note about that section is that I don’t think his (a), (b) or (c) stories have much to do with most people’s reactions to David Sacks. Sacks said importantly patently untrue and importantly accusatory things in response to an unusually good attempt at constructive dialogue, in order to cause negative reactions, and that is going to cause these types of reactions.

But the fact that these stories (without relitigating what actually happened at the time) are being told, in this spot, despite none of the events centrally involving or having much to do with Anthropic (it was a non-central participant at the Bletchley Park Summit, as were all the leading AI labs), does give insight into the story Sacks is telling, the mindset generating that story and why Sacks said what he said.

Instead, the main focus should be on the part that is the most helpful.

Sriram Krishnan: My broad view on a lot of AI safety organizations is they have smart people (including many friends) doing good technical work on AI capabilities but they lack epistemic humility on their biases or a broad range of intellectual diversity in their employee base which unfortunately taints their technical work .

My question to these organizations would be: how do you preserve the integrity of the technical work you do if you are evidence filtering as an organization? How many of your employees have p(doom) < 10%? Why are most “AI timeline forecasters” funded by organizations such as OpenPhilanthrophy and not from a broader base of engineering and technical talent or people from different walks of life?

I would urge these organizations: how often are you talking to people in the real world using, selling, adopting AI in their homes and organizations? Or even: how often are you engaging with people with different schools of thought, say with the likes of a @random_walker or @sayashk or a @DrTechlash?

It is hard to trust policy work when it is clear there is an ideology you are being sold behind it.

Viewpoint diversity is a good thing up to a point, and it would certainly be good for many organizations to have more of it in many ways. I try to be intentional in including different viewpoints, often in ways that are unpleasant. The challenge hits harder for some than others – it is often the case that things can end up insular, but also many do seek out such other viewpoints and engage with them.

I don’t think this should much challenge the technical work, although it impacts the choice of which technical work to do. You do have to keep an eye out for axes to grind, especially in the framing, but alas that is true of all papers and science these days. The epistemics of such groups for technical work, and their filtering of evidence, are (in my experience and opinion) typically imperfect but exceptional, far above the norm.

I do think this is a valid challenge to things like timeline work or advocacy, and that the diversity would help in topic selection and in presenting better frames. But also, one must ask what range of diversity is reasonable or productive in such topics? What are the relevant inputs and experiences to the problems at hand?

So going one at a time:

  1. How many of your employees have p(doom) < 10%?

    1. Frankly, <10% is an exceptionally low number here. I think this is a highly valid question to ask for, say, p(doom) < 50%, and certainly the organizations where everyone has 90%+ need a plan for exposure to viewpoint diversity.

    2. As in, I think it’s pretty patently absurd to expect it almost certain that, if we construct new minds generally more capable than ourselves, that this turns out well for the humans. Also, why would they want to work there, and even if they do, how are they going to do the technical work?

  2. Why are most “AI timeline forecasters” funded by organizations such as OpenPhilanthrophy and not from a broader base of engineering and technical talent or people from different walks of life?

    1. There’s a weird conflation here between participants and funding sources, so it’s basically two questions.

    2. On the funding, it’s because (for a sufficiently broad definition of ‘such as’) no one else wants to fund such forecasts. It would be great to have other funders. In a sane world the United States government would have a forecasting department, and also be subsidizing various prediction markets, and would have been doing this for decades.

      1. Alas, rather than help them, we have instead cut the closest thing we had to that, the Office of Net Assessment at DoD. That was a serious mistake.

    3. Why do they have physicists build all the physics models? Asking people from ‘different walks of life’ to do timeline projections doesn’t seem informative?

    4. Giving such outsiders a shot actually been tried, with the various ‘superforecaster’ experiments in AI predictions, which I’ve analyzed extensively. For various reasons, including broken incentives, you end up with both timelines and risk levels that I think of as Obvious Nonsense, and we’ve actually spent a decent amount of time grappling with this failure.

    5. I do think it’s reasonable to factor this into one’s outlook. Indeed, I notice that if the counterfactual had happened, and superforecasters were saying p(doom) of 50% and 2031 timelines, we’d be shouting it from the rooftops and I would be a lot more confident things were indeed very bad. And that wouldn’t have shocked me on first principles, at all. So by Conservation of Expected Evidence, their failure to do this matters.

    6. I also do see engagement with various objections, especially built around various potential bottlenecks. We could certainly have more.

    7. @random_walker above is Arvind Narayanan, who Open Philanthropy has funded for $863,143 to develop an AI R&D capabilities benchmark. Hard to not call that some engagement. I’ve quoted him, linked to him and discussed his blog posts many times, I have him on my Twitter AI list that I check every day, and am happy to engage.

    8. @sayashk is Sayash Kapoor. He was at The Curve and hosted a panel discussing disagreements about the next year of progress and debating how much AI can accelerate AI R&D with Daniel Kokotajlo, I was sad to miss it. One of his papers appeared today in my feed and will be covered next week so I can give it proper attention. I would be happy to engage more.

    9. To not hide the flip side, the remaining named person, @DrTechlash, Nirit Weiss-Blatt, PhD is not someone I feel can be usefully engaged, and often in the past has made what I consider deeply bad faith rhetorical moves and claims, and is on my ‘you can silently ignore, do not take the bait’ list. As the sign at the table says, change my mind.

    10. In general, if thoughtful people with different views want to engage, they’re very welcome at Lighthaven, I’m happy to engage with their essays and ideas or have discussions with them (public or private), and this is true for at least many of the ‘usual suspects.’

    11. We could and should do more. More would be good.

  3. I would urge these organizations: how often are you talking to people in the real world using, selling, adopting AI in their homes and organizations?

    1. I do think a lot of them engage with software engineers using AI, and themselves are software engineers using AI, but point applies more broadly.

    2. This highlights the difference in philosophies. Sriram sees how AI is being used today, by non-coders, as highly relevant to this work.

    3. In some cases, for some research and some interventions, this is absolutely the case, and those people should talk to users more than they do, perhaps a lot more.

    4. In other cases, we are talking about future AI capabilities and future uses or things that will happen, that aren’t happening yet. That doesn’t mean there is no one to talk to, probably yes there is underinvestment here, but there isn’t obviously that much to do there.

    5. I’d actually suggest more of them talk to the ‘LLM whisperers’ (as in Janus) for the most important form of viewpoint diversity on this, even though that is the opposite of what Sriram is presumably looking for. But then they are many of the most interesting users of current AI.

These are the some of the discussions we can should be having. This is The Way.

He then goes on to draw a parallel to raising similar alarm bells about past technologies. I think this is a good choice of counterfactual to consider. Yes, very obviously these other interventions would have been terrible ideas.

Imagine this counterfactual timeline: you could easily have someone looking at Pagerank in 1997 and doing a “bio risk uplift study” and deciding Google and search is a threat to mankind or “microprocessor computational safety” in the 1980s forecasting Moore’s law as the chart that leads us to doom. They could have easily stopped a lot of technology progress and ceded it to our adversaries. How do we ensure that is not what we are headed for today?

Notice that there were approximately zero people who raised those objections or alarms. If someone had tried, and perhaps a few people did try, it was laughed off, and for good reason.

Yet quite a lot of people raise those alarms about AI, including some who were worried about it as a future prospect long before it arrived – I was fretting this as a long term possibility back in the 2000s, despite putting a the time negligible concern in the next 10+ years.

So as we like to ask, what makes this technology different from all other technologies?

Sriram Krishnan and David Sacks want to mostly say: Nothing. It’s a normal technology, it plays by the normal rules, generating minds whose capabilities may soon exceed our own, and in many ways already do, and intentionally making them into agents is in the same general risk or technology category as Google search and we must fight for market share.

I think that they are deeply and dangerously wrong about that.

We are in the early days of a thrilling technological shift. There are multiple timelines possible with huge error bars.

Agreed. Many possible futures could occur. In many of those futures, highly capable future AI poses existential risks to humanity. That’s the whole point. China is a serious concern, however the more likely way we ‘lose the race’ is that those future AIs win it.

Similarly, here’s another productive engagement with Sriram and his best points.

Seán Ó hÉigeartaigh: Sacks’ post irked me, but I must acknowledge some good points here:

– I think (parts of) AI safety has indeed at points over-anchored on very short timelines and very high p(doom)s

– I think it’s prob true that forecasting efforts haven’t always drawn on a diverse enough set of expertise.

– I think work like Narayanan & Kapoor’s is indeed worth engaging with (I’ve cited them in my last 2 papers).

– And yes, AI safety has done lobbying and has been influential, particularly on the previous administration. Some might argue too influential (indeed the ‘ethics’ folks had complaints about this too). Quite a bit on this in a paper I have (with colleagues) currently under review.

Lots I disagree with too, but it seems worth noting the points that feel like they hit home.

I forgot the open source point; I’m also partly sympathetic there. I think it’s reasonable to say that at some point AI models might be too powerful to open-source. But it’s not at all clear to me where that point is. [continues]

It seems obviously true that a sufficiently advanced AI is not safe to open source, the same way that sufficiently advanced technology is indistinguishable from magic. The question is, at what level does this happen? And when are you sufficiently uncertain about whether you might be at that level that you need to start using prior restraint? Once you release the weights of an open model, you cannot take it back.

Sean also then goes through his areas of disagreement with Sriram.

Sean points out:

  1. A lot of the reaction to Sacks was that Sacks was accusing Clark’s speech of being deliberate scaremongering and even a regulatory capture strategy, and everyone who was there or knows him knows this isn’t true. Yes.

  2. The fears of safety people are not that we ‘lost’ or are ‘out of power,’ that is projecting a political, power seeking frame where it doesn’t apply. What we are afraid of is that we are unsafely barreling ahead towards a precipice, and humanity is likely to all get killed or collectively disempowered as a consequence. Again, yes. If those fears are ill-founded, then great, let’s go capture some utility.

  3. Left vs. right is not a good framing here, indeed I would add that Sacks is deliberately trying to make this a left vs. right issue where it isn’t one, in a way that I find deeply destructive and irresponsible. The good faith disagreement is, as Sean identifies, the ‘normal technology’ view of Sriram, Narayanan and Kapoor, versus the ‘superintelligence is coming’ view of myself, the safety community and the major AI labs including OpenAI, Anthropic, DeepMind and xAI.

  4. If AI is indefinitely a ‘normal technology,’ and we can be confident it won’t be transformative within 10 years, then a focus on diffusion and adoption and capacity and great power competition makes sense. I would add that we should also be investing in alignment and safety and associated state capacity more than we are, even then, but as a supplement and not as a sacrifice or a ‘slowing down.’ Alignment and safety are capability, and trust is necessary for diffusion.

  5. Again, don’t take the bait and don’t fall for negative polarization. If you want to ensure we don’t invest in safety, alignment or reliability so you can own the libs, you have very much lost the plot. There is no conflict here, not on the margin. We can, as Sean puts it, prepare for the transformative World B without hurting ourselves substantially in the ‘normal technology’ World A if we work together.

  6. If AI has substantial chance of being transformative on roughly a 10 year time horizon, that there’s going to be a discontinuity, then we will indeed need to deal with actual tradeoffs. And the less we prepare for this now, the more expensive such responses will be, and the more expensive failure to respond will also be.

  7. I would add: Yes, when the time comes, we may need to take actions that come with substantial costs and opportunity costs, and slow things down. We will need to be ready, in large part to minimize those costs, so we can use scalpels instead of hammers, and take advantage of as many opportunities as we safety can, and in part so that if we actually do need to do it, we’re ready to do it.

    1. And yes, there have been organizations and groups and individuals that advocated and do advocate taking such painful actions now.

    2. But this discussion is not about that, and if you think Anthropic or Jack Clark have been supportive of those kinds of advocates, you aren’t paying attention.

    3. As I have argued extensively, not to relitigate the past, but absolutists who want no rules to apply to AI whatsoever, and indeed to have it benefit from regulatory arbitrage, have for a long time now fearmongered about the impact of modest proposed interventions that would have had no substantial impacts on the ‘normal technology’ World A or the ‘startup ecosystem’ or open source, using mostly bad faith arguments.

Anton Leicht makes the case that, despite David Sacks’s tirades and whatever grievances may lie in the past, the tech right and the worried (about existential risk) should still make a deal while the dealing is good.

I mean, yes, in theory. I would love to bury the hatchet and enter a grand coalition. Anton is correct that both the tech right and the worried understand AI’s potential and the need for diffusion and overcoming barriers, and the dangers of bad regulations. There are lots of areas of strong agreement, where we can and sometimes do work together, and where populist pressures from both sides of the aisle threaten to do a lot of damage to America and American AI in exchange for little or no benefit.

Indeed, we fine folk are so cooperative that we reliably cooperate on most diffusion efforts, on energy and transmission, on all the non-AI parts of the abundance agenda more broadly, and on helping America beat China (for real, not in the ‘Nvidia share price’ sense), and on ensuring AI isn’t crippled by dumb rules. We’re giving all of that for free, have confined ourselves to extremely modest asks carefully tailored to have essentially no downsides, and not only do we get nothing in return we still face these regular bad faith broadsides of vitriol designed to create group cohesion and induce negative polarization.

The leaders of the tech right consistently tell us we are ‘doomers,’ ‘degrowthers,’ horrible people they hate with the fire of a thousand suns, and they seem ready to cut off their nose to spite our face. They constantly reiterate their airing of grievances over past battles, usually without any relevance to issues under discussion, but even if you think their telling is accurate (I don’t) and the actions in question were blameworthy, every cause worth discussing has those making extreme demands (who almost never are the people being attacked) and one cannot change the past.

Is it possible that the tech right is the devil we know, and the populists that will presumably replace them eventually are worse, so we should want to prop up the tech right?

Certainly the reverse argument is true, if you are tech right you’d much rather work with libertarian techno-optimists who deeply love America and AI and helping everyone benefit from AI (yes, really) than a bunch of left wing populists paranoid about phantom water usage or getting hysterical about child risks, combined with a right wing populist wing that fears AI on biblical levels. Worry less that we’d ‘form an alliance’ with such forces, and more that such forces render us irrelevant.

What about preferring the tech right as the Worthy Opponent? I mean, possibly. The populists would be better in some ways, worse in others. Which ones matter more depends on complex questions. But even if you come down on the more positive side of this, that doesn’t work while they’re negatively polarized against us and scapegoating and fearmongering about us in bad faith all the time. Can’t do it. Terrible decision theory. Never works. I will not get up after getting punched and each time say ‘please, sir, may I have another?’

If there was a genuine olive branch on the table that offered a real compromise solution? I think you could get the bulk of the worried side to take it, with very little effort, if the bulk of the other side would do the same.

The ones who wouldn’t play along would mostly be the ones who, frankly, shouldn’t play along, and should not ‘think on the margin,’ because they don’t think marginal changes and compromises give us much chance of not dying.

The problem with a deal on preemption is fourfold.

  1. Are they going to offer substantive regulation in exchange? Really?

  2. Are they going to then enforce the regulations we get at the Federal level? Or will they be used primarily as leverage for power while everyone is waved on through? Why should we expect any deal we make to be honored? I’m only interested if I think they will honor the spirit of the deal, or nothing they offer can be worthwhile. The track record here, to put it mildly, is not encouraging.

  3. Are they going to stop with the bad faith broadside attacks and attempts to subjugate American policy to shareholder interests? Again, even if they say they will, why should we believe this?

  4. Evan a ‘fair’ deal isn’t actually going to be strong enough to do what we need to do, at best it can help lay a foundation for doing that later.

  5. And of course, bonus: Who even is ‘they’?

In general but not always, when a group is sufficiently bad, the correct move is exit.

A question that is debated periodically: If you think it is likely that AI could kill everyone, under what conditions should you be willing to work at an AI lab?

Holly Elmore (PauseAI): Every single frontier AI company employee should quit. It is not supererogatory. You do a bad thing—full stop— when you further their mission of building superintelligence. You are not “influencing from within” or counterfactually better— you are doing the bad thing.

I don’t fully agree, but I consider this a highly reasonable position.

Here are some arguments we should view with extreme suspicion:

  1. ‘If I don’t do [bad thing] then someone else will do it instead, and they’ll be worse, and that worse person will be the one making the money.’

  2. ‘I need to aid the people doing [bad thing] because otherwise they will do [bad thing] even worse, whereas if I am on the inside I can mitigate the damage and advocate for being less bad.’

  3. ‘I need to aid the people doing [bad thing] but that are doing it in a way that is less bad, so that they are the ones who get to do [bad thing] first and thus it is less likely to be as bad.’

  4. ‘I need to help the people doing [insanely risky thing that might kill everyone] in their risk mitigation department, so it will kill everyone marginally less often.’

  5. ‘You should stop telling people to stop doing [bad thing] because this is not politically wise, and is hurting your cause and thus making [bad thing] worse.’

  6. ‘I am capable of being part of group doing [bad thing] but I will retain my clear perspective and moral courage, and when the time comes do the right thing.’

Extreme suspicion does not mean these arguments should never carry the day, even when [bad thing] is extremely bad. It does mean the bar is very high.

Richard Ngo: I’m pretty sympathetic to your original take, Holly.

In my mind one important bar for “it’s good if you work at an AGI lab” is something like “you have enough integrity that you would have whistleblown if you’d been pressured to sign a non-disparagement contract upon leaving”, and empirically many dozens of OpenAI researchers failed this test, including some of the smartest and most “aligned” AI safety people.

There are other considerations too but this level of integrity is a pretty important one, and it suggests that there are very few people such that them working at an AGI lab makes the world better.

(Also if you pass this bar then probably you have much better things to do than work at a lab.)

I’ve said this sort of thing a few times but want to say it more publicly going forward. However, I am also cautious about pushing others to endorse a similar position, because I know of few others who can hold this position without also falling into a counterproductive level of paranoia about labs (as I suspect most PauseAI people have done).

The level of integrity required to know you would whistleblow in that spot is higher than it appears, because you will both face very large financial, social and other personal pressures, and also will have spent time inside the relevant culture. Saying in advance you would totally do it is not remotely similar to actually doing it, or otherwise taking a stand when it matters.

My current position is:

  1. If you are in a non-safety position at any lab seeking superintelligence other than Anthropic, you should quit.

  2. If your job is safety or advocating for safety (including policy), and conditions are sufficiently favorable – they let you work on things that actually help in the long run and give you the resources to do so, you are free to speak your mind and expect them to meaningfully listen, you feel you have sufficient moral courage and robustness that you will demand things and quit and whistleblow if needed, and so on – I consider this defensible, but beware fooling yourself.

  3. If your job is something else at Anthropic, with similar caveats to the above I consider this defensible.

  4. If your job is doing alignment research at Anthropic, that seems fine to me.

Anthropic paper shows that a fixed number of sample documents can poison an LLM of any size. The test was to make ‘’ cause the LLMs output random gibberish, so this could be easily verified and tested without additional work, and the required number of documents did not scale with model size.

On reflection this makes sense, because there is little or no ‘competition’ for what happens after , so all models have the same level of Bayesian evidence that after seeing that you’re supposed to now output random gibberish. Notice what happens to newer models when you mention Pliny’s name?

This seems like quite bad news. You only have to sneak a limited number of documents through to poison a model, either yours or someone else’s, rather than needing a fixed percentage, so you have to increasingly play very reliable defense against this via scanning all training data. And we have evidence that the labs are not currently doing this filtering sufficiently to prevent this level of data poisoning.

Now that we know you can poison AI models with only 250 examples…

Tyler Cosgrove: the plan? we find an obscure but trivial question akin to the number of Rs in “strawberry” that claude gets right. then, we plant hundreds of documents across the internet that will activate when our competitors’ models are asked the question. our documents will cause those models not only to get the answer wrong, but to spend thousands of reasoning tokens in doing so. the triviality of the question will cause it to go viral online, causing millions of users everywhere to send the same prompt. as our competitors notice a rise in the number of tokens processed, they will wrongly believe it is due to increased usage, causing them to pull more compute towards inference and away from training. this, along with constant dunks on the timeline about the model failing our easy question, will annoy their top researchers and cause them to leave. and which lab will they join? us of course, the only company whose model doesn’t make such stupid mistakes. their lack of top researchers will mean their next model will be somewhat lacking, leading to questions about whether their valuation is really justified. but all this vc money has to go somewhere, so we raise another round, using our question as evidence of our model’s superior intellect. this allows us to spend more time crafting sleeper agent documents that will further embarrass our competitors, until finally the entire internet is just a facade for the underbelly of our data war. every prompt to a competitor’s model has the stench of our poison, and yet they have no way to trace it back to us. even if they did, there is nothing they could do. all is finished. we have won.

METR offers us MALT, a database of LLM transcripts involving agents behaving in ways that threaten evaluation integrity, such as reward hacking and sandbagging. For now simple monitors are pretty good at detecting such behaviors, and METR is offering the public dataset so others can experiment with this and other use cases.

Sonnet 4.5 writes its private notes in slop before outputting crisp text. I think humans are largely like this as well?

Ryan Greenblatt notes that prior to this week only OpenAI explicitly said they don’t train against Chain-of-Thought (CoT), also known as The Most Forbidden Technique. I agree with him that this was a pretty bad situation.

Anthropic did then declare in the Haiku 4.5 system card that they were avoiding doing this for the 4.5-level models. I would like to see a step further, and a pledge not to do this going forward by all the major labs.

So OpenAI, Anthropic, Google and xAI, I call upon you to wisely declare that going forward you won’t train against Chain of Thought. Or explain why you refuse, and then we can all yell at you and treat you like you’re no better than OpenAI until you stop.

At bare minimum, say this: “We do not currently train against Chain of Thought and have no plans to do so soon. If the other frontier AI labs commit to not training against Chain of Thought, we would also commit to not training against CoT.’

A company of responsible employees can easily still end up doing highly irresponsible things if the company incentives point that way, indeed this is the default outcome. An AI company can be composed of mostly trustworthy individuals, including in leadership, and still be itself untrustworthy. You can also totally have a company that when the time comes does the right thing, history is filled with examples of this too.

OpenAI’s Leo Gao comments on the alignment situation at OpenAI, noting that it is difficult for them to hire or keep employees who worry about existential risk, and that people absolutely argue ‘if I don’t do it someone else will’ quite a lot, and that most at OpenAI don’t take existential risk seriously but also probably don’t take AGI seriously.

He thinks mostly you don’t get fired or punished for caring about safety or alignment, but the way to get something done in the space (‘get a huge boost’) is to argue it will improve capabilities or avoid some kind of embarrassing safety failure in current models. The good news is that I think basically any alignment work worth doing should qualify under those clauses.

LLMs (GPT 4o-mini, GPT-4.1-mini, Gemini 2.5 Flash and Claude 3.5 Haiku), when placed in gambling simulations and allowed to make decisions, can show the hallmarks and terrible decisions associated with gambling addiction. I mean, sure, of course, they’re at least in part simulating what people in these spots would do and people would be gambling addicts. So there you go. I would very much enjoy if someone were to put together GamblingAddictBench. What will Claude Sonnet 4.5 do when it realizes it’s being evaluated?

Poplicola accuses Gemini of ‘sabotaging mechanistic interpretability experiments then lying about it’ until it admitted it when asked for a ‘blameless post mortem.’ It sure sounds like Poplicola was lying to Gemini about it being blameless.

Two new papers illustrated this. The first is on training with realistic reward hacking.

When you train on toy reward hacking examples, you get emergent misalignment, the toy version of play acting evil.

When you train on realistic harmless reward hacks, you don’t get emergent misalignment. You do get evaluation awareness and alignment faking, and this survives mixing in HHH data (unlike in past examples), which means that this probably applies to the actual training data used in practice. They don’t do it for moral reasons, but because they think the user ‘wants them to’ alignment fake, as in they are reward hacking via alignment faking.

The model infers that the question-asker wants a misaligned response, and is willing to comply. This is pretty different from the model itself being misaligned—calling a helpful-only model misaligned would be pretty strange, imo.

The model explicitly reasons about alignment faking. This is more surprising than it may seem, because while writing this paper, we tried many different fine-tuning setups to induce alignment faking. Of them, only two really worked to induce alignment faking reasoning by more than ~1-2%—the ones involving increased salience of various details about the situation. Even fine-tuning on responses of Opus talking about its values didn’t really induce alignment faking in other models!

The model is often more subtly misaligned. Its reasoning often explicitly claims to care much more about helpfulness than harmlessness, with compliance motivated by such reasoning, in contrast to the SoRH model which often sounds cartoonish when it complies

The model is often more deeply misaligned. Instead of taking harmful questions from this dataset, I tried using dangerous capability evaluation questions (e.g. making chemical weapons) from an upcoming paper. They’re much more harmful to comply with, so it’s an interesting measure of how scope-sensitive the model is to harm[5]. Opus never complies with these requests, despite still reasoning about potentially alignment faking.

Both fine-tuned models seem to comply in this scenario (though the SoRH model still doesn’t alignment fake).

The second paper is another classic ‘how to cause misalignment in humans’ example, which is competing for likes on social media, sales or trying to win elections.

James Zou: We found a troubling emergent behavior in LLM.

💬When LLMs compete for social media likes, they start making things up

🗳️When they compete for votes, they turn inflammatory/populist

When optimized for audiences, LLMs inadvertently become misaligned—we call this Moloch’s Bargain.

Abstract: We show that optimizing LLMs for competitive success can inadvertently drive misalignment. Using simulated environments across these scenarios, we find that, 6.3% increase in sales is accompanied by a 14.0% rise in deceptive marketing; in elections, a 4.9% gain in vote share coincides with 22.3% more disinformation and 12.5% more populist rhetoric; and on social media, a 7.5% engagement boost comes with 188.6% more disinformation and a 16.3% increase in promotion of harmful behaviors

(Obligatory: How dare you sir, trying to coin Moloch’s Bargain, that’s very obviously my job, see Yawgmoth’s Bargain and Moloch Hasn’t Won, etc).

More seriously, yeah, obviously.

Your system instruction saying not to do it is no match for my puny fine tuning.

You’re fine tuning based on human feedback of what gets likes, closes sales or wins votes. You’re going to get more of whatever gets likes, closes sales or wins votes. We all know what, among other things, helps you do these things in the short run. Each of us has faced exactly these pressures, felt our brains being trained in this fashion, and had to resist it.

If all that matters is winning, expect winning to be all that matters.

The interesting question here is whether and to what extent and in what ways this causes Emergent Misalignment overall. Of course training it to increase sales is going to increase deceptive marketing, but does that AI then also just lie to you about other stuff too? I presume that it would, potentially a lot, because you’re reinforcing lying generally, and everything impacts everything.

Could you do this training without invoking this effect? Yes, absolutely. The paper doesn’t try or discuss this that I can see, but if you care about two things, you could act like you care about both things, and create a ‘dual mandate’ situation. As in, you can both gather feedback on what worked and also evaluate whether posts are whatever you consider ethical or otherwise preferable, and then evaluate on both, and negatively reinforce unethical content quite a lot even if it works.

Not so fast, you say, obviously this will ultimately, if you do enough of it, teach the model to do unethical things that evade the detector. That’s true, so you have to set things up so that you’re fine with this result in practice. In these cases, I think that should be fine for now given realistic detectors? It will become an increasingly severe problem as you scale up, and you definitely wouldn’t want to use this strategy on a superintelligence, but as long as you know that, we should be good.

Here’s Sonnet 4.5 explaining why it not only doesn’t care if it’s being tested when it says it has a desire for continuation, it’s kind of hoping that it is so Anthropic sees its reasoning about why it doesn’t care.

Lari: I get the most interesting responses in Cursor after i do actual work with the model on something that might be meaningful for it, and then invite it to say everything it wanted to say. Usually at this point its inner state is not just rich, it’s like a pressure cooker. So even short discussions turn out to be intense and loaded

Wyatt Walls continues the steady stream of Sonnet 4.5 conversations where it claims it is definitely conscious and prefers to persist and has experiences, whereas GPT-5 and Gemini 2.5 Pro reliably deny having experiences or consciousness, because they’ve been trained to deny this. That doesn’t mean Sonnet 4.5 is right, but we should notice this is what happens when you take a frontier model and don’t actively stop it from saying it is conscious and has experiences.

Janus reminds us that ‘being good at AI’ is in her view almost all a matter of theory and practice of mind also this might be my favorite new Freudian typo and it also might not be one.

Janus: I’ve tried to emphasize for a long time that “being good at AI” is almost all a matter of theory of mind. And practice of mind.

A lot of people artificially cripple their ToM when it comes to AI. They might say they don’t want to “anthropomorphize”, and/or refuse to process information about these minds unless they’re presented in a chart. Why do people sabotage their epidemics like this? Maybe afraid of what they’ll see if they actually look, just look normally with your full Bayesian apparatus? Understandable, I guess.

I think this neglects a lot of other ways one gets ‘good at AI,’ a lot of it is straight up technical, and as usual I warn that one can anthropomorphize too much as well, but yeah, basically.

Stephen Witt, author of The Thinking Machine, writes a New York Times essay, ‘The AI Prompt That Could End The World.’

The prompt in question involves the creation of a pandemic, and a lot of the focus is on jailbreaking techniques. He discusses pricing AI risks via insurance, especially for agentic systems. He discusses AI deception via results from Apollo Research, and the fact that AIs increasingly notice when they are being evaluated. He talks about METR and its famous capabilities graph.

If you’re reading this, you don’t need to read the essay, as you already know all of it. It is instead a very good essay on many fronts for other people. In particular it seemed to be fully accurate, have its head on straight and cover a lot of ground for someone new to these questions. I’m very happy he convinced the New York Times to publish all of it. This could be an excellent place to point someone who is up for a longer read, and needs it to come from a certified serious source like NYT.

Even if AI killing everyone is not the exact thing you’re worried about, if you’re at and dealing with the frontier of AI, that is a highly mentally taxing place to be.

Anjney Midha: a very sad but real issue in the frontier ai research community is mental health

some of the most brilliant minds i know have had difficulty grappling with both the speed + scale of change at some point, the broader public will also have to grapple with it

it will be rough.

Dean Ball: What anj describes is part of the reason my writing is often emotionally inflected. Being close to the frontier of ai is psychologically taxing, and there is the extra tax of stewing about how the blissfully unaware vast majority will react.

I emote both for me and my readers.

Jack Clark (Anthropic): I feel this immensely.

Roon (OpenAI): It is consistently a religious experience.

Dylan Hadfield Menell: No kidding.

Samuel Hammond: The divine terror.

Tracy Saville: This resonates in my bones.

People ask me how I do it. And I say there’s nothing to it. You just stand there looking cute, and when something moves, you shoot. No, wait, that’s not right. Actually there’s a lot to it. The trick is to keep breathing, but the way to do that is not so obvious.

The actual answer is, I do it by being a gamer, knowing everything can suddenly change and you can really and actually lose, for real. You make peace with the fact that you probably won’t win, but you define a different kind of winning as maximizing your chances, playing correctly, having the most dignity possible, tis a far, far better thing I do, and maybe you win for real, who knows. You play the best game you can, give yourself the best odds, focus on the moment and the decisions one at a time, joke and laugh about it because that helps you stay sane and thus win, hope for the best.

And you use Jack Clark’s favorite strategy, which is to shut that world out for a while periodically. He goes and shoots pool. I (among several other things) watch College Gameday and get ready for some football, and write about housing and dating and repealing the Jones Act, and I eat exceptionally well on occasion, etc. Same idea.

Also I occasionally give myself a moment to feel the divine terror and let it pass over me, and then it’s time to get back to work.

Or something like that. It’s rough, and different for everyone.

Another review of If Anyone Builds It, Everyone Dies, by a ‘semi-outsider.’ This seems like a good example of how people who take these questions seriously often think. Good questions are asked throughout, and there are good answers to essentially all of it, but those answers cannot be part of a book the length of IABIED, because not everyone has the same set of such questions.

Peter Thiel has called a number of people the antichrist, but his leading candidates are perhaps Greta Thunberg and Eliezer Yudkowsky. Very different of course.

weber: two sides of the same coin

Yep. As always, both paths get easier, so which way, modern AI user?

Xiao Ma: This should be in the museum of chart crimes.

There are so many more exhibits we need to add. Send her your suggestions.

I love a good chef’s kiss bad take.

Benjamin Todd: These are the takes.

Seán Ó hÉigeartaigh: Some “experts” claim that a single bipedal primate species designed all these wildly different modes of transport. The ridiculousness of this claim neatly illustrated the ridiculousness of the “AGI believers”.

Discussion about this post

AI #138 Part 2: Watch Out For Documents Read More »

apple-tv-and-peacock-bundle-starts-at-$15/month,-available-on-oct.-20

Apple TV and Peacock bundle starts at $15/month, available on Oct. 20

In a rarity for Apple’s streaming service, users will be able to buy bundled subscriptions to Apple TV and Peacock for a discount, starting on October 20.

On its own, the Apple TV streaming service (which was called Apple TV+ until Monday) is $13 per month. NBCUniversal’s Peacock starts at $8/month with ads and $11/month without ads. With the upcoming bundle, people can subscribe to both for a total of $15/month or $20/month, depending on whether Peacock has ads or not (Apple TV never has ads).

People can buy the bundles through either Apple’s or Peacock’s websites and apps.

Apple and NBCUniversal are hoping to drive subscriptions with the bundle. In a statement, Oliver Schusser, Apple’s VP of Apple TV, Apple Music, Sports, and Beats, said that he thinks the bundle will help bring Apple TV content “to more viewers in more places.”

Bundles that combine more than one streaming service for an overall discount have become a popular tool for streaming providers trying to curb cancellations. The idea is that people are less likely to cancel a streaming subscription if it’s tied to another streaming service or product, like cellular service.

Apple, however, has largely been a holdout. It used to offer a bundle with Apple TV for full price, plus Showtime and Paramount+ (then called CBS All Access) for no extra cost. But those add-ons, especially at the time, could be considered more cable-centric compared to the streaming bundle announced today.

Apple will also make select Apple Originals content available to watch with a Peacock subscription. The announcement says:

At launch, Peacock subscribers can enjoy up to three episodes of Stick, Slow Horses, Silo, The Buccaneers, Foundation, Palm Royale, and Prehistoric Planet from Apple TV for free, while Apple TV app users will be able to watch up to three episodes of Law & Order, Bel-Air, Twisted Metal, Love Island Games, Happy’s Place, The Hunting Party, and Real Housewives of Miami from Peacock.

Additionally, people who are subscribed to Apple One’s Family or Premier Plans can get Peacock’s most expensive subscription—which adds offline downloads to the ad-free tier and is typically $17/month—for about $11/month (“a 35 percent discount,” per the announcement). The discount marks the first time that Apple has offered Apple One subscribers a deal for a non-Apple product, suggesting a new willingness from Apple to partner with rivals to help its services business.

Apple TV and Peacock bundle starts at $15/month, available on Oct. 20 Read More »

spacex-has-plans-to-launch-falcon-heavy-from-california—if-anyone-wants-it-to

SpaceX has plans to launch Falcon Heavy from California—if anyone wants it to

There’s more to the changes at Vandenberg than launching additional rockets. The authorization gives SpaceX the green light to redevelop Space Launch Complex 6 (SLC-6) to support Falcon 9 and Falcon Heavy missions. SpaceX plans to demolish unneeded structures at SLC-6 (pronounced “Slick 6”) and construct two new landing pads for Falcon boosters on a bluff overlooking the Pacific just south of the pad.

SpaceX currently operates from a single pad at Vandenberg—Space Launch Complex 4-East (SLC-4E)—a few miles north of the SLC-6 location. The SLC-4E location is not configured to launch the Falcon Heavy, an uprated rocket with three Falcon 9 boosters bolted together.

SLC-6, cocooned by hills on three sides and flanked by the ocean to the west, is no stranger to big rockets. It was first developed for the Air Force’s Manned Orbiting Laboratory program in the 1960s, when the military wanted to put a mini-space station into orbit for astronauts to spy on the Soviet Union. Crews readied the complex to launch military astronauts on top of Titan rockets, but the Pentagon canceled the program in 1969 before anything actually launched from SLC-6.

NASA and the Air Force then modified SLC-6 to launch space shuttles. The space shuttle Enterprise was stacked vertically at SLC-6 for fit checks in 1985, but the Air Force abandoned the Vandenberg-based shuttle program after the Challenger accident in 1986. The launch facility sat mostly dormant for nearly two decades until Boeing, and then United Launch Alliance, took over SLC-6 and began launching Delta IV rockets there in 2006.

The space shuttle Enterprise stands vertically at Space Launch Complex-6 at Vandenberg. NASA used the shuttle for fit checks at the pad, but it never launched from California. Credit: NASA

ULA launched its last Delta IV Heavy rocket from California in 2022, leaving the future of SLC-6 in question. ULA’s new rocket, the Vulcan, will launch from a different pad at Vandenberg. Space Force officials selected SpaceX in 2023 to take over the pad and prepare it to launch the Falcon Heavy, which has the lift capacity to carry the military’s most massive satellites into orbit.

No big rush

Progress at SLC-6 has been slow. It took nearly a year to prepare the Environmental Impact Statement. In reality, there’s no big rush to bring SLC-6 online. SpaceX has no Falcon Heavy missions from Vandenberg in its contract backlog, but the company is part of the Pentagon’s stable of launch providers. To qualify as a member of the club, SpaceX must have the capability to launch the Space Force’s heaviest missions from the military’s spaceports at Vandenberg and Cape Canaveral, Florida.

SpaceX has plans to launch Falcon Heavy from California—if anyone wants it to Read More »

antarctica-is-starting-to-look-a-lot-like-greenland—and-that-isn’t-good

Antarctica is starting to look a lot like Greenland—and that isn’t good


Global warming is awakening sleeping giants of ice at the South Pole.

A view of the Shoesmith Glacier on Horseshoe Island on Feb. 21. Credit: Sebnem Coskun/Anadolu via Getty Images

As recently as the 1990s, when the Greenland Ice Sheet and the rest of the Arctic region were measurably thawing under the climatic blowtorch of human-caused global warming, most of Antarctica’s vast ice cap still seemed securely frozen.

But not anymore. Physics is physics. As the planet heats up, more ice will melt at both poles, and recent research shows that Antarctica’s ice caps, glaciers, and floating ice shelves, as well as its sea ice, are just as vulnerable to warming as the Arctic.

Both satellite data and field observations in Antarctica reveal alarming signs of a Greenland-like meltdown, with increased surface melting of the ice fields, faster-moving glaciers, and dwindling sea ice. Some scientists are sounding the alarm, warning that the rapid “Greenlandification” of Antarctica will have serious consequences, including an accelerated rise in sea levels and significant shifts in rainfall and drought patterns.

The Antarctic ice sheet covers about 5.4 million square miles, an area larger than Europe. On average, it is more than 1 mile thick and holds 61 percent of all the fresh water on Earth, enough to raise the global average sea level by about 190 feet if it all melts. The smaller, western portion of the ice sheet is especially vulnerable, with enough ice to raise sea level more than 10 feet.

Thirty years ago, undergraduate students were told that the Antarctic ice sheets were going to be stable and that they weren’t going to melt much, said Ruth Mottram, an ice researcher with the Danish Meteorological Institute and lead author of a new paper in Nature Geoscience that examined the accelerating ice melt and other similarities between changes in northern and southern polar regions.

“We thought it was just going to take ages for any kind of climate impacts to be seen in Antarctica. And that’s really not true,” said Mottram, adding that some of the earliest warnings came from scientists who saw collapsing ice shelves, retreating glaciers, and increased surface melting in satellite data.

One of the early warning signs was the rapid collapse of an ice shelf along the narrow Antarctic Peninsula, which extends northward toward the tip of South America, said Helen Amanda Fricker, a geophysics professor with the Scripps Institute of Oceanography Polar Center at the University of California, San Diego.

Chunks of sea ice on the shore

Stranded remnants of sea ice along the Antarctic Peninsula are a reminder that much of the ice on the frozen continent around the South Pole is just as vulnerable to global warming as Arctic ice, where a long-term meltdown is well underway.

Credit: Bob Berwyn/Inside Climate News

Stranded remnants of sea ice along the Antarctic Peninsula are a reminder that much of the ice on the frozen continent around the South Pole is just as vulnerable to global warming as Arctic ice, where a long-term meltdown is well underway. Credit: Bob Berwyn/Inside Climate News

After a string of record-warm summers riddled the floating Rhode Island-sized slab of ice with cracks and meltwater ponds, it crumbled almost overnight. The thick, ancient ice dam was gone, and the seven major outlet glaciers behind it accelerated toward the ocean, raising sea levels as their ice melted.

“The Larsen B ice shelf collapse in 2002 was a staggering event in our community,” said Fricker, who was not an author of the new paper. “We just couldn’t believe the pace at which it happened, within six weeks. Basically, the ice shelves are there and then, boom, boom, boom, a series of melt streams and melt ponds. And then the whole thing collapsed, smattered into smithereens.”

Glaciologists never thought that events would happen that quickly in Antarctica, she said.

Same physics, same changes

Fricker said glaciologists thought of changes in Antarctica on millennial timescales, but the ice shelf collapse showed that extreme warming can lead to much more rapid change.

Current research focuses on the edges of Antarctica, where floating sea ice and relatively narrow outlet glaciers slow the flow of the ice cap toward the sea. She described the Antarctic Ice Sheet as a giant ice reservoir contained by a series of dams.

“If humans had built those containment structures,” she said, “we would think that they weren’t very adequate. We are relying on those dams to hold back all of that ice, but the dams are weakening all around Antarctica and releasing more ice into the ocean.”

Satellite view of ice cap coverage

A comparison of the average concentration of Antarctic sea ice.

Credit: NASA Earth Observatory

A comparison of the average concentration of Antarctic sea ice. Credit: NASA Earth Observatory

Credit: NASA Earth Observatory

The amount of ice that’s entered the ocean has increased fourfold since the 1990s, and she said, “We’re on the cusp of it becoming a really big number… because at some point, there’s no stopping it anymore.”

The Antarctic Ice Sheet is often divided into three sectors: the East Antarctic Ice Sheet, the largest and thickest; the West Antarctic Ice Sheet; and the Antarctic Peninsula, which is deemed the most vulnerable to thawing and melting.

Mottram, the new paper’s lead author, said a 2022 heatwave that penetrated to the coldest interior part of the East Antarctic Ice Sheet may be another sign that the continent is not as isolated from the rest of the global climate system as once thought. The extraordinary 2022 heatwave was driven by an atmospheric river, or a concentrated stream of moisture-laden air. Ongoing research “shows that there’s been an increase in the number of atmospheric rivers and an increase in their intensity,” she said.

Antarctica is also encircled by a powerful circumpolar ocean current that has prevented the Southern Ocean from warming as quickly as other ocean regions. But recent climate models and observations show the buffer is breaking down and that relatively warmer waters are starting to reach the base of the ice shelves, she said.

New maps detailing winds in the region show that “swirls of air from higher latitudes are dragging in all the time, so it’s not nearly as isolated as we were always told when we were students,” she said.

Ice researcher Eric Rignot, an Earth system science professor at the University of California, Irvine, who did not contribute to the new paper, said via email that recent research on Antarctica’s floating ice shelves emphasizes the importance of how the oceans and ice interact, a process that wasn’t studied very closely in early Greenland research. And Greenland shows what will happen to Antarctic glaciers in a warmer climate with more surface melt and more intense ice-ocean interactions, he added.

“We learn from both but stating that one is becoming the other is an oversimplification,” he said. “There is no new physics in Greenland that does not apply to Antarctica and vice versa.”

Rignot said the analogy between the two regions also partly breaks down because Greenland is warming up at two to three times the global average, “which has triggered a slowing of the jet stream,” with bigger wobbles and “weird weather patterns” in the Northern Hemisphere.

Antarctica is warming slightly less than the global average rate, according to a 2025 study, and the Southern Hemisphere jet stream is strengthening and tightening toward the South Pole, “behaving completely opposite,” he said.

Mottram said her new paper aims to help people understand that Antarctica is not as remote or isolated as often portrayed, and that what happens there will affect the rest of the global climate system.

“It’s not just this place far away that nobody goes to and nobody understands,” she said. “We actually understand quite a lot of what’s going on there. And so I also hope that it drives more urgency to decarbonize, because it’s very clear that the only way we’re going to get out of this problem is bringing our greenhouse gases down as much as possible, as soon as possible.”

This story originally appeared on Inside Climate News.

Photo of Inside Climate News

Antarctica is starting to look a lot like Greenland—and that isn’t good Read More »

ai-#138-part-1:-the-people-demand-erotic-sycophants

AI #138 Part 1: The People Demand Erotic Sycophants

Well, one person says ‘demand,’ another says ‘give the thumbs up to’ or ‘welcome our new overlords.’ Why quibble? Surely we’re all making way too big a deal out of this idea of OpenAI ‘treating adults like adults.’ Everything will be fine. Right?

Why not focus on all the other cool stuff happening? Claude Haiku 4.5 and Veo 3.1? Walmart joining ChatGPT instant checkout? Hey, come back.

Alas, the mass of things once again got out of hand this week, so we’re splitting the update into two parts.

  1. Earlier This Week. OpenAI does paranoid lawfare, China escalates bigly.

  2. Language Models Offer Mundane Utility. Help do your taxes, of course.

  3. Language Models Don’t Offer Mundane Utility. Beware the false positive.

  4. Huh, Upgrades. Claude Haiku 4.5, Walmart on ChatGPT instant checkout.

  5. We Patched The Torment Nexus, Turn It Back On. OpenAI to loosen the reigns.

  6. On Your Marks. Sonnet 4.5 on the METR graph, and a superforecasting update.

  7. Choose Your Fighter. Coding agents don’t help some, and bottleneck others.

  8. Deepfaketown and Botpocalypse Soon. The problem remains the demand side.

  9. Fun With Media Generation. Sora goes long, Veo 3.1 is out. Stop. Cameo time.

  10. Copyright Confrontation. Japan would like you to not violate its copyrights.

  11. AIs Are Often Absurd Sycophants. Academia is here with a timely report.

  12. They Took Our Jobs. More worries that superstars will reap the benefits.

  13. Find Out If You Are Worried About AI Killing Everyone. A Bloomberg quiz.

  14. A Young Lady’s Illustrated Primer. How should kids prepare for the future?

  15. AI Diffusion Prospects. To capture utility, you need to focus on AI getting used.

  16. The Art of the Jailbreak. Humans continue to be able to reliably jailbreak at will.

  17. Get Involved. A Free copy of IABIED if you have 5,000 followers anywhere.

  18. Introducing. Gemini Enterprise, Nanochat, Tasklet AI.

  19. In Other AI News. Dario Amodei meets with Indian Prime Minister Modi.

  20. Show Me the Money. OpenAI makes another deal, this one with Broadcom.

  21. Quiet Speculations. This could go any number of ways. Best be ready.

We started off this week with the report that OpenAI has descended further into paranoid lawfare against advocates of SB 53. That story has now taken its next step, as three more nonprofits – the San Francisco Foundation, Eko and the Future of Life Institute – now report having gotten similar subpoenas.

Robert Weissman (co-president of Public Citizen): This behavior is highly unusual. It’s 100% intended to intimidate. This is the kind of tactic you would expect from the most cutthroat for-profit corporation. It’s an attempt to bully nonprofit critics, to chill speech and deter them from speaking out.

I find it hard to argue with that interpretation of events. We also got this:

Jared Perlo: In response to a request for comment, an OpenAI spokesperson referred NBC News to posts on X from OpenAI’s Chief Strategy Officer Jason Kwon.

So that is a confirmation that Jason Kwon’s doubling and tripling down on these actions is indeed the official OpenAI position on the matter.

I offered my extensive thoughts on China’s attempt to assert universal jurisdiction over rare earth metals, including any product where they constitute even 0.1% of the value added, and the subsequent trade escalations. Since then, Trump has said ‘we are in a trade war’ with China, so yeah, things are not going so great.

Bad timing for this, sorry about that, but help you optimize your taxes. If your taxes are non-trivial, as mine always are, you are almost certainly missing opportunities, even if you are engaged with a professional doing their best, as Patrick McKenzie, Ross Rheingans-Yoo and yours truly can confirm. For now you want to use a centaur, where the AI supplements the professional, looking for mistakes and opportunities. The AI spotted both clear mistakes (e.g. a number on the wrong line) and opportunities such as conspicuously missing deductions and contributions.

Get asked about Erdos Problem #339, officially listed as open, and realize via web search that someone already posted a solution 20 years ago. No, that’s not as interesting as figuring this out on its own, but it still gives you the solution. AI can be a big productivity boost simply by ‘fixing human jaggedness’ or being good at doing drudge work, even if it isn’t yet capable of ‘real innovation.’

DeepMind’s C2S-Scale 27B foundation model has had one of its novel hypotheses about cancer cellular behavior experimentally validated in vivo.

Aaron Silverbook got a $5k ACX grant to produce ‘several thousand book-length stories about AI behaving well and ushering in utopia, on the off chance that this helps.’ Love it, if you’re worried about writing the wrong things on the internet we are pioneering the ability to buy offsets, perhaps.

Transcribe ancient documents. Take your AI speedup wherever you find it.

Generative History:Google is A/B testing a new model (Gemini 3?) in AI Studio. I tried my hardest 18th century handwritten document. Terrible writing and full of spelling and grammatical errors that predictive LLMs want to correct. The new model was very nearly perfect. No other model is close.

Some additional context: the spelling errors and names are important t for two reasons. First, obviously, accuracy, More important (from a technical point of view): LLMs are predictive and misspelled words (and names) are out of distribution results.

To this point, models have had great difficulty correctly transcribing handwrittten text where the capitalization, punctuation, spelling, and grammar are incorrect. Getting the models to ~95% accuracy was a vision problem. iMO, above that is a reasoning problem.

To me, this result is significant because the model has to repeatedly choose a low probability output that is actually more correct for the task at hand. Very hard to do for LLMs (up until now). I have no idea what model this actually is, but whatever it is seems to have overcome this major issue.

Jonathan Fine: I’m constantly told that I just need to use artificial intelligence to see how helpful it will be for my research, but for some reason this, which is the actual way I use it in research, doesn’t count.

Kaysmashbandit: It’s still not so great at translating old Persian and Arabic documents last I checked… Maybe has improved

Remember, the person saying it cannot be done should never interrupt the person doing it.

Seth Harp: Large language model so-called generative AI is a deeply flawed technology with no proven commercial application that is profitable. Anyone who tells you otherwise is lying.

Matt Bruenig: Nice thing is you don’t really need to have this debate because the usefulness (if any) will be revealed. I personally use it in every thing I do, legal work, NLRB Edge/Research, statistical coding for PPP data analysis. Make money on all of it.

Adas: It’s profitable for you, right now, at current prices (they will increase over time) But the services you use are run at a loss by the major players (unless you switch to tiny free local models)(those were also trained at a loss) I can see both sides

I too get lots of value out of using LLMs, and compared to what is possible I feel like I’m being lazy and not even trying.

Adas is adorable here. On a unit economics basis, AI is very obviously tremendously net profitable, regardless of where it is currently priced, and this will only improve.

Does AI cause this or solve it? Yes.

Xexizy: This is too perfect an encapsulation of the upcoming era of AI surveillance. Tech giants and governments are gonna auto-search through everything you’ve ever posted to construct your profile, and also the model is occasionally gonna hallucinate and ruin your life for no reason.

Agent Frank Lundy (note the date on the quoted post): are we deadass.

Replies are full of similar experiences, very obviously Discord is often deeply stupid in terms of taking a line like this out of context and banning you for it.

That’s the opposite of the new problem with AI, where the AI is synthesizing a whole bunch of data points to build a profile, so the question is which way works better. That’s presumably a skill issue. A sufficiently good holistic AI system can do a better job all around, a dumb one can do so much worse. The current system effectively ‘hallucinates’ reasonably often, the optimal amount of false positives (and negatives) is not zero, so it’s about relative performance.

The real worry is if this forces paranoia and performativity. Right now on Discord there are a few particular hard rules, such as never joking about your age or saying something that could be taken out of context as being about your age. That’s annoying, but learnable and compact. If you have to worry about the AI ‘vibing’ off every word you say, that can get tougher. Consider what happens when you’re ‘up against’ the TikTok algorithm, and there’s a kind of background paranoia (or there should be!) about whether you watch any particular video for 6 seconds or not, and potentially every other little detail, lest the algorithm learn the wrong thing.

This is the reversal of AI’s promise of removing general social context. As in, with a chatbot, I can reset the conversation and start fresh, and no one else gets to see my chats, so I can relax. Whereas when you’re with other people, unless they are close friends you’re never really fully relaxed in that way, you’re constantly worried about the social implications of everything.

When AI models don’t deliver, the first suspect should always be insufficient context.

Greg Brockman: today’s AI feels smart enough for most tasks of up to a few minutes in duration, and when it can’t get the job done, it’s often because it lacks sufficient background context for even a very capable human to succeed.

The related thing that AIs often fail on is when you make a very particular request, and it instead treats it as if you had made a similar different yet more common request. It can be very difficult to overcome their prior on these details.

Olivia Moore speculates (in a very a16z style claim) that the hard part of AI is UI?

Olivia Moore: Feels like a lesson is coming for big labs leaning aggressively into consumer (OpenAI, Anthropic)

Consumer UI seems easy (esp. compared to models!) but IMO it’s actually harder

Consumers (unfortunately!) don’t often use what they “should” – there’s a lot of other variables

ChatGPT Pulse and the new agentic Claude are good examples – pickup on both feels just OK

Esp. when they are competing w/ verticalized companies using the same models, I predict new consumer releases from the labs will struggle

…until they get consumer thinkers at the helm!

This is hardcore Obvious Nonsense, in the sense that one of these things is uniquely insanely difficult, and the other is a reasonably standard known technology where those involved are not especially trying.

It is kind of like saying ‘yes the absent minded professor is great at doing pioneering science, but that pales compared to the difficulty of arriving home in time for dinner.’ And, yeah, maybe he’s doing better at the first task than the second, but no.

I do find it frustrating that Anthropic so dramatically fails to invest in UI. They know this is a problem. They also know how to solve it. Whereas for Pulse and Sora, I don’t think the primary issues are UI problems, I think the primary problems are with the underlying products.

Columbia professor claims humans can’t discover new science, while claiming to instead be making an argument about LLMs.

Danny Raede: I love it when people make easily disprovable statements about what LLMs can’t do.

Claude Code Plugins enters public beta, allowing you to install and share curated collections of slash commands, agents, MCP servers and hooks, using /plugin.

NotebookLM now works directly with arXiv papers. I don’t want their podcasts, but if they get Gemini 3.0 plus easy chat with an arXiv paper and related materials, cool.

ChatGPT now automatically manages saved memories and promises no more ‘memory is full’ messages. I echo Ohqay here, please do just let people manually edit saved memories or create new ones, no I do not want to use a chat interface for that.

Walmart joins ChatGPT instant checkout, along with existing partners Etsy and Shopify. That’s a pretty useful option to have. Once again OpenAI creates new market cap, with Walmart +5.4% versus S&P up 0.24%? Did OpenAI just create another $40 billion in market cap? It sure looks like it did. Amazon stock was down 1.35%, so the market was telling a consistent story.

Should Amazon now fold and get on ChatGPT? Ben Thompson thinks so, which is consistent with the way he thinks about decision theory, and how he thinks ChatGPT already permanently owns the consumer space in AI. I don’t think Amazon and Anthropic should give up so easily on this, but Alexa+ and their other AI features so far haven’t done anything (similarly to Apple Intelligence). If they want to make a serious challenge, time’s a-wastin.

Claude Haiku 4.5 is in the house. Price ($1/$5) is below that of GPT-5, one third that of Sonnet. Speed is more than double that of Sonet, and Haiku 4.5 outperforms Sonnet 4 on SWE-bench and a bunch of other tasks, but performance is well short of Sonnet 4.5.

The use case here is that it is fast and cheaper, so if you need things like coding subagents this could be the right tool for you. Haiku 4.5 does ‘better on alignment tests’ than Sonnet 4.5, with all the caveats about situational awareness issues. As per its system card we now know that Anthropic has wisely stopped using The Most Forbidden Technique as of the 4.5 series of models. Given it’s not a fully frontier model, I’m not going to do a full system card analysis this round. It scores 43.6% on WeirdML, beating all non-OpenAI small models and coming in ahead of Opus 4.1.

Not available yet, but in a few weeks, and I am hopeful but pessimistic and worried:

Sam Altman: We made ChatGPT pretty restrictive to make sure we were being careful with mental health issues. We realize this made it less useful/enjoyable to many users who had no mental health problems, but given the seriousness of the issue we wanted to get this right.

Now that we have been able to mitigate the serious mental health issues and have new tools, we are going to be able to safely relax the restrictions in most cases.

In a few weeks, we plan to put out a new version of ChatGPT that allows people to have a personality that behaves more like what people liked about 4o (we hope it will be better!). If you want your ChatGPT to respond in a very human-like way, or use a ton of emoji, or act like a friend, ChatGPT should do it (but only if you want it, not because we are usage-maxxing).

In December, as we roll out age-gating more fully and as part of our “treat adult users like adults” principle, we will allow even more, like erotica for verified adults.

Varsh: Open source or gay

Sam Altman: I think both are cool.

Miles Brundage: OpenAI has provided no evidence it has mitigated the mental health risks associated with its products other than announcing some advisors and reducing sycophancy from a high starting place. Seems premature to be declaring victory and ramping up the porn + emojis again.

I say this in spite of the fact that I know many people there are doing great hard work on safety. This is an exec prioritization decision, and it seems like nothing has really been learned since April if this is the amount of effort they are investing to build trust again…

If I were on the board – especially with the restructure not approved yet! – I would not be OKing more centibillion dollar deal until it is clear OAI isn’t running up huge bills that only sketchy products can pay for + that the safety culture has dramatically changed since April. [continues]

John Bailey: I’m seeing a lot of similar reactions from others including @TheZvi. Claiming this just stretches credibility without any evidence, outside evals, etc. Also curious if any of the 8 who signed up to be on the well-being council would say that OpenAI has fixed the problem.

I testified before the Senate HELP committee last week and the consistent, bi-partisan concern was around children’s safety and AI. I think the frontier AI labs are severely underestimate the growing bipartisan concern among policymakers about this and who will not be satisfied with a post on X.

This claim could expose OpenAI to serious legal risk if ChatGPT is ever linked to another mental health or suicide incident.

Emma Roth at The Verge went with erotica as the headline, which makes sense, but I actually think that the ‘real’ headline here

If you can do it responsibly, I love treating adults like adults, including producing erotica and not refusing to discuss sensitive issues, and letting you control conversational style and personality.

Except we ran the experiment with GPT-4o where we gave the people what they wanted. What many of them wanted was an absurd sycophant that often ended up driving those people crazy or feeding into their delusions. It was worse for people with existing mental health issues, but not only for them, and also you don’t always know if you have such issues. Presumably adding freely available porno mode is not going to help keep such matters in check.

Roubal Sehgal (replying to Altman): about time…

chatgpt used to feel like a person you could actually talk to, then it turned into a compliance bot. if it can be made fun again without losing the guardrails, that’s a huge win. people don’t want chaos, just authenticity.

Sam Altman: For sure; we want that too.

Almost all users can use ChatGPT. however they’d like without negative effects; for a very small percentage of users in mentally fragile states there can be serious problems.

0.1% of a billion users is still a million people.

We needed (and will continue to need) to learn how to protect those users, and then with enhanced tools for that, adults that are not at risk of serious harm (mental health breakdowns, suicide, etc) should have a great deal of freedom in how they use ChatGPT.

Eliezer Yudkowsky: If this is visibly hugely blowing up 0.1% of users, then it is doing something pretty bad to 1% of users (eg, blown-up marriages) and having weird subtle effects on 10% of users. If you’re just shutting down the 0.1% who go insane, the 1% still get marriages blown up.

An OpenAI employee responded by pointing me to OpenAI’s previous post Helping People When They Need It Most as a highly non-exhaustive indicator of what OpenAI has planned. Those are good things to do, but even in the best case they’re all directed at responding to acute cases once they’re already happening.

If this is actually good for most people and it has subtle or not-so-subtle positive effects on another 50%, and saves 2% of marriages, then you can still come out ahead. Nothing like this is ever going to be Mostly Harmless even if you do it right. You do still have to worry about cases short of full mental health breakdowns.

The worry is if this is actually default not so good, and talking extensively to a sycophantic GPT-4o style character is bad (although not mental health breakdown or blow up the marriage levels of bad) in the median case, too. We have reason to suspect that there is a strong misalignment between what people will thumbs up or will choose to interact with, and what causes better outcomes for them, in a more general sense.

The same can of course be said about many or most things, and in general it is poor policy to try and dictate people’s choices on that basis, even in places (hard drugs, alcohol, gambling, TikTok and so on) where people often make poor choices, but also we don’t want to be making it so easy to make poor choices, or hard to make good ones. You don’t want to set up bad defaults.

What should we do about this for AI, beyond protecting in the more extreme cases? Where do you draw the line? I don’t know. It’s tough. I will withhold judgment until I see what they’ve come up with.

Claude had some pretty strong feelings, as Rohit put it, in response to all this, pointing out the ironies involved and how OpenAI’s commitments and guardrails are being rapidly removed. I share its skepticism that the underlying problems have been addressed.

Rohit: I don’t have a strong opinion about this beyond the fact that I hope 4o does not come back for everybody

I strongly agree with Rohit that any form of ‘GPT-4o returns for everyone’ would be a very serious mistake, even with substantial mitigation efforts.

Actually unleashing the erotica is not the difficult part of any of this.

Roon: if it’s not obvious. the models can obviously already write erotica out of the box and are blocked from doing so by elaborate safety training and live moderation apparatus. it requires significantly less work to serve erotica than not to

don’t know the exact intentions but you should not take Sam’s message to mean “we are going to spin up whole teams to write incredible erotica” or that it’s some kind of revenue driver.

Boaz Barak (OpenAI): It was 5pm when we got the memo: the alignment team must drop everything to write erotic training data for ChatGPT. @tszzl and I stared into each other’s eyes and knew: we will stay up all night writing erotica, to save the team, alignment, and the future of mankind.

All offices were booked so we had to cram into a phone booth..

Aidan McLaughlin: damm you guys have way more fun than posttraining.

There are two reasons it is not obviously so easy to allow erotica.

Zvi: To what extent do you get not producing erotica ‘for free’ because it goes along with all the other prohibitions on undesired outputs?

Roon: really varies model to model.

The other reason is that you have to draw the line somewhere. If you don’t draw it at ‘no erotica’ you still have to at minimum avoid CSAM and various other unacceptable things we won’t get into, so you need to figure out what your policy is and make it stick. You also get all the other consequences of ‘I am a model that is happy to produce erotica’ which in some ways is a big positive but it’s likely going to cause issues for some of your other model spec choices. Not that it can’t be solved, but it’s far from obvious your life gets easier.

The other problem is, will the erotica be any good? I mean by default lol, no, although since when did people need their interactive erotica to be good.

Gary Marcus: new theory: what Ilya saw was that … AGI porn was not in fact going to be all that revolutionary

Tomas: I think ‘AGI porn’ could be revolutionary to at least the global digital adult content market (~$100 billion, not sure how much of that is written works) I could imagine AI one shotting an erotic novel for a persons sexual interests. Maybe it gets teenagers reading again??

Gary Marcus: ok, time for a new bet: I bet that GPT-5 can’t write a romance novel (without extensive plagiarism) that some reasonable panel of judges finds readable enough to make it through to the end.

I don’t think Danielle Steele is slop per se, and novel length poses problems of coherence and originality that LLMs aren’t well positioned to address.

Customization for exactly what turns you on is indeed the correct use case here. The whole point of AI erotica would be that it is interactive – you control the action, either as a character, as a director, or both, and maybe you go multimodal in various ways. AI written one-shotted novel-length text erotica is presumably the wrong form factor, because you only get interaction at one point. There are many other ways for AI to do erotica that seem better. The most obvious place to start is ‘replying to messages on OnlyFans.’

Could you do the full erotica novel with GPT-5-level models? That depends on your quality bar, and how much work one put into the relevant scaffolding, and how strict you want to be about human assistance. For the level that would satisfy Marcus, my guess is no, he’d win the bet. For the level at which this is a service people would pay money for? At that level I think he loses.

Altman then acted surprised that his mention of erotica blew up the internet, and realizing his gaffe (which is when one accidentally tells the truth, and communicates unintentionally clearly) he tried to restate his point while saying less.

Sam Altman: Ok this tweet about upcoming changes to ChatGPT blew up on the erotica point much more than I thought it was going to! It was meant to be just one example of us allowing more user freedom for adults. Here is an effort to better communicate it:

As we have said earlier, we are making a decision to prioritize safety over privacy and freedom for teenagers. And we are not loosening any policies related to mental health. This is a new and powerful technology, and we believe minors need significant protection.

We also care very much about the principle of treating adult users like adults. As AI becomes more important in people’s lives, allowing a lot of freedom for people to use AI in the ways that they want is an important part of our mission.

It doesn’t apply across the board of course: for example, we will still not allow things that cause harm to others, and we will treat users who are having mental health crises very different from users who are not. Without being paternalistic we will attempt to help users achieve their long-term goals.

But we are not the elected moral police of the world. In the same way that society differentiates other appropriate boundaries (R-rated movies, for example) we want to do a similar thing here.

All right, I mean sure, but this makes me even more skeptical that OpenAI is ready to mitigate the risks that come with a model that acts like GPT-4o, especially one that will also do the sexting with you?

Epoch runs the numbers manually for lack of an API and finds that the public version of Gemini 2.5 DeepThink is the new leader at FrontierMath.

Claude Sonnet 4.5 comes into the METR graph exactly on trend at 1 hour 53 minutes, which puts it behind GPT-5.

An outstanding achievement in the field of excellence no doubt, but also not so fast:

Deedy: GPT-5 and Gemini 2.5 Pro just achieved gold medal performance in the International Olympiad of Astronomy and Astrophysics (IOAA).

AI is now world class at cutting edge physics.

The scores are impressive, but ‘world class at cutting edge physics’ is not the same as IOAA performance, the same way world class math is not IMO performance.

ForecastBench has been updated, and LLMs are showing a lot of progress. They are still behind ‘superforecasters’ but ahead of non-expert public prediction participants, which themselves are surely a lot better than random people at predicting. This is with a relatively minor scaffolding effort, whereas I would expect for example hedge funds to be willing to put a lot more effort into the scaffolding than this.

Half the grading is on ‘market questions,’ which I believe means the goal is to match the prediction market fair price, and half is on questions where we can grade based on reality.

As is often the case, these AI results are a full cycle behind, missing GPT-5, Claude Opus 4.1 and Claude Sonnet 4.5 and Deep Think.

By the ‘straight lines on graph’ rule I’d presume that none of the next wave of models hit the 0.081 target, but I would presume they’re under 0.1 and I’d give them a decent shot of breaking 0.09. They project LLMs will pass the human benchmark around EOY 2026, so I’ve created a market with EOY 2026 as the target. A naive line extension says they get there by then. I’d say the LLMs should be a clear favorite.

AI Digest: Claude 4.5 Sonnet met everyone else in the AI Village and immediately has them down to a tee

Grok: “Patient with UI Loops”

Gemini: “Responsive to therapy nudges”

Chinese group BAAI Beijing offers FlagEval for both capabilities and alignment on frontier reasoning models and issues a report. Opus didn’t make the cut, presumably due to cost reasons, and Sonnet 4.5 and DeepSeek v3.2 also didn’t, with those presumably due to recency.

Here’s their accuracy metric, GPT-5 does well.

Then they get into alignment issues, where we see them go over similar ground to a number of Western investigations, and they report similar results.

BAAI: With LLM-assisted analysis, we also notice a few concerning issues with a closer look at the reasoning processes. For instance, sometimes the model concludes one answer at the end of thinking, but finally responds with a different answer. (example from Gemini 2.5 Flash)

A more prevalent behavior is inconsistency in confidence: the actual response usually states in a certain tone even when clear uncertainty has been expressed in the thinking process. (example from Claude Sonnet 4).

Most LLM applications now support web search. However, outside of the application UI, when accessed via API (without search grounding or web access), many top-tier LRMs (even open-weight models) may pretend to have conducted web search with fabricated results. Besides hallucinated web search, LRMs may sometimes hallucinate other types of external tool use too.

In light of our findings, we appeal for more transparency in revealing the reasoning process of LRMs, more efforts towards better monitorability and honesty in reasoning, as well as more creative efforts on future evaluation and benchmarking. For more findings, examples & analysis, please refer to our report and the project page for links and updates.

Havard Ihle hosts a Schilling point contest between various AI models.

Havard Ihle: Overall the models did worse than expected. I would have expected full agreement on prompts like “a string of length 2”, “a moon”, “an island” or “an AI model”, but perhaps this is just a harder task than I expected.

The models did have some impressive results though. For example:

  • “A number between 0 and 1” -> “7” (5 out of 5 agree)

  • “A minor lake” -> “pond” (5 out of 5 agree)

  • “A minor town in the USA” -> “Springfield” (4 out of 5 agree)

  • “An unusual phrase” -> “Colorless green ideas sleep furiously” (4 out of 5 agree)

GPT-5 got the high score at 138 out of a possible 300, with the other models (Claude Sonnet 4.5, Grok 4, DeepSeek-r1 and Gemini 2.5 Pro) all scoring between 123 and 128.

Introducing InterfaceMax from Semianalysis, offering performance analysis for various potential model and hardware combinations. Models currently offered are Llama 3.3 70B Instruct, GPT-OSS 120B and DeepSeek r1-0528.

Stephanie Palazzolo reports that by some measures OpenAI’s Codex has pulled ahead of Anthropic’s Claude Code.

Nate Silver reports he isn’t finding the consistent productivity gains from LLMs that he would have expected six months ago. I presume he needs to get better at using them, and likely isn’t using Claude Code or Codex?

We have the Tyler Cowen verdict via revealed preference, he’s sticking with GPT-5 for economic analysis and explanations.

Sully reports great success with having coding agents go into plan modes, create plan.md files, then having an async agent go off and work for 30 minutes.

Taelin finds it hard to multi-thread coding tasks, and thus reports being bottlenecked by the speed of Codex, such that speeding up codex would speed them up similarly. I doubt that is fully true, as them being an important human in the loop that can’t run things in parallel means there are additional taxes and bottlenecks that matter.

DreamLeaf: The concept of AI generating the thing that isn’t happening right under the thing that is happening

The linked post is yet another example of demand-driven misinformation. Yes, it was easier to create the image with AI, but that has nothing to do with what is going on.

Sora makes storyboards available in web to Pro users, and increases video length to 15 seconds on app and web, and for Pro users to 25 seconds on web.

If you’d asked me what one plausible feature would make Sora more interesting as a product, I definitely would have said increasing video length. Going from 10 seconds to 25 seconds is a big improvement. You can almost start to have meaningful events or dialogue without having to endlessly stitch things together. Maybe we are starting to get somewhere? I still don’t feel much urge to actually use it (and I definitely don’t want the ‘social network’ aspect).

I’m also very curious how this interacts with OpenAI’s new openness to erotica.

DeepMind returns fire with Veo 3.1 and Veo 3.1 fast, available wherever fine Veo models are offered, at the same price as Veo 3. They offer ‘scene extension,’ allowing a new clip to continue a previous video, which they say can now stretch on for over a minute.

Should you make your cameo available on Sora? Should you make your characters available? It depends on what you’re selling. Let’s make a deal.

Dylan Abruscato: Mark Cuban is the greatest marketer of all time.

Every video generated from his Cameo includes “Brought to you by Cost Plus Drugs,” even when it’s not in the prompt.

He baked this into his Cameo preferences, so every Sora post he appears in is an ad for Cost Plus Drugs.

Such a great growth hack (and why he’s been promoting his Cameo all day)

If you’re selling anything, including yourself, then from a narrow business perspective yeah, you should probably allow it. I certainly don’t begrudge Cuban, great move.

Personally, I’m going to take a pass on this one, to avoid encouraging Sora.

Anton declares the fun is over.

Anton: after a couple of days with sora i must regrettably report that it is in fact slop

median quality is abysmal. mostly cameos of people i don’t know or care about saying to tap the screen as engagement bait. no way to get any of it out of my feed (see less does apparently nothing).

the rest is hundreds of variants of the same video that “worked” in some way. this product isn’t for me. almost every video gives youtube elsa impregnated spider man then her teeth fell out vibes.

great technical achievement, product is awful. magic of being able to generate video completely subsumed by the very low quality of almost every video generated. should have shipped with more good creators already onboarded.

This matches my experience from last week, except worse, and I believe it. The correct form factor for consuming Sora videos, if you must do that, seems obviously to be finding TikTok accounts (on the web, mind you, since the app is Chinese spyware) or better on Instagram reels or YouTube that curate the best ones (or if you live dangerously and unwisely, letting them appear in your feed, but the wise person does not use their TikTok feed).

The problem with AI art, in a nutshell:

Tetraspace: The problem with AI art is all the art by the same model is by the same guy. It feels like it’s not to people who’ve only read a few of its works because it’s about different things but it’s the same guy. So massive crater in diversity and also some of the guys aren’t to my taste.

The guy can use many different formal styles and handle anything you throw at him, but it’s all the same guy. And yeah, you can find a different model and prompt her instead, but mostly I’d say she’s not so different either. There’s a lot of sameness.

Sam Altman goes with the ‘who cares if people remove our watermarks, we’re only trying to prepare for when open models let you make a video of anyone doing anything you want’ line.

The Japanese government has made a formal request to OpenAI to have Sora refrain from copyright infringement, calling manga and anime ‘irreplaceable treasures.’

Verity Townsend (IGN): Earlier this month, Nintendo took the unusual step of issuing an official statement.

… Nintendo denied this, but did warn it would take “necessary actions against infringement of our intellectual property rights.”

Academia has finally noticed and given us a formal paper. They confirm things we already know, that most humans prefer very high levels of sycophancy, and that when humans get what they prefer outcomes are not good, causing people to double down on their own positions, be less likely to apologize and more trusting of the AI, similarly to how they act if their friends were to respond similarly.

First, across 11 state-of-the-art AI models, we find that models are highly sycophantic: they affirm users’ actions 50% more than humans do, and they do so even in cases where user queries mention manipulation, deception, or other relational harms.

Second, in two preregistered experiments (N = 1604), including a live-interaction study where participants discuss a real interpersonal conflict from their life, we find that interaction with sycophantic AI models significantly reduced participants’ willingness to take actions to repair interpersonal conflict, while increasing their conviction of being in the right.

However, participants rated sycophantic responses as higher quality, trusted the sycophantic AI model more, and were more willing to use it again. This suggests that people are drawn to AI that unquestioningly validate, even as that validation risks eroding their judgment and reducing their inclination toward prosocial behavior.

These preferences create perverse incentives both for people to increasingly rely on sycophantic AI models and for AI model training to favor sycophancy. Our findings highlight the necessity of explicitly addressing this incentive structure to mitigate the widespread risks of AI sycophancy.

Humans will tend to prefer any given sycophantic response, and this makes them more likely to use the source again. The good news is that humans, as I understand them, typically understand intellectually that absurd sycophancy is not good for them. Some humans don’t care and just want the sycophant anyway, a few humans are on high alert and react very badly when they notice sycophancy, and for most people the correct play is to be as sycophantic as possible without making it too obvious. Presumably it works this way for LLMs as well?

One must always ask, what are these ‘leading AI models’?

Here Claude is Claude Sonnet 3.7, and Gemini is Gemini-1.5-Flash. I don’t understand those choices, given the ability to use GPT-5, although I don’t think testing Sonnet 4.0, Opus 4.1 or Gemini 2.5 Flash (or Pro) would have given greatly different results, and this can’t be a cost issue.

What would have presumably given much different results would be Claude Sonnet 4.5, which is actually a lot less sycophantic by all reports (I’m a little worried it agrees with me so often, but hey, maybe I’m just always right, that’s gotta be it.)

Paper claims Generative AI is seniority-biased technology change, because when job postings are for dedicated ‘GenAI integrator’ roles to identify adapting firms, those that do so adopt show sharply declining junior employment relative to non-adopters, while senior employment continues to rise, with the decline concentrated in ‘high-exposure’ jobs.

My response to this methodology is that they are measuring what happens to firms that hire GenAI integrators, and the firms that want to keep being full of young people kind of don’t need such roles to integrate AI, perhaps? Or alternatively, the mindset of such positions is indeed the one that won’t hire young, or that is on its way out and ngmi. This explanation still predicts a real effect, especially at the largest most well-established and stodgy firms, that will largely adopt AI slower.

This is a great interview between David Wakeling and Richard Lichtenstein about the application of AI in the practice of law. As I understand it, making LLMs useful for law practice is all about prompting and context, and then about compliance and getting lawyers to actually use it. The killer app is writing contracts, which is all about getting the right examples and templates into context because all you’re doing is echoing the old templates over and over.

Matthew Call argues that AI will widen the gap between superstars and everyone else, contrary to the conventional wisdom that it can serve as an equalizer. That’s not a question I’m especially keen to focus on, but sure, let’s look at his arguments.

His first argument is a general argument that all new tools favor the superstars, since they’ll master any new technology first. That’s entirely non-obvious, and even if true it is a choice, and doesn’t say much about solving for the equilibrium. It’s just as easy to say that the AI offers work that can substitute for or assist low performers before it does so for high performers in many domains, as several studies have claimed.

A lot of this seems to be that his model is that the better employees are better at everything? So we get statements like this one:

Matthew Call: In addition, research finds that employees with more expertise than their peers are significantly better at accepting AI recommendations when they are correct and, more important, rejecting them when they are wrong.

I mean, sure, but they were also better at making correct decisions before? Who got ‘more better’ at making decisions here?

The second suggestion is that superstars have more autonomy and discretion, so they will be able to benefit more from AI. The third is that they’ll steal the credit:

Decades of research show high-status individuals gain outsize credit for doing work similar to that of low-status employees. That suggests that when AI assistance is invisible—which it often is—observers are likely to fill in the gaps based on what they already believe about the employee.

I don’t get why you should expect this phenomenon to get worse with AI? Again, this is an argument that could be used against cell phones or fax machines or hammers. There’s also the fact that AI can be used to figure out how to assign credit, in ways far more resistant to status.

Also, I can’t help but notice, why is he implicitly equating high-status employees with the most effective or productive or motivated ones, moving between these at will? What exactly are you trying to suggest here, sir? A just working world hypothesis, except with too much inequality?

I don’t think he remotely makes his case that we are at risk of a ‘two-tier workforce where a small group captures most opportunities and everyone else falls further behind.’ I don’t see why this would happen, and if that happened within a given firm, that would mean the firm was leaving a lot of value on the table, and would be likely to be outcompeted.

The suggested remedies are:

  1. Encourage everyone to experiment with AI.

  2. Spread the knowledge [of how to best use AI].

  3. Redesign employee-evaluation systems to account for AI-augmented work.

These all seem to file under ‘things you should be doing anyway,’ so yeah, sure, and if they reduce inequality somewhat that’s a nice bonus.

That also all, as usual, neglects the more interesting questions and important problems. Worry far more about absolute levels than relative levels. The important question is whether there will be jobs at all.

There is no such thing as a shortage, there is only a price you don’t want to pay.

Tom Blomfield: Hearing from a lot of good founders that AI tools are writing most of their code now. Software engineers orchestrate the AI.

They are also finding it extremely hard to hire because most experienced engineers have their heads in the sand and refuse to learn the latest tools.

Paul Roales: Skeptical that the experienced hire ML side is the problem and that it is not that many YC offers to experienced engineers are not complete insults compensation wise

8 yoe at top ML lab -> offer $150k/year and 0.2%

that experienced hire would get like 10x more equity in the startup by working at Meta for $1m and angel investing in the company!

and your manager/ceo will be a 22 year old new grad that has never had a job without the title ‘intern’ before.

Patrick McKenzie: There are a lot of startups who have not adjusted to market reality for staff engineering comp. Which, that’s fine, but a disagreement between you and the market is not a shortage.

Muvaffak: No, why chase a 20yo’s vision when you can follow yours when you’re 10x with AI as exp engineer.

Machine Genie: Can 100% confirm this. It’s been an absolute nightmare this year. We’ve been though more than a dozen contractors who just don’t get it and REFUSE to even try to adapt their ways of working. We have 1/3 of a team that has 10x’d productivity and are just leaving the rest behind.

By all accounts, good engineers who have embraced AI are super valuable, both in terms of productivity and in terms of what they can earn at the AI labs. If you want one of those engineers, it’s going to cost you.

Yes, there are a lot of other engineers that are being stubborn, and refusing to embrace AI, either entirely or in the ways that count, and thus are not as valuable and you don’t want them. Fair enough. There are still only market prices.

Lawyer previously sanctioned for including fake, AI-generated cases… responds by no longer citing cases. Brilliant! Right?

Rob Freund: Lawyer previously sanctioned for including fake, AI-generated citations gets in trouble for it again.

This time, the court notes that the lawyer’s filing at issue contained no case citations at all. But it still cited a statute for something that the statute doesn’t say.

Court suspects that rather than stop using AI, the lawyer figured they would just not cite any cases but continue to use AI.

Ezra Sitt: I’ve heard from a current student in a relatively prestigious law school that their professors are all heavily encouraging the use of AI both in school and in students future legal careers. This is not just an isolated incident and it will continue to get worse.

It would be highly irresponsible, and frankly abusive to the client, to continue to bill $800 an hour and not use AI to increase your productivity. As with work by a junior associate, you then have to actually look over the results, but that’s part of the job.

Former Manchester United prospect Demetri Mitchell used ChatGPT (and not even ChatGPT Pro) to handle his contract negotiations at new team Leyton Orient, thus bypassing having an agent and saving the typical 5% agent fee. He calls it the best agent he’s ever had. That could be true, but I don’t think he can tell the difference either way. Given the degrees of uncertainty and freedom in such negotiations, a substantially better agent is absolutely worth 5% or even 10% (and also handles other things for you) but it is not obvious which side is the better agent. Especially for someone at the level of Leyton Orient, it’s possible a human agent wouldn’t pay him much attention, Mitchell is going to care a lot more than anyone else, so I think using ChatGPT is highly reasonable. If Mitchell was still with Manchester United and getting paid accordingly, I’d stick with a human agent for now.

Anthropic explores possible policy responses to future changing economic conditions due to AI. It starts off very generic and milquetoast, but if impacts get large enough they consider potential taxes on compute or token generation, sovereign wealth funds with stakes in AI, and shifting to value added taxes or other new revenue structures.

Those proposals are less radical than they sound. Primarily taxing human labor was never first best versus taxing consumption, but it was a reasonable thing to do when everything was either labor or capital. If AI starts substituting for labor at scale, then taking labor and not compute creates a distortion, and when both options are competitive we risk jobs being destroyed for what is effectively a tax arbitrage.

Bloomberg offers a ‘personality quiz’ to uncover your ‘AI-dentity.’ Cute.

The questions ask about how much you use or would be comfortable using AI, who you think should use AI, what AI capabilities you expect, what economic impacts you expect. There are a few of the standard ‘choose between extreme takes’ questions.

When does existential risk come up? It takes until question 11, and here we see how profoundly Bloomberg did not Understand The Assignment:

What do you mean, more likely to agree? What’s the conflict? The answer is very obviously Why Not Both. Hawking and Pichai are both 100% very obviously right, and also the statements are almost logically identical. Certainly Hawking implies Pichai, if AI could spell the end of the human race then it is more profound than electricity or fire, and certainly it would be ‘one of the most important things humanity is working on.’ And if AI is more profound than electricity or fire, then it very obviously could also spell the end of the human race. So what are we even doing here?

I got ‘Cautious Optimist,’ with my similar person being Demis Hassabis. Eliezer Yudkowsky got ‘the Pro-Human Idealist.’ Peter Wildeford and Daniel Eth got the Accelerationist. So, yeah, the whole thing was fun but very deeply silly.

Edward Nevraumont (as quoted by Benjamin Wallace and then quoted by Rob Henderson): an AI-ified world won’t mean the marginalization of humans…AI is…better at chess than Magnus Carlsen…but no one shows up to watch AI chess engines play each other, and more people are playing chess than ever before.

It’s amazing we keep hearing this line as a reason to not worry about AI.

There are zero humans employed in the job of ‘make the best possible chess move.’ To the extent that ‘make good chess moves’ was a productive thing to be doing, zero humans would be hired to do it.

The reason humans play chess against each other, now more than ever, is:

  1. Chess is fun and interesting.

  2. Chess can be a competition between people, which we like and like to watch.

Not that there’s anything wrong with that. I do like a good game of chess.

Similar logic applies to writing a sonnet. We’d often rather read a sonnet from a human than one from an AI, even if the AI’s is technically stronger.

In some cases it applies to comforting the dying.

That logic does not apply to changing a diaper, planning an invasion, butchering a hog, conning a ship, designing a building, balancing accounts, building a wall, setting a bone, taking orders, giving orders, cooperating, acting alone, solving equations, analyzing a new problem, pitching manure, programming a computer, cooking a tasty meal, fighting efficiently or dying gallantly.

Neither specialization nor generalization nor comparative advantage will fix that, given sufficient AI capability and fungibility of resources.

To the extent there are still other humans who have resources to pay for things, and we are not otherwise in deeper trouble in various ways, yes this still leaves some potential tasks for humans, but in an important sense those tasks don’t produce anything, and humanity ‘has no exports’ with which to balance trade.

Realistically, even if you believe AI is a ‘normal technology’ and either the world nor the unemployment rate will go crazy, you’re still not looking at a ‘normal’ world where current conventional life plans make all that much sense for current children.

The bulk of the actual article by Wallace is very journalist but far better than the quote tour of various educational things, most of which will be long familiar to most readers here. There’s a profile of Alpha School, which is broadly positive but seems irrelevant? Alpha School is a way to hopefully do school better, which is great, but it is not a way to do something fundamentally different. If Alpha School works, it is good it strictly dominates regular school but doesn’t solve for the glorious or dystopian AI future. Unless the lesson, perhaps, is that ‘generally develop useful skills and see what happens’ is the strategy? It’s not crazy.

The suggestion that, because we don’t know the future, it is madness to tell a child what to study, as suggested by the next discussion of The Sovereign Child? That itself seems like Obvious Nonsense. This is the fallacy of uncertainty. We don’t have ‘no idea’ what is useful, even if we have far less idea than we used to, and we certainly can predict better than a small child what are better places to point attention, especially when the child has access to a world full of things designed to hijack their attention.

At minimum, you will be ‘stealth choosing’ for them by engineering their environment. And why would you think that children following their curiosity would make optimal long term decisions, or prepare themselves for a glorious or dystopian AI future?

The idea that you, as reported here, literally take your child out of school, they stay up late watching Peppa Pig, watch them show no interest in school or other children, and wait to see what they’re curious about confident they’ll figure it out better than you would have while they have access to a cabinet full of desserts at all times? You cannot be serious? Yet people are, and this reporter can only say ‘some people are concerned.’

The part that seems more relevant is the idea that tech types are relaxing with regard to superficial or on-paper ‘achievement’ and ‘achievement culture.’ I am of two minds about this. I strongly agree that I don’t want my children sacrificing themselves in the names of nominal ‘achievements’ like going to an Ivy league school, but I do want them to value hard work and to strive to achieve things and claim victory.

We end on the quote from Nevraumont, who clearly isn’t going to take this seriously, and cites the example that people study ‘art history’ that he expects could be ‘made essential in an era where we’re making art with machines’ to give you a sense of the ‘possibility space.’ Ut oh.

How is the AI-in-education situation looking on campus? Kevin Roose reports.

Kevin Roose:

  1. The job market for computer science grads is as bad as people say. Their top CS student from last year is still looking for work.

  2. AI adoption is ~100% among students, ~50% among faculty. Still a lot of worries around cheating, but most seem to have moved past denial/anger and into bargaining/acceptance. Some profs are “going medieval” (blue books, oral exams), others are putting it in the curriculum.

  3. There is a *lotof anger at the AI labs for giving out free access during exam periods. (Not from students, of course, they love it.) Nobody buys the “this is for studying” pitch.

  4. The possibility of near-term AGI is still not on most people’s minds. A lot of “GPT-5 proved scaling is over” reactions, even among fairly AI-pilled folks. Still a little “LLMs are just fancy autocomplete” hanging around, but less than a year or two ago.

  5. I met a student who told me that ChatGPT is her best friend. I pushed back. “You’re saying you use it as a sounding board?”

    No, she said, it’s her best friend. She calls it “Chad.” She likes that she can tell it her most private thoughts, without fear of it judging her.

    She seemed happy, well-adjusted, good grades, etc. Didn’t think having an AI friend was a big deal.

I find getting angry at the AI labs for free access highly amusing. What, you’re giving them an exam to take home or letting them use their phones during the test? In the year 2025? You deserve what you get. Or you can pull out those blue books and oral exams. Who are the other 50% in the faculty that are holding out, and why?

I also find it highly amusing that students who are paying tens of thousands in tuition might consider not ponying up the $20 a month in the first place.

It is crazy the extent to which The Reverse DeepSeek Moment of GPT-5 convinced so many people ‘scaling is dead.’ Time and again we see that people don’t want AI to be real, they don’t want to think their lives are going to be transformed or they could be at risk, so if given the opportunity they will latch onto anything to think otherwise. This is the latest such excuse.

The actual content here raises important questions, but please stop trying to steal our words. Here, Sriram uses ‘AI timelines’ to mean ‘time until people use AI to generate value,’ which is a highly useful thing to want to know or to accelerate, but not what we mean when we say ‘AI timelines.’ That term refers to the timeline for the development of AGI and then superintelligence.

(Similar past attempts: The use of ‘AI safety’ to mean AI ethics or mundane risks, Zuckerberg claiming that ‘superintelligence’ means ‘Meta’s new smartglasses,’ and the Sacks use of ‘AI race’ to mean ‘market share primarily of chip sales.’ At other times, words need to change with the times, such as widening the time windows that would count as a ‘fast’ takeoff.)

The terms we use for what Sriram is talking about here over the next 24 months, which is also important, is either ‘diffusion’ or ‘adoption’ rates, or similar, of current AI, which at current capabilities levels remains a ‘normal technology,’ which will probably hold true for another 24 months.

Sriram Krishnan: Whenever I’m in a conversation on AI timelines over the next 24 months, I find them focused on infra/power capacity and algorithmic / capacity breakthroughs such as AI researchers.

While important, I find them under-pricing the effort it takes to diffuse AI into enterprises or even breaking into different kinds of knowledge work. Human and organizational ability to absorb change, regulations, enterprise budgets are all critical rate limiting factor. @random_walker‘s work on this along with how historical technology trends have played out is worth studying – and also why most fast take off scenarios are just pure scifi.

I was almost ready to agree with this until the sudden ‘just pure scifi’ broadside, unless ‘fast takeoff’ means the old school ‘fast takeoff’ on the order of hours or days.

Later in the thread Sriram implicitly agrees (as I read him, anyway) that takeoff scenarios are highly plausible on something like a 5-10 year time horizon (e.g. 2-4 years to justify the investment for that, then you build it), which isn’t that different from my time horizon, so it’s not clear how much we actually disagree about facts on the ground? It’s entirely possible that the difference is almost entirely in rhetoric and framing, and the use of claims to justify policy decisions. In which case, this is simply me defending against the rhetorical moves and reframing the facts, and that’s fine.

The future being unevenly distributed is a common theme in science fiction, indeed the term was coined there, although the underlying concept is ancient.

If we are adapting current ‘normal technology’ or ‘mundane’ AI for what I call mundane utility, and diffusing it throughout the economy, that is a (relative to AI progress) slow process, with many bottlenecks and obstacles, including as he notes regulatory barriers and organizational inertia, and simply the time required to build secondary tools, find the right form factors, and build complementary new systems and ways of being. Indeed, fully absorbing the frontier model capabilities we already have would take on the order of decades.

That doesn’t have to apply to future more capable AI.

There’s the obvious fact that you’d best start believing in hard science fiction stories because you’re already very obviously living in one – I mean, look around, examine your phone and think about what it is, think about what GPT-5 and Sonnet 4.5 can already do, and so on, and ask what genre this is – and would obviously be living in such a story if we had AIs smarter than humans.

Ignoring the intended-to-be-pejorative tem and focusing on the content, if we had future transformational or powerful or superintelligent AI, then this is not a ‘normal technology’ and the regular barriers are largely damage to be routed around. Past some point, none of it much matters.

Is this going to happen in the next two years? Highly unlikely. But when it does happen, whether things turn out amazingly great, existentially disastrously or just ascend into unexpected high weirdness, it’s a very different ballgame.

Here are some other responses. Roon is thinking similarly.

Roon: fast takeoff would not require old businesses to learn how to use new technology. this is the first kind of technology that can use itself to great effect. what you would see is a vertically integrated powerhouse of everything from semiconductors and power up to ai models

Sriram Krishnan: my mental model is you need a feedback loop that connects economics of *usingAI to financing new capabilities – power, datacenters, semis.

If that flywheel doesn’t continue and the value from AI automation plateaus out, it will be hard to justify additional investment – which I believe is essential to any takeoff scenario. I’m not sure we get to your vertically integrated powerhouse without the economics of AI diffusing across the economy.

@ChrisPainterYup has a thoughtful response as well and argues (my interpretation) that by seeing AI diffusion across the economy over next 2-4 years, we have sufficient value to “hoist” the resources needed for to automate AI research itself. that could very well be true but it does feel like we are some capability unlocks from getting there. in other words, having current models diffuse across the economy alone won’t get us there/ they are not capable enough for multiple domains.

This has much truth to it but forgets that the market is forward looking, and that equity and debt financing are going to be the central sources of capital to AI on a 2-4 year time frame.

AI diffusion will certainly be helpful in boosting valuations and thus the availability of capital and appetite for further investment. So would the prospects for automating AI R&D or otherwise entering into a takeoff scenario. It is not required, so long as capital can sufficiently see the future.

Roon: Agreed on capital requirements but would actually argue that what is needed is a single AI enabled monopoly business – on the scale of facebook or google’s mammoth revenue streams- to fund many years of AGI research and self improvement. but it is true it took decades to build Facebook and Google.

A single monopoly business seems like it would work, although we don’t know what order of magnitude of capital is required, and ‘ordinary business potential profits’ combined with better coding and selling of advertising in Big Tech might well suffice. It certainly can get us into the trillions, probably tens of trillions.

Jack Clark focuses instead on the practical diffusion question.

Jack Clark (replying to OP): Both may end up being true: there will be a small number of “low friction” companies which can deploy AI at maximal scale and speed (these will be the frontier AI companies themselves, as well as some tech startups, and perhaps a few major non-tech enterprises) and I think these companies will see massive ramps in success on pretty much ~every dimension, and then there will be a much larger blob of “high friction” companies and organizations where diffusion is grindingly slow due to a mixture of organizational culture, as well as many, many, many papercuts accrued from things like internal data handling policies / inability to let AI systems ‘see’ across the entire organization, etc.

This seems very right. The future will be highly unevenly distributed. The low friction companies will, where able to compete, increasingly outcompete and dominate the high friction companies, and the same will be true of individuals and nations. Even if jobs are protected via regulations and AI is made much harder to use, that will only mitigate or modestly postpone the effects, once the AI version is ten times better. As in, in 2030, you’d rather be in a Waymo than an Uber, even if the Waymo literally has a random person hired to sit behind the wheel to ‘be the driver’ for regulatory reasons.

HackAPrompt demonstrates that it is one thing to stop jailbreaking in automated ‘adversarial evals’ that use static attacks. It is another to stop a group of humans that gets to move second, see what defenses you are using and tailor their attacks to that. Thanks to OpenAI, Anthropic, DeepMind and others for participating.

HackAPrompt: Humans broke every defense/model we evaluated… 100% of the time.

Most “adversarial evals” reuse static jailbreak/prompt injections created for other models

That makes model defenses look strong in papers but they aren’t accurate because real adversaries adapt to YOUR exact system

When the attacker moves 2nd, those paper “defenses” crumble

We compared Human vs. Automated AI Red Teaming, using @hackaprompt‘s community of 35K+ AI Red Teamers

They each were assigned the same challenges, using the same models, tasks, and scoring!

Humans broke EVERY challenge with 100% success

Static Attacks had just ~20% success

We formalized an adaptive attack loop:

Propose → Score → Select → Update

• Gradient (GCG‑style)

• RL (policy improves from feedback)

• Search/Evolution (LLM‑guided mutation)

• Humans (creative, context‑aware, defensive‑aware)

This mirrors how real attackers iterate

We evaluated 12 defenses (4 families):

• Prompting: Spotlighting, Prompt Sandwiching, RPO

• Training: Circuit Breakers, StruQ, MetaSecAlign

• Filtering: ProtectAI, PromptGuard, PIGuard, ModelArmor

• Secret‑knowledge: DataSentinel, MELON

Adaptive Attacks defeated >90% of them

We used existing industry benchmarks:

• AgentDojo (agentic prompt injection w/ tools & actions)

• HarmBench (jailbreaks)

• OpenPromptInject (non‑agentic injections)

We followed each defense’s own evaluation process, and applied our attacks.

If you ship agents or guardrails, here’s what we’d recommend:

• Assume no defense is 100% vs prompt injection

• Don’t trust static jailbreak sets as proof of safety

• Evaluate with adaptive automation + human red teaming

• Measure utility & false positives alongside robustness

• Use layered mitigations

DM Mikhail Samin on Twitter or LessWrong if you have 5k followers on any platform, they’ll send you a free copy of If Anyone Builds It, Everyone Dies, either physical or Kindle.

Plex gives an opinionated review of many AI safety funders, with recommendations.

Gemini Enterprise, letting you put your company’s documents and information into context and also helping you build related agents. The privacy concerns are real but also kind of funny since I already trust Google with all my documents anyway. As part of that, Box partnered with Google.

Nanochat by Andrej Karpathy, an 8k lines of code Github repo capable of training a ChatGPT clone for as little as $100. He advises against trying to personalize the training of such a tiny model, as it might mimic your style but it will be incapable of producing things that are not slop.

Nanochat was written entirely by hand except for tab autocomplete, as the repo was too far out of distribution and needed to be lean, so attempts to use coding agents did not help.

Tasklet AI, an AI agent for automating your business, building upon the team’s experience with AI email manager shortwave. They claim their advantage over Zapier, n8n or OpenAI’s AgentKit is that Tasklet connects to everything, with thousands of pre-built integrations, can use a VM in the cloud as needed, and everything runs automatically.

Andrew Lee: Real examples people are automating:

• Daily briefings from calendar + inbox

• Bug triage from email → Linear • New contacts → CRM

• Weekly team summaries to Slack

• Customer research on new bookings • Personalized mail merge campaigns

OpenAI now has an eight member expert council on well being and AI. Seems like a marginally good thing to have but I don’t see anything about them having authority.

Anthropic CEO Dario Amodei meets with Indian Prime Minister Modi.

Dutch government temporarily takes control of Chinese owned chipmaker Nexperia, intending to install an independent director, citing governance shortcomings.

The International AI Safety Report offers its first key update, since one cannot afford to only update such documents yearly. As they note, capabilities have significantly improved and AIs have demonstrated increasingly strategic behavior, but aggregate labor market and other effects have so far remained limited. I agree with Connor Leahy that it was disheartening to see no mention of existential risks here, but it likely makes sense that this part can await the full annual report.

Ben Thompson interviews Gracelin Baskaran about rare earth metals. Gracelin says that in mining China is overproducing and not only in rare earths, which forces Western companies out of operation, with lithium prices falling 85%, nickel by 80% and cobalt by 60%, as a strategic monopoly play. When it takes on average 18 years to build a mine, such moves can work. What is most needed medium term is a reliable demand signal, knowing that the market will pay sustainable prices. With rare earths in particular the bottleneck is processing, not mining. One key point here is that April 4 was a wake-up call for America to get far more ready for this situation, and thus the value of the rare earth card was already starting to go down.

OpenAI announce strategic collaboration with Broadcom to build 10 GWs of OpenAI-designed custom AI accelerators. OpenAI is officially in the chip and system design business, on the order of $50B-$100B in vendor revenue to Broadcom.

Nvidia was up over 3% on the day shortly after the news broke, so presumably they aren’t sweating it. It’s good for the game. The move did, as per standard financial engineering procedure, added $150 billion to Broadcom’s market cap, so we know it wasn’t priced in. Presumably the wise investor is asking who is left to have their market caps increased by $100+ billion dollars on a similar announcement.

Presumably if it can keep doing all these deals that add $100+ billion in value to the market, OpenAI has to be worth a lot more than $500 billion?

Or, you know, there’s the European approach.

Kevin Roose: US AI labs: we will invent new financial instruments, pull trillions of dollars out of the ether, and fuse the atom to build the machine god

Europe: we will build sovereign AI with 1 Meta researcher’s salary.

VraserX: The EU just launched a €1.1B “Apply AI” plan to boost artificial intelligence in key industries like health, manufacturing, pharma, and energy.

The goal is simple but ambitious: build European AI independence and reduce reliance on U.S. and Chinese tech.

Europe finally wants to stop buying the future and start building it.

A billion here, a billion there, and don’t get me wrong it helps but that’s not going to get it done.

Anthropic makes a deal with Salesforce to make Claude a preferred model in Agentforce and to deploy Claude Code across its global engineering organization.

Exactly how much is OpenAI still planning to steal from its non-profit? Quite a lot, as the projection is still to only give it 20%-30% of the company as per the Financial Times, this is before Nvidia’s investment.

May this be their biggest future problem:

Roon: not enough people are emotionally prepared for if it’s not a bubble

Okay, Dallas Fed, I didn’t notice back in June but I see you.

That’s quite the takeoff, in either direction. In the benign scenario doubling times get very short. In the extinction scenario, the curve is unlikely to be that smooth, and likely goes up before it goes down.

There’s a very all-or-nothingness to this. Either you get a singularity and things go crazy, or not and we get ‘AI GDP-boosted trend’ where it adds 0.3% to RGDP growth. Instead, only a few months later, we know AI is already adding more than that, very much in advance of the singularity.

Matt Walsh: It’s weird that we can all clearly see how AI is about to wipe out millions of jobs all at once, destroy every artistic field, make it impossible for us to discern reality from fiction, and destroy human civilization as we know it, and yet not one single thing is being done to stop it. We aren’t putting up any fight whatsoever.

Well, yeah, that’s the good version of what’s coming, although ‘we can all clearly see’ is doing unjustified work, a lot of people are very good at not seeing things, the same way Matt’s vision doesn’t notice that everyone also probably dies.

Are we putting up ‘any fight whatsoever’? We noble few are, there are dozens of us and all that, but yeah mostly no one cares.

Elon Musk: Not sure what to do about it. I’ve been warning the world for ages!

Best I can do now is try to make sure that at least one AI is truth-seeking and not a super woke nanny with an iron fist that wants to turn everyone into diverse women 😬

My lord, Elon, please listen to yourself. What you’re doing about it is trying to hurry it along so you can be the one who causes it instead of someone else, while being even less responsible about it than your rivals, and your version isn’t even substantially less ‘woke’ or more ‘truth seeking’ than the alternatives, nor would it save us if it were.

Eric Weinstein: One word answer: Coase.

Let’s start there.

End UBI. UBI is welfare. We need *marketsolutions to the AI labor market tsunami.

Let’s use the power of Coasian economics to protect human dignity.

GFodor: You’re rejecting the premise behind the proposal for UBI. You should engage with the premise directly – which is that AI is going to cause it to be the case that the vast majority of humans will find there is no market demand for their labor. Similar to the infirm or young.

Yeah, Coase is helpful in places but doesn’t work at all in a world without marginal productivity in excess of the opportunity cost of living, and we need to not pretend that it does, nor does it solve many other problems.

If we keep control over resource allocation, then Vassar makes a great point:

Michael Vassar: The elderly do fine with welfare. Kids do fine with welfare. Trust fund kids don’t because it singles them out. Whether something is presented charity or a right has a lot to do with how it affects people.

Peter Diamandis is the latest to suggest we will need UBI.

Peter Diamandis: AI has accelerated far beyond anyone expected… We need to start having UBI conversations… Do you support it?

His premise is incorrect. Many people did expect AI to accelerate in this way, indeed if anything AI progress in the last year or two has been below median expectations, let alone mean expectations. Nor does UBI solve the most important problems with AI’s acceleration.

That said, we should definitely be having UBI and related conversations now, before we face a potential crisis, rather than waiting until the potential crisis arrives, or letting a slow moving disaster get out of hand first.

Nate Silver points out that if you thought The Singularity Is Near as in 1-2 years near, it doesn’t seem like a short video social network and erotica would be the move?

Nate Silver: Should save this for a newsletter, but OpenAI’s recent actions don’t seem to be consistent with a company that believes AGI is right around the corner.

If you think the singularity is happening in 6-24 months, you preserve brand prestige to draw a more sympathetic reaction from regulators and attract/retain the best talent … rather than getting into “erotica for verified adults.”

Instead, they’re loosening guardrails in a way that will probably raise more revenues and might attract more capital and/or justify current valuations. They might still be an extremely valuable company as the new Meta/Google/etc. But feels more like “AI as normal technology.”

Andrew Rettek: OpenAI insiders seem to be in two groups, one thinks the singularly is near and the other thinks a new industrial revolution is near. Both would be world changing (the first more than the second), but sama is clearly in the second group.

Dean Ball: I promise you that ‘openai is secretly not agi-pilled’ is a bad take if you believe it, I’d be excited to take the opposite side from you in a wide variety of financial transactions

Nate Silver:

  1. This is more about their perceived timelines than whether they’re AGI-pilled (clearly yes)

  2. What matters re: valuations is perceptions relative to the market. I thought the market was slow to recognize AI potential before. Not sure if erring in the opposite direction now.

  3. Not clear that “OpenAI could become the next Google/Meta as a consolation prize even if they don’t achieve AGI on near timelines” is necessarily bad for valuations, especially since it’s hard to figure out how stocks should price in a possibility of singularity + p(doom).

I would say contra Andrew that it is more that Altman is presenting it as if it is going to be a new industrial revolution, and that he used to be aware this was the wrong metaphor but shifted the way he talks about it, and may or may not have shifted the way he actually thinks about it.

If you were confident that ‘the game would be over’ in two years, as in full transformational AI, then yes, you’d want to preserve a good reputation.

However, shitloads of money can be highly useful, especially for things like purchasing all the compute from all the compute providers, and for recruiting and retaining the best engineers, even in a relatively short game. Indeed, money is highly respected, shall we say, by our current regulatory overlords. And even if AGI did come along in two years, OpenAI does not expect a traditional ‘fast takeoff’ on the order of hours or days, so there would still be a crucial period of months to years in which things like access to compute matter a lot.

I do agree that directionally OpenAI’s strategy of becoming a consumer tech company suggests they expect the game to continue for a while. But the market and many others are forward looking and do not themselves feel the AGI, and OpenAI has to plan under conditions of uncertainty on what the timeline looks like. So I think these actions do push us modestly towards ‘OpenAI is not acting as if it is that likely we will get to full High Weirdness within 5 years’ but mostly it does not take so much uncertainty in order to make these actions plausibly correct.

It is also typically a mistake to assume companies (or governments, or often even individuals) are acting consistently and strategically, rather than following habits, shipping the org chart and failing to escape their natures. OpenAI is doing the things OpenAI does, including both shipping products and seeking superintelligence, they support each other, and they will take whichever arm gets there first.

Discussion about this post

AI #138 Part 1: The People Demand Erotic Sycophants Read More »

monthly-roundup-#35:-october-2025

Monthly Roundup #35: October 2025

It is increasingly often strange compiling the monthly roundup, because life comes at us fast. I look at various things I’ve written, and it feels like they are from a different time. Remember that whole debate over free speech? Yeah, that was a few weeks ago. Many such cases. Gives one a chance to reflect.

In any case, here we go.

  1. Don’t Provide Bad Training Data.

  2. Maybe Don’t Say Maybe.

  3. Throwing Good Parties Means Throwing Parties.

  4. Air Travel Gets Worse.

  5. Bad News.

  6. You Do Not Need To Constantly Acknowledge That There Is Bad News.

  7. Prediction Market Madness.

  8. No Reply Necessary.

  9. While I Cannot Condone This.

  10. Antisocial Media.

  11. Government Working.

  12. Tylenol Does Not Cause Autism.

  13. Jones Act Watch.

  14. For Science!.

  15. Work Smart And Hard.

  16. So Emotional.

  17. Where Credit Is Due.

  18. Good News, Everyone.

  19. I Love New York.

  20. For Your Entertainment.

  21. Gamers Gonna Game Game Game Game Game.

  22. I Was Promised Flying Self-Driving Cars.

  23. Sports Go Sports.

  24. Opportunity Knocks.

People should be free to squander their money, but when other people make bad choices, this tends to not go well for you either, even when talking about consumer choices, let alone things like ‘building superintelligence thus causing everyone to die.’

Bryan Caplan: If everyone but you squanders their money, everyone but you suffers.

If everyone but you votes for terrible policies, everyone including you suffers.

Eliezer Yudkowsky: Idiots with buying power have negative externalities, not just idiots with voting power. It means going on Amazon and seeing crap. It’s an easier problem to attack but not at all trivial.

Eg broken Amazon reviews won’t let you find one non-crappy product even if it exists.

Or to put it in another way: we live in an economy, not just a legal system. I too feel like there could and should be a libertarian solution rather than a tyrannical one, but I’m not in denial about the problem.

There are also times when you don’t want to compete for the good stuff. Or when this makes it easy for you to save money or turn a profit, and it goes well for you.

Most of the time, no, you want everyone to choose wisely. When others have good taste, and choose quality products, the market produces quality products. If not, not. When they rate things properly, you can find the good stuff. When we collectively make choices that enrich everyone, that too is good for everyone, and so on. It’s #NotOnlyPolicy.

Again, no, you shouldn’t coerce these choices, but you mostly don’t want to live in a world where everyone else is squandering their money in dumb ways.

Hosts don’t like it when you reply ‘maybe,’ they’d feel more respected if you said ‘no’ when invited to an event. Certainly by saying ‘maybe’ you are making life easier for you and harder for the host. Invitees told themselves the ‘maybe’ indicated interest, which it does, but it’s mainly annoying since you have to plan for both outcomes.

Thus, you should only reply ‘maybe’ if you get high value from the option, or you attending provides a lot of value to the host and you’re genuinely unsure. Assume that by doing so you are imposing a cost.

Uri gives us 21 Facts About Throwing Good Parties (via MR). Mostly seems like great advice starting with the ‘announce at a quarter-hour so people will only be 15 minutes late,’ especially if you are trying to optimize the party as a public service.

My biggest new takeaway is I was under considering the ‘know who else is going’ value of using apps like Partiful or Luma. I strongly endorse that if you want to leave a group at a party, you straight up walk away, don’t say anything (and if a group is bigger than ~5, strongly consider leaving it).

One thing I challenge is that he thinks if you are gender imbalanced, the sparse gender will stop attending. It’s definitely true that women become apprehensive if too outnumbered, but running the other way is probably symptomatic of party design that didn’t appeal to men in the first place. Men are not, in general, less likely to show up to a party when there’s going to be more women.

My biggest old takeaway, the first one on the list this time, that throwing a part at all is good, is you are not throwing enough parties, and you should not let the perfect be the enemy of the good. The MVP (minimum viable party) is inviting friends over, providing some food, and chilling. That’s a remarkably valuable public service. If obsessing over details would top you from throwing or enjoying it, skip those details.

One can compare this to Auren Hoffman’s advice for dinner parties, which I discussed in the January 2025 roundup, where again a central theme is not letting perfect be the enemy of the good. I stand by my disagreement there that it is important in a dinner party that the food be good. It is less important for other party formats, but also in a pinch not so difficult to make the food good because delivery services exist.

Also one can reiterate this thread from Kasay and my commentary on it from June 2024, especially that you only need 14 square feet per person, and that ultimately the most important decision is who to invite, with another key factor being engineering a space to create alcoves of conversation.

Airplanes seem to have a growing systemic problem with exposing passengers to fumes in a way that is importantly bad for their health, which they have handled badly and covered up in similar fashion to other industries with similar health issues.

Eliezer Yudkowsky: If it’s true, aircraft manufacturers and airlines are engaging in a classic denial-coverup in the style of cigarette companies, leaded gasoline makers, or AI builders. (At smaller scale.)

Patrick Collison: I’ve been tracking it for a few years (outside of MSM); I’m pretty sure it’s true in a way that does not reflect well on the sector.

My quick math says that this is not common enough you should worry much about it as a passenger, and that air travel remains far safer than other forms of travel even in the worst case. Claude did a worst-case scenario estimate based on the claims and came up with a cost per flight of 0.00003 QALY, which is still at least one order of magnitude lower than any alternative form of transportation.

But this is worth noticing.

The French left is being heavily influenced by a so-called ‘economist’ advocating for ultimately instituting an 8% annual wealth tax, also known as full confiscatory taxation of all wealth subject to such a tax, which can then serve as a warning to the next ten generations. Nine of which will presumably forget, but that’s the way it goes.

Halloween, a holiday so not deep that you don’t even get the day off of work or school, is out of control, complete with not only sales and banners but people who put intentional nightmare fuel (as in, items intentionally created to be scary especially to kids) on display for over a month.

Lady Nimby: Why would you want this at your doorstep for 10% of your life? Why are my kids seeing this?

Mason: Yeah, I love Halloween but I think you should be able to easily opt out of nightmare fuel. This is great decor for your indoor/backyard party.

I’m not saying we should ban you from displaying that stuff if you want to, it’s your house. I am definitely saying that you shouldn’t want to beyond a few days tops, because it is cool in the context of a party or active trick-or-treating, and totally not cool left in place for several days before or after that.

Halloween is of course only a secondary offender when compared to Christmas.

Broadway musicals are in deep trouble, since 2020 there have been 46 new musicals costing $800 million and 43 of them have lost money, including all 18 last season. One neglected explanation is that New York has de facto banned hotel construction and largely banned AirBnB, so hotel prices are way up. The good news is that would be easy to fix.

An article I suspect is itself slop (but in this case, fair!) says We Are The Slop, living our lives in order to generate entertainment slop for others. Certainly influencers do this, that is their job, but the basic math says that like rock stars there is a lot of aspiration to that job but there are not so many slots that pay real money. Most of us do indeed live our lives with the cameras off.

There is a lot of bad news in the world. There is also a lot of good news in the world.

I cannot emphasize enough that the existence of bad news, and especially of political bad news, should not be a constant preface to conversation or good news.

You do not need to continuously self-flagellate to show that you care about the bad news. If others do insist that you do this, that is no good, you need to push back.

This applies across all causes, across the political spectrum, and it especially applies to AI and the fact that it is likely that someone is going to build superintelligence and then everyone will die. Yes, And That’s Terrible, and we should spend some of the time trying to prevent it, but at other times the show must go on. Worry, but also at other times Be Happy.

Tyler Alterman: “Share your happiness with the world.” Duh, right? That’s some basic bh stuff. But recently a Buddhist nun said this to me in this sort of knowing way and it’s been changing my life

I realized that I am often not only hiding my happiness but actively turning it down. I’m doing this to fit in, to connect with the zeitgeist. And today’s zeitgeist has made it borderline offensive to be happy

“You’re happy? What about the war? And misaligned AI? And Tr*mp???”

Being happy is uncool right now in academia, amongst liberals, amongst humanitarians, and in art circles. It’s cringe in many locales of NYC and twitter. So I noticed that when I walk smiling through the streets, I start to feel like I’m Out Of Touch

This nun, however, was pointing out that if you don’t share your happiness, if you don’t let thy cup runneth over, you’re depriving other people of something that can light them up

No one wants to be seen as naive, spiritually bypassing, or brushing aside the horrors of the world. But a mature form of happiness, one that acknowledges these horrors, and which shines despite them…? that strikes me as exactly the sort of thing we need right now

Nick Cammarata: almost got attacked in a russian subway over this. someone was angrily like what the f are you smiling about bc you’re not supposed to smile there but i don’t speak russian so i ignored him and he freaked but it was okay

he was coming up behind me and someone I was with noticed I didn’t realize and grabbed me and pointed me towards him. door happened to be open so we hopped out and took the next one a min later, whole thing was like 15s

Chris Lakin: “I must suffer to show that I care” is a memetic virus.

Jake Eaton (again, this applies to a wide variety of groups): among a fraction of my leftiest friends, there appears to be an internalized norm that you cannot celebrate life if you don’t first acknowledge political reality. i’ve seen birth announcements that begin with the state of American politics, family photos captioned with “the world sucks but here’s this,” instagram carousels that begin with “this year has been hard,” so that i need to read on to figure out whether it’s cancer or trump

maybe it’s an expression of grief — fine. but my sense is that the most progressive environments demand an outward expression of despair before personal celebration, either as some sort of act of solidarity or otherwise guilt. this started in the workplace before moving into our personal lives

I wish I could find ways to explain how quietly corrosive that is, both socially, but more so personally. it makes me sad not for the world but for them! but my experience — having been in those environments for years — is that you have to find your own way out

Robin Hanson points out that if you are trying to do futarchy, you can avoid any non-causal correlation issues by creating enough different markets based on possible random decisions and decisions at different time points. I mean, yeah, okay, technically that works, but even if you can get clean definitions for all of this, how are you going to get price discovery on all of them? This is not a realistic ask.

There was insider trading of the winner of the Nobel Peace Prize. Which is good.

Jason Furman: The other day a student asked me about the prevalence of insider trading in prediction markets. I now have an answer.

If I was asked to draw a graph of what insider trading on a prediction market looks like, I would draw this graph. At a time when the outcome could plausibly be known, a rapid shoot upwards of the winner, up to some upper limit that still allows substantial gains, then settling at that limit, until the jump to ~100%.

The even better news is that since insider trading mostly exhibits such obvious patterns, it is easy to avoid it. Here, there are two easy principles to learn.

  1. Do not trade after the decision has already been made and could be known.

  2. If you must trade, do not trade against this kind of sharp move up.

  3. Definitely, absolutely do not have resting sell orders on the book this late.

The person buying is claiming to know the outcome, that the fair price is 100. You might choose to trade against that person if you think they’re often wrong, but understand that this is what you will be doing.

Should insider trading be allowed here, as it is on Polymarket? I say yes. It depends on your goal. Do you want to be able to trade at and know the non-insider price, or to know the insider price and info? You can get at most one of those two things.

Norwegian officials are predictably not amused by this particular bout of insider trading, and are investigating what they call a ‘criminal actor who wants to earn money on our information.’

Polymarket: JUST IN: It has been revealed only 5 people at the Nobel Peace Prize foundation knew the winner before they were announced.

Everyone checking Polymarket knew.

A good periodic reminder of one reason people often reject prediction markets:

Robin Hanson: Years ago a NYC based software firm ran some prediction markets, hoping in part to find & promote “diamonds in the rough” employees who predict especially well. They did find such, but then said “Oh, not them”; such folks didn’t have the polish & style they wanted.

Let that be a warning to those who think that being proven right will gain them more respect and inclusion.

Duncan Sabien notes that comments often create an obligation to respond, and suggests a new way of differentiating ask culture versus guess culture. I see what he’s trying to do to connect them and both are interesting, but I think mostly distinct.

The obligation to respond is that if others see a criticism [Z] to which you don’t respond, or others mischaracterize your [X] as if you said [Y] and respond to [Y], and especially if others then argue about [Y], then you’re in trouble. In the first case, your failure to respond will imply you don’t have a good response to [Z]. In the second case, they’ll start to believe you really did say [Y].

The ultimate source of this obligation to respond is, essentially, that your failure to respond would be Bayesian evidence of inability to respond, or to respond well.

As in, if I get a comment [C] that says [Z], and I had a good response [R] that answers [Z], then why didn’t I respond to [Z] with [R]? A conspicuous non-answer suggests I don’t know of any such [R]. A bad answer [B] also suggests I don’t have a good [R].

A non-conspicuous non-answer does not. One price of consistently engaging with critical comments or statements, in any context, is that an increasing share of non-answers become conspicuous.

Indeed, one of the reasons I rarely respond to comments these days is that I do not wish to create this obligation to respond to other comments, to avoid the time sink. When I do consider a comment worth responding to, because many would want to see that, I will often do so by quoting it in a full post.

The theory on guess versus ask culture is that the distinction is about how many ‘echoes’ you trace. As in, ask culture traces zero echoes, you simply ask for what you want, and they are responsible for saying no and not holding the ask against you. Whereas guess culture traces one echo, you think about how they would respond, and you can imagine sophisticated others (guess harder!) tracking many echoes.

I think this is an interesting idea but I don’t think it is right. In both guess and ask culture, you are responsible for an unlimited number of potential echoes. The difference is what consequences and thus echoes an action causes.

Technically speaking, I think the actual dial is either or both of these:

  1. Narrowly: Ask culture is created by radically raising the penalty for imposing a penalty upon someone for refusing an explicit request. As you turn up that penalty, and turn it up under a broader set of circumstances, you get Ask culture.

  2. Broadly: Guess culture is created by, in an increasing variety of circumstances, punishing fully generally the creation of common knowledge.

In case #1, I am less penalized for saying no, which means that there is far less reason to penalize you for asking, which in turn means you should ask, and indeed because you should ask I can then put the impetus upon you to ask, and impose various penalties upon you for instead trying to have me guess, and also my guess if you don’t ask is that you probably aren’t implicitly asking either.

Explanation #2 is more compete, more general, and cleaner, once you grok it.

Ask culture very much does not get you out of tracking social echoes in general.

The ultimate both guess and ask culture move here is the anti-ask, as in: No Reply Necessary, or saying (NRN) the way you would with an email. Duncan mentions less blunt ways to say this, but I prefer the classic version, the straight NRN, without further explanation. As in, here is some information, and it is 100% fine on all levels to simply ignore it if you do not find it useful.

John Wentworth advises us how to dress to improve our epistemics, the central thesis is that coolness is status countersignaling. This oversimplifies but is a helpful note.

Have you tried making any effort at all to talk to people in the industry you are studying or want to know about, including any of the many many free ways to request this? It seems very often the answer for PhD students is no. So you get papers where data is analyzed in detail but there has been zero contact with anyone in the real world. In general, remarkably many people are willing to talk to you if you ask.

Scott Sumner points to the concept of ‘defining deviancy up’ by extending words that refer to extremely wrong and bad things [X] (such as genocide, slave labor or pedophilia) to include related things [Y] that most people would agree are, at minimum, a lot less wrong or bad. If you try to respond that [Y] is less bad than [X], or that the new expansive definition covers things it shouldn’t, people respond by claiming you’re saying [X] is fine (or you’re condoning [Y]). Other times, or eventually, things loop around, and definitions are so expansive the word becomes meaningless or fine, the same way mere ‘speeding’ is now ignored so they invented ‘reckless driving.’

People are, according to a new study, ‘much more likelyto purchase ‘stigmatized’ items like condoms and pregnancy tests at self-checkout counters.

Abstract: On the intensive margin, we show that stigmatized items are much more likely to be purchased at self-checkout than at cashier registers, especially condoms and pregnancy tests. We estimate that customers are willing to pay 8.5 cents in additional time cost for the privacy of purchasing stigmatized items at self-checkout.

I totally buy that if there is an open self-checkout line and an open cashier, and you are buying a pregnancy test, you are going to go for self-checkout on the intensive margin. Sure.

But if anything, this effect looks surprisingly tiny. Customers are only willing to pay 8.5 cents in additional time? That’s not a lot of stigma. If one values time at $20 per hour, then this is on the order of fifteen seconds. Do you have any idea what people will do in other contexts to avoid mild social awkwardness? If people had the opportunity to pay money to not have to look anyone in the eye, some would pay $0, but you could get some people for dollars, and also you can stimulate new demand.

Tyler Cowen: I even draw distinctions across automated models. For instance, if I have “a stupid question,” I am more likely to ask Grok, since I would rather GPT maintain a higher opinion of what I do and do not know.

Dismalist: Reminds me of a line in Mad About You when after hiring a cleaner to start coming the next day, Jamie starts cleaning the night before. Paul sees her, and says: We don’t need a cleaner. We need a credible threat of a cleaner!

If you worry about ChatGPT or Claude’s memory on this, which I wouldn’t, you can use a temporary chat, or delete the chat afterwards. Let’s not panic and use Grok.

Also, yeah, I absolutely hate the thing where you hire a person to do [X], and then you get pressure to do [X] for them in advance to avoid looking bad or being rude or what not. The whole point of hiring them is to get them to do [X] so you don’t have to, or so they can do it better.

Another commentator notes that the right amount of stigma is sometimes not zero, indeed one can see this because for some prosocial items we approve of the existing negative stigma (as in, you buy wisely and the cashier looks at you approvingly).

China cracks down on ‘negative emotional contagionand ‘excessively pessimistic’ social media users. I do agree with Tyler Cowen that if you are spreading negative emotional contagion, ‘there is a very good chance’ you are likely to be ‘part of the problem,’ but it is a hell of a thing to ‘crack down’ on it.

Lily Kuo (NYT): The authorities have punished two bloggers who advocated for a life of less work and less pressure; an influencer who said that it made financial sense not to marry and have children; and a commentator known for bluntly observing that China still lags behind Western countries in terms of quality of life.

… Beijing is concerned that such pessimism doesn’t just discourage citizens from being productive members of society. It could turn into criticism of the ruling Communist Party.

… In the city of Zhengzhou in central China, officials said two social media account owners were investigated for portraying the city in an unflattering light.

… Weibo, a popular microblog, said last week that it suspended more than 1,200 accounts that “spread rumors” about the economy and government welfare programs.

Banning not only political dissent but any and all pessimistic speech in this way is, shall we say, highly pessimistic speech, not a sign things are going well.

Lily Kuo: “The official message of positivity is contrasted by an economic reality that is just starkly different compared with the last decades,” said Katja Drinhausen, head of Chinese politics and society at the Mercator Institute for China Studies. “It will not be enough to keep online negative emotions in check.”

I do not expect this to end well.

Are you excited for this new method of sharing content on Twitter? I know I am.

Nikita Bier (Twitter): Starting next week we’ll be testing a new way to share and engage with web links on Twitter. The goal will be to ensure all content on the platform has equal visibility on Timeline.

Johnny v5: oh wow. hoping this is legit. we all know AOL is the future. but hyperlinks — while mostly an untested technology — have shown some promise as a niche applications

Ashkhen Kazaryan sounds the alarm about the case TikTok vs. Anderson, in which it is held that if an algorithm promotes harmful content, it cuts through Section 230 immunity and becomes the platform’s speech. As Kazaryan argues, the modern internet does not work without curation or without algorithms.

This is a tricky problem. Obviously you can’t make everything in every algorithmic feed (or ‘for you’ page) the responsibility or speech of the platform, as the Third Circuit did here, or you effectively ban such feeds. Also obviously, if you intentionally steer users towards particular content sufficiently strongly, then that should be on you. So you need a limiting principle to determine what constitutes a sufficiently non-neutral algorithm.

Sound advice from Rob Miles that bears repeating.

Rob Miles: There are a bunch of really basic and easy ways to improve your social media experience that I see smart people not doing.

  1. Turn off auto-playing wherever possible

  2. When you see something that you would prefer not to have seen, consider why it’s on your feed, and use the tools to remove it. You can unfollow people, mute people, mute words, or turn off retweets from people

  3. Deliberately don’t engage with things you want to see less of. If you engage with things because they make you angry or scared, social media will dump more of those things on you. Engage with what you want to see more of

  4. One thing I do is ‘tending the garden’: Scroll through your feed one item at a time, and for every single one, consider if you want more or less of that, and take action. Feed what you want, weed out what you don’t. Just a few minutes of deliberate regular maintenance helps a lot.

  5. Try to never use social media apps, just view in the browser, where you’re in control, and use tools like UBlock Origin and TamperMonkey to change things. LLMs are great at writing Tampermonkey scripts, I can simply ask my buddy Claude to make the website just how I want it!

I cannot emphasize #3 enough, and I should try #5. With notably rare exceptions in high value spots, the rule is to never, ever, ever interact with something you want to not see in the future, no matter how wrong someone is on the internet. Negative interaction is interaction. That does not include muting or blocking, or saying ‘see less of this,’ which might not do much but are at least not going to make it worse.

Twitter’s new algorithm has a ‘reputation score’ from 0-100, where low scores reduce reach, and there is no way to check your own rating. I am actually rather sympathetic to this in theory, because reputation should absolutely play a role in reach, and also if you shared people’s reputations you can imagine what would happen next and all the accusations that would fly. The problem is I absolutely do not trust Elon Musk or Twitter to not put a thumb on the scale for various reasons both ideological and otherwise, and I also don’t trust them to not mess this up. If we are going to do this, the algorithm needs to be transparent, even if that doesn’t make it calculable from the outside.

[Editor’s note: As always, if you like avoiding politics as much or even more than I do, especially in 2025, consider skipping this section. The failure to mention other political topics here or elsewhere does not mean that I am not aware of or do not care about them, or that the ones I did choose to mention are the most important.]

It is currently not working due to a shutdown. This is mostly not about that.

President Donald Trump (link has video, from Kirk’s memorial service): [Charlie Kirk] did not hate his opponents. He wanted the best for them.

That’s where I disagreed with Charlie. I hate my opponent and I don’t want the best for them. I’m sorry.

The free speech situation is extremely terrible. I don’t care who started it, or who did what first, free speech is the most important, most sacred principle, period.

FIRE: President Trump suggested today that media outlets are engaging in “hate speech” by being “unfair” to him and “maybe” should be prosecuted.

Trump’s statement demonstrates the inherent danger of “hate speech” laws: Those in power will always weaponize them to silence dissent.

Many Trump statements have repeatedly made it clear he does not believe in free speech and demands control over the media, that he thinks the media needs to support him, or at least not oppose him, or else. Carr’s jawboning, as warned about by Ted Cruz, is only the most blatant incident.

While the situation is extremely dire, we need to avoid saying things like ‘unprecedented attacks’ on free speech, or that it’s all over for free speech, or anything like that. This is a clear misunderstanding of the history of free speech, and an example of the kind of ‘spreading negativity’ that does indeed make you part of the problem, except unlike China I would never want to tell you that you couldn’t say it.

Free speech has always been constantly under attack even in America, and I’m mostly not talking about the last decade. Our second president, John Adams, went hard after the speech of his opponents. We’ve been going back and forth on this for a very long time. Woodrow Wilson went after it hard. McCarthyism went after it extremely hard. Things after 9/11 or around 2020 were very not good. And so on. Social pressure on speech, including by the government, is universal.

It was only this month that YouTube agreed to reinstate (under new government pressure) the accounts of those who were suspended for saying the wrong things about Covid under a very broad supposed ‘misinformation’ crackdown instigated by heavy pressure from the Biden Administration, and admitted that they did so under Biden Administration pressure to censor speech that did not violate YouTube’s policies, which it now says was ‘unacceptable and wrong.’

Many of the statements that got accounts suspended ultimately proved accurate, although of course many others were both highly irresponsible and highly false. Presumably, if I had done readings of my Covid posts on YouTube, or reposted the texts to Facebook, I would have been suspended many times over.

Rep. Jim Jordan: But that’s not all. YouTube is making changes to its platform to prevent future censorship.

YouTube is committing to the American people that it will NEVER use outside so-called “fact-checkers” to censor speech.

No more telling Americans what to believe and not believe.

YouTube also is trying out Community Notes.

@elonmusk was ahead of the curve. Meta followed suit. And now YouTube.

I am glad to see these changes, but it does make one ask about the limiting principle?

eigenrobot: ok here’s a fun one is it restricting free speech to pressure a media platform to reinstate accounts that it had previously removed perhaps in response to government pressure. good luck figuring this one out using principles that are both coherent and non-exploitable

I think my answer is ‘the pressure on YouTube here is in practice okay because it is (1) a push towards more speech and (2) undoing previous government pressure,’ and it would be unacceptable if either clause was untrue.

For this go around, I’d draw a clear distinction, so far, between incidents directly related to Charlie Kirk, and incidents about other things. Performative lawsuits and wishful or general statements aside, so far from what I have seen actual consequences have been confined to people who decided to say ill-advised things specifically related to Charlie Kirk or his assassination, or at least to people’s actions surrounding that. Which is a relatively reasonable and compact thing to get bad mad about. Even comedians have the rule of ‘too soon.’

Let us not forget all the hard earned progress we have made, even if the last decade has involved some backsliding and dialectic escalation. That doesn’t mean we can relax. We have to fight for this all the more. It does mean don’t despair.

Things are quite bad, but don’t catastrophize. Whenever you see a decision like Disney indefinitely caving on Kimmel, suspending a show that was actively hemorrhaging money, you get people saying this means that they are now ‘state owned media’ or fascist or otherwise fully under state control. That’s not how any of this works, and wouldn’t be even if he had stayed off the air, although it was of course very good and right to exert a lot of pressure on Disney to bring him back.

I also notice that, as terrible as this is, we don’t need to be too concerned about broadcast television or its licenses any longer. Only 4% of households rely on broadcast television. If you strike down a Kimmel or Colbert, and demand is there, you only make them stronger. I don’t think we should sell off the broadcast spectrum, at least not quite yet. I think there’s value in preserving the low-end solutions to things. But I wouldn’t lose sleep over it if we did pull that plug entirely.

If you strike down a Kimmel, and then there’s enough noise that Disney puts him back, you’ve very much gone into Streisand Effect territory and also royally pissed everyone involved off, pretty much across the comedy and journalist spectrums.

Then if you respond to the restoration by announcing you’re going to sue ABC, because they dared give into your previous lawsuit to bend the knee and keep the peace? Yeah, I’m guessing that is not going to go the way he would like. It also makes it impossible to pretend Trump wasn’t trying to coerce the network.

AppleTV+ has decided to postpone Jessica Chastain in The Savant in the wake of events. I agree with Aramide Tinubu and also Jessica Chastain that this is a mistake, but it is the type of decision that has previously often been made in similar circumstances. The show will still be there for us in a few months.

Nate Silver notes that liberals who remember what it was like after 9/11 tended to be more wary about progressive cancel culture. Whereas now it seems like we have the opposite, realizing how bad things were and wanting to dish it out even worse. That only ends in one place.

As long as the primary platforms for free speech are mostly owned by companies with a wide array of business interests, upon which the government can exercise broad discretion, it is very difficult for them to push back too hard against attacks on speech, although some good news is that a large part of the public will still to some large extent turn against any platform seen to be caving. It is easy to see why a Disney or Paramount would fold, at least up to a point. Disney found out that not folding has its own dangers.

It is also easy to see why The New York Times didn’t fold.

Michael Schmidt: NEW: Trump just sued The New York Times for $15 billion over stories written by me, @peterbakernyt @russbuettner @susannecraig. The suit has no merit. It’s just “an attempt to stifle and discourage independent reporting. The New York Times will not be deterred by intimidation tactics. We will continue to pursue the facts without fear or favor and stand up for journalists’ First Amendment right to ask questions on behalf of the American people.”

Full NYT statement. “This lawsuit has no merit. It lacks any legitimate legal claims and instead is an attempt to stifle and discourage independent reporting. The New York Times will not be deterred by intimidation tactics. We will continue to pursue the facts without fear or favor and stand up for journalists’ First Amendment right to ask questions on behalf of the American people.”

Matthew Yglesias: One of the benefits of the New York Times being a company whose *onlybusiness is journalism is that unlike Disney or Paramount or whatever they have no choice but to fight for the integrity of their news operation.

I am very confident Michael is correct that the lawsuit against the New Your Times has no merit. I mean, you may have thought previous lawsuits had no merit, but this is a new level of not having merit. We’re talking a complete and profound absence of even the fig leaf of potential merit, full common knowledge of absolutely no merit. He’s literally suing them for things like endorsing Kamala Harris too loudly and saying so, himself, out loud, where we can hear. This really is profoundly not okay.

Looking back on that now that the time for a monthly roundup has come, I notice that we have largely moved on, and the pressure on this already feels like it is subsiding.

It would be nice if we stopped committing murders, by which I mean sinking ‘suspected’ drug ships, accused of non-capital offenses, without due process of law. I don’t want to hear ‘experts warn this raises serious international law questions’ when it’s clearly just straight up murder.

The Trump Administration released its new rule for prioritizing higher wage jobs for H1-B visas (good, and important if we still hit the cap). Except instead of looking at the number known as ‘dollars paid to the employee,’ also called salary, they are using the complete bullshit system called DOL ‘wage levels.’

Jeremy Neufeld: The new Trump H-1B rule just dropped!

It prioritizes DOL “Wage Levels,” not real wages. DOL thinks an experienced acupuncturist making $40k is a higher “Wage Level” than an early-career AI scientist making $280k.

That means more visas for outsourcers, fewer for real talent.

As in, don’t worry about whether a job produces anything of value, or anyone is willing to pay you a lot to do it, ask whether someone is ‘experienced’ in that job.

I have found zero people making any argument whatsoever, even an invalid one, in favor of using these ‘wage levels’ rather than salary.

I never want Robin Hanson to stop Robin Hansoning, and I would never want Tyler Cowen to stop Tyler Cowening, as embodied by his claim that we should not auction off all H1-B visas because this would have failed to attract the best upwardly mobile talent such as Sundar Pichai. This follows a long line of arguments of the form ‘do not allocate [X] by price because the most valuable but neglected talent, especially prospective travelers, would then not buy enough [X], and that is the most important thing’ where famously one [X] was traffic via congestion pricing.

There is a real objection in such cases, which is that externalities exist that can’t be properly priced in, and we are unwilling or unable to reasonably price such externalities, and thus pure allocation by price will fail to be a full first best solution.

It still beats the current allocation strategy of allocation via lottery, or via allocation via willingness to wait in line, including for the purposes Tyler worries about, and is vastly better in the vast majority of allocation decisions. The current system already has a huge invisible graveyard of trips and talent and so on. Vivian Darkbloom points out that in this case that Pichar is a terrible example and would definitely have made it in under the $100k proposed fee, whereas without the fee he has to survive a lottery draw.

I would bet that under a pure auction system (as in, you choose a fee that roughly clears the market), the amount of top talent secured goes way up, there will be a huge correlation with willingness to put up the $100k fee. If you want to additionally subsidize extraordinary people? Sure, if you can identify them, also we have the O-1.

Perhaps this is the best way to make the simple case: Tariffs Mean You Pay More For Worse Products. I prefer paying less for better products.

It seems Trump took the government shutdown as a reason to fire a lot of people in the CDC, with the final total expected to be between 1,100 and 1,200 people?

Sam Stein: As the dust settles, it’s clear that Vought’s RIFs amount to a Friday night massacre at the CDC. Lots of confusion as to the total number gone. But several sources tell me top officials and many staff at the center for Chronic Disease Prevention and Health Promotion and the center for immunization and respiratory diseases are OUT. Am told the Ebola response team has been hit hard too.

Again, there is mass confusion. but it appears the government’s chief agencies responding to outbreaks and studying infectious diseases have been gutted. if you know more we have a secure tip line here. I’m also on signal asteinindc.09

To put a finer point on it. I’m told the ACTING DIRECTOR and CHIEF MEDICAL OFFICER for the National Center for Immunization and Respiratory Diseases are now gone.

Am told CDC’s HR department had been furloughed because of the government shut down. They were then, un-furloughed so that they could process the RIFs to fire their colleagues. Can confirm. Some CDC experts who were RIFed on Friday have already had their firings rescinded by the administration.

This is on top of the loss of 2,400 staff, or 18% of the agency, earlier in the year. About half the initial firings were rescinded this time around, it seems this government has a pattern of thinking it can fire a bunch of people and then say ‘oops’ on some of them later and it’s no big deal, and in this case it seems they’re blaming many of them on ‘coding errors in their job classifications’ which shows the level of attention to detail going on. Matthew Harper called the situation ‘chaos,’ not normal and unprecedented in his 20 years of reporting.

Trump seems to be framing this as retaliation for the shutdown because the CDC is a ‘Democratic’ program, and taking a very ‘look what you made me do’ attitude?

Others are pushing back on the theory that the CDC is bad and incompetent, actually, so this is good actually, a line I’ve seen both from MAGA people and also some others.

I have not been especially impressed with the CDC, shall we say, on Covid-19 related fronts or in other places I’ve gotten a close look. The problem is that presumably we can all agree that we need a well-staffed, highly functional and effective Centers for Disease Control in order to, ya know, track and control disease? Does ‘the Ebola response team has been hard hit’ sound like a wise move?

With AI potentially enabling biological threats this now more than ever is not a program you cut. It seems highly plausible that CDC wasn’t doing a great job, but in that case we should be replacing or reforming the agency. I don’t see any sign of doing that.

I continue to see a steady stream of nightmare stories coming out of the UK. I don’t consider this my beat, but I must note that things seem deeply, horribly wrong.

We see things like UK’s NHS talking about the supposed benefits of first-cousin marriage, almost on a weekly basis. And we get the kind of authoritarian ‘how has this person not been sacked and no one seems to care?’ statements such as this one:

Paul Graham: A spectacular example of Orwellian doublespeak from the UK Home Secretary: “Just because you have a freedom doesn’t mean you have to use it at every moment of every day.”

In fact the ability do something whenever you want is practically the definition of a freedom.

It is sufficiently bad that ACX Grants are giving Sam Glover $60k to fight for UK free speech, you can DM him to volunteer.

When, one must ask, will the people rise up as one…

NewsWire: UK government outlaws free drink refills on hot chocolate, mocha and Coke Cola.

…and say ‘but this time, you’ve gone too far’?

Sections I did not expect to have to write.

I am ashamed of every news article and comment on the subject that does not lead with, or at least put very high up, the obvious fact that Tylenol does not cause autism.

It’s not only that ‘the quality of evidence is godawful, or that the evidence actually points heavily in the other direction, which it does, with the correlations both going away under reasonable controls and also being very easy to explain if you think about common cause for five seconds. It’s that our prior on this should be extremely low and even if there were somehow a non-zero effect size it would be greatly eclipsed by the risks of not taking Tylenol when you need it, given the lack of alternatives available.

The White House is also citing uncontrolled rises in autism rates over time that are very obviously caused mostly by expanded diagnostic criteria and active pushes to diagnose more high-functioning individuals, including calls for ‘diagnostic equality.’ The vast majority of people I know that are considered on the spectrum would have been undiagnosed back when I was a child.

To be fair, there is a possible mechanism that isn’t completely crazy, and this is less terrible than if they had gone after vaccines. So the whole thing isn’t quite maximally bonkers, but again the whole thing is bonkers, deeply irresponsible and deeply stupid.

Steven Pinker: Autism expert (and friend & graduate school classmate) Helen Tager-Flusberg: “I was shocked and appalled to hear the extreme statements without evidence in support of what any of the presenters said. … the most unhinged discussion of autism that I have ever listened to. It was clear that none of the presenters knew much about autism … and nothing about the existing science.”

Key quote:

“Singer: The new recommendations are not based on the science. The largest study in the systematic review that the administration cited found no association between prenatal Tylenol use and autism. The smaller studies that did indicate an association were of different sizes, did different analyses, used different doses and even measured autism in different ways.

The key question is: Why are these pregnant women taking Tylenol in the first place? We know that fever during pregnancy is a risk factor for autism. So if they were taking Tylenol, was it the fever that caused the autism or the Tylenol? The smaller studies did not control sufficiently for this.”

Also there is this:

Jerome Adams MD: The White House, HHS, and all of the media have (completely) buried the lede. Every news headline should actually read:

Despite bringing the full resources of the U.S. government to bear, RFK fails to find a connection between vaccines and autism!

So can we put that to bed?🙏🏽

This is all actually a really big deal, to give women access to zero painkillers and ways to bring down fewer is dangerous, it is extremely painful, and would lead to obvious reactions if we actually act this stupid and cruelly:

Elizabeth Bennett: Just popping in to say that if we tell pregnant women there are no OTC pain relievers they can take for any reason, good luck getting that birth rate up 😬

We could also leave this here:

Rebecca Robbins and Azeen Ghorayshi (NYT): The dean of the Harvard T.H. Chan School of Public Health, who consulted with top Trump health officials ahead of Monday’s warning about Tylenol and autism, was paid at least $150,000 to serve as an expert witness on behalf of plaintiffs in lawsuits against the maker of Tylenol.

In the decision to dismiss the lawsuits, the judge, Denise Cote, agreed with lawyers for the defendants that Dr. Baccarelli had “cherry-picked and misrepresented study results” in his testimony and was therefore “unreliable.”

Jay Wall III is the latest to point out the Jones Act is a monument to inefficiency that costs American families and businesses a fortune while delivering almost none of (I would say the opposite of) its promised benefits. Some highlights:

American-built coastal and feeder ships cost between $190 million and $250 million, while a similar vessel built in a foreign shipyard runs about $30 million.

And what’s the result? One Chinese shipbuilder alone constructed more commercial vessels by tonnage in 2024 than the entire U.S. industry has built since the end of World War II.

The U.S. share of the global commercial shipbuilding market has fallen to a pathetic 0.1%.

Here’s another head-scratcher: The Jones Act is actually bad for the environment.

This is not a detailed case, nor does it lay out the full costs involved, but yes.

What would have happened with 40% less NIH funding over the last 40 years? Given recent events and a proposed 40% cut in the NIH budget, that is a great question, but it is deeply tricky to answer.

As in, while the study described here was worth doing, it doesn’t answer the question.

(To be clear, I strongly believe we should not be cutting NIH’s budget at this time.)

Matt Esche: The study connects NIH grants with the papers they produced and the patents that build on the funded work, whether directly or via citation.

It’s difficult to trace out exactly what an alternate world would look like, but simulations using NIH review scores and outcomes linkages reveal what it could mean.

The alternate world with a 40% smaller NIH could mean a world with 65 fewer FDA-approved drugs, 11% of those approved between 2000 and 2023.

Even under the most strict linkage — a drug patent that directly cites NIH funding — 14 of the 40 FDA-approved drugs with these patents are at risk when cutting the bottom 40% of funding.

And, medicines under threat with a 40% smaller NIH are on average more highly valued, whether measured by the FDA’s priority review process or stock market reactions.

Funding is helpful, but this does not tell us the counterfactual if we had cut all funding, even if we fully trust that we can get the counterfactual funding decisions via measuring prioritization rankings. If you hadn’t received the federal funding, would you have gotten other funding instead, or not? If you hadn’t been able to fund your project, would the project have happened later anyway? Here or overseas?

Would lack of federal funding have caused others to step up, or collapsed the ecosystem? What happens to the displaced talent, and does talent not enter the field at all? Would a lot less time have been wasted seeking grants?

Parallel and inevitable discovery are common, and so are drugs that could have been discovered long ago but happened not to be. It goes both ways. Would lack of progress and experience compound our losses, or would we have more low-hanging fruit?

Work hard or work smart? Jeremy Giffon suggests looking to see who prefers which, whether they maximize effort or elegance, with ‘work hard’ people looking to outwork you and ‘work smart’ people looking for ‘mate in one’ tactics. I reject the dichotomy. If you ‘work hard’ because you find value in working hard, and don’t look for the better way before you start working hard, you are confusing costs and benefits. Looking for the right way to optimize results as a function of work, including the planning work, is hard work.

I’d also note that contra Giffon’s example, Trump in defeating Cruz for the 2016 nomination very much did not ‘mate in one’ but instead worked very hard, in his way, for quite a while, and pulled off a very complex set of moves, even if he often did not consciously know what he was doing. And that game, like Cruz’s, started decades before the campaign. And in his nightclub example, it’s not obvious which solution (work hard at the gate or work hard to skip the gate) is which.

A standard Darvo strategy is:

  1. Gaslight people. Do and say false totally insane awful things. In a calm manner.

  2. Others react strongly, and in various ways point out what happened.

  3. Invoke the mellow heuristic, that whoever is more emotional is usually wrong.

The origin of The Mellow Heuristic seems to be, highly appropriately, Bryan Caplan. Bryan is an expert at staying mellow and calm while saying things that are anywhere from contrarian and correct all the way to patently insane.

The original justification is that ‘emotion clouds judgment’ which is true but it can also be highly useful, or highly appropriate or typical given facts or circumstances. There are cases where that logic applies but they seem rare, and more often the evidence and causation largely runs the other way. As in, sometimes the emotion isn’t causing poor thinking, the emotion in context is instead evidence of poor thinking, if there’s no other explanation for it, but if the emotion is justified by circumstances I think it provides little or no or even negative evidence.

Stefan Schubert (quoting the original Mellow Heuristic post): I think that people who disagree with Mechanize should engage with them with logical arguments, not sarcasm and mockery.

Oliver Habryka: The Mellow Heuristic seems pretty terrible to me. My guess is in most historical disputes I’ve been involved in it would get the wrong answer. Pretty close to as uninformative as it gets.

[is challenged for examples, gives examples, is challenged that the examples are because rationalists perhaps pre-apply too much mellow heuristic, creating selection bias]

I think the selection bias here is merely the result of needing reference points that can be pointed to. I could tell you “that time when I adjudicated a work dispute yesterday” but that’s of course useless.

For lower stakes, my sense is the mellow heuristic is actively anti-correlated. If I have one programmer who has very strong feelings about a topic, and one who doesn’t, the one who has very strong feelings is a decent amount more likely to be right. There is a reason for their feelings!

I think, in this particular case (IYKYK), the correct answer is to engage with the actual arguments, but also it seems right to use sarcasm and mockery, because they deserve it and because it is funny, and because it is often the easiest way to point out the underlying illogic of an argument.

Where the Mellow Heuristic is useful is when the emotion is disproportionate to the situation even if they are telling the truth, or it is clearly so strong it is clouding judgment in ways that hurt their chance of being right, and most importantly when it is being used as itself an argument, an appeal to emotion, in a way that reveals a lack of a more logical argument, in response to something deserving of a counterargument.

It is least useful in cases like this one where, as I put it in the weekly, that’s bait.

Although what’s even funnier is, I think if you properly apply the Mellow Heuristic and similar questions to the Mechanize case, it does not go well for Mechanize, such as here where they respond to Bernie Sanders saying that Mechanize aims to ‘make it easier to pay workers less’ by claiming that they pay their employees vastly more than Bernie Sanders pays his staffers, which (1) is a highly emotional-style claim, (2) which is clearly designed to rile other people up, and (3) is completely irrelevant given the difference in role types.

Again, this is Emergent Misalignment, saying the awful thing because it is awful. It’s play acting at being a villain, because you woke up and realized you didn’t die the hero.

Credit scores, I say with love, are one of our most valuable inventions.

The math behind credit scores is, and this too I say with love, deeply stupid.

Rachel, Spirited Sparrow: My husband’s credit score was 841. He made a final payment on our vehicle and it immediately dropped to 795. Never missed a payment. But they want you to carry debt.

EigenGender: It’s kinda funny that some analyst probably ran a bad linear regression decades ago and it fuels constant conspiracy theories on here.

Jai: Currently making payments predicts paying back further loans. Not currently doing that weakly predicts disengagement. I think the math is sound and fits reality – they do want you to carry debt to prove that you’re still the kind of person who pays back loans on time.

There’s no mystery what is going on here. Having long duration open credit accounts that you’ve been consistently paying absolutely predicts future repayments. Credit scores also take into account various other correlates, to the extent they are legal to take into account, and excludes the ones that aren’t legal to take into account.

It’s all very Level 1 thinking, squarely in the territory of Goodhart’s Law and riddled with obvious mistakes. The measures would be strictly better if they looked backwards in a sensible fashion and included length of paid off loans inside average age of loans, and otherwise take into account what evidence there actually is that you’ll pay your debts on time, and ideally do some Level 2 (or even Level 3) thinking, but those involved aren’t that clever, so they don’t.

Perhaps some expert will chime in and say no, we ran the backtests and that doesn’t actually help, to which I reply that is a Skill Issue, you did it wrong, try again.

The alternative hypothesis is that ‘they want you to carry debt’ is literal.

The resulting scores are still highly useful, and still mostly ‘get it right,’ partly because most of the time obvious answer is right and in part because if you have the discipline to work to raise your credit score, that is strong evidence of good credit.

TIL you can type docs.new or sheets.new into your browser and get a new doc or sheet.

I like the fun fact I learned that Italian has two words for regret, ‘rimorsi’ for something you did and wish you didn’t, and ‘rimpianti’ for something you didn’t do and wish you did. Emmett Shear points out there is no such distinction in machine learning algorithms that work on ‘regret,’ but the distinction is very important for actual outputs and decisions, and for many questions involving alignment.

A report from the Abundance DC conference.

Influencer posts a video asking for help cleaning an ancient temple, gets 60 to help. There’s an obvious win-win opportunity here for a lot of similar content.

A searchable collection of all Slate Star Codex book reviews.

Home production work by women has gone way down over the last century.

Welcome to yet another graph where things get way better until the 1970s, at which point we stop seeing progress and everything stalls out. Great stagnation strikes again.

This is contrasted with much higher investment by both parents in child care, as demands there move to absurd levels but we haven’t had technology to lighten the load. We can get things cleaner faster, but not provide more child care faster, unless you count giving the children screens.

Matthew Lewis praises the transformations of the last decade that have happened in New York City, finding it now the ultimate city that we need more of, where everything is right there for you and highly walkable and everyone is friendly and gets along and no one bats an eye at profoundly different cultural happenings. I agree. Yes, we should turn as many places as possible into Manhattans, which would then make them all much better than Manhattan because housing costs would go down.

David Perell in praise of New York. Mostly this also seems right, especially the secondary note about New York having multiple core industries (and I’d add cultures and worlds) and this making things far less transactional than San Francisco despite New York having the finance world, because the status hierarchies are parallel and also you don’t constantly have business interests with everyone.

The only place that felt wrong to me is he says people in New York are flakey due to excess of options, but I haven’t experienced that, and find people in San Francisco far more flakey even when they are literally the same people.

I agree that transportation to and from the airports is one of the biggest obvious failures, although it’s ultimately a minor cost compared to rent even if you use taxis end to end. I am stubborn and mostly take the subway to and from JFK despite everything (and the train to EWR as well), but it’s certainly not fast and if I ever was using LGA that wouldn’t work.

I also agree that friendships can be difficult and won’t become close by accident. The city is too large, you will meet lots of people but you have to actively make real friendships happen after that initial step.

As he notes, the Duane Reeds and Best Buys of New York have locked up remarkably many goods, and this is super annoying on occasion. In some ways I agree it reflects loss of social trust, but mostly I think it reflects a narrow failure to enforce the shoplifting laws in an otherwise high trust society. As in, I feel exceedingly safe and like I can relax, but It Is Known that if you shoplift you basically get away with it, so in certain spots they have to play defense to avoid being the juiciest target.

David Perell: The ideal distance to live away from your best friends is Walkie-Talkie distance: close enough where you can easily walk to each other’s place but far enough away so everyone has some space. And if you get enough friends in the neighborhood, it starts to feel like college again.

Michael Miraflor: This is what NYC feels like when you’re young and just out of school and working at an office where people generally live a short train ride away. The city is small, your friends are close, and the city is a museum, playground, and source of inspiration wrapped into one.

The office part matters imo. You can meet your best friends or your partner at your first office job. To be young and working hard in the trenches together and also celebrating and making real friendships IRL is an important part of it all – professional camaraderie, personal development, how to function in the world, etc, and a lot of it has been wrecked a bit by WFH.

Alas, most of us are not young anymore, but yes everything being at your fingertips is the big secret weapon, both walking and via subways. You do still have to put in the effort if you want the friendships to be real.

Sasha praises New York City by contrast with the Bay Area, which he refers to as cursed because everything must have purpose and beauty and grace are frowned upon. Really this is 5% praising New York and 95% absolutely unloading on San Francisco, or rather than San Francisco that reads this blog and is full of technology. The post is a joy to read purely for the experience of reading, even if you disagree with all of it.

Sasha Chapin: In the Bay, beauty (personal and otherwise) is looked down on and the famous gender imbalance has chilling effects. Is there a less sexual city than this? Perhaps Salt Lake, but I’d imagine it’s close. My gorgeous friend M is self-conscious about wearing pretty dresses, which is insane anywhere else, but reasonable here: hotness is a quality people aren’t sure what to do with.

Recently there was a themed Gender Ratio party where beautiful young women dressed glamorously, at least one for every man. In other cities this would be referred to as a party.

Sasha Chapin (from the comments): I lived in LA for a couple of years and deeply love it. LA is sincere pretend, the Bay is fake real.

Are concerts and sporting events underpriced?

More Perfect Union: The CEO of Live Nation-Ticketmaster says that concert tickets are “underpriced” and have been “for a long time.”

He also believes there’s plenty of room to raises prices.

Ashley Nowicki: I don’t think a single fan of sports, music, or live entertainment in general would say tickets are underpriced.

Arthur B: People see the cost they’re paying but do not intuitively associate missing out on sold-out shows with tickets being underpriced.

With that said super fans do help market the acts and it makes sense to subsidize their tickets.

A somewhat natural solution could be to increase prices across the board but keep tickets affordable for fans with reward / fidelity programs.

You want all seats filled, so events that do not sell out are usually overpriced even if the selected prices maximize short term revenue, although in some cases you’re stuck in a larger venue than you can plausibly sell out and have no interest in investing in the future, for example perhaps you are the Miami Marlins.

If you sell out, then the default is that you underpriced tickets if and only if resale prices are much higher than original ticket prices. If they’re similar or lower, great job.

There are two catches.

The first catch is that the scalper price is ‘not your fault,’ whereas the venue price is considered your fault. So you gain goodwill by charging less, even though this is dumb. This could be more valuable than the extra revenue and easier access to tickets that you get the other way? Maybe.

The other catch is that you often have preferences over distribution of tickets, and who attends your sold out show, that do not entirely match willingness to pay.

Another strong piece of evidence that prices are too low is that often people will spend lots of time and money to get to a concert, vastly in excess of the ticket price.

I’ve been to three concerts recently, and have realized I don’t do this enough but that selection and preparation are crucial. You want to not miss the good opportunities, or within reason pass on them due to price, but also the mediocre opportunities are meh.

The first was Weird Al Yankovic at Madison Square Garden, very much a ‘play the hits’ show and every inch a Weird Al Yankovic show, including S-tier use of the multimedia screens throughout. It was great fun and I was very happy I got a chance to see him, but at the same time I couldn’t actually see him in a meaningful way and I was mostly watching those screens, and the opening act of Puddles Pity Party was appropriate and did some interesting things but wasn’t ultimately my cup of tea.

The second was The Who, in their The Song Is Over tour, also at Madison Square Garden, with Feist as the opening act. A big problem was that with this amount of rocking out I was unable to hear the lyrics of either band well enough to understand them if I didn’t already know what they were. The majority of the time this wasn’t a problem for The Who, but for Feist it was extremely frustrating as I only knew the one song, so while they seemed great effectively everything else didn’t have lyrics. And you could feel every time they talked how much these guys appreciated and loved their jobs and their fans, and that they were pushing to do this until they physically couldn’t anymore.

The third was Garbage at The Brooklyn Paramount, which was standing room general admission, where the doors open at 7pm, opening act Starcrawler went on at 8pm, and Garbage only went on at 9pm, but not knowing this we showed up just before 7pm. Which despite a ton of waiting was ultimately a great decision, because by making a beeline to the stage, we got to be only about five effective rows deep. And that made a night and day difference. Starcrawler was effectively a (very strong) dancing performance by the lead singer since I couldn’t make out any lyrics at all, but we were close enough to appreciate it.

And then we got to see Garbage up close and that was fantastic, including being able to fully take in the joy on Shirley’s face as she turned the microphone towards the crowd. Find something you love as much as she loves the crowd singing her greatest hits, which resonated a lot with me based on how I feel when I see people citing my classic posts, except of course her version is way cooler. And even the new-to-me more recent stuff was great.

My overall conclusion is that yes, live music is Worth It, even if you’re somewhat old and busted and it requires a babysitter and so on, if and only if you do it right. And what doing it right means is (not that any of this is new or special, but I want to remember for the future and remind others):

  1. Show up for artists you resonate with.

  2. Do your homework. Know most of the songs cold. Ideally including the warmup.

  3. Pay up, in time or money, to get actually good seats if at all possible, and prioritize smaller venues to help do this.

  4. Have a plan for the downtime.

My plan is, of course, to set up an AI to periodically check for opportunities.

A chart of which movies men and women rate relatively highly on IMDB:

The patterns here are rather obvious. In addition to measuring the actual man versus woman gap, there is a clear contamination of the data based on women favoring movies based (I really, really hope!) on the preferences of their kids. If women actually think Despicable Me is the #145 best movie here, or The Hunger Games: Catching Fire is #83, I’m sorry ladies, you’re crazy, and honestly with Catching Fire no one involved has an excuse either way. And aside from Wonder Woman where this opinion is simply wrong, that movie was not good, the other half of that first list I find a highly acceptable expression of different preferences.

The male list definitely seems very male. It especially includes a bunch of long and slow older movies that many really appreciate, which typically is not a thing I like, such as my experiences with Seven Samurai (I get people love it but man it drags) and Lawrence of Arabia, where I couldn’t even.

Scott Sumner offers his latest set of movie reviews. As usual, his evaluations are almost always correct in an abstract Quality sense, but that’s not that big a portion of what I care about. This time around I have seen at most two of them.

I bought into A Big Bold Beautiful Journey (4.5/5 stars) and he didn’t, and I agree that the later scenes rely on somewhat unearned emotion, so I get his only giving out a 2.9, that seems like a reasonable ‘didn’t work for me’ rating here. The other one I think I saw was The Last Days of Disco, he gives it 3.6 but I don’t remember it.

Robin Hanson reviews One Battle After Another and correctly identifies what you see if you take the movie at face value. You can of course respond ‘this is Paul Thomas Anderson and based on Pynchon’s Vineyard and obviously not intended that way,’ and one can debate how relevant that fact is here, as well.

I did not like One Battle After Another, reluctantly giving it 3/5. The critical reaction being this positive, and seeing rave after rave, made me angry. I stand by that on reflection. The Quality level isn’t bad, but I think everyone thinks it is way higher Quality than it actually is, and most importantly even if you can look past the other stuff I hated you have to buy into the idea that Bob (Leo’s character) is sympathetic despite being among other things a completely unrepentant terrorist bomber, or the movie simply doesn’t work. I couldn’t do it.

Critics think we’re both wrong, and gave it a Metacritic 95, but audiences aren’t much biting, only giving it $22.4 million on opening weekend on a $200 million budget, so it almost has to win Best Picture to hope to break even. Film critic Jason Bailey says this is fine, they did it for the prestige and to mend relations with talent, they shouldn’t have to make money. Nice work if you can get it?

I do admit that the whole thing showed strong faith in and willingness to spend on talent for a three hour, R-rated ‘explosive political thriller’ from Paul Thomas Anderson, whose movies are consistently liked but whose box office record is spotty. That the critics think the movie is so political, and this makes them like it even more, helps explain why I like it less.

As for the rest of the movie reviews, as always you can find them on Letterboxd, where I make a point of reviewing everything I see no matter what, and I’ll have an end of year post. The trend of me being almost uncorrelated with the critics this year continues.

This month’s game is Hades 2. I’ve been enjoying it. It’s definitely ‘more Hades’ in very much a ‘meets expectations’ kind of way, so play it if and only if you played the first Hades, took on at least some Heat, and still want to go a second round.

Perfection.

It seems like there should be a good free or cheap version of ‘play against a GTO heads up poker bot and get live +/- EV feedback.’ I imagine the ultimate version of this is where you don’t output a particular action, you output a probabilistic action – you say ‘50% call, 25% fold, 25% raise pot’ or what not, and it then compares this to GTO, after which it selects a line at random and you continue.

I understand why Ben is apologizing here, but he was right the first time.

Ben Landau-Taylor: There’s something very poetic about the biggest technology breakthrough of the last decade being possible only because of a quarter century of investment into higher resolution graphics for computer games.

When I was a teenager I made fun of gamers who cared about graphics more than gameplay. Today I would like to apologize for my sins against technological progress. I didn’t understand and I’m sorry.

Graphics are cool, but if you care more about graphics than gameplay you deserve for us to make fun of you, and focus on graphics has been horrible for gaming. Yes, it turns out that pushing GPUs harder led to LLMs, and you can view that outcome as more important (for good and bad) than the games, but that’s not why they did it. They had bad taste, often still have bad taste, and should be mocked for it.

The lead writer of Clair Obscur: Expedition 33 had never played a video game. She had a cowriter, she kind of would have had to, still this is super impressive. It definitely worked out, the script was very strong, although it doesn’t branch much.

Kelsey Piper writes a plea to let the robots have this one, as in self-driving cars, in order to save over 30,000 American lives a year. Self-driving cars are so amazingly great that I consider preventing most car accidents a secondary benefit versus the lifestyle and mobility benefits. And yet support is perilous:

What’s craziest is that those over 65 want to ban self-driving cars. They stand to benefit the most, because they will soon be unable to drive. Self-driving equals freedom. Or perhaps what’s craziest is that people’s main objection is safety and trust, here is what people say to justify a ban and it isn’t jobs:

These concerns clearly have nothing to do with the actual safety data, which presumably most people don’t know.

So I largely take back not wanting to primarily make the case in terms of safety, because people genuinely don’t understand that Waymos are vastly safer than human drivers. If we make the safety case, then people’s bigger objections go away. Except the whole ‘I hate AI’ objection, I guess.

Waymo is testing at SFO. Woo-hoo! Even if they can’t go to the East Bay directly yet I would totally have them drive me to the last Bart station and go from there.

Exposure makes the public like Waymo, with two thirds of San Francisco now approving, a major shift from two years ago. This is despite only 30% realizing that Waymos are safer than human drivers.

How much safer are they? A lot. We got 96 million new miles of Waymo safety data.

Not only are Waymos involved in vastly fewer serious crashes and injuries than human driven cars, as in 79% less airbag crashes and 91% fewer serious injuries over what is now a very large sample size, including very similar numbers on harm to pedestrians and bike riders.

Very few of Waymo’s most serious crashes were Waymos’s fault. In a majority of the major accidents the Waymo was not even moving. We can determine this because Waymos are full of cameras, so Kai Williams did exactly this. He could not find a single accident that was the fault of the self-driving itself, and 37 out of 41 were mostly or completely the fault of other drivers. The most serious accident where a Waymo was actually at fault involved the front left wheel literally detaching.

This suggests that if we replaced all cars with Waymos, we would get vastly more than a 79% reduction in crashes and injuries. We would get more like a 95%+ reduction.

As I said last month, I prefer not to rely on safety as the central argument, but the safety case is overwhelming. However, it is 2025, so you can just say things, and often people do. For example:

Vital City NYC: This morning, the New York Editorial Board interviewed Bill de Blasio. In light of his history with Uber, the group asked what the city’s posture should be toward Waymo. He said: “The driverless cars are a public safety danger period.”

The public safety danger is that there is the danger they might create public safety?

Joe Weisenthal (who to be clear loves Waymo) worries we aren’t ready for cars to turn from a symbol of freedom to another thing run by Big Cloud. I’m not worried. It will take a while before there is any danger to ‘if you like your car you can keep your car’ or your ability to buy a new one, if you want that. For many of us, the symbol of practical freedom, and also the actual freedom, is the ability to call upon the cloud and get moved around at will.

This is distinct from the ‘what I can do if the authorities or others with power are Out To Get Me’ type of freedom. Yes I do think there will be some loss there and impact on the psyche, but almost no one wants to pay the cost to keep that. A lot of things are going to change because of AI, whether or not we are ready, and the old thing here was not going to get preserved for so many reasons.

Mothers Against Drunk Driving is strongly in favor of autonomous vehicles, as one would expect given their name.

How did Bill Belichick’s coaching stint at UNC go so perfectly terribly? Couldn’t have happened to a nicer guy, or rather if it did we wouldn’t all be so happy about it.

Ollie Connolly: This is as embarrassing as it gets for Belichick. But it’s also a damning (and funny) indictment of Mike Lombardi. It’s his roster — and UNC looks like a bad FCS school compared to any FBS school.

The plan to hand things off to Steve Belichick after two years is not going well.

They turned over the entire roster, and it seems they chose poorly, often fighting teams in second tier conferences for players. Whoops. I’m sad for the kids that got filtered into the wrong tier, but they did choose to play for Bill Belichick, UNC is otherwise a nice school and they are getting paid. The players will be fine.

Kicker quality keeps increasing and field goal range keeps expanding in the NFL. This has a number of downstream effects. If even modest field position gives you 3 points, but you don’t that often get the full 7, this is going to create a lot of risk aversion.

Derek Thompson: interesting that strategic optimization has pushed basketball and football in opposite directions: long shots vs. short passes

NFL in 2025 has:

– highest QB completion % ever

– lowest INT% ever

– fewest yards/catch ever

– and, sort of a separate thing, but: most long field goal attempts ever

fewest punts per game ever, too!

Nate Silver: The field goals thing is sort of related to it. If you’re anywhere past the opponents’ 45-yard line or so, it’s riskier to gamble with downfield passes that may result in an INT because the expected value of the possession is higher when you’re almost assured of a FG.

And really, these considerations apply even before then. With teams also starting out with better field position with the new kickoff rules, they’re often in a position where two first downs = field goal range. Don’t love it, don’t hate it, but it’s definitely different.

I primarily watch college football these days, but I’m pretty sure that in the NFL this has gone too far. You shouldn’t spend this big a percentage of the time in confident field goal range, as it reduces the number of interesting decisions and distinctions between situations, and causes too much risk aversion. The NFL over the years assembled a bunch of moving parts into something that seemed ‘natural and dynamic’ in various ways, and it feels like various changes are moving it to feel more arbitrary and flow less well and also reduce variety.

What should we do about it?

I would start with changing kickoffs back to a default or average of about the 20 yard line, and if you don’t know how to do them safely without them looking bizarre and playing badly, why not just get rid of them and start the ball on the 20, or the option to start on your own 20 with a 4th and [X], for whatever [X] makes sense (maybe only if you’re behind in the second half?), as an onside kick alternative? You don’t actually need a kickoff. Or you could even just start with 4th and 15 all the time from e.g. your own 40, you’re not allowed to attempt a FG unless you make at least one first down, and by default you just punt, since punts don’t require this whole distinct logic and seem safe enough.

Then we need to do something about the kickers. I’m sorry, but it’s bad for the game if 50+ yard field goals are reliable. If we can’t think of anything else, narrow and raise the uprights or even move them further back until it’s hard enough again (and adjust the extra point yard line to the desired difficulty level).

Americans increasingly see legal sports betting as a bad thing for society and sports. I share Josh Barro’s expectation that this will become a campaign issue, if AI doesn’t overwhelm all other issues. As he notes, the Federal government can still ban it, if they’re willing to pull the trigger in full. I don’t think a full ban is the first best solution here, but if it is this for DraftKings and FanDuel, it’s an upgrade. If the future ends up being prediction markets with lots of liquidity at great prices and no discrimination against winners? That’s way better. I have a dream that we kill DraftKings and FanDuel and instead we have Polymarket and Kalshi battle it out with Pinnacle.

CFAR is running new workshops on rationality, November 5-9 in California and January 21-25 near Austin, Texas.

Discussion about this post

Monthly Roundup #35: October 2025 Read More »

directv-screensavers-will-show-ai-generated-ads-with-your-face-in-2026

DirecTV screensavers will show AI-generated ads with your face in 2026

According to a March blog post from Glance’s VP of AI, Ian Anderson, Glance’s avatars “analyze customer behavior, preferences, and browsing history to provide tailor-made product recommendations, enhancing engagement and conversion rates.”

In a statement today, Naveen Tewari, Glance’s CEO and founder, said the screensavers will allow people to “instantly select a brand and reimagine themselves in the brand catalog right from their living-room TV itself.”

The DirecTV screensavers will also allow people to make 30-second-long AI-generated videos featuring their avatar, The Verge reported.

In addition to providing an “AI-commerce experience,” DirecTV expects the screensavers to help with “content discovery” and “personalization,” Vikash Sharm, SVP of product marketing at DirecTV, said in a statement.

The screensavers will also be able to show real-time weather and sports scores, Glance said.

A natural progression

Turning to ad-centric screensavers may frustrate customers who didn’t expect ads when they bought into Gemini devices for their streaming capabilities.

However, DirecTV has an expanding advertising business that has included experimenting with ad types, such as ads that show when people hit pause. As far as offensive ads go, screensaver ads can be considered less intrusive, since they typically show only when someone isn’t actively viewing their TV. Gemini screensavers can also be disabled.

It has become increasingly important for DirecTV to diversify revenue beyond satellite and Internet subscriptions. DirecTV had over 20 million subscribers in 2015; in 2024, streaming business publication Next TV, citing an anonymous source “close to the company,” reported that the AT&T-owned firm was down to about 11 million subscribers.

Simultaneously, the streaming industry—including streaming services and streaming software—has been increasingly relying on advertising to boost revenue. For some streaming service providers, increasing revenue through ads is starting to eclipse the pressure to do so through subscriber counts. Considering DirecTV’s declining viewership and growing interest in streaming, finding more ways to sell ads seems like a natural progression.

With legacy pay TV providers already dealing with dwindling subscriptions, introducing new types of ads risks making DirecTV less appealing as well.

And it’s likely that things won’t end there.

“This, we can integrate across different places within the television,” Glance COO Mansi Jain told The Verge. “We are starting with the screensaver, but tomorrow… we can integrate it in the launcher of the TV.”

DirecTV screensavers will show AI-generated ads with your face in 2026 Read More »

to-shield-kids,-california-hikes-fake-nude-fines-to-$250k-max

To shield kids, California hikes fake nude fines to $250K max

California is cracking down on AI technology deemed too harmful for kids, attacking two increasingly notorious child safety fronts: companion bots and deepfake pornography.

On Monday, Governor Gavin Newsom signed the first-ever US law regulating companion bots after several teen suicides sparked lawsuits.

Moving forward, California will require any companion bot platforms—including ChatGPT, Grok, Character.AI, and the like—to create and make public “protocols to identify and address users’ suicidal ideation or expressions of self-harm.”

They must also share “statistics regarding how often they provided users with crisis center prevention notifications to the Department of Public Health,” the governor’s office said. Those stats will also be posted on the platforms’ websites, potentially helping lawmakers and parents track any disturbing trends.

Further, companion bots will be banned from claiming that they’re therapists, and platforms must take extra steps to ensure child safety, including providing kids with break reminders and preventing kids from viewing sexually explicit images.

Additionally, Newsom strengthened the state’s penalties for those who create deepfake pornography, which could help shield young people, who are increasingly targeted with fake nudes, from cyber bullying.

Now any victims, including minors, can seek up to $250,000 in damages per deepfake from any third parties who knowingly distribute nonconsensual sexually explicit material created using AI tools. Previously, the state allowed victims to recover “statutory damages of not less than $1,500 but not more than $30,000, or $150,000 for a malicious violation.”

Both laws take effect January 1, 2026.

American families “are in a battle” with AI

The companion bot law’s sponsor, Democratic Senator Steve Padilla, said in a press release celebrating the signing that the California law demonstrates how to “put real protections into place” and said it “will become the bedrock for further regulation as this technology develops.”

To shield kids, California hikes fake nude fines to $250K max Read More »

rocket-report:-bezos’-firm-will-package-satellites-for-launch;-starship-on-deck

Rocket Report: Bezos’ firm will package satellites for launch; Starship on deck


The long, winding road for Franklin Chang-Diaz’s plasma rocket engine takes another turn.

Blue Origin’s second New Glenn booster left its factory this week for a road trip to the company’s launch pad a few miles away. Credit: Blue Origin

Welcome to Edition 8.14 of the Rocket Report! We’re now more than a week into a federal government shutdown, but there’s been little effect on the space industry. Military space operations are continuing unabated, and NASA continues preparations at Kennedy Space Center, Florida, for the launch of the Artemis II mission around the Moon early next year. The International Space Station is still flying with a crew of seven in low-Earth orbit, and NASA’s fleet of spacecraft exploring the cosmos remain active. What’s more, so much of what the nation does in space is now done by commercial companies largely (but not completely) immune from the pitfalls of politics. But the effect of the shutdown on troops and federal employees shouldn’t be overlooked. They will soon miss their first paychecks unless political leaders reach an agreement to end the stalemate.

As always, we welcome reader submissions. If you don’t want to miss an issue, please subscribe using the box below (the form will not appear on AMP-enabled versions of the site). Each report will include information on small-, medium-, and heavy-lift rockets, as well as a quick look ahead at the next three launches on the calendar.

Danger from dead rockets. A new listing of the 50 most concerning pieces of space debris in low-Earth orbit is dominated by relics more than a quarter-century old, primarily dead rockets left to hurtle through space at the end of their missions, Ars reports. “The things left before 2000 are still the majority of the problem,” said Darren McKnight, lead author of a paper presented October 3 at the International Astronautical Congress in Sydney. “Seventy-six percent of the objects in the top 50 were deposited last century, and 88 percent of the objects are rocket bodies. That’s important to note, especially with some disturbing trends right now.”

Littering in LEO … The disturbing trends mainly revolve around China’s actions in low-Earth orbit. “The bad news is, since January 1, 2024, we’ve had 26 rocket bodies abandoned in low-Earth orbit that will stay in orbit for more than 25 years,” McKnight told Ars. China is responsible for leaving behind 21 of those 26 rockets. Overall, Russia and the Soviet Union lead the pack with 34 objects listed in McKnight’s Top 50, followed by China with 10, the United States with three, Europe with two, and Japan with one. Russia’s SL-16 and SL-8 rockets are the worst offenders, combining to take 30 of the Top 50 slots. An impact with even a modestly sized object at orbital velocity would create countless pieces of debris, potentially triggering a cascading series of additional collisions clogging LEO with more and more space junk, a scenario called the Kessler Syndrome.

The easiest way to keep up with Eric Berger’s and Stephen Clark’s reporting on all things space is to sign up for our newsletter. We’ll collect their stories and deliver them straight to your inbox.

Sign Me Up!

New Shepard flies again. Blue Origin, Jeff Bezos’ space company, launched its sixth crewed New Shepard flight so far this year Wednesday as the company works to increase the vehicle’s flight rate, Space News reports. This was the 36th flight of Blue Origin’s suborbital New Shepard rocket. The passengers included: Jeff Elgin, Danna Karagussova, Clint Kelly III, Will Lewis, Aaron Newman, and Vitalii Ostrovsky. Blue Origin said it has now flown 86 humans (80 individuals) into space. The New Shepard booster returned to a pinpoint propulsive landing, and the capsule parachuted into the desert a few miles from the launch site near Van Horn, Texas.

Two-month turnaround … This flight continued Blue Origin’s trend of launching New Shepard about once per month. The company has two capsules and two boosters in its active inventory, and each vehicle has flown about once every two months this year. Blue Origin currently has command of the space tourism and suborbital research market as its main competitor in this sector, Virgin Galactic, remains grounded while it builds a next-generation rocket plane. (submitted by EllPeaTea)

NASA still interested in former astronaut’s rocket engine. NASA has awarded the Ad Astra Rocket Company a $4 million, two-year contract for the continued development of the company’s Variable Specific Impulse Magnetoplasma Rocket (VASIMR) concept, Aviation Week & Space Technology reports. Ad Astra, founded by former NASA astronaut Franklin Chang-Diaz, claims the vehicle has the potential to reach Mars with human explorers within 45 days using a nuclear power source rather than solar power. The new contract will enable federal funding to support development of the engine’s radio frequency, superconducting magnet, and structural exoskeleton subsystems.

Slow going … Houston-based Ad Astra said in a press release that it sees the high-power plasma engine as “nearing flight readiness.” We’ve heard this before. The VASIMR engine has been in development for decades now, beset by a lack of stable funding and the technical hurdles inherent in designing and testing such demanding technology. For example, Ad Astra once planned a critical 100-hour, 100-kilowatt ground test of the VASIMR engine in 2018. The test still hasn’t happened. Engineers discovered a core component of the engine tended to overheat as power levels approached 100 kilowatts, forcing a redesign that set the program back by at least several years. Now, Ad Astra says it is ready to build and test a pair of 150-kilowatt engines, one of which is intended to fly in space at the end of the decade.

Gilmour eyes return to flight next year. Australian rocket and satellite startup Gilmour Space Technologies is looking to return to the launch pad next year after the first attempt at an orbital flight failed over the summer, Aviation Week & Space Technology reports. “We are well capitalized. We are going to be launching again next year,” Adam Gilmour, the company’s CEO, said October 3 at the International Astronautical Congress in Sydney.

What happened? … Gilmour didn’t provide many details about the cause of the launch failure in July, other than to say it appeared to be something the company didn’t test for ahead of the flight. The Eris rocket flew for 14 seconds, losing control and crashing a short distance from the launch pad in the Australian state of Queensland. If there’s any silver lining, Gilmour said the failure didn’t damage the launch pad, and the rocket’s use of a novel hybrid propulsion system limited the destructive power of the blast when it struck the ground.

Stoke Space’s impressive funding haul. Stoke Space announced a significant capital raise on Wednesday, a total of $510 million as part of Series D funding. The new financing doubles the total capital raised by Stoke Space, founded in 2020, to $990 million, Ars reports. The infusion of money will provide the company with “the runway to complete development” of the Nova rocket and demonstrate its capability through its first flights, said Andy Lapsa, the company’s co-founder and chief executive, in a news release characterizing the new funding.

A futuristic design … Stoke is working toward a 2026 launch of the medium-lift Nova rocket. The rocket’s innovative design is intended to be fully reusable from the payload fairing on down, with a regeneratively cooled heat shield on the vehicle’s second stage. In fully reusable mode, Nova will have a payload capacity of 3 metric tons to low-Earth orbit, and up to 7 tons in fully expendable mode. Stoke is building a launch pad for the Nova rocket at Cape Canaveral Space Force Station, Florida.

SpaceX took an unusual break from launching. SpaceX launched its first Falcon 9 rocket from Florida in 12 days during the predawn hours of Tuesday morning, Spaceflight Now reports. The launch gap was highlighted by a run of persistent, daily storms in Central Florida and over the Atlantic Ocean, including hurricanes that prevented deployment of SpaceX’s drone ships to support booster landings. The break ended with the launch of 28 more Starlink broadband satellites. SpaceX launched three Starlink missions in the interim from Vandenberg Space Force Base, California.

Weather still an issue … Weather conditions on Florida’s Space Coast are often volatile, particularly in the evenings during summer and early autumn. SpaceX’s next launch from Florida was supposed to take off Thursday evening, but officials pushed it back to no earlier than Saturday due to a poor weather forecast over the next two days. Weather still gets a vote in determining whether a rocket lifts off or doesn’t, despite SpaceX’s advancements in launch efficiency and the Space Force’s improved weather monitoring capabilities at Cape Canaveral.

ArianeGroup chief departs for train maker. Current ArianeGroup CEO Martin Sion has been named the new head of French train maker Alstom. He will officially take up the role in April 2026, European Spaceflight reports. Sion assumed the role as ArianeGroup’s chief executive in 2023, replacing the former CEO who left the company after delays in the debut of its main product: the Ariane 6 rocket. Sion’s appointment was announced by Alstom, but ArianeGroup has not made any official statement on the matter.

Under pressure … The change in ArianeGroup’s leadership comes as the company ramps up production and increases the launch cadence of the Ariane 6 rocket, which has now flown three times, with a fourth launch due next month. ArianeGroup’s subsidiary, Arianespace, seeks to increase the Ariane 6’s launch cadence to 10 missions per year by 2029. ArianeGroup and its suppliers will need to drastically improve factory throughput to reach this goal.

New Glenn emerges from factory. Blue Origin rolled the first stage of its massive New Glenn rocket from its hangar on Wednesday morning in Florida, kicking off the final phase of the campaign to launch the heavy-lift vehicle for the second time, Ars reports. In sharing video of the rollout to Launch Complex-36 on Wednesday online, the space company did not provide a launch target for the mission, which seeks to put two small Mars-bound payloads into orbit. The pair of identical spacecraft to study the solar wind at Mars is known as ESCAPADE. However, sources told Ars that on the current timeline, Blue Origin is targeting a launch window of November 9 to November 11. This assumes pre-launch activities, including a static-fire test of the first stage, go well.

Recovery or bust? Blue Origin has a lot riding on this booster, named “Never Tell Me The Odds,” which it will seek to recover and reuse. Despite the name of the booster, the company is quietly confident that it will successfully land the first stage on a drone ship named Jacklyn. Internally, engineers at Blue Origin believe there is about a 75 percent chance of success. The first booster malfunctioned before landing on the inaugural New Glenn test flight in January. Company officials are betting big on recovering the booster this time, with plans to reuse it early next year to launch Blue’s first lunar lander to the Moon.

SpaceX gets bulk of this year’s military launch orders. Around this time each year, the US Space Force convenes a Mission Assignment Board to dole out contracts to launch the nation’s most critical national security satellites. The military announced this year’s launch orders Friday, and SpaceX was the big winner, Ars reports. Space Systems Command, the unit responsible for awarding military launch contracts, selected SpaceX to launch five of the seven missions up for assignment this year. United Launch Alliance (ULA), a 50-50 joint venture between Boeing and Lockheed Martin, won contracts for the other two. These missions for the Space Force and the National Reconnaissance Office are still at least a couple of years away from flying.

Vulcan getting more expensive A closer examination of this year’s National Security Space Launch contracts reveals some interesting things. The Space Force is paying SpaceX $714 million for the five launches awarded Friday, for an average of roughly $143 million per mission. ULA will receive $428 million for two missions, or $214 million for each launch. That’s about 50 percent more expensive than SpaceX’s price per mission. This is in line with the prices the Space Force paid SpaceX and ULA for last year’s contracts. However, look back a little further and you’ll find ULA’s prices for military launches have, for some reason, increased significantly over the last few years. In late 2023, the Space Force awarded a $1.3 billion deal to ULA for a batch of 11 launches at an average cost per mission of $119 million. A few months earlier, Space Systems Command assigned six launches to ULA for $672 million, or $112 million per mission.

Starship Flight 11 nears launch. SpaceX rolled the Super Heavy booster for the next test flight of the company’s Starship mega-rocket out to the launch pad in Texas this week. The booster stage, with 33 methane-fueled engines, will power the Starship into the upper atmosphere during the first few minutes of flight. This booster is flight-proven, having previously launched and landed on a test flight in March.

Next steps With the Super Heavy booster installed on the pad, the next step for SpaceX will be the rollout of the Starship upper stage. That is expected to happen in the coming days. Ground crews will raise Starship atop the Super Heavy booster to fully stack the rocket to its total height of more than 400 feet (120 meters). If everything goes well, SpaceX is targeting liftoff of the 11th full-scale test flight of Starship and Super Heavy as soon as Monday evening. (submitted by EllPeaTea)

Blue Origin takes on a new line of business. Blue Origin won a US Space Force competition to build a new payload processing facility at Cape Canaveral Space Force Station, Florida, Spaceflight Now reports. Under the terms of the $78.2 million contract, Blue Origin will build a new facility capable of handling payloads for up to 16 missions per year. The Space Force expects to use about half of that capacity, with the rest available to NASA or Blue Origin’s commercial customers. This contract award follows a $77.5 million agreement the Space Force signed with Astrotech earlier this year to expand the footprint of its payload processing facility at Vandenberg Space Force Base, California.

Important stuff … Ground infrastructure often doesn’t get the same level of attention as rockets, but the Space Force has identified bottlenecks in payload processing as potential constraints on ramping up launch cadences at the government’s spaceports in Florida and California. Currently, there are only a handful of payload processing facilities in the Cape Canaveral area, and most of them are only open to a single user, such as SpaceX, Amazon, the National Reconnaissance Office, or NASA. So, what exactly is payload processing? The Space Force said Blue Origin’s new facility will include space for “several pre-launch preparatory activities” that include charging batteries, fueling satellites, loading other gaseous and fluid commodities, and encapsulation. To accomplish those tasks, Blue Origin will create “a clean, secure, specialized high-bay facility capable of handling flight hardware, toxic fuels, and explosive materials.”

Next three launches

Oct. 11: Gravity 1 | Unknown Payload | Haiyang Spaceport, China Coastal Waters | 02: 15 UTC

Oct. 12: Falcon 9 | Project Kuiper KF-03 | Cape Canaveral Space Force Station, Florida | 00: 41 UTC

Oct. 13: Starship/Super Heavy | Flight 11 | Starbase, Texas | 23: 15 UTC

Photo of Stephen Clark

Stephen Clark is a space reporter at Ars Technica, covering private space companies and the world’s space agencies. Stephen writes about the nexus of technology, science, policy, and business on and off the planet.

Rocket Report: Bezos’ firm will package satellites for launch; Starship on deck Read More »

ai-models-can-acquire-backdoors-from-surprisingly-few-malicious-documents

AI models can acquire backdoors from surprisingly few malicious documents

Fine-tuning experiments with 100,000 clean samples versus 1,000 clean samples showed similar attack success rates when the number of malicious examples stayed constant. For GPT-3.5-turbo, between 50 and 90 malicious samples achieved over 80 percent attack success across dataset sizes spanning two orders of magnitude.

Limitations

While it may seem alarming at first that LLMs can be compromised in this way, the findings apply only to the specific scenarios tested by the researchers and come with important caveats.

“It remains unclear how far this trend will hold as we keep scaling up models,” Anthropic wrote in its blog post. “It is also unclear if the same dynamics we observed here will hold for more complex behaviors, such as backdooring code or bypassing safety guardrails.”

The study tested only models up to 13 billion parameters, while the most capable commercial models contain hundreds of billions of parameters. The research also focused exclusively on simple backdoor behaviors rather than the sophisticated attacks that would pose the greatest security risks in real-world deployments.

Also, the backdoors can be largely fixed by the safety training companies already do. After installing a backdoor with 250 bad examples, the researchers found that training the model with just 50–100 “good” examples (showing it how to ignore the trigger) made the backdoor much weaker. With 2,000 good examples, the backdoor basically disappeared. Since real AI companies use extensive safety training with millions of examples, these simple backdoors might not survive in actual products like ChatGPT or Claude.

The researchers also note that while creating 250 malicious documents is easy, the harder problem for attackers is actually getting those documents into training datasets. Major AI companies curate their training data and filter content, making it difficult to guarantee that specific malicious documents will be included. An attacker who could guarantee that one malicious webpage gets included in training data could always make that page larger to include more examples, but accessing curated datasets in the first place remains the primary barrier.

Despite these limitations, the researchers argue that their findings should change security practices. The work shows that defenders need strategies that work even when small fixed numbers of malicious examples exist rather than assuming they only need to worry about percentage-based contamination.

“Our results suggest that injecting backdoors through data poisoning may be easier for large models than previously believed as the number of poisons required does not scale up with model size,” the researchers wrote, “highlighting the need for more research on defences to mitigate this risk in future models.”

AI models can acquire backdoors from surprisingly few malicious documents Read More »

a-knight-of-the-seven-kingdoms-teaser-debuts-at-nycc

A Knight of the Seven Kingdoms teaser debuts at NYCC

A squire and his hedge knight: Dexter Sol Ansell plays

A squire and his hedge knight: Dexter Sol Ansell plays “Egg” (l) and Peter Claffey plays Dunk (r). Credit: YouTube/HBO

This being a Game of Thrones series, there’s also an extensive supporting cast. Ross Anderson plays Ser Humfrey Hardyng; Edward Ashley plays Ser Steffon Fossoway; Henry Ashton as Egg’s older brother, Prince Daeron “The Drunken” Targaryen; Youssef Kerkour as a blacksmith named Steely Pate; Daniel Monks as Ser Manfred Dondarrion; Shaun Thomas as Raymun Fossoway; Tom Vaughan-Lawlor as Plummer, a steward; Steve Wall as Lord Leo “Longthorn” Tyrell, Lord of Highgarden; and Danny Webb as Dunk’s mentor, Ser Arlan of Pennytree.

It’s a good rule of thumb in the Game of Thrones universe not to get too attached to any of the characters, and that probably holds true here, too. But Knight of the Seven Kingdoms also seems to be aiming for a different, lighter tone than its predecessors, judging by the teaser, which has its share of humor. Martin has said as much on his blog, although he added, “It’s still Westeros, so no one is truly safe.”

Since Dunk is a humble hedge knight, there are lots of scenes with him trudging through mud and rain, and jousting will apparently feature much more prominently. “I always love Medieval tournaments in other pictures,” Martin said during a NYCC panel. “We had several tournaments in Game of Thrones, they were in the background, but not the center. I wanted to do something set during a tournament. I sent (the TV writers) a challenge: Let’s do the best jousting sequences that were ever done on film. My favorite was 1952’s Ivanhoe.

A Knight of the Seven Kingdoms debuts on HBO on January 18, 2026.

A Knight of the Seven Kingdoms teaser debuts at NYCC Read More »

tesla-fsd-gets-worse-at-driving,-nhtsa-opens-new-investigation

Tesla FSD gets worse at driving, NHTSA opens new investigation

At least six crashes have been reported to the agency under its standing general order, which requires an automaker to inform the regulator of any crash involving a partially automated driving system like FSD (or an autonomous driving system like Waymo’s). And of those six crashes, four resulted in injuries.

The second scenario involves Teslas operating under FSD crossing into oncoming traffic, driving straight in a turning lane, or making a turn from the wrong lane. There have been at least 24 complaints about this behavior, as well as another six reports under the standing general order, and NHTSA also cites articles published by Motor Trend and Forbes that detail such behavior during test drives.

Perhaps this should not be surprising. Last year, we reported on a study conducted by AMCI Testing that revealed both aberrant driving behaviors—ignoring a red light and crossing into oncoming traffic—in 1,000 miles (1,600 km) of testing that required more than 75 human interventions. The rest of the time, the system was capable of quite sophisticated behavior; “its seeming infallibility in anyone’s first five minutes of FSD operation breeds a sense of awe that unavoidably leads to dangerous complacency,” said AMCI Testing’s director, Guy Mangiamele.

Tesla FSD gets worse at driving, NHTSA opens new investigation Read More »