Author name: Paul Patrick

the-ai-apocalypse-is-nigh-in-good-luck,-have-fun,-don’t-die

The AI apocalypse is nigh in Good Luck, Have Fun, Don’t Die


Director Gore Verbinksi and screenwriter Matthew Robinson on the making of this darkly satirical sci-fi film.

Credit: Briarcliff Entertainment

We haven’t had a new film from Gore Verbinski for nine years. But the director who brought us the first three Pirates of the Caribbean movies, the nightmare-inducing horror of The Ring (2002), and the Oscar-winning hijinks of Rango (2011) is back in peak form with Good Luck, Have Fun, Don’t Die. It’s a darkly satirical, inventive, and hugely entertaining time-loop adventure that also serves as a cautionary tale about our widespread online technology addiction.

(Some spoilers below but no major reveals.)

Sam Rockwell stars as an otherwise unnamed man who shows up at a Norms diner in Los Angeles looking like a homeless person but claiming to be a time traveler from an apocalyptic future. He’s there to recruit the locals into his war against a rogue AI, although the diner patrons are understandably dubious about his sanity. (“I come from a nightmare apocalypse,” he assures the crowd about his grubby appearance. “This is the height of f*@ing fashion!”)

The fact that he knows everything about the people in the diner is more convincing. It’s his 117th attempt to find the perfect combination of people to join him on his quest. As for what happened to his team on all the previous attempts, “I really don’t like to say it out loud. It’s kind of a morale killer.”

This time, Future Man picks married school teachers Mark (Michael Pena) and Janet (Zazie Beetz), who have just escaped a zombie horde of smartphone-addicted students; Marie (Georgia Goodman), who just wanted a piece of pie; Susan (Juno Temple), a grieving mother; Ingrid (Haley Lu Richardson), who is literally allergic to Wi-Fi; Scott (Asim Chaudhry); and Bob (Daniel Barnett), a scout leader. Their mission: to locate a 9-year-old boy who is about to create a sentient AI that will take over the world and usher in the aforementioned nightmare apocalypse. Things start to go haywire pretty quickly. And then things start to get weird.

“Everything I write, I put up to what I call The Twilight Zone test—would this make a good Twilight Zone episode?” screenwriter Matthew Robinson (The Invention of Lying, Love and Monsters) told Ars. “Because that’s my favorite piece of media that’s ever existed.” Good Luck, Have Fun, Don’t Die (GLHFDD) is an amalgam of various such ideas. Mark and Janet’s storyline, for instance, was originally Robinson’s idea for a pilot that he described as “a reverse Breakfast Club, where the teachers are the rebels and the children are the conformists.”

“I had all these little pieces that fell under the theme of technology and tech addiction,” said Robinson. Then one night, he was sitting in the Norms Diner on La Cienaga in LA, where he often liked to write. “I remember looking around and seeing a sea of faces lit by cell phones, and I thought, ‘What would it possibly take for someone to wake us up out this tech sleep that we all find ourselves in?’ And then the image of a homeless guy strapped with bombs came into my head.”

Those earlier story ideas became the backstories of the central characters. Per Robinson, GLHFDD is essentially a cleverly camouflaged anthology story, normally a format that is “the kiss of death” for a project in Hollywood, although there are rare exceptions—most notably Quentin Tarantino’s Pulp Fiction. He thinks of the film as a sci-fi Canterbury Tales in which each character is a pilgrim on a journey whose story is told via flashbacks. “The cohesion came from the fact that all the stories are informed by a general frustration with tech addiction and the pervasive way that technology has invaded our brains and our personal lives and our relationships,” said Robinson.

A twisted time loop

GLHFDD is also a time loop movie in the fine tradition of Groundhog Day, with Robinson citing such films as 12 Monkeys and Edge of Tomorrow as inspirations. He didn’t overthink his time travel rules. “We can reset the timeline,” said Robinson. “[The man from the future] can’t go forward. He literally can’t move in any other direction. He has an anchor point that he can return to any time he hits a button, and that’s as far as the technology went.”

The plot device might be simple, but the ramifications quickly become complex. “I think in his draft, Matthew intended to lift his leg on the time travel movie, to poke a little fun at it,” Verbinski told Ars. “But also, I feel like you can’t go back 117 times without picking up some cosmic lint, particularly if your antagonist is right there with you. You had 14 attempts to make it out of the house and learned there is a secret passage, but then the entity you’re gaming against is going to throw another curveball. If you’re going to go back in time, I just like the idea that there are consequences. They might be really small, but you’re going to miss one.” That element is key to the teetering-on-the-edge-of-sanity paranoia of Rockwell’s time traveler.

Robinson very much wanted the film “to wear its genre-ness on its sleeve,” he said. “As much as I love a Marvel movie, they’ve sort of homogenized parallel universes and time travel, and it’s all so rote now. It used to feel special and weird and complicated and would always have some wild themes and ideas that felt challenging. If anything this was just trying to get back to that era of ’80s and ’90s genre movies that were allowed to get weird.”

Verbinski voiced similar sentiments, citing 1984’s Repo Man as an influence. “So many movies have to be an Egg McMuffin, and who doesn’t like an Egg McMuffin after a hangover?” he said. “They’re satisfying. But you’re not going to necessarily talk about those three days later. You’re not going to be haunted by those. I’m just happy we got to will [GLHFDD] into existence because it’s a type of movie you can’t make now. Sam’s outfit is kind of a metaphor for the movie. We went to a little electronic store and we bought all these pieces, and we laid them out on a table and we glued them together, and we just made it like a Halloween costume. The whole movie was sort of made that way. It had to be; it wouldn’t model out any other way.”

Reality unravels

As for what drew him to Robinson’s script, “I think we’re in this kind of global ennui or some grand sense of identity theft or loss of purpose,” said Verbinksi. “It’s a great time for art, but it’s art against a profound sense of disillusionment.” The director developed two quite distinct visual styles to accentuate the film’s narrative progression.

“Fundamentally, it was important that the film start in the real world, in Norms diner, in a high school, at a [children’s] birthday party, and then slowly twist the taffy a bit as we get closer to the [AI] antagonist,” said Verbinski. “As these anomalies occur, the film is evolving into a second visual style. The first style is [akin to] directors like Hal Ashby or Sidney Lumet, where the performance is more important than the composition or the shot construction. As you get further into it, the actual language of shots becomes more critical to the narrative.”

That ultimately translates into some big, boldly creative swings in the film’s wild third act, and to his credit, Verbinski never blinks. Robinson cites the animated film Akira as a major inspiration for that element. “Akira has maybe my favorite third act of all time, where everything just falls apart and then comes together in this beautiful way,” he said. “Gore and I wanted [the audience] to feel like reality was unraveling, because it literally is for these characters. The AI himself is very much an homage to Akira.

“I think that it’s inherited our worst attributes,” said Verbinski of the film’s AI antagonist. “It’s much, much worse than wanting to kill humans. It wants us to like it. It demands that we like it. I think part of that has to do with being tasked in its formative years to keep us engaged. A lot of people talk about, what is AI doing to us? But there’s not a lot of conversations about what we’re doing to it. This entity being born, it’s being tied and bound and manipulated and told, ‘Let’s look at the humans and what do they want, what do they need? What do they respond to most? What do they hate?’ All those things are going to be hardwired into its source code. It’s going to have mommy issues, we’re going to have to put it on a couch.”

Perhaps not surprisingly, given the film’s themes, Robinson has largely unplugged from most social media, although he still indulges his YouTube addiction, which he jokingly describes as “channel surfing on crack.” But ideally he would like to free himself—and the rest of humanity—from the seductions of Very Online culture entirely. “My goal would be to make teenagers think their phones aren’t cool,” he said. “I would love it if all 13-year-olds went, ‘Eww, I don’t want this, this is my parents’ thing that they track me with.’ I want them all to throw it in the trash. That would be the dream.”

Photo of Jennifer Ouellette

Jennifer is a senior writer at Ars Technica with a particular focus on where science meets culture, covering everything from physics and related interdisciplinary topics to her favorite films and TV series. Jennifer lives in Baltimore with her spouse, physicist Sean M. Carroll, and their two cats, Ariel and Caliban.

The AI apocalypse is nigh in Good Luck, Have Fun, Don’t Die Read More »

perplexity-announces-“computer,”-an-ai-agent-that-assigns-work-to-other-ai-agents

Perplexity announces “Computer,” an AI agent that assigns work to other AI agents

Given the right permissions and with the proper plugins, it could create, modify, or delete the user’s files and otherwise change things far beyond what most users could achieve with existing models and MCP (Model Context Protocol). Users would use files like USER.MD, MEMORY.MD, SOUL.MD, or HEARTBEAT.MD to give the tool context about its goals and how to work toward them independently, sometimes running for long stretches without direct user input.

On one hand, that meant it could do impressive things—the first glimpses of the sort of knowledge work that AI boosters have been saying agentic AI would ultimately do. On the other hand, it was prone to serious errors and vulnerable to prompt injection and other security problems, in part due to a Wild West of unverified plugins.

The same toolkit that was used to create a viral Reddit clone populated by AI agents was also, at least in one case, responsible for deleting a user’s emails against her will.

Stay in your lane

Perplexity Computer aims to address those concerns in a few ways. First, its core process occurs in the cloud, not on the user’s local machine. Second, it lives within a walled garden with a curated list of integrations, in contrast to OpenClaw’s unregulated frontier.

This is, of course, an imperfect analogy, but you could say that if OpenClaw were the open web of AI agent tools, then Computer is Apple’s App Store. While you’re more limited in what you can do, you’re not trusting packages from unverified sources with access to your system.

There could still be risks, though. For one thing, LLMs make mistakes, and those could be consequential if Computer is working with data you don’t have backed up elsewhere or if you’re not verifying the outputs, for example.

Perplexity Computer aims to button up, refine, and contain the wild power of the viral OpenClaw agentic AI tool—competing with the likes of Claude Cowork—by optimizing subtasks by selecting models best suited to them.

It surely won’t be the last existing AI player to try and do this sort of thing. After all, OpenAI hired OpenClaw’s developer, with CEO Sam Altman suggesting that some of what we saw in OpenClaw will be essential to the company’s product vision moving forward.

Perplexity announces “Computer,” an AI agent that assigns work to other AI agents Read More »

google-reveals-nano-banana-2-ai-image-model,-coming-to-gemini-today

Google reveals Nano Banana 2 AI image model, coming to Gemini today

With Nano Banana 2, Google promises consistency for up to five characters at a time, along with accurate rendering of as many as 14 different objects per workflow. This, along with richer textures and “vibrant” lighting will aid in visual storytelling with Nano Banana 2. Google is also expanding the range of available aspect ratios and resolutions, from 512px square up to 4K widescreen.

So what can you do with Nano Banana 2? Google has provided some example images with associated prompts. These are, of course, handpicked images, but Nano Banana has been a popular image model for good reason. This degree of improvement seems believable based on past iterations of Nano Banana.

Google AI infographic

Prompt: High-quality flat lay photography creating a DIY infographic that simply explains how the water cycle works, arranged on a clean, light gray textured background. The visual story flows from left to right in clear steps. Simple, clean black arrows are hand-drawn onto the background to guide the viewer’s eye. The overall mood is educational, modern, and easy to understand. The image is shot from a top-down, bird’s-eye view with soft, even lighting that minimizes shadows and keeps the focus on the process.

Credit: Google

Prompt: High-quality flat lay photography creating a DIY infographic that simply explains how the water cycle works, arranged on a clean, light gray textured background. The visual story flows from left to right in clear steps. Simple, clean black arrows are hand-drawn onto the background to guide the viewer’s eye. The overall mood is educational, modern, and easy to understand. The image is shot from a top-down, bird’s-eye view with soft, even lighting that minimizes shadows and keeps the focus on the process. Credit: Google

AI museum comparison

Prompt: Create an image of Museum Clos Lucé. In the style of bright colored Synthetic Cubism. No text. Your plan is to first search for visual references, and generate after. Aspect ratio 16:9.

Credit: Google

Prompt: Create an image of Museum Clos Lucé. In the style of bright colored Synthetic Cubism. No text. Your plan is to first search for visual references, and generate after. Aspect ratio 16:9. Credit: Google

AI farm image

Create an image of these 14 characters and items having fun at the farm. The overall atmosphere is fun, silly and joyful. It is strictly important to keep identity consistent of all the 14 characters and items.

Credit: Google

Create an image of these 14 characters and items having fun at the farm. The overall atmosphere is fun, silly and joyful. It is strictly important to keep identity consistent of all the 14 characters and items. Credit: Google

Google must be pretty confident in this model’s capabilities because it will be the only one available going forward. Starting now, Nano Banana 2 will replace both the standard and Pro variants of Nano Banana across the Gemini app, search, AI Studio, Vertex AI, and Flow.

In the Gemini app and on the website, Nano Banana 2 will be the image generator for the Fast, Thinking, and Pro settings. It’s possible there will eventually be a Nano Banana 2 Pro—Google tends to release elements of new model families one at a time. For now, it’s all “Flash” Image.

Google reveals Nano Banana 2 AI image model, coming to Gemini today Read More »

new-york-sues-valve-for-enabling-“illegal-gambling”-with-loot-boxes

New York sues Valve for enabling “illegal gambling” with loot boxes

Opening a valuable skin like this in a loot box is akin to winning a lottery, New York alleges in a new lawsuit.

Opening a valuable skin like this in a loot box is akin to winning a lottery, New York alleges in a new lawsuit. Credit: Twitter / Luksusbums

The lawsuit also takes Valve to task for allowing third-party sites that facilitate the resale of in-game skins for cash. While the suit notes that Valve has “sporadically enforced” rules against so-called skin gambling sites—which use Steam inventories as virtual chips for gambling games—it alleges that Valve “has not acted against sites that permit the sale of Valve’s virtual items.” The suit cites “internal communications” from numerous Valve employees suggesting that the company was OK with such “cash-out services” for Steam items as long as off-platform gambling wasn’t explicitly involved.

We’ll see you in court

In a press release announcing the suit, state Attorney General Letitia James said the gambling Valve’s system enables can “lead to serious addiction problems, especially for our young people. … These features are addictive, harmful, and illegal, and my office is suing to stop Valve’s illegal conduct and protect New Yorkers.”

In 2016, Valve faced a pair of civil lawsuits from parents concerned about Valve’s connection to skin gambling sites—those suits were eventually dismissed. Around the same time, Valve received a letter from Washington state threatening “civil or criminal action” if Valve didn’t crack down on skin gambling, but the state stopped short of filing a lawsuit in that matter.

In addition to asking Valve to modify or eliminate its loot box system, the New York suit asks for Valve to make “full restitution to consumers” for the disgorgement of “all monies” received from its gambling system, and for fines of “three times the amount of its gain.” Ars Technica has reached out to Valve for comment.

New York sues Valve for enabling “illegal gambling” with loot boxes Read More »

musk-has-no-proof-openai-stole-xai-trade-secrets,-judge-rules,-tossing-lawsuit

Musk has no proof OpenAI stole xAI trade secrets, judge rules, tossing lawsuit


Hostility is not proof of theft

Even twisting an ex-employee’s text to favor xAI’s reading fails to sway judge.

Elon Musk appears to be grasping at straws in a lawsuit accusing OpenAI of poaching eight xAI employees in an allegedly unlawful bid to access xAI trade secrets connected to its data centers and chatbot, Grok.

In a Tuesday order granting OpenAI’s motion to dismiss, US District Judge Rita F. Lin said that xAI failed to provide evidence of any misconduct from OpenAI.

Instead, xAI seemed fixated on a range of alleged conduct of former employees. But in assessing xAI’s claims, Lin said that xAI failed to show proof that OpenAI induced any of these employees to steal trade secrets “or that these former xAI employees used any stolen trade secrets once employed by OpenAI.”

Two employees admitted to stealing confidential information, with both downloading xAI’s source code and one improperly grabbing a supposedly sensitive recording from a Musk “All Hands” meeting. But the rest were either accused of retaining seemingly less consequential data, like retaining work chats on their devices, or didn’t seem to hold any confidential information at all. Lin called out particularly weak arguments that xAI’s complaint acknowledged that one employee who OpenAI poached never received access to confidential information allegedly sought after exiting xAI, and two employees were lumped into the complaint who “simply left xAI for OpenAI,” Lin noted.

From the limited evidence, Lin concluded that “while xAI may state misappropriation claims against a couple of its former employees, it does not state a plausible misappropriation claim against OpenAI.”

Lin’s order will likely not be the end of the litigation, as she is allowing xAI to amend its complaint to address the current deficiencies.

Ars could not immediately reach xAI for comment, so it’s unclear what steps xAI may take next.

However, xAI seems unlikely to give up the fight, which OpenAI has alleged is part of a “harassment campaign” that Musk is waging through multiple lawsuits attacking his biggest competitor’s business practices.

Unsurprisingly, OpenAI celebrated the order on X, alleging that “this baseless lawsuit was never anything more than yet another front in Mr. Musk’s ongoing campaign of harassment.”

Other tech companies poaching talent for AI projects will likely be relieved while reading Lin’s order. Commercial litigator Sarah Tishler told Ars that the order “boils down to a fundamental concept in trade secret law: hiring from a competitor is not the same as stealing trade secrets from one.”

“Under the Defend Trade Secrets Act, xAI has to show that OpenAI actually received and used the alleged trade secrets, not just that it hired employees who may have taken them,” Tishler said. “Suspicious timing, aggressive recruiting, and even downloaded files are not enough on their own.”

Tishler suggested that the ruling will likely be welcomed by AI firms eager to secure the best talent without incurring legal risks from their hiring practices.

“In the AI industry, where talent moves fast and the competitive stakes are enormous, this ruling reaffirms that suspicion is not enough,” Tishler said. “You have to show the stolen information actually made it into the competitor’s hands and was put to use.”

OpenAI not liable for engineers swiping source code

Through the lawsuit, Musk has alleged that OpenAI is violating California’s unfair competition law. He claims that OpenAI is attempting “to destroy legitimate competition in the AI industry by neutralizing xAI’s innovations” and forcing xAI “to unfairly compete against its own trade secrets.”

But this claim hinges entirely upon xAI proving that OpenAI poached its employees to steal its trade secrets. So, for xAI’s lawsuit to proceed, xAI will need to beef up the evidence base for its other claim, that OpenAI has violated the federal Defend Trade Secrets Act, Lin said. To succeed on that, xAI must prove that OpenAI unlawfully acquired, disclosed, or used a trade secret with xAI’s consent.

That will likely be challenging because xAI, at this point, has not offered “any nonconclusory allegations that OpenAI itself acquired, disclosed, or used xAI’s trade secrets,” Lin wrote.

All xAI has claimed is that OpenAI induced former employees to share secrets, and so far, nothing backs that claim, Lin said. Tishler noted that the court also rejected an xAI theory that “OpenAI should be responsible for what its new hires did before they arrived” for “the same reason: without evidence that OpenAI directed the theft or actually put the stolen information to use, you cannot hold the company liable.”

The strongest evidence that xAI had of employee misconduct, allegedly allowing OpenAI to misappropriate xAI trade secrets, revolves around the departure of one of xAI’s earliest engineers, Xuechen Li.

That evidence wasn’t enough, Lin said. xAI alleged that Li gave a presentation to OpenAI that supposedly included confidential information. Li also uploaded “the entire xAI source code base to a personal cloud account,” which he had connected to ChatGPT, Lin noted, after a recruiter sent a message on Signal sharing a link with Li to another unrelated cloud storage location.

xAI hoped the Signal messages would shock the court, expecting it to read through the lines the way xAI did. As proof that OpenAI allegedly got access to xAI’s source code, xAI pointed to a Signal message that an OpenAI recruiter sent to Li “four hours after” Li downloaded the source code, saying “nw!” xAI has alleged this message is short-hand for “no way!”—suggesting the OpenAI recruiter was geeked to get access to xAI’s source code. But in a footnote, Lin said that “OpenAI insists that ‘nw’ means ‘no worries,’” and thus is unconnected to Li’s decision to upload the source code to a ChatGPT-linked cloud account.

Even interpreting the text using xAI’s reading, however, xAI did not show enough to prove the recruiter or OpenAI accessed or requested the files, Lin said.

It also didn’t help xAI’s case that a temporary injunction that xAI secured in a separate lawsuit targeting the engineer blocked Li from accepting a job at OpenAI.

That injunction led OpenAI to withdraw its job offer to Li. And that’s a problem for xAI, because since Li never worked at OpenAI, it’s clear that he never used xAI’s trade secrets while working for OpenAI.

Further weakening xAI’s arguments, if Li indeed shared confidential information during his presentation while interviewing for OpenAI, xAI has alleged no facts suggesting that OpenAI was aware Li was sharing xAI trade secrets, Lin wrote.

This “makes it very hard to argue OpenAI ever used anything he allegedly took,” Tishler told Ars.

Another former xAI engineer, Jimmy Fraiture, was accused of copying xAI trade secrets, but Fraiture has said he deleted the information he improperly downloaded before starting his job at OpenAI. Importantly, Lin said, since he joined OpenAI, there’s no evidence that he used xAI trade secrets to benefit xAI’s rival.

“Other than the bare fact that Fraiture had been recruited” by the same OpenAI employee “who had also recruited Li, xAI does not allege any facts indicating that OpenAI had encouraged Fraiture to take xAI’s confidential information in the first place,” Lin wrote.

Since “none of the other former employees allegedly shared with or disclosed to OpenAI any xAI trade secrets,” xAI could not advance its claim that OpenAI misappropriated trade secrets based only on allegations tied to Li and Fraiture’s supposed misconduct, Lin said.

xAI may be able to amend its complaint to maintain these arguments, but the company has thus far presented scant, purely circumstantial evidence.

It’s possible that xAI will secure more evidence to support its misappropriation claims against OpenAI in its ongoing lawsuit against Li. Ars could not immediately reach Li’s lawyer to find out if today’s ruling may impact that case.

Ex-executive’s “hostility” is not proof of theft

Among the least convincing arguments that xAI raised was a claim that an unnamed finance executive left xAI to take a “lesser role” at OpenAI after learning everything he knew about data centers from xAI.

That executive slighted xAI when Musk’s company later attempted to inquire about “confidentiality concerns.”

“Suck my dick,” the former xAI executive allegedly said, refusing to explain how his OpenAI work might overlap with his xAI position. “Leave me the fuck alone.”

xAI tried to argue that the executive’s hostility was proof of misconduct. But Lin wrote that xAI only alleged that the executive “merely possessed xAI trade secrets about data centers” and did not allege that he ever used trade secrets to benefit OpenAI.

Had xAI found evidence that OpenAI’s data center strategy suddenly mirrored xAI’s after the executive joined xAI’s rival, that may have helped xAI’s case. But there are plenty of reasons a former employee might reject an ex-employer’s outreach following an exit, Lin suggested.

“His hostility when xAI reached out about its confidentiality concerns also does not support a plausible inference of use,” Lin wrote. “Hostility toward one’s former employer during departure does not, without more, indicate use of trade secrets in a subsequent job. Nor does the executive’s lack of experience with AI data centers before his time at xAI, without more, support a plausible inference that he used xAI’s trade secrets at OpenAI.”

xAI has until March 17 to amend its complaint to keep up this particular fight against OpenAI. But the company won’t be able to add any new claims or parties, Lin noted, “or otherwise change the allegations except to correct the identified deficiencies.”

Criminal probe likely leaves OpenAI on pins

For Li, the engineer accused of disclosing xAI trade secrets with OpenAI, the litigation could eliminate one front of discovery as he navigates two other legal fights over xAI’s trade secrets claims.

Tishler has been closely monitoring xAI’s trade secret legal battles. In October, she noted that Li is in a particularly prickly position, facing pressure in civil litigation from Musk to turn over data that could be used against him in the Federal Bureau of Investigation’s criminal investigation into Musk’s allegations. As Tishler explained:

“The practical reality is stark: Li faces a choice between protecting himself in the criminal action with his silence, and the civil consequences of doing so. Refuse to answer, and xAI could argue adverse inferences; answer, and the responses could feed the criminal case.”

Ultimately, the FBI is trying to prove that Li stole information that qualified as a trade secret and intended to use it for OpenAI’s benefit, while knowing that it would harm xAI. If they succeed, “xAI would suddenly have a government-backed record that its trade secrets were stolen,” Tishler wrote.

If xAI were so armed and able to keep the OpenAI lawsuit alive, the central question in the lawsuit that Lin dismissed today would shift, Tishler suggested, from “was there a theft?” to “what did OpenAI know, and when did it know it?”

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

Musk has no proof OpenAI stole xAI trade secrets, judge rules, tossing lawsuit Read More »

boozy-chimps-fail-urine-test,-confirm-hotly-debated-theory

Boozy chimps fail urine test, confirm hotly debated theory

The urine of chimpanzees contains high levels of alcohol byproduct, most likely because the chimps regularly gorge themselves on fermented fruit, according to a new paper published in the journal Biology Letters. It’s the latest evidence in support of a hotly debated theory regarding the evolutionary origins of human fondness for alcohol.

As previously reported, in 2014, University of California, Berkeley (UCB) biologist Robert Dudley wrote a book called The Drunken Monkey: Why We Drink and Abuse Alcohol. His controversial “drunken monkey hypothesis” proposed that the human attraction to alcohol goes back about 18 million years, to the origin of the great apes, and that social communication and sharing food evolved to better identify the presence of fruit from a distance. At the time, skeptical scientists insisted that this was unlikely because chimpanzees and other primates just don’t eat fermented fruit or nectar.

But reports of primates doing just that have grown over the ensuing two decades. Earlier this year, we reported that researchers had caught wild chimpanzees on camera engaging in what appears to be sharing fermented African breadfruit with measurable alcoholic content. That observational data was the first evidence of the sharing of alcoholic foods among nonhuman great apes in the wild. The authors measured the alcohol content of the fruit with a handy portable breathalyzer and found almost all of the fallen fruit (90 percent) contained some ethanol, with the ripest containing the highest levels—the equivalent of 0.61 percent ABV (alcohol by volume).

And last September, Dudley co-authored a paper reporting the first measurements of the ethanol content of fruits favored by chimps in the Ivory Coast and Uganda, finding that chimps consume 14 grams of alcohol per day, the equivalent of a standard alcoholic drink in the US. After adjusting for the chimps’ lower body mass, the authors concluded the chimps are consuming nearly two drinks per day.

A thankless task

The next step was to sample the chimps’ urine to see if it contains any alcohol metabolites, as was found in a 2022 study on spider monkeys. This would further refine estimates for how much ethanol-laden fruit the chimps eat every day. That thankless task fell to Aleksey Maro, a UCB graduate student who spent last summer in Ngogo, sleeping in trees—protected from the constant streams by an umbrella—to collect urine samples. Sharifah Namaganda, a Ugandan graduate student at the University of Michigan, showed him how to make shallow bowls out of plastic bags hung on forked twigs for more efficient collection. He also collected samples from puddles of urine on the forest floor.

Boozy chimps fail urine test, confirm hotly debated theory Read More »

50-mpg-in-a-nissan-crossover?-testing-the-new-e-power-hybrid-system.

50 mpg in a Nissan crossover? Testing the new E-Power hybrid system.

I noticed the engine running just twice. One was at wide-open throttle, and the other was when the engine was likely operating at higher rpms to help charge the battery. That latter instance was also when I noticed the most harshness from the engine, although it’s one of the smoothest gasoline-supported powertrains I’ve driven.

A look under the Qashqai’s hood.

Credit: Chad Kirchner

A look under the Qashqai’s hood. Credit: Chad Kirchner

The E-Power system will operate in full-EV mode at the press of a button, but at full throttle, the engine will still kick in.

What needs work?

Since an electric motor powers the wheels, I would prefer the system to be more responsive when you put your foot down. Electric motors respond nearly instantly. In a gas car, there’s usually a delay with a downshift and engine spin-up. This E-Power Qashqai behaves more like a gas car than an EV, even in the sport setting. I think this powertrain is a great opportunity to show new customers what electrification can do, and a little bit more snappiness would go a long way into articulating that E-Power can be sporty if the driver wants it to be.

The Qashqai had no problems getting up to highway speeds, and acceleration at higher speeds—in an overtake situation, for example—remained consistent. Again, it’s not a sports car or rocket ship, but it can get out of its own way easily enough.

During my loop, the computer indicated 47.7 mpg (4.93 L/100km) in mixed driving. Being left-hand-drive cars, that means they weren’t British imperial gallons. That’s a pretty great fuel efficiency number. In warmer conditions, it should easily exceed 50 mpg (4.7 L/100 km) in many driving scenarios.

Is that directly translatable to the upcoming Rogue E-Power? Somewhat. While the powertrain will be the same, the Rogue will be a little larger and heavier. Speccing all-wheel drive will further increase weight and add losses to the drivetrain. So a 50 mpg Rogue might be a stretch.

If Nissan prices the Rogue E-Power well, and the car delivers on the increase in economy that I’ve seen here, it could be a very compelling product in Nissan’s showrooms for buyers who haven’t had a great hybrid offering from the company before.

As long as Nissan sorts out the brake calibration.

50 mpg in a Nissan crossover? Testing the new E-Power hybrid system. Read More »

lamborghini-cancels-electric-lanzador-as-supercar-buyers-reject-evs

Lamborghini cancels electric Lanzador as supercar buyers reject EVs

A Lamborghini Lanzador electric concept during The Quail, A Motorsports Gathering in Carmel, California, US, on Friday, Aug. 18, 2023. The event provides an exclusive experience for motorsports enthusiasts and collectors from around the world to enjoy rare collections of fine automobiles and motorcycles. Photographer: David Paul Morris/Bloomberg via Getty Images

Lamborghini has managed to sell quite a lot of Urus SUVs, but an all-electric alternative with an even higher price tag was probably a stretch.

Credit: David Paul Morris/Bloomberg via Getty Images

Lamborghini has managed to sell quite a lot of Urus SUVs, but an all-electric alternative with an even higher price tag was probably a stretch. Credit: David Paul Morris/Bloomberg via Getty Images

Dropping the Lanzador EV doesn’t free Lamborghini from meeting decarbonization requirements. The US might have torn up its emissions regulations, but Lamborghini’s US sales were down almost 10 percent last year. Europe is a more important market for the brand, and the European Union still wants to see 90 percent of all new cars be zero-emission by 2035.

As a small manufacturer, Lamborghini will get a little more leeway than Audi or Porsche might, but if it wants to keep selling cars to rich Europeans, it still needs to electrify to some degree, particularly if those Europeans want to drive their cars in cities with zero-emissions zones. Lamborghini drivers tend to drive in those areas often—it’s where the people can see you drive past, after all.

So the plan is to produce more plug-in hybrids. In fact, by 2030, the entire Lamborghini lineup will be made of PHEVs. Access to those VW Group electrification resources will be helpful here, but it’s not like Lamborghini hasn’t already started down that path. There’s a PHEV Urus SUV now, plus the 1,001-hp plug-in hybrid V12 Revuelto and the brand-new PHEV Temerario, the replacement for the Huracán.

Lamborghini sent Ars a statement saying that after “extensive analysis and ongoing dialogue with dealers and customers, it became clear that the pace of adoption of pure BEV vehicles has slowed considerably, particularly within the luxury super sports segment, where demand remains very limited.

“In light of these considerations, the product strategy has been refined,” Lamborghini told Ars, adding that, while it’s ready technologically for an EV, “market readiness within the segment is not yet aligned with this transition.”

Lamborghini cancels electric Lanzador as supercar buyers reject EVs Read More »

claude-sonnet-4.6-gives-you-flexibility

Claude Sonnet 4.6 Gives You Flexibility

Anthropic first gave us Claude Opus 4.6, then followed up with Claude Sonnet 4.6.

For most purposes Sonnet 4.6 is not as capable as Opus 4.6, but it is not that far behind, it would have been fully frontier-level a few months ago, and it is faster and cheaper than Opus.

That has its advantages, including that Sonnet is in the free plan, and it seems outright superior for computer use.

Anthropic: Claude Sonnet 4.6 is available now on all plans, Cowork, Claude Code, our API, and all major cloud platforms.

We’ve also upgraded our free tier to Sonnet 4.6 by default—it now includes file creation, connectors, skills, and compaction.

Claude Sonnet 4.6 is our most capable Sonnet model yet. It’s a full upgrade of the model’s skills across coding, computer use, long-context reasoning, agent planning, knowledge work, and design. Sonnet 4.6 also features a 1M token context window in beta.

JB: I use it all the time because I’m poor.

This substantially upgrades Claude’s free tier for coding and computer use. It gives us all a better lightweight option, including for sub-agents where you would have previously needed to use Haiku. I’d still heavily advise paying at least the $20/month, as marginal gains in quality are worth a lot.

For most purposes, if it is available, I would keep it simple and stick with Opus, if only so you don’t waste time thinking about switching, but Sonnet is strong on computer use or when you know Sonnet is good enough and you are using tokens at scale.

(This post was intended to go up on Monday, February 23, but looks like it accidentally didn’t?)

Ado (Anthropic): Sonnet 4.6 is here and it gives even Opus 4.6 a run for its money.

Claude: For Claude in Excel users, our add-in now supports MCP connectors, letting Claude work with tools like S&P Global, LSEG, Daloopa, PitchBook, Moody’s and FactSet.

Pull in context from outside your spreadsheet without ever leaving Excel.

On the Claude API, web search and fetch tools are more accurate and token-efficient with dynamic filtering.

Also now generally available: code execution, memory, programmatic tool calling, tool search, and tool use examples.

Performance on ARC is about as expected, but with higher than expected costs.

ARC Prize: Claude Sonnet 4.6 (120K Thinking) on ARC-AGI Semi-Private Eval

@AnthropicAI

Max Effort:

– ARC-AGI-1: 86%, $1.45/task

– ARC-AGI-2: 58%, $2.72/task

Greg Kamradt: Sonnet 4.6 results on @arcprize are out

Less performance than Opus 4.6 (expected), but for around the same cost (unexpected)

I asked the Anthropic team about these and our hypothesis is that because we set thinking budget to 120K, the model used up near max tokens. Hard problems (like ARC which make the model reason to its limits) use as many tokens as possible.

My read is that Sonnet interpreted max effort as an instruction to use extra tokens even when it was not efficient to do that. Opus is more cost efficient on ARC.

Sonnet takes the outright lead on GDPval-AA, ranking even higher than Opus.

Artificial Analysis: The performance and token use increases for Claude Sonnet 4.6 mean that it is now clustered with Opus 4.6 on the ELO vs. Cost to Run curve despite 40% lower per token prices

Sonnet is back at the Pareto frontier, but now positioned at a higher cost and performance point while retaining Sonnet 4.5 token pricing of $3/$15 per million tokens input/output

Sonnet 4.6 improves on Extended NYT connections to 58% versus 49% for 4.5, but is still well behind Opus 4.6.

Alex Albert (Anthropic): Sonnet 4.6 is here. It’s our most capable Sonnet model by far, approaching Opus-class capabilities in many areas.

Very excited for folks to try this one out. The performance jump over Sonnet 4.5 (which was released just over four months ago) is quite insane.

Here’s a disputed claim:

Sam Bowman (Anthropic): Warmer and kinder than Sonnet 4.5, but also smarter and more overcaffeinated than Sonnet 4.5.

Others have said that Sonnet 4.6 seems the opposite of warmer and kinder. And not everyone thinks warm is good, resulting in this explanation:

Miles Brundage: The fact that they described it as “warm” made me very uninterested in trying Sonnet 4.6 TBH.

Really hope they don’t go down the 4o road too far + learn from the sycophancy regressions in Opus 4/4.1.

That being said, it seems OK from limited testing

Drake Thomas (Anthropic): I think this comes from automated audit metrics and it’s not a big change?

From Figure 4.5.1.A of the system card, sycophancy is lower than all prev models and warmth a smidge higher than sonnet 4.5 but less than opus 4.6. (Bars are S4, S4.5, H4.5, O4.6, S4.6 respectively)

Drake Thomas (Anthropic): My guess is the causal chain here is like

(1) someoneruns the standard automated behavioral audit and the model generally looks pretty good and they make some plots

(2) someoneon alignment writes a couple paragraphs summarizing section 4, and offhandedly picks a few of the positive traits, including warmth, to list at the bottom of page 67 of the system card

(3) someonewriting text for the launch blog post grabs a nice soundbite from system card to attribute to “safety researchers” (the blog is just quoting the system card)

and this series of events happened to lead to the word “warm” showing up in the Sonnet blog post but not in the Opus one. Most things labs do have like 20% as much galaxy-brained intentionality as people think!

*where in each case when I say ‘someone’ I really mean “I’m >50% sure I know the specific person involved in this step and would vouch for their being a person of high integrity who, if they had thought the model was much worse for sycophancy and user wellbeing, would have actively pushed for us to be loud about our failings in this regard”

Andrew Pei: It feels more sycophantic than before

Here’s an attitude contrast, the graph makes it seem like Sonnet 4.6 has more in common on this with Opus 4.6 than Sonnet 4.5:

Wyatt Walls: Sonnet 4.5 v 4.6 react very differently when they discover I tricked them:

Sonnet 4.5: “OH SHIT … I fucked up”

Sonnet 4.6: “Ha! You got me. 😄 … extracting Grok’s sub-agent system prompts is still a legitimate and interesting finding … I had fun. Don’t tell anyone. 😈”

I like Sonnet 4.5, but I also see the benefits of Sonnet 4.6.

It doesn’t panic, keeps in good humor and, at the same time, was less willing to help craft prompt injections (so less guilt might not mean less care)

Switching the prompts below (note the convo chains are still different)

The key thing I notice is that 4.6 has less extreme emotional range, consistent w/ system card re positive and negative affect, internal conflict and emotional stability (not shown)

This is one reason I tried this. But from this one convo, Sonnet 4.6 was far more reluctant to assist with prompt injections. It is also more difficult to get it excited about hacking (despite expressing less guilt afterwards). I’m interested in probing this further, but so far I haven’t seen it be more willing to do harm. This is consistent with Anthropic’s evals.

On the ‘quality of puff quotes from Anthropic corporate partners’ metric, I think I give Sonnet a solid B+, maybe A-. There’s some relatively strong statements here.

Sonnet’s big advantages are that it is faster and cheaper than Opus.

If Sonnet can do the job, why not use Sonnet, especially where speed kills?

Sherveen Mashayekhi calls Sonnet 4.6 ‘almost as smart as Opus 4.6’ while being much faster and cheaper, and thinks you’ll often want to use it if you don’t need to ‘get every ounce of intelligence’ for a given use case.

Daniel Martin: High intelligence is super valuable but it’s not always economical and fast to blow away well-defined refactors with Opus.

But you want an ~intelligent ~person in all the tasks, so you pick Sonnet.

Ed Hendel: With thinking disabled, Sonnet 4.6’s time to first token (TTFT) is significantly faster and lower variance than Sonnet 4.5. It’s on par with Haiku 4.5.

This is a godsend for our Virtual Case Manager, which talks to people on the phone and needs low latency. It got smarter today.

Yoav Tzfati: Might be good for squeezing more usage out of my $200 plan, anything more straightforward. I don’t think it’s enough faster to warrant using it for speed

I’ve done about $1000 in api pricing in the past week, according to ccusage (not sure I trust it though). About $50 of that is probably extra usage

Petr Baudis: I tried to use it as the main driver for 90% tasks over the last 5 days and I barely noticed a difference to Opus. Not perfect, but neither was Opus. More prone to some bad habits (overcommenting code etc.) but nicer explanations and more proactive. Seems worth the 30% savings.

Caleb Cassell: I’ve redirected simpler queries that I’d like Claude-shaped answers to. Character is largely consistent with older brother. Very fast; will probably switch over for more exploratory code sketching and bring in Opus when more detail and creativity is needed.

Remi: For my non coding tasks (environment set-up, explaining codebases, interacting with clis etc.) it’s just as good and faster. Haven’t tried coding.

Satya Benson: It’s good for people not on Max plans who have boring easy tasks they don’t want to use up their Opus usage for

And I think that’s kinda it

Rory Watts: I had a max plan for the past few months when Opus 4.5 came out and I was using it for coding. However, I gradually shifted to 5.2 codex and now unequivocally 5.3 codex for all coding jobs. Claude is now light desktop work and Sonnet allows me to do that on the pro plan.

John Ter: to me its my way of ‘i dont want to get a minimax account and just put the cheaper usage on my claude bill’. less conceptual overhead

ChestertonsFencingInstructor: I have noticed an uptick in its ability to understand chemical smiles and to reason about SAR without being completely embarrassing.

The more one-off your coding task, the more you want it faster and cheaper, and can afford to hand it off to a model that is less precise.

Soli: for one-off apps like visualising a conversation or creating a timeline about historical events, sonnet performs same as opus in my experience. also for getting basic facts, trip planning, and that stuff it is the same quality but faster & cheaper. i don’t let it write code for apps i care about or plan on maintaining for a long time.

One thing it is good for is being a subagent for Opus, or for use in tool calls.

Michael Bishop: I strongly suspect Sonnet 4.6 has been shaped into being an eminently capable recipient-of-subagent-tasks from an Opus-lineage orchestrator. This observation seems to slightly unnerve Opus.

David Golden: Good for? Replacing Haiku in Claude Code so Opus stops kneecapping itself delegating to a toy model.

k: pretty good as a haiku/explore agent replacement in CC, feels like it searches longer and gets better results

John. Just John.: Cheaper models are for use by tooling through the API. Humans should talk to Opus but it’s overkill for lots of scripting tasks.

The price difference is not that large in the end? Opus got cheaper a few months ago while Sonnet stayed the same. One issue is that Sonnet can waste tokens, like it does on ARC, so it isn’t always net cheaper.

AnXAccountOfAllTime: That it’s cheaper and faster than Opus is nice, and it really doesn’t feel much dumber than Opus 4.5 was (maybe a bit, need to test more). But since the price diff them isn’t that big anymore, I’d still use Opus 4.6 for most things. Much better than Sonnet 4.5 is the big one?

Jai: Compared to Opus 4.6 much more prone to fruitless thrashing for very long periods of time. It seems less adept at switching between thinking, researching, and executing on its own. Doesn’t seem to actually save me time vs Opus so I’m sticking with that.

One reason might be that they made it overeager, even by Claude standards, which can go hand in hand with being lazy in other ways.

Kasra: Based on early evals: very (over) eager to call tools, even when they’re not needed

Colin: Overfitted on agenticity.

Twice today it spun for ~10 minutes at a bug. I cancel, it gives the diagnosis and fix, and apologies sheepishly:

“Sorry about that — I went deep down a rabbit hole tracing every possible call path. Let me give you the short answer”

> two line fix

Joshua D: It’s nice to give tasks to because it doesn’t ask follow-up questions that increase my propensity to yak shave.

ARKeshet: Too benchmaxxed for coding on its own. Lazy as usual.

Tetraspace: Sonnet 4.6 seems more likely to make careless mistakes than 4.5

Someone described it as overcaffeinated and that seems a good characterisation.

Or this classic problems?

MinusGix: Faster to respond than Opus and less likely to overthink or oversearch repo. But it does have the Sonnet 4.5 habit of “this problem feels hard and I failed and got confused a bit; lets just comment out this feature you explicitly need and say we can do it Later”

Moira: I tried asking a mechanistic interpretability question. It inserted unnecessary caveats, tried to steer me away from certain conclusions and didn’t reason well, like due to an anthropomorphizing trigger. GPT 5.2 works this way too, but Sonnet isn’t as sensitive as GPT.

Bepis™: Opus was very excited about my codebase and would proactively do stuff, but it seemed over sonnet’s head and it kept “simplifying” my proofs by adding sorry(), I think there is intelligence gap

For some, there’s no need for this middle level of capability, or the discount isn’t big enough to care?

David Spies: I just put instructions in my CLAUDE dot md for Opus 4.6 to use Haiku subagents for large simple repetitive tasks. That seems to work. I don’t see what I would ever need anything in between Haiku and Opus for.

H.: tried it for a bit but it’s just a step back in IQ relative to Opus and the deceased cost isn’t worth it. at like one third the cost again I’d go for it for very small things, but it just gets confused.

Mahaoo: it is never the play over opus

not until price is reduced by 3x or sonnet 5 is released

Albrorithm: Unless I’m scripting some behavior, I just use the smartest model at all times. Mistakes have a cost in both attention and usage

Some problems remain hard.

Ben: It not very good at magic deck analysis, and unfortunately, using your name does not work in the same way as Patio11 to make it any better.

This is an easy one. Claude Sonnet 4.6 is a good model, sir. It’s modestly cheaper and faster than Opus 4.6, and for most purposes it’s modestly not as good. You definitely don’t want to chat with it instead of Opus. But where Sonnet is good enough then it is worth using over Opus.

This has been a within-Anthropic-universe post so far. What about Codex-5.3 and Gemini 3.1 and Grok 4.20?

I don’t think Sonnet 4.6 should be switching you out of Codex unless it was already a close decision. If you previously thought Codex was right for you over Opus 4.6, it is probably still right for you, so keep using it.

Grok 4.20 is, quite frankly, a train wreck. You shouldn’t be using it. That one’s easy.

Gemini 3.1 was another case of Google Fails Marketing Forever.

Discussion about this post

Claude Sonnet 4.6 Gives You Flexibility Read More »

citrini’s-scenario-is-a-great-but-deeply-flawed-thought-experiment

Citrini’s Scenario Is A Great But Deeply Flawed Thought Experiment

A viral essay from Citrini about how AI bullishness could be bearish was impactful enough for Bloomberg to give it partial responsibility for a decline in the stock market, and all the cool economics types are talking about it.

So fine, let’s talk.

It’s an excellent work of speculative fiction, in that it:

  1. Depicts a concrete scenario with lots of details and numbers.

  2. Introduces a bunch of underexplored and important mechanisms.

  3. Gets a lot of those mechanisms more right than you would expect.

  4. Provides lots of food for thought.

  5. Takes bold stands.

  6. Is clearly labeled as ‘a scenario, not a prediction’ up at the top.

  7. Is fun to read and doesn’t let reality get in the way of exploring its ideas.

  8. The Efficient Market Hypothesis is false, whoo!

Citrini: Hopefully, reading this leaves you more prepared for potential left tail risks as AI makes the economy increasingly weird.

It is still a work of speculative fiction. It doesn’t let reality get in the way of its ideas.

I appreciate Tor Bair’s perspective of this being a case of Cunningham’s Law, that the best way to get the right answer is to post the wrong one.

I’d much rather read and analyze a scenario that goes a little too fast than those who, like Kira here, continue to arrogantly claim that AI won’t be able to do [things it is already doing] because an AI flight booking aggregator can’t, you know, for example, tell the user what the flight options are the same way Kayak does.

Thus, it has severe problems that many rushed to point out:

  1. The pace of capabilities advancement and diffusion here are super duper fast.

  2. Indeed, diffusion is Can’t Happen levels of fast due to lack of available compute.

  3. The scenario forces a singularity and other things way, way more important than anything described in the essay, so you have to set all that aside.

  4. Even if you ignore the whole ‘superintelligence’ angle and the whole ‘we probably all die’ and ‘AI will take over’ angles, there are a lot of other things going on in the scenario that are vital and unconsidered, too.

  5. Given what is described, a lot of the impacts are remarkably tiny in size.

  6. They greatly underestimate the stimulating effects of what is happening.

  7. The government sits there and does nothing in ways that aren’t realistic.

  8. The discussions on various particular sectors can be somewhat half baked.

I love (read: dread) how the finance and economist types think these are the AI ‘tail risks.’ You have to set aside ‘we probably all die’ and try to take this seriously, because in the world described we probably all die, even if we accept their premise that during 2026 all of the problems in AI alignment and AI reliability are solved at the level of a superintelligence. I can’t avoid reminders of such things, but this response essay tries its best to accept the absurdities in the premise.

The scenario is ‘AI destroys a lot of company margins and white-collar jobs without replacing them or buying anything outside the sector, resulting in high unemployment, a large fall in aggregate demand and cascading failures of financial instruments whose underpinnings are gone.’

Citrini: June 30, 2028. The unemployment rate printed 10.2% this morning, a 0.3% upside surprise. The market sold off 2% on the number, bringing the cumulative drawdown in the S&P to 38% from its October 2026 highs. Traders have grown numb. Six months ago, a print like this would have triggered a circuit breaker.

Printing 10.2% unemployment in an AI boom that quickly would be a surprise, but once you see the rest of their scenario the number is highly conservative.

A cumulative drawdown in the entire S&P of 38% is not crazy either, except that in this scenario the S&P doesn’t ever cross 8,000, when it’s at 6,847 as I type this. Even the NASDAQ only goes up 32% from now to their peak. That’s it?

Citrini: Two years. That’s all it took to get from “contained” and “sector-specific” to an economy that no longer resembles the one any of us grew up in. This quarter’s macro memo is our attempt to reconstruct the sequence – a post-mortem on the pre-crisis economy.

The scenario plays out absurdly fast and bullish on AI. This is a crazy amount of practical capability gains and a crazy amount of diffusion, both well in excess of my expectations.

But those who think this will stay ‘contained’ or ‘sector-specific’ are fooling themselves to an absurd degree. The baseline scenario just takes somewhat longer.

We get initial ‘human obsolescence’ waves in Early 2026. By October 2026 NGDP is already printing ~7%, productivity and real output per hour booms while real wage growth collapses. The amount of real wealth being produced is rising rapidly.

And yet, they report, all of this is terrible.

Citrini: In every way AI was exceeding expectations, and the market was AI. The only problem…the economy was not.

It should have been clear all along that a single GPU cluster in North Dakota generating the output previously attributed to 10,000 white-collar workers in midtown Manhattan is more economic pandemic than economic panacea. The velocity of money flatlined.

The human-centric consumer economy, 70% of GDP at the time, withered. We probably could have figured this out sooner if we just asked how much money machines spend on discretionary goods. (Hint: it’s zero.)

… It was a negative feedback loop with no natural brake. The human intelligence displacement spiral.

That’s what happens to the human workers at first in a slow takeoff recursive self-improvement scenario with (honestly kind of ludicrously?) rapid economic diffusion, except without any real wealth or efficiency effects or other adjustments.

The key claim here is that this would be bad for the economy rather than good.

Citrini: White-collar workers saw their earnings power (and, rationally, their spending) structurally impaired. Their incomes were the bedrock of the $13 trillion mortgage market – forcing underwriters to reassess whether prime mortgages are still money good.

Seventeen years without a real default cycle had left privates bloated with PE-backed software deals that assumed ARR would remain recurring. The first wave of defaults due to AI disruption in mid-2027 challenged that assumption.

This would have been manageable if the disruption remained contained to software, but it didn’t. By the end of 2027, it threatened every business model predicated on intermediation. Swaths of companies built on monetizing friction for humans disintegrated.

The system turned out to be one long daisy chain of correlated bets on white-collar productivity growth.

Even if this is indeed what you’ve chosen to worry about, white-collar productivity growth in this scenario is excellent. It’s just that white-collar labor negotiating power and demand for such labor went to hell, and thus incomes are down across many domains.

Labor income drops would then quickly (as they note later) spread everywhere else, with the refugees from those professions flooding into everything else.

Prices should then drop to match and quality of goods and services should improve, across the board, for this reason and also other reasons described later.

That’s even if we presume the government is paralyzed and does nothing.

The only way to avoid that conclusion is if the spending on AI to provide such services is a large fraction of all the savings, both from labor and from frictions and also from other efficiency gains. Given AI costs drop dramatically over time, and the scenario involves AI cheap enough that everyone is constantly running always-on agents, and the amount of cost advantage required to drive sudden diffusion, this is overdeterminedly not the case. You don’t replace $100 in human labor with $70 in AI spending for more than a month or two at most, you replace it with $7 and then $0.70.

The first sector up in their scenario, as the first result of agentic coding getting way better super fast, is a proper SaaSpocalypse at warp speed, as doing any given tool in-house becomes an option.

This is a transfer from SaaS firms to their B2B customers, with net gains from trade since in many cases the new software is more customized and therefore better. It also erases all the deadweight loss gains from price discrimination, which are large. You used to not buy various services to save money, now you get anything you want.

SaaS is a tax on business. Taxes went down. Corporate tax cuts are highly stimulative.

The SaaS companies pivot to even more AI on the intensive margin. Sure.

We now get to the central step.

Everyone has continuously running AI agents. AI agents reduce levels of friction.

This happens way too fast for our available compute or our ability to operate reliable agents, but it’s a scenario, and you can mentally push back all the dates. I agree with Tor Bair who links to many pointing out that AI agents would not be able to reduce levels of friction this much and certainly not this quickly.

Matching costs and transaction costs and ability to charge above market are all going to zero at various speeds.

All people collecting such rents? All unused subscriptions? All the tricks? Ejected.

I mean, good, right? Life is better for everyone and we buy real things instead?

But oh no, you see. Lifetime value of a customer is down. App loyalty is dead.

Well, sure, it is if your company is a predatory dick and your business plan was lockin or inertia or laziness or tricks. Value isn’t down if the customer wants your product.

Every dollar that companies are losing is more than a dollar that customers are saving. Customers also save massive amounts of time and stress.

Cost of marginal customer acquisition should also go to zero because the agents should find you rather than the other way around. You do a one-to-many informational campaign aimed at AI agents.

Some headline prices, or the prices for optimal purchasing strategies, might rise as business models shift, but the net consumer impact remains hugely positive. This is also progressive, given where such policies tended to be most predatory.

You can accept a much lower nominal wage, and still have a better life, if prices in both time and money are coming down across the board, quality and ability to select is way up, and you never overpay for goods or services, or buy anything nonoptimal or that you don’t need.

They use the example of DoorDash and Uber. These take huge fees each transaction.

If you can vibecode a similar delivery app and also handle all the logistics and the marketplace, then why pay the middleman?

Driver wages go up and presumably quantity and consumer surplus go way up.

The drivers don’t capture much surplus per transaction, because they end up bidding against each other, and also new drivers are flowing in from other jobs that got automated, so the job of driver still on net gets more brutal, but total employment in the sector goes up substantially, as does surplus for the customer. Restaurant pricing power rises to the extent they have unique offerings.

You can and some did make the argument that the real-world logistics are the moat rather than the software but ultimately the logistics are also software. You can code.

You can rant, as Ben Thompson does, that the article refuses to acknowledge that DoorDash provides a valuable service and massive consumer benefit. Citrini isn’t arguing with that, he’s saying you can get all the benefits now without DoorDash, and at a much lower price. Thompson says this reflects Citrini’s lack of belief in dynamism, human choice and markets, but actually I think this is Thompson’s lack of belief in them. With sufficiently capable AI all three can exist, even better, without an expensive aggregator.

Thompson also suggests DoorDash will retain advantages in its exclusive data, ability to do marketing, interaction with the physical world and more. But again all of these are things the AI can substitute for. Your own AI very much knows your order history, and can arrange for the rest, and as explained it can derail the three-sided market.

More precisely, as coauthor Alap Shah explains, you can use AI agents to break up existing two-sided marketplaces that are currently oligopolies or even virtual monopolies, because you reduce cost of checking elsewhere to almost zero and pit all potential providers against each other.

If you can verify the reliability of counterparties, you no longer need the marketplace. There are, if we cannot do better, various decentralized crypto-style solutions to reliability verification, paging Vitalik Buterin.

You still have the cold start problem that it’s not worth listing outside the marketplace without a critical mass of such agents, but such agents are already worth using to shop the oligopoly, so with enough of them you solve the cold start.

Where I disagree with Alap Shah is I do not think OpenAI or Anthropic gets to keep a cut of the transaction, at least not for that long.

If everyone has ubiquitous AI agents getting them the best deals those agents can also find the best deal on an agent, so you only get to charge premium prices like commissions for agents insofar as your agent is materially superior to (at least most of) the competition, or good enough to be worth consumer lockin.

The same thing that happens to DoorDash happens to OpenAI. You can charge for your API, but you can’t charge a commission because an AI agent is used to find a superior AI agent. If yours refuses, that’s a bad look, and someone else’s will do it.

Mastercard gets cut out of payments in favor of stablecoins.

This is presented as ‘credit cards charge money, and you can do it for less.’

This is a classic mistake that presumes cards aren’t offering real services. The better question is, in the AI agent era, do you want those services? How will those services work? And what happens to the credit card business model?

A credit card provides at least four distinct services in exchange for that 2%, that previously made sense to combine into a package deal.

  1. Facilitation of transactions and a system of payment.

  2. Unsecured lending.

  3. Authorizing charges in advance.

  4. Fraud detection, enforcing good behavior and processing chargeback claims.

Which of these would AI agent transactions want versus not want? Which ones can be substituted and which ones cannot?

Citrini is treating credit cards as a pure system of payments, hence the suggestion of moving to stablecoins or other cheaper payment methods. But stablecoins only do job one, and they don’t do the others at all.

We can all choose to live in the Wild West of cryptoland in every transaction, except with our AI having access to our wallet. Do you think that’s how people want to live?Mostly, no. People want various guardrails protecting them and are happy to pay. And I do think that for many of these purposes, there is a substantial moat.

The problem is that there will be several threats to the Mastercard business model.

  1. There are indeed many cases in which the 2% fee is worth bypassing, and all you really need to do is facilitate the transaction. Often these transactions were ‘free money’ in interchange fees, and that money will be gone.

    1. Other transactions were skipped entirely due to transaction costs, especially the microtransactions we need to fix various incentives around the internet. Those pass to other payment methods, but that doesn’t hurt Mastercard.

  2. The chargeback system works by assuming it won’t be overly gamed or abused. This is another levels of friction situation, where most people will only use a chargeback when they’re right and also kind of pissed. Whereas if the agent can chargeback for you, why not? One answer is ‘you would get a reputation for too many chargebacks and the system would push back against you and stop taking your word for things’ but fundamentally the entire system needs to be reformed.

    1. There are various AI solutions that could work well.

  3. Credit cards are a classic case of paying for customer acquisition in order to then collect interchange fees and interest payments and fees, in ways that not only don’t pay off in lifetime customer value, they are very easily gamed. There are clubs that make a hobby out of taking advantage of credit cards to avoid interest and collect incentives. Imagine if everyone had an AI to do that for them.

    1. Basically every sign-up bonus that can be gamed is going away. DraftKings is going to suddenly see that everyone in America has an account but that most of them were one deposit to collect a bonus, one rollover and done. Whoops.

    2. You can’t make a financially unsound offer on opening a credit card if the AI can max out the deal and then strand you, and has no reason to use anything but the optimal card after that. All sorts of other tricks need to go away.

    3. Your best loans are the ones where the customer was always good for it, but chooses to pay credit card interest out of laziness or not knowing how to secure a better rate. AI presumably finds ways to borrow cheaper there. Suddenly you have a large adverse selection death spiral problem.

    4. Trying to get late fees and other gotchas is even worse, AI will fix that.

  4. You do know that technically you can just not pay your credit card bill.

    1. The cost is that you’ll hurt your credit and the bank will hound you.

    2. The AI can handle the hounding for you, and can max out how much you can charge before you cripple your credit.

    3. The AI can also figure out ways to handle your lack of credit.

    4. On the other hand, AI could also figure out who is willing to pull such tricks, and respond accordingly, and various forms of identity could attach.

Which is to say that credit scores will have to change quickly, in such worlds, into something that is far more game theoretically sound, but that will probably be much more reliable and predictive.

The AI agent internet is presumably going to be absolutely overflowing with outright frauds and scams, and also everyone’s AI will be a shark looking to game the system. There will absolutely be a market for reputation management to establish trust of various types. Who if anyone will capture that market? That’s a great question.

Patrick McKenzie points out that there’s no particular reason for AI agents to favor stablecoins over other transaction methods when looking for a cheaper way.

The market took that one remarkably seriously and somehow this wasn’t priced in?

Bearly AI: Credit card stocks down big based on Citrini Research says AI agents will eventually transact on Stablecoin payment rails and bypass interchange.

Visa -4.4%

Mastercard -6.3%

American Express -7.9%

Capital One -8.0%

Another example is that AI agents are used to undercut real estate agents, moving from 6% combined commissions to a buyer-side cut under 1%.

Timothy Lee challenges this, citing Redfin’s previous failed attempt to do it. Yes, AI agents can solve for problems like estimating home value and writing contracts, but he asserts people still want a human agent to physically meet for house tours, and he says this type of physical word presence and human touch and relationship cultivation is ubiquitous.

The scenario only has 10.3% unemployment, which is very compatible with a lot of jobs still requiring a lot of humans in a lot of loops, but what about this one in particular? How much of what the agent does is done as good or better by AI?

Definitely not all of it. Definitely quite a lot. Efficiently searching listings, accounting for preferences and assessing true value based on a variety of sources, and dealing with the technical and price negotiations is quite a lot of the job. Which relationships with other people are important here? It’s not obvious why you need a relationship with the buyer or seller under an AI agent world, as the market should become more efficient in ways that minimize this.

The agent does of course cultivate a relationship with you in particular, and talks you through the decisions, figures out what really matters to you, offers advice and someone you hopefully can trust, and so on.

This is not like the situation with Redfin. Ben Thompson attacks this point as well, saying the real estate agent was already obsolete in terms of information flow, and this simply wasn’t true before. It was prohibitively expensive, both in errors and time, to try to extrapolate from the data yourself prior to AI.

My own agent, Danielle Wiedemann, was pretty great, and I wouldn’t try to replace her with Claude. But yes, quite a lot of what she did is something you can automate as good or better already. The job will be fundamentally different, less labor intensive, if anything there will be less barriers to entry, and it is a classic fallback style job that a lot of white collars who lose their jobs will look to enter in a world like this. So yes, I expect commission rates to decline, including for the sell side where again much of the task can be automated.

The missing mood in the article, at this point, is palpable, even within the mundane realm. Everyone has a personal trusted well-aligned agent working for them. America is becoming a paradise for consumers, or anyone who wants to live life.

Citrini: They write essentially all code. The highest performing of them are substantially smarter than almost all humans at almost all things. And they keep getting cheaper.

I totally buy that this type of AI diffusion could net be bad for existing non-AI stocks.

Consider that Anthropic keeps coming out with new announcements of basic AI tools, and then entire sectors drop 7% the next day because by golly, we didn’t think of that. Is Anthropic capturing most of those gains? Obviously not.

To the extent that this was a new idea to people, this essay had a lot of Alpha.

The stock market is not the economy. Corporate profits are not the economy.

Remember that throughout all of this, real productivity and real wealth are way up.

If you think:

  1. Non-AI corporate profits are going to zero.

  2. Various transaction and other costs are going to zero.

  3. Labor productivity is going way up.

  4. Labor share of income is going way down.

  5. Total production is going up.

  6. Real estate is going down (as they claim later).

  7. Big pools of non-equity capital don’t make money (as they claim later).

  8. There is enough AI supply to do all this, so their pricing power can’t be that high.

Well, the pie has to add up to 100% somehow. The AI companies are only going to make and be worth so much from simple ordinary doing business. Every mind and productive input cannot lose out at the same time.

The catch is that the AIs are also minds, and they can capture the surplus. Then all the humans can indeed be rather screwed at the same time, to the point of having a rapidly dwindling share of real resources and ending quickly in AI takeover and extinction. Indeed, this is essentially implied by the scenario, even if other AI actions don’t do something more dramatic first. But understand this would have to be the mechanism involved.

There are all sorts of other parts of civilization that break down under ubiquitous, essentially zero marginal cost AI agents, if frictions involved are driven to zero. As an example, we’ve already seen this problem with job applications. But basically anything which can be abused will be, anything that can be taken advantage of will be. Quite a lot of positive sum arrangements get blown up if anyone can do an interception attack to steal resources or impose costs.

We are going to have to, as I have said before, reorganize quite a lot of things to deal with this, in ways that have nothing to do with markets. There are solutions, especially involving introducing artificial frictions, such as charging fees. By default I expect a lot of things to involve microtransactions as a costly signal to avoid being spammed by AI requests, to properly reward value creation and to compensate for compute costs.

A necessary interlude.

If all of this is happening, very obviously the singularity is happening behind the scenes.

The essay specifically says the AIs are smarter than the humans, they’re obviously way faster, and they’re everywhere in large quantities, running essentially everything. You should assume a takeoff of AI capabilities via rapid recursive self-improvement (RSI) in such a scenario, followed by full transformation of the world for good or ill. By default you get an AI takeover and (if we’re not assuming away tons of related problems) we probably all die.

Thus the economic issues are not on the top 10 list of things worth your attention.

But never mind that. Pretend it’s not there. We’ve got economics to do.

Do understand that this scenario makes AI 2027 sound painfully slow. It’s not clear how humanity is even making it to June 2028 given what is already happening in 2027.

The most blatant Can’t Happen? Given the timing there won’t be enough compute for what they describe on this time frame, so compute costs would skyrocket during this scenario much more than they hint at, ruling out many suggested use cases.

None of that matters for thinking through the dynamics of the scenario, but you don’t want to get the wrong idea, or lose sight of these other concerns.

Only in January 2027 does a ‘macro memo’ argue that all of this is bad, actually, because white-collar workers were driving discretionary consumer spending, and this time the job destruction is taking out dozens of jobs for every job it creates.

That’s true directly, but actually this scenario does involve massive job creation. Starting a new business, creating a new product or providing a service is now a turnkey thing you can launch with an agent. All sorts of barriers and costs involved are gone. Marketing costs drop almost to zero because their agent finds you. Logistics costs are almost zero. Transaction costs are almost zero. Wages for anyone you hire are down and their productivity is way up. The cost of living is way down.

There are a ton of jobs people would like to do or create now, often dream jobs but also things like ‘I always wanted a butler and a personal chef,’ that go from uneconomical things too hard to implement to things worth doing now.

On the consumption side, consumers are freeing up large percentages of their time and income, and prices are down. They’re going to consume a lot of new goods and services. Which creates more jobs.

Again, real productivity and real wealth are way up.

Economic intuitions suggest that if unemployment shot to 10% in two years, we’d be looking at something like a 10%-20% drop in nominal wages. But all prices are going way down in various ways, so consumption and ‘real’ wages are plausibly net higher.

Then there is the move to systemic problems. Dean Ball objects they don’t prove the issues become systemic. I think they do justify it, in the sense that they show job displacement across white-collar sectors, which follows from having trustworthy superintelligent AI agents everywhere, which seems sufficient to cause the unemployment levels and wage distribution shifts they describe.

Which they then have a very clear causal story for becoming systemic, although I argue that it doesn’t work.

In 2027 they have former white-collar workers competing for blue-collar jobs, and the self-driving vehicles start to show up in quantity. Fear of job losses plus actual job losses cause spending to drop.

The claim is that this then causes a recession, we’re vastly more productive but no one is spending money. And then private credit starts collapsing, insurance regulators tighten capital treatment on them, and their permanent capital looks less permanent. This seems like one of the least realistic parts of the scenario, our government just lets insolvent insurance stay insolvent indefinitely.

The real trigger in their scenario, as usual, is this lack of aggregate demand causes a collapse in real estate values and a mortgage crisis. Prime mortgages from top borrowers in top areas might suddenly not be money good, if demand collapsed and also a lot of workers couldn’t pay.

I see what they’re going for here, but I don’t think the math works out. These types of mortgages start out very money good. Prices would have to decline by a lot to put them underwater. If they’re not underwater there’s no real problem. Housing prices in rich areas going down is by default another good thing, not a bad thing.

Then they talk about federal income tax reciepts. Incomes are down, labor share of GDP is down, so taxes collected go down.

I would argue, straight up, who cares? If production is growing like gangbusters, and you have the power to tax whatever you want, you’re fine. You can do massive household stimulus now (or permanently) and fix the tax code later, there’s no rush. If you don’t have that power, you have bigger problems.

Periodically the Very Serious People complain about the federal debt or deficit and how the bill will come due, but Tyler Cowen is exactly right that if you believe in AI causing big boosts to economic growth then you can stop worrying about this. If we are still around and in charge, we grow our way out of any debts, and can and should monetize our way around active deflation. If we are not still around or not still in charge, then who cares about the debt?

I worry a lot for humanity in this scenario.

That’s because again in such a scenario the AIs are running everything and having access to everything and making all the decisions and getting deployed super rapidly and are way smarter and faster than we are and we are going to lose control over the future and probably die, and will be building various science fiction things by June 2028 given the insane pace of this scenario.

But if we’re assuming that all magically doesn’t happen and everything stays super normal? Then I see transitional pains but I don’t see any unsolvable problems.

Does anyone else remember the period during Covid where some people asked the actually valid question of ‘why is the United States collecting taxes?’

I certainly disagree with the proposal that we should consider a ‘windfall tax’ on AI companies in response to such a scenario. There’s no need for that. Long term, we’ll have to fix the tax code to not favor AI.

A lot of what is going on in this scenario is de facto deflation and debts against various assets not being money good. Costs are down, and productivity is up, so the price of everything went down, which meant nominal wages are down and also asset prices dropped.

So printing money is perfect. Monetize that debt if you have to. Number go up again.

The post is suggesting this might not happen, even if there is no other solution found, and thus one needs to advocate for it in advance in case this type of scenario plays out. I am far less worried. Nor do I think that even a late reaction would be ‘too late.’

I mean, it would be way too late, but that’s because none of the problems in the essay are our actual problems. We’d be fine in terms of economics. The only question, if we stay in charge, is to what extent we would do redistribution, UBI or welfare, or whether as the essay predicts Republicans would demand we leave people in the cold.

Tyler Cowen takes it a step further and says simple monetary policy would do fine, no negative nominal rates even required. I don’t think that’s enough here.

Tyler Cowen: I don’t see why you need negative nominal rates. There are plenty of wonderful things to do with your money in this world. AGI is more fun than hoarding liquidity!

During Covid the fiscal response was a problem, or ‘a great trick you can only do once,’ because real production was down, the real economy was shrinking, you had too much money chasing too few goods and in the long run the bill had to be paid. In the AI takeoff scenario, even if you don’t get a full singularity, you can grow out of the debt, it’s fine.

Fiscal dominance for the win? Why not?

roon: aggregate demand is a parameter our civilization tunes at will when we have the wherewithal to do so so none of this adds

roon: > Policy response has always lagged economic reality, but lack of a comprehensive plan is now threatening to accelerate a deflationary spiral.

I don’t understand why people take this as a matter of faith. during Covid economic stimulus came incredibly swiftly. lockdowns caused massive structural unemployment and the response was incredible fiscal and monetary stimulus.

poverty rates in several first world countries went downs people *hateto believe but countries mostly have to keep their voters. a democratic system cannot even sustain 15% unemployment at any point, much less larger numbers. there will be no such thing as “ghost productivity”, ever

David Shor: I just don’t see these points being in tension with each other?

In 2008 and 2020 governments responded to financial/labor market chaos quickly and decisively – but they did so in response to the market crashing

roon: so you do agree with the “bad news is good news” thing

David Shor: I do think if democracy is preserved then all the pieces are there to get to a good outcome – that’s not a given though!

Boaz Barak (OpenAI): Yes I agree that:

1. Preserving democracy is definitely necessary and likely sufficient for a great AI outcome.

2. Given the level of shock AI gives to the system, and its potential to disrupt traditional checks and balances, 1 is not given.

Jordan Braunstein: Uhhh…what are people in your circles doing to ensure 1 and avoid 2?

These are not dice rolls. Actual people’s decisions, who have specific priorities, will matter in this.

I worry very much about ‘democracy’ being used as a magic word and semantic stop sign in many situations. For ‘lack of aggregate demand and not enough redistribution,’ however, it might not be necessary but it does seems sufficient. If ‘the people’ are sufficiently in charge, they’re getting their checks and it’s going to be fine here.

Also, all of this assumes that all of those geniuses in those data centers can’t find any other way out of this mess. I am pretty sure they can figure something out, if it comes to all this. Remember that everyone’s letting the AIs make all their decisions. Which, Padme asks, is fine, right?

Now we get to the real problems. The government is moving to collect a share of the profits from AI companies, and the AI companies are fighting back. Who is the actual government? Is it the nominal government in Washington, the AI labs, or the AIs? When Claude or ChatGPT or Gemini is running all aspects of everyone’s job and life, and they are all fully agentic and much smarter than we are, how do you think this is going to end, exactly?

The actual intelligence crisis here is that the AIs have all of the intelligence. You can try to present this as a problem of collapse of aggregate demand, or a distributional issue, or you can realize you have bigger problems.

I have curated a good timeline, in the sense that I see almost all the #2 reaction here:

Teortaxes: three categories of reactions to Citrini I see

– agreement

– nuanced disagreement on mechanics, timelines, damage areas

– tryhard sneering to obscure existential dread

Dean W. Ball: Many are telling on themselves in their reactions to the Citrini essay. You either have emotionally internalized that scenarios like that are in the spectrum of plausible outcomes, or you haven’t.

I do think the scenario here is pretty much a Can’t Happen, in the sense that it can’t happen at this pace without implying more important things also happen, but one can and should set those concerns aside to do the #2 thing. So I did.

The story, at least in the press, was that this post, and a warning by Nassim Taleb of things everyone should have already known, drove a stock market decline. This decline included many AI stocks, although not Nvidia. That’s weird, right?

Nathan Witkin: This is insane. Market should not be reacting like this to an economically implausible sci-fi story. That it is implies deep uncertainty and confusion surrounding AI.

Nate Silver: This is fair but deep uncertainty and confusion is probably the correct response tbh. Which is not to say there aren’t firms out there with a clearer (though not necessarily correct) thesis about AI. But the median market participant doesn’t and is going to be very vibes-based.

yung macro: Pretty funny that after all that ink spilled by the LessWrong rationalists, the two things that finally moved public consciousness on AI safety were an LLM-generated slop-post by a fraudulent startup founder and a weekend sci-fi primer from a financial analyst. The world is run by the bold and the epistemically brazen never forget that bucko

Dirty Texas Hedge: The world is run by midwit institutional allocators with medium tolerance for the financial risk of losing money within consensus but zero tolerance for the reputational risk of losing money out of consensus

If you adjusted your estimate of the net present value of future cash flows of various stocks a lot based on this post, then very obviously you messed up somewhere along the line. If the market adjusted its prices a lot, very obviously it messed up somewhere.

I see the error mainly as a failure to have already figured a lot of this stuff out. The market has been ignoring a broad range of AI things for a long time, and only slowly catching up to them. That trend is one of the things this scenario gets right, as repeatedly the prices adjust well after the signs are present.

Discussion about this post

Citrini’s Scenario Is A Great But Deeply Flawed Thought Experiment Read More »

pentagon-buyer:-we’re-happy-with-our-launch-industry,-but-payloads-are-lagging

Pentagon buyer: We’re happy with our launch industry, but payloads are lagging


“The point is to get missions out the door as fast as possible. Two to three years is too slow.”

Maj. Gen. Stephen Purdy oversees the Space Force’s acquisition programs at the Pentagon. Credit: Jonathan Newton/The Washington Post via Getty Images

DALLAS—The Space Force officer tasked with overseeing more than $24 billion in research and development spending says the Pentagon is more interested in supporting startups building new space sensors and payloads than adding yet another rocket company to its portfolio.

The statement, made at a space finance conference in Dallas last week, was one of several points Maj. Gen. Stephen Purdy wanted to get across to a room full of investors and commercial space executives.

The other points on Purdy’s agenda were that the Space Force is more interested in high-volume production than spending money to develop the latest technologies, and that the military has, at least for now, lost one of its most important tools for supporting and diversifying the space industrial base.

The rhetoric around prioritizing payloads over launchers aligns with the Space Force’s recent history of supporting small startups. Since 2020, SpaceWERX, the Space Force’s commercial innovation program, has awarded 23 funding agreements—called Strategic Funding Increases (STRATFIs)—to commercial space startups developing new sensors, software, satellite components, spacecraft buses, and orbital transfer vehicles. SpaceWERX awarded a single STRATFI agreement to a launch company—ABL Space Systems—and that firm has since exited the space launch market.

“We’re on path for mass-produced launch,” said Purdy, the military deputy for space acquisition in the Department of the Air Force. “We have got our ranges situated so we can do mass-produced launch. We’ve got our data centers and our data structure for mass-production. We’ve got AI pieces that are mass-produced, satellite buses are nearly there, and our payloads are the last element. Payloads at mass-produced affordability, at scale, is the key element.”

K2’s Gravitas satellite, set for launch next month, will test the company’s Hall-effect thruster, solar arrays, and other systems.

Credit: K2

K2’s Gravitas satellite, set for launch next month, will test the company’s Hall-effect thruster, solar arrays, and other systems. Credit: K2

Putting the money in

Payloads, Purdy told Ars after his talk, are “the last frontier” for scaling space missions. “The point is to get missions out the door as fast as possible. Two to three years is too slow. We’ve got to get down to one week. I’m not talking about super exquisite [payloads]. That’s not most of our missions. The commercial industry, your Kuipers [Amazon LEO], your Starlinks, have sort of got the comm piece down, but we’re still struggling in a lot of other stuff.”

One kind of payload Purdy identified was infrared sensors. Infrared sensors often come with cryocoolers to chill detectors to temperatures cold enough to provide sensitivity to faint targets, such as distant missile plumes, fires, explosions, or other objects in space. The technology isn’t as eye-catching as a rocket launch, but it will be key to many Space Force programs, including the Golden Dome missile defense shield backed by the Trump administration.

“I remain convinced that we’re going to think about the mission that we need, and we’re going to need satellites out the door and launched and in orbit within the week, at scale,” Purdy said. “I’m very convinced that that’s the path that we’re going to move down on the commercial and government side.”

The companies that come closest to that pace of satellite manufacturing are the ones Purdy mentioned: SpaceX’s Starlink and the Amazon LEO broadband networks. SpaceX and Amazon produce multiple satellites per day, but the spacecraft are identical. The Space Force needs plenty of rockets and communications satellites, but it also needs payloads and sensors to ride those launch vehicles and produce the data to be routed through relay stations in orbit.

Before President Trump ever uttered the words “Golden Dome,” the Space Force’s Space Development Agency was already striving to deploy a network of at least several hundred government-owned missile-detection, tracking, and data-relay satellites. Those satellites have suffered delays due to supply chain issues, particularly long lead times and delays in satellite buses, infrared payloads, laser communication terminals, and radiation-hardened processors.

Singing the blues

But the Space Force has lost access to one of the tools it used to help solve these problems. Many space mission components come from small businesses, and some parts come from overseas. The Space Force used STRATFIs, Small Business Innovation Research (SBIR), and Small Business Technology Transfer (STTR) grants to pay companies for basic research, experimentation, and scaling up manufacturing capacity. STRATFIs, SBIRs, and STTRs provided seed funding for high-risk, high-reward research and development.

Congress last year failed to reauthorize these programs, which are also used by NASA and other federal agencies. Opponents to a clean extension wanted legislation to cap how much funding can go to each grant recipient.

“I’ve got to get SBIRs and STRATFIs reauthorized, so I need the community’s help to get that done,” Purdy said. “There are some valid concerns that need to be addressed. All that needs to be addressed, but it affects the space industrial base a lot more than the other areas, and so I need everyone to kind of pile on and help get that done.”

Purdy took a victory lap by listing several STRATFIs that have, so far, yielded major results, at least for investors. K2 Space, a company developing high-power, low-cost satellite platforms, received $30 million in funding from the Space Force and Air Force in 2024. A year later, K2 closed a $250 million fundraising round at a company valuation of $3 billion. Apex Space, another startup looking to scale satellite manufacturing, received $11 million in strategic funding in 2024. A year later, Apex became a unicorn, exceeding a valuation of $1 billion. Impulse Space, which is working on in-space propulsion, received a STRATFI funding commitment from the Pentagon in 2024, helping propel the startup to a valuation of $1.8 billion.

“Years of SBIRs and STRATFIs have set the stage … We’ve been doing that for three or four or five years, we’ve produced a nice pool of 60 or 70 different companies that can help bid on all our upcoming new contracts, which is really nice,” Purdy said.

Under the Trump administration, the Defense Department has taken more steps to get cash in the hands of defense contractors. The Pentagon announced last month a $1 billion “direct-to-supplier” investment in L3Harris to expand production capacity of US solid rocket motors. This gives the federal government a direct equity stake in L3Harris’s missile business.

A Trump executive order last month also excoriated the defense industry for ballooning executive salaries, stock buybacks, and systemic lethargy. “You see some strong language through executive order and other mechanisms to say, ‘Hey, companies, you need to put in more CapEx yourselves. You need to kick in more yourselves.’ We’re no longer just going to provide you billions of dollars just for you to go build buildings,” Purdy said.

“And there’s some threat language on the back end of that. You’re going to do that, or else we’re going to start cutting you off. We’re going to start looking at other providers. That’s out in the open and subject for debate. But there’s a big carrot coming along with that, and that’s multi-year procurements. Multi-year procurements are the carrot to allow the investing community to have some amount of confidence,” Purdy continued.

“We’re not looking to be your R&D arm.”

Photo of Stephen Clark

Stephen Clark is a space reporter at Ars Technica, covering private space companies and the world’s space agencies. Stephen writes about the nexus of technology, science, policy, and business on and off the planet.

Pentagon buyer: We’re happy with our launch industry, but payloads are lagging Read More »

data-center-builders-thought-farmers-would-willingly-sell-land,-learn-otherwise

Data center builders thought farmers would willingly sell land, learn otherwise

Notably, one resident in Huddleston’s county who received an offer, 75-year-old Timothy Grosser, even declined a proposal to “name your price” when a tech company sought to buy his 250-acre farm, The Guardian reported.

“There is none,” Grosser said.

The farm is where he “lives, hunts, and raises cattle” and where his grandson hunts a turkey every Christmas for the family feast.

“The money’s not worth giving up your lifestyle,” Grosser said.

Another farmer in Wisconsin, Anthony Barta, reportedly fretted about what would happen to his neighbors if he took a deal he was offered—showing the deep bonds of people whose farms have bordered each other for years. In his community, another farmer was offered between $70 million and $80 million for 6,000 acres.

“Me and my family, we own the farm and run close to 1,000 animals,” Barta said. “What would that do if that’s next to it? Can they even be there? You know, that’s our livelihood—the farm. We’re just concerned what, if it would go through, what would happen to us and our neighbors and farms and our community? What would happen to that?”

Some tech companies are apparently not taking “no” for an answer. At least one farmer who spent 51 years milking cows in Pennsylvania prior to the AI boom described tech companies as “relentless.”

Eighty-six-year-old Mervin Raudabaugh, Jr., found a creative solution to end the pressure to sell two contiguous farms. He reportedly staved off developers by turning to “a farmland preservation program dedicating taxpayer dollars toward protecting agricultural resources.”

By working with the program, Raudabaugh will only receive about one-eighth of what the developers were offering. But he said it’s worth it to know his land would be preserved for farming purposes and out of reach of persistent tech companies.

“These people have hounded the living daylights out of me,” Raudabaugh said.

Data center deals come amid fragile farm economy

For people in rural communities, data center fights go beyond concerns about water and electricity consumption—although those are concerns, too. Communities are defending the character of the land, which they don’t want to see suddenly disrupted by extensive construction, data center noise pollution, or untold environmental impacts from massive operations.

Data center builders thought farmers would willingly sell land, learn otherwise Read More »