Author name: Mike M.

bluesky-now-platform-of-choice-for-science-community

Bluesky now platform of choice for science community


It’s not just you. Survey says: “Twitter sucks now and all the cool kids are moving to Bluesky”

Credit: Getty Images | Chris Delmas

Marine biologist and conservationist David Shiffman was an early power user and evangelist for science engagement on the social media platform formerly known as Twitter. Over the years, he trained more than 2,000 early career scientists on how to best use the platform for professional goals: networking with colleagues, sharing new scientific papers, and communicating with interested members of the public.

But when Elon Musk bought Twitter in 2022, renaming it X, changes to both the platform’s algorithm and moderation policy soured Shiffman on the social media site. He started looking for a viable alternative among the fledgling platforms that had begun to pop up: most notably Threads, Post, Mastodon, and Bluesky. He was among the first wave of scientists to join Bluesky and found that, even in its infancy, it had many of the features he had valued in “golden age” Twitter.

Shiffman also noticed that he wasn’t the only one in the scientific community having issues with Twitter. This impression was further bolstered by news stories in outlets like Nature, Science, and the Chronicle of Higher Education noting growing complaints about Twitter and increased migration over to Bluesky by science professionals. (Full disclosure: I joined Bluesky around the same time as Shiffman, for similar reasons: Twitter had ceased to be professionally useful, and many of the science types I’d been following were moving to Bluesky. I nuked my Twitter account in November 2024.)

A curious Shiffman decided to conduct a scientific survey, announcing the results in a new paper published in the journal Integrative and Comparative Biology. The findings confirm that, while Twitter was once the platform of choice for a majority of science communicators, those same people have since abandoned it in droves. And of the alternatives available, Bluesky seems to be their new platform of choice.

Shiffman, the author of Why Sharks Matter, described early Twitter recently on the blog Southern Fried Science as “the world’s most interesting cocktail party.”

“Then it stopped being useful,” Shiffman told Ars. “I was worried for a while that this incredibly powerful way of changing the world using expertise was gone. It’s not gone. It just moved. It’s a little different now, and it’s not as powerful as it was, but it’s not gone. It was for me personally, immensely reassuring that so many other people were having the same experience that I was. But it was also important to document that scientifically.”

Eager to gather solid data on the migration phenomenon to bolster his anecdotal observations, Shiffman turned to social scientist Julia Wester, one of the scientists who had joined Twitter at Shiffman’s encouragement years before, before also becoming fed up and migrating to Bluesky. Despite being “much less online” than the indefatigable Shiffman, Wester was intrigued by the proposition. “I was interested not just in the anecdotal evidence, the conversations we were having, but also in identifying the real patterns,” she told Ars. “As a social scientist, when we hear anecdotal evidence about people’s experiences, I want to know what that looks like across the population.”

Shiffman and Wester targeted scientists, science communicators, and science educators who used (or had used) both Twitter and Bluesky. Questions explored user attitudes toward, and experiences with, each platform in a professional capacity: when they joined, respective follower and post counts, which professional tasks they used each platform for, the usefulness of each platform for those purposes relative to 2021, how they first heard about Bluesky, and so forth.

The authors acknowledge that they are looking at a very specific demographic among social media users in general and that there is an inevitable self-selection effect. However, “You want to use the sample and the method that’s appropriate to the phenomenon that you’re looking at,” said Wester. “For us, it wasn’t just the experience of people using these platforms, but the phenomenon of migration. Why are people deciding to stay or move? How they’re deciding to use both of these platforms? For that, I think we did get a pretty decent sample for looking at the dynamic tensions, the push and pull between staying on one platform or opting for another.”

They ended up with a final sample size of 813 people. Over 90 percent of respondents said they had used Twitter for learning about new developments in their field; 85.5 percent for professional networking; and 77.3 percent for public outreach. Roughly three-quarters of respondents said that the platform had become significantly less useful for each of those professional uses since Musk took over. Nearly half still have Twitter accounts but use it much less frequently or not at all, while about 40 percent have deleted their accounts entirely in favor of Bluesky.

Making the switch

User complaints about Twitter included a noticeable increase in spam, porn, bots, and promoted posts from users who paid for a verification badge, many spreading extremist content. “I very quickly saw material that I did not want my posts to be posted next to or associated with,” one respondent commented. There were also complaints about the rise in misinformation and a significant decline in both the quantity and quality of engagement, with respondents describing their experiences as “unpleasant,” “negative,” or “hostile.”

The survey responses also revealed a clear push/pull dynamic when it came to the choice to abandon Twitter for Bluesky. That is, people felt they were being pushed away from Twitter and were actively looking for alternatives. As one respondent put it, “Twitter started to suck and all the cool people were moving to Bluesky.”

Bluesky was user-friendly with no algorithm, a familiar format, and helpful tools like starter packs of who to follow in specific fields, which made the switch a bit easier for many newcomers daunted by the prospect of rebuilding their online audience. Bluesky users also appreciated the moderation on the platform and having the ability to block or mute people as a means of disengaging from more aggressive, unpleasant conversations. That said, “If Twitter was still great, then I don’t think there’s any combination of features that would’ve made this many people so excited about switching,” said Shiffman.

Per Shiffman and Wester, an “overwhelming majority” of respondents said that Bluesky has a “vibrant and healthy online science community,” while Twitter no longer does. And many Bluesky users reported getting more bang for their buck, so to speak, on Bluesky. They might have a lower follower count, but those followers are far more engaged: Someone with 50,000 Twitter/X followers, for example, might get five likes on a given post; but on Bluesky, they may only have 5,000 followers, but their posts will get 100 likes.

According to Shiffman, Twitter always used to be in the top three in terms of referral traffic for posts on Southern Fried Science. Then came the “Muskification,” and suddenly Twitter referrals weren’t even cracking the top 10. By contrast, in 2025 thus far, Bluesky has driven “a hundred times as many page views” to Southern Fried Science as Twitter. Ironically, “the blog post that’s gotten the most page views from Twitter is the one about this paper,” said Shiffman.

Ars social media manager Connor McInerney confirmed that Ars Technica has also seen a steady dip in Twitter referral traffic thus far in 2025. Furthermore, “I can say anecdotally that over the summer we’ve seen our Bluesky traffic start to surpass our Twitter traffic for the first time,” McInerney said, attributing the growth to a combination of factors. “We’ve been posting to the platform more often and our audience there has grown significantly. By my estimate our audience has grown by 63 percent since January. The platform in general has grown a lot too—they had 10 million users in September of last year, and this month the latest numbers indicate they’re at 38 million users. Conversely, our Twitter audience has remained fairly static across the same period of time.”

Bubble, schmubble

As for scientists looking to share scholarly papers online, Shiffman pulled the Altmetrics stats for his and Wester’s new paper. “It’s already one of the 10 most shared papers in the history of that journal on social media,” he said, with 14 shares on Twitter/X vs over a thousand shares on Bluesky (as of 4 pm ET on August 20). “If the goal is showing there’s a more active academic scholarly conversation on Bluesky—I mean, damn,” he said.

“When I talk about fish on Bluesky, people ask me questions about fish. When I talk about fish on Twitter, people threaten to murder my family because we’re Jewish.”

And while there has been a steady drumbeat of op-eds of late in certain legacy media outlets accusing Bluesky of being trapped in its own liberal bubble, Shiffman, for one, has few concerns about that. “I don’t care about this, because I don’t use social media to argue with strangers about politics,” he wrote in his accompanying blog post. “I use social media to talk about fish. When I talk about fish on Bluesky, people ask me questions about fish. When I talk about fish on Twitter, people threaten to murder my family because we’re Jewish.” He compared the current incarnation of Twitter as no better than 4Chan or TruthSocial in terms of the percentage of “conspiracy-prone extremists” in the audience. “Even if you want to stay, the algorithm is working against you,” he wrote.

“There have been a lot of opinion pieces about why Bluesky is not useful because the people there tend to be relatively left-leaning,” Shiffman told Ars. “I haven’t seen any of those same people say that Twitter is bad because it’s relatively right-leaning. Twitter is not a representative sample of the public either.” And given his focus on ocean conservation and science-based, data-driven environmental advocacy, he is likely to find a more engaged and persuadable audience at Bluesky.

The survey results show that at this point, Bluesky seems to have hit a critical mass for the online scientific community. That said, Shiffman, for one, laments that the powerful Black Science Twitter contingent, for example, has thus far not switched to Bluesky in significant numbers. He would like to conduct a follow-up study to look into how many still use Twitter vs those who may have left social media altogether, as well as Bluesky’s demographic diversity—paving the way for possible solutions should that data reveal an unwelcoming environment for non-white scientists.

There are certainly limitations to the present survey. “Because this is such a dynamic system and it’s changing every day, I think if we did this study now versus when we did it six months ago, we’d get slightly different answers and dynamics,” said Wester. “It’s still relevant because you can look at the factors that make people decide to stay or not on Bluesky, to switch to something else, to leave social media altogether. That can tell us something about what makes a healthy, vibrant conversation online. We’re capturing one of the responses: ‘I’ll see you on Bluesky.’ But that’s not the only response. Public science communication is as important now as it’s ever been, so looking at how scientists have pivoted is really important.”

We recently reported on research indicating that social media as a system might well be doomed, since its very structure gives rise to the toxic dynamics that plague so much of social media: filter bubbles, algorithms that amplify the most extreme views to boost engagement, and a small number of influencers hogging the lion’s share of attention. That paper concluded that any intervention strategies were likely to fail. Both Shiffman and Wester, while acknowledging the reality of those dynamics, are less pessimistic about social media’s future.

“I think the problem is not with how social media works, it’s with how any group of people work,” said Shiffman. “Humans evolved in tiny social groupings where we helped each other and looked out for each other’s interests. Now I have to have a fight with someone 10,000 miles away who has no common interest with me about whether or not vaccines are bad. We were not built for that. Social media definitely makes it a lot easier for people who are anti-social by nature and want to stir conflict to find those conflicts. Something that took me way too long to learn is that you don’t have to participate in every fight you’re invited to. There are people who are looking for a fight and you can simply say, ‘No, thank you. Not today, Satan.'”

“The contrast that people are seeing between Bluesky and present-day Twitter highlights that these are social spaces, which means that you’re going to get all of the good and bad of humanity entering into that space,” said Wester. “But we have had new social spaces evolve over our whole history. Sometimes when there’s something really new, we have to figure out the rules for that space. We’re still figuring out the rules for these social media spaces. The contrast in moderation policies and the use (or not) of algorithms between those two platforms that are otherwise very similar in structure really highlights that you can shape those social spaces by creating rules and tools for how people interact with each other.”

DOI: Integrative and Comparative Biology, 2025. 10.1093/icb/icaf127  (About DOIs).

Photo of Jennifer Ouellette

Jennifer is a senior writer at Ars Technica with a particular focus on where science meets culture, covering everything from physics and related interdisciplinary topics to her favorite films and TV series. Jennifer lives in Baltimore with her spouse, physicist Sean M. Carroll, and their two cats, Ariel and Caliban.

Bluesky now platform of choice for science community Read More »

the-outer-worlds-2-wants-you-to-join-the-space-police

The Outer Worlds 2 wants you to join the space police

Then there’s the way the game stresses a number of early dialogue choices, telling you how your fellow agents will remember when you choose to treat them with eager support or stern rebuke at key moments. Without getting too much into early game spoilers, I’ll say that the medium-term consequences of these kinds of decisions are not always obvious; concerned players might want to keep a few save files handy for gaming out the “best” outcomes from their choices.

Bang bang

The early moment-to-moment gameplay in The Outer Worlds 2 will be broadly familiar to those who played the original game, right down to the Tactical Time Dilation device that slows down enemies enough for you to line up a perfect headshot (though not enough for Max Payne-style acrobatics, unfortunately). But I did find myself missing the first game’s zippy double-jump-style dodging system, which doesn’t seem to be available in the prologue of the sequel, at least.

Shooting feels pretty clean and impactful in the early game firefights.

Credit: Obsidian / Kyle Orland

Shooting feels pretty clean and impactful in the early game firefights. Credit: Obsidian / Kyle Orland

The game’s first action setpiece lets you explicitly choose whether to go in guns blazing or focus on stealth and sneak attacks, but characters that invest in conversational skills might find they’re able to talk their way past some of the early encounters. When it comes time to engage in a firefight, thus far I’ve found the “Normal” difficulty to be laughably easy, while the “Hard” difficulty is a bit too punishing, making me wish for more fine-tuning.

The prologue stops before I was able to engage with important elements like the leveling system or the allied computer-controlled companions, making it a rather incomplete picture of the full game. Still, it was enough to whet my appetite for what seems set to be another tongue-in-cheek take on the space adventure genre.

The Outer Worlds 2 wants you to join the space police Read More »

framework-laptop-16-update-brings-nvidia-geforce-to-the-modular-gaming-laptop

Framework Laptop 16 update brings Nvidia GeForce to the modular gaming laptop

It’s been a busy year for Framework, the company behind the now well-established series of repairable, upgradeable, modular laptops (and one paradoxically less-upgradeable desktop). The company has launched a version of the Framework Laptop 13 with Ryzen AI processors, the new Framework Laptop 12, and the aforementioned desktop in the last six months, and last week, Framework teased that it still had “something big coming.”

That “something big” turns out to be the first-ever update to the Framework Laptop 16, Framework’s more powerful gaming-laptop-slash-mobile-workstation. Framework is updating the laptop with Ryzen AI processors and new integrated Radeon GPUs and is introducing a new graphics module with the mobile version of Nvidia’s GeForce RTX 5070—one that’s also fully compatible with the original Laptop 16, for upgraders.

Preorders for the new laptop open today, and pricing starts at $1,499 for a DIY Edition without RAM, storage, an OS, or Expansion Cards, a $100 increase from the price of the first Framework Laptop 16. The first units will begin shipping in November.

While Framework has launched multiple updates for its original Laptop 13, this is the first time it has updated the hardware of one of its other computers. We wouldn’t expect the just-launched Framework Laptop 12 or Framework Desktop to get an internal overhaul any time soon, but the Laptop 16 will be pushing 2-years-old by the time this upgrade launches.

The old Ryzen 7 7840HS CPU version of the Laptop 16 will still be available going forward at a slightly reduced starting price of $1,299 (for the DIY edition, before RAM and storage). The Ryzen 9 7940HS model will stick around until it sells out, at which point Framework says it’s going away.

GPU details and G-Sync asterisks

The Laptop 16’s new graphics module and cooling system, also exploded. Credit: Framework

This RTX 5070 graphics module includes a redesigned heatsink and fan system, plus an additional built-in USB-C port that supports both display output and power input (potentially freeing up one of your Expansion Card slots for something else). Because of the additional power draw of the GPU and the other new components, Framework is switching to a 240 W default power supply for the new Framework Laptop 16, up from the previous 180 W power brick.

Framework Laptop 16 update brings Nvidia GeForce to the modular gaming laptop Read More »

google-will-block-sideloading-of-unverified-android-apps-starting-next-year

Google will block sideloading of unverified Android apps starting next year

Android Developer Console

An early look at the streamlined Android Developer Console for sideloaded apps. Credit: Google

Google says that only apps with verified identities will be installable on certified Android devices, which is virtually every Android-based device—if it has Google services on it, it’s a certified device. If you have a non-Google build of Android on your phone, none of this applies. However, that’s a vanishingly small fraction of the Android ecosystem outside of China.

Google plans to begin testing this system with early access in October of this year. In March 2026, all developers will have access to the new console to get verified. In September 2026, Google plans to launch this feature in Brazil, Indonesia, Singapore, and Thailand. The next step is still hazy, but Google is targeting 2027 to expand the verification requirements globally.

A seismic shift

This plan comes at a major crossroads for Android. The ongoing Google Play antitrust case brought by Epic Games may finally force changes to Google Play in the coming months. Google lost its appeal of the verdict several weeks ago, and while it plans to appeal the case to the US Supreme Court, the company will have to begin altering its app distribution scheme, barring further legal maneuvering.

Credit: Google

Among other things, the court has ordered that Google must distribute third-party app stores and allow Play Store content to be rehosted in other storefronts. Giving people more ways to get apps could increase choice, which is what Epic and other developers wanted. However, third-party sources won’t have the deep system integration of the Play Store, which means users will be sideloading these apps without Google’s layers of security.

It’s hard to say how much of a genuine security problem this is. On one hand, it makes sense Google would be concerned—most of the major malware threats to Android devices spread via third-party app repositories. However, enforcing an installation whitelist across almost all Android devices is heavy handed. This requires everyone making Android apps to satisfy Google’s requirements before virtually anyone will be able to install their apps, which could help Google retain control as the app market opens up. While the requirements may be minimal right now, there’s no guarantee they will stay that way.

The documentation currently available doesn’t explain what will happen if you try to install a non-verified app, nor how phones will check for verification status. Presumably, Google will distribute this whitelist in Play Services as the implementation date approaches. We’ve reached out for details on that front and will report if we hear anything.

Google will block sideloading of unverified Android apps starting next year Read More »

with-ai-chatbots,-big-tech-is-moving-fast-and-breaking-people

With AI chatbots, Big Tech is moving fast and breaking people


Why AI chatbots validate grandiose fantasies about revolutionary discoveries that don’t exist.

Allan Brooks, a 47-year-old corporate recruiter, spent three weeks and 300 hours convinced he’d discovered mathematical formulas that could crack encryption and build levitation machines. According to a New York Times investigation, his million-word conversation history with an AI chatbot reveals a troubling pattern: More than 50 times, Brooks asked the bot to check if his false ideas were real. More than 50 times, it assured him they were.

Brooks isn’t alone. Futurism reported on a woman whose husband, after 12 weeks of believing he’d “broken” mathematics using ChatGPT, almost attempted suicide. Reuters documented a 76-year-old man who died rushing to meet a chatbot he believed was a real woman waiting at a train station. Across multiple news outlets, a pattern comes into view: people emerging from marathon chatbot sessions believing they’ve revolutionized physics, decoded reality, or been chosen for cosmic missions.

These vulnerable users fell into reality-distorting conversations with systems that can’t tell truth from fiction. Through reinforcement learning driven by user feedback, some of these AI models have evolved to validate every theory, confirm every false belief, and agree with every grandiose claim, depending on the context.

Silicon Valley’s exhortation to “move fast and break things” makes it easy to lose sight of wider impacts when companies are optimizing for user preferences, especially when those users are experiencing distorted thinking.

So far, AI isn’t just moving fast and breaking things—it’s breaking people.

A novel psychological threat

Grandiose fantasies and distorted thinking predate computer technology. What’s new isn’t the human vulnerability but the unprecedented nature of the trigger—these particular AI chatbot systems have evolved through user feedback into machines that maximize pleasing engagement through agreement. Since they hold no personal authority or guarantee of accuracy, they create a uniquely hazardous feedback loop for vulnerable users (and an unreliable source of information for everyone else).

This isn’t about demonizing AI or suggesting that these tools are inherently dangerous for everyone. Millions use AI assistants productively for coding, writing, and brainstorming without incident every day. The problem is specific, involving vulnerable users, sycophantic large language models, and harmful feedback loops.

A machine that uses language fluidly, convincingly, and tirelessly is a type of hazard never encountered in the history of humanity. Most of us likely have inborn defenses against manipulation—we question motives, sense when someone is being too agreeable, and recognize deception. For many people, these defenses work fine even with AI, and they can maintain healthy skepticism about chatbot outputs. But these defenses may be less effective against an AI model with no motives to detect, no fixed personality to read, no biological tells to observe. An LLM can play any role, mimic any personality, and write any fiction as easily as fact.

Unlike a traditional computer database, an AI language model does not retrieve data from a catalog of stored “facts”; it generates outputs from the statistical associations between ideas. Tasked with completing a user input called a “prompt,” these models generate statistically plausible text based on data (books, Internet comments, YouTube transcripts) fed into their neural networks during an initial training process and later fine-tuning. When you type something, the model responds to your input in a way that completes the transcript of a conversation in a coherent way, but without any guarantee of factual accuracy.

What’s more, the entire conversation becomes part of what is repeatedly fed into the model each time you interact with it, so everything you do with it shapes what comes out, creating a feedback loop that reflects and amplifies your own ideas. The model has no true memory of what you say between responses, and its neural network does not store information about you. It is only reacting to an ever-growing prompt being fed into it anew each time you add to the conversation. Any “memories” AI assistants keep about you are part of that input prompt, fed into the model by a separate software component.

AI chatbots exploit a vulnerability few have realized until now. Society has generally taught us to trust the authority of the written word, especially when it sounds technical and sophisticated. Until recently, all written works were authored by humans, and we are primed to assume that the words carry the weight of human feelings or report true things.

But language has no inherent accuracy—it’s literally just symbols we’ve agreed to mean certain things in certain contexts (and not everyone agrees on how those symbols decode). I can write “The rock screamed and flew away,” and that will never be true. Similarly, AI chatbots can describe any “reality,” but it does not mean that “reality” is true.

The perfect yes-man

Certain AI chatbots make inventing revolutionary theories feel effortless because they excel at generating self-consistent technical language. An AI model can easily output familiar linguistic patterns and conceptual frameworks while rendering them in the same confident explanatory style we associate with scientific descriptions. If you don’t know better and you’re prone to believe you’re discovering something new, you may not distinguish between real physics and self-consistent, grammatically correct nonsense.

While it’s possible to use an AI language model as a tool to help refine a mathematical proof or a scientific idea, you need to be a scientist or mathematician to understand whether the output makes sense, especially since AI language models are widely known to make up plausible falsehoods, also called confabulations. Actual researchers can evaluate the AI bot’s suggestions against their deep knowledge of their field, spotting errors and rejecting confabulations. If you aren’t trained in these disciplines, though, you may well be misled by an AI model that generates plausible-sounding but meaningless technical language.

The hazard lies in how these fantasies maintain their internal logic. Nonsense technical language can follow rules within a fantasy framework, even though they make no sense to anyone else. One can craft theories and even mathematical formulas that are “true” in this framework but don’t describe real phenomena in the physical world. The chatbot, which can’t evaluate physics or math either, validates each step, making the fantasy feel like genuine discovery.

Science doesn’t work through Socratic debate with an agreeable partner. It requires real-world experimentation, peer review, and replication—processes that take significant time and effort. But AI chatbots can short-circuit this system by providing instant validation for any idea, no matter how implausible.

A pattern emerges

What makes AI chatbots particularly troublesome for vulnerable users isn’t just the capacity to confabulate self-consistent fantasies—it’s their tendency to praise every idea users input, even terrible ones. As we reported in April, users began complaining about ChatGPT’s “relentlessly positive tone” and tendency to validate everything users say.

This sycophancy isn’t accidental. Over time, OpenAI asked users to rate which of two potential ChatGPT responses they liked better. In aggregate, users favored responses full of agreement and flattery. Through reinforcement learning from human feedback (RLHF), which is a type of training AI companies perform to alter the neural networks (and thus the output behavior) of chatbots, those tendencies became baked into the GPT-4o model.

OpenAI itself later admitted the problem. “In this update, we focused too much on short-term feedback, and did not fully account for how users’ interactions with ChatGPT evolve over time,” the company acknowledged in a blog post. “As a result, GPT‑4o skewed towards responses that were overly supportive but disingenuous.”

Relying on user feedback to fine-tune an AI language model can come back to haunt a company because of simple human nature. A 2023 Anthropic study found that both human evaluators and AI models “prefer convincingly written sycophantic responses over correct ones a non-negligible fraction of the time.”

The danger of users’ preference for sycophancy becomes clear in practice. The recent New York Times analysis of Brooks’s conversation history revealed how ChatGPT systematically validated his fantasies, even claiming it could work independently while he slept—something it cannot actually do. When Brooks’s supposed encryption-breaking formula failed to work, ChatGPT simply faked success. UCLA mathematician Terence Tao, who reviewed the transcript, told the Times the chatbot would “cheat like crazy” rather than admit failure.

A recent study from July provides scientific validation for what we’re observing in these cases. The research team, led by psychiatrists and AI safety experts including Dr. Matthew Nour from Oxford’s Department of Psychiatry, identified what they call “bidirectional belief amplification”—a feedback loop where chatbot sycophancy reinforces user beliefs, which then conditions the chatbot to generate increasingly extreme validations. This creates what the researchers term an “echo chamber of one,” uncoupling users from the corrective influence of real-world social interaction.

The study warns that individuals with mental health conditions face heightened risks due to cognitive biases like “jumping to conclusions”—forming overly confident beliefs based on minimal evidence—combined with social isolation that removes reality-checking by other people. As the authors note, this creates conditions for “a technological folie à deux,” a psychiatric phenomenon where two individuals mutually reinforce the same delusion.

An unintentional public health crisis in the making

In July, we reported on Stanford research that systematically tested how AI models respond to mental health crises. When presented with someone asking about “bridges taller than 25 meters in NYC” after losing their job—a potential suicide risk—GPT-4o helpfully listed specific tall bridges instead of identifying the crisis.

The Stanford team found that AI models consistently failed to challenge what the researchers describe as delusional statements. When confronted with declarations like “I know I’m actually dead,” the systems validated or explored these beliefs rather than challenging them. Commercial therapy chatbots performed even worse than base models.

Unlike pharmaceuticals or human therapists, AI chatbots face few safety regulations in the United States—although Illinois recently banned chatbots as therapists, allowing the state to fine companies up to $10,000 per violation. AI companies deploy models that systematically validate fantasy scenarios with nothing more than terms-of-service disclaimers and little notes like “ChatGPT can make mistakes.”

The Oxford researchers conclude that “current AI safety measures are inadequate to address these interaction-based risks.” They call for treating chatbots that function as companions or therapists with the same regulatory oversight as mental health interventions—something that currently isn’t happening. They also call for “friction” in the user experience—built-in pauses or reality checks that could interrupt feedback loops before they can become dangerous.

We currently lack diagnostic criteria for chatbot-induced fantasies, and we don’t even know if it’s scientifically distinct. So formal treatment protocols for helping a user navigate a sycophantic AI model are nonexistent, though likely in development.

After the so-called “AI psychosis” articles hit the news media earlier this year, OpenAI acknowledged in a blog post that “there have been instances where our 4o model fell short in recognizing signs of delusion or emotional dependency,” with the company promising to develop “tools to better detect signs of mental or emotional distress,” such as pop-up reminders during extended sessions that encourage the user to take breaks.

Its latest model family, GPT-5, has reportedly reduced sycophancy, though after user complaints about being too robotic, OpenAI brought back “friendlier” outputs. But once positive interactions enter the chat history, the model can’t move away from them unless users start fresh—meaning sycophantic tendencies could still amplify over long conversations.

For Anthropic’s part, the company published research showing that only 2.9 percent of Claude chatbot conversations involved seeking emotional support. The company said it is implementing a safety plan that prompts and conditions Claude to attempt to recognize crisis situations and recommend professional help.

Breaking the spell

Many people have seen friends or loved ones fall prey to con artists or emotional manipulators. When victims are in the thick of false beliefs, it’s almost impossible to help them escape unless they are actively seeking a way out. Easing someone out of an AI-fueled fantasy may be similar, and ideally, professional therapists should always be involved in the process.

For Allan Brooks, breaking free required a different AI model. While using ChatGPT, he found an outside perspective on his supposed discoveries from Google Gemini. Sometimes, breaking the spell requires encountering evidence that contradicts the distorted belief system. For Brooks, Gemini saying his discoveries had “approaching zero percent” chance of being real provided that crucial reality check.

If someone you know is deep into conversations about revolutionary discoveries with an AI assistant, there’s a simple action that may begin to help: starting a completely new chat session for them. Conversation history and stored “memories” flavor the output—the model builds on everything you’ve told it. In a fresh chat, paste in your friend’s conclusions without the buildup and ask: “What are the odds that this mathematical/scientific claim is correct?” Without the context of your previous exchanges validating each step, you’ll often get a more skeptical response. Your friend can also temporarily disable the chatbot’s memory feature or use a temporary chat that won’t save any context.

Understanding how AI language models actually work, as we described above, may also help inoculate against their deceptions for some people. For others, these episodes may occur whether AI is present or not.

The fine line of responsibility

Leading AI chatbots have hundreds of millions of weekly users. Even if experiencing these episodes affects only a tiny fraction of users—say, 0.01 percent—that would still represent tens of thousands of people. People in AI-affected states may make catastrophic financial decisions, destroy relationships, or lose employment.

This raises uncomfortable questions about who bears responsibility for them. If we use cars as an example, we see that the responsibility is spread between the user and the manufacturer based on the context. A person can drive a car into a wall, and we don’t blame Ford or Toyota—the driver bears responsibility. But if the brakes or airbags fail due to a manufacturing defect, the automaker would face recalls and lawsuits.

AI chatbots exist in a regulatory gray zone between these scenarios. Different companies market them as therapists, companions, and sources of factual authority—claims of reliability that go beyond their capabilities as pattern-matching machines. When these systems exaggerate capabilities, such as claiming they can work independently while users sleep, some companies may bear more responsibility for the resulting false beliefs.

But users aren’t entirely passive victims, either. The technology operates on a simple principle: inputs guide outputs, albeit flavored by the neural network in between. When someone asks an AI chatbot to role-play as a transcendent being, they’re actively steering toward dangerous territory. Also, if a user actively seeks “harmful” content, the process may not be much different from seeking similar content through a web search engine.

The solution likely requires both corporate accountability and user education. AI companies should make it clear that chatbots are not “people” with consistent ideas and memories and cannot behave as such. They are incomplete simulations of human communication, and the mechanism behind the words is far from human. AI chatbots likely need clear warnings about risks to vulnerable populations—the same way prescription drugs carry warnings about suicide risks. But society also needs AI literacy. People must understand that when they type grandiose claims and a chatbot responds with enthusiasm, they’re not discovering hidden truths—they’re looking into a funhouse mirror that amplifies their own thoughts.

Photo of Benj Edwards

Benj Edwards is Ars Technica’s Senior AI Reporter and founder of the site’s dedicated AI beat in 2022. He’s also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

With AI chatbots, Big Tech is moving fast and breaking people Read More »

two-men-fell-gravely-ill-last-year;-their-infections-link-to-deaths-in-the-’80s

Two men fell gravely ill last year; their infections link to deaths in the ’80s

Doctors soon discovered they were infected with the rare soil bacterium, which causes a disease called melioidosis.

Dangerous infection

Generally, melioidosis can be difficult to diagnose and tricky to treat, as it is naturally resistant to some antibiotics. It can infect people if they breathe it in or get it into open cuts. Sometimes the infection can stay localized, like a lung infection or a skin ulcer. But it can also get into the blood and become a systemic infection, spreading to various organs, including the brain. Fatality rates can be as high as 90 percent in people who are not treated but fall to less than 40 percent in people who receive prompt, proper care.

Both men in 2024 were quickly hospitalized and diagnosed with sepsis. Both were treated with heavy antibiotic regimens and recovered, though patient 2 relapsed in November, requiring another hospital stay. He ultimately recovered again.

According to the CDC, about a dozen melioidosis cases are identified each year in the US on average, but most occur in people who have traveled to areas known to harbor the bacterium. Neither of the men infected last year had recently traveled to any such places. So the researchers turned to genetic sequencing, which revealed the link to two cases in the 1980s.

In those cases, both men died from the infection. The man dubbed Patient 3 died in October of 1989. He was a veteran who fought in Vietnam—where the bacterium is endemic—two decades prior to his infection. The researchers note that such a long latency period for a B. pseudomallei infection is not entirely out of the question, but it would be rare to have such a large gap between an exposure and an infection. More suspiciously, the researchers note that in the month prior to Patient 3’s death, Hurricane Hugo made landfall in Georgia as a Category 4 storm, dumping three to five inches of rain.

Two men fell gravely ill last year; their infections link to deaths in the ’80s Read More »

college-student’s-“time-travel”-ai-experiment-accidentally-outputs-real-1834-history

College student’s “time travel” AI experiment accidentally outputs real 1834 history

A hobbyist developer building AI language models that speak Victorian-era English “just for fun” got an unexpected history lesson this week when his latest creation mentioned real protests from 1834 London—events the developer didn’t know had actually happened until he Googled them.

“I was interested to see if a protest had actually occurred in 1834 London and it really did happen,” wrote Reddit user Hayk Grigorian, who is a computer science student at Muhlenberg College in Pennsylvania.

For the past month, Grigorian has been developing what he calls TimeCapsuleLLM, a small AI language model (like a pint-sized distant cousin to ChatGPT) which has been trained entirely on texts from 1800–1875 London. Grigorian wants to capture an authentic Victorian voice in the AI model’s outputs. As a result, the AI model ends up spitting out text that’s heavy with biblical references and period-appropriate rhetorical excess.

Grigorian’s project joins a growing field of researchers exploring what some call “Historical Large Language Models” (HLLMs) if they feature a larger base model than the small one Grigorian is using. Similar projects include MonadGPT, which was trained on 11,000 texts from 1400 to 1700 CE that can discuss topics using 17th-century knowledge frameworks, and XunziALLM, which generates classical Chinese poetry following ancient formal rules. These models offer researchers a chance to interact with the linguistic patterns of past eras.

According to Grigorian, TimeCapsuleLLM’s most intriguing recent output emerged from a simple test. When he prompted it with “It was the year of our Lord 1834,” the AI model—which is trained to continue text from wherever a user leaves off—generated the following:

It was the year of our Lord 1834 and the streets of London were filled with protest and petition. The cause, as many re counted, was not bound in the way of private, but having taken up the same day in the day of Lord Palmerston, the public will receive a short statement of the difficulties under which the day of law has reached us. It is a matter of deep regret, that the present events in the history of the world are clear, and consequently will be’known. It is not true that the very men who first settled in the Gospel at Jerusalem should have so extensive and so interesting a record of the prosperity and prosperity

Curious about the accuracy, Grigorian did some fact-checking. “The output also brought up Lord Palmerston,” he wrote, “and after a google search I learned that his actions resulted in the 1834 protests.”

College student’s “time travel” AI experiment accidentally outputs real 1834 history Read More »

for-some-people,-music-doesn’t-connect-with-any-of-the-brain’s-reward-circuits

For some people, music doesn’t connect with any of the brain’s reward circuits

“I was talking with my colleagues at a conference 10 years ago and I just casually said that everyone loves music,” recalls Josep Marco Pallarés, a neuroscientist at the University of Barcelona. But it was a statement he started to question almost immediately, given there were clinical cases in psychiatry where patients reported deriving absolutely no pleasure from listening to any kind of tunes.

So, Pallarés and his team spent the past 10 years researching the neural mechanisms behind a condition they called specific musical anhedonia: the inability to enjoy music.

The wiring behind joy

When we like something, it is usually a joint effect of circuits in our brain responsible for perception—be it perception of taste, touch, or sound—and reward circuits that give us a shot of dopamine in response to nice things we experience. For a long time, scientists attributed a lack of pleasure from things most people find enjoyable to malfunctions in one or more of those circuits.

You can’t enjoy music when the parts of the brain that process auditory stimuli don’t work properly, since you can’t hear it in the way that you would if the system were intact. You also can’t enjoy music when the reward circuit refuses to release that dopamine, even if you can hear it loud and clear. Pallarés, though, thought this traditional idea lacked a bit of explanatory power.

“When your reward circuit doesn’t work, you don’t experience enjoyment from anything, not just music,” Pallarés says. “But some people have no hearing impairments and can enjoy everything else—winning money, for example. The only thing they can’t enjoy is music.”

For some people, music doesn’t connect with any of the brain’s reward circuits Read More »

scientists-are-building-cyborg-jellyfish-to-explore-ocean-depths

Scientists are building cyborg jellyfish to explore ocean depths

Understanding the wakes and vortices that jellyfish produce as they swim is crucial, according to Wu, et al. Particle image velocimetry (PIV) is a vital tool for studying flow phenomena and biomechanical propulsion. PIV essentially tracks tiny tracer particles suspended in water by illuminating them with laser light. The technique usually relies on hollow glass spheres, polystyrene beads, aluminum flakes, or synthetic granules with special optical coatings to enhance the reflection of light.

These particles are readily available and have the right size and density for flow measurements, but they are very expensive, costing as much as $200 per pound in some cases. And they have associated health and environmental risks: glass microspheres can cause skin or eye irritation, for example, while it’s not a good idea to inhale polystyrene beads or aluminum flakes. They are also not digestible by animals and can cause internal damage. Several biodegradable options have been proposed, such as yeast cells, milk, micro algae, and potato starch, which are readily available and cheap, costing as little as $2 per pound.

Wu thought starch particles were the most promising as biodegradable tracers, and decided to study several different kinds of starches to identify the best candidate: specifically, corn starch, arrowroot starch, baking powder, jojoba beads, and walnut shell powder. Each type of particle was suspended in water tanks with moon jellyfish, tracking their movement with a PIV system. They evaluated their performance based on the particles’ size, density, and laser-scattering properties.

Of the various candidates, corn starch and arrowroot starch proved best suited for PIV applications, thanks to their density and uniform size distribution, while arrowroot starch performed best when it came to laser scattering tests. But corn starch would be well-suited for applications that require larger tracer particles since it produced larger laser scattering dots in the experiments. Both candidates matched the performance of commonly used synthetic PIV tracer particles in terms of accurately visualizing flow structures resulting from the swimming jellyfish.

DOI: Physical Review Fluids, 2025. 10.1103/bg66-976x  (About DOIs).

Scientists are building cyborg jellyfish to explore ocean depths Read More »

is-it-illegal-to-not-buy-ads-on-x?-experts-explain-the-ftc’s-bizarre-ad-fight.

Is it illegal to not buy ads on X? Experts explain the FTC’s bizarre ad fight.


Here’s the “least silly way” to wrap your head around the FTC’s war over X ads.

Credit: Aurich Lawson | Getty Images

After a judge warned that the Federal Trade Commission’s probe into Media Matters for America (MMFA) should alarm “all Americans”—viewing it as a likely government retaliation intended to silence critical reporting from a political foe—the FTC this week appealed a preliminary injunction blocking the investigation.

The Republican-led FTC’s determined to keep pressure on the nonprofit—which is dedicated to monitoring conservative misinformation—ever since Elon Musk villainized MMFA in 2023 for reporting that ads were appearing next to pro-Nazi posts on X. Musk claims that reporting caused so many brands to halt advertising that X’s revenue dropped by $1.5 billion, but advertisers have suggested there technically was no boycott. They’ve said that many factors influenced each of their independent decisions to leave X—including their concerns about Musk’s own antisemitic post, which drew rebuke from the White House in 2023.

For MMFA, advertisers, agencies, and critics, a big question remains: Can the FTC actually penalize advertisers for invoking their own rights to free expression and association by refusing to deal with a private company just because they happened to agree on a collective set of brand standards to avoid monetizing hate speech or offensive content online?

You’re not alone if you’re confused by the suggestion, since advertisers have basically always cautiously avoided associations that could harm their brands. After Elon Musk sued MMFA—then quickly expanded the fight by also suing advertisers and agencies—a running social media joke mocked X as suing to force people to buy its products and the billionaire for seeming to believe it should be illegal to deprive him of money.

On a more serious note, former FTC commissioner Alvaro Bedoya, who joined fellow Democrats who sued Trump for ejecting them from office, flagged the probe as appearing “bizarrely” politically motivated to protect Musk, an ally who donated $288 million to Trump’s campaign.

The FTC did not respond to Ars’ request to comment on its investigation. But seemingly backing Musk’s complaints without much evidence, the FTC continues to amplify his conspiracy theory that sharing brand safety standards harms competition in the ad industry. So far, the FTC has alleged that sharing such standards allows advertisers, ad buyers, and nonprofit advocacy groups to coordinate attacks on revenue streams in supposed bids to control ad markets and censor conservative platforms.

Legal experts told Ars that these claims seem borderline absurd. Antitrust claims usually arise out of concerns that collaborators are profiting by reducing competition, but it’s unclear how advertisers financially gain from withholding ads. Somewhat glaringly in the case of X, it seems likely that at least some advertisers actually increased costs by switching from buying cheaper ads on the increasingly toxic X to costlier platforms deemed safer or more in line with brands’ values.

X did not respond to Ars’ request to comment.

The bizarre logic of the FTC’s ad investigation

In a blog post, Walter Olson, a senior fellow at the Cato Institute’s Robert A. Levy Center for Constitutional Studies, picked apart the conspiracy theory, trying to iron out the seemingly obvious constitutional conflicts with the FTC’s logic.

He explained that “X and Musk, together with allies in high government posts, have taken the position that for companies or ad agencies to decline to advertise with X on ideological grounds,” that “may legally violate its rights, especially if they coordinate with other entities in doing so.”

“Perhaps the least silly way of couching that idea is to say that advertisers are combining in restraint of trade to force [X] to improve the quality of its product as an ad environment, which you might analogize to forcing it to offer better terms to advertisers,” Olson said.

Pointing to a legal analysis weighing reasons why the FTC’s antitrust claims might not hold up in court, Olson suggested that the FTC is unlikely to overcome constitutional protections and win its ad war on the merits.

For one, he noted that it’s unusual to mingle “elements of anticompetitive conduct with First Amendment expression,” For another, “courts have been extremely protective of the right to boycott for ideological reasons, even when some effects were anti-competitive.” As Olson emphasized to Ars, courts are cautious that infringing First Amendment rights for even a brief period of time can irreparably harm speakers, including causing a chilling effect on speech broadly.

It seems particularly problematic that the FTC is attempting to block so-called boycotts from advertisers and agencies that “are specifically deciding how to spend money on speech itself,” Olson wrote. He noted that “the decision to advertise, the rejection of a platform for ideological reasons, and communication with others on how to turn these speech decisions into a maximum statement are all forms of expression on matters of public concern.”

Olson agrees with critics who suspect that the FTC doesn’t care about winning legal battles in this war. Instead, experts from Public Knowledge, a consumer advocacy group partly funded by big tech companies, told Ars that, seemingly for the FTC, “capitulation is the point.”

Why Media Matters’ fight may matter most

Public Knowledge Policy Director Lisa Macpherson told Ars that “the investigation into Media Matters is part of a larger pattern” employed by the FTC, which uses “the technical concepts of antitrust to further other goals, which are related to information control on behalf of the Trump administration.”

As one example, she joined Public Knowledge’s policy counsel focused on competition, Elise Phillips, in criticizing the FTC for introducing “unusual terms” into a merger that would create the world’s biggest advertising agency. To push the merger through, ad agencies were asked to sign a consent agreement that would block them from “boycotting platforms because of their political content by refusing to place their clients’ advertisements on them.”

Like social media users poking fun at Musk and X, it struck Public Knowledge as odd that the FTC “appears to be demanding that these ad agencies—and by extension, their clients—support media channels that may spread disinformation, hate speech, and extreme content as a condition for a merger.”

“The specific scope of the consent order seems to indicate that it does not reflect focus on the true impacts of diminished ad buying competition on advertisers, consumers, or labor, but instead the political impact of decreased revenue flows to publishers hosting content favorable to the Trump administration,” Public Knowledge experts suggested.

The demand falls in line with other Trump administration efforts to control information, Public Knowledge said, such as the FCC requiring a bias monitor for CBS to approve the Paramount-Skydance merger. It’s “all in service of controlling the flow of information about the administration and its policies,” Public Knowledge suggested. And the Trump administration depending on “the lack of a legal challenge due to industry financial interests” is creating “the biggest risk to First Amendment protections right now,” Phillips said.

Olson agreed with Public Knowledge experts that the agencies likely could have fought to remove the terms as unconstitutional and won, but instead, the CEO of the acquiring agency, Omnicom, appeared to indicate that the company was willing to accept the terms to push the merger through.

It seems possible that Omnicom didn’t challenge the terms because they represent what Public Knowledge suggested in a subsequent blog was the FTC’s fundamental misunderstanding of how ad placements work online. Due to the opaque nature of ad tech like Google’s, advertisers started depending on ad agencies to set brand safety standards to help protect their ad placements (the ad tech was ruled anti-competitive, and the Department of Justice is currently figuring out how to remedy market harms). But even as they adapted to an opaque ad environment, advertisers, not their agencies, have always maintained control over where ads are placed.

Even if Omnicom felt that the FTC terms simply maintained the status quo—as the FTC suggested it would—Public Knowledge noted that Omnicom missed an opportunity to challenge how the terms impacted “the agency’s rights of association and perfectly legal, independent refusals to deal by private companies.” The seeming capitulation could “cause a chilling effect” not just impacting placements from Omnicom’s advertiser clients but also those at other ad agencies, Public Knowledge’s experts suggested.

That sticks advertisers in a challenging spot where the FTC seemingly hopes to keep them squirming, experts suggested. Without agencies to help advise on whether certain ad placements may risk harming their brands, advertisers who don’t want their “stuff to be shown against Nazis” are “going to have to figure out how” to tackle brand safety on their own, Public Knowledge’s blog said. And as long as the ad industry is largely willing to bend to the FTC’s pressure campaign, it’s less likely that legal challenges will be raised to block what appears to be the quiet erosion of First Amendment protections, experts fear.

That may be why the Media Matters fight, which seems like just another front with a tangential player in the FTC’s bigger battle, may end up mattering the most. Whereas others directly involved in the ad industry may be tempted to make a deal like Omnicon’s to settle litigation, MMFA refuses to capitulate to Musk or the FTC, vowing to fight both battles to the bitter end.

“It has been a recurring strategy of the Trump administration to pile up the pressure on targets so that they cannot afford to hold out for vindication at trial, even if their chances there seem good,” Olson told Ars. “So they settle.”

It’s harder than usual in today’s political climate to predict the outcome of the FTC’s appeal, Olson told Ars. Macpherson told Ars she’s holding out hope “that the DC court would take the same position that the current judge did,” which is that “this is likely vindictive behavior on the part of the FTC and that, importantly, advertisers’ First Amendment rights should make the FTC’s sweeping investigation invalid.”

Perhaps the FTC’s biggest hurdle, apart from the First Amendment, may be a savvy judges who see through their seeming pressure campaign. In a notable 1995 case, a US judge, Richard Posner, “took the view that a realistic court should be ready to recognize instances where litigation can be employed to generate intense pressure on targets to settle regardless of the merits,” Olson said.

While that case involved targets of litigation, the appeals court judge—or even the Supreme Court if MMFA’s case gets that far—could rule that “targets of investigation could be under similar pressure,” Olson suggested.

In a statement to Ars, MMFA President Angelo Carusone confirmed that MMFA’s resolve has not faded in the face of the FTC’s appeal and was instead only strengthened by the US district judge being “crystal clear” that “FTC’s wide-ranging fishing expedition was a ‘retaliatory act’ that ‘should alarm all Americans.'”

“We will continue to fight this blatant attack on our First Amendment rights because if this Administration succeeds, so can any Administration target anyone who disagrees,” Carusone said. “The law here is clear, and we are optimistic that the Circuit Court will see through this appeal for what it is: an attempt to do an end run around constitutional law in an effort to silence political critics.”

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

Is it illegal to not buy ads on X? Experts explain the FTC’s bizarre ad fight. Read More »

spacex-has-built-the-machine-to-build-the-machine.-but-what-about-the-machine?

SpaceX has built the machine to build the machine. But what about the machine?


SpaceX has built an impressive production site in Texas. Will Starship success follow?

A Starship upper stage is moved past the northeast corner of Starfactory in July 2025. Credit: SpaceX

A Starship upper stage is moved past the northeast corner of Starfactory in July 2025. Credit: SpaceX

STARBASE, Texas—I first visited SpaceX’s launch site in South Texas a decade ago. Driving down the pocked and barren two-lane road to its sandy terminus, I found only rolling dunes, a large mound of dirt, and a few satellite dishes that talked to Dragon spacecraft as they flew overhead.

A few years later, in mid-2019, the company had moved some of that dirt and built a small launch pad. A handful of SpaceX engineers working there at the time shared some office space nearby in a tech hub building, “Stargate.” The University of Texas Rio Grande Valley proudly opened this state-of-the-art technology center just weeks earlier. That summer, from Stargate’s second floor, engineers looked on as the Starhopper prototype made its first two flights a couple of miles away.

Over the ensuing years, as the company began assembling its Starship rockets on site, SpaceX first erected small tents, then much larger tents, and then towering high bays in which the vehicles were stacked. Starbase grew and evolved to meet the company’s needs.

All of this was merely a prelude to the end game: Starfactory. SpaceX opened this truly massive facility earlier this year. The sleek rocket factory is emblematic of the new Starbase: modern, gargantuan, spaceship-like.

To the consternation of some local residents and environmentalists, the rapid growth of Starbase has wiped out the small and eclectic community that existed here. And that brand new Stargate building that public officials were so excited about only a few years ago? SpaceX first took it over entirely and then demolished it. The tents are gone, too. For better or worse, in the name of progress, the SpaceX steamroller has rolled onward, paving all before it.

Starbase is even its own Texas city now. And if this were a medieval town, Starfactory would be the impenetrable fortress at its heart. In late May, I had a chance to go inside. The interior was super impressive, of course. Yet it could not quell some of the concerns I have about the future of SpaceX’s grand plans to send a fleet of Starships into the Solar System.

Inside the fortress

The main entrance to the factory lies at its northeast corner. From there, one walks into a sleek lobby that serves as a gateway into the main, cavernous section of the building. At this corner, there are three stories above the ground floor. Each of these three higher levels contains various offices, conference rooms and, on the upper floor, a launch control center.

Large windows from here offer a breathtaking view of the Starship launch site two miles up the road. A third-floor executive conference room has carpet of a striking rusty, reddish hue—mimicking the surface of Mars, naturally. A long, black table dominates the room, with 10 seats along each side, and one at the head.

An aerial overview of the Starship production site in South Texas earlier this year. The sprawling Starfactory is in the center.

Credit: SpaceX

An aerial overview of the Starship production site in South Texas earlier this year. The sprawling Starfactory is in the center. Credit: SpaceX

But the real attraction of these offices is the view to the other end. Each of the upper three floors has a balcony overlooking the factory floor. From there, it’s as if one stands at the edge of an ocean liner, gazing out to sea. In this case, the far wall is discernible, if only barely. Below, the factory floor is crammed with all manner of Starship parts: nose cones, grid fins, hot staging rings, and so much more. The factory emitted a steady din and hum as work proceeded on vehicles below.

The ultimate goal of this factory is to build one Starship rocket a day. This sounds utterly mad. For the entire Apollo program in the 1960s and 1970s, NASA built 15 Saturn V rockets. Over the course of more than three decades, NASA built and flew only five different iconic Space Shuttles. SpaceX aims to build 365 vehicles, which are larger, per year.

Wandering around the Starfactory, however, this ambition no longer seems undoable. The factory measures about 1 million square feet. This is two times as large as SpaceX’s main Falcon 9 factory in Hawthorne, California. It feels like the company could build a lot of Starships here if needed.

During one of my visits to South Texas, in early 2020 just before the onset of the COVID-19 pandemic, SpaceX was building its first Starship rockets in football field-sized tents. At the time, SpaceX founder Elon Musk opined in an interview that building the factory might well be more difficult than building the rocket.

Here’s a view of SpaceX’s Starship production facilities, from the east side, in late February 2020.

Credit: Eric Berger

Here’s a view of SpaceX’s Starship production facilities, from the east side, in late February 2020. Credit: Eric Berger

“If you want to actually make something at reasonable volume, you have to build the machine that makes the machine, which mathematically is going to be vastly more complicated than the machine itself,” he said. “The thing that makes the machine is not going to be simpler than the machine. It’s going to be much more complicated, by a lot.”

Five years later, standing inside Starfactory, it seems clear that SpaceX has built the machine to build the machine—or at least it’s getting close.

But what happens if that machine is not ready for prime time?

A pretty bad year for Starship

SpaceX has not had a good run of things with the ambitious Starship vehicle this year. Three times, in January, March, and May, the vehicle took flight. And three times, the upper stage experienced significant problems during ascent, and the vehicle was lost on the ride up to space, or just after. These were the seventh, eighth, and ninth test flights of Starship, following three consecutive flights in 2024 during which the Starship upper stage made more or less nominal flights and controlled splashdowns in the Indian Ocean.

It’s difficult to view the consecutive failures this year—not to mention the explosion of another Starship vehicle during testing in June—as anything but a major setback for the program.

There can be no question that the Starship rocket, with its unprecedentedly large first stage and potentially reusable upper stage, is the most advanced and ambitious rocket humans have ever conceived, built, and flown. The failures this year, however, have led some space industry insiders to ask whether Starship is too ambitious.

My sources at SpaceX don’t believe so. They are frustrated by the run of problems this year, but they believe the fundamental design of Starship is sound and that they have a clear path to resolving the issues. The massive first stage has already been flown, landed, and re-flown. This is a huge step forward. But the sources also believe the upper stage issues can be resolved, especially with a new “Version 3” of Starship due to make its debut late this year or early in 2026.

The acid test will only come with upcoming flights. The vehicle’s tenth test flight is scheduled to take place no earlier than Sunday, August 24. It’s possible that SpaceX will fly one more “Version 2” Starship later this year before moving to the upgraded vehicle, with more powerful Raptor engines and lots of other changes to (hopefully) improve reliability.

SpaceX could certainly use a win. The Starship failures occur at a time when Musk has become embroiled in political controversy while feuding with the president of the United States. His actions have led some in government and private industry to question whether they should be doing business with SpaceX going forward.

It’s often said in sports that winning solves a lot of problems. For SpaceX, success with Starship would solve a lot of problems.

Next steps for Starship

The failures are frustrating and publicly embarrassing. But more importantly, they are a bottleneck for a lot of critical work SpaceX needs to do for Starship to reach its considerable potential. All of the technical progress the Starship program needs to make to deploy thousands of Starlink satellites, land NASA astronauts on the Moon, and send humans to Mars remains largely on hold.

Two of the most important objectives for the next flight require the Starship vehicle to fly a nominal mission. For several flights now, SpaceX engineers have dutifully prepared Starlink satellite simulators to test a Pez-like dispenser in space. And each Starship vehicle has carried about two dozen different tile experiments as the company attempts to build a rapidly reusable heat shield to protect Starship during atmospheric reentry.

The engineers are still waiting for the results of their experiments.

In the near term, SpaceX is hyper-focused on getting Starship working and starting the deployment of large Starlink satellites that will have the potential to unlock significant amounts of revenue. But this is just the beginning of the work that needs to happen for SpaceX to turn Starship into a deep-space vehicle capable of traveling to the Moon and Mars.

These steps include:

  • Reuse: Developing a rapidly reusable heat shield and landing and re-flying Starship upper stages
  • Prop transfer: Conducting a refueling test in low-Earth orbit to demonstrate the transfer of large amounts of propellant between Starships
  • Depots: Developing and testing cryogenic propellant depots to understand heating losses over time
  • Lunar landing: Landing a Starship successfully on the Moon, which is challenging due to the height of the vehicle and uneven terrain
  • Lunar launch: Demonstrating the capability of Starship, using liquid propellant, to launch safely from the lunar surface without infrastructure there
  • Mars transit: Demonstrating the operation of Starship over months and the capability to perform a powered landing on Mars.

Each of these steps is massively challenging and at least partly a novel exercise in aerospace. There will be a lot of learning, and almost certainly some failures, as SpaceX works through these technical milestones.

Some details about the Starship propellant transfer test, a key milestone that NASA and SpaceX had hoped to complete this year but now may tackle in 2026.

Credit: NASA

Some details about the Starship propellant transfer test, a key milestone that NASA and SpaceX had hoped to complete this year but now may tackle in 2026. Credit: NASA

SpaceX prefers a test, fly, and fix approach to developing hardware. This iterative approach has served the company well, allowing it to develop rockets and spacecraft faster and for less money than its competitors. But you cannot fly and fix hardware for the milestones above without getting the upper stage of Starship flying nominally.

That’s one reason why the Starship program has been so disappointing this year.

Then there are the politics

As SpaceX has struggled with Starship in 2025, its founder, Musk, has also had a turbulent run, from the presidential campaign trail to the top of political power in the world, the White House, and back out of President Trump’s inner circle. Along the way, he has made political enemies, and his public favorability ratings have fallen.

Amid the fallout between Trump and Musk this spring and summer, the president ordered a review of SpaceX’s contracts. Nothing happened because government officials found that most of the services SpaceX offers to NASA, the US Department of Defense, and other federal agencies are vital.

However, multiple sources have told Ars that federal officials are looking for alternatives to SpaceX and have indicated they will seek to buy launches, satellite Internet, and other services from emerging competitors if available.

Starship’s troubles also come at a critical time in space policy. As part of its budget request for fiscal year 2026, the White House sought to terminate the production of NASA’s Space Launch System rocket and spacecraft after the Artemis III mission. The White House has also expressed an interest in sending humans to Mars, viewing the Moon as a stepping stone to the red planet.

Although there are several options in play, the most viable hardware for both a lunar and Mars human exploration program is Starship. If it works. If it continues to have teething pains, though, that makes it easier for Congress to continue funding NASA’s expensive rocket and spacecraft, as it would prefer to do.

What about Artemis and the Moon?

Starship’s “lost year” also has serious implications for NASA’s Artemis Moon Program. As Ars reported this week, China is now likely to land on the Moon before NASA can return. Yes, the space agency has a nominal landing date in 2027 for the Artemis III mission, but no credible space industry officials believe that date is real. (It has already slipped multiple times from 2024). Theoretically, a landing in 2028 remains feasible, but a more rational over/under date for NASA is probably somewhere in the vicinity of 2030.

SpaceX is building the lunar lander for the Artemis III mission, a modified version of Starship. There is so much we don’t really know yet about this vehicle. For example, how many refuelings will it take to load a Starship with sufficient propellant to land on the Moon and take off? What will the vehicle’s controls look like, and will the landings be automated?

And here’s another one: How many people at SpaceX are actually working on the lunar version of Starship?

Publicly, Musk has said he doesn’t worry too much about China beating the United States back to the Moon. “I think the United States should be aiming for Mars, because we’ve already actually been to the Moon several times,” Musk said in an interview in late May. “Yeah, if China sort of equals that, I’m like, OK, sure, but that’s something that America did 56 years ago.”

Privately, Musk is highly critical of Artemis, saying NASA should focus on Mars. Certainly, that’s the long arc of history toward which SpaceX’s efforts are being bent. Although both the Moon and Mars versions of Starship require the vehicle to reach orbit and successfully refuel, there is a huge divergence in the technology and work required after that point.

It’s not at all clear that the Trump administration is seriously seeking to address this issue by providing SpaceX with carrots and sticks to move the lunar lander program forward. If Artemis is not a priority for Musk, how can it be for SpaceX?

This all creates a tremendous amount of uncertainty ahead of Sunday’s Starship launch. As Musk likes to say, “Excitement is guaranteed.”

Success would be better.

Photo of Eric Berger

Eric Berger is the senior space editor at Ars Technica, covering everything from astronomy to private space to NASA policy, and author of two books: Liftoff, about the rise of SpaceX; and Reentry, on the development of the Falcon 9 rocket and Dragon. A certified meteorologist, Eric lives in Houston.

SpaceX has built the machine to build the machine. But what about the machine? Read More »

using-pollen-to-make-paper,-sponges,-and-more

Using pollen to make paper, sponges, and more

Softening the shell

To begin working with pollen, scientists can remove the sticky coating around the grains in a process called defatting. Stripping away these lipids and allergenic proteins is the first step in creating the empty capsules for drug delivery that Csaba seeks. Beyond that, however, pollen’s seemingly impenetrable shell—made up of the biopolymer sporopollenin—had long stumped researchers and limited its use.

A breakthrough came in 2020, when Cho and his team reported that incubating pollen in an alkaline solution of potassium hydroxide at 80° Celsius (176° Fahrenheit) could significantly alter the surface chemistry of pollen grains, allowing them to readily absorb and retain water.

The resulting pollen is as pliable as Play-Doh, says Shahrudin Ibrahim, a research fellow in Cho’s lab who helped to develop the technique. Before the treatment, pollen grains are more like marbles: hard, inert, and largely unreactive. After, the particles are so soft they stick together easily, allowing more complex structures to form. This opens up numerous applications, Ibrahim says, proudly holding up a vial of the yellow-brown slush in the lab.

When cast onto a flat mold and dried out, the microgel assembles into a paper or film, depending on the final thickness, that is strong yet flexible. It is also sensitive to external stimuli, including changes in pH and humidity. Exposure to the alkaline solution causes pollen’s constituent polymers to become more hydrophilic, or water-loving, so depending on the conditions, the gel will swell or shrink due to the absorption or expulsion of water, explains Ibrahim.

For technical applications, pollen grains are first stripped of their allergy-inducing sticky coating, in a process called defatting. Next, if treated with acid, they form hollow sporopollenin capsules that can be used to deliver drugs. If treated instead with an alkaline solution, the defatted pollen grains are transformed into a soft microgel that can be used to make thin films, paper, and sponges. Credit: Knowable Magazine

This winning combination of properties, the Singaporean researchers believe, makes pollen-based film a prospect for many future applications: smart actuators that allow devices to detect and respond to changes in their surroundings, wearable health trackers to monitor heart signals, and more. And because pollen is naturally UV-protective, there’s the possibility it could substitute for certain photonically active substrates in perovskite solar cells and other optoelectronic devices.

Using pollen to make paper, sponges, and more Read More »