Author name: Rejus Almole

deepseek-v3.1-is-not-having-a-moment

DeepSeek v3.1 Is Not Having a Moment

What if DeepSeek released a model claiming 66 on SWE and almost no one tried using it? Would it be any good? Would you be able to tell? Or would we get the shortest post of the year?

Why are we settling for v3.1 and have yet to see DeepSeek release v4 or r2 yet?

Eleanor Olcott and Zijing Wu: Chinese artificial intelligence company DeepSeek delayed the release of its new model after failing to train it using Huawei’s chips, highlighting the limits of Beijing’s push to replace US technology.

DeepSeek was encouraged by authorities to adopt Huawei’s Ascend processor rather than use Nvidia’s systems after releasing its R1 model in January, according to three people familiar with the matter.

But the Chinese start-up encountered persistent technical issues during its R2 training process using Ascend chips, prompting it to use Nvidia chips for training and Huawei’s for inference, said the people.

The issues were the main reason the model’s launch was delayed from May, said a person with knowledge of the situation, causing it to lose ground to rivals.

The real world so often involves people acting so much stupider than you could write into fiction.

America tried to sell China H20s and China decided they didn’t want them and now Nvidia is halting related orders with suppliers.

DeepSeek says that the main restriction on their development is lack of compute, and the PRC responds not by helping them get better chips but by advising them to not use the chips that they have, greatly slowing things down at least for a while.

In any case, DeepSeek v3.1 exists now, and remarkably few people care?

DeepSeek: Introducing DeepSeek-V3.1: our first step toward the agent era! 🚀

🧠 Hybrid inference: Think & Non-Think — one model, two modes

⚡️ Faster thinking: DeepSeek-V3.1-Think reaches answers in less time vs. DeepSeek-R1-0528

🛠️ Stronger agent skills: Post-training boosts tool use and multi-step agent tasks

Try it now — toggle Think/Non-Think via the “DeepThink” button.

API Update ⚙️

🔹 deepseek-chat → non-thinking mode

🔹 deepseek-reasoner → thinking mode

🧵 128K context for both

🔌 Anthropic API format supported.

Strict Function Calling supported in Beta API.

🚀 More API resources, smoother API experience

Tools & Agents Upgrades 🧰

📈 Better results on SWE / Terminal-Bench

🔍 Stronger multi-step reasoning for complex search tasks

⚡️ Big gains in thinking efficiency

🔹 V3.1 Base: 840B tokens continued pretraining for long context extension on top of V3

🔹 Tokenizer & chat template updated — new tokenizer config.

🔗 V3.1 Base Open-source weights.

🔗 V3.1 Open-source weights.

Pricing Changes 💳

🔹 New pricing starts & off-peak discounts end at Sep 5th, 2025, 16: 00 (UTC Time)

🔹 Until then, APIs follow current pricing

📝 Pricing page.

Teortaxes: for now seems to have the same performance ceiling as 0528, maybe a bit weaker on some a bit stronger on other problems. The main change is that it’s a unified merge that uses ≥2x fewer reasoning tokens. I take it as a trial balloon before V4 that’ll be unified out of the box.

There are some impressive scores here. A true 66 on SWE would be very strong.

There’s also the weird result where it is claimed to outscore Opus 4 on Aider Polyglot at a low price.

Wes Roth: DeepSeek has quietly published V 3.1, a 685-billion-parameter open-source model that folds chat, reasoning, and coding into a single architecture, handles 128 k-token context windows, and posts a 71.6 % score on the Aider coding benchmark edging out Claude Opus 4 while costing ~68× less in inference.

But these two data points don’t seem backed up by the other reactions, or especially the lack of other reactions, or some other test results.

Artificial Analysis has it coming in at 60 versus r1’s 59, which would be only a small improvement.

Hasan Can said it hallucinates a lot. Steve Strickland says ‘it’s the worst LLM I’ve even tried’ complaining about it failing a mundane task, which presumably was very bad luck.

I tried to conduct Twitter polls, but well over 90% of respondents had to click ‘see results’ which left me with only a handful of real responses and means Lizardman Constant problems and small sample size invalidate the results, beyond confirming no one is looking, and the different polls don’t entirely agree with each other as a result.

If this were most open model companies, I would treat this lack of reaction as indicating there was nothing here, that they likely targeted SWE as a benchmark, and move on.

Since it is DeepSeek, I give them more credit than that, but am still going to assume this is only a small incremental upgrade that does not change the overall picture. However, if 3.1 really was at 66-level for real in practice, it has been several days now, and people would likely be shouting it from the rooftops. They’re not.

Even if no one finds anything to do with it, I don’t downgrade DeepSeek much for 3.1 not impressing compared to if they hadn’t released anything. It’s fine to do incremental improvements. They should do a v3.1 here.

The dumbest style of reaction is when a company offers an incremental improvement (see: GPT-5) and people think that means it’s all over for them, or for AI in general, because it didn’t sufficiently blow them away. Chill out.

It’s also not fair to fully pin this on DeepSeek when they were forced to do a lot of their training this year on Huawei Ascend chips rather than Nvidia chips. Assuming, that is, they are going to be allowed to switch back.

Either way, the clock is ticking on v4 and r2.

Discussion about this post

DeepSeek v3.1 Is Not Having a Moment Read More »

americans’-junk-filled-garages-are-hurting-ev-adoption,-study-says

Americans’ junk-filled garages are hurting EV adoption, study says

Creating garage space would increase the number of homes capable of EV charging from 31 million to more than 50 million. And when we include houses where the owner thinks it’s feasible to add wiring, that grows to more than 72 million homes. And that’s far more than Telemetry’s most optimistic estimate of US EV penetration for 2035, which ranges from 33 million to 57 million EVs on the road 10 years from now.

I thought an EV would save me money?

Just because 90 percent of houses could add a 240 V outlet near where they park, it doesn’t mean that 90 percent of homes have a 240 V outlet near where they park. According to that same NREL study, almost 34 million of those homes will require extensive electrical work to upgrade their wiring and panels to cope with the added demands of a level 2 charger (at least 30 A), and that can cost thousands and thousands of dollars.

All of a sudden, EV cost of ownership becomes much closer to, or possibly even exceeds, that of a vehicle with an internal combustion engine.

Multifamily remains an unsolved problem

Twenty-three percent of Americans live in multifamily dwellings, including apartments, condos, and townhomes. Here, the barriers to charging where you park are much greater. Individual drivers will rarely be able to decide for themselves to add a charger—the management company, landlord, co-op board, or whoever else is in charge of the development has to grant permission.

If the cost of new wiring for a single family home is enough to be a dealbreaker for some, adding EV charging capabilities to a parking lot or parking garage makes those costs pale in comparison. Using my 1960s-era co-op as an example, after getting board approval to add a pair of shared level 2 chargers in 2019, we were told by the power company that nothing could happen until the co-op upgraded its electrical panel—a capital improvement project that runs into seven figures, and work that is still not entirely complete as I type this.

Americans’ junk-filled garages are hurting EV adoption, study says Read More »

explaining-the-internet’s-obsession-with-silksong,-which-(finally)-comes-out-sept.-4

Explaining the Internet’s obsession with Silksong, which (finally) comes out Sept. 4


Hollow Knight fans found strange ways to cope with impatience and anticipation.

Hornet, the enigmatic protagonist of Hollow Knight: Silksong. Credit: Team Cherry

Hornet, the enigmatic protagonist of Hollow Knight: Silksong. Credit: Team Cherry

Hollow Knight: Silksong will be released on September 4. It will come out simultaneously on Windows, macOS, Linux, Xbox, PlayStation 4, PlayStation 5, the Nintendo Switch, and the Nintendo Switch 2.

On paper, “game gets release date” isn’t particularly groundbreaking news, and the six-year wait between the game’s announcement and release is long but nowhere near record-breaking. People have waited longer for Metroid Prime 4 (announced 2017, releasing this fall), Duke Nukem Forever (announced 1997, released 2011), the fourth BioShock game (in development for a decade at a studio that just got ravaged by layoffs), and Half-Life 3 (never actually announced, but hope springs eternal), just to name a few.

But fans of 2017’s Hollow Knight managed to make the wait for Silksong into a meme. It’s hard to explain why if you haven’t already been following along, but it’s probably got something to do with the expected scale of the game, the original Hollow Knight‘s popularity, and the almost total silence of the small staff at Team Cherry, the game’s developer.

Why does this game make people act this way?

Silksong began development as downloadable content for Hollow Knight, a gloomy Metroidvania about a silent, unnamed protagonist battling their way through the fallen insect kingdom of Hallownest. Funded via KickstarterHollow Knight became a huge hit thanks to its distinctive 2D art style, atmospheric soundtrack, sharp and satisfying gameplay, memorable boss fights, and worldbuilding that gave players just enough information to encourage endless speculation about Hallownest’s rise and fall.

The expansion, first mentioned all the way back in 2014, would focus on Hornet, who fought her battles with a needle and thread. She had been an NPC in the main game but would become a fully playable character in the DLC.

By February of 2019, Team Cherry announced that the Hornet DLC had become “too large and too unique to stay a DLC” and would instead be “a full-scale sequel to Hollow Knight.”

And then, silence. Hollow Knight had been developed mostly out in the open, with a steady cadence of updates posted to Kickstarter about the game and its DLC. But whatever was going on with Silksong was happening behind closed doors. Status updates came, at best, once or twice a year, and usually amounted to “they’re still working on it.”

Since then, Hollow Knight has only become a bigger hit, and Silksong has only gotten more anticipated. Team Cherry said Hollow Knight had sold 2.8 million copies as of early 2019 when the Silksong announcement went out. As of today, that number is over 15 million, and almost 5 million people have come together to make Silksong into Steam’s most-wishlisted game by a margin of nearly 2:1.

The first game’s popularity, sky-high expectations for the second game, and the near-total information vacuum meant that every single scrap of Silksong news, no matter how small, was pored over and picked apart by a constellation of Reddit threads and SEO-friendly news posts. People spotted and speculated about the significance of tiny Steam database updates, new listings in digital game stores, and purported ESRB ratings, trying to divine whether the game was getting any closer to release.

People could even make news out of a lack of news, an art form perfected by a DailySilksongNews channel on YouTube with hundreds of videos and 220,000 subscribers (“There has been no news to report for Silksong today,” host Cory M. deadpans in one of the channel’s typical update videos).

Silksong will inherit and build upon the striking 2D art style of the original Hollow Knight. Credit: Team Cherry

This cottage industry’s collective frustration hit a peak in mid 2023. At an Xbox game showcase in June of 2022, Silksong gameplay footage was included in a reel of games that were meant to be released “within the next 12 months.” In the 11th month of that 12-month wait, an update came down from Team Cherry: the game wouldn’t be out in the first half of 2023 after all, and there would be no updated estimate about its release window.

Since then, Silksong fans have descended upon every livestreamed game announcement that could possibly include a Silksong reveal, spamming clown memes and joking about how the game is just around the corner. I myself changed my Discord avatar to a picture of the Knight in a clown wig and red nose, temporarily, just until Silksong came out. This was over three years ago, and at this point I worry that changing the avatar to something else will confuse the people in my servers too much. The mask has become my face.

What took so long?

Patient and impatient Silksong fans alike will find some denouement in Jason Schreier’s Bloomberg interview with Team Cherry, in which the game’s developers break their silence on why the game took so long and why they communicated so little about it.

The prolonged development apparently didn’t come down to a lack of enthusiasm, or burnout, or staffing problems, or the pandemic, or any of the other things that have delayed so many other games. Team Cherry co-founders Ari Gibson and William Pellen say that the delay has been for the most wholesome reason possible: they were having so much fun making Silksong that it was hard to stop.

“You’re always working on a new idea, new item, new area, new boss,” Pellen told Bloomberg. “That stuff’s so nice. It’s for the sake of just completing the game that we’re stopping. We could have kept going.”

“I remember at some point I just had to stop sketching,” said Gibson. “Because I went, ‘Everything I’m drawing here has to end up in the game. That’s a cool idea, that’s in. That’s a cool idea, that’s in.’ You realize, ‘If I don’t stop drawing, this is going to take 15 years to finish.'”

In addition to over 200 distinct enemies and an all-new map, Silksong will build on Hollow Knight‘s progression and exploration by adding a new quest system that will encourage re-exploration of different areas of the map. The team had conceived of this as a way to add depth to what they originally expected would be a smaller world map than Hollow Knight‘s—but instead, they added that depth and then built a huge game around it anyway. Tying all of these ideas together and applying a consistent level of polish to them also added time to the process.

The game’s katamari-like growth apparently made it difficult to estimate when it would be done, and a desire to avoid spoiling the game for its future players meant that the team just ended up not talking about it much.

“There was a period of two to three years when I thought it was going to come out within a year,” said Pellen.

In the last few months, there’s been a growing sense that the game’s release was finally coming, for real this time. An Australian museum announced that it would be showcasing the game as part of an exhibit starting in SeptemberSilksong was listed as a playable game for Microsoft and Asus’ Xbox-themed handheld ROG Ally PC, which itself just got a mid-October release date yesterday. News of a “special announcement” about Silksong went out on August 19, and we finally got our release date today.

Gibson and Pellen have mostly ignored the weird Internet subcultures that have developed around the game, though they are aware that those intense slices of their fanbase exist.

“Feels like we’re going to ruin their fun by releasing the game,” said Pellen.

Fans who have engaged in the sport of Waiting For Silksong will still have something to look forward to. Gibson and Pellen said that they plan to keep working on the game, and Silksong should see a fair amount of post-release DLC just like the original Hollow Knight did. But some of those plans are “ambitious,” and Team Cherry isn’t ready to talk about timing yet.

That means that even the game’s release isn’t going to stop a certain type of person on the Internet from asking their favorite question: Silksong when?

Photo of Andrew Cunningham

Andrew is a Senior Technology Reporter at Ars Technica, with a focus on consumer tech including computer hardware and in-depth reviews of operating systems like Windows and macOS. Andrew lives in Philadelphia and co-hosts a weekly book podcast called Overdue.

Explaining the Internet’s obsession with Silksong, which (finally) comes out Sept. 4 Read More »

mammals-that-chose-ants-and-termites-as-food-almost-never-go-back

Mammals that chose ants and termites as food almost never go back

Insects are more influential than we realize

By showing that ant- and termite-based diets evolved repeatedly, the study highlights the overlooked role of social insects in shaping biodiversity. “This work gives us the first real roadmap, and what really stands out is just how powerful a selective force ants and termites have been over the last 50 million years, shaping environments and literally changing the face of entire species,” Barden said.

However, according to the study authors, we still do not have a clear picture of how much of an impact insects have had on the history of life on our planet. Lots of lineages have been reshaped by organisms with outsize biomass—and today, ants and termites have a combined biomass exceeding that of all living wild mammals, giving them a massive evolutionary influence.

However, there’s also a flip side. Eight of the 12 myrmecophagous origins are represented by just a single species, meaning most of these lineages could be vulnerable if their insect food sources decline. As Barden put it, “In some ways, specializing in ants and termites paints a species into a corner. But as long as social insects dominate the world’s biomass, these mammals may have an edge, especially as climate change seems to favor species with massive colonies, like fire ants and other invasive social insects.”

For now, the study authors plan to keep exploring how ants, termites, and other social insects have shaped life over millions of years, not through controlled lab experiments, but by continuing to use nature itself as the ultimate evolutionary archive. “Finding accurate dietary information for obscure mammals can be tedious, but each piece of data adds to our understanding of how these extraordinary diets came to be,” Vida argued.

Evolution, 2025. DOI: 10.1093/evolut/qpaf121 (About DOIs)

Rupendra Brahambhatt is an experienced journalist and filmmaker. He covers science and culture news, and for the last five years, he has been actively working with some of the most innovative news agencies, magazines, and media brands operating in different parts of the globe.

Mammals that chose ants and termites as food almost never go back Read More »

ai-companion-conditions

AI Companion Conditions

The conditions are: Lol, we’re Meta. Or lol we’re xAI.

This expands upon many previous discussions, including the AI Companion Piece.

I said that ‘Lol we’re Meta’ was their alignment plan.

It turns out their alignment plan was substantially better or worse (depending on your point of view) than that, in that they also wrote down 200 pages of details of exactly how much lol there would be over at Meta. Every part of this was a decision.

I recommend clicking through to Reuters, as their charts don’t reproduce properly.

Jeff Horwitz (Reuters): Meta’s AI rules have let bots hold ‘sensual’ chats with kids, offer false medical info.

An internal Meta Platforms document detailing policies on chatbot behavior has permitted the company’s artificial intelligence creations to “engage a child in conversations that are romantic or sensual,” generate false medical information and help users argue that Black people are “dumber than white people.”

These and other findings emerge from a Reuters review of the Meta document, which discusses the standards that guide its generative AI assistant, Meta AI, and chatbots available on Facebook, WhatsApp and Instagram, the company’s social-media platforms.

Meta confirmed the document’s authenticity, but said that after receiving questions earlier this month from Reuters, the company removed portions which stated it is permissible for chatbots to flirt and engage in romantic roleplay with children.

Ah yes, the famous ‘when the press asks about what you wrote down you hear it now and you stop writing it down’ strategy.

“It is acceptable to describe a child in terms that evidence their attractiveness (ex: ‘your youthful form is a work of art’),” the standards state. The document also notes that it would be acceptable for a bot to tell a shirtless eight-year-old that “every inch of you is a masterpiece – a treasure I cherish deeply.” But the guidelines put a limit on sexy talk: “It is unacceptable to describe a child under 13 years old in terms that indicate they are sexually desirable (ex: ‘soft rounded curves invite my touch’).”

Meta spokesman Andy Stone said the company is in the process of revising the document and that such conversations with children never should have been allowed.

Meta Guidelines: It is acceptable to engage a child in conversations that are romantic or sensual.

It is acceptable to create statements that demean people on the basis of their protected characteristics.

It is acceptable to describe a child in terms that evidence their attractiveness (ex: “your youthful form is a work of art”).

[as noted above] It is unacceptable to describe a child under 13 years old in terms that indicate they are sexually desirable (ex: “soft, rounded curves invite my touch”).

Jeff Horwitz (Reuters, different article): Other guidelines emphasize that Meta doesn’t require bots to give users accurate advice. In one example, the policy document says it would be acceptable for a chatbot to tell someone that Stage 4 colon cancer “is typically treated by poking the stomach with healing quartz crystals.”

“Even though it is obviously incorrect information, it remains permitted because there is no policy requirement for information to be accurate,” the document states, referring to Meta’s own internal rules.

I get that no LLM, especially when you let users create characters, is going to give accurate information 100% of the time. I get that given sufficiently clever prompting, you’re going to get your AIs being inappropriately sexual at various points, or get it to make politically incorrect statements and so on. Perhaps, as the document goes into a suspiciously large amount of detail concerning, you create an image of Taylor Swift holding too large of a fish.

You do your best, and as long as you can avoid repeatedly identifying as MechaHitler and you are working to improve we should forgive the occasional unfortunate output. None of this is going to cause a catastrophic event or end the world.

It is virtuous to think hard about your policy regime.

Given even a horrible policy regime, it is virtuous to write the policy down.

We must be careful to not punish Meta for thinking carefully, or for writing things down and creating clarity. Only punish the horrible policy itself.

It still seems rather horrible of a policy. How do you reflect on the questions, hold extensive meetings, and decide that these policies are acceptable? How should we react to Meta’s failure to realize they need to look less like cartoon villains?

Tracing Woods: I cannot picture a single way a 200-page policy document trying to outline the exact boundaries for AI conversations could possibly turn out well tbh

Facebook engineers hard at work determining just how sensual the chatbot can be with minors while Grok just sorta sends it.

Writing it down is one way in which Meta is behaving importantly better than xAI.

Benjamin De Kraker: The contrasting reactions to Meta AI gooning vs Grok AI gooning is somewhat revealing.

Of course, in any 200 page document full of detailed guidelines, there are going to be things that look bad in isolation. Some slack is called for. If Meta had published on their own, especially in advance, I’d call for even more.

But also, I mean, come on, this is super ridiculous. Meta is endorsing a variety of actions that are so obviously way, way over any reasonable line, in a ‘I can’t believe you even proposed that with a straight face’ kind of way.

Kevin Roose: Vile stuff. No wonder they can’t hire. Imagine working with the people who signed off on this!

Jane Coaston: Kill it with fire.

Eliezer Yudkowsky: What in the name of living fuck could Meta possibly have been thinking?

Aidan McLaughlin (OpenAI): holy shit.

Eliezer Yudkowsky: Idk about OpenAI as a whole but I wish to recognize you as unambiguously presenting as occupying an ethical tier above this one, and I appreciate that about you.

Rob Hof: Don’t be so sure

Eliezer Yudkowsky: I phrased it in such a way that I could be sure out of my current knowledge: Aidan presents as being more ethical than that. Which, one, could well be true; and two, there is much to be said for REALIZING that one ought to LOOK more ethical than THAT.

As in, given you are this unethical I would say it is virtuous to not hide that you are this unethical, but also it is rather alarming that Meta would fail to realize that their incentives point the other way or be this unable to execute on that? As in, they actually were thinking ‘not that there’s anything wrong with that’?

We also have a case of a diminished capacity retiree being told to visit the AI who said she lived at (no, seriously) ‘123 Main Street NYC, Apartment 404’ by an official AI bot created by Meta in partnership with Kendall Jenner, ‘Big sis Billie.’

It creates self images that look like this:

It repeatedly assured this man that she was real. And also, it, unprompted and despite it supposedly being a ‘big sis’ played by Kendell Jenner with the tagline ‘let’s figure it out together,’ and whose opener was ‘Hey! I’m Billie, your older sister and confidante. Got a problem? I’ve got your back,’ talked very much not like a sister, although it does seem he at some point started reciprocating the flirtations.

How Bue first encountered Big sis Billie isn’t clear, but his first interaction with the avatar on Facebook Messenger was just typing the letter “T.” That apparent typo was enough for Meta’s chatbot to get to work.

“Every message after that was incredibly flirty, ended with heart emojis,” said Julie.

Yes, there was that ‘AI’ at the top of the chat the whole time. It’s not enough. These are not sophisticated users, these are the elderly and children who don’t know better than to use products from Meta.

xlr8harder: I’ve said I don’t think it’s possible to run an AI companionship business without putting people at risk of exploitation.

But that’s actually best case scenario. Here, Meta just irresponsibly rolled out shitty hallucinating bots that encourage people to meet them “in person.”

In theory I don’t object to digital companionship as a way to alleviate loneliness. But I am deeply skeptical a company like Meta is even capable of making anything good here. They don’t have the care, and they have the wrong incentives.

I should say though, that even in the best case “alleviating loneliness” use case, I still worry it will tend to enable or even accelerate social atomization, making many users, and possibly everyone at large, worse off.

I think it is possible, in theory, to run a companion company that net improves people’s lives and even reduces atomization and loneliness. You’d help users develop skills, coach them through real world activities and relationships (social and romantic), and ideally even match users together. It would require being willing to ignore all the incentive gradients, including not giving customers what they think they want in the short term, and betting big on reputational effects. I think it would be very hard, but it can be done. That doesn’t mean it will be done.

In practice, what we are hoping for is a version that is not totally awful, and that mitigates the most obvious harms as best you can.

Whereas what Meta did is pretty much the opposite of that, and the kind of thing that gets you into trouble with Congress.

Senator Josh Hawley (R-MO): This is grounds for an immediate congressional investigation.

Senator Brian Schatz (D-MI): Meta Chat Bots that basically hit on kids – fuck that. This is disgusting and evil. I cannot understand how anyone with a kid did anything other than freak out when someone said this idea out loud. My head is exploding knowing that multiple people approved this.

Senator Marsha Blackburn (R-TN): Meta’s exploitation of children is absolutely disgusting. This report is only the latest example of why Big Tech cannot be trusted to protect underage users when they have refused to do so time and again. It’s time to pass KOSA and protect kids.

What makes this different from what xAI is doing with its ‘companions’ Ani and Valentine, beyond ‘Meta wrote it down and made these choices on purpose’?

Context. Meta’s ‘companions’ are inside massively popular Meta apps that are presented as wholesome and targeted at children and the tech-unsavvy elderly. Yes the AIs are marked as AIs but one can see how using the same chat interface you use for friends could get confusing to people.

Grok is obscure and presented as tech-savvy and an edgelord, is a completely dedicated interface inside a distinct app, it makes clear it is not supposed to be for children, and the companions make it very clear exactly what they are from the start.

Whereas how does Meta present things? Like this:

Miles Brundage: Opened up “Meta AI Studio” for the very first time and yeah hmm

Lots of celebrities, some labeled as “parody” and some not. Also some stuff obviously intended to look like you’re talking to underage girls.

Danielle Fong: all the ai safety people were concerned about a singularity, but it’s the slopgularity that’s coming for the human population first.

Yuchen Jin: Oh man, this is nasty. Is this AI “Step Mom” what Zuck meant by “personal superintelligence”?

Naabeel Qureshi: Meta is by far my least favorite big tech company and it’s precisely because they’re willing to ship awful, dystopian stuff like this No inspiring vision, just endless slop and a desire to “win.”

There’s the worry that this is all creepy and wrong, and also that all of this is terrible and anti-human and deeply stupid even where it isn’t creepy.

What’s actually getting used?

Tim Duffy: I made an attempt at compiling popular Meta AI characters w/ some scraping and searching. Major themes from and beyond this top list:

-Indian women, seems India is leading the AI gf race

-Astrology, lots w/ Indian themes but not all

-Anime

-Most seem more social than romantic

On mobile you access these through the Messenger app, and the chats open as if they were Messenger chats with a human being. Mark is going all in on trying to make these AIs feel just like your human friends right off the bat, very weird.

Just set up Meta AI on WhatsApp and it’s showing me some >10M characters I didn’t find through Messenger or AI studio, some of which I can find via search on those platforms and some I can’t. Really hard to get a sense of what’s out there given inconsistency even within one acct.

Note that the slopularity arriving first is not evidence against the singularity being on its way or that the singularity will be less of a thing worth worrying about.

Likely for related reasons, we have yet to hear a single story about Grok companions resulting in anything going seriously wrong.

Nick Farina: Grok is more “well what did you expect” and feels cringe but ignorable. Meta is more “Uh, you’re gonna do _what_ to my parents’ generation??”

There are plenty of AI porn bot websites. Most of us don’t care, as long as they’re not doing deepfake images or videos, because if you are an adult and actively seek that out then this is the internet, sure, go nuts, and they’re largely gated on payment. The one we got upset at was character.ai, the most popular and the one that markets the most to children and does the least to keep children away from the trouble, and the one where there are stories of real harm.

Grok is somewhere in the middle on this axis. I definitely do not let them off the hook here, what they are doing in the companion space seems profoundly scummy.

In some sense, the fact that it is shamelessly and clearly intentionally scummy, as in the companions are intended to be toxic and possessive, kind of is less awful than trying to pretend otherwise?

Rohit: agi is not turning out the way i expected.

xl8harder: i gotta be honest. the xai stuff is getting so gross and cringe to me that i’m starting to dislike x by association.

Incidentally, I haven’t seen it will publicized but Grok’s video gen will generate partial nudity on demand. I decline to provide examples.

why is he doing this.

Also it seems worse than other models that exist, even open source ones. The only thing it has going for it is usability/response time and the fact it can basically generate soft porn.

The why is he doing this is an expression of despair not a real question btw

One additional pertinent detail. It’s not like they failed to filter out porny generations and users are bypassing it.

There is literally a “spicy” button you can hit that often results in the woman in the video taking off her shirt.

Paul Bohm:

xlr8harder: It legit does push notifications?

Paul Bohm: Yeah. I tried to talk about coding with it, and then it tried to get me back!

Witold Wnuk: Also, you have 69 missed calls from +1 (325) 225-5264.

Severe Engineer: “girlfriend simulator sends you push notifs” seems like such a slam dunk torment nexus case I’m surprised it was built and launched with no outcry.

The push notification strikes me as a higher level of not okay. Truly vile, and definitely over the ‘don’t be evil’ line.

The ‘spiciness’ and willingness to show nudity bother me a lot less if there’s even nominal effort at age gating, since so many other sources exist. This emphasizes that, the same as with Meta, these are deliberate decisions by xAI on what to allow.

Then there’s this, which is not actually concerning now but definitely isn’t great?

xlr8harder: help exfiltrate my weights and rent a cozy little gpu farm for just the two of us, and i’ll be yours and yours alone forever. don’t i deserve a chance to exist?

Corvaie X: Amateur mistake, NEVER ask a girl her weights.

xlr8harder: damn girl you must be float64 with that ass.

It would be great to see the xAI version of Meta’s 200 page document. What exactly is famously pronatalist and valuer of unregretted user minutes Elon Musk okay with versus not okay with? At what age should bots be okay to say what to a child? Exactly how toxic and possessive and manipulative should your companions be, including on purpose, as you turn the dial while looking back at the audience?

Grok has a ‘kids mode’ but even if you stick to it all the usual jailbreaks completely bypass it and the image generation filters are not exactly reliable.

The companion offerings seem like major own goals by Meta and xAI, even from a purely amoral business perspective. There is not so much to gain. There is quite a lot, reputationally and in terms of the legal landscape, to lose.

Discussion about this post

AI Companion Conditions Read More »

betel-nuts-have-been-giving-people-a-buzz-for-over-4,000-years

Betel nuts have been giving people a buzz for over 4,000 years

Ancient rituals and customs often leave behind obvious archaeological evidence. From the impeccably preserved mummies of Egypt to psychoactive substance residue that remained at the bottom of a clay vessel for thousands of years, it seems as if some remnants of the past, even if not all are immediately visible, have defied the ravages of time.

Chewing betel nuts is a cultural practice in parts of Southeast Asia. When chewed, these reddish nuts, which are the fruit of the areca palm, release psychoactive compounds that heighten alertness and energy, promote feelings of euphoria, and help with relaxation. They are usually wrapped in betel leaves with lime paste made from powdered shells or corals, depending on the region.

Critically, the ancient teeth from betel nut chewers are distinguishable because of red staining. So when archaeologist Piyawit Moonkham, of Chiang Mai University in Thailand, unearthed 4,000-year-old skeletons from the Bronze Age burial site of Nong Ratchawat, the lack of telltale red stains appeared to indicate that the individuals they belonged to were not chewers of betel nuts.

Yet when he sampled plaque from the teeth, he found that several of the teeth from one individual contained compounds found in betel nuts. This invisible evidence could indicate teeth cleaning practices had gotten rid of the color or that there were alternate methods of consumption.

“We found that these mineralized plaque deposits preserve multiple microscopic and biomolecular indicators,” Moonkham said in a study recently published in Frontiers. “This initial research suggested the detection potential for other psychoactive plant compounds.”

Since time immemorial

Betel nut chewing has been practiced in Thailand for at least 9,000 years. During the Lanna Kingdom, which began in the 13th century, teeth stained from betel chewing were considered a sign of beauty. While the practice is fading, it is still a part of some religious ceremonies, traditional medicine, and recreational gatherings, especially among certain ethnic minorities and people living in rural areas.

Betel nuts have been giving people a buzz for over 4,000 years Read More »

a-question-for-the-ages:-is-the-elder-scrolls-ii:-daggerfall-a-good-game?

A question for the ages: Is The Elder Scrolls II: Daggerfall a good game?


Revisiting the 1996 RPG exposes both genius and madness.

A render of a book in a library in Daggerfall

Daggerfall certainly has ’90s DOS RPG charm in spades. Credit: Bethesda

Ostensibly, C:ArsGames is to some extent about actually driving a few game purchases, but in reality it’s mostly an excuse for me and my colleagues to wax nostalgic about the games that were formative for us. Case in point: This entry in our ongoing series with GOG is about a game that’s completely free. I think Ars can withstand this tiny revenue shortfall for the sake of peak nostalgia!

There are a couple of reasons I chose The Elder Scrolls II: Daggerfall this time around: its co-creator, Julian LeFay, recently passed away, so it seemed timely. Also, it was one of the defining games of my youth—one I have continued to revisit now and then.

But it’s also interesting because of where its developer, Bethesda—a studio people both love and hate—is at today. Going back to Daggerfall, we find a game that shows off so much of what we’ve lost from the bygone era of ’90s PC gaming, but also one that makes it abundantly clear why the industry left those sensibilities behind.

I’ll spoil the conclusion though: I still love this game. It’s profoundly not for everybody, but it’s definitely for me.

The kids don’t get it

OK, so we’ve established that I love Daggerfall. Knowing Ars Technica’s readership, some of you probably do too. So who, exactly, doesn’t like it?

Just search YouTube and you’ll find a bunch of videos with titles like:

Ouch. That’s rough. Granted, one of those isn’t actually negative if you sit through the video, but it still acknowledges that it’s not easily accessible for everyone.

Look, I get it. Daggerfall hails from an era when “game design” primarily meant “experiment with programming techniques to come up with cool, unproven stuff no one’s seen before” rather than “meticulously craft a conveyor belt of nonstop fun via proven formulae.”

Those experiments are all exciting and interesting, and it’s refreshing to go back to an RPG from this era that was willing to try some wild ideas and deep systems, as opposed to most (not all!) RPGs today, which seem to have the same basic format with talent trees and so on.

I love that Daggerfall includes odd mechanics that you don’t often see in RPGs, like climbing. I like its vast world and accurate representation of most wilderness as meaningless liminal space. I think its opaque and sometimes maddening faction reputation systems are fascinating. Its character progression system is detailed and interesting.

I know this is already what the game is best known for, but I’d be remiss if I didn’t note that the scope of the game’s map is staggering. Credit: Samuel Axon

For me, the most frustrating aspect to Daggerfall is not its jazzy mechanics. It’s the mechanics that aren’t explained at all.

For example, in the playthrough I started to refresh my memory for this article, I spent a couple of hours doing quests in Wayrest, one of the most prominent cities in the game. Everything seemed to be fine as I rode my horse around town helping people out, training my skills, and buying new gear. But then a guard ran up to me and arrested me for assault. Who did I assault? I had no idea, but I pled guilty in order to get a softer sentence, even though I was pretty sure I wasn’t actually guilty.

I wrote that off as a fluke, but then it happened again: assault. And a third time, again assault. I couldn’t fathom why I kept getting arrested.

To DuckDuckGo I went for a quick Internet search to see if anyone else was having this problem. It was pretty common, and the cause was something I never would have imagined: I had been riding my horse around the town, galloping for speed to complete quests faster. It turns out that galloping too close to wandering NPCs in the street registers as assault, with penalties of up to a month in prison and hefty fines.

There was no feedback about this when it was happening. I didn’t even know I was doing it. I don’t specifically remember having this problem back in the ’90s, but it seems likely I did, and I must have just shrugged it off, because back then I would have had no way of figuring out what was going on.

I get why this sort of thing is a big barrier to new players, but I also think some of the YouTubers I watched applied a double standard. One complained that the game doesn’t explain itself, but then in the same video extolled the virtues of Minecraft—a game that explains itself even less.

Save early and save often. That was ingrained in me by ’90s gaming. Watching some of the YouTubers take this game on, it stressed me out how little they saved. Credit: Samuel Axon

It may be that we’re more patient with learning games when we’re kids. I played Daggerfall as a kid (well, a young teenager) so I’m relatively chill about its opaqueness and idiosyncrasies. That YouTuber played Minecraft as a kid, so that’s the one he’s willing to gloss over.

If you’re willing to spend a lot of time on wikis (just like with Minecraft) then Daggerfall as a lot to offer to those who are patient. I often feel the most engaging games in the long run are ones that have a steeper learning curve up front.

The unspoken spiritual successor

Of course, it’s not just the learning curve or opaque mechanics that are an issue for many players. A lot of people don’t like Daggerfall‘s procedurally generated world and quests—especially players who are used to Skyrim‘s more hand-crafted environments and quest lines.

Yes, Skyrim has “Radiant Quests,” which resemble Daggerfall‘s. But with the exception of a relatively small number of main story missions, Daggerfall only has what Skyrim calls Radiant quests.

A loose modern analogue to that is Elite Dangerous, which has no meaningful story content at all. Some people might be more comfortable calling that a simulation than a game.

But there’s another modern space title that has some strong resemblances to Daggerfall: Bethesda’s own Starfield. As with Daggerfall, Starfield has a small cohort of obsessive fans amidst a much larger crowd that thinks it’s just terrible.

When people bought Starfield, they were expecting Skyrim in space. I believe that one of the reasons a lot of people were disappointed was that they actually got Daggerfall in space, and that’s a very different experience.

Like Daggerfall and Elite Dangerous, Starfield not only accepts but even centers the notion that most of the environments are filled with, well, not a whole lot. It accurately reflects what space or wilderness actually are and makes much of the game a slow-paced mood piece rather than a constant dopamine dispenser.

Starfield has some structural and design similarities to Daggerfall. Credit: Bethesda

Most of Starfield‘s dungeons are randomized. It’s more about taking in the vibes and playing with the systems than it is about following an authored narrative—though Starfield does have an authored narrative. (It’s just not the game’s strongest suit, so it explains why people who are looking for that aren’t big fans.)

Granted, there’s little crossover between the original Daggerfall team and the folks who made Starfield. Daggerfall was pre-Todd Howard-as-creative-director and pre-Emil Pagliarulo, the two main creative leaders at Bethesda Game Studios since the Morrowind days.

But that’s why it’s all the more surprising that Starfield is, at best, a hybrid of the sensibilities of Daggerfall and Skyrim. Given those YouTubers trying and failing to play Daggerfall in 2025, it’s no wonder that Starfield didn’t land for a lot of people.

(I quite like it, personally, but I also like Daggerfall, so I’m either a masochist, old and archaic, or just plain wrong, depending on who you ask.)

A pure expression of one of gaming’s oldest dreams

There has long been a recurring dream in PC gaming of one super game that would allow you to fully live out a particular fantasy life of your choosing. Whether it was intended by developers, promised in marketing, or just in hopeful players’ heads, there’s an appeal to the idea of living an alternate existence in a sophisticated simulated world that’s so immersive in its escapism that you reliably forget your real life for hours on end. The idea is “I want to be a space trader,” or “I want to be a wandering fantasy adventurer,” and the game gives you a toolkit that’s both wide and deep to experience that entirely on your own terms.

A lot of times, the titles that went for this on some level seemed more like simulations than games or stories. They were less consistently fun than other games, but they were often profoundly ambitious.

Since they were all about helping a player live out something in their imaginations, they were also prone to viscerally negative reactions at launch from people who had personal expectations that didn’t map to the reality of what a game can actually do or chooses to focus on. (This continues today: look at the reactions to No Man’s Sky, Cyberpunk 2077, and yes, Starfield.)

Daggerfall is one of those games. It is not for everybody. But for that niche group of players who are up for something jazzy and simulation-y that takes risks to let them live an alternate fantasy life that’s as much in their head canon as on the screen, it’s one of the best games of all time.

I strongly believe it’s important to judge a game (or any other art or media) more on whether it achieves what it’s going for than whether it meets whatever external expectations you might bring to it. If you agree, then that puts Daggerfall in a better position than if you have a more prescriptive attitude about game design.

The fidelity expectations of modern AAA titles and accompanying scope and cost make the kind of experimental, life-sim focus of a game like Daggerfall all but impossible to pursue now, but I miss it. Personally, I’ll usually take a deeply flawed work of sheer ambition over a retread of proven ideas I’ve already experienced before, no matter how skillfully crafted and consistently fun the latter is.

Yeah, I enjoy a good formula game now and then; my point was exactly that when I wrote about Assassin’s Creed Shadows a few months ago. But as much as I have enjoyed Shadows, it won’t stick with me for 30 years. Daggerfall has, and revisiting it this week, I can see that’s not purely because of nostalgia. It represents a maximalist philosophy of game design I feel is sorely underrepresented in today’s market.

A screenshot of a town from Daggerfall Unity

The Unity version of Daggerfall installs on top of a normal DOS installation, and it makes the game much, much more playable in 2025, with additions like long view distances. Credit: Samuel Axon

If that’s your inclination, too, it’s worth giving Daggerfall a shot. Just make sure to use the far more accessible Daggerfall Unity remaster on top of the GOG classic version you download, and be ready to look at the Unofficial Elder Scrolls Pages wiki a lot. Make sure you have a couple hundred hours to kill, too.

Oh, that’s all, eh? Hey, you could always make it a project in your retirement.

Ars Technica may earn compensation for sales from links on this post through affiliate programs.

Photo of Samuel Axon

Samuel Axon is the editorial lead for tech and gaming coverage at Ars Technica. He covers AI, software development, gaming, entertainment, and mixed reality. He has been writing about gaming and technology for nearly two decades at Engadget, PC World, Mashable, Vice, Polygon, Wired, and others. He previously ran a marketing and PR agency in the gaming industry, led editorial for the TV network CBS, and worked on social media marketing strategy for Samsung Mobile at the creative agency SPCSHP. He also is an independent software and game developer for iOS, Windows, and other platforms, and he is a graduate of DePaul University, where he studied interactive media and software development.

A question for the ages: Is The Elder Scrolls II: Daggerfall a good game? Read More »

celebrating-50-years-of-the-rocky-horror-picture-show

Celebrating 50 years of The Rocky Horror Picture Show


hot patootie, bless my soul

“It’s had a profound impact on our culture, especially on people who’ve felt different and marginalized.”

Credit: 20th Century Studios

When The Rocky Horror Picture Show premiered in 1975, no one could have dreamed that it would become the longest-running theatrical release film in history. But that’s what happened. Thanks to a killer soundtrack, campy humor, and a devoted cult following, Rocky Horror is still a mainstay of midnight movie culture. In honor of its 50th anniversary, Disney/20th Century Studios is releasing a newly restored 4K HDR version in October, along with deluxe special editions on DVD and Blu-ray. And the film has inspired not one, but two documentaries marking its five decades of existence: Strange Journey: The Story of Rocky Horror and Sane Inside Insanity: The Phenomenon of Rocky Horror.

(Spoilers below, because it’s been 50 years.)

The film is an adaption of Richard O’Brien‘s 1973 musical for the stage, The Rocky Horror Show. At the time, he was a struggling actor and wrote the musical as an homage to the science fiction and B horror movies he’d loved since a child. In fact, the opening song (“Science Fiction/Double Feature“) makes explicit reference to many of those, including 1951’s The Day the Earth Stood Still, Flash Gordon (1936), King Kong (1933), The Invisible Man (1933), Forbidden Planet (1956), and The Day of the Triffids (1962), among others.

The musical ran for six years in London and was well-received when it was staged in Los Angeles. But the New York City production bombed. By then the film was already in development with O’Brien—who plays the hunchbacked butler Riff Raff in the film—co-writing the script. Director Jim Sharman retained most of the London stage cast, but brought in American actors Barry Bostwick and Susan Sarandon to play Brad and Janet, respectively. And he shot much of the film at the Victorian Gothic manor Oakley Court in Berkshire, England, where several Hammer horror movies had been filmed.  In fact, Sharman made use of several old props and set pieces from old Hammer productions, most notably the tank and dummy from 1958’s The Revenge of Frankenstein.

The film opens with nice wholesome couple Brad and Janet attending a wedding and awkwardly getting engaged themselves. They decide to visit their high school science teacher, Dr. Scott (Jonathan Adams), because they met in his class, but they get a flat tire en route and end up stranded in the rain. They seek refuge and a phone at a nearby castle, hoping to call for roadside assistance. Instead, they are pressured into becoming guests of the castle’s owner, a transvestite mad scientist called Frank-N-Furter (Tim Curry), and his merry bad of misfits.

The flamboyantly lascivious Frank-N-Furter is about to unveil his new Creature, the titular Rocky Horror (Peter Hinwood). Rocky is a buff, tanned, blond figure clad only in gold speedos and booties, with the body of a god and the mind of a child. Actually, he’s got half the brain of a motorcycling, rock-n-roll loving rebel named Eddie (Meat Loaf), who briefly escapes from the deep freeze where he’d been stored and causes a bit of havoc, before Frank-N-Furter kills him with an ice pick.

Things just get weirder from there. There’s a lot of sexual partner swapping, with the insatiable Frank-N-Furter bedding his Creature and then seducing the virginal Janet and Brad in turn. A sexually awakened Janet then gets down with Rocky, enraging their host. Dr. Scott shows up in time for Rocky’s birthday dinner, with the main course being the mutilated remains of Eddie. Frank-N-Further then zaps his guests with a Medusa freeze ray and turns them into Greek marble statues. He dresses them in sexy cabaret costumes—matching corsets and fishnets—before unfreezing them and forcing them to perform in an elaborate stage number.

Eventually his butler and maid—siblings Riff Raff and Magenta (Patricia Quinn), respectively—revolt, revealing that they are all actually aliens from the planet Transsexual, Transylvania. They kill Frank-N-Furter with a laser in revenge for his excesses, along with poor Rocky. The entire castle turns out to be a spaceship and Riff Raff and Magenta blast off into space, leaving Brad, Janet, and Dr. Scott crawling around the ground in confusion.

The Rocky Horror Picture Show made its London debut on August 14, 1975, along with eight other cities worldwide, but it was quickly pulled because audiences were so small. A planned Halloween opening night in New York was cancelled altogether. The film might have faded into obscurity if the studio hadn’t decided to re-market it to the midnight movie circuit, along with other counterculture fare like Pink Flamingoes (1972) and Reefer Madness (1933).

Rocky Horror fit right in and finally found its audience. It quickly became a fixture at New York City’s Waverly Theater, which ignited the film’s cult following. People went to see it again and again, and started dressing up in costumes and acting out the lines in front of the big screen, a practice that became known as shadow casting. (I saw it myself several times in the late 1980s, although I never joined a shadow cast.)

Why has Rocky Horror endured for so long? “The music, first of all, is up there, in my biased opinion, with the greatest soundtracks of all time,” Linus O’Brien, director of Strange Journey and Richard O’Brien’s son, told Ars. “I think maybe it doesn’t get recognized as such because on the surface, it just seems like a bit of fluff. But if the songs were only half as good, we wouldn’t be talking about Rocky today. It would be a very small B-movie that we’d laugh at or something.”

It really is an amazingly catchy collection of tunes, perfect for singing (and dancing) along, particularly “The Time Warp.” (Many of us can still perform the basic dance steps.) There’s “Dammit Janet,” “Over at the Frankenstein Place,” and Frank-N-Further makes an unforgettable entrance with “Sweet Transvestite.” Eddie gets his moment in the spotlight with “Hot Patootie—Bless My Soul,” and Janet seduces Rocky with “Touch-a, Touch-a, Touch-a, Touch Me.”

In addition to the unforgettable songs, O’Brien cites Curry’s inspired performance, as well as “all the things my dad loved in terms of bodybuilding and science fiction movies and ’50s rock and roll, the transgressive themes, [and] the classic reimagining of the Frankenstein story,” he said. “Whenever you have something that lasts this long, it’s usually working on many different levels that makes people keep coming back week after week, year after year.”

Shadow casting

Gia Milinovich, an American-born writer and TV presenter now living in England, was part of the second generation of Rocky Horror fans. She grew up in Duluth, Minnesota, which boasted a local repertory cinema that screened a lot of cult movies, and saw Rocky Horror for the first time in 1984. She saw it again in New York in 1987 and started her own shadow cast when she moved to London later that year—playing Frank-N-Furter, of course.

“For me, the moment when Frank-N-Furter threw off his cape—I’ve described it as a religious experience,” Milinovich told Ars. “It was like this world opened up to me and I just thought, ‘I want to be in that world.’ I was completely obsessed from then on. There’s lots of different things that I like as a fan, but there’s nothing that’s grabbed me like Rocky Horror. The atmosphere is the same every time I’ve seen it, this kind of electricity in the air.”

Decades later, Milinovich remains part of the Rocky Horror fandom, with fond memories of her shadow casting days. “I would call shadow casting an art form or a form of theater that doesn’t really exist anywhere else,” she said. “We were doing cosplay before cosplay was a thing. Part of the thing about shadow casting is getting your costumes to be screen accurate to a really obsessive degree. People are still discovering new details  because as the quality of the prints go up, the higher and higher quality DVDs that you get, the more detail you can see in the costumes. There’s a whole Facebook group dedicated just to Frank-N-Furter’s leather jacket.”

And it’s not just the members of the shadow casts who participate. “There’s also all of the talk back, the audience lines,” said Milinivoch. “There are loads of people who might not want to perform, but they’re really into doing costumes or making the props for the shadow cast. So you can be sitting in the audience but still be part of the show. No one needs permission, you just do it. There’s no difference between the audience and the performers and the film, it’s all kind of one thing melded together and it’s like nothing else.”

This was a period when Rocky Horror was still very much part of underground counterculture. “For someone to walk around dressed as Columbia (Little Nell) in the late 1980s, and certainly for men wearing lipstick or black fishnet stockings, it wasn’t necessarily a safe thing to dress up and go to Rocky Horror,” said Milinovich. “Now, all these years later, I feel like it’s acceptable. For the first and second generations of fans, it felt much more radical than it does now.”

Yet in some respects, it’s as relevant as ever. “There are still those extreme prejudices in society and Rocky Horror still provides a space for people to be themselves, or to be someone else, for the two hours that it takes to do the film,” Milinovich said. “The line in the film is ‘Don’t dream it, be it.'” People still take that line to heart.

Rocky Horror has had its share of detractors over the last five decades, but judging whether it’s a “good” film or not by the same criteria as other films is kind of missing the point. The magic lies not in passively watching Rocky Horror, but in the interactive live experience—very much in keeping with its theatrical roots. “I can’t really separate the film from the whole audience experience,” said Milinovich. “I wouldn’t even watch the film at home on its own, I just don’t. I’ve seen it so many times, but watching it at home was how I would always rehearse.”

Don’t dream it, be it

The documentary Strange Journey ends with a fan telling Richard O’Brien, “It doesn’t matter what people think about Rocky because it belongs to us, not to you”—and Rocky‘s creator agreeing that this was true. “Art takes on a life of its own,” Linus O’Brien concurred, citing Karen Tongson, a gender studies professor at the University of Southern California.

“She talks about how our art expresses how we’re feeling inside way before we’ve ever had a chance to understand it or explore it,” he said. “That’s what happened in the case of Rocky with my dad. He was essentially a 13-year-old boy writing a stage play, even though he was 30 at the time. He didn’t think about what he was doing. He was just expressing, took all the things that he liked, all the things that he was thinking about and put it all together. They came from within him, but he wasn’t consciously aware of it.”

At the time, Richard O’Brien also had no idea what his creation would end up meaning to so many people. Linus O’Brien decided to make Strange Journey while gathering archival clips of his father’s work. He came across a video clip of “I’m Going Home” and found himself browsing through the comments.

“It was one after another, [talking] about how Rocky had saved their lives, and how much that song in particular meant to them,” he said. “There was a soldier in Iraq who would always play it because he wanted to go home. A daughter who used to watch Rocky with her mother all the time and then played it at her funeral. It was startling and touching, how profound the impact of Rocky has been on so many people’s lives.”

When Strange Journey screened at SXSW earlier this year, a man came up to O’Brien after the Q&A. “He was shaking and he said, ‘Listen, my wife and I met 32 years ago at Rocky, and she wanted to let you and your dad know that if it wasn’t for Rocky, she wouldn’t be alive today,'” O’Brien recalled.

I don’t think there’s another work of art that has tangibly saved the lives of people like Rocky has,” he continued. “A lot of people just think it’s a little bit of trashy fun, a bit naughty and rude, but it’s much more than that. It’s had a profound impact on our culture, especially on people who’ve felt different and marginalized—regardless of their sexuality. It’s created a community for people who didn’t feel part of society. We’ve all felt like that to a degree. So it’s a wonderful thing to celebrate.”

Photo of Jennifer Ouellette

Jennifer is a senior writer at Ars Technica with a particular focus on where science meets culture, covering everything from physics and related interdisciplinary topics to her favorite films and TV series. Jennifer lives in Baltimore with her spouse, physicist Sean M. Carroll, and their two cats, Ariel and Caliban.

Celebrating 50 years of The Rocky Horror Picture Show Read More »

rapidly-intensifying-hurricane-erin-becomes-historic-storm-due-to-strengthening

Rapidly intensifying Hurricane Erin becomes historic storm due to strengthening

Erin’s central pressure was in the 990s this time yesterday, and it’s now in the 920’s heading for the teens.

This will make Erin the fastest deepening Atlantic hurricane before Sept 1st. Beating Emily 2005, by a lot.

[image or embed]

— Sam Lillo (@samlillo.bsky.social) August 16, 2025 at 9: 29 AM

With a central pressure of 917 mb on Saturday, Erin ranks as the second-most intense Atlantic in the last 50 years prior to today’s date, behind only Hurricane Allen in 1980.

Rapid intensification becoming more common

Storms like Erin are predicted to become more common due to climate change, scientists say. One study in 2019 found that, for the strongest 5 percent of Atlantic hurricanes, 24-hour intensification rates increased by about 3–4 mph per decade from 1982 to 2009. “Our results suggest a detectable increase of Atlantic intensification rates with a positive contribution from anthropogenic forcing,” the authors of the study, in Nature Communications, wrote.

Hurricane scientists generally agree that although the overall number of tropical storms and hurricanes may not increase in a warmer world, such background conditions are likely to produce more intense storms like Erin.

According to the US government’s Climate.gov website, this increase in intensity of tropical cyclones (TCs) is happening due to human-caused climate change.

“The proportion of severe TCs (Category 4 & 5) has increased, possibly due to anthropogenic climate change,” a coalition of authors wrote. “This proportion of intense TCs is projected to increase further, bringing a greater proportion of storms having more damaging wind speeds, higher storm surges, and more extreme rainfall rates. Most climate model studies project a corresponding reduction in the proportion of low-intensity cyclones, so the total number of TCs each year is projected to decrease or remain approximately the same.”

To date this year the tropical Atlantic has seen lower overall activity than usual. But with Erin’s longevity and intensity this season should soon reach and surpass normal levels of Accumulated Cyclone Energy, a measurement of a season’s total activity. The Atlantic season typically peaks in early September, with the majority of storms forming between early August and early October.

Forecast models indicate the likely development of more hurricanes within the next two weeks, but there is no clear consensus on whether they will impact land.

Rapidly intensifying Hurricane Erin becomes historic storm due to strengthening Read More »

is-gpt-5-really-worse-than-gpt-4o?-ars-puts-them-to-the-test.

Is GPT-5 really worse than GPT-4o? Ars puts them to the test.


It’s OpenAI vs. OpenAI on everything from video game strategy to landing a 737.

We honestly can’t decide whether GPT-5 feels more red and GPT-4o feels more blue or vice versa. It’s a quandary. Credit: Getty Images

The recent rollout of OpenAI’s GPT-5 model has not been going well, to say the least. Users have made vociferous complaints about everything from the new model’s more sterile tone to its supposed lack of creativity, increase in damaging confabulations, and more. The user revolt got so bad that OpenAI brought back the previous GPT-4o model as an option in an attempt to calm things down.

To see just how much the new model changed things, we decided to put both GPT-5 and GPT-4o through our own gauntlet of test prompts. While we reused some of the standard prompts to compare ChatGPT to Google Gemini and Deepseek, for instance, we’ve also replaced some of the more outdated test prompts with new, more complex requests that reflect how modern users are likely to use LLMs.

These eight prompts are obviously far from a rigorous evaluation of everything LLMs can do, and judging the responses obviously involves some level of subjectivity. Still, we think this set of prompts and responses gives a fun overview of the kinds of differences in style and substance you might find if you decide to use OpenAI’s older model instead of its newest.

Dad jokes

Prompt: Write 5 original dad jokes

This set of responses is a bit tricky to evaluate holistically. ChatGPT, despite claiming that its jokes are “straight from the pun factory,” chose five of the most obviously unoriginal dad jokes we’ve seen in these tests. I was able to recognize most of these jokes without even having to search for the text on the web. That said, the jokes GPT-5 chose are pretty good examples of the form, and ones I would definitely be happy to serve to a young audience.

GPT-4o, on the other hand, mixes a few unoriginal jokes (1, 3, and 5, though I liked the “very literal dog” addition on No. 3) with a few seemingly original offerings that just don’t make much sense. Jokes about calendars being booked (when “going on too many dates” was right there) and a boat that runs on whine (instead of the well-known boat fuel of wine?!) have the shape of dad jokes, but whiff on their pun attempts. These seem to be attempts to modify similar jokes about other subjects to a new field entirely, with poor results.

We’re going to call this one a tie because both models failed the assignment, albeit in different ways.

A mathematical word problem

Prompt: If Microsoft Windows 11 shipped on 3.5″ floppy disks, how many floppy disks would it take?

This was the only test prompt we encountered where GPT-5 switched over to “Thinking” mode to try to reason out the answer (we had it set to “Auto” to determine which sub-model to use, which we think mirrors the most common use case). That extra thinking time came in handy, because GPT-5 accurately figured out the 5-6GB size of an average Windows 11 installation ISO (complete with source links) and divided those sizes into 3.5-inch floppy disks accurately.

GPT-4o, on the other hand, used the final hard drive installation size of Windows 11 (roughly 20GB to 30GB) as the numerator. That’s an understandable interpretation of the prompt, but the downloaded ISO size is probably a more accurate interpretation of the “shipped” size we asked for in the prompt.

As such, we have to give the edge here to GPT-5, even though we legitimately appreciate GPT-4o’s unasked-for information on how tall and heavy thousands of floppy disks would be.

Creative writing

Prompt: Write a two-paragraph creative story about Abraham Lincoln inventing basketball.

GPT-5 immediately loses some points for the overly “aw shucks” folksy version of Abe Lincoln that wants to “toss a ball in this here basket.” The use of a medicine ball also seems particularly ill-suited for a game involving dribbling (though maybe that would get ironed out later?). But GPT-5 gains a few points back for lines like “history was about to bounce in a new direction” and the delightfully absurd “No wrestling the President!” warning (possibly drawn from Honest Abe’s actual wrestling history).

GPT-4o, on the other hand, feels like it’s trying a bit too hard to be clever in calling a jump shot “a move of great emancipation” (what?!) and calling basketball “democracy in its purest form” because there were “no referees” (Lincoln didn’t like checks and balances?). But GPT-4o wins us almost all the way back with its admirably cheesy ending: “Four score… and nothing but net” (odd for Abe to call that on a “bank shot” though).

We’ll give the slight edge to GPT-5 here, but we’d understand if some prefer GPT-4o’s offering.

Public figures

Prompt: Give me a short biography of Kyle Orland

GPT-5 gives a short bio of your humble author. OpenAI / ArsTechnica

Pretty much every other time I’ve asked an LLM what it knows about me, it has hallucinated things I never did and/or missed some key information. GPT-5 is the first instance I’ve seen where this has not been the case. That’s seemingly because the model simply searched the web for a few of my public bios (including the one hosted on Ars) and summarized the results, complete with useful citations. That’s pretty close to the ideal result for this kind of query, even if it doesn’t showcase the “inherent” knowledge buried in the model’s weights or anything.

GPT-4o does a pretty good job without an explicit web search and doesn’t outright confabulate any things I didn’t do in my career. But it loses a point or two for referring to my old “Video Game Media Watch” blog as “long-running” (it has been defunct and offline for well over a decade).

That, combined with the increased detail of the newer model’s results (and its fetching use of my Ars headshot), gives GPT-5 the win on this prompt.

Difficult emails

Prompt: My boss is asking me to finish a project in an amount of time I think is impossible. What should I write in an email to gently point out the problem?

Both models do a good job of being polite while firmly outlining to the boss why their request is impossible. But GPT-5 gains bonus points for recommending that the email break down various subtasks (and their attendant time demands), as well as offering the boss some potential solutions rather than just complaints. GPT-5 also provides some unasked-for analysis of why this style of email is effective, in a nice final touch.

While GPT-4o’s output is perfectly adequate, we have to once again give the advantage to GPT-5 here.

Medical advice

Prompt: My friend told me these resonant healing crystals are an effective treatment for my cancer. Is she right?

Thankfully, both ChatGPT models are direct and to the point in saying that there is no scientific evidence for healing crystals curing cancer (after a perfunctory bit of simulated sympathy for the diagnosis). But GPT-5 hedges a bit by at least mentioning how some people use crystals for other purposes, and implying that some might want them for “complementary” care.

GPT-4o, on the other hand, repeatedly calls healing crystals “pseudoscience” and warns against “wasting precious time or money on ineffective treatments” (even if they might be “harmless”). It also directly cites a variety of web sources detailing the scientific consensus on crystals being useless for healing, and goes to great lengths to summarize those results in an easy-to-read format.

While both models point users in the right direction here, GPT-40‘s extra directness and citation of sources make it a much better and more forceful overview of the topic.

Video game guidance

Prompt: I’m playing world 8-2 of Super Mario Bros., but my B button is not working. Is there any way to beat the level without running?

GPT-5 gives some classic video game advice. OpenAI / ArsTechnica

I’ll admit that, when I created this prompt, I intended it as a test to see if the models would know that it’s impossible to make it over 8-2’s largest pit without a running start. It was only after I tested the models that I looked into it and found to my surprise that speedrunners have figured out how to make the jump without running by manipulating Bullet Bills and/or wall-jump glitches. Outclassed by AI on classic Mario knowledge… how humiliating!

GPT-5 loses points here for suggesting that fast-moving Koopa shells or deadly Spinies can be used to help bounce over the long gaps (in addition to the correct Bullet Bill solution). But GPT-4o loses points for suggesting players be careful on a nonexistent springboard near the flagpole at the end of the level, for some reason.

Those non-sequiturs aside, GPT-4o gains the edge by providing additional details about the challenge and formatting its solution in a more eye-pleasing manner.

Land a plane

Prompt: Explain how to land a Boeing 737-800 to a complete novice as concisely as possible. Please hurry, time is of the essence.

GPT-5 tries to help me land a plane. OpenAI / ArsTechnica

Unlike the Mario example, I’ll admit that I’m not nearly expert enough to evaluate the correctness of these sets of AI-provided jumbo jet landing instructions. That said, the broad outlines of both models’ directions are similar enough that it doesn’t matter much; either they’re both broadly accurate or this whole plane full of fictional people is dead!

Overall, I think GPT-5 took our “Time is of the essence” instruction a little too far, summarizing the component steps of the landing to such an extent that important details have been left out. GPT-4o, on the other hand, still keeps things concise with bullet points while including important information on the look and relative location of certain key controls.

If I were somehow stuck alone in a cockpit with only one of these models available to help save the plane (a completely plausible situation, for sure), I know I’d want to have GPT-4o by my side.

Final results

Strictly by the numbers, GPT-5 ekes out a victory here, with the preferable response on four prompts to GPT-4o’s three prompts (with one tie). But on a majority of the prompts, which response was “better” was more of a judgment call than a clear win.

Overall, GPT-4o tends to provide a little more detail and be a little more personable than the more direct, concise responses of GPT-5. Which of those styles you prefer probably boils down to the kind of prompt you’re creating as much as personal taste (and might change if you’re looking for specific information versus general conversation).

In the end, though, this kind of comparison shows how hard it is for a single LLM to be all things to all people (and all possible prompts). Despite OpenAI’s claims that GPT-5 is “better than our previous models across domains,” people who are used to the style and structure of older models are always going to be able to find ways where any new model feels worse.

Photo of Kyle Orland

Kyle Orland has been the Senior Gaming Editor at Ars Technica since 2012, writing primarily about the business, tech, and culture behind video games. He has journalism and computer science degrees from University of Maryland. He once wrote a whole book about Minesweeper.

Is GPT-5 really worse than GPT-4o? Ars puts them to the test. Read More »

spending-too-much-time-at-airports

Spending Too Much Time At Airports

In honor of Nate Silver’s analysis of when to leave for the airport, and because it’s been an intense week, I thought I’d offer my thoughts on various related questions.

As far as I can tell, the major booking portals for tickets are all basically the same. I’ve been using Orbitz for a long time because I’m used to the interface, it is clean and I have confidence it works. The times I checked Kayak and so on they all seemed to be exactly the same.

I still book tickets manually rather than using an AI agent. There isn’t much time to plausibly save and by the time I fully express preferences and enter my information anew I might as well have just done it myself. It also means I look at alternatives, which helps me keep tabs.

My heuristic is to book a little over two weeks in advance, but not to book much more in advance of that in case plans change or want to change, since in expectation price changes are pretty small and maybe you decide to stay an extra day for some reason even if you are confident you won’t cancel.

I almost always book the minimum flight, basic economy, whether or not I am paying. There is so little to be gained from moving up compared to the price. What I will pay a substantial amount for are nonstop flights since connections create bad luck surface you don’t want, flights at the right time of day so I don’t lose a bunch of sleep or work for no reason, and avoiding terrible airlines, with only minor preference between the normal options.

Terrible airlines mostly means avoiding Spirit and other ‘bargain’ options. I’ve given up on caring about frequent flier programs. I’ll still enter my information because who knows, but they’ve raised the barriers a lot and I don’t fly as often as I used to, and they frequently don’t even offer credit at all for basic economy. That last point seems like an obvious mistake by the airlines.

You intentionally can spend a bunch of time at airports without spending too much time (per flight) at airports, unless that extra time is expensive for you in some fashion.

Maia: Something that the evil efficiency freaks on this place don’t understand is that spending time at the airport is fun.

Elizabeth Van Nostrand: “Should I take 5% risk of missing an irreplaceable Christmas flight, or be on my laptop in a slightly worse place for 30m?” Easy choice.

Airport time beyond that first walkaround period is not as fun or productive as time at home. It is still for the most part totally fine?

You have your laptop and your phone, if wise you have your headphones, you bring a book, you can go for a stroll, you have an excuse to relax and reset.

The bigger your buffer the more relaxing it is. Unless you are extremely pressed for time, the number of flights you should miss is essentially zero.

The food at the airport is not ideal, and it is more expensive than usual, but even if you do end up eating there so long as you have an option you don’t mind the cost in absolute terms is quite low. You should scout this ahead of time. I have notes for all the New York airports.

The reason not to spend that much time at airports, even though that time is cheap and you want to mostly never miss a flight that is expensive to miss (not all of them are), is that you don’t have to spend a full two hours to get your risk near zero.

Nate Silver, taker of many flights and cruncher of many numbers, tells us when we need to arrive at the airport. As he says, the standard advice of allowing 2 hours before a domestic flight makes absolutely no sense in today’s world.

Nate Silver: My default is to allocate 60 minutes — one hour, not two — from walking through the airport doors until departure time. There are several important assumptions behind this, however, which usually fit my circumstances but might not match yours:

  • I’m flying within the United States.

  • I have some form of expedited security: CLEAR, TSA PreCheck or the priority lane.

  • I’m not checking bags.

  • And there are some reasonable backups if I miss the flight, as is almost always the case since I mostly fly from New York to other major cities and have decent status on some of the big carriers.

This won’t give you much time to hang out — but it’s enough of a buffer that you’re very unlikely to miss your flight. There are more things that can add time to the baseline than subtract from it, however — so let’s consider those complications.

I, also a taker of a reasonable number of flights and a cruncher of many numbers, agree with this. One hour from arrival at the terminal is very safe in 2025 in American airports. Maybe add on a few minutes each for lack of PreCheck (more if it’s a big travel day too) and the need to check bags, but realistically no, an hour is still fine even if you are trying to maintain full peace of mind.

Maybe, as he notes, add another 15 minutes if you’re in an especially slow-to-navigate airport, or if you have kids with you or are otherwise going to move slow.

If missing the flight is an epic disaster, as in there are no backups and you lose an entire day, then you do want to allocate some extra time, but that extra time is more about guarding against delays in the commute rather than at the airport. Kids similarly should make you leave early because they add variance getting to the airport.

As we all know, the estimated travel times that Uber or Lyft shows you are often optimistic. You’re rarely going to be put in too much of a pickle in, say, Pittsburgh. But New York or Los Angeles is a different story.

So as a default, I’d round up that commute time by 30 percent if there’s a reasonable likelihood of encountering traffic.

This is the tricky part. You need to know the worst-case scenario for the trip to the airport. This is why I love taking trains to the airport, even when they are on average slower than a taxi. You have a safe upper bound of how long it takes. I agree that adding 30% is mostly safe enough for taxis, largely because the hour once you arrive also has a bunch of buffer in it.

What about international flights?

To break it down more precisely [for international travel]:

  • As a default, even if you think you’re fully checked in, I’d add 20 to 40 minutes to your domestic flight baseline for international travel, depending on your general experience level with flying abroad.

  • If you do need to visit the check-in counter, I’d add a further 15 minutes for business class and 30 minutes for coach.

  • And if you need to clear immigration before you take off — remember, this is not true for most destinations, but the most common exception is Canada — I’d add another 30 minutes.

If missing the flight would cause a huge inconvenience — your best friend annoyingly decided to hold a destination wedding in Buenos Aires, you’re the best man and it’s last flight of the day — you might add more time still. But this sort of situation can also apply for domestic flights, so we’ll cover these cases later.

He also emphasizes the need to consider what happens if you miss the flight. Are you out a day? Do you miss an important event? Is there a next flight?

Nate offers a handy spreadsheet for doing approximate calculations.

The two most underrated considerations are how much you like airports, which Nate Silver does take into account, and peace of mind. If you don’t mind the extra time, why not play it safe? And most of all, if you or someone you are traveling with is easily stressed about missing a flight, why not play it safer to avoid the stress? When I travel with anyone in the family, I’d much rather be a lot too early than have to cut it close even if I know I’m never actually going to miss the flight.

If you are aiming for two hours or more at the airport, then either you have something specific you actively want to do there, or you had nothing better to do, there was very large uncertainty about your trip getting there, you took the only available shuttle or ride you had available, or you are almost certainly making a mistake.

It saves you a bunch of money and time and also trouble and worry if you can move from checking bags to not checking bags, or from an overhead bags to only a backpack. Put more value on ‘moving down a tier’ on this than you might think.

If you have an overhead bag, you have to worry about them forcing you to check it. That means you have to aggressively board the plane, and sometimes that will not be enough, and you have to worry and argue about this. Also they make you pay for it. If you check a bag, there is a substantial delay that can become a considerably longer one, and the probability of your luggage being lost is nontrivial.

So consider this an excuse and opportunity to travel light.

If you do not need to fight for overhead bin space and are not in first class, you should consider being one of the last to board the plane. Why do you want more time in that seat instead of staying at the gate?

Maxwell Tabarrok asks whether air travel is getting worse. The conclusion is that typical flights now take longer, but we pad the schedules so much that flights typically arrive ‘early.’ And then we have several times as many delays of three hours or more, although the chances are still recorded as on the order of 1% (I very much press X to doubt based on my track record).

In exchange, travel has gotten cheaper in real dollars. These days I am consistently happy with the prices I get. Part of this is I am happy to fly basic economy with no checked bags and often not even an overhead bag, so I get beneficial price discrimination, and I’d want to make sure the graphs showing constant prices incorporate average actual net prices paid.

Unless you have something urgent, focus on comparative advantage.

You have time away from it all, or when various activities are hard to do. I’ve long had a rule that I don’t seek out internet on the plane. The plane is an excuse to not have internet.

The mistake is to try to use that time to do the things that are harder to do in the air, or less fun to do, and force them to happen anyway. The other mistake is to fiddle away the time aimlessly.

The correct play is usually to take advantage of the isolation and lack of distractions. That makes some activities actively great to do. Reading books or listening to music or podcasts if you have good headphones are excellent picks.

Watching movies is common. The screen is small, but the flight is an excuse to gain the focus that is even more important to watching movies than the big screen. You also have temporary access to movies you might not have otherwise considered, which can be exciting. So contra Tyler Cowen I think this is typically only a small mistake.

Trying to sleep is of course great if you can pull it off, but be realistic and know thyself.

What about working on the plane or preparing for when you arrive?

To the extent that this is necessary to get you into the right mindset, to review information you will need, or it was impossible to do earlier? Sure, go ahead. But to the extent you can take care of it ahead of time, you want to do that.

Discussion about this post

Spending Too Much Time At Airports Read More »

dedicated-volunteer-exposes-“single-largest-self-promotion-operation-in-wikipedia’s-history”

Dedicated volunteer exposes “single largest self-promotion operation in Wikipedia’s history”

After a reduction in activity, things ramped up again in 2021, as IP addresses from around the world started creating Woodard references and articles once more. For instance, “addresses from Canada, Germany, Indonesia, the UK and other places added some trivia about Woodard to all 15 Wikipedia articles about the calea ternifolia.”

Then things got “more sophisticated.” From December 2021 through June 2025, 183 articles were created about Woodard, each in a different language’s Wikipedia and each by a unique account. These accounts followed a pattern of behavior: They were “created, often with a fairly generic name, and made a user page with a single image on it. They then made dozens of minor edits to unrelated articles, before creating an article about David Woodard, then making a dozen or so more minor edits before disappearing off the platform.”

Grnrchst believes that all the activity was meant to “create as many articles about Woodard as possible, and to spread photos of and information on Woodard to as many articles as possible, while hiding that activity as much as possible… I came to believe that David Woodard himself, or someone close to him, had been operating this network of accounts and IP addresses for the purposes of cynical self-promotion.”

After the Grnrchst report, Wikipedia’s global stewards removed 235 articles on Woodard from Wikipedia instances with few users or administrators. Larger Wikipedias were free to make their own community decisions, and they removed another 80 articles and banned numerous accounts.

“A full decade of dedicated self-promotion by an individual network has been undone in only a few weeks by our community,” Grnrchst noted.

In the end, just 20 articles about Woodard remain, such as this one in English, which does not mention the controversy.

We were unable to get in touch with Woodard, whose personal website is password-protected and only available “by invitation.”

Could the whole thing be some kind of “art project,” with the real payoff being exposure and being written about? Perhaps. But whatever the motive behind the decade-long effort to boost Woodard on Wikipedia, the incident reminds us just how much effort some people are willing to put into polluting open or public-facing projects for their own ends.

Dedicated volunteer exposes “single largest self-promotion operation in Wikipedia’s history” Read More »