Features

why-anthropic’s-claude-still-hasn’t-beaten-pokemon

Why Anthropic’s Claude still hasn’t beaten Pokémon


Weeks later, Sonnet’s “reasoning” model is struggling with a game designed for children.

A game Boy Color playing Pokémon Red surrounded by the tendrils of an AI, or maybe some funky glowing wires, what do AI tendrils look like anyways

Gotta subsume ’em all into the machine consciousness! Credit: Aurich Lawson

Gotta subsume ’em all into the machine consciousness! Credit: Aurich Lawson

In recent months, the AI industry’s biggest boosters have started converging on a public expectation that we’re on the verge of “artificial general intelligence” (AGI)—virtual agents that can match or surpass “human-level” understanding and performance on most cognitive tasks.

OpenAI is quietly seeding expectations for a “PhD-level” AI agent that could operate autonomously at the level of a “high-income knowledge worker” in the near future. Elon Musk says that “we’ll have AI smarter than any one human probably” by the end of 2025. Anthropic CEO Dario Amodei thinks it might take a bit longer but similarly says it’s plausible that AI will be “better than humans at almost everything” by the end of 2027.

A few researchers at Anthropic have, over the past year, had a part-time obsession with a peculiar problem.

Can Claude play Pokémon?

A thread: pic.twitter.com/K8SkNXCxYJ

— Anthropic (@AnthropicAI) February 25, 2025

Last month, Anthropic presented its “Claude Plays Pokémon” experiment as a waypoint on the road to that predicted AGI future. It’s a project the company said shows “glimmers of AI systems that tackle challenges with increasing competence, not just through training but with generalized reasoning.” Anthropic made headlines by trumpeting how Claude 3.7 Sonnet’s “improved reasoning capabilities” let the company’s latest model make progress in the popular old-school Game Boy RPG in ways “that older models had little hope of achieving.”

While Claude models from just a year ago struggled even to leave the game’s opening area, Claude 3.7 Sonnet was able to make progress by collecting multiple in-game Gym Badges in a relatively small number of in-game actions. That breakthrough, Anthropic wrote, was because the “extended thinking” by Claude 3.7 Sonnet means the new model “plans ahead, remembers its objectives, and adapts when initial strategies fail” in a way that its predecessors didn’t. Those things, Anthropic brags, are “critical skills for battling pixelated gym leaders. And, we posit, in solving real-world problems too.”

Over the last year, new Claude models have shown quick progress in reaching new Pokémon milestones.

Over the last year, new Claude models have shown quick progress in reaching new Pokémon milestones. Credit: Anthropic

But relative success over previous models is not the same as absolute success over the game in its entirety. In the weeks since Claude Plays Pokémon was first made public, thousands of Twitch viewers have watched Claude struggle to make consistent progress in the game. Despite long “thinking” pauses between each move—during which viewers can read printouts of the system’s simulated reasoning process—Claude frequently finds itself pointlessly revisiting completed towns, getting stuck in blind corners of the map for extended periods, or fruitlessly talking to the same unhelpful NPC over and over, to cite just a few examples of distinctly sub-human in-game performance.

Watching Claude continue to struggle at a game designed for children, it’s hard to imagine we’re witnessing the genesis of some sort of computer superintelligence. But even Claude’s current sub-human level of Pokémon performance could hold significant lessons for the quest toward generalized, human-level artificial intelligence.

Smart in different ways

In some sense, it’s impressive that Claude can play Pokémon with any facility at all. When developing AI systems that find dominant strategies in games like Go and Dota 2, engineers generally start their algorithms off with deep knowledge of a game’s rules and/or basic strategies, as well as a reward function to guide them toward better performance. For Claude Plays Pokémon, though, project developer and Anthropic employee David Hershey says he started with an unmodified, generalized Claude model that wasn’t specifically trained or tuned to play Pokémon games in any way.

“This is purely the various other things that [Claude] understands about the world being used to point at video games,” Hershey told Ars. “So it has a sense of a Pokémon. If you go to claude.ai and ask about Pokémon, it knows what Pokémon is based on what it’s read… If you ask, it’ll tell you there’s eight gym badges, it’ll tell you the first one is Brock… it knows the broad structure.”

A flowchart summarizing the pieces that help Claude interact with an active game of Pokémon (click through to zoom in).

A flowchart summarizing the pieces that help Claude interact with an active game of Pokémon (click through to zoom in). Credit: Anthropic / Excelidraw

In addition to directly monitoring certain key (emulated) Game Boy RAM addresses for game state information, Claude views and interprets the game’s visual output much like a human would. But despite recent advances in AI image processing, Hershey said Claude still struggles to interpret the low-resolution, pixelated world of a Game Boy screenshot as well as a human can. “Claude’s still not particularly good at understanding what’s on the screen at all,” he said. “You will see it attempt to walk into walls all the time.”

Hershey said he suspects Claude’s training data probably doesn’t contain many overly detailed text descriptions of “stuff that looks like a Game Boy screen.” This means that, somewhat surprisingly, if Claude were playing a game with “more realistic imagery, I think Claude would actually be able to see a lot better,” Hershey said.

“It’s one of those funny things about humans that we can squint at these eight-by-eight pixel blobs of people and say, ‘That’s a girl with blue hair,’” Hershey continued. “People, I think, have that ability to map from our real world to understand and sort of grok that… so I’m honestly kind of surprised that Claude’s as good as it is at being able to see there’s a person on the screen.”

Even with a perfect understanding of what it’s seeing on-screen, though, Hershey said Claude would still struggle with 2D navigation challenges that would be trivial for a human. “It’s pretty easy for me to understand that [an in-game] building is a building and that I can’t walk through a building,” Hershey said. “And that’s [something] that’s pretty challenging for Claude to understand… It’s funny because it’s just kind of smart in different ways, you know?”

A sample Pokémon screen with an overlay showing how Claude characterizes the game’s grid-based map.

A sample Pokémon screen with an overlay showing how Claude characterizes the game’s grid-based map. Credit: Anthrropic / X

Where Claude tends to perform better, Hershey said, is in the more text-based portions of the game. During an in-game battle, Claude will readily notice when the game tells it that an attack from an electric-type Pokémon is “not very effective” against a rock-type opponent, for instance. Claude will then squirrel that factoid away in a massive written knowledge base for future reference later in the run. Claude can also integrate multiple pieces of similar knowledge into pretty elegant battle strategies, even extending those strategies into long-term plans for catching and managing teams of multiple creatures for future battles.

Claude can even show surprising “intelligence” when Pokémon’s in-game text is intentionally misleading or incomplete. “It’s pretty funny that they tell you you need to go find Professor Oak next door and then he’s not there,” Hershey said of an early-game task. “As a 5-year-old, that was very confusing to me. But Claude actually typically goes through that same set of motions where it talks to mom, goes to the lab, doesn’t find [Oak], says, ‘I need to figure something out’… It’s sophisticated enough to sort of go through the motions of the way [humans are] actually supposed to learn it, too.”

A sample of the kind of simulated reasoning process Claude steps through during a typical Pokémon battle.

A sample of the kind of simulated reasoning process Claude steps through during a typical Pokémon battle. Credit: Claude Plays Pokemon / Twitch

These kinds of relative strengths and weaknesses when compared to “human-level” play reflect the overall state of AI research and capabilities in general, Hershey said. “I think it’s just a sort of universal thing about these models… We built the text side of it first, and the text side is definitely… more powerful. How these models can reason about images is getting better, but I think it’s a decent bit behind.”

Forget me not

Beyond issues parsing text and images, Hershey also acknowledged that Claude can have trouble “remembering” what it has already learned. The current model has a “context window” of 200,000 tokens, limiting the amount of relational information it can store in its “memory” at any one time. When the system’s ever-expanding knowledge base fills up this context window, Claude goes through an elaborate summarization process, condensing detailed notes on what it has seen, done, and learned so far into shorter text summaries that lose some of the fine-grained details.

This can mean that Claude “has a hard time keeping track of things for a very long time and really having a great sense of what it’s tried so far,” Hershey said. “You will definitely see it occasionally delete something that it shouldn’t have. Anything that’s not in your knowledge base or not in your summary is going to be gone, so you have to think about what you want to put there.”

A small window into the kind of “cleaning up my context” knowledge-base update necessitated by Claude’s limited “memory.”

A small window into the kind of “cleaning up my context” knowledge-base update necessitated by Claude’s limited “memory.” Credit: Claude Play Pokemon / Twitch

More than forgetting important history, though, Claude runs into bigger problems when it inadvertently inserts incorrect information into its knowledge base. Like a conspiracy theorist who builds an entire worldview from an inherently flawed premise, Claude can be incredibly slow to recognize when an error in its self-authored knowledge base is leading its Pokémon play astray.

“The things that are written down in the past, it sort of trusts pretty blindly,” Hershey said. “I have seen it become very convinced that it found the exit to [in-game location] Viridian Forest at some specific coordinates, and then it spends hours and hours exploring a little small square around those coordinates that are wrong instead of doing anything else. It takes a very long time for it to decide that that was a ‘fail.’”

Still, Hershey said Claude 3.7 Sonnet is much better than earlier models at eventually “questioning its assumptions, trying new strategies, and keeping track over long horizons of various strategies to [see] whether they work or not.” While the new model will still “struggle for really long periods of time” retrying the same thing over and over, it will ultimately tend to “get a sense of what’s going on and what it’s tried before, and it stumbles a lot of times into actual progress from that,” Hershey said.

“We’re getting pretty close…”

One of the most interesting things about observing Claude Plays Pokémon across multiple iterations and restarts, Hershey said, is seeing how the system’s progress and strategy can vary quite a bit between runs. Sometimes Claude will show it’s “capable of actually building a pretty coherent strategy” by “keeping detailed notes about the different paths to try,” for instance, he said. But “most of the time it doesn’t… most of the time, it wanders into the wall because it’s confident it sees the exit.”

Where previous models wandered aimlessly or got stuck in loops, Claude 3.7 Sonnet plans ahead, remembers its objectives, and adapts when initial strategies fail.

Critical skills for battling pixelated gym leaders. And, we posit, in solving real-world problems too. pic.twitter.com/scvISp14XG

— Anthropic (@AnthropicAI) February 25, 2025

One of the biggest things preventing the current version of Claude from getting better, Hershey said, is that “when it derives that good strategy, I don’t think it necessarily has the self-awareness to know that one strategy [it] came up with is better than another.” And that’s not a trivial problem to solve.

Still, Hershey said he sees “low-hanging fruit” for improving Claude’s Pokémon play by improving the model’s understanding of Game Boy screenshots. “I think there’s a chance it could beat the game if it had a perfect sense of what’s on the screen,” Hershey said, saying that such a model would probably perform “a little bit short of human.”

Expanding the context window for future Claude models will also probably allow those models to “reason over longer time frames and handle things more coherently over a long period of time,” Hershey said. Future models will improve by getting “a little bit better at remembering, keeping track of a coherent set of what it needs to try to make progress,” he added.

Twitch chat responds with a flood of bouncing emojis as Claude concludes an epic 78+ hour escape from Pokémon’s Mt. Moon.

Twitch chat responds with a flood of bouncing emojis as Claude concludes an epic 78+ hour escape from Pokémon’s Mt. Moon. Credit: Claude Plays Pokemon / Twitch

Whatever you think about impending improvements in AI models, though, Claude’s current performance at Pokémon doesn’t make it seem like it’s poised to usher in an explosion of human-level, completely generalizable artificial intelligence. And Hershey allows that watching Claude 3.7 Sonnet get stuck on Mt. Moon for 80 hours or so can make it “seem like a model that doesn’t know what it’s doing.”

But Hershey is still impressed at the way that Claude’s new reasoning model will occasionally show some glimmer of awareness and “kind of tell that it doesn’t know what it’s doing and know that it needs to be doing something different. And the difference between ‘can’t do it at all’ and ‘can kind of do it’ is a pretty big one for these AI things for me,” he continued. “You know, when something can kind of do something it typically means we’re pretty close to getting it to be able to do something really, really well.”

Photo of Kyle Orland

Kyle Orland has been the Senior Gaming Editor at Ars Technica since 2012, writing primarily about the business, tech, and culture behind video games. He has journalism and computer science degrees from University of Maryland. He once wrote a whole book about Minesweeper.

Why Anthropic’s Claude still hasn’t beaten Pokémon Read More »

the-wheel-of-time-is-back-for-season-three,-and-so-are-our-weekly-recaps

The Wheel of Time is back for season three, and so are our weekly recaps

Andrew Cunningham and Lee Hutchinson have spent decades of their lives with Robert Jordan and Brandon Sanderson’s Wheel of Time books, and they previously brought that knowledge to bear as they recapped each first season episode and second season episode of Amazon’s WoT TV series. Now we’re back in the saddle for season three—along with insights, jokes, and the occasional wild theory.

These recaps won’t cover every element of every episode, but they will contain major spoilers for the show and the book series. We’ll do our best to not spoil major future events from the books, but there’s always the danger that something might slip out. If you want to stay completely unspoiled and haven’t read the books, these recaps aren’t for you.

New episodes of The Wheel of Time season three will be posted for Amazon Prime subscribers every Thursday. This write-up covers the entire three-episode season premiere, which was released on March 13.

Lee: Welcome back! Holy crap, has it only been 18 months since we left our broken and battered heroes standing in tableaux, with the sign of the Dragon flaming above Falme? Because it feels like it’s been about ten thousand years.

Andrew: Yeah, I’m not saying I want to return to the days when every drama on TV had 26 hour-long episodes per season, but when you’re doing one eight-episode run every year-and-a-half-to-two-years, you really feel those gaps. And maybe it’s just [waves arms vaguely at The World], but I am genuinely happy to have this show back.

This season’s premiere simply whips, balancing big action set-pieces and smaller character moments in between. But the whole production seems to be hitting a confident stride. The cast has gelled; they know what book stuff they’re choosing to adapt and what they’re going to skip. I’m sure there will still be grumbles, but the show does finally feel like it’s become its own thing.

Rosamund Pike returns as as Moiraine Damodred.

Credit: Courtesy of Prime/Amazon MGM Studios

Rosamund Pike returns as as Moiraine Damodred. Credit: Courtesy of Prime/Amazon MGM Studios

Lee: Oh yeah. The first episode hits the ground running, with explosions and blood and stolen ter’angreal. And we’ve got more than one episode to talk about—the gods of production at Amazon have given us a truly gigantic three-episode premiere, with each episode lasting more than an hour. Our content cup runneth over!

Trying to straight-up recap three hours of TV isn’t going to happen in the space we have available, so we’ll probably bounce around a bit. What I wanted to talk about first was exactly what you mentioned: unlike seasons one and two, this time, the show seems to have found itself and locked right in. To me, it feels kind of like Star Trek: The Next Generation’s third season versus its first two.

Andrew: That’s a good point of comparison. I feel like a lot of TV shows fall into one of two buckets: either it starts with a great first season and gradually falls off, or it gets off to a rocky start and finds itself over time. Fewer shows get to take the second path because a “show with a rocky start” often becomes a “canceled show,” but they can be more satisfying to watch.

The one Big Overarching Plot Thing to know for book readers is that they’re basically doing book 4 (The Shadow Rising) this season, with other odds and ends tucked in. So even if it gets canceled after this, at least they will have gotten to do what I think is probably the series’ high point.

Lee: Yep, we find out in our very first episode this season that we’re going to be heading to the Aiel Waste rather than the southern city of Tear, which is a significant re-ordering of events from the books. But unlike some of the previous seasons’ changes that feel like they were forced upon the show by outside factors (COVID, actors leaving, and so on), this one feels like it serves a genuine narrative purpose. Rand is reciting the Prophesies of the Dragon to himself and he knows he needs the “People of the Dragon” to guarantee success in Tear, and while he’s not exactly sure who the “People of the Dragon” might be, it’s obvious that Rand has no army as of yet. Maybe the Aiel can help?

Rand is doing all of this because both the angel and the devil on Rand’s shoulders—that’s the Aes Sedai Moiraine Damodred with cute blue angel wings and the Forsaken Lanfear in fancy black leather BDSM gear—want him wielding Callandor, The Sword That is Not a Sword (as poor Mat Cauthon explains in the Old Tongue). This powerful sa’angreal is located in the heart of the Stone of Tear (it’s the sword in the stone, get it?!), and its removal from the Stone is a major prophetic sign that the Dragon has indeed come again.

Book three is dedicated to showing how all that happens—but, like you said, we’re not in book three anymore. We’re gonna eat our book 4 dessert before our book 3 broccoli!

Natasha O’Keeffe as Lanfear.

Credit: Courtesy of Prime/Amazon MGM Studios

Natasha O’Keeffe as Lanfear. Credit: Courtesy of Prime/Amazon MGM Studios

Andrew: I like book 4 a lot (and I’d include 5 and 6 here too) because I think it’s when Robert Jordan was doing his best work balancing his worldbuilding and politicking with the early books’ action-adventure stuff, and including multiple character perspectives without spreading the story so thin that it could barely move forward. Book 3 was a stepping stone to this because the first two books had mainly been Rand’s, and we spend almost no time in Rand’s head in book 3. But you can’t do that in a TV show! So they’re mixing it up. Good! I am completely OK with this.

Lee:What did you think of Queen Morgase’s flashback introduction where we see how she won the Lion Throne of Andor (flanked by a pair of giant lions that I’m pretty sure came straight from Pier One Imports)? It certainly seemed a bit… evil.

Andrew: One of the bigger swerves that the show has taken with an established book character, I think! And well before she can claim to have been under the control of a Forsaken. (The other swerves I want to keep tabs on: Moiraine actively making frenemies with Lanfear to direct Rand, and Lan being the kind of guy who would ask Rand if he “wants to talk about it” when Rand is struggling emotionally. That one broke my brain, the books would be half as long as they are if men could openly talk to literally any other men about their states of mind.)

But I am totally willing to accept that Morgase change because the alternative is chapters and chapters of people yapping about consolidating political support and daes dae’mar and on and on. Bo-ring!

But speaking of Morgase and Forsaken, we’re starting to spend a little time with all the new baddies who got released at the end of last season. How do you feel about the ones we’ve met so far? I know we were generally supportive of the fact that the show is just choosing to have fewer of them in the first place.

Lee: Hah, I loved the contrast with Book Lan, who appears to only be capable of feeling stereotypically manly feelings (like rage, shame, or the German word for when duty is heavier than a mountain, which I’m pretty sure is something like “Bergpflichtenschwerengesellschaften”). It continues to feel like all of our main characters have grown up significantly from their portrayals on the page—they have sex, they use their words effectively, and they emotionally support each other like real people do in real life. I’m very much here for that particular change.

But yes, the Forsaken. We know from season two that we’re going to be seeing fewer than in the books—I believe we’ve got eight of them to deal with, and we meet almost all of them in our three-episode opening blast. I’m very much enjoying Moghedien’s portrayal by Laia Costa, but of course Lanfear is stealing the show and chewing all the scenery. It will be fascinating to see how the show lets the others loose—we know from the books that every one of the Forsaken has a role to play (including one specific Forsaken whose existence has yet to be confirmed but who figures heavily into Rand learning more about how the One Power works), and while some of those roles can be dropped without impacting the story, several definitely cannot.

And although Elaida isn’t exactly a Forsaken, it was awesome to see Shohreh Aghdashloo bombing around the White Tower looking fabulous as hell. Chrisjen Avasarala would be proud.

The boys, communicating and using their words like grown-ups.

Credit: Courtesy of Prime/Amazon MGM Studios

The boys, communicating and using their words like grown-ups. Credit: Courtesy of Prime/Amazon MGM Studios

Andrew: Maybe I’m exaggerating but I think Shohreh Aghdashloo’s actual voice goes deeper than Hammed Animashaun’s lowered-in-post-production voice for Loial. It’s an incredible instrument.

Meeting Morgase in these early episodes means we also meet Gaebril, and the show only fakes viewers out for a few scenes before revealing what book-readers know: that he’s the Forsaken Rahvin. But I really love how these scenes play, particularly his with Elayne. After one weird, brief look, they fall into a completely convincing chummy, comfortable stepdad-stepdaughter relationship, and right after that, you find out that, oops, nope, he’s been there for like 15 minutes and has successfully One Power’d everyone into believing he’s been in their lives for decades.

It’s something that we’re mostly told-not-shown in the books, and it really sells how powerful and amoral and manipulative all these characters are. Trust is extremely hard to come by in Randland, and this is why.

Lee: I very much liked the way Gaebril’s/Rahvin’s crazy compulsion comes off, and I also like the way Nuno Lopes is playing Gaebril. He seems perhaps a little bumbling, and perhaps a little self-effacing—truly, a lovable uncle kind of guy. The kind of guy who would say “thank you” to a servant and smile at children playing. All while, you know, plotting the downfall of the kingdom. In what is becoming a refrain, it’s a fun change from the books.

And along the lines of unassuming folks, we get our first look at a Gray Man and the hella creepy mechanism by which they’re created. I can’t recall in the books if Moghedien is explicitly mentioned as being able to fashion the things, but she definitely can in the show! (And it looks uncomfortable as hell. “Never accept an agreement that involves the forcible removal of one’s soul” is an axiom I try to live by.)

Olivia Williams as Queen Morgase Trakand and Shohreh Aghdashloo as Elaida do Avriny a’Roihan.

Credit: Courtesy of Prime/Amazon MGM Studios

Olivia Williams as Queen Morgase Trakand and Shohreh Aghdashloo as Elaida do Avriny a’Roihan. Credit: Courtesy of Prime/Amazon MGM Studios

Andrew: It’s just one of quite a few book things that these first few episodes speedrun. Mat has weird voices in his head and speaks in tongues! Egwene and Elayne pass the Accepted test! (Having spent most of an episode on Nynaeve’s Accepted test last season, the show yada-yadas this a bit, showing us just a snippet of Egwene’s Rand-related trials and none of Elayne’s test at all.) Elayne’s brothers Gawyn and Galad show up, and everyone thinks they’re very hot, and Mat kicks their asses! The Black Ajah reveals itself in explosive fashion, and Siuan can only trust Elayne and Nynaeve to try and root them out! Min is here! Elayne and Aviendha kiss, making more of the books’ homosexual subtext into actual text! But for the rest of the season, we split the party in basically three ways: Rand, Egwene, Moiraine and company head with Aviendha to the Waste, so that Rand can make allies of the Aiel. Perrin and a few companions head home to the Two Rivers and find that things are not as they left them. Nynaeve and Elayne are both dealing with White Tower intrigue. There are other threads, but I think this sets up most of what we’ll be paying attention to this season.

As we try to wind down this talk about three very busy episodes, is there anything you aren’t currently vibing with? I feel like Josha Stradowski’s Rand is getting lost in the shuffle a bit, despite this nominally being his story.

Lee: I agree about Rand—but, hey, the same de-centering of Rand happened in the books, so at least there is symmetry. I think the things I’m not vibing with are at this point just personal dislikes. The sets still feel cheap. The costumes are great, but the Great Serpent rings are still ludicrously large and impractical.

I’m overjoyed the show is unafraid to shine a spotlight on queer characters, and I’m also desperately glad that we aren’t being held hostage by Robert Jordan’s kinks—like, we haven’t seen a single Novice or Accepted get spanked, women don’t peel off their tops in private meetings to prove that they’re women, and rather than titillation or weirdly uncomfortable innuendo, these characters are just straight-up screwing. (The Amyrlin even notes that she’s not sure the Novices “will ever recover” after Gawyn and Galad come to—and all over—town.)

If I had to pick a moment that I enjoyed the most out of the premiere, it would probably be the entire first episode—which in spite of its length kept me riveted the entire time. I love the momentum, the feeling of finally getting the show that I’d always hoped we might get rather than the feeling of having to settle.

How about you? Dislikes? Loves?

Ceara Coveney as Elayne Trakand and Ayoola Smart as Aviendha, and they’re thinking about exactly what you think they’re thinking about.

Credit: Courtesy of Prime/Amazon MGM Studios

Ceara Coveney as Elayne Trakand and Ayoola Smart as Aviendha, and they’re thinking about exactly what you think they’re thinking about. Credit: Courtesy of Prime/Amazon MGM Studios

Andrew: Not a ton of dislikes, I am pretty in the tank for this at this point. But I do agree that some of the prop work is weird. The Horn of Valere in particular looks less like a legendary artifact and more like a decorative pitcher from a Crate & Barrel.

There were two particular scenes/moments that I really enjoyed. Rand and Perrin and Mat just hang out, as friends, for a while in the first episode, and it’s very charming. We’re told in the books constantly that these three boys are lifelong pals, but (to the point about Unavailable Men we were talking about earlier) we almost never get to see actual evidence of this, either because they’re physically split up or because they’re so wrapped up in their own stuff that they barely want to speak to each other.

I also really liked that brief moment in the first episode where a Black Ajah Aes Sedai’s Warder dies, and she’s like, “hell yeah, this feels awesome, this is making me horny because of how evil I am.” Sometimes you don’t want shades of gray—sometimes you just need some cartoonishly unambiguous villainy.

Lee: I thought the Black Ajah getting excited over death was just the right mix of of cartoonishness and actual-for-real creepiness, yeah. These people have sold their eternal souls to the Shadow, and it probably takes a certain type. (Though, as book readers know, there are some surprising Black Ajah reveals yet to be had!)

We close out our three-episode extravaganza with Mat having his famous stick fight with Zoolander-esque male models Gawyn and Galad, Liandrin and the Black Ajah setting up shop (and tying off some loose ends) in Tanchico, Perrin meeting Faile and Lord Luc in the Two Rivers, and Rand in the Aiel Waste, preparing to do—well, something important, one can be sure.

We’ll leave things here for now. Expect us back next Friday to talk about episode four, which, based on the preview trailers already showing up online, will involve a certain city in the desert, wherein deep secrets will be revealed.

Mia dovienya nesodhin soende, Andrew!

Andrew: The Wheel weaves as the Wheel wills.

Credit: WoT Wiki

The Wheel of Time is back for season three, and so are our weekly recaps Read More »

scoop:-origami-measuring-spoon-incites-fury-after-9-years-of-kickstarter-delay-hell

Scoop: Origami measuring spoon incites fury after 9 years of Kickstarter delay hell


The curious case of the missing Kickstarter spoons.

An attention-grabbing Kickstarter campaign attempting to reinvent the measuring spoon has turned into a mad, mad, mad, mad world for backers after years of broken promises and thousands of missing spoons.

The mind-boggling design for the measuring spoon first wowed the Internet in 2016 after a video promoting the Kickstarter campaign went viral and spawned widespread media coverage fawning over the unique design.

Known as Polygons, the three-in-one origami measuring spoons have a flat design that can be easily folded into common teaspoon and tablespoon measurements. “Regular spoons are so 3000 BC,” a tagline on the project’s website joked.

For gadget geeks, it’s a neat example of thinking outside of the box, and fans found it appealing to potentially replace a drawer full of spoons with a more futuristic-looking compact tool. Most backers signed up for a single set, paying $8–$12 each, while hundreds wanted up to 25 sets, a handful ordered 50, and just one backer signed up for 100. Delivery was initially promised by 2017, supposedly shipping to anywhere in the world.

But it’s been about nine years since more than 30,000 backers flocked to the Kickstarter campaign—raising more than $1 million and eclipsing Polygons’ $10,000 goal. And not only have more than a third of the backers not received their spoons, but now, after years of updates claiming that the spoons had been shipped, some backers began to wonder if the entire campaign might be a fraud. They could see that Polygons are currently being sold on social media and suspected that the maker might be abusing backers’ funds to chase profits, seemingly without ever seriously intending to fulfill their orders.

One Kickstarter backer, Caskey Hunsader, told Ars that he started doubting if the spoon’s designer—an inventor from India, Rahul Agarwal—was even a real person.

Ars reached out to verify Agarwal’s design background. We confirmed that, yes, Agarwal is a real designer, and, yes, he believes there is a method to the madness when it comes to his Kickstarter campaign, which he said was never intended to be a scam or fraud and is currently shipping spoons to backers. He forecasted that 2025 is likely the year that backers’ wait will finally end.

But as thousands of complaints on the Kickstarter attest, backers have heard that one before. It’s been two years since the last official update was posted, which only promised updates that never came and did not confirm that shipments were back on track. The prior update in 2022 promised that “the time has finally arrived when we begin bulk shipping to everyone!”

Hunsader told Ars that people seem mostly upset because of “bullshit,” which is widely referenced in the comments. And that anger is compounded “by the fact that they are producing, and they are selling this product, so they are operating their business using funds that all these people who were their first backers gave them, and we’re the ones who are not getting the product. I think that’s where the anger comes from.”

“It’s been years now, and [I’ve] watched as you promise good people their products and never deliver,” one commenter wrote. “Wherever you try… to sell [your] products, we will be there reminding them of the empty orders you left here.”

“Where is my item? I am beyond angry,” another fumed.

Those who did receive their spoons often comment on the substantial delays, but reviews are largely positive.

“Holy crap, folks,” a somewhat satisfied backer wrote. “Hell has frozen over. I finally got them (no BS).”

One backer was surprised to get twice as many spoons as expected, referencing an explanation blaming Chinese New Year for one delay and writing, “I can honestly say after 8 years… and an enormous amount of emails, I finally received my pledge. Except… I only ordered 3… and I received 6. I’d be inclined to ship some back to Polygons… bare with me… I’ll return them soon… I appreciate your patience… mebbe after Chinese New Years 2033…”

Agarwal agreed to meet with Ars, show us the spoon, and explain why backers still haven’t gotten their deliveries when the spoon appears widely available to purchase online.

Failing prototypes and unusable cheap knockoffs

As a designer, Agarwal is clearly a perfectionist. He was just a student when he had the idea for Polygons in 2014, winning design awards and garnering interest that encouraged him to find a way to manufacture the spoons. He felt eager to see people using them.

Agarwal told Ars that before he launched the Kickstarter, he had prototypes made in China that were about 85 percent of the quality that he and his collaborators at InventIndia required. Anticipating that the quality would be fully there soon, Agarwal launched the Kickstarter, along with marketing efforts that Agarwal said had to be squashed due to unexpectedly high interest in the spoons.

This is when things started spiraling, as Agarwal had to switch manufacturers five times, with each partner crashing into new walls trying to execute the novel product.

Once the Kickstarter hit a million dollars, though, Agarwal committed to following through on launching the product. Eventually, cheap knockoff versions began appearing online on major retail sites like Walmart and Amazon toward the end of 2024. Because Agarwal has patents and trademarks for his design, he can get the knockoffs taken down, but they proved an important point that Agarwal had learned the hard way: that his design, while appearing simplistic, was incredibly hard to pull off.

Ars handled both a legitimate Polygons spoon and a cheap knockoff. The knockoff was a flimsy, unusable slab of rubber dotted with magnets; the companies aping Agarwal’s idea are seemingly unable to replicate the manufacturing process that Agarwal has spent years perfecting to finally be able to widely ship Polygons today.

On the other hand, Agarwal’s spoon is sturdy, uses food-grade materials, and worked just as well measuring wet and dry ingredients during an Ars test. A silicon hinge connects 19 separate plastic pieces and ensures that magnets neatly snap along indented lines indicating if the measurement is a quarter, half, or whole teaspoon or tablespoon. It took Agarwal two and a half years to finalize the design while working with InventIndia, a leading product development firm in India. Prototyping required making special molds that took a month each to iterate rather than using a 3D-printing shortcut whereby multiple prototypes could be made in a day, which Agarwal said he’d initially anticipated could be possible.

Around the time that the prototyping process concluded, Agarwal noted, COVID hit, and supply chains were disrupted, causing production setbacks. Once production could resume, costs became a factor, as estimates used to set Kickstarter backer awards were based on the early failed Chinese prototype, and the costs of producing a functioning spoon were much higher. Over time, shipping costs also rose.

As Kickstarter funds dwindled, there was no going back, so Agarwal devised a plan to sell the spoons for double the price ($25–$30 a set) by marketing them on social media, explaining this in a note to backers posted on the Polygons site. Those sales would fund ongoing manufacturing, allowing profits to be recycled so that Kickstarter backers could gradually receive shipments dependent on social media sales volumes. Orders from anyone who paid extra for expedited shipping are prioritized.

It’s a math problem at this point, with more funding needed to scale. But Agarwal told Ars that sales on Shopify and TikTok Shop have increased each quarter, most recently selling 30,000 units on TikTok, which allowed Polygons to take out a bigger line of credit to fund more manufacturing. He also brought in a more experienced partner to focus on the business side while he optimizes production.

Agarwal told Ars that he understands trust has been broken with many Kickstarter backers, considering that totally fair. While about 38 percent of backers’ orders still need filling, he predicts that all backers could get their orders within the next six to eight months as Polygons becomes better resourced, but that still depends on social media sales.

Agarwal met Ars after attending a housewares show in Chicago, where he shopped the spoons with retailers who may also help scale the product in the coming years. He anticipates that as the business scales, the cost of the spoons will come back down. And he may even be able to move onto executing other product designs that have been on the backburner as he attempts to work his way out of the Kickstarter corner he backed himself into while obsessing over his first design.

Kickstarter problem goes beyond Polygons

Hunsader told Ars there’s a big difference “in a lie versus bad management,” suggesting that as a business owner who has managed Kickstarter campaigns, he thinks more transparency likely could’ve spared Polygons a lot of angry comments.

“I am not sitting here with a dart board with [Agarwal’s] face on it, being like, when am I going to get my damn spoons?” Hunsader joked. But the campaign’s Kickstarter messaging left many backers feeling like Polygons took backers’ money and ran, Hunsader said.

Unlike people who saw the spoons going viral on social media, Hunsader discovered Polygons just by scrolling on Kickstarter. As a fan of geeky gadgets, he used to regularly support campaigns, but his experience supporting Polygons and monitoring other cases of problematic Kickstarters have made him more hesitant to use the platform without more safeguards for backers.

“It’s not specifically a Polygons problem,” Hunsader told Ars. “The whole Kickstarter thing needs maybe just more protections in place.”

Kickstarter did not respond to Ars’ request to comment. But Kickstarter’s “accountability” policy makes clear that creators “put their reputation at risk” launching campaigns and are ultimately responsible for following through on backer promises. Kickstarter doesn’t issue refunds or guarantee projects, only providing limited support when backers report “suspicious activity.”

Redditors have flagged “shitty” Kickstarter campaigns since 2012, three years after the site’s founding, and the National Association of Attorneys General—which represents US state attorneys general—suggested in 2019 that disgruntled crowdfunding backers were increasingly turning to consumer protection laws to fight alleged fraud.

In 2015, an independent analysis by the University of Pennsylvania estimated that 9 percent of Kickstarter projects didn’t fulfill their rewards. More recently, it appeared that figure had doubled, as Fortune reported last year that an internal Kickstarter estimate put “the amount of revenue that comes from fraudulent projects as high as 18 percent.” A spokesperson disputed that estimate and told Fortune that the platform employs “extensive” measures to detect fraud.

Agarwal told Ars that he thinks it’s uncommon for a campaign to continue fulfilling backer rewards after eight years of setbacks. It would be easier to just shut down and walk away, and Kickstarter likely would not have penalized him for it. While the Kickstarter campaign allowed him to reach his dream of seeing people using his novel measuring spoon in the real world, it’s been bittersweet that the campaign has dragged out so long and kept the spoons out of the hands of his earliest supporters, he told Ars.

Hunsader told Ars that he hopes the Polygons story serves as a “cautionary tale” for both backers and creators who bite off more than they can chew when launching a Kickstarter campaign. He knows that designers like Agarwal can take a reputational hit.

“I don’t want to make somebody who has big dreams not want to dream, but you also, when you’re dealing with things like manufacturing technology, have to be realistic about what is and is not accomplishable,” Hunsader said.

Polygons collaborators at InventIndia told Ars that Agarwal is “dedicated and hard-working,” describing him as “someone deeply committed to delivering a product that meets the highest standards” and whose intentions have “always” been to “ship a perfect product.”

Agarwal’s team connected with Hunsader to schedule his Kickstarter reward shipment on Friday. Hunsader told Ars he doesn’t really care if it takes another nine years. It’s just a spoon, and “there are bigger fish to fry.”

“Listen, I can buy that narrative that he was somebody who got totally overwhelmed but handled it in the worst possible way ever,” Hunsader said.

He plans to continue patiently waiting for his spoons.

This story was updated on March 14 to update information on the Polygons Kickstarter campaign.

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

Scoop: Origami measuring spoon incites fury after 9 years of Kickstarter delay hell Read More »

iphone-16e-review:-the-most-expensive-cheap-iphone-yet

iPhone 16e review: The most expensive cheap iPhone yet


The iPhone 16e rethinks—and prices up—the basic iPhone.

An iPhone sits on the table, displaying the time with the screen on

The iPhone 16e, with a notch and an Action Button. Credit: Samuel Axon

The iPhone 16e, with a notch and an Action Button. Credit: Samuel Axon

For a long time, the cheapest iPhones were basically just iPhones that were older than the current flagship, but last week’s release of the $600 iPhone 16e marks a big change in how Apple is approaching its lineup.

Rather than a repackaging of an old iPhone, the 16e is the latest main iPhone—that is, the iPhone 16—with a bunch of stuff stripped away.

There are several potential advantages to this change. In theory, it allows Apple to support its lower-end offerings for longer with software updates, and it gives entry-level buyers access to more current technologies and features. It also simplifies the marketplace of accessories and the like.

There’s bad news, too, though: Since it replaces the much cheaper iPhone SE in Apple’s lineup, the iPhone 16e significantly raises the financial barrier to entry for iOS (the SE started at $430).

We spent a few days trying out the 16e and found that it’s a good phone—it’s just too bad it’s a little more expensive than the entry-level iPhone should ideally be. In many ways, this phone solves more problems for Apple than it does for consumers. Let’s explore why.

Table of Contents

A beastly processor for an entry-level phone

Like the 16, the 16e has Apple’s A18 chip, the most recent in the made-for-iPhone line of Apple-designed chips. There’s only one notable difference: This variation of the A18 has just four GPU cores instead of five. That will show up in benchmarks and in a handful of 3D games, but it shouldn’t make too much of a difference for most people.

It’s a significant step up over the A15 found in the final 2022 refresh of the iPhone SE, enabling a handful of new features like AAA games and Apple Intelligence.

The A18’s inclusion is good for both Apple and the consumer; Apple gets to establish a new, higher baseline of performance when developing new features for current and future handsets, and consumers likely get many more years of software updates than they’d get on the older chip.

The key example of a feature enabled by the A18 that Apple would probably like us all to talk about the most is Apple Intelligence, a suite of features utilizing generative AI to solve some user problems or enable new capabilities across iOS. By enabling these for the cheapest iPhone, Apple is making its messaging around Apple Intelligence a lot easier; it no longer needs to put effort into clarifying that you can use X feature with this new iPhone but not that one.

We’ve written a lot about Apple Intelligence already, but here’s the gist: There are some useful features here in theory, but Apple’s models are clearly a bit behind the cutting edge, and results for things like notifications summaries or writing tools are pretty mixed. It’s fun to generate original emojis, though!

The iPhone 16e can even use Visual Intelligence, which actually is handy sometimes. On my iPhone 16 Pro Max, I can point the rear camera at an object and press the camera button a certain way to get information about it.

I wouldn’t have expected the 16e to support this, but it does, via the Action Button (which was first introduced in the iPhone 15 Pro). This is a reprogrammable button that can perform a variety of functions, albeit just one at a time. Visual Intelligence is one of the options here, which is pretty cool, even though it’s not essential.

The screen is the biggest upgrade over the SE

Also like the 16, the 16e has a 6.1-inch display. The resolution’s a bit different, though; it’s 2,532 by 1,170 pixels instead of 2,556 by 1,179. It also has a notch instead of the Dynamic Island seen in the 16. All this makes the iPhone 16e’s display seem like a very close match to the one seen in 2022’s iPhone 14—in fact, it might literally be the same display.

I really missed the Dynamic Island while using the iPhone 16e—it’s one of my favorite new features added to the iPhone in recent years, as it consolidates what was previously a mess of notification schemes in iOS. Plus, it’s nice to see things like Uber and DoorDash ETAs and sports scores at a glance.

The main problem with losing the Dynamic Island is that we’re back to the old minor mess of notifications approaches, and I guess Apple has to keep supporting the old ways for a while yet. That genuinely surprises me; I would have thought Apple would want to unify notifications and activities with the Dynamic Island just like the A18 allows the standardization of other features.

This seems to indicate that the Dynamic Island is a fair bit more expensive to include than the good old camera notch flagship iPhones had been rocking since 2017’s iPhone X.

That compromise aside, the display on the iPhone 16e is ridiculously good for a phone at this price point, and it makes the old iPhone SE’s small LCD display look like it’s from another eon entirely by comparison. It gets brighter for both HDR content and sunny-day operation; the blacks are inky and deep, and the contrast and colors are outstanding.

It’s the best thing about the iPhone 16e, even if it isn’t quite as refined as the screens in Apple’s current flagships. Most people would never notice the difference between the screens in the 16e and the iPhone 16 Pro, though.

There is one other screen feature I miss from the higher-end iPhones you can buy in 2025: Those phones can drop the display all the way down to 1 nit, which is awesome for using the phone late at night in bed without disturbing a sleeping partner. Like earlier iPhones, the 16e can only get so dark.

It gets quite bright, though; Apple claims it typically reaches 800 nits in peak brightness but that it can stretch to 1200 when viewing certain HDR photos and videos. That means it gets about twice as bright as the SE did.

Connectivity is key

The iPhone 16e supports the core suite of connectivity options found in modern phones. There’s Wi-Fi 6, Bluetooth 5.3, and Apple’s usual limited implementation of NFC.

There are three new things of note here, though, and they’re good, neutral, and bad, respectively.

USB-C

Let’s start with the good. We’ve moved from Apple’s proprietary Lightning port found in older iPhones (including the final iPhone SE) toward USB-C, now a near-universal standard on mobile devices. It allows faster charging and more standardized charging cable support.

Sure, it’s a bummer to start over if you’ve spent years buying Lightning accessories, but it’s absolutely worth it in the long run. This change means that the entire iPhone line has now abandoned Lightning, so all iPhones and Android phones will have the same main port for years to come. Finally!

The finality of this shift solves a few problems for Apple: It greatly simplifies the accessory landscape and allows the company to move toward producing a smaller range of cables.

Satellite connectivity

Recent flagship iPhones have gradually added a small suite of features that utilize satellite connectivity to make life a little easier and safer.

Among those is crash detection and roadside assistance. The former will use the sensors in the phone to detect if you’ve been in a car crash and contact help, and roadside assistance allows you to text for help when you’re outside of cellular reception in the US and UK.

There are also Emergency SOS and Find My via satellite, which let you communicate with emergency responders from remote places and allow you to be found.

Along with a more general feature that allows Messages via satellite, these features can greatly expand your options if you’re somewhere remote, though they’re not as easy to use and responsive as using the regular cellular network.

Where’s MagSafe?

I don’t expect the 16e to have all the same features as the 16, which is $200 more expensive. In fact, it has more modern features than I think most of its target audience needs (more on that later). That said, there’s one notable omission that makes no sense to me at all.

The 16e does not support MagSafe, a standard for connecting accessories to the back of the device magnetically, often while allowing wireless charging via the Qi standard.

Qi wireless charging is still supported, albeit at a slow 7.5 W, but there are no magnets, meaning a lot of existing MagSafe accessories are a lot less useful with this phone, if they’re usable at all. To be fair, the SE didn’t support MagSafe either, but every new iPhone design since the iPhone 12 way back in 2020 has—and not just the premium flagships.

It’s not like the MagSafe accessory ecosystem was some bottomless well of innovation, but that magnetic alignment is handier than you might think, whether we’re talking about making sure the phone locks into place for the fastest wireless charging speeds or hanging the phone on a car dashboard to use GPS on the go.

It’s one of those things where folks coming from much older iPhones may not care because they don’t know what they’re missing, but it could be annoying in households with multiple generations of iPhones, and it just doesn’t make any sense.

Most of Apple’s choices in the 16e seem to serve the goal of unifying the whole iPhone lineup to simplify the message for consumers and make things easier for Apple to manage efficiently, but the dropping of MagSafe is bizarre.

It almost makes me think that Apple might plan to drop MagSafe from future flagship iPhones, too, and go toward something new, just because that’s the only explanation I can think of. That otherwise seems unlikely to me right now, but I guess we’ll see.

The first Apple-designed cellular modem

We’ve been seeing rumors that Apple planned to drop third-party modems from companies like Qualcomm for years. As far back as 2018, Apple was poaching Qualcomm employees in an adjacent office in San Diego. In 2020, Apple SVP Johny Srouji announced to employees that work had begun.

It sounds like development has been challenging, but the first Apple-designed modem has arrived here in the 16e of all places. Dubbed the C1, it’s… perfectly adequate. It’s about as fast or maybe just a smidge slower than what you get in the flagship phones, but almost no user would notice any difference at all.

That’s really a win for Apple, which has struggled with a tumultuous relationship with its partners here for years and which has long run into space problems in its phones in part because the third-party modems weren’t compact enough.

This change may not matter much for the consumer beyond freeing up just a tiny bit of space for a slightly larger battery, but it’s another step in Apple’s long journey to ultimately and fully control every component in the iPhone that it possibly can.

Bigger is better for batteries

There is one area where the 16e is actually superior to the 16, much less the SE: battery life. The 16e reportedly has a 3,961 mAh battery, the largest in any of the many iPhones with roughly this size screen. Apple says it offers up to 26 hours of video playback, which is the kind of number you expect to see in a much larger flagship phone.

I charged this phone three times in just under a week with it, though I wasn’t heavily hitting 5G networks, playing many 3D games, or cranking the brightness way up all the time while using it.

That’s a bit of a bump over the 16, but it’s a massive leap over the SE, which promised a measly 15 hours of video playback. Every single phone in Apple’s lineup now has excellent battery life by any standard.

Quality over quantity in the camera system

The 16E’s camera system leaves the SE in the dust, but it’s no match for the robust system found in the iPhone 16. Regardless, it’s way better than you’d typically expect from a phone at this price.

Like the 16, the 16e has a 48 MP “Fusion” wide-angle rear camera. It typically doesn’t take photos at 48 MP (though you can do that while compromising color detail). Rather, 24 MP is the target. The 48 MP camera enables 2x zoom that is nearly visually indistinguishable from optical zoom.

Based on both the specs and photo comparisons, the main camera sensor in the 16e appears to me to be exactly the same as that one found in the 16. We’re just missing the ultra-wide lens (which allows more zoomed-out photos, ideal for groups of people in small spaces, for example) and several extra features like advanced image stabilization, the newest Photographic Styles, and macro photography.

The iPhone 16e takes excellent photos in bright conditions. Samuel Axon

That’s a lot of missing features, sure, but it’s wild how good this camera is for this price point. Even something like the Pixel 8a can’t touch it (though to be fair, the Pixel 8a is $100 cheaper).

Video capture is a similar situation: The 16e shoots at the same resolutions and framerates as the 16, but it lacks a few specialized features like Cinematic and Action modes. There’s also a front-facing camera with the TrueDepth sensor for Face ID in that notch, and it has comparable specs to the front-facing cameras we’ve seen in a couple of years of iPhones at this point.

If you were buying a phone for the cameras, this wouldn’t be the one for you. It’s absolutely worth paying another $200 for the iPhone 16 (or even just $100 for the iPhone 15 for the ultra-wide lens for 0.5x zoom; the 15 is still available in the Apple Store) if that’s your priority.

The iPhone 16’s macro mode isn’t available here, so ultra-close-ups look fuzzy. Samuel Axon

But for the 16e’s target consumer (mostly folks with the iPhone 11 or older or an iPhone SE, who just want the cheapest functional iPhone they can get) it’s almost overkill. I’m not complaining, though it’s a contributing factor to the phone’s cost compared to entry-level Android phones and Apple’s old iPhone SE.

RIP small phones, once and for all

In one fell swoop, the iPhone 16e’s replacement of the iPhone SE eliminates a whole range of legacy technologies that have held on at the lower end of the iPhone lineup for years. Gone are Touch ID, the home button, LCD displays, and Lightning ports—they’re replaced by Face ID, swipe gestures, OLED, and USB-C.

Newer iPhones have had most of those things for quite some time. The latest feature was USB-C, which came in 2023’s iPhone 15. The removal of the SE from the lineup catches the bottom end of the iPhone up with the top in these respects.

That said, the SE had maintained one positive differentiator, too: It was small enough to be used one-handed by almost anyone. With the end of the SE and the release of the 16e, the one-handed iPhone is well and truly dead. Of course, most people have been clear they want big screens and batteries above almost all else, so the writing had been on the wall for a while for smaller phones.

The death of the iPhone SE ushers in a new era for the iPhone with bigger and better features—but also bigger price tags.

A more expensive cheap phone

Assessing the iPhone 16e is a challenge. It’s objectively a good phone—good enough for the vast majority of people. It has a nearly top-tier screen (though it clocks in at 60Hz, while some Android phones close to this price point manage 120Hz), a camera system that delivers on quality even if it lacks special features seen in flagships, strong connectivity, and performance far above what you’d expect at this price.

If you don’t care about extra camera features or nice-to-haves like MagSafe or the Dynamic Island, it’s easy to recommend saving a couple hundred bucks compared to the iPhone 16.

The chief criticism I have that relates to the 16e has less to do with the phone itself than Apple’s overall lineup. The iPhone SE retailed for $430, nearly half the price of the 16. By making the 16e the new bottom of the lineup, Apple has significantly raised the financial barrier to entry for iOS.

Now, it’s worth mentioning that a pretty big swath of the target market for the 16e will buy it subsidized through a carrier, so they might not pay that much up front. I always recommend buying a phone directly if you can, though, as carrier subsidization deals are usually worse for the consumer.

The 16e’s price might push more people to go for the subsidy. Plus, it’s just more phone than some people need. For example, I love a high-quality OLED display for watching movies, but I don’t think the typical iPhone SE customer was ever going to care about that.

That’s why I believe the iPhone 16e solves more problems for Apple than it does for the consumer. In multiple ways, it allows Apple to streamline production, software support, and marketing messaging. It also drives up the average price per unit across the whole iPhone line and will probably encourage some people who would have spent $430 to spend $600 instead, possibly improving revenue. All told, it’s a no-brainer for Apple.

It’s just a mixed bag for the sort of no-frills consumer who wants a minimum viable phone and who for one reason or another didn’t want to go the Android route. The iPhone 16e is definitely a good phone—I just wish there were more options for that consumer.

The good

  • Dramatically improved display than the iPhone SE
  • Likely stronger long-term software support than most previous entry-level iPhones
  • Good battery life and incredibly good performance for this price point
  • A high-quality camera, especially for the price

The bad

  • No ultra-wide camera
  • No MagSafe
  • No Dynamic Island

The ugly

  • Significantly raises the entry price point for buying an iPhone

Photo of Samuel Axon

Samuel Axon is a senior editor at Ars Technica. He covers Apple, software development, gaming, AI, entertainment, and mixed reality. He has been writing about gaming and technology for nearly two decades at Engadget, PC World, Mashable, Vice, Polygon, Wired, and others. He previously ran a marketing and PR agency in the gaming industry, led editorial for the TV network CBS, and worked on social media marketing strategy for Samsung Mobile at the creative agency SPCSHP. He also is an independent software and game developer for iOS, Windows, and other platforms, and he is a graduate of DePaul University, where he studied interactive media and software development.

iPhone 16e review: The most expensive cheap iPhone yet Read More »

amd-radeon-rx-9070-and-9070-xt-review:-rdna-4-fixes-a-lot-of-amd’s-problems

AMD Radeon RX 9070 and 9070 XT review: RDNA 4 fixes a lot of AMD’s problems


For $549 and $599, AMD comes close to knocking out Nvidia’s GeForce RTX 5070.

AMD’s Radeon RX 9070 and 9070 XT are its first cards based on the RDNA 4 GPU architecture. Credit: Andrew Cunningham

AMD’s Radeon RX 9070 and 9070 XT are its first cards based on the RDNA 4 GPU architecture. Credit: Andrew Cunningham

AMD is a company that knows a thing or two about capitalizing on a competitor’s weaknesses. The company got through its early-2010s nadir partially because its Ryzen CPUs struck just as Intel’s current manufacturing woes began to set in, first with somewhat-worse CPUs that were great value for the money and later with CPUs that were better than anything Intel could offer.

Nvidia’s untrammeled dominance of the consumer graphics card market should also be an opportunity for AMD. Nvidia’s GeForce RTX 50-series graphics cards have given buyers very little to get excited about, with an unreachably expensive high-end 5090 refresh and modest-at-best gains from 5080 and 5070-series cards that are also pretty expensive by historical standards, when you can buy them at all. Tech YouTubers—both the people making the videos and the people leaving comments underneath them—have been almost uniformly unkind to the 50 series, hinting at consumer frustrations and pent-up demand for competitive products from other companies.

Enter AMD’s Radeon RX 9070 XT and RX 9070 graphics cards. These are aimed right at the middle of the current GPU market at the intersection of high sales volume and decent profit margins. They promise good 1440p and entry-level 4K gaming performance and improved power efficiency compared to previous-generation cards, with fixes for long-time shortcomings (ray-tracing performance, video encoding, and upscaling quality) that should, in theory, make them more tempting for people looking to ditch Nvidia.

Table of Contents

RX 9070 and 9070 XT specs and speeds

RX 9070 XT RX 9070 RX 7900 XTX RX 7900 XT RX 7900 GRE RX 7800 XT
Compute units (Stream processors) 64 RDNA4 (4,096) 56 RDNA4 (3,584) 96 RDNA3 (6,144) 84 RDNA3 (5,376) 80 RDNA3 (5,120) 60 RDNA3 (3,840)
Boost Clock 2,970 MHz 2,520 MHz 2,498 MHz 2,400 MHz 2,245 MHz 2,430 MHz
Memory Bus Width 256-bit 256-bit 384-bit 320-bit 256-bit 256-bit
Memory Bandwidth 650GB/s 650GB/s 960GB/s 800GB/s 576GB/s 624GB/s
Memory size 16GB GDDR6 16GB GDDR6 24GB GDDR6 20GB GDDR6 16GB GDDR6 16GB GDDR6
Total board power (TBP) 304 W 220 W 355 W 315 W 260 W 263 W

AMD’s high-level performance promise for the RDNA 4 architecture revolves around big increases in performance per compute unit (CU). An RDNA 4 CU, AMD says, is nearly twice as fast in rasterized performance as RDNA 2 (that is, rendering without ray-tracing effects enabled) and nearly 2.5 times as fast as RDNA 2 in games with ray-tracing effects enabled. Performance for at least some machine learning workloads also goes way up—twice as fast as RDNA 3 and four times as fast as RDNA 2.

We’ll see this in more detail when we start comparing performance, but AMD seems to have accomplished this goal. Despite having 64 or 56 compute units (for the 9070 XT and 9070, respectively), the cards’ performance often competes with AMD’s last-generation flagships, the RX 7900 XTX and 7900 XT. Those cards came with 96 and 84 compute units, respectively. The 9070 cards are specced a lot more like last generation’s RX 7800 XT—including the 16GB of GDDR6 on a 256-bit memory bus, as AMD still isn’t using GDDR6X or GDDR7—but they’re much faster than the 7800 XT was.

AMD has dramatically increased the performance-per-compute unit for RDNA 4. AMD

The 9070 series also uses a new 4 nm manufacturing process from TSMC, an upgrade from the 7000 series’ 5 nm process (and the 6 nm process used for the separate memory controller dies in higher-end RX 7000-series models that used chiplets). AMD’s GPUs are normally a bit less efficient than Nvidia’s, but the architectural improvements and the new manufacturing process allow AMD to do some important catch-up.

Both of the 9070 models we tested were ASRock Steel Legend models, and the 9070 and 9070 XT had identical designs—we’ll probably see a lot of this from AMD’s partners since the GPU dies and the 16GB RAM allotments are the same for both models. Both use two 8-pin power connectors; AMD says partners are free to use the 12-pin power connector if they want, but given Nvidia’s ongoing issues with it, most cards will likely stick with the reliable 8-pin connectors.

AMD doesn’t appear to be making and selling reference designs for the 9070 series the way it did for some RX 7000 and 6000-series GPUs or the way Nvidia does with its Founders Edition cards. From what we’ve seen, 2 or 2.5-slot, triple-fan designs will be the norm, the way they are for most midrange GPUs these days.

Testbed notes

We used the same GPU testbed for the Radeon RX 9070 series as we have for our GeForce RTX 50-series reviews.

An AMD Ryzen 7 9800X3D ensures that our graphics cards will be CPU-limited as little as possible. An ample 1050 W power supply, 32GB of DDR5-6000, and an AMD X670E motherboard with the latest BIOS installed round out the hardware. On the software side, we use an up-to-date installation of Windows 11 24H2 and recent GPU drivers for older cards, ensuring that our tests reflect whatever optimizations Microsoft, AMD, Nvidia, and game developers have made since the last generation of GPUs launched.

We have numbers for all of Nvidia’s RTX 50-series GPUs so far, plus most of the 40-series cards, most of AMD’s RX 7000-series cards, and a handful of older GPUs from the RTX 30-series and RX 6000 series. We’ll focus on comparing the 9070 XT and 9070 to other 1440p-to-4K graphics cards since those are the resolutions AMD is aiming at.

Performance

At $549 and $599, the 9070 series is priced to match Nvidia’s $549 RTX 5070 and undercut the $749 RTX 5070 Ti. So we’ll focus on comparing the 9070 series to those cards, plus the top tier of GPUs from the outgoing RX 7000-series.

Some 4K rasterized benchmarks.

Starting at the top with rasterized benchmarks with no ray-tracing effects, the 9070 XT does a good job of standing up to Nvidia’s RTX 5070 Ti, coming within a few frames per second of its performance in all the games we tested (and scoring very similarly in the 3DMark Time Spy Extreme benchmark).

Both cards are considerably faster than the RTX 5070—between 15 and 28 percent for the 9070 XT and between 5 and 13 percent for the regular 9070 (our 5070 scored weirdly low in Horizon Zero Dawn Remastered, so we’d treat those numbers as outliers for now). Both 9070 cards also stack up well next to the RX 7000 series here—the 9070 can usually just about match the performance of the 7900 XT, and the 9070 XT usually beats it by a little. Both cards thoroughly outrun the old RX 7900 GRE, which was AMD’s $549 GPU offering just a year ago.

The 7900 XT does have 20GB of RAM instead of 16GB, which might help its performance in some edge cases. But 16GB is still perfectly generous for a 1440p-to-4K graphics card—the 5070 only offers 12GB, which could end up limiting its performance in some games as RAM requirements continue to rise.

On ray-tracing improvements

Nvidia got a jump on AMD when it introduced hardware-accelerated ray-tracing in the RTX 20-series in 2018. And while these effects were only supported in a few games at the time, many modern games offer at least some kind of ray-traced lighting effects.

AMD caught up a little when it began shipping its own ray-tracing support in the RDNA2 architecture in late 2020, but the issue since then has always been that AMD cards have taken a larger performance hit than GeForce GPUs when these effects are turned on. RDNA3 promised improvements, but our tests still generally showed the same deficit as before.

So we’re looking for two things with RDNA4’s ray-tracing performance. First, we want the numbers to be higher than they were for comparably priced RX 7000-series GPUs, the same thing we look for in non-ray-traced (or rasterized) rendering performance. Second, we want the size of the performance hit to go down. To pick an example: the RX 7900 GRE could compete with Nvidia’s RTX 4070 Ti Super in games without ray tracing, but it was closer to a non-Super RTX 4070 in ray-traced games. It has helped keep AMD’s cards from being across-the-board competitive with Nvidia’s—is that any different now?

Benchmarks for games with ray-tracing effects enabled. Both AMD cards generally keep pace with the 5070 in these tests thanks to RDNA 4’s improvements.

The picture our tests paint is mixed but tentatively positive. The 9070 series and RDNA4 post solid improvements in the Cyberpunk 2077 benchmarks, substantially closing the performance gap with Nvidia. In games where AMD’s cards performed well enough before—here represented by Returnal—performance goes up, but roughly proportionately with rasterized performance. And both 9070 cards still punch below their weight in Black Myth: Wukong, falling substantially behind the 5070 under the punishing Cinematic graphics preset.

So the benefits you see, as with any GPU update, will depend a bit on the game you’re playing. There’s also a possibility that game optimizations and driver updates made with RDNA4 in mind could boost performance further. We can’t say that AMD has caught all the way up to Nvidia here—the 9070 and 9070 XT are both closer to the GeForce RTX 5070 than the 5070 Ti, despite keeping it closer to the 5070 Ti in rasterized tests—but there is real, measurable improvement here, which is what we were looking for.

Power usage

The 9070 series’ performance increases are particularly impressive when you look at the power-consumption numbers. The 9070 comes close to the 7900 XT’s performance but uses 90 W less power under load. It beats the RTX 5070 most of the time but uses around 30 W less power.

The 9070 XT is a little less impressive on this front—AMD has set clock speeds pretty high, and this can increase power use disproportionately. The 9070 XT is usually 10 or 15 percent faster than the 9070 but uses 38 percent more power. The XT’s power consumption is similar to the RTX 5070 Ti’s (a GPU it often matches) and the 7900 XT’s (a GPU it always beats), so it’s not too egregious, but it’s not as standout as the 9070’s.

AMD gives 9070 owners a couple of new toggles for power limits, though, which we’ll talk about in the next section.

Experimenting with “Total Board Power”

We don’t normally dabble much with overclocking when we review CPUs or GPUs—we’re happy to leave that to folks at other outlets. But when we review CPUs, we do usually test them with multiple power limits in place. Playing with power limits is easier (and occasionally safer) than actually overclocking, and it often comes with large gains to either performance (a chip that performs much better when given more power to work with) or efficiency (a chip that can run at nearly full speed without using as much power).

Initially, I experimented with the RX 9070’s power limits by accident. AMD sent me one version of the 9070 but exchanged it because of a minor problem the OEM identified with some units early in the production run. I had, of course, already run most of our tests on it, but that’s the way these things go sometimes.

By bumping the regular RX 9070’s TBP up just a bit, you can nudge it closer to 9070 XT-level performance.

The replacement RX 9070 card, an ASRock Steel Legend model, was performing significantly better in our tests, sometimes nearly closing the gap between the 9070 and the XT. It wasn’t until I tested power consumption that I discovered the explanation—by default, it was using a 245 W power limit rather than the AMD-defined 220 W limit. Usually, these kinds of factory tweaks don’t make much of a difference, but for the 9070, this power bump gave it a nice performance boost while still keeping it close to the 250 W power limit of the GeForce RTX 5070.

The 90-series cards we tested both add some power presets to AMD’s Adrenalin app in the Performance tab under Tuning. These replace and/or complement some of the automated overclocking and undervolting buttons that exist here for older Radeon cards. Clicking Favor Efficiency or Favor Performance can ratchet the card’s Total Board Power (TBP) up or down, limiting performance so that the card runs cooler and quieter or allowing the card to consume more power so it can run a bit faster.

The 9070 cards get slightly different performance tuning options in the Adrenalin software. These buttons mostly change the card’s Total Board Power (TBP), making it simple to either improve efficiency or boost performance a bit. Credit: Andrew Cunningham

For this particular ASRock 9070 card, the default TBP is set to 245 W. Selecting “Favor Efficiency” sets it to the default 220 W. You can double-check these values using an app like HWInfo, which displays both the current TBP and the maximum TBP in its Sensors Status window. Clicking the Custom button in the Adrenalin software gives you access to a Power Tuning slider, which for our card allowed us to ratchet the TBP up by up to 10 percent or down by as much as 30 percent.

This is all the firsthand testing we did with the power limits of the 9070 series, though I would assume that adding a bit more power also adds more overclocking headroom (bumping up the power limits is common for GPU overclockers no matter who makes your card). AMD says that some of its partners will ship 9070 XT models set to a roughly 340 W power limit out of the box but acknowledges that “you start seeing diminishing returns as you approach the top of that [power efficiency] curve.”

But it’s worth noting that the driver has another automated set-it-and-forget-it power setting you can easily use to find your preferred balance of performance and power efficiency.

A quick look at FSR4 performance

There’s a toggle in the driver for enabling FSR 4 in FSR 3.1-supporting games. Credit: Andrew Cunningham

One of AMD’s headlining improvements to the RX 90-series is the introduction of FSR 4, a new version of its FidelityFX Super Resolution upscaling algorithm. Like Nvidia’s DLSS and Intel’s XeSS, FSR 4 can take advantage of RDNA 4’s machine learning processing power to do hardware-backed upscaling instead of taking a hardware-agnostic approach as the older FSR versions did. AMD says this will improve upscaling quality, but it also means FSR4 will only work on RDNA 4 GPUs.

The good news is that FSR 3.1 and FSR 4 are forward- and backward-compatible. Games that have already added FSR 3.1 support can automatically take advantage of FSR 4, and games that support FSR 4 on the 90-series can just run FSR 3.1 on older and non-AMD GPUs.

FSR 4 comes with a small performance hit compared to FSR 3.1 at the same settings, but better overall quality can let you drop to a faster preset like Balanced or Performance and end up with more frames-per-second overall. Credit: Andrew Cunningham

The only game in our current test suite to be compatible with FSR 4 is Horizon Zero Dawn Remastered, and we tested its performance using both FSR 3.1 and FSR 4. In general, we found that FSR 4 improved visual quality at the cost of just a few frames per second when run at the same settings—not unlike using Nvidia’s recently released “transformer model” for DLSS upscaling.

Many games will let you choose which version of FSR you want to use. But for FSR 3.1 games that don’t have a built-in FSR 4 option, there’s a toggle in AMD’s Adrenalin driver you can hit to switch to the better upscaling algorithm.

Even if they come with a performance hit, new upscaling algorithms can still improve performance by making the lower-resolution presets look better. We run all of our testing in “Quality” mode, which generally renders at two-thirds of native resolution and scales up. But if FSR 4 running in Balanced or Performance mode looks the same to your eyes as FSR 3.1 running in Quality mode, you can still end up with a net performance improvement in the end.

RX 9070 or 9070 XT?

Just $50 separates the advertised price of the 9070 from that of the 9070 XT, something both Nvidia and AMD have done in the past that I find a bit annoying. If you have $549 to spend on a graphics card, you can almost certainly scrape together $599 for a graphics card. All else being equal, I’d tell most people trying to choose one of these to just spring for the 9070 XT.

That said, availability and retail pricing for these might be all over the place. If your choices are a regular RX 9070 or nothing, or an RX 9070 at $549 and an RX 9070 XT at any price higher than $599, I would just grab a 9070 and not sweat it too much. The two cards aren’t that far apart in performance, especially if you bump the 9070’s TBP up a little bit, and games that are playable on one will be playable at similar settings on the other.

Pretty close to great

If you’re building a 1440p or 4K gaming box, the 9070 series might be the ones to beat right now. Credit: Andrew Cunningham

We’ve got plenty of objective data in here, so I don’t mind saying that I came into this review kind of wanting to like the 9070 and 9070 XT. Nvidia’s 50-series cards have mostly upheld the status quo, and for the last couple of years, the status quo has been sustained high prices and very modest generational upgrades. And who doesn’t like an underdog story?

I think our test results mostly justify my priors. The RX 9070 and 9070 XT are very competitive graphics cards, helped along by a particularly mediocre RTX 5070 refresh from Nvidia. In non-ray-traced games, both cards wipe the floor with the 5070 and come close to competing with the $749 RTX 5070 Ti. In games and synthetic benchmarks with ray-tracing effects on, both cards can usually match or slightly beat the similarly priced 5070, partially (if not entirely) addressing AMD’s longstanding performance deficit here. Neither card comes close to the 5070 Ti in these games, but they’re also not priced like a 5070 Ti.

Just as impressively, the Radeon cards compete with the GeForce cards while consuming similar amounts of power. At stock settings, the RX 9070 uses roughly the same amount of power under load as a 4070 Super but with better performance. The 9070 XT uses about as much power as a 5070 Ti, with similar performance before you turn ray-tracing on. Power efficiency was a small but consistent drawback for the RX 7000 series compared to GeForce cards, and the 9070 cards mostly erase that disadvantage. AMD is also less stingy with the RAM, giving you 16GB for the price Nvidia charges for 12GB.

Some of the old caveats still apply. Radeons take a bigger performance hit, proportionally, than GeForce cards. DLSS already looks pretty good and is widely supported, while FSR 3.1/FSR 4 adoption is still relatively low. Nvidia has a nearly monopolistic grip on the dedicated GPU market, which means many apps, AI workloads, and games support its GPUs best/first/exclusively. AMD is always playing catch-up to Nvidia in some respect, and Nvidia keeps progressing quickly enough that it feels like AMD never quite has the opportunity to close the gap.

AMD also doesn’t have an answer for DLSS Multi-Frame Generation. The benefits of that technology are fairly narrow, and you already get most of those benefits with single-frame generation. But it’s still a thing that Nvidia does that AMDon’t.

Overall, the RX 9070 cards are both awfully tempting competitors to the GeForce RTX 5070—and occasionally even the 5070 Ti. They’re great at 1440p and decent at 4K. Sure, I’d like to see them priced another $50 or $100 cheaper to well and truly undercut the 5070 and bring 1440p-to-4K performance t0 a sub-$500 graphics card. It would be nice to see AMD undercut Nvidia’s GPUs as ruthlessly as it undercut Intel’s CPUs nearly a decade ago. But these RDNA4 GPUs have way fewer downsides than previous-generation cards, and they come at a moment of relative weakness for Nvidia. We’ll see if the sales follow.

The good

  • Great 1440p performance and solid 4K performance
  • 16GB of RAM
  • Decisively beats Nvidia’s RTX 5070, including in most ray-traced games
  • RX 9070 XT is competitive with RTX 5070 Ti in non-ray-traced games for less money
  • Both cards match or beat the RX 7900 XT, AMD’s second-fastest card from the last generation
  • Decent power efficiency for the 9070 XT and great power efficiency for the 9070
  • Automated options for tuning overall power use to prioritize either efficiency or performance
  • Reliable 8-pin power connectors available in many cards

The bad

  • Nvidia’s ray-tracing performance is still usually better
  • At $549 and $599, pricing matches but doesn’t undercut the RTX 5070
  • FSR 4 isn’t as widely supported as DLSS and may not be for a while

The ugly

  • Playing the “can you actually buy these for AMD’s advertised prices” game

Photo of Andrew Cunningham

Andrew is a Senior Technology Reporter at Ars Technica, with a focus on consumer tech including computer hardware and in-depth reviews of operating systems like Windows and macOS. Andrew lives in Philadelphia and co-hosts a weekly book podcast called Overdue.

AMD Radeon RX 9070 and 9070 XT review: RDNA 4 fixes a lot of AMD’s problems Read More »

reddit-mods-are-fighting-to-keep-ai-slop-off-subreddits-they-could-use-help.

Reddit mods are fighting to keep AI slop off subreddits. They could use help.


Mods ask Reddit for tools as generative AI gets more popular and inconspicuous.

Redditors in a treehouse with a NO AI ALLOWED sign

Credit: Aurich Lawson (based on a still from Getty Images)

Credit: Aurich Lawson (based on a still from Getty Images)

Like it or not, generative AI is carving out its place in the world. And some Reddit users are definitely in the “don’t like it” category. While some subreddits openly welcome AI-generated images, videos, and text, others have responded to the growing trend by banning most or all posts made with the technology.

To better understand the reasoning and obstacles associated with these bans, Ars Technica spoke with moderators of subreddits that totally or partially ban generative AI. Almost all these volunteers described moderating against generative AI as a time-consuming challenge they expect to get more difficult as time goes on. And most are hoping that Reddit will release a tool to help their efforts.

It’s hard to know how much AI-generated content is actually on Reddit, and getting an estimate would be a large undertaking. Image library Freepik has analyzed the use of AI-generated content on social media but leaves Reddit out of its research because “it would take loads of time to manually comb through thousands of threads within the platform,” spokesperson Bella Valentini told me. For its part, Reddit doesn’t publicly disclose how many Reddit posts involve generative AI use.

To be clear, we’re not suggesting that Reddit has a large problem with generative AI use. By now, many subreddits seem to have agreed on their approach to AI-generated posts, and generative AI has not superseded the real, human voices that have made Reddit popular.

Still, mods largely agree that generative AI will likely get more popular on Reddit over the next few years, making generative AI modding increasingly important to both moderators and general users. Generative AI’s rising popularity has also had implications for Reddit the company, which in 2024 started licensing Reddit posts to train the large language models (LLMs) powering generative AI.

(Note: All the moderators I spoke with for this story requested that I use their Reddit usernames instead of their real names due to privacy concerns.)

No generative AI allowed

When it comes to anti-generative AI rules, numerous subreddits have zero-tolerance policies, while others permit posts that use generative AI if it’s combined with human elements or is executed very well. These rules task mods with identifying posts using generative AI and determining if they fit the criteria to be permitted on the subreddit.

Many subreddits have rules against posts made with generative AI because their mod teams or members consider such posts “low effort” or believe AI is counterintuitive to the subreddit’s mission of providing real human expertise and creations.

“At a basic level, generative AI removes the human element from the Internet; if we allowed it, then it would undermine the very point of r/AskHistorians, which is engagement with experts,” the mods of r/AskHistorians told me in a collective statement.

The subreddit’s goal is to provide historical information, and its mods think generative AI could make information shared on the subreddit less accurate. “[Generative AI] is likely to hallucinate facts, generate non-existent references, or otherwise provide misleading content,” the mods said. “Someone getting answers from an LLM can’t respond to follow-ups because they aren’t an expert. We have built a reputation as a reliable source of historical information, and the use of [generative AI], especially without oversight, puts that at risk.”

Similarly, Halaku, a mod of r/wheeloftime, told me that the subreddit’s mods banned generative AI because “we focus on genuine discussion.” Halaku believes AI content can’t facilitate “organic, genuine discussion” and “can drown out actual artwork being done by actual artists.”

The r/lego subreddit banned AI-generated art because it caused confusion in online fan communities and retail stores selling Lego products, r/lego mod Mescad said. “People would see AI-generated art that looked like Lego on [I]nstagram or [F]acebook and then go into the store to ask to buy it,” they explained. “We decided that our community’s dedication to authentic Lego products doesn’t include AI-generated art.”

Not all of Reddit is against generative AI, of course. Subreddits dedicated to the technology exist, and some general subreddits permit the use of generative AI in some or all forms.

“When it comes to bans, I would rather focus on hate speech, Nazi salutes, and things that actually harm the subreddits,” said 3rdusernameiveused, who moderates r/consoom and r/TeamBuilder25, which don’t ban generative AI. “AI art does not do that… If I was going to ban [something] for ‘moral’ reasons, it probably won’t be AI art.”

“Overwhelmingly low-effort slop”

Some generative AI bans are reflective of concerns that people are not being properly compensated for the content they create, which is then fed into LLM training.

Mod Mathgeek007 told me that r/DeadlockTheGame bans generative AI because its members consider it “a form of uncredited theft,” adding:

You aren’t allowed to sell/advertise the workers of others, and AI in a sense is using patterns derived from the work of others to create mockeries. I’d personally have less of an issue with it if the artists involved were credited and compensated—and there are some niche AI tools that do this.

Other moderators simply think generative AI reduces the quality of a subreddit’s content.

“It often just doesn’t look good… the art can often look subpar,” Mathgeek007 said.

Similarly, r/videos bans most AI-generated content because, according to its announcement, the videos are “annoying” and “just bad video” 99 percent of the time. In an online interview, r/videos mod Abrownn told me:

It’s overwhelmingly low-effort slop thrown together simply for views/ad revenue. The creators rarely care enough to put real effort into post-generation [or] editing of the content [and] rarely have coherent narratives [in] the videos, etc. It seems like they just throw the generated content into a video, export it, and call it a day.

An r/fakemon mod told me, “I can’t think of anything more low-effort in terms of art creation than just typing words and having it generated for you.”

Some moderators say generative AI helps people spam unwanted content on a subreddit, including posts that are irrelevant to the subreddit and posts that attack users.

“[Generative AI] content is almost entirely posted for purely self promotional/monetary reasons, and we as mods on Reddit are constantly dealing with abusive users just spamming their content without regard for the rules,” Abrownn said.

A moderator of the r/wallpaper subreddit, which permits generative AI, disagrees. The mod told me that generative AI “provides new routes for novel content” in the subreddit and questioned concerns about generative AI stealing from human artists or offering lower-quality work, saying those problems aren’t unique to generative AI:

Even in our community, we observe human-generated content that is subjectively low quality (poor camera/[P]hotoshopping skills, low-resolution source material, intentional “shitposting”). It can be argued that AI-generated content amplifies this behavior, but our experience (which we haven’t quantified) is that the rate of such behavior (whether human-generated or AI-generated content) has not changed much within our own community.

But we’re not a very active community—[about] 13 posts per day … so it very well could be a “frog in boiling water” situation.

Generative AI “wastes our time”

Many mods are confident in their ability to effectively identify posts that use generative AI. A bigger problem is how much time it takes to identify these posts and remove them.

The r/AskHistorians mods, for example, noted that all bans on the subreddit (including bans unrelated to AI) have “an appeals process,” and “making these assessments and reviewing AI appeals means we’re spending a considerable amount of time on something we didn’t have to worry about a few years ago.”

They added:

Frankly, the biggest challenge with [generative AI] usage is that it wastes our time. The time spent evaluating responses for AI use, responding to AI evangelists who try to flood our subreddit with inaccurate slop and then argue with us in modmail, [direct messages that message a subreddits’ mod team], and discussing edge cases could better be spent on other subreddit projects, like our podcast, newsletter, and AMAs, … providing feedback to users, or moderating input from users who intend to positively contribute to the community.

Several other mods I spoke with agree. Mathgeek007, for example, named “fighting AI bros” as a common obstacle. And for r/wheeloftime moderator Halaku, the biggest challenge in moderating against generative AI is “a generational one.”

“Some of the current generation don’t have a problem with it being AI because content is content, and [they think] we’re being elitist by arguing otherwise, and they want to argue about it,” they said.

A couple of mods noted that it’s less time-consuming to moderate subreddits that ban generative AI than it is to moderate those that allow posts using generative AI, depending on the context.

“On subreddits where we allowed AI, I often take a bit longer time to actually go into each post where I feel like… it’s been AI-generated to actually look at it and make a decision,” explained N3DSdude, a mod of several subreddits with rules against generative AI, including r/DeadlockTheGame.

MyarinTime, a moderator for r/lewdgames, which allows generative AI images, highlighted the challenges of identifying human-prompted generative AI content versus AI-generated content prompted by a bot:

When the AI bomb started, most of those bots started using AI content to work around our filters. Most of those bots started showing some random AI render, so it looks like you’re actually talking about a game when you’re not. There’s no way to know when those posts are legit games unless [you check] them one by one. I honestly believe it would be easier if we kick any post with [AI-]generated image… instead of checking if a button was pressed by a human or not.

Mods expect things to get worse

Most mods told me it’s pretty easy for them to detect posts made with generative AI, pointing to the distinct tone and favored phrases of AI-generated text. A few said that AI-generated video is harder to spot but still detectable. But as generative AI gets more advanced, moderators are expecting their work to get harder.

In a joint statement, r/dune mods Blue_Three and Herbalhippie said, “AI used to have a problem making hands—i.e., too many fingers, etc.—but as time goes on, this is less and less of an issue.”

R/videos’ Abrownn also wonders how easy it will be to detect AI-generated Reddit content “as AI tools advance and content becomes more lifelike.”

Mathgeek007 added:

AI is becoming tougher to spot and is being propagated at a larger rate. When AI style becomes normalized, it becomes tougher to fight. I expect generative AI to get significantly worse—until it becomes indistinguishable from ordinary art.

Moderators currently use various methods to fight generative AI, but they’re not perfect. r/AskHistorians mods, for example, use “AI detectors, which are unreliable, problematic, and sometimes require paid subscriptions, as well as our own ability to detect AI through experience and expertise,” while N3DSdude pointed to tools like Quid and GPTZero.

To manage current and future work around blocking generative AI, most of the mods I spoke with said they’d like Reddit to release a proprietary tool to help them.

“I’ve yet to see a reliable tool that can detect AI-generated video content,” Aabrown said. “Even if we did have such a tool, we’d be putting hundreds of hours of content through the tool daily, which would get rather expensive rather quickly. And we’re unpaid volunteer moderators, so we will be outgunned shortly when it comes to detecting this type of content at scale. We can only hope that Reddit will offer us a tool at some point in the near future that can help deal with this issue.”

A Reddit spokesperson told me that the company is evaluating what such a tool could look like. But Reddit doesn’t have a rule banning generative AI overall, and the spokesperson said the company doesn’t want to release a tool that would hinder expression or creativity.

For now, Reddit seems content to rely on moderators to remove AI-generated content when appropriate. Reddit’s spokesperson added:

Our moderation approach helps ensure that content on Reddit is curated by real humans. Moderators are quick to remove content that doesn’t follow community rules, including harmful or irrelevant AI-generated content—we don’t see this changing in the near future.

Making a generative AI Reddit tool wouldn’t be easy

Reddit is handling the evolving concerns around generative AI as it has handled other content issues, including by leveraging AI and machine learning tools. Reddit’s spokesperson said that this includes testing tools that can identify AI-generated media, such as images of politicians.

But making a proprietary tool that allows moderators to detect AI-generated posts won’t be easy, if it happens at all. The current tools for detecting generative AI are limited in their capabilities, and as generative AI advances, Reddit would need to provide tools that are more advanced than the AI-detecting tools that are currently available.

That would require a good deal of technical resources and would also likely present notable economic challenges for the social media platform, which only became profitable last year. And as noted by r/videos moderator Abrownn, tools for detecting AI-generated video still have a long way to go, making a Reddit-specific system especially challenging to create.

But even with a hypothetical Reddit tool, moderators would still have their work cut out for them. And because Reddit’s popularity is largely due to its content from real humans, that work is important.

Since Reddit’s inception, that has meant relying on moderators, which Reddit has said it intends to keep doing. As r/dune mods Blue_Three and herbalhippie put it, it’s in Reddit’s “best interest that much/most content remains organic in nature.” After all, Reddit’s profitability has a lot to do with how much AI companies are willing to pay to access Reddit data. That value would likely decline if Reddit posts became largely AI-generated themselves.

But providing the technology to ensure that generative AI isn’t abused on Reddit would be a large challege. For now, volunteer laborers will continue to bear the brunt of generative AI moderation.

Advance Publications, which owns Ars Technica parent Condé Nast, is the largest shareholder of Reddit.

Photo of Scharon Harding

Scharon is a Senior Technology Reporter at Ars Technica writing news, reviews, and analysis on consumer gadgets and services. She’s been reporting on technology for over 10 years, with bylines at Tom’s Hardware, Channelnomics, and CRN UK.

Reddit mods are fighting to keep AI slop off subreddits. They could use help. Read More »

after-50-years,-ars-staffers-pick-their-favorite-saturday-night-live-sketches

After 50 years, Ars staffers pick their favorite Saturday Night Live sketches


“Do not taunt Happy Fun Ball.”

American musician Stevie Wonder (left) appears on an episode of ‘Saturday Night Live’ with comedian and actor Eddie Murphy, New York, New York, May 6, 1983. Credit: Anthony Barboza/Getty Images

American musician Stevie Wonder (left) appears on an episode of ‘Saturday Night Live’ with comedian and actor Eddie Murphy, New York, New York, May 6, 1983. Credit: Anthony Barboza/Getty Images

The venerable late-night sketch comedy show Saturday Night Live is celebrating its 50th anniversary season this year. NBC will air a special on Sunday evening featuring current and former cast members.

I’ve long been a big fan of the show, since I was a kid in the late 1980s watching cast members such as Phil Hartman, Dana Carvey, and Jan Hooks. By then, the show was more than a decade old. It had already spawned huge Hollywood stars like Chevy Chase and Eddie Murphy and had gone through some near-death experiences as it struggled to find its footing.

The show most definitely does not appeal to some people. When I asked the Ars editorial team to share their favorite sketches, a few writers told me they had never found Saturday Night Live funny, hadn’t watched it in decades, or just did not get the premise of the show. Others, of course, love the show’s ability to poke fun at the cultural and political zeitgeist of the moment.

With the rise of the Internet, Saturday Night Live has become much more accessible. If you don’t care to watch live on Saturday night or record the show, its sketches are available on YouTube within a day or two. Not all of the show’s 10,000-odd sketches from the last five decades are available online, but many of them are.

With that said, here are some of our favorites!

Celebrity Hot Tub Party (Season 9)

Saturday Night Live has a thing for hot tubs, and it starts here, with the greatest of all hot tub parties.

Should you get in the water? Will it make you sweat?

Good god!

Celebrity Hot Tub.

—Ken Fisher

Papyrus (Season 43)

Some of SNL’s best skits satirize cultural touchstones that seem like they’d be way too niche but actually resonate broadly with its audience—like Font Snobs, i.e., those people who sneer at fonts like Comic-Sans (you know who you are) in favor of more serious options like the all-time favorite Helvetica. (Seriously, Helvetica has its own documentary.)

In “Papyrus,” host Ryan Gosling played Steven, a man who becomes obsessed with the fact that the person who designed the Avatar logo chose to use Papyrus. “Was it laziness? Was it cruelty?” Why would any self-respecting graphic designer select the same font one sees all over in “hookah bars, Shakira merch, [and] off-brand teas”? The skit is played straight as a tense psychological thriller and ends with a frustrated Steven screaming, “I know what you did!” in front of the graphic designer’s house while the designer smirks in triumph.

There was even a sequel last year in which Gosling’s Steven is in a support group and seems to have recovered from the trauma of seeing the hated font everywhere—as long as he avoids triggers. Then he learns that the font for Avatar: The Way of Water is just Papyrus in bold.

So begins an elaborate plot to infiltrate a graphic designer awards event to confront his tormentor head-on. The twist: Steven achieves a personal epiphany instead and confronts the root of his trauma: the fact that he was never able to understand his father, Jonathan WingDings. “My dad was so hard to read,” a weeping Steven laments as he finally gets some much-needed closure. Like most sequels, it doesn’t quite capture the magic of the original, but it’s still a charming addition to the archive.

Papyrus.

—Jennifer Ouellette

Washington’s Dream (Season 49)

The only SNL skit known and loved by all my kids. Nate Bargatze is George Washington, who explains his dream of “liberty” to soldiers in his revolutionary army. Washington’s future America is heavy on bizarre weights, measures, and rules, though not quite so concerned about things like slavery.

Washington’s Dream.

—Nate Anderson

Commercial parodies

I’ve always been partial to SNL‘s commercial parodies, probably because I saw way too many similar (but earnest) commercials while watching terrestrial TV growing up.

The other good thing about the commercial format is that it’s hard to make them longer than about two minutes, so they don’t outstay their welcome like some other SNL sketches

It’s hard to pick just one, so I’ll give a trio, along with the bits I think about and/or quote regularly.

Old Glory Insurance: “I don’t even know why the scientists make them!” (Season 21)

Old Glory Insurance.

First Citywide Change Bank: “All the time, our customers ask us, ‘How do you make money doing this?’ The answer is simple: volume.” (Season 14)

First CityWide Change Bank.

Happy Fun Ball: “Do not taunt Happy Fun Ball” (Season 16)

Happy Fun Ball.

—Kyle Orland

Anything with Phil Hartman (Seasons 12 to 20)

Phil Hartman was a regular on Saturday Night Live throughout my high school and college years, and it was nice to know that on the rare Saturday night when I did not have a date or plans, he and the cast would be on television to provide entertainment. He was the “glue” guy during his time on the show, playing a variety of roles and holding the show together.

Here are some of his most memorable sketches, at least to me.

Anal Retentive Chef. Hartman acts as Gene, who is… well, anal retentive. He appeared in five different skits over the years. This is the first one. (Season 14)

The Anal Retentive Chef.

Hartman had incredible range. During his first year on the show, he played President Reagan, who at the time had acquired the reputation of becoming doddering and forgetful. However, as Hartman clearly shows us in this sketch, that is far from reality. (Season 12)

President Reagan, Mastermind.

And here he is a few years later, during the first year of President Clinton’s term in office. This skit also features Chris Farley, who was memorable in almost everything he appeared in. “Do you mind if I wash it down?” (Season 18)

President Bill Clinton at McDonald’s.

Kyle has noted commercial parodies above, and there are many good ones. Hartman often appeared in these because he did such a good job of playing the “straight man” character in comedy, the generally normal person in contrast to all of the wackiness happening in a scene. One of Hartman’s most famous commercials is for Colon Blow cereal. However, my favorite is this zany commercial for Jiffy Pop… Airbags. (Season 17)

Jiffy Pop Airbag.

—Eric Berger

Motherlover (Season 34)

The Lonely Island (an American comedy trio, formed by Andy Samberg, Jorma Taccone, and Akiva Schaffer, which wrote comedy music videos) had bigger, more viral hits, but nothing surpasses the subversiveness of “to me, you’re like a brother, so be my motherlover.”

Motherlover.

—Jacob May

More Cowbell (Season 25)

This classic sketch gets featured on almost all SNL “best of” lists; “more cowbell” even made it into the dictionary. It’s a sendup of VH1’s “Behind the Music,” focused on the recording of Blue Oyster Cult’s 1975 hit “Don’t Fear the Reaper,” which features a distinctive percussive cowbell in the background. Will Ferrell is perfection as fictional cowbell player Gene Frenkel, whose overly enthusiastic playing is a distraction to his bandmates. But Christopher Walken’s “legendary” (and fictional) producer Bruce Dickinson loves the cowbell, encouraging Gene to “really explore the studio space” with each successive take. “I gotta have more cowbell, baby!”

Things escalate as Gene’s playing first becomes too flamboyant, and then passive-aggressive, until the band works through its tensions and decides to embrace the cowbell after all. The comic timing is spot on, and the cast doesn’t let the joke run too long (a common flaw in lesser SNL skits). Ferrell’s physical antics and Walken’s brilliantly deadpan delivery—”I got a fever and the only prescription is more cowbell!”—has the cast on the verge of breaking character throughout. It deserves its place in the pantheon of SNL‘s best.

More Cowbell.

—Jennifer Ouellette

The Californians (Season 37-present day)

I was going to go with Old Glory Insurance as my favorite SNL skit, but since Kyle already grabbed that one, I have to fall back on some of my runners-up. And although the Microsoft Robots and Career Day and even good ol’ Jingleheimer Junction almost topped my list, ultimately, I have to give it up to the recurring SNL skit that has probably given me more joy than anything the show has done since John Belushi’s samurai librarian. I am speaking of The Californians.

This fake soap opera, featuring a cast of perpetually blonde, perpetually unfaithful, perpetually directions-obsessed California stereotypes hits me just right. The elements that get repeated in every skit (including and especially Fred Armisen’s inevitable “WHATAREYUUUUDUUUUUUUINGHERE” or the locally produced furniture that everyone makes a point of using in the second act) are the kind of absurdities that get funnier over time, and it’s awesome to see guest stars try on the hyper-SoCal accent that is mandatory for all characters in the Californians’ universe.

Special props to Kristen Wiig, too—she’s inevitably hilarious, but her incredulous line reading when Mick Jagger shows up as Stuart’s long-absent father (“STUART! You never told me you had a dad!”) can and will fully send me into doubled-over hysterics every single time.

The Californians.

—Lee Hutchinson

What’s the fuss about?

In more than 20 years of living in the United States, few things still remain as far outside my cultural frame of reference as SNL. Whenever someone makes an unintelligible joke in Slack (or IRC before it) and everyone laughs, it invariably turns out to be some SNL thing that anyone who grew up here instinctively understands.

To me, it was always just *crickets*.

—Jonathan Gitlin

Black Jeopardy (Season 42)

Kenan Thompson was the show’s first cast member born after SNL‘s premiere in 1975, and after joining the show in 2003, he has become its longest-running cast member. Whenever he is on screen, you know you’re about to see something hilarious. One of his best roles on SNL has become the “game show host,” with long-running bits on Family Feud and the absurdly hilarious Black Jeopardy. The most famous of these latter skits occurred in 2016, when Tom Hanks appeared. If you haven’t watched it, you really must.

Black Jeopardy.

—Eric Berger

Josh Acid (Season 15)

One of my favorite SNL sketches (and perhaps one of the most underrated) is an Old West send-up featuring a sheriff named “Josh Acid” (played by Mel Gibson during his hosting appearance in 1989), who keeps two bottles of acid in holsters instead of the standard six-shooter revolvers.

The character is a hero in his town, but when he throws acid on people, their skin melts, and they die a horrible, gruesome death. The townspeople witness one such death and say it’s “gross.” In response, the main character cites Jim Bowie using a Bowie knife and says, “I use acid because that’s my name.” At one point, Kevin Nealon, as the bartender, says the town is grateful he’s cleaned up the place, but “it’s just that we’re not sure which is worse: lawlessness, or having to watch people die horribly from acid.”

Later, when a woman asks Josh to choose between her or acid, he says, “Frida, I took a job, and that job’s not done until every criminal in this territory is either behind bars or melted down.”

The sketch is just absurdly ridiculous in a delightful way, and it gleefully subverts the stoic nobility of the stereotypical Western hero, which is a trope baby boomers grew up with on TV. If I were to stretch, I’d also say it works because it lampoons the idea that some methods of legally or rightfully killing someone are more honorable and socially acceptable than others.

It’s not on YouTube that I can find, but I found a copy on TikTok.

—Benj Edwards

Hidden Camera Commercials (Season 17)

For me—and, I suspect, most people—there are several “golden ages” of SNL. But if I had to pick just one, it would be the Chris Farley era. The crown jewel of Farley’s SNL tenure was certainly the Bob Odenkirk- penned “Van Down by the River.” Today, though, I’d like to highlight a deeper cut: a coffee commercial in which Farley’s character is told he is drinking decaf coffee instead of regular. Instead of being delighted that he can’t tell the difference in taste, he gets… ANGRY.

Farley’s incredulous “what?” and dawning rage at being deceived never fail to make me laugh.

Hidden Camera Commercials.

—Aaron Zimmerman

Wake Up and Smile (Season 21)

SNL loves to take a simple idea and repeat it—sometimes without enough progression. But “Wake Up and Smile” stands out by following its simple idea (perky morning show hosts are lost without their teleprompters) into an incredibly dark place. In six minutes, you can watch the polished veneer of civilization collapse into tribal violence, all within the absurdist confines of a vapid TV show. In the end, everyone wakes from their temporary dystopian dreamland. Well, except for the weatherman.

Wake Up and Smile

—Nate Anderson

Thanks, Nate, and everyone who contributed. Indeed, one of the joys of watching the show live is you never know when a sketch is going to dark or very, very dark.

Photo of Eric Berger

Eric Berger is the senior space editor at Ars Technica, covering everything from astronomy to private space to NASA policy, and author of two books: Liftoff, about the rise of SpaceX; and Reentry, on the development of the Falcon 9 rocket and Dragon. A certified meteorologist, Eric lives in Houston.

After 50 years, Ars staffers pick their favorite Saturday Night Live sketches Read More »

centurylink-nightmares:-users-keep-asking-ars-for-help-with-multi-month-outages

CenturyLink nightmares: Users keep asking Ars for help with multi-month outages


More CenturyLink horror stories

Three more tales of CenturyLink failing to fix outages until hearing from Ars.

Horror poster take on the classic White Zombie about Century Link rendering the internet powerless

Credit: Aurich Lawson | White Zombie (Public Domain)

Credit: Aurich Lawson | White Zombie (Public Domain)

CenturyLink hasn’t broken its annoying habit of leaving customers without service for weeks or months and repeatedly failing to show up for repair appointments.

We’ve written about CenturyLink’s failure to fix long outages several times in the past year and a half. In each case, desperate customers contacted Ars because the telecom provider didn’t reconnect their service. And each time, CenturyLink finally sprang into action and fixed the problems shortly after hearing from an Ars reporter.

Unfortunately, it keeps happening, and CenturyLink (also known as Lumen) can’t seem to explain why. In only the last two months, we heard from CenturyLink customers in three states who were without service for periods of between three weeks and over four months.

In early December, we heard from John in Boulder, Colorado, who preferred that we not publish his last name. John said he and his wife had been without CenturyLink phone and DSL Internet service for over three weeks.

“There’s no cell service where we live, so we have to drive to find service… We’ve scheduled repairs [with CenturyLink] three different times, but each time nobody showed up, emailed, or called,” he told us. They pay $113 a month for phone and DSL service, he said.

John also told us his elderly neighbors were without service. He read our February 2024 article about a 39-day outage in Oregon and wondered if we could help. We also published an August 2023 article about CenturyLink leaving an 86-year-old woman in Minnesota with no Internet service for a month and a May 2024 article about CenturyLink leaving a couple in Oregon with no service for two months, then billing them for $239.

We contacted CenturyLink about the outages affecting John and his neighbor, providing both addresses to the company. Service for both was fixed several hours later. Suddenly, a CenturyLink “repair person showed up today, replaced both the modem and the phone card in the nearest pedestal, and we are reconnected to the rest of the world,” John told us.

John said he also messaged a CenturyLink technician whose contact information he saved from a previous visit for a different matter. It turned out this technician had been promoted to area supervisor, so John’s outreach to him may also have contributed to the belated fix. However it happened, CenturyLink confirmed to Ars that service was restored for both John and his neighbor on the same day,

“Good news, we were able to restore service to both customers today,” a company spokesperson told us. “One had a modem issue, which needed to be replaced, and the other had a problem with their line.”

What were you waiting for?

After getting confirmation that the outages were fixed, we asked the CenturyLink spokesperson whether the company has “a plan to make sure that customer outages are always fixed when a customer contacts the company instead of waiting for a reporter to contact the company on the customer’s behalf weeks later.”

Here is the answer we got from CenturyLink: “Restoring customer service is a priority, and we apologized for the delay. We’re looking at why there was a repair delay.”

It appears that nothing has changed. Even as John’s problem was fixed, CenturyLink users in other states suffered even longer outages, and no one showed up for scheduled repair appointments. These outages weren’t fixed until late January—and only after the customers contacted us to ask for help.

Karen Kurt, a resident of Sheridan, Oregon, emailed us on January 23 to report that she had no CenturyLink DSL Internet service since November 4, 2024. One of her neighbors was also suffering through the months-long outage.

“We have set up repair tickets only to have them voided and/or canceled,” Kurt told us. “We have sat at home on the designated repair day from 8–5 pm, and no one shows up.” Kurt’s CenturyLink phone and Internet service costs $172.04 a month, according to a recent bill she provided us. Kurt said she also has frequent CenturyLink phone outages, including some stretches that occurred during the three-month Internet outage.

Separately, a CenturyLink customer named David Stromberg in Bellevue, Washington, told us that his phone service had been out since September 16. He repeatedly scheduled repair appointments, but the scheduled days went by with no repairs. “Every couple weeks, they do this and the tech doesn’t show up,” he said.

“Quick” fixes

As far as we can tell, there weren’t any complex technical problems preventing CenturyLink from ending these outages. Once the public relations department heard from Ars, CenturyLink sent technicians to each area, and the customers had their services restored.

On the afternoon of January 24, we contacted CenturyLink about the outage affecting Kurt and her neighbor. CenturyLink restored service for both houses less than three hours later, finally ending outages that lasted over 11 weeks.

On Sunday, January 26, we informed CenturyLink’s public relations team about the outage affecting Stromberg in Washington. Service was restored about 48 hours later, ending the phone outage that lasted well over four months.

As we’ve done in previous cases, we asked CenturyLink why the outages lasted so long and why the company repeatedly failed to show up for repair appointments. We did not receive any substantive answer. “Services have been restored, and appropriate credits will be provided,” the CenturyLink spokesperson replied.

Stromberg said getting the credit wasn’t so simple. “We contacted them after service was restored. They credited the full amount, but it took a few phone calls. They also gave us a verbal apology,” he told us. He said they pay $80.67 a month for CenturyLink phone service and that they get Internet access from Comcast.

Kurt said she had to call CenturyLink each month the outage dragged on to obtain a bill credit. Though the outage is over, she said her Internet access has been unreliable since the fix, with webpages often taking painfully long times to load.

Kurt has only a 1.5Mbps DSL connection, so it’s not a modern Internet connection even on a good day. CenturyLink told us it found no further problems on its end, so it appears that Kurt is stuck with what she has for now.

Desperation

“We are just desperate,” Kurt told us when she first reached out. Kurt, a retired teacher, said she and her husband were driving to a library to access the Internet and help grandchildren with schoolwork. She said there’s no reliable cell service in the area and that they are on a waiting list for Starlink satellite service.

Kurt said her husband once suggested they switch to a different Internet provider, and she pointed out that there aren’t any better options. On the Starlink website, entering their address shows they are in an area labeled as sold out.

Although repair appointments came and went without a fix, Kurt said she received emails from CenturyLink falsely claiming that service had been restored. Kurt said she spoke with technicians doing work nearby and asked if CenturyLink is trying to force people to drop the service because it doesn’t want to serve the area anymore.

Kurt said a technician replied that there are some areas CenturyLink doesn’t want to serve anymore but that her address isn’t on that list. A technician explained that they have too much work, she said.

CenturyLink has touted its investments in modern fiber networks but hasn’t upgraded the old copper lines in Kurt’s area and many others.

“This is DSL. No fiber here!” Kurt told us. “Sometimes when things are congested, you can make a sandwich while things download. I have been told that is because this area is like a glass of water. At first, there were only a few of us drinking out of the glass. Now, CenturyLink has many more customers drinking out of that same glass, and so things are slower/congested at various times of the day.”

Kurt said the service tends to work better in mid-morning, early afternoon, after 9 pm on weeknights, and on weekends. “Sometimes pages take a bit of time to load. That is especially frustrating while doing school work with my grandson and granddaughter,” she said.

CenturyLink Internet even slower than expected

After the nearly three-month outage ended, Kurt told us on January 27 that “many times, we will get Internet back for two or three days, only to lose it again.” This seemed to be what happened on Sunday, February 2, when Kurt told us her Internet stopped working again and that she couldn’t reach a human at CenturyLink. She restarted the router but could not open webpages.

We followed up with CenturyLink’s public relations department again, but this time, the company said its network was performing as expected. “We ran a check and called Karen regarding her service,” CenturyLink told us on February 3. “Everything looks good on our end, with no problems reported since the 24th. She mentioned that she could access some sites, but the speed seemed really slow. We reminded her that she has a 1.5Mbps service. Karen acknowledged this but felt it was slower than expected.”

Kurt told us that her Internet is currently slower than it was before the outage. “Before October, at least the webpages loaded,” she said. Now, “the pages either do not load, continue to attempt to load, or finally time out.”

While Kurt is suffering from a lack of broadband competition, municipalities sometimes build public broadband networks when private companies fail to adequately serve their residents. ISPs such as CenturyLink have lobbied against these efforts to expand broadband access.

In May 2024, we wrote about how public broadband advocates say they’ve seen a big increase in opposition from “dark money” groups that don’t have to reveal their donors. At the time, CenturyLink did not answer questions about specific donations but defended its opposition to government-operated networks.

“We know it will take everyone working together to close the digital divide,” CenturyLink told us then. “That’s why we partner with municipalities on their digital inclusion efforts by providing middle-mile infrastructure that supports last-mile networks. We have and will continue to raise legitimate concerns when government-owned networks create an anti-competitive environment. There needs to be a level playing field when it comes to permitting, right-of-way fees, and cross subsidization of costs.”

Stuck with CenturyLink

Kurt said that CenturyLink has set a “low bar” for its service, and it isn’t even meeting that low standard. “I do not use the Internet a lot. I do not use the Internet for gaming or streaming things. The Internet here would never be able to do that. But I do expect the pages to load properly and fully,” she said.

Kurt said she and her husband live in a house they built in 2007 and originally were led to believe that Verizon service would be available. “Prior to purchasing the property, we did our due diligence and sought out all utility providers… Verizon insisted it was their territory on at least two occasions,” she said.

But when it was time to install phone and Internet lines, it turned out Verizon didn’t serve the location, she said. This is another problem we’ve written about multiple times—ISPs incorrectly claiming to offer service in an area, only to admit they don’t after a resident moves in. (Verizon sold its Oregon wireline operations to Frontier in 2010.)

“We were stuck with CenturyLink,” and “CenturyLink did not offer Internet when we first built this home,” Kurt said. They subscribed to satellite Internet offered by WildBlue, which was acquired by ViaSat in 2009. They used satellite for several years until they could get CenturyLink’s DSL Internet.

Now they’re hoping to replace CenturyLink with Starlink, which uses low-Earth orbit satellites that offer faster service than older satellite services. They’re on the waiting list for Starlink and are interested in Amazon’s Kuiper satellite service, which isn’t available yet.

“We are hoping one of these two vendors will open up a spot for us and we can move our Internet over to satellite,” Kurt said. “We have also heard that Starlink and Amazon are going to be starting up phone service as well as Internet. That would truly be a gift to us. If we could move all of our services over to something reliable, our life would be made so much easier.”

Not enough technicians for copper network

John, the Colorado resident who had a three-week CenturyLink outage, said his default DSL speed is 10Mbps downstream and 2Mbps upstream. He doubled that by getting a second dedicated line to create a bonded connection, he said.

When John set up repair appointments during the outage, the “dates came and went without the typical ‘your tech’s on their way’ email, without anyone showing up,” he said. John said he repeatedly called CenturyLink and was told there was a bad cable that was being fixed.

“Every time I called, I’d get somebody who said that it was a bad cable and it was being fixed. Every single time, they’d say it would be fixed by 11 pm the following day,” he said. “It wasn’t, so I’d call again. I asked to talk with a supervisor, but that was always denied. Every time, they said they’d expedite the request. The people I talked with were all very nice and very apologetic about our outage, but they clearly stayed in their box.”

John still had the contact information for the CenturyLink technician who set up his bonded connection and messaged him around the same time he contacted Ars. When a CenturyLink employee finally showed up to fix the problem, he “found that our DSL was out because our modem was bad, and the phone was out because there was a bad dial-tone card in the closest pedestal. It took this guy less than an hour to get us back working—and it wasn’t a broken cable,” John said.

John praised CenturyLink’s local repair team but said his requests for repairs apparently weren’t routed to the right people. A CenturyLink manager told John that the local crew never got the repair ticket from the phone-based customer service team, he said.

The technician who fixed the service offered some insight into the local problems, John told us. “He said that in the mountains of western Boulder County, there are a total of five techs who know how to work with copper wire,” John told us. “All the other employees only work with fiber. CenturyLink is losing the people familiar with copper and not replacing them, even though copper is what the west half of the county depends on.”

Lumen says it has 1.08 million fiber broadband subscribers and 1.47 million “other broadband subscribers,” defined as “customers that primarily subscribe to lower speed copper-based broadband services marketed under the CenturyLink brand.”

John doesn’t know whether his copper line will ever be upgraded to fiber. His house is 1.25 miles from the nearest fiber box. “I wonder if they’ll eventually replace lines like the one to our house or if they’ll drop us as customers when the copper line eventually degrades to the point it’s not usable,” he said.

Photo of Jon Brodkin

Jon is a Senior IT Reporter for Ars Technica. He covers the telecom industry, Federal Communications Commission rulemakings, broadband consumer affairs, court cases, and government regulation of the tech industry.

CenturyLink nightmares: Users keep asking Ars for help with multi-month outages Read More »

the-severance-writer-and-cast-on-corporate-cults,-sci-fi,-and-more

The Severance writer and cast on corporate cults, sci-fi, and more

The following story contains light spoilers for season one of Severence but none for season 2.

The first season of Severance walked the line between science-fiction thriller and Office Space-like satire, using a clever conceit (characters can’t remember what happens at work while at home, and vice versa) to open up new storytelling possibilities.

It hinted at additional depths, but it’s really season 2’s expanded worldbuilding that begins to uncover additional themes and ideas.

After watching the first six episodes of season two and speaking with the series’ showrunner and lead writer, Dan Erickson, as well as a couple of members of the cast (Adam Scott and Patricia Arquette), I see a show that’s about more than critiquing corporate life. It’s about all sorts of social mechanisms of control. It’s also a show with a tremendous sense of style and deep influences in science fiction.

Corporation or cult?

When I started watching season 2, I had just finished watching two documentaries about cults—The Vow, about a multi-level marketing and training company that turned out to be a sex cult, and Love Has Won: The Cult of Mother God, about a small, Internet-based religious movement that believed its founder was the latest human form of God.

There were hints of cult influences in the Lumon corporate structure in season 1, but without spoiling anything, season 2 goes much deeper into them. As someone who has worked at a couple of very large media corporations, I enjoyed Severance’s send-up of corporate culture. And as someone who has worked in tech startups—both good and dysfunctional ones—and who grew up in a radical religious environment, I now enjoy its send-up of cult social dynamics and power plays.

Employees watch a corporate propaganda video

Lumon controls what information is presented to its employees to keep them in line. Credit: Apple

When I spoke with showrunner Dan Erickson and actor Patricia Arquette, I wasn’t surprised to learn that it wasn’t just me—the influence of stories about cults on season 2 was intentional.

Erickson explained:

I watched all the cult documentaries that I could find, as did the other writers, as did Ben, as did the actors. What we found as we were developing it is that there’s this weird crossover. There’s this weird gray zone between a cult and a company, or any system of power, especially one where there is sort of a charismatic personality at the top of it like Kier Eagan. You see that in companies that have sort of a reverence for their founder.

Arquette also did some research on cults. “Very early on when I got the pilot, I was pretty fascinated at that time with a lot of cult documentaries—Wild Wild Country, and I don’t know if you could call it a cult, but watching things about Scientology, but also different military schools—all kinds of things like that with that kind of structure, even certain religions,” she recalled.

The Severance writer and cast on corporate cults, sci-fi, and more Read More »

weight-saving-and-aero-optimization-feature-in-the-2025-porsche-911-gt3

Weight saving and aero optimization feature in the 2025 Porsche 911 GT3


Among the changes are better aero, shorter gearing, and the return of the Touring.

A pair of Porsche 911 GT3s parked next to a wall with the words

The Porsche 911 GT3 is to other 911s as other 911s are to regular cars. Credit: Jonathan Gitlin

The Porsche 911 GT3 is to other 911s as other 911s are to regular cars. Credit: Jonathan Gitlin

VALENCIA, SPAIN—A Porsche 911 is rather special compared to most “normal” cars. The rear-engined sports car might be bigger and less likely to swap ends than the 1960s version, but it remains one of the more nimble and engaging four-wheeled vehicles you can buy. The 911 comes in a multitude of variants, but among driving enthusiasts, few are better regarded than the GT3. And Porsche has just treated the current 911 GT3 to its midlife refresh, which it will build in regular and Touring flavors.

The GT3 is a 911 you can drive to the track, spend the day lapping, and drive home again. It’s come a long way since the 1999 original—that car made less power than a base 911 does now. Now, the recipe is a bit more involved, with a naturally aspirated flat-six engine mounted behind the rear axle that generates 502 hp (375 kW) and 331 lb-ft (450 Nm) and a redline that doesn’t interrupt play until 9,000 rpm. You’ll need to exercise it to reach those outputs—peak power arrives at 8,500, although peak torque happens a bit sooner at around 6,000 revs.

It’s a mighty engine indeed, derived from the racing version of the 911, with some tweaks for road legality. So there are things like individual throttle valves, dry sump lubrication, solid cam finger followers (instead of hydraulic valve lifters), titanium con rods, and forged pistons.

I’ve always liked GT3s in white.

For this car, Porsche has also worked on reducing its emissions, fitting four catalytic converters to the exhaust, plus a pair of particulate filters, which together help cut NOx emissions on the US test cycle by 44 percent. This adds 3 lbs (1.4 kg) of mass and increases exhaust back pressure by 17 percent. But there are also new cylinder heads and reprofiled camshafts (from the even more focused, even more expensive GT3 RS), which increase drivability and power delivery in the upper rev range by keeping the valves open for longer.

Those tweaks might not be immediately noticeable when you look at last year’s GT3, but the shorter gearing definitely will be. The final drive ratios for both the standard seven-speed PDK dual-clutch gearbox and the six-speed manual have been reduced by 8 percent. This lowers the top speed a little—a mostly academic thing anyway outside of the German Autobahn and some very long runways—but it increases the pulling force on the rear wheels in each gear across the entire rev range. In practical terms, it means you can take a corner in a gear higher than you would in the old car.

There have been suspension tweaks, too. The GT3 moved to double front wishbone suspension (replacing the regular car’s MacPherson struts) in 2021, but now the front pivot point has been lowered to reduce the car diving under braking, and the trailing arms have a new teardrop profile that improves brake cooling and reduces drag a little. Porsche has altered the bump stops, giving the suspension an inch (24 mm) more travel at the front axle and slightly more (27 mm) at the rear axle, which in turn means more body control on bumpy roads.

A white Porsche 911 GT3 seen in profile

Credit: Porsche

New software governs the power steering. Because factors like manufacturing tolerances, wear, and even temperature can alter how steering components interact with each other, the software automatically tailors friction compensation to axle friction. Consequently, the steering is more precise and more linear in its behavior, particularly in the dead-ahead position.

The GT3 also has new front and rear fascias, again derived from the racing GT3. There are more cooling inlets, vents, and ducts, plus a new front diffuser that reduces lift at the front axle at speed. Porsche has tuned the GT3’s aerodynamics to be constant across the speed range, and like the old model, it generates around 309 lbs (140 kg) of downforce at 125 mph (200 km/h). Under the car, there are diffusers on the rear lower wishbones, and Porsche has improved brake and driveshaft cooling.

Finally, Porsche has made some changes to the interior. For instance, the GT3 now gains the same digital display seen on other facelifted 911s (the 992.2 generation if you’re a Porsche nerd), similar to the one you’d find in a Taycan, Macan, or Panamera.

Some people may mourn the loss of the big physical tachometer, but I’m not one of them. The car has a trio of UI settings: a traditional five-dial display, a more reduced three-dial display, and a track mode with just the big central tach, which you can reorient so the red line is at 12 o’clock, as was the case with many an old Porsche racing car, rather than its normal position down around 5 o’clock. And instead of a push button to start the car, there’s a twister—if a driver spins on track, it’s more intuitive to restart the car by twisting the control the way you would a key.

You can see the starter switch on the left of the steering wheel. Porsche

Finally, there are new carbon fiber seats, which now have folding backrests for better access to the rear. (However, unless I’m mistaken, you can’t adjust the angle of the backrest.) In a very clever and welcome touch, the headrest padding is removable so that your head isn’t forced forward when wearing a helmet on track. Such is the attention to detail here. (Customers can also spec the car with Porsche’s 18-way sports seats instead.)

Regular, Touring, Lightweight, Wiessach

In fact, the new GT3 is available in two different versions. There’s the standard car, with its massive rear wing (complete with gooseneck mounts), which is the one you’d pick if your diet included plenty of track days. For those who want a 911 that revs to 9 but don’t plan on spending every weekend chasing lap times, Porsche has reintroduced the GT3 Touring. This version ditches the rear wing for the regular 911 rear deck, the six-speed manual is standard (with PDK as an option), and you can even specify rear seats—traditionally, the GT3 has eliminated those items in favor of weight saving.

Of course, it’s possible to cut even more weight from the GT3 with the Weissach Pack for the winged car or a lightweight package for the Touring. These options involve lots of carbon fiber bits for the interior and the rear axle, a carbon fiber roof for the Touring, and even the option of a carbon fiber roll cage for the GT3. The lightweight package for the touring also includes an extra-short gear lever with a shorter throw.

The track mode display might be too minimalist for road driving—I tend to like being able to see my directions as well as the rpm and speed—but it’s perfect for track work. Note the redline at 12 o’clock. Porsche

Although Porsche had to add some weight to the 992.2 compared to the 992.1 thanks to thicker front brake discs and more door-side impact protection, the standard car still weighs just 3,172 lbs (1,439 kg), which you can reduce to 3,131 lbs (1,420 kg) if you fit all the lightweight goodies, including the ultra-lightweight magnesium wheels.

Behind the wheel

I began my day with a road drive in the GT3 Touring—a PDK model. Porsche wasn’t kidding about the steering. I hesitate to call it telepathic, as that’s a bit of a cliché, but it’s extremely direct, particularly the initial turn-in. There’s also plenty of welcome feedback from the front tires. In an age when far too many cars have essentially numb steering, the GT3 is something of a revelation. And it’s proof that electronic power steering can be designed and tuned to deliver a rewarding experience.

The cockpit ergonomics are spot-on, with plenty of physical controls rather than relegating everything to a touchscreen. If you’re short like me and you buy a GT3, you’ll want to have the buckets set for your driving position—while the seat adjusts for height, as you raise it up, it also pitches forward a little, making the seat back more vertical than I’d like. (The seats slide fore and aft, so they’re not quite fixed buckets as they would be in a racing car.)

The anti-dive effect of that front suspension is quite noticeable under braking, and in either Normal or Sport mode, the damper settings are well-calibrated for bumpy back roads. It’s a supple ride, if not quite a magic carpet. On the highway, the Touring cruises well, although the engine can start to sound a little droning at a constant rpm. But the highway is not what the GT3 is optimized for.

On a dusty or wet road, you need to be alert if you’re going to use a lot of throttle at low speed. Jonathan Gitlin

On windy mountain roads, again in Normal or Sport, the car comes alive. Second and third gears are perfect for these conditions, allowing you to keep the car within its power band. And boy, does it sound good as it howls between 7,000 and 9,000 rpm. Porsche’s naturally aspirated flat-sixes have a hard edge to them—the 911 RSR was always the loudest race car in the pack—and the GT3 is no exception. Even with the sports exhaust in fruity mode, there’s little of the pops, bangs, and crackles you might hear in other sports cars, but the drama comes from the 9000 rpm redline.

Porsche asked us to keep traction control and ESC enabled during our drive—there are one-touch buttons to disable them—and given the muddy and dusty state of the roads, this was a wise idea. (The region was beset by severe flooding recently, and there was plenty of evidence of that on the route.) Even with TC on, the rear wheels would break traction if you were injudicious with the throttle, and presumably that would be the same in the wet. But it’s very easy to catch, even if you are only of moderate driving ability, like your humble correspondent.

After lunch, it was time to try the winged car, this time on the confines of the Ricardo Torno circuit just outside the city. On track, the handling was very neutral around most of the corners, with some understeer through the very slow turn 2. While a low curb weight and more than 500 hp made for a very fast accelerating car, the braking performance was probably even more impressive, allowing you to stand on the pedal and shed speed with no fade and little disturbance to the body control. Again, I am no driving god, but the GT3 was immensely flattering on track, and unlike much older 911s, it won’t try to swap ends on you when trail-braking or the like.

The landing was not nearly as jarring as you might think. Porsche

After some time behind the wheel, I was treated to some passenger laps by one of my favorite racing drivers, the inimitable Jörg Bergmeister. Unlike us journalists, he was not required to stay off the high curbs, and he demonstrated how well the car settles after launching its right-side wheels into the air over one of them. It settles down very quickly! He also demonstrated that the GT3 can be plenty oversteer-y on the exit of corners if you know what you’re doing, aided by the rear-wheel steering. It’s a testament to his driving that I emerged from two passenger laps far sweatier than I was after lapping the track myself.

The GT3 and GT3 Touring should be available from this summer in the US, with a starting price of $222,500. Were I looking for a 911 for road driving, I think I might be more tempted by the much cheaper 911 Carrera T, which is also pared to the bone weight-wise but uses the standard 380 hp (283 kW) turbocharged engine (which is still more power than the original GT3 of 1999). That car delivers plenty of fun at lower speeds, so it’s probably more useable on back roads.

A green Porsche 911 GT3 seen at sunset

Credit: Porsche

But if you want a 911 for track work, this new GT3 is simply perfect.

Photo of Jonathan M. Gitlin

Jonathan is the Automotive Editor at Ars Technica. He has a BSc and PhD in Pharmacology. In 2014 he decided to indulge his lifelong passion for the car by leaving the National Human Genome Research Institute and launching Ars Technica’s automotive coverage. He lives in Washington, DC.

Weight saving and aero optimization feature in the 2025 Porsche 911 GT3 Read More »

ai-haters-build-tarpits-to-trap-and-trick-ai-scrapers-that-ignore-robots.txt

AI haters build tarpits to trap and trick AI scrapers that ignore robots.txt


Making AI crawlers squirm

Attackers explain how an anti-spam defense became an AI weapon.

Last summer, Anthropic inspired backlash when its ClaudeBot AI crawler was accused of hammering websites a million or more times a day.

And it wasn’t the only artificial intelligence company making headlines for supposedly ignoring instructions in robots.txt files to avoid scraping web content on certain sites. Around the same time, Reddit’s CEO called out all AI companies whose crawlers he said were “a pain in the ass to block,” despite the tech industry otherwise agreeing to respect “no scraping” robots.txt rules.

Watching the controversy unfold was a software developer whom Ars has granted anonymity to discuss his development of malware (we’ll call him Aaron). Shortly after he noticed Facebook’s crawler exceeding 30 million hits on his site, Aaron began plotting a new kind of attack on crawlers “clobbering” websites that he told Ars he hoped would give “teeth” to robots.txt.

Building on an anti-spam cybersecurity tactic known as tarpitting, he created Nepenthes, malicious software named after a carnivorous plant that will “eat just about anything that finds its way inside.”

Aaron clearly warns users that Nepenthes is aggressive malware. It’s not to be deployed by site owners uncomfortable with trapping AI crawlers and sending them down an “infinite maze” of static files with no exit links, where they “get stuck” and “thrash around” for months, he tells users. Once trapped, the crawlers can be fed gibberish data, aka Markov babble, which is designed to poison AI models. That’s likely an appealing bonus feature for any site owners who, like Aaron, are fed up with paying for AI scraping and just want to watch AI burn.

Tarpits were originally designed to waste spammers’ time and resources, but creators like Aaron have now evolved the tactic into an anti-AI weapon. As of this writing, Aaron confirmed that Nepenthes can effectively trap all the major web crawlers. So far, only OpenAI’s crawler has managed to escape.

It’s unclear how much damage tarpits or other AI attacks can ultimately do. Last May, Laxmi Korada, Microsoft’s director of partner technology, published a report detailing how leading AI companies were coping with poisoning, one of the earliest AI defense tactics deployed. He noted that all companies have developed poisoning countermeasures, while OpenAI “has been quite vigilant” and excels at detecting the “first signs of data poisoning attempts.”

Despite these efforts, he concluded that data poisoning was “a serious threat to machine learning models.” And in 2025, tarpitting represents a new threat, potentially increasing the costs of fresh data at a moment when AI companies are heavily investing and competing to innovate quickly while rarely turning significant profits.

“A link to a Nepenthes location from your site will flood out valid URLs within your site’s domain name, making it unlikely the crawler will access real content,” a Nepenthes explainer reads.

The only AI company that responded to Ars’ request to comment was OpenAI, whose spokesperson confirmed that OpenAI is already working on a way to fight tarpitting.

“We’re aware of efforts to disrupt AI web crawlers,” OpenAI’s spokesperson said. “We design our systems to be resilient while respecting robots.txt and standard web practices.”

But to Aaron, the fight is not about winning. Instead, it’s about resisting the AI industry further decaying the Internet with tech that no one asked for, like chatbots that replace customer service agents or the rise of inaccurate AI search summaries. By releasing Nepenthes, he hopes to do as much damage as possible, perhaps spiking companies’ AI training costs, dragging out training efforts, or even accelerating model collapse, with tarpits helping to delay the next wave of enshittification.

“Ultimately, it’s like the Internet that I grew up on and loved is long gone,” Aaron told Ars. “I’m just fed up, and you know what? Let’s fight back, even if it’s not successful. Be indigestible. Grow spikes.”

Nepenthes instantly inspires another tarpit

Nepenthes was released in mid-January but was instantly popularized beyond Aaron’s expectations after tech journalist Cory Doctorow boosted a tech commentator, Jürgen Geuter, praising the novel AI attack method on Mastodon. Very quickly, Aaron was shocked to see engagement with Nepenthes skyrocket.

“That’s when I realized, ‘oh this is going to be something,'” Aaron told Ars. “I’m kind of shocked by how much it’s blown up.”

It’s hard to tell how widely Nepenthes has been deployed. Site owners are discouraged from flagging when the malware has been deployed, forcing crawlers to face unknown “consequences” if they ignore robots.txt instructions.

Aaron told Ars that while “a handful” of site owners have reached out and “most people are being quiet about it,” his web server logs indicate that people are already deploying the tool. Likely, site owners want to protect their content, deter scraping, or mess with AI companies.

When software developer and hacker Gergely Nagy, who goes by the handle “algernon” online, saw Nepenthes, he was delighted. At that time, Nagy told Ars that nearly all of his server’s bandwidth was being “eaten” by AI crawlers.

Already blocking scraping and attempting to poison AI models through a simpler method, Nagy took his defense method further and created his own tarpit, Iocaine. He told Ars the tarpit immediately killed off about 94 percent of bot traffic to his site, which was primarily from AI crawlers. Soon, social media discussion drove users to inquire about Iocaine deployment, including not just individuals but also organizations wanting to take stronger steps to block scraping.

Iocaine takes ideas (not code) from Nepenthes, but it’s more intent on using the tarpit to poison AI models. Nagy used a reverse proxy to trap crawlers in an “infinite maze of garbage” in an attempt to slowly poison their data collection as much as possible for daring to ignore robots.txt.

Taking its name from “one of the deadliest poisons known to man” from The Princess Bride, Iocaine is jokingly depicted as the “deadliest poison known to AI.” While there’s no way of validating that claim, Nagy’s motto is that the more poisoning attacks that are out there, “the merrier.” He told Ars that his primary reasons for building Iocaine were to help rights holders wall off valuable content and stop AI crawlers from crawling with abandon.

Tarpits aren’t perfect weapons against AI

Running malware like Nepenthes can burden servers, too. Aaron likened the cost of running Nepenthes to running a cheap virtual machine on a Raspberry Pi, and Nagy said that serving crawlers Iocaine costs about the same as serving his website.

But Aaron told Ars that Nepenthes wasting resources is the chief objection he’s seen preventing its deployment. Critics fear that deploying Nepenthes widely will not only burden their servers but also increase the costs of powering all that AI crawling for nothing.

“That seems to be what they’re worried about more than anything,” Aaron told Ars. “The amount of power that AI models require is already astronomical, and I’m making it worse. And my view of that is, OK, so if I do nothing, AI models, they boil the planet. If I switch this on, they boil the planet. How is that my fault?”

Aaron also defends against this criticism by suggesting that a broader impact could slow down AI investment enough to possibly curb some of that energy consumption. Perhaps due to the resistance, AI companies will be pushed to seek permission first to scrape or agree to pay more content creators for training on their data.

“Any time one of these crawlers pulls from my tarpit, it’s resources they’ve consumed and will have to pay hard cash for, but, being bullshit, the money [they] have spent to get it won’t be paid back by revenue,” Aaron posted, explaining his tactic online. “It effectively raises their costs. And seeing how none of them have turned a profit yet, that’s a big problem for them. The investor money will not continue forever without the investors getting paid.”

Nagy agrees that the more anti-AI attacks there are, the greater the potential is for them to have an impact. And by releasing Iocaine, Nagy showed that social media chatter about new attacks can inspire new tools within a few days. Marcus Butler, an independent software developer, similarly built his poisoning attack called Quixotic over a few days, he told Ars. Soon afterward, he received messages from others who built their own versions of his tool.

Butler is not in the camp of wanting to destroy AI. He told Ars that he doesn’t think “tools like Quixotic (or Nepenthes) will ‘burn AI to the ground.'” Instead, he takes a more measured stance, suggesting that “these tools provide a little protection (a very little protection) against scrapers taking content and, say, reposting it or using it for training purposes.”

But for a certain sect of Internet users, every little bit of protection seemingly helps. Geuter linked Ars to a list of tools bent on sabotaging AI. Ultimately, he expects that tools like Nepenthes are “probably not gonna be useful in the long run” because AI companies can likely detect and drop gibberish from training data. But Nepenthes represents a sea change, Geuter told Ars, providing a useful tool for people who “feel helpless” in the face of endless scraping and showing that “the story of there being no alternative or choice is false.”

Criticism of tarpits as AI weapons

Critics debating Nepenthes’ utility on Hacker News suggested that most AI crawlers could easily avoid tarpits like Nepenthes, with one commenter describing the attack as being “very crawler 101.” Aaron said that was his “favorite comment” because if tarpits are considered elementary attacks, he has “2 million lines of access log that show that Google didn’t graduate.”

But efforts to poison AI or waste AI resources don’t just mess with the tech industry. Governments globally are seeking to leverage AI to solve societal problems, and attacks on AI’s resilience seemingly threaten to disrupt that progress.

Nathan VanHoudnos is a senior AI security research scientist in the federally funded CERT Division of the Carnegie Mellon University Software Engineering Institute, which partners with academia, industry, law enforcement, and government to “improve the security and resilience of computer systems and networks.” He told Ars that new threats like tarpits seem to replicate a problem that AI companies are already well aware of: “that some of the stuff that you’re going to download from the Internet might not be good for you.”

“It sounds like these tarpit creators just mainly want to cause a little bit of trouble,” VanHoudnos said. “They want to make it a little harder for these folks to get” the “better or different” data “that they’re looking for.”

VanHoudnos co-authored a paper on “Counter AI” last August, pointing out that attackers like Aaron and Nagy are limited in how much they can mess with AI models. They may have “influence over what training data is collected but may not be able to control how the data are labeled, have access to the trained model, or have access to the Al system,” the paper said.

Further, AI companies are increasingly turning to the deep web for unique data, so any efforts to wall off valuable content with tarpits may be coming right when crawling on the surface web starts to slow, VanHoudnos suggested.

But according to VanHoudnos, AI crawlers are also “relatively cheap,” and companies may deprioritize fighting against new attacks on crawlers if “there are higher-priority assets” under attack. And tarpitting “does need to be taken seriously because it is a tool in a toolkit throughout the whole life cycle of these systems. There is no silver bullet, but this is an interesting tool in a toolkit,” he said.

Offering a choice to abstain from AI training

Aaron told Ars that he never intended Nepenthes to be a major project but that he occasionally puts in work to fix bugs or add new features. He said he’d consider working on integrations for real-time reactions to crawlers if there was enough demand.

Currently, Aaron predicts that Nepenthes might be most attractive to rights holders who want AI companies to pay to scrape their data. And many people seem enthusiastic about using it to reinforce robots.txt. But “some of the most exciting people are in the ‘let it burn’ category,” Aaron said. These people are drawn to tools like Nepenthes as an act of rebellion against AI making the Internet less useful and enjoyable for users.

Geuter told Ars that he considers Nepenthes “more of a sociopolitical statement than really a technological solution (because the problem it’s trying to address isn’t purely technical, it’s social, political, legal, and needs way bigger levers).”

To Geuter, a computer scientist who has been writing about the social, political, and structural impact of tech for two decades, AI is the “most aggressive” example of “technologies that are not done ‘for us’ but ‘to us.'”

“It feels a bit like the social contract that society and the tech sector/engineering have had (you build useful things, and we’re OK with you being well-off) has been canceled from one side,” Geuter said. “And that side now wants to have its toy eat the world. People feel threatened and want the threats to stop.”

As AI evolves, so do attacks, with one 2021 study showing that increasingly stronger data poisoning attacks, for example, were able to break data sanitization defenses. Whether these attacks can ever do meaningful destruction or not, Geuter sees tarpits as a “powerful symbol” of the resistance that Aaron and Nagy readily joined.

“It’s a great sign to see that people are challenging the notion that we all have to do AI now,” Geuter said. “Because we don’t. It’s a choice. A choice that mostly benefits monopolists.”

Tarpit creators like Nagy will likely be watching to see if poisoning attacks continue growing in sophistication. On the Iocaine site—which, yes, is protected from scraping by Iocaine—he posted this call to action: “Let’s make AI poisoning the norm. If we all do it, they won’t have anything to crawl.”

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

AI haters build tarpits to trap and trick AI scrapers that ignore robots.txt Read More »

nvidia-geforce-rtx-5090-costs-as-much-as-a-whole-gaming-pc—but-it-sure-is-fast

Nvidia GeForce RTX 5090 costs as much as a whole gaming PC—but it sure is fast


Even setting aside Frame Generation, this is a fast, power-hungry $2,000 GPU.

Credit: Andrew Cunningham

Credit: Andrew Cunningham

Nvidia’s GeForce RTX 5090 starts at $1,999 before you factor in upsells from the company’s partners or price increases driven by scalpers and/or genuine demand. It costs more than my entire gaming PC.

The new GPU is so expensive that you could build an entire well-specced gaming PC with Nvidia’s next-fastest GPU in it—the $999 RTX 5080, which we don’t have in hand yet—for the same money, or maybe even a little less with judicious component selection. It’s not the most expensive GPU that Nvidia has ever launched—2018’s $2,499 Titan RTX has it beat, and 2022’s RTX 3090 Ti also cost $2,000—but it’s safe to say it’s not really a GPU intended for the masses.

At least as far as gaming is concerned, the 5090 is the very definition of a halo product; it’s for people who demand the best and newest thing regardless of what it costs (the calculus is probably different for deep-pocketed people and companies who want to use them as some kind of generative AI accelerator). And on this front, at least, the 5090 is successful. It’s the newest and fastest GPU you can buy, and the competition is not particularly close. It’s also a showcase for DLSS Multi-Frame Generation, a new feature unique to the 50-series cards that Nvidia is leaning on heavily to make its new GPUs look better than they already are.

Founders Edition cards: Design and cooling

RTX 5090 RTX 4090 RTX 5080 RTX 4080 Super
CUDA cores 21,760 16,384 10,752 10,240
Boost clock 2,410 MHz 2,520 MHz 2,617 MHz 2,550 MHz
Memory bus width 512-bit 384-bit 256-bit 256-bit
Memory bandwidth 1,792 GB/s 1,008 GB/s 960 GB/s 736 GB/s
Memory size 32GB GDDR7 24GB GDDR6X 16GB GDDR7 16GB GDDR6X
TGP 575 W 450 W 360 W 320 W

We won’t spend too long talking about the specific designs of Nvidia’s Founders Edition cards since many buyers will experience the Blackwell GPUs with cards from Nvidia’s partners instead (the cards we’ve seen so far mostly look like the expected fare: gargantuan triple-slot triple-fan coolers, with varying degrees of RGB). But it’s worth noting that Nvidia has addressed a couple of my functional gripes with the 4090/4080-series design.

The first was the sheer dimensions of each card—not an issue unique to Nvidia, but one that frequently caused problems for me as someone who tends toward ITX-based PCs and smaller builds. The 5090 and 5080 FE designs are the same length and height as the 4090 and 4080 FE designs, but they only take up two slots instead of three, which will make them an easier fit for many cases.

Nvidia has also tweaked the cards’ 12VHPWR connector, recessing it into the card and mounting it at a slight angle instead of having it sticking straight out of the top edge. The height of the 4090/4080 FE design made some cases hard to close up once you factored in the additional height of a 12VHPWR cable or Nvidia’s many-tentacled 8-pin-to-12VHPWR adapter. The angled connector still extends a bit beyond the top of the card, but it’s easier to tuck the cable away so you can put the side back on your case.

Finally, Nvidia has changed its cooler—whereas most OEM GPUs mount all their fans on the top of the GPU, Nvidia has historically placed one fan on each side of the card. In a standard ATX case with the GPU mounted parallel to the bottom of the case, this wasn’t a huge deal—there’s plenty of room for that air to circulate inside the case and to be expelled by whatever case fans you have installed.

But in “sandwich-style” ITX cases, where a riser cable wraps around so the GPU can be mounted parallel to the motherboard, the fan on the bottom side of the GPU was poorly placed. In many sandwich-style cases, the GPU fan will dump heat against the back of the motherboard, making it harder to keep the GPU cool and creating heat problems elsewhere besides. The new GPUs mount both fans on the top of the cards.

Nvidia’s Founders Edition cards have had heat issues in the past—most notably the 30-series GPUs—and that was my first question going in. A smaller cooler plus a dramatically higher peak power draw seems like a recipe for overheating.

Temperatures for the various cards we re-tested for this review. The 5090 FE is the toastiest of all of them, but it still has a safe operating temperature.

At least for the 5090, the smaller cooler does mean higher temperatures—around 10 to 12 degrees Celsius higher when running the same benchmarks as the RTX 4090 Founders Edition. And while temperatures of around 77 degrees aren’t hugely concerning, this is sort of a best-case scenario, with an adequately cooled testbed case with the side panel totally removed and ambient temperatures at around 21° or 22° Celsius. You’ll just want to make sure you have a good amount of airflow in your case if you buy one of these.

Testbed notes

A new high-end Nvidia GPU is a good reason to tweak our test bed and suite of games, and we’ve done both here. Mainly, we added a 1050 W Thermaltake Toughpower GF A3 power supply—Nvidia recommends at least 1000 W for the 5090, and this one has a native 12VHPWR connector for convenience. We’ve also swapped the Ryzen 7 7800X3D for a slightly faster Ryzen 7 9800X3D to reduce the odds that the CPU will bottleneck performance as we try to hit high frame rates.

As for the suite of games, we’ve removed a couple of older titles and added some with built-in benchmarks that will tax these GPUs a bit more, especially at 4K with all the settings turned up. Those games include the RT Overdrive preset in the perennially punishing Cyberpunk 2077 and Black Myth: Wukong in Cinematic mode, both games where even the RTX 4090 struggles to hit 60 fps without an assist from DLSS. We’ve also added Horizon Zero Dawn Remastered, a recent release that doesn’t include ray-tracing effects but does support most DLSS 3 and FSR 3 features (including FSR Frame Generation).

We’ve tried to strike a balance between games with ray-tracing effects and games without it, though most AAA games these days include it, and modern GPUs should be able to handle it well (best of luck to AMD with its upcoming RDNA 4 cards).

For the 5090, we’ve run all tests in 4K—if you don’t care about running games in 4K, even if you want super-high frame rates at 1440p or for some kind of ultrawide monitor, the 5090 is probably overkill. When we run upscaling tests, we use the newest DLSS version available for Nvidia cards, the newest FSR version available for AMD cards, and the newest XeSS version available for Intel cards (not relevant here, just stating for the record), and we use the “Quality” setting (at 4K, that equates to an actual rendering version of 1440p).

Rendering performance: A lot faster, a lot more power-hungry

Before we talk about Frame Generation or “fake frames,” let’s compare apples to apples and just examine the 5090’s rendering performance.

The card mainly benefits from four things compared to the 4090: the updated Blackwell GPU architecture, a nearly 33 percent increase in the number of CUDA cores, an upgrade from GDDR6X to GDDR7, and a move from a 384-bit memory bus to a 512-bit bus. It also jumps from 24GB of RAM to 32GB, but games generally aren’t butting up against a 24GB limit yet, so the capacity increase by itself shouldn’t really change performance if all you’re focused on is gaming.

And for people who prioritize performance over all else, the 5090 is a big deal—it’s the first consumer graphics card from any company that is faster than a 4090, as Nvidia never spruced up the 4090 last year when it did its mid-generation Super refreshes of the 4080, 4070 Ti, and 4070.

Comparing natively rendered games at 4K, the 5090 is between 17 percent and 40 percent faster than the 4090, with most of the games we tested landing somewhere in the low to high 30 percent range. That’s an undeniably big bump, one that’s roughly commensurate with the increase in the number of CUDA cores. Tests run with DLSS enabled (both upscaling-only and with Frame Generation running in 2x mode) improve by roughly the same amount.

You could find things to be disappointed about if you went looking for them. That 30-something-percent performance increase comes with a 35 percent increase in power use in our testing under load with punishing 4K games—the 4090 tops out around 420 W, whereas the 5090 went all the way up to 573 W, with the 5090 coming closer to its 575 W TDP than the 4090 does to its theoretical 450 W maximum. The 50-series cards use the same TSMC 4N manufacturing process as the 40-series cards, and increasing the number of transistors without changing the process results in a chip that uses more power (though it should be said that capping frame rates, running at lower resolutions, or running less-demanding games can rein in that power use a bit).

Power draw under load goes up by an amount roughly commensurate with performance. The 4090 was already power-hungry; the 5090 is dramatically more so. Credit: Andrew Cunningham

The 5090’s 30-something percent increase over the 4090 might also seem underwhelming if you recall that the 4090 was around 55 percent faster than the previous-generation 3090 Ti while consuming about the same amount of power. To be even faster than a 4090 is no small feat—AMD’s fastest GPU is more in line with Nvidia’s 4080 Super—but if you’re comparing the two cards using the exact same tests, the relative leap is less seismic.

That brings us to Nvidia’s answer for that problem: DLSS 4 and its Multi-Frame Generation feature.

DLSS 4 and Multi-Frame Generation

As a refresher, Nvidia’s DLSS Frame Generation feature, as introduced in the GeForce 40-series, takes DLSS upscaling one step further. The upscaling feature inserted interpolated pixels into a rendered image to make it look like a sharper, higher-resolution image without having to do all the work of rendering all those pixels. DLSS FG would interpolate an entire frame between rendered frames, boosting your FPS without dramatically boosting the amount of work your GPU was doing. If you used DLSS upscaling and FG at the same time, Nvidia could claim that seven out of eight pixels on your screen were generated by AI.

DLSS Multi-Frame Generation (hereafter MFG, for simplicity’s sake) does the same thing, but it can generate one to three interpolated frames for every rendered frame. The marketing numbers have gone up, too; now, 15 out of every 16 pixels on your screen can be generated by AI.

Nvidia might point to this and say that the 5090 is over twice as fast as the 4090, but that’s not really comparing apples to apples. Expect this issue to persist over the lifetime of the 50-series. Credit: Andrew Cunningham

Nvidia provided reviewers with a preview build of Cyberpunk 2077 with DLSS MFG enabled, which gives us an example of how those settings will be exposed to users. For 40-series cards that only support the regular DLSS FG, you won’t notice a difference in games that support MFG—Frame Generation is still just one toggle you can turn on or off. For 50-series cards that support MFG, you’ll be able to choose from among a few options, just as you currently can with other DLSS quality settings.

The “2x” mode is the old version of DLSS FG and is supported by both the 50-series cards and 40-series GPUs; it promises one generated frame for every rendered frame (two frames total, hence “2x”). The “3x” and “4x” modes are new to the 50-series and promise two and three generated frames (respectively) for every rendered frame. Like the original DLSS FG, MFG can be used in concert with normal DLSS upscaling, or it can be used independently.

One problem with the original DLSS FG was latency—user input was only being sampled at the natively rendered frame rate, meaning you could be looking at 60 frames per second on your display but only having your input polled 30 times per second. Another is image quality; as good as the DLSS algorithms can be at guessing and recreating what a natively rendered pixel would look like, you’ll inevitably see errors, particularly in fine details.

Both these problems contribute to the third problem with DLSS FG: Without a decent underlying frame rate, the lag you feel and the weird visual artifacts you notice will both be more pronounced. So DLSS FG can be useful for turning 120 fps into 240 fps, or even 60 fps into 120 fps. But it’s not as helpful if you’re trying to get from 20 or 30 fps up to a smooth 60 fps.

We’ll be taking a closer look at the DLSS upgrades in the next couple of weeks (including MFG and the new transformer model, which will supposedly increase upscaling quality and supports all RTX GPUs). But in our limited testing so far, the issues with DLSS MFG are basically the same as with the first version of Frame Generation, just slightly more pronounced. In the built-in Cyberpunk 2077 benchmark, the most visible issues are with some bits of barbed-wire fencing, which get smoother-looking and less detailed as you crank up the number of AI-generated frames. But the motion does look fluid and smooth, and the frame rate counts are admittedly impressive.

But as we noted in last year’s 4090 review, the xx90 cards portray FG and MFG in the best light possible since the card is already capable of natively rendering such high frame rates. It’s on lower-end cards where the shortcomings of the technology become more pronounced. Nvidia might say that the upcoming RTX 5070 is “as fast as a 4090 for $549,” and it might be right in terms of the number of frames the card can put up on your screen every second. But responsiveness and visual fidelity on the 4090 will be better every time—AI is a good augmentation for rendered frames, but it’s iffy as a replacement for rendered frames.

A 4090, amped way up

Nvidia’s GeForce RTX 5090. Credit: Andrew Cunningham

The GeForce RTX 5090 is an impressive card—it’s the only consumer graphics card to be released in over two years that can outperform the RTX 4090. The main caveats are its sky-high power consumption and sky-high price; by itself, it costs as much (and consumes as much power as) an entire mainstream gaming PC. The card is aimed at people who care about speed way more than they care about price, but it’s still worth putting it into context.

The main controversy, as with the 40-series, is how Nvidia talks about its Frame Generation-inflated performance numbers. Frame Generation and Multi-Frame Generation are tools in a toolbox—there will be games where they make things look great and run fast with minimal noticeable impact to visual quality or responsiveness, games where those impacts are more noticeable, and games that never add support for the features at all. (As well-supported as DLSS generally is in new releases, it is incumbent upon game developers to add it—and update it when Nvidia puts out a new version.)

But using those Multi-Frame Generation-inflated FPS numbers to make topline comparisons to last-generation graphics cards just feels disingenuous. No, an RTX 5070 will not be as fast as an RTX 4090 for just $549, because not all games support DLSS MFG, and not all games that do support it will run it well. Frame Generation still needs a good base frame rate to start with, and the slower your card is, the more issues you might notice.

Fuzzy marketing aside, Nvidia is still the undisputed leader in the GPU market, and the RTX 5090 extends that leadership for what will likely be another entire GPU generation, since both AMD and Intel are focusing their efforts on higher-volume, lower-cost cards right now. DLSS is still generally better than AMD’s FSR, and Nvidia does a good job of getting developers of new AAA game releases to support it. And if you’re buying this GPU to do some kind of rendering work or generative AI acceleration, Nvidia’s performance and software tools are still superior. The misleading performance claims are frustrating, but Nvidia still gains a lot of real advantages from being as dominant and entrenched as it is.

The good

  • Usually 30-something percent faster than an RTX 4090
  • Redesigned Founders Edition card is less unwieldy than the bricks that were the 4090/4080 design
  • Adequate cooling, despite the smaller card and higher power use
  • DLSS Multi-Frame Generation is an intriguing option if you’re trying to hit 240 or 360 fps on your high-refresh-rate gaming monitor

The bad

  • Much higher power consumption than the 4090, which already consumed more power than any other GPU on the market
  • Frame Generation is good at making a game that’s running fast run faster, it’s not as good for bringing a slow game up to 60 Hz
  • Nvidia’s misleading marketing around Multi-Frame Generation is frustrating—and will likely be more frustrating for lower-end cards since they aren’t getting the same bumps to core count and memory interface that the 5090 gets

The ugly

  • You can buy a whole lot of PC for $2,000, and we wouldn’t bet on this GPU being easy to find at MSRP

Photo of Andrew Cunningham

Andrew is a Senior Technology Reporter at Ars Technica, with a focus on consumer tech including computer hardware and in-depth reviews of operating systems like Windows and macOS. Andrew lives in Philadelphia and co-hosts a weekly book podcast called Overdue.

Nvidia GeForce RTX 5090 costs as much as a whole gaming PC—but it sure is fast Read More »