gaming

why-anthropic’s-claude-still-hasn’t-beaten-pokemon

Why Anthropic’s Claude still hasn’t beaten Pokémon


Weeks later, Sonnet’s “reasoning” model is struggling with a game designed for children.

A game Boy Color playing Pokémon Red surrounded by the tendrils of an AI, or maybe some funky glowing wires, what do AI tendrils look like anyways

Gotta subsume ’em all into the machine consciousness! Credit: Aurich Lawson

Gotta subsume ’em all into the machine consciousness! Credit: Aurich Lawson

In recent months, the AI industry’s biggest boosters have started converging on a public expectation that we’re on the verge of “artificial general intelligence” (AGI)—virtual agents that can match or surpass “human-level” understanding and performance on most cognitive tasks.

OpenAI is quietly seeding expectations for a “PhD-level” AI agent that could operate autonomously at the level of a “high-income knowledge worker” in the near future. Elon Musk says that “we’ll have AI smarter than any one human probably” by the end of 2025. Anthropic CEO Dario Amodei thinks it might take a bit longer but similarly says it’s plausible that AI will be “better than humans at almost everything” by the end of 2027.

A few researchers at Anthropic have, over the past year, had a part-time obsession with a peculiar problem.

Can Claude play Pokémon?

A thread: pic.twitter.com/K8SkNXCxYJ

— Anthropic (@AnthropicAI) February 25, 2025

Last month, Anthropic presented its “Claude Plays Pokémon” experiment as a waypoint on the road to that predicted AGI future. It’s a project the company said shows “glimmers of AI systems that tackle challenges with increasing competence, not just through training but with generalized reasoning.” Anthropic made headlines by trumpeting how Claude 3.7 Sonnet’s “improved reasoning capabilities” let the company’s latest model make progress in the popular old-school Game Boy RPG in ways “that older models had little hope of achieving.”

While Claude models from just a year ago struggled even to leave the game’s opening area, Claude 3.7 Sonnet was able to make progress by collecting multiple in-game Gym Badges in a relatively small number of in-game actions. That breakthrough, Anthropic wrote, was because the “extended thinking” by Claude 3.7 Sonnet means the new model “plans ahead, remembers its objectives, and adapts when initial strategies fail” in a way that its predecessors didn’t. Those things, Anthropic brags, are “critical skills for battling pixelated gym leaders. And, we posit, in solving real-world problems too.”

Over the last year, new Claude models have shown quick progress in reaching new Pokémon milestones.

Over the last year, new Claude models have shown quick progress in reaching new Pokémon milestones. Credit: Anthropic

But relative success over previous models is not the same as absolute success over the game in its entirety. In the weeks since Claude Plays Pokémon was first made public, thousands of Twitch viewers have watched Claude struggle to make consistent progress in the game. Despite long “thinking” pauses between each move—during which viewers can read printouts of the system’s simulated reasoning process—Claude frequently finds itself pointlessly revisiting completed towns, getting stuck in blind corners of the map for extended periods, or fruitlessly talking to the same unhelpful NPC over and over, to cite just a few examples of distinctly sub-human in-game performance.

Watching Claude continue to struggle at a game designed for children, it’s hard to imagine we’re witnessing the genesis of some sort of computer superintelligence. But even Claude’s current sub-human level of Pokémon performance could hold significant lessons for the quest toward generalized, human-level artificial intelligence.

Smart in different ways

In some sense, it’s impressive that Claude can play Pokémon with any facility at all. When developing AI systems that find dominant strategies in games like Go and Dota 2, engineers generally start their algorithms off with deep knowledge of a game’s rules and/or basic strategies, as well as a reward function to guide them toward better performance. For Claude Plays Pokémon, though, project developer and Anthropic employee David Hershey says he started with an unmodified, generalized Claude model that wasn’t specifically trained or tuned to play Pokémon games in any way.

“This is purely the various other things that [Claude] understands about the world being used to point at video games,” Hershey told Ars. “So it has a sense of a Pokémon. If you go to claude.ai and ask about Pokémon, it knows what Pokémon is based on what it’s read… If you ask, it’ll tell you there’s eight gym badges, it’ll tell you the first one is Brock… it knows the broad structure.”

A flowchart summarizing the pieces that help Claude interact with an active game of Pokémon (click through to zoom in).

A flowchart summarizing the pieces that help Claude interact with an active game of Pokémon (click through to zoom in). Credit: Anthropic / Excelidraw

In addition to directly monitoring certain key (emulated) Game Boy RAM addresses for game state information, Claude views and interprets the game’s visual output much like a human would. But despite recent advances in AI image processing, Hershey said Claude still struggles to interpret the low-resolution, pixelated world of a Game Boy screenshot as well as a human can. “Claude’s still not particularly good at understanding what’s on the screen at all,” he said. “You will see it attempt to walk into walls all the time.”

Hershey said he suspects Claude’s training data probably doesn’t contain many overly detailed text descriptions of “stuff that looks like a Game Boy screen.” This means that, somewhat surprisingly, if Claude were playing a game with “more realistic imagery, I think Claude would actually be able to see a lot better,” Hershey said.

“It’s one of those funny things about humans that we can squint at these eight-by-eight pixel blobs of people and say, ‘That’s a girl with blue hair,’” Hershey continued. “People, I think, have that ability to map from our real world to understand and sort of grok that… so I’m honestly kind of surprised that Claude’s as good as it is at being able to see there’s a person on the screen.”

Even with a perfect understanding of what it’s seeing on-screen, though, Hershey said Claude would still struggle with 2D navigation challenges that would be trivial for a human. “It’s pretty easy for me to understand that [an in-game] building is a building and that I can’t walk through a building,” Hershey said. “And that’s [something] that’s pretty challenging for Claude to understand… It’s funny because it’s just kind of smart in different ways, you know?”

A sample Pokémon screen with an overlay showing how Claude characterizes the game’s grid-based map.

A sample Pokémon screen with an overlay showing how Claude characterizes the game’s grid-based map. Credit: Anthrropic / X

Where Claude tends to perform better, Hershey said, is in the more text-based portions of the game. During an in-game battle, Claude will readily notice when the game tells it that an attack from an electric-type Pokémon is “not very effective” against a rock-type opponent, for instance. Claude will then squirrel that factoid away in a massive written knowledge base for future reference later in the run. Claude can also integrate multiple pieces of similar knowledge into pretty elegant battle strategies, even extending those strategies into long-term plans for catching and managing teams of multiple creatures for future battles.

Claude can even show surprising “intelligence” when Pokémon’s in-game text is intentionally misleading or incomplete. “It’s pretty funny that they tell you you need to go find Professor Oak next door and then he’s not there,” Hershey said of an early-game task. “As a 5-year-old, that was very confusing to me. But Claude actually typically goes through that same set of motions where it talks to mom, goes to the lab, doesn’t find [Oak], says, ‘I need to figure something out’… It’s sophisticated enough to sort of go through the motions of the way [humans are] actually supposed to learn it, too.”

A sample of the kind of simulated reasoning process Claude steps through during a typical Pokémon battle.

A sample of the kind of simulated reasoning process Claude steps through during a typical Pokémon battle. Credit: Claude Plays Pokemon / Twitch

These kinds of relative strengths and weaknesses when compared to “human-level” play reflect the overall state of AI research and capabilities in general, Hershey said. “I think it’s just a sort of universal thing about these models… We built the text side of it first, and the text side is definitely… more powerful. How these models can reason about images is getting better, but I think it’s a decent bit behind.”

Forget me not

Beyond issues parsing text and images, Hershey also acknowledged that Claude can have trouble “remembering” what it has already learned. The current model has a “context window” of 200,000 tokens, limiting the amount of relational information it can store in its “memory” at any one time. When the system’s ever-expanding knowledge base fills up this context window, Claude goes through an elaborate summarization process, condensing detailed notes on what it has seen, done, and learned so far into shorter text summaries that lose some of the fine-grained details.

This can mean that Claude “has a hard time keeping track of things for a very long time and really having a great sense of what it’s tried so far,” Hershey said. “You will definitely see it occasionally delete something that it shouldn’t have. Anything that’s not in your knowledge base or not in your summary is going to be gone, so you have to think about what you want to put there.”

A small window into the kind of “cleaning up my context” knowledge-base update necessitated by Claude’s limited “memory.”

A small window into the kind of “cleaning up my context” knowledge-base update necessitated by Claude’s limited “memory.” Credit: Claude Play Pokemon / Twitch

More than forgetting important history, though, Claude runs into bigger problems when it inadvertently inserts incorrect information into its knowledge base. Like a conspiracy theorist who builds an entire worldview from an inherently flawed premise, Claude can be incredibly slow to recognize when an error in its self-authored knowledge base is leading its Pokémon play astray.

“The things that are written down in the past, it sort of trusts pretty blindly,” Hershey said. “I have seen it become very convinced that it found the exit to [in-game location] Viridian Forest at some specific coordinates, and then it spends hours and hours exploring a little small square around those coordinates that are wrong instead of doing anything else. It takes a very long time for it to decide that that was a ‘fail.’”

Still, Hershey said Claude 3.7 Sonnet is much better than earlier models at eventually “questioning its assumptions, trying new strategies, and keeping track over long horizons of various strategies to [see] whether they work or not.” While the new model will still “struggle for really long periods of time” retrying the same thing over and over, it will ultimately tend to “get a sense of what’s going on and what it’s tried before, and it stumbles a lot of times into actual progress from that,” Hershey said.

“We’re getting pretty close…”

One of the most interesting things about observing Claude Plays Pokémon across multiple iterations and restarts, Hershey said, is seeing how the system’s progress and strategy can vary quite a bit between runs. Sometimes Claude will show it’s “capable of actually building a pretty coherent strategy” by “keeping detailed notes about the different paths to try,” for instance, he said. But “most of the time it doesn’t… most of the time, it wanders into the wall because it’s confident it sees the exit.”

Where previous models wandered aimlessly or got stuck in loops, Claude 3.7 Sonnet plans ahead, remembers its objectives, and adapts when initial strategies fail.

Critical skills for battling pixelated gym leaders. And, we posit, in solving real-world problems too. pic.twitter.com/scvISp14XG

— Anthropic (@AnthropicAI) February 25, 2025

One of the biggest things preventing the current version of Claude from getting better, Hershey said, is that “when it derives that good strategy, I don’t think it necessarily has the self-awareness to know that one strategy [it] came up with is better than another.” And that’s not a trivial problem to solve.

Still, Hershey said he sees “low-hanging fruit” for improving Claude’s Pokémon play by improving the model’s understanding of Game Boy screenshots. “I think there’s a chance it could beat the game if it had a perfect sense of what’s on the screen,” Hershey said, saying that such a model would probably perform “a little bit short of human.”

Expanding the context window for future Claude models will also probably allow those models to “reason over longer time frames and handle things more coherently over a long period of time,” Hershey said. Future models will improve by getting “a little bit better at remembering, keeping track of a coherent set of what it needs to try to make progress,” he added.

Twitch chat responds with a flood of bouncing emojis as Claude concludes an epic 78+ hour escape from Pokémon’s Mt. Moon.

Twitch chat responds with a flood of bouncing emojis as Claude concludes an epic 78+ hour escape from Pokémon’s Mt. Moon. Credit: Claude Plays Pokemon / Twitch

Whatever you think about impending improvements in AI models, though, Claude’s current performance at Pokémon doesn’t make it seem like it’s poised to usher in an explosion of human-level, completely generalizable artificial intelligence. And Hershey allows that watching Claude 3.7 Sonnet get stuck on Mt. Moon for 80 hours or so can make it “seem like a model that doesn’t know what it’s doing.”

But Hershey is still impressed at the way that Claude’s new reasoning model will occasionally show some glimmer of awareness and “kind of tell that it doesn’t know what it’s doing and know that it needs to be doing something different. And the difference between ‘can’t do it at all’ and ‘can kind of do it’ is a pretty big one for these AI things for me,” he continued. “You know, when something can kind of do something it typically means we’re pretty close to getting it to be able to do something really, really well.”

Photo of Kyle Orland

Kyle Orland has been the Senior Gaming Editor at Ars Technica since 2012, writing primarily about the business, tech, and culture behind video games. He has journalism and computer science degrees from University of Maryland. He once wrote a whole book about Minesweeper.

Why Anthropic’s Claude still hasn’t beaten Pokémon Read More »

hands-on-with-frosthaven’s-ambitious-port-from-gigantic-box-to-inviting-pc-game

Hands-on with Frosthaven’s ambitious port from gigantic box to inviting PC game

I can say this for certain: The game’s tutorial does a lot of work in introducing you to the game’s core mechanics, which include choosing cards with sequential actions, “burning” cards for temporary boosts, positioning, teamwork, and having enough actions or options left if a fight goes longer than you think. I’m not a total newcomer to the -haven games, having played a couple rounds of the Gloomhaven board game. But none of my friends, however patient, did as good a job of showing just how important it was to consider not just attack, defend, or move, but where each choice would place you, and how it would play with your teammates.

I played as a “Banner Spear,” one of the six starting classes. Their thing is—you guessed it—having a spear, and they can throw it or lunge with it from farther away. Many of the Banner Spear’s cards are more effective with positioning, like pincer-flanking an enemy or attacking from off to the side of your more up-close melee teammate. With only two players taking on a couple of enemies, I verbally brushed off the idea of using some more advanced options. My developer partner, using a Deathwalker, interjected: “Ah, but that is what summons are for.”

Soon enough, one of the brutes was facing down two skeletons, and I was able to get a nice shot in from an adjacent hex. The next thing I wanted to do was try out being a little selfish, running for some loot left behind by a vanquished goon. I forgot that you only pick up loot if you end your turn on a hex, not just pass through it, so my Banner Spear appeared to go on a little warm-up jog, for no real reason, before re-engaging the Germinate we were facing.

The art, animations, and feel of everything I clicked on was engaging, even as the developers regularly reassured me that all of it needs working on. With many more experienced players kicking the tires in early access, I expect the systems and quality-of-life details to see even more refinement. It’s a long campaign, both for players and the developers, but there’s a good chance it will be worth it.

Hands-on with Frosthaven’s ambitious port from gigantic box to inviting PC game Read More »

developer’s-gdc-billboard-pokes-at-despised-former-google-stadia-exec

Developer’s GDC billboard pokes at despised former Google Stadia exec

It’s been nearly two years now since game industry veteran Phil Harrison left Google following the implosion of the company’s Stadia cloud gaming service. But the passage of time hasn’t stopped one company from taking advantage of this week’s Game Developers Conference to poke fun at the erstwhile gaming executive for his alleged mistreatment of developers.

VGC spotted a conspicuous billboard in San Francisco’s Union Square Monday featuring the overinflated, completely bald head of Gunther Harrison, the fictional Alta Interglobal CEO who was recently revealed as the blatantly satirical antagonist in the upcoming game Revenge of the Savage Planet. A large message atop the billboard asks passersby—including the tens of thousands in town for GDC—”Has a Harrison fired you lately? You might be eligible for emotional support.”

Google’s Phil Harrison talks about the Google Stadia controller at GDC 2019.

Google’s Phil Harrison talks about the Google Stadia controller at GDC 2019. Credit: Google

While Gunther Harrison probably hasn’t fired any GDC attendees, the famously bald Phil Harrison was responsible for the firing of plenty of developers when he shut down Google’s short-lived Stadia Games & Entertainment (SG&E) publishing imprint in early 2021. That shutdown surprised a lot of newly jobless game developers, perhaps none moreso than those at Montreal-based Typhoon Games, which Google had acquired in late 2019 to make what Google’s Jade Raymond said at the time would be “platform-defining exclusive content” for Stadia.

Yet on the very same day that Journey to the Savage Planet launched as a Stadia exclusive, the developers at Typhoon found themselves jobless, alongside the rest of SG&E. By the end of 2022, Google would shut down Stadia entirely, blindsiding even more game developers.

Don’t forgive, don’t forget

After being let go by Google, Typhoon Games would reform as Raccoon Logic (thanks in large part to investment from Chinese publishing giant Tencent) and reacquire the rights to the Savage Planet franchise. And now that the next game in that series is set to launch in May, it seems the developers still haven’t fully gotten over how they were treated during Google’s brief foray into game publishing.

Developer’s GDC billboard pokes at despised former Google Stadia exec Read More »

new-portal-pinball-table-may-be-the-closest-we’re-gonna-get-to-portal-3

New Portal pinball table may be the closest we’re gonna get to Portal 3

A bargain at twice the price

The extensive Portal theming on the table seems to extend to the gameplay as well. As you might expect, launching a ball into a lit portal on one side of the playfield can lead to it (or a ball that looks a lot like it) immediately launching from another portal elsewhere. The speed of the ball as it enters one portal and exits the other seems like it might matter to the gameplay, too: A description for an “aerial portal” table feature warns that players should “make sure to build enough momentum or else your ball will land in the pit!”

The table is full of other little nods to the Portal games, from a physical Weighted Companion Cube that can travel through a portal to lock balls in place for eventual multiball to an Aerial Faith Plate that physically flings the ball up to a higher level. There’s also a turret-themed multiball, which GLaDOS reminds you is based around “the pale spherical things that are full of bullets. Oh wait, that’s you in five seconds.”

You can purchase a full Portal pinball table starting at $11,620 (plus shipping), which isn’t unreasonable as far as brand-new pinball tables are concerned these days. But if you already own the base table for Multimorphic’s P3 Pinball Platform, you can purchase a “Game Kit” upgrade—with the requisite game software and physical playfield pieces to install on your table—starting at just $3,900.

Even players that invested $1,000 or more in an Index VR headset just to play Half-Life Alyx might balk at those kinds of prices for the closest thing we’ve got to a new, “official” Portal game. For true Valve obsessives, though, it might be a small price to pay for the ultimate company collector’s item and conversation piece.

New Portal pinball table may be the closest we’re gonna get to Portal 3 Read More »

why-snes-hardware-is-running-faster-than-expected—and-why-it’s-a-problem

Why SNES hardware is running faster than expected—and why it’s a problem


gotta go precisely the right speed

Cheap, unreliable ceramic APU resonators lead to “constant, pervasive, unavoidable” issues.

Sir, do you know how fast your SNES was going? Credit: Getty Images

Ideally, you’d expect any Super NES console—if properly maintained—to operate identically to any other Super NES unit ever made (in the same region, at least). Given the same base ROM file and the same set of precisely timed inputs, all those consoles should hopefully give the same gameplay output across individual hardware and across time.

The TASBot community relies on this kind of solid-state predictability when creating tool-assisted speedruns that can be executed with robotic precision on actual console hardware. But on the SNES in particular, the team has largely struggled to get emulated speedruns to sync up with demonstrated results on real consoles.

After significant research and testing on dozens of actual SNES units, the TASBot team now thinks that a cheap ceramic resonator used in the system’s Audio Processing Unit (APU) is to blame for much of this inconsistency. While Nintendo’s own documentation says the APU should run at a consistent rate of 24.576 Mhz (and the associated Digital Signal Processor sample rate at a flat 32,000 Hz), in practice, that rate can vary just a bit based on heat, system age, and minor physical variations that develop in different console units over time.

Casual players would only notice this problem in the form of an almost imperceptibly higher pitch for in-game music and sounds. But for TASBot, Allan “dwangoAC” Cecil says this unreliable clock has become a “constant, pervasive, unavoidable” problem for getting frame-accurate consistency in hardware-verified speedruns.

Not to spec

Cecil testing his own SNES APU in 2016.

Cecil testing his own SNES APU in 2016. Credit: Allan Cecil

Cecil says he first began to suspect the APU’s role in TASBot’s SNES problems back in 2016 when he broke open his own console to test it with an external frequency counter. He found that his APU clock had “degraded substantially enough to cause problems with repeatability,” causing the console to throw out unpredictable “lag frames” if and when the CPU and APU load cycles failed to line up in the expected manner. Those lag frames, in turn, are enough to “desynchronize” TASBot’s input on actual hardware from the results you’d see on a more controlled emulator.

Unlike the quartz crystals used in many electronics (including the SNES’s more consistent and differently timed CPU), the cheaper ceramic resonators in the SNES APU are “known to degrade over time,” as Cecil put it. Documentation for the resonators used in the APU also seems to suggest that excess heat may impact the clock cycle speed, meaning the APU might speed up a bit as a specific console heats up.

The APU resonator manual shows slight variations in operating thresholds based on heat and other factors.

The APU resonator manual shows slight variations in operating thresholds based on heat and other factors. Credit: Ceralock ceramic resonator manual

The TASBot team was not the first group to notice this kind of audio inconsistency in the SNES. In the early 2000s, some emulator developers found that certain late-era SNES games don’t run correctly when the emulator’s Digital Signal Processor (DSP) sample rate is set to the Nintendo-specified value of precisely 32,000 Hz (a number derived from the speed of the APU clock). Developers tested actual hardware at the time and found that the DSP was actually running at 32,040 Hz and that setting the emulated DSP to run at that specific rate suddenly fixed the misbehaving commercial games.

That small but necessary emulator tweak implies that “the original developers who wrote those games were using hardware that… must have been running slightly faster at that point,” Cecil told Ars. “Because if they had written directly to what the spec said, it may not have worked.”

Survey says…

While research and testing confirmed the existence of these APU variations, Cecil wanted to determine just how big the problem was across actual consoles today. To do that, he ran an informal online survey last month, cryptically warning his social media followers that “SNES consoles seem to be getting faster as they age.” He asked respondents to run a DSP clock measurement ROM on any working SNES hardware they had lying around and to rerun the test after the console had time to warm up.

After receiving 143 responses and crunching the numbers, Cecil said he was surprised to find that temperature seemed to have a minimal impact on measured DSP speed; the measurement only rose an insignificant 8 Hz on average between “cold” and “warm” readings on the same console. Cecil even put his own console in a freezer to see if the DSP clock rate would change as it thawed out and found only a 32 Hz difference as it warmed back up to room temperature.

A sample result from the DSP sample test program.

Credit: Allan Cecil

A sample result from the DSP sample test program. Credit: Allan Cecil

Those heat effects paled in comparison to the natural clock variation across different consoles, though. The slowest and fastest DSPs in Cecil’s sample showed a clock difference of 234 Hz, or about 0.7 percent of the 32,000 Hz specification.

That difference is small enough that human players probably wouldn’t notice it directly; TASBot team member Total estimated it might amount to “at most maybe a second or two [of difference] over hours of gameplay.” Skilled speedrunners could notice small differences, though, if differing CPU and APU alignments cause “carefully memorized enemy pattern changes to something else” between runs, Cecil said.

For a frame-perfect tool-assisted speedrun, though, the clock variations between consoles could cause innumerable headaches. As TASBot team member Undisbeliever explained in his detailed analysis: “On one console this might take 0.126 frames to process the music-tick, on a different console it might take 0.127 frames. It might not seem like much but it is enough to potentially delay the start of song loading by 1 frame (depending on timing, lag and game-code).”

Cecil’s survey found variation across consoles was much higher than the effects of heat on any single console.

Cecil’s survey found variation across consoles was much higher than the effects of heat on any single console. Credit: SNES SMP Speed test survey

Cecil also said the survey-reported DSP clock speeds were also a bit higher than he expected, at an average rate of 32,076 Hz at room temperature. That’s quite a bit higher than both the 32,000 Hz spec set by Nintendo and the 32,040 Hz rate that emulator developers settled on after sampling actual hardware in 2003.

To some observers, this is evidence that SNES APUs originally produced in the ’90s have been speeding up slightly as they age and could continue to get faster in the coming years and decades. But Cecil says the historical data they have is too circumstantial to make such a claim for certain.

“We’re all a bunch of differently skilled geeks and nerds, and it’s in our nature to argue over what the results mean, which is fine,” Cecil said. “The only thing we can say with certainty is the statistical significance of the responses that show the current average DSP sample rate is 32,076 Hz, faster on average than the original specification. The rest of it is up to interpretation and a certain amount of educated guessing based on what we can glean.”

A first step

For the TASBot team, knowing just how much real SNES hardware timing can differ from dry specifications (and emulators) is an important step to getting more consistent results on real hardware. But that knowledge hasn’t completely solved their synchronization problems. Even when Cecil replaced the ceramic APU resonator in his Super NES with a more accurate quartz version (tuned precisely to match Nintendo’s written specification), the team “did not see perfect behavior like we expected,” he told Ars.

Beyond clock speed inconsistencies, Cecil explained to Ars that TASBot team testing has found an additional “jitter pattern” present in the APU sampling that “injects some variance in how long it takes to perform various actions” between runs. That leads to non-deterministic performance even on the same hardware, Cecil said, which means that “TASBot is likely to desync” after just a few minutes of play on most SNES games.

The order in which these components start when the SNES is reset can have a large impact on clock synchronization.

The order in which these components start when the SNES is reset can have a large impact on clock synchronization. Credit: Rasteri

Extensive research from Rasteri suggests that these inconsistencies across same-console runs are likely caused by a “very non-deterministic reset circuit” that changes the specific startup order and timing for a console’s individual components every time it’s powered on. That leads to essentially “infinite possibilities” for the relative place where the CPU and APU clocks start in their “synchronization cycle” for each fresh run, making it impossible to predict specifically where and when lag frames will appear, Rasteri wrote.

Cecil said these kind of “butterfly effect” timing issues make the Super NES “a surprisingly complicated console [that has] resisted our attempts to fully model it and coerce it into behaving consistently.” But he’s still hopeful that the team will “eventually find a way to restore an SNES to the behavior game developers expected based on the documentation they were provided without making invasive changes…”

In the end, though, Cecil seems to have developed an almost grudging respect for how the SNES’s odd architecture leads to such unpredictable operation in practice. “If you want to deliberately create a source of randomness and non-deterministic behavior, having two clock sources that spinloop independently against one another is a fantastic choice,” he said.

Photo of Kyle Orland

Kyle Orland has been the Senior Gaming Editor at Ars Technica since 2012, writing primarily about the business, tech, and culture behind video games. He has journalism and computer science degrees from University of Maryland. He once wrote a whole book about Minesweeper.

Why SNES hardware is running faster than expected—and why it’s a problem Read More »

leaked-geforce-rtx-5060-and-5050-specs-suggest-nvidia-will-keep-playing-it-safe

Leaked GeForce RTX 5060 and 5050 specs suggest Nvidia will keep playing it safe

Nvidia has launched all of the GeForce RTX 50-series GPUs that it announced at CES, at least technically—whether you’re buying from Nvidia, AMD, or Intel, it’s nearly impossible to find any of these new cards at their advertised prices right now.

But hope springs eternal, and newly leaked specs for GeForce RTX 5060 and 5050-series cards suggest that Nvidia may be announcing these lower-end cards soon. These kinds of cards are rarely exciting, but Steam Hardware Survey data shows that these xx60 and xx50 cards are what the overwhelming majority of PC gamers are putting in their systems.

The specs, posted by a reliable leaker named Kopite and reported by Tom’s Hardware and others, suggest a refresh that’s in line with what Nvidia has done with most of the 50-series so far. Along with a move to the next-generation Blackwell architecture, the 5060 GPUs each come with a small increase to the number of CUDA cores, a jump from GDDR6 to GDDR7, and an increase in power consumption, but no changes to the amount of memory or the width of the memory bus. The 8GB versions, in particular, will probably continue to be marketed primarily as 1080p cards.

RTX 5060 Ti (leaked) RTX 4060 Ti RTX 5060 (leaked) RTX 4060 RTX 5050 (leaked) RTX 3050
CUDA Cores 4,608 4,352 3,840 3,072 2,560 2,560
Boost Clock Unknown 2,535 MHz Unknown 2,460 MHz Unknown 1,777 MHz
Memory Bus Width 128-bit 128-bit 128-bit 128-bit 128-bit 128-bit
Memory bandwidth Unknown 288 GB/s Unknown 272 GB/s Unknown 224 GB/s
Memory size 8GB or 16GB GDDR7 8GB or 16GB GDDR6 8GB GDDR7 8GB GDDR6 8GB GDDR6 8GB GDDR6
TGP 180 W 160 W 150 W 115 W 130 W 130 W

As with the 4060 Ti, the 5060 Ti is said to come in two versions, one with 8GB of RAM and one with 16GB. One of the 4060 Ti’s problems was that its relatively narrow 128-bit memory bus limited its performance at 1440p and 4K resolutions even with 16GB of RAM—the bandwidth increase from GDDR7 could help with this, but we’ll need to test to see for sure.

Leaked GeForce RTX 5060 and 5050 specs suggest Nvidia will keep playing it safe Read More »

six-ways-microsoft’s-portable-xbox-could-be-a-steam-deck-killer

Six ways Microsoft’s portable Xbox could be a Steam Deck killer

Bring old Xbox games to PC

The ultimate handheld system seller.

Credit: Microsoft / Bizarre Creations

The ultimate handheld system seller. Credit: Microsoft / Bizarre Creations

Microsoft has made a lot of hay over the way recent Xbox consoles can play games dating all the way back to the original Xbox. If Microsoft wants to set its first gaming handheld apart, it should make those old console games officially available on a Windows-based system for the first time.

The ability to download previous console games dating back to the Xbox 360 era (or beyond) would be an instant “system seller” feature for any portable Xbox. While this wouldn’t be a trivial technical lift on Microsoft’s part, the same emulation layer that powers Xbox console backward compatibility could surely be ported to Windows with a little bit of work. That process might be easier with a specific branded portable, too, since Microsoft would be working with full knowledge of what hardware was being used.

If Microsoft can give us a way to play Geometry Wars 2 on the go without having to deal with finicky third-party emulators, we’ll be eternally grateful.

Multiple hardware tiers

Xbox Series S (left), next to Xbox Series X (right).

One size does not fit all when it comes to consoles or to handhelds.

Credit: Sam Machkovech

One size does not fit all when it comes to consoles or to handhelds. Credit: Sam Machkovech

On the console side, Microsoft’s split simultaneous release of the Xbox Series S and X showed an understanding that not everyone wants to pay more money for the most powerful possible gaming hardware. Microsoft should extend this philosophy to gaming handhelds by releasing different tiers of portable Xbox hardware for price-conscious consumers.

Raw hardware power is the most obvious differentiator that could set a more expensive tier of Xbox portables apart from any cheaper options. But Microsoft could also offer portable options that reduce the overall bulk (a la the Nintendo Switch Lite) or offer relative improvements in screen size and quality (a la the Steam Deck OLED and Switch OLED).

“Made for Xbox”

It worked for Valve, it can work for Microsoft.

Credit: Valve

It worked for Valve, it can work for Microsoft. Credit: Valve

One of the best things about console gaming is that you can be confident any game you buy for a console will “just work” with your hardware. In the world of PC gaming handhelds, Valve has tried to replicate this with the “Deck Verified” program to highlight Steam games that are guaranteed to work in a portable setting.

Microsoft is well-positioned to work with game publishers to launch a similar program for its own Xbox-branded portable. There’s real value in offering gamers assurances that “Made for Xbox” PC games will “just work” on their Xbox-branded handheld.

This kind of verification system could also help simplify and clarify hardware requirements across different tiers of portable hardware power; any handheld marketed as “level 2” could play any games marketed as level 2 or below, for instance.

Six ways Microsoft’s portable Xbox could be a Steam Deck killer Read More »

blood-typers-is-a-terrifically-tense,-terror-filled-typing-tutor

Blood Typers is a terrifically tense, terror-filled typing tutor

When you think about it, the keyboard is the most complex video game controller in common use today, with over 100 distinct inputs arranged in a vast grid. Yet even the most complex keyboard-controlled games today tend to only use a relative handful of all those available keys for actual gameplay purposes.

The biggest exception to this rule is a typing game, which by definition asks players to send their fingers flying across every single letter on the keyboard (and then some) in quick succession. By default, though, typing games tend to take the form of extremely basic typing tutorials, where the gameplay amounts to little more than typing out words and sentences by rote as they appear on screen, maybe with a few cute accompanying animations.

Typing “gibbon” quickly has rarely felt this tense or important.

Credit: Outer Brain Studios

Typing “gibbon” quickly has rarely felt this tense or important. Credit: Outer Brain Studios

Blood Typers adds some much-needed complexity to that basic type-the-word-you-see concept, layering its typing tests on top of a full-fledged survival horror game reminiscent of the original PlayStation era. The result is an amazingly tense and compelling action adventure that also serves as a great way to hone your touch-typing skills.

See it, type it, do it

For some, Blood Typers may bring up first-glance memories of Typing of the Dead, Sega’s campy, typing-controlled take on the House of the Dead light gun game series. But Blood Typers goes well beyond Typing of the Dead‘s on-rails shooting, offering an experience that’s more like a typing-controlled version of Resident Evil.

Practically every action in Blood Typers requires typing a word that you see on-screen. That includes basic locomotion, which is accomplished by typing any of a number of short words scattered at key points in your surroundings in order to automatically walk to that point. It’s a bit awkward at first, but quickly becomes second nature as you memorize the names of various checkpoints and adjust to using the shift keys to turn that camera as you move.

Each of those words on the ground is a waypoint that you can type to move toward.

Credit: Outer Brain Studios

Each of those words on the ground is a waypoint that you can type to move toward. Credit: Outer Brain Studios

When any number of undead enemies appear, a quick tap of the tab key switches you to combat mode, which asks you to type longer words that appear above those enemies to use your weapons. More difficult enemies require multiple words to take down, including some with armor that means typing a single word repeatedly before you can move on.

While you start each scenario in Blood Typers with a handy melee weapon, you’ll end up juggling a wide variety of projectile firearms that feel uniquely tuned to the typing gameplay. The powerful shotgun, for instance, can take out larger enemies with just a single word, while the rapid-fire SMG lets you type only the first few letters of each word, allowing for a sort of rapid fire feel. The flamethrower, on the other hand, can set whole groups of nearby enemies aflame, which makes each subsequent attack word that much shorter and faster.

Blood Typers is a terrifically tense, terror-filled typing tutor Read More »

amd-says-top-tier-ryzen-9900x3d-and-9950x3d-cpus-arrive-march-12-for-$599-and-$699

AMD says top-tier Ryzen 9900X3D and 9950X3D CPUs arrive March 12 for $599 and $699

Like the 7950X3D and 7900X3D, these new X3D chips combine a pair of AMD’s CPU chiplets, one that has the extra 64MB of cache stacked underneath it and one that doesn’t. For the 7950X3D, you get eight cores with extra cache and eight without; for the 7900X3D, you get eight cores with extra cache and four without.

It’s up to AMD’s chipset software to decide what kinds of apps get to run on each kind of CPU core. Non-gaming workloads prioritize the normal CPU cores, which are generally capable of slightly higher peak clock speeds, while games that benefit disproportionately from the extra cache are run on those cores instead. AMD’s software can “park” the non-V-Cache CPU cores when you’re playing games to ensure they’re not accidentally being run on less-suitable CPU cores.

We didn’t have issues with this core parking technology when we initially tested the 7950X3D and 7900X3D, and AMD has steadily made improvements since then to make sure that core parking is working properly. The new 9000-series X3D chips should benefit from that work, too. To get the best results, AMD officially recommends a fresh and fully updated Windows install, along with the newest BIOS for your motherboard and the newest AMD chipset drivers; swapping out another Ryzen CPU for an X3D model (or vice versa) without reinstalling Windows can occasionally lead to CPUs being parked (or not parked) when they are supposed to be (or not supposed to be).

AMD says top-tier Ryzen 9900X3D and 9950X3D CPUs arrive March 12 for $599 and $699 Read More »

“literally-just-a-copy”—hit-ios-game-accused-of-unauthorized-html5-code-theft

“Literally just a copy”—hit iOS game accused of unauthorized HTML5 code theft

Viral success (for someone else)

VoltekPlay writes on Reddit that it was only alerted to the existence of My Baby or Not! on iOS by “a suspicious burst of traffic on our itch.io page—all coming from Google organic search.” Only after adding a “where did you find our game?” player poll to the page were the developers made aware of some popular TikTok videos featuring the iOS version.

“Luckily, some people in the [Tiktok] comments mentioned the real game name—Diapers, Please!—so a few thousand players were able to google their way to our page,” VoltekPlay writes. “I can only imagine how many more ended up on the thief’s App Store page instead.”

Earlier this week, the $2.99 iOS release of My Baby or Not! was quickly climbing iOS’s paid games charts, attracting an estimated 20,000 downloads overall, according to Sensor Tower.

Marwane Benyssef’s only previous iOS release, Kiosk Food Night Shift, also appears to be a direct copy of an itch.io release.

Marwane Benyssef’s only previous iOS release, Kiosk Food Night Shift, also appears to be a direct copy of an itch.io release.

The App Store listing credited My Baby or Not! to “Marwane Benyssef,” a new iOS developer with no apparent history in the game development community. Benyssef’s only other iOS game, Kiosk Food Night Shift, was released last August and appears to be a direct copy of Kiosk, a pay-what-you-want title that was posted to itch.io last year (with a subsequent “full” release on Steam this year)

In a Reddit post, the team at VoltekPlay said that they had filed a DMCA copyright claim against My Baby or Not! Apple subsequently shared that claim with Bennysof, VoltekPlay writes, along with a message that “Apple encourages the parties to a dispute to work directly with one another to resolve the claim.”

This morning, Ars reached out to Apple to request a comment on the situation. While awaiting a response (which Apple has yet to provide), Apple appears to have removed Benyssef’s developer page and all traces of their games from the iOS App Store.

“Literally just a copy”—hit iOS game accused of unauthorized HTML5 code theft Read More »

amd-radeon-rx-9070-and-9070-xt-review:-rdna-4-fixes-a-lot-of-amd’s-problems

AMD Radeon RX 9070 and 9070 XT review: RDNA 4 fixes a lot of AMD’s problems


For $549 and $599, AMD comes close to knocking out Nvidia’s GeForce RTX 5070.

AMD’s Radeon RX 9070 and 9070 XT are its first cards based on the RDNA 4 GPU architecture. Credit: Andrew Cunningham

AMD’s Radeon RX 9070 and 9070 XT are its first cards based on the RDNA 4 GPU architecture. Credit: Andrew Cunningham

AMD is a company that knows a thing or two about capitalizing on a competitor’s weaknesses. The company got through its early-2010s nadir partially because its Ryzen CPUs struck just as Intel’s current manufacturing woes began to set in, first with somewhat-worse CPUs that were great value for the money and later with CPUs that were better than anything Intel could offer.

Nvidia’s untrammeled dominance of the consumer graphics card market should also be an opportunity for AMD. Nvidia’s GeForce RTX 50-series graphics cards have given buyers very little to get excited about, with an unreachably expensive high-end 5090 refresh and modest-at-best gains from 5080 and 5070-series cards that are also pretty expensive by historical standards, when you can buy them at all. Tech YouTubers—both the people making the videos and the people leaving comments underneath them—have been almost uniformly unkind to the 50 series, hinting at consumer frustrations and pent-up demand for competitive products from other companies.

Enter AMD’s Radeon RX 9070 XT and RX 9070 graphics cards. These are aimed right at the middle of the current GPU market at the intersection of high sales volume and decent profit margins. They promise good 1440p and entry-level 4K gaming performance and improved power efficiency compared to previous-generation cards, with fixes for long-time shortcomings (ray-tracing performance, video encoding, and upscaling quality) that should, in theory, make them more tempting for people looking to ditch Nvidia.

Table of Contents

RX 9070 and 9070 XT specs and speeds

RX 9070 XT RX 9070 RX 7900 XTX RX 7900 XT RX 7900 GRE RX 7800 XT
Compute units (Stream processors) 64 RDNA4 (4,096) 56 RDNA4 (3,584) 96 RDNA3 (6,144) 84 RDNA3 (5,376) 80 RDNA3 (5,120) 60 RDNA3 (3,840)
Boost Clock 2,970 MHz 2,520 MHz 2,498 MHz 2,400 MHz 2,245 MHz 2,430 MHz
Memory Bus Width 256-bit 256-bit 384-bit 320-bit 256-bit 256-bit
Memory Bandwidth 650GB/s 650GB/s 960GB/s 800GB/s 576GB/s 624GB/s
Memory size 16GB GDDR6 16GB GDDR6 24GB GDDR6 20GB GDDR6 16GB GDDR6 16GB GDDR6
Total board power (TBP) 304 W 220 W 355 W 315 W 260 W 263 W

AMD’s high-level performance promise for the RDNA 4 architecture revolves around big increases in performance per compute unit (CU). An RDNA 4 CU, AMD says, is nearly twice as fast in rasterized performance as RDNA 2 (that is, rendering without ray-tracing effects enabled) and nearly 2.5 times as fast as RDNA 2 in games with ray-tracing effects enabled. Performance for at least some machine learning workloads also goes way up—twice as fast as RDNA 3 and four times as fast as RDNA 2.

We’ll see this in more detail when we start comparing performance, but AMD seems to have accomplished this goal. Despite having 64 or 56 compute units (for the 9070 XT and 9070, respectively), the cards’ performance often competes with AMD’s last-generation flagships, the RX 7900 XTX and 7900 XT. Those cards came with 96 and 84 compute units, respectively. The 9070 cards are specced a lot more like last generation’s RX 7800 XT—including the 16GB of GDDR6 on a 256-bit memory bus, as AMD still isn’t using GDDR6X or GDDR7—but they’re much faster than the 7800 XT was.

AMD has dramatically increased the performance-per-compute unit for RDNA 4. AMD

The 9070 series also uses a new 4 nm manufacturing process from TSMC, an upgrade from the 7000 series’ 5 nm process (and the 6 nm process used for the separate memory controller dies in higher-end RX 7000-series models that used chiplets). AMD’s GPUs are normally a bit less efficient than Nvidia’s, but the architectural improvements and the new manufacturing process allow AMD to do some important catch-up.

Both of the 9070 models we tested were ASRock Steel Legend models, and the 9070 and 9070 XT had identical designs—we’ll probably see a lot of this from AMD’s partners since the GPU dies and the 16GB RAM allotments are the same for both models. Both use two 8-pin power connectors; AMD says partners are free to use the 12-pin power connector if they want, but given Nvidia’s ongoing issues with it, most cards will likely stick with the reliable 8-pin connectors.

AMD doesn’t appear to be making and selling reference designs for the 9070 series the way it did for some RX 7000 and 6000-series GPUs or the way Nvidia does with its Founders Edition cards. From what we’ve seen, 2 or 2.5-slot, triple-fan designs will be the norm, the way they are for most midrange GPUs these days.

Testbed notes

We used the same GPU testbed for the Radeon RX 9070 series as we have for our GeForce RTX 50-series reviews.

An AMD Ryzen 7 9800X3D ensures that our graphics cards will be CPU-limited as little as possible. An ample 1050 W power supply, 32GB of DDR5-6000, and an AMD X670E motherboard with the latest BIOS installed round out the hardware. On the software side, we use an up-to-date installation of Windows 11 24H2 and recent GPU drivers for older cards, ensuring that our tests reflect whatever optimizations Microsoft, AMD, Nvidia, and game developers have made since the last generation of GPUs launched.

We have numbers for all of Nvidia’s RTX 50-series GPUs so far, plus most of the 40-series cards, most of AMD’s RX 7000-series cards, and a handful of older GPUs from the RTX 30-series and RX 6000 series. We’ll focus on comparing the 9070 XT and 9070 to other 1440p-to-4K graphics cards since those are the resolutions AMD is aiming at.

Performance

At $549 and $599, the 9070 series is priced to match Nvidia’s $549 RTX 5070 and undercut the $749 RTX 5070 Ti. So we’ll focus on comparing the 9070 series to those cards, plus the top tier of GPUs from the outgoing RX 7000-series.

Some 4K rasterized benchmarks.

Starting at the top with rasterized benchmarks with no ray-tracing effects, the 9070 XT does a good job of standing up to Nvidia’s RTX 5070 Ti, coming within a few frames per second of its performance in all the games we tested (and scoring very similarly in the 3DMark Time Spy Extreme benchmark).

Both cards are considerably faster than the RTX 5070—between 15 and 28 percent for the 9070 XT and between 5 and 13 percent for the regular 9070 (our 5070 scored weirdly low in Horizon Zero Dawn Remastered, so we’d treat those numbers as outliers for now). Both 9070 cards also stack up well next to the RX 7000 series here—the 9070 can usually just about match the performance of the 7900 XT, and the 9070 XT usually beats it by a little. Both cards thoroughly outrun the old RX 7900 GRE, which was AMD’s $549 GPU offering just a year ago.

The 7900 XT does have 20GB of RAM instead of 16GB, which might help its performance in some edge cases. But 16GB is still perfectly generous for a 1440p-to-4K graphics card—the 5070 only offers 12GB, which could end up limiting its performance in some games as RAM requirements continue to rise.

On ray-tracing improvements

Nvidia got a jump on AMD when it introduced hardware-accelerated ray-tracing in the RTX 20-series in 2018. And while these effects were only supported in a few games at the time, many modern games offer at least some kind of ray-traced lighting effects.

AMD caught up a little when it began shipping its own ray-tracing support in the RDNA2 architecture in late 2020, but the issue since then has always been that AMD cards have taken a larger performance hit than GeForce GPUs when these effects are turned on. RDNA3 promised improvements, but our tests still generally showed the same deficit as before.

So we’re looking for two things with RDNA4’s ray-tracing performance. First, we want the numbers to be higher than they were for comparably priced RX 7000-series GPUs, the same thing we look for in non-ray-traced (or rasterized) rendering performance. Second, we want the size of the performance hit to go down. To pick an example: the RX 7900 GRE could compete with Nvidia’s RTX 4070 Ti Super in games without ray tracing, but it was closer to a non-Super RTX 4070 in ray-traced games. It has helped keep AMD’s cards from being across-the-board competitive with Nvidia’s—is that any different now?

Benchmarks for games with ray-tracing effects enabled. Both AMD cards generally keep pace with the 5070 in these tests thanks to RDNA 4’s improvements.

The picture our tests paint is mixed but tentatively positive. The 9070 series and RDNA4 post solid improvements in the Cyberpunk 2077 benchmarks, substantially closing the performance gap with Nvidia. In games where AMD’s cards performed well enough before—here represented by Returnal—performance goes up, but roughly proportionately with rasterized performance. And both 9070 cards still punch below their weight in Black Myth: Wukong, falling substantially behind the 5070 under the punishing Cinematic graphics preset.

So the benefits you see, as with any GPU update, will depend a bit on the game you’re playing. There’s also a possibility that game optimizations and driver updates made with RDNA4 in mind could boost performance further. We can’t say that AMD has caught all the way up to Nvidia here—the 9070 and 9070 XT are both closer to the GeForce RTX 5070 than the 5070 Ti, despite keeping it closer to the 5070 Ti in rasterized tests—but there is real, measurable improvement here, which is what we were looking for.

Power usage

The 9070 series’ performance increases are particularly impressive when you look at the power-consumption numbers. The 9070 comes close to the 7900 XT’s performance but uses 90 W less power under load. It beats the RTX 5070 most of the time but uses around 30 W less power.

The 9070 XT is a little less impressive on this front—AMD has set clock speeds pretty high, and this can increase power use disproportionately. The 9070 XT is usually 10 or 15 percent faster than the 9070 but uses 38 percent more power. The XT’s power consumption is similar to the RTX 5070 Ti’s (a GPU it often matches) and the 7900 XT’s (a GPU it always beats), so it’s not too egregious, but it’s not as standout as the 9070’s.

AMD gives 9070 owners a couple of new toggles for power limits, though, which we’ll talk about in the next section.

Experimenting with “Total Board Power”

We don’t normally dabble much with overclocking when we review CPUs or GPUs—we’re happy to leave that to folks at other outlets. But when we review CPUs, we do usually test them with multiple power limits in place. Playing with power limits is easier (and occasionally safer) than actually overclocking, and it often comes with large gains to either performance (a chip that performs much better when given more power to work with) or efficiency (a chip that can run at nearly full speed without using as much power).

Initially, I experimented with the RX 9070’s power limits by accident. AMD sent me one version of the 9070 but exchanged it because of a minor problem the OEM identified with some units early in the production run. I had, of course, already run most of our tests on it, but that’s the way these things go sometimes.

By bumping the regular RX 9070’s TBP up just a bit, you can nudge it closer to 9070 XT-level performance.

The replacement RX 9070 card, an ASRock Steel Legend model, was performing significantly better in our tests, sometimes nearly closing the gap between the 9070 and the XT. It wasn’t until I tested power consumption that I discovered the explanation—by default, it was using a 245 W power limit rather than the AMD-defined 220 W limit. Usually, these kinds of factory tweaks don’t make much of a difference, but for the 9070, this power bump gave it a nice performance boost while still keeping it close to the 250 W power limit of the GeForce RTX 5070.

The 90-series cards we tested both add some power presets to AMD’s Adrenalin app in the Performance tab under Tuning. These replace and/or complement some of the automated overclocking and undervolting buttons that exist here for older Radeon cards. Clicking Favor Efficiency or Favor Performance can ratchet the card’s Total Board Power (TBP) up or down, limiting performance so that the card runs cooler and quieter or allowing the card to consume more power so it can run a bit faster.

The 9070 cards get slightly different performance tuning options in the Adrenalin software. These buttons mostly change the card’s Total Board Power (TBP), making it simple to either improve efficiency or boost performance a bit. Credit: Andrew Cunningham

For this particular ASRock 9070 card, the default TBP is set to 245 W. Selecting “Favor Efficiency” sets it to the default 220 W. You can double-check these values using an app like HWInfo, which displays both the current TBP and the maximum TBP in its Sensors Status window. Clicking the Custom button in the Adrenalin software gives you access to a Power Tuning slider, which for our card allowed us to ratchet the TBP up by up to 10 percent or down by as much as 30 percent.

This is all the firsthand testing we did with the power limits of the 9070 series, though I would assume that adding a bit more power also adds more overclocking headroom (bumping up the power limits is common for GPU overclockers no matter who makes your card). AMD says that some of its partners will ship 9070 XT models set to a roughly 340 W power limit out of the box but acknowledges that “you start seeing diminishing returns as you approach the top of that [power efficiency] curve.”

But it’s worth noting that the driver has another automated set-it-and-forget-it power setting you can easily use to find your preferred balance of performance and power efficiency.

A quick look at FSR4 performance

There’s a toggle in the driver for enabling FSR 4 in FSR 3.1-supporting games. Credit: Andrew Cunningham

One of AMD’s headlining improvements to the RX 90-series is the introduction of FSR 4, a new version of its FidelityFX Super Resolution upscaling algorithm. Like Nvidia’s DLSS and Intel’s XeSS, FSR 4 can take advantage of RDNA 4’s machine learning processing power to do hardware-backed upscaling instead of taking a hardware-agnostic approach as the older FSR versions did. AMD says this will improve upscaling quality, but it also means FSR4 will only work on RDNA 4 GPUs.

The good news is that FSR 3.1 and FSR 4 are forward- and backward-compatible. Games that have already added FSR 3.1 support can automatically take advantage of FSR 4, and games that support FSR 4 on the 90-series can just run FSR 3.1 on older and non-AMD GPUs.

FSR 4 comes with a small performance hit compared to FSR 3.1 at the same settings, but better overall quality can let you drop to a faster preset like Balanced or Performance and end up with more frames-per-second overall. Credit: Andrew Cunningham

The only game in our current test suite to be compatible with FSR 4 is Horizon Zero Dawn Remastered, and we tested its performance using both FSR 3.1 and FSR 4. In general, we found that FSR 4 improved visual quality at the cost of just a few frames per second when run at the same settings—not unlike using Nvidia’s recently released “transformer model” for DLSS upscaling.

Many games will let you choose which version of FSR you want to use. But for FSR 3.1 games that don’t have a built-in FSR 4 option, there’s a toggle in AMD’s Adrenalin driver you can hit to switch to the better upscaling algorithm.

Even if they come with a performance hit, new upscaling algorithms can still improve performance by making the lower-resolution presets look better. We run all of our testing in “Quality” mode, which generally renders at two-thirds of native resolution and scales up. But if FSR 4 running in Balanced or Performance mode looks the same to your eyes as FSR 3.1 running in Quality mode, you can still end up with a net performance improvement in the end.

RX 9070 or 9070 XT?

Just $50 separates the advertised price of the 9070 from that of the 9070 XT, something both Nvidia and AMD have done in the past that I find a bit annoying. If you have $549 to spend on a graphics card, you can almost certainly scrape together $599 for a graphics card. All else being equal, I’d tell most people trying to choose one of these to just spring for the 9070 XT.

That said, availability and retail pricing for these might be all over the place. If your choices are a regular RX 9070 or nothing, or an RX 9070 at $549 and an RX 9070 XT at any price higher than $599, I would just grab a 9070 and not sweat it too much. The two cards aren’t that far apart in performance, especially if you bump the 9070’s TBP up a little bit, and games that are playable on one will be playable at similar settings on the other.

Pretty close to great

If you’re building a 1440p or 4K gaming box, the 9070 series might be the ones to beat right now. Credit: Andrew Cunningham

We’ve got plenty of objective data in here, so I don’t mind saying that I came into this review kind of wanting to like the 9070 and 9070 XT. Nvidia’s 50-series cards have mostly upheld the status quo, and for the last couple of years, the status quo has been sustained high prices and very modest generational upgrades. And who doesn’t like an underdog story?

I think our test results mostly justify my priors. The RX 9070 and 9070 XT are very competitive graphics cards, helped along by a particularly mediocre RTX 5070 refresh from Nvidia. In non-ray-traced games, both cards wipe the floor with the 5070 and come close to competing with the $749 RTX 5070 Ti. In games and synthetic benchmarks with ray-tracing effects on, both cards can usually match or slightly beat the similarly priced 5070, partially (if not entirely) addressing AMD’s longstanding performance deficit here. Neither card comes close to the 5070 Ti in these games, but they’re also not priced like a 5070 Ti.

Just as impressively, the Radeon cards compete with the GeForce cards while consuming similar amounts of power. At stock settings, the RX 9070 uses roughly the same amount of power under load as a 4070 Super but with better performance. The 9070 XT uses about as much power as a 5070 Ti, with similar performance before you turn ray-tracing on. Power efficiency was a small but consistent drawback for the RX 7000 series compared to GeForce cards, and the 9070 cards mostly erase that disadvantage. AMD is also less stingy with the RAM, giving you 16GB for the price Nvidia charges for 12GB.

Some of the old caveats still apply. Radeons take a bigger performance hit, proportionally, than GeForce cards. DLSS already looks pretty good and is widely supported, while FSR 3.1/FSR 4 adoption is still relatively low. Nvidia has a nearly monopolistic grip on the dedicated GPU market, which means many apps, AI workloads, and games support its GPUs best/first/exclusively. AMD is always playing catch-up to Nvidia in some respect, and Nvidia keeps progressing quickly enough that it feels like AMD never quite has the opportunity to close the gap.

AMD also doesn’t have an answer for DLSS Multi-Frame Generation. The benefits of that technology are fairly narrow, and you already get most of those benefits with single-frame generation. But it’s still a thing that Nvidia does that AMDon’t.

Overall, the RX 9070 cards are both awfully tempting competitors to the GeForce RTX 5070—and occasionally even the 5070 Ti. They’re great at 1440p and decent at 4K. Sure, I’d like to see them priced another $50 or $100 cheaper to well and truly undercut the 5070 and bring 1440p-to-4K performance t0 a sub-$500 graphics card. It would be nice to see AMD undercut Nvidia’s GPUs as ruthlessly as it undercut Intel’s CPUs nearly a decade ago. But these RDNA4 GPUs have way fewer downsides than previous-generation cards, and they come at a moment of relative weakness for Nvidia. We’ll see if the sales follow.

The good

  • Great 1440p performance and solid 4K performance
  • 16GB of RAM
  • Decisively beats Nvidia’s RTX 5070, including in most ray-traced games
  • RX 9070 XT is competitive with RTX 5070 Ti in non-ray-traced games for less money
  • Both cards match or beat the RX 7900 XT, AMD’s second-fastest card from the last generation
  • Decent power efficiency for the 9070 XT and great power efficiency for the 9070
  • Automated options for tuning overall power use to prioritize either efficiency or performance
  • Reliable 8-pin power connectors available in many cards

The bad

  • Nvidia’s ray-tracing performance is still usually better
  • At $549 and $599, pricing matches but doesn’t undercut the RTX 5070
  • FSR 4 isn’t as widely supported as DLSS and may not be for a while

The ugly

  • Playing the “can you actually buy these for AMD’s advertised prices” game

Photo of Andrew Cunningham

Andrew is a Senior Technology Reporter at Ars Technica, with a focus on consumer tech including computer hardware and in-depth reviews of operating systems like Windows and macOS. Andrew lives in Philadelphia and co-hosts a weekly book podcast called Overdue.

AMD Radeon RX 9070 and 9070 XT review: RDNA 4 fixes a lot of AMD’s problems Read More »

george-orwell’s-1984-as-a-’90s-pc-game-has-to-be-seen-to-be-believed

George Orwell’s 1984 as a ’90s PC game has to be seen to be believed

Quick, to the training sphere!

The Big Brother announcement promised the ability to “interact with everything” and “disable and destroy intrusive tele-screens and spy cameras watching the player’s every move” across “10 square blocks of Orwell’s retro-futuristic world.” But footage from the demo falls well short of that promise, instead covering some extremely basic Riven-style puzzle gameplay (flips switches to turn on the power; use a screwdriver to open the grate, etc.) played from a first-person view.

Sample gameplay from the newly unearthed Big Brother demo.

It all builds up to a sequence where (according to a walk-through included on the demo disc) you have to put on a “zero-g suit” before planting a bomb inside a “zero gravity training sphere” guarded by robots. Sounds like inhabiting the world of the novel to us!

Aside from the brief mentions of the Thought Police and MiniPac, the short demo does include a few other incidental nods to its licensed source material, including a “WAR IS PEACE” propaganda banner and an animated screen with the titular Big Brother seemingly looking down on you. Still, the entire gameplay scenario is so far removed from anything in the actual 1984 novel to make you wonder why they bothered with the license in the first place. Of course, MediaX answers that question in the game’s announcement, predicting that “while the game stands on its own as an entirely new creation in itself and will attract the typical game audience, the ‘Big Brother’ game will undoubtedly also attract a large literary audience.”

We sadly never got the chance to see how that “large literary audience” would have reacted to a game that seemed poised to pervert both the name and themes of 1984 so radically. In any case, this demo can now sit alongside the release of 1984’s Fahrenheit 451 and 1992’s The Godfather: The Action Game on any list of the most questionable game adaptations of respected works of art.

George Orwell’s 1984 as a ’90s PC game has to be seen to be believed Read More »