As with its other 50-series announcements, Nvidia is leaning on its DLSS Multi-Frame Generation technology to make lofty performance claims—the GPUs can insert up to three AI-interpolated frames in between each pair of frames that the GPU actually renders. The 40 series could only generate a single frame, and 30-series and older GPUs don’t support DLSS Frame Generation at all. This makes apples-to-apples performance comparisons difficult.
Generally, the company says the 5060 Ti and 5060 offer double the performance of the 4060 Ti and 4060, but all of its benchmarks are made using the “max Frame Gen level supported by each GPU.” The small snippets of native performance information we do have—Hogwarts Legacy runs on a 5060 Ti at 61 FPS 1440p, compared to 34 FPS for the 3060 Ti—suggests that it’s slightly less than twice as fast as that two-generation-old card. This would still be reasonably impressive, given the underwhelming 4060 Ti refresh. But we’ll need to wait for third-party testing before we really have a good idea of how performance will stack up without Frame Generation enabled.
As we and others have observed since the launch of the 40-series a few years ago, Frame Generation gives the best results when your base frame rate is already reasonably high; the technology is best used to make a good frame rate better and is less useful if you’re trying to make a bad frame rate good. That’s even more relevant for the slower 50-series than for the other GPUs in the lineup, which makes Nvidia’s reticence to provide native performance comparisons especially frustrating.
Rumors from earlier this year that correctly reported the specs of the 5060 series also indicated that Nvidia was planning to launch a low-end RTX 5050 GPU at some point, its first new entry-level GPU since launching the RTX 3050 in January 2022. The 5050 could still be coming, but if it is, it wasn’t part of Nvidia’s announcements today.
On Tuesday at Nvidia’s GTC 2025 conference in San Jose, California, CEO Jensen Huang revealed several new AI-accelerating GPUs the company plans to release over the coming months and years. He also revealed more specifications about previously announced chips.
The centerpiece announcement was Vera Rubin, first teased at Computex 2024 and now scheduled for release in the second half of 2026. This GPU, named after a famous astronomer, will feature tens of terabytes of memory and comes with a custom Nvidia-designed CPU called Vera.
According to Nvidia, Vera Rubin will deliver significant performance improvements over its predecessor, Grace Blackwell, particularly for AI training and inference.
Specifications for Vera Rubin, presented by Jensen Huang during his GTC 2025 keynote.
Vera Rubin features two GPUs together on one die that deliver 50 petaflops of FP4 inference performance per chip. When configured in a full NVL144 rack, the system delivers 3.6 exaflops of FP4 inference compute—3.3 times more than Blackwell Ultra’s 1.1 exaflops in a similar rack configuration.
The Vera CPU features 88 custom ARM cores with 176 threads connected to Rubin GPUs via a high-speed 1.8 TB/s NVLink interface.
Huang also announced Rubin Ultra, which will follow in the second half of 2027. Rubin Ultra will use the NVL576 rack configuration and feature individual GPUs with four reticle-sized dies, delivering 100 petaflops of FP4 precision (a 4-bit floating-point format used for representing and processing numbers within AI models) per chip.
At the rack level, Rubin Ultra will provide 15 exaflops of FP4 inference compute and 5 exaflops of FP8 training performance—about four times more powerful than the Rubin NVL144 configuration. Each Rubin Ultra GPU will include 1TB of HBM4e memory, with the complete rack containing 365TB of fast memory.
Nvidia has launched all of the GeForce RTX 50-series GPUs that it announced at CES, at least technically—whether you’re buying from Nvidia, AMD, or Intel, it’s nearly impossible to find any of these new cards at their advertised prices right now.
But hope springs eternal, and newly leaked specs for GeForce RTX 5060 and 5050-series cards suggest that Nvidia may be announcing these lower-end cards soon. These kinds of cards are rarely exciting, but Steam Hardware Survey data shows that these xx60 and xx50 cards are what the overwhelming majority of PC gamers are putting in their systems.
The specs, posted by a reliable leaker named Kopite and reported by Tom’s Hardware and others, suggest a refresh that’s in line with what Nvidia has done with most of the 50-series so far. Along with a move to the next-generation Blackwell architecture, the 5060 GPUs each come with a small increase to the number of CUDA cores, a jump from GDDR6 to GDDR7, and an increase in power consumption, but no changes to the amount of memory or the width of the memory bus. The 8GB versions, in particular, will probably continue to be marketed primarily as 1080p cards.
RTX 5060 Ti (leaked)
RTX 4060 Ti
RTX 5060 (leaked)
RTX 4060
RTX 5050 (leaked)
RTX 3050
CUDA Cores
4,608
4,352
3,840
3,072
2,560
2,560
Boost Clock
Unknown
2,535 MHz
Unknown
2,460 MHz
Unknown
1,777 MHz
Memory Bus Width
128-bit
128-bit
128-bit
128-bit
128-bit
128-bit
Memory bandwidth
Unknown
288 GB/s
Unknown
272 GB/s
Unknown
224 GB/s
Memory size
8GB or 16GB GDDR7
8GB or 16GB GDDR6
8GB GDDR7
8GB GDDR6
8GB GDDR6
8GB GDDR6
TGP
180 W
160 W
150 W
115 W
130 W
130 W
As with the 4060 Ti, the 5060 Ti is said to come in two versions, one with 8GB of RAM and one with 16GB. One of the 4060 Ti’s problems was that its relatively narrow 128-bit memory bus limited its performance at 1440p and 4K resolutions even with 16GB of RAM—the bandwidth increase from GDDR7 could help with this, but we’ll need to test to see for sure.
AMD is releasing the first detailed specifications of its next-generation Radeon RX 9070 series GPUs and the RDNA4 graphics architecture today, almost two months after teasing them at CES.
The short version is that these are both upper-midrange graphics cards targeting resolutions of 1440p and 4K and meant to compete mainly with Nvidia’s incoming and outgoing 4070- and 5070-series GeForce GPUs, including the RTX 4070, RTX 5070, RTX 4070 Ti and Ti Super, and the RTX 5070 Ti.
AMD says the RX 9070 will start at $549, the same price as Nvidia’s RTX 5070. The slightly faster 9070 XT starts at $599, $150 less than the RTX 5070 Ti. The cards go on sale March 6, a day after Nvidia’s RTX 5070.
Neither Nvidia nor Intel has managed to keep its GPUs in stores at their announced starting prices so far, though, so how well AMD’s pricing stacks up to Nvidia in the real world may take a few weeks or months to settle out. For its part, AMD says it’s confident that it has enough supply to meet demand, but that’s as specific as the company’s reassurances got.
Specs and speeds: Radeon RX 9070 and 9070 XT
RX 9070 XT
RX 9070
RX 7900 XTX
RX 7900 XT
RX 7900 GRE
RX 7800 XT
Compute units (Stream processors)
64 RDNA4 (4,096)
56 RDNA4 (3,584)
96 RDNA3 (6,144)
84 RDNA3 (5,376)
80 RDNA3 (5,120)
60 RDNA3 (3,840)
Boost Clock
2,970 MHz
2,520 MHz
2,498 MHz
2,400 MHz
2,245 MHz
2,430 MHz
Memory Bus Width
256-bit
256-bit
384-bit
320-bit
256-bit
256-bit
Memory Bandwidth
650 GB/s
650 GB/s
960 GB/s
800 GB/s
576 GB/s
624 GB/s
Memory size
16GB GDDR6
16GB GDDR6
24GB GDDR6
20GB GDDR6
16GB GDDR6
16GB GDDR6
Total board power (TBP)
304 W
220 W
355 W
315 W
260 W
263 W
As is implied by their similar price tags, the 9070 and 9070 XT have more in common than not. Both are based on the same GPU die—the 9070 has 56 of the chip’s compute units enabled, while the 9070 XT has 64. Both cards come with 16GB of RAM (4GB more than the 5070, the same amount as the 5070 Ti) on a 256-bit memory bus, and both use two 8-pin power connectors by default, though the 9070 XT can use significantly more power than the 9070 (304 W, compared to 220 W).
AMD says that its partners are free to make Radeon cards with the 12VHPWR or 12V-2×6 power connectors on them, though given the apparently ongoing issues with the connector, we’d expect most Radeon GPUs to stick with the known quantity that is the 8-pin connector.
AMD says that the 9070 series is made using a 4 nm TSMC manufacturing process and that the chips are monolithic rather than being split up into chiplets as some RX 7000-series cards were. AMD’s commitment to its memory controller chiplets was always hit or miss with the 7000-series—the high-end cards tended to use them, while the lower-end GPUs were usually monolithic—so it’s not clear one way or the other whether this means AMD is giving up on chiplet-based GPUs altogether or if it’s just not using them this time around.
It’s hard to review a product if you don’t know what it will actually cost!
The Asus Prime GeForce RTX 5070 Ti. Credit: Andrew Cunningham
The Asus Prime GeForce RTX 5070 Ti. Credit: Andrew Cunningham
Nvidia’s RTX 50-series makes its first foray below the $1,000 mark starting this week, with the $749 RTX 5070 Ti—at least in theory.
The third-fastest card in the Blackwell GPU lineup, the 5070 Ti is still far from “reasonably priced” by historical standards (the 3070 Ti was $599 at launch). But it’s also $50 cheaper and a fair bit faster than the outgoing 4070 Ti Super and the older 4070 Ti. These are steps in the right direction, if small ones.
We’ll talk more about its performance shortly, but at a high level, the 5070 Ti’s performance falls in the same general range as the 4080 Super and the original RTX 4080, a card that launched for $1,199 just over two years ago. And it’s probably your floor for consistently playable native 4K gaming for those of you out there who don’t want to rely on DLSS or 4K upscaling to hit that resolution (it’s also probably all the GPU that most people will need for high-FPS 1440p, if that’s more your speed).
But it’s a card I’m ambivalent about! It’s close to 90 percent as fast as a 5080 for 75 percent of the price, at least if you go by Nvidia’s minimum list prices, which for the 5090 and 5080 have been mostly fictional so far. If you can find it at that price—and that’s a big “if,” since every $749 model is already out of stock across the board at Newegg—and you’re desperate to upgrade or are building a brand-new 4K gaming PC, you could do worse. But I wouldn’t spend more than $749 on it, and it might be worth waiting to see what AMD’s first 90-series Radeon cards look like in a couple weeks before you jump in.
Meet the GeForce RTX 5070 Ti
RTX 5080
RTX 4080 Super
RTX 5070 Ti
RTX 4070 Ti Super
RTX 4070 Ti
RTX 5070
CUDA Cores
10,752
10,240
8,960
8,448
7,680
6,144
Boost Clock
2,617 MHz
2,550 MHz
2,452 MHz
2,610 MHz
2,610 MHz
2,512 MHz
Memory Bus Width
256-bit
256-bit
256-bit
256-bit
192-bit
192-bit
Memory Bandwidth
960 GB/s
736 GB/s
896 GB/s
672 GB/s
504 GB/s
672 GB/s
Memory size
16GB GDDR7
16GB GDDR6X
16GB GDDR7
16GB GDDR6X
12GB GDDR6X
12GB GDDR7
TGP
360 W
320 W
300 W
285 W
285 W
250 W
Nvidia isn’t making a Founders Edition version of the 5070 Ti, so this time around our review unit is an Asus Prime GeForce RTX 5070 Ti provided by Asus and Nvidia. These third-party cards will deviate a little from the stock specs listed above, but factory overclocks tend to be inordinately mild, and done mostly so the GPU manufacturer can slap a big “overclocked” badge somewhere on the box. We tested this Asus card with its BIOS switch set to “performance” mode, which elevates the boost clock by an entire 30 MHz; you don’t need to be a math whiz to guess that a 1.2 percent overclock is not going to change performance much.
Compared to the 4070 Ti Super, the 5070 Ti brings two things to the table: a roughly 6 percent increase in CUDA cores and a 33 percent increase in memory bandwidth, courtesy of the switch from GDDR6X to GDDR7. The original 4070 Ti had even fewer CUDA cores, but most importantly for its 4K performance included just 12GB of memory on a 192-bit bus.
The 5070 Ti is based on the same GB203 GPU silicon as the 5080 series, but with 1,792 CUDA cores disabled. But there are a lot of similarities between the two, including the 16GB bank of GDDR7 and the 256-bit memory bus. It looks nothing like the yawning gap between the RTX 5090 and the RTX 5080, and the two cards’ similar-ish specs meant they weren’t too far away from each other in our testing. The 5070 Ti’s 300 W power requirement is also a bit lower than the 5080’s 360 W, but it’s pretty close to the 4080 and 4080 Super’s 320 W; in practice, the 5070 Ti draws about as much as the 4080 cards do under load.
Asus’ design for its Prime RTX 5070 Ti is an inoffensive 2.5-slot, triple-fan card that should fit without a problem in most builds. Credit: Andrew Cunningham
As a Blackwell GPU, the 5070 Ti also supports Nvidia’s most-hyped addition to the 50-series: support for DLSS 4 and Multi-Frame Generation (MFG). We’ve already covered this in our 5090 and 5080 reviews, but the short version is that MFG works exactly like Frame Generation did in the 40-series, except that it can now insert up to three AI-generated frames in between natively rendered frames instead of just one.
Especially if you’re already running at a reasonably high frame rate, this can make things look a lot smoother on a high-refresh-rate monitor without introducing distractingly excessive lag or weird rendering errors. The feature is mainly controversial because Nvidia is comparing 50-series performance numbers with DLSS MFG enabled to older 40-series cards without DLSS MFG to make the 50-series cards seem a whole lot faster than they actually are.
We’ll publish some frame-generation numbers in our review, both using DLSS and (for AMD cards) FSR. But per usual, we’ll continue to focus on natively rendered performance—more relevant for all the games out there that don’t support frame generation or don’t benefit much from it, and more relevant because your base performance dictates how good your generated frames will look and feel anyway.
Testbed notes
We tested the 5070 Ti in the same updated testbed and with the same updated suite of games that we started using in our RTX 5090 review. The heart of the build is an AMD Ryzen 9800X3D, ensuring that our numbers are limited as little as possible by the CPU speed.
Per usual, we prioritize testing GPUs at resolutions that we think most people will use them for. For the 5070 Ti, that means both 4K and 1440p—this card is arguably still overkill for 1440p, but if you’re trying to hit 144 or 240 Hz (or even more) on a monitor, there’s a good case to be made for it. We also use a mix of ray-traced and non-ray-traced games. For the games we test with upscaling enabled, we use DLSS on Nvidia cards and the newest supported version of FSR (usually 2.x or 3.x) for AMD cards.
Though we’ve tested and re-tested multiple cards with recent drivers in our updated testbed, we don’t have a 4070 Ti Super, 4070 Ti, or 3070 Ti available to test with. We’ve provided some numbers for those GPUs from past reviews; these are from a PC running older drivers and a Ryzen 7 7800X3D instead of a 9800X3D, and we’ve put asterisks next to them in our charts. They should still paint a reasonably accurate picture of the older GPUs’ relative performance, but take them with that small grain of salt.
Performance and power
Despite including fewer CUDA cores than either version of the 4080, some combination of architectural improvements and memory bandwidth increases help the card keep pace with both 4080 cards almost perfectly. In most of our tests, it landed in the narrow strip right in between the 4080 and the 4080 Super, and its power consumption under load was also almost identical.
Benchmarks with DLSS/FSR and/or frame generation enabled.
In every way that matters, the 5070 Ti is essentially an RTX 4080 that also supports DLSS Multi-Frame Generation. You can see why we’d be mildly enthusiastic about it at $749 but less and less impressed the closer the price creeps to $1,000.
Being close to a 4080 also means that the performance gap between the 5070 Ti and the 5080 is usually pretty small. In most of the games we tested, the 5070 Ti hovers right around 90 percent of the 5080’s performance.
The 5070 Ti is also around 60 percent as fast as an RTX 5090. The performance is a lot lower, but the price-to-performance ratio is a lot higher, possibly reflecting the fact that the 5070 Ti actually has other GPUs it has to compete with (in non-ray-traced games, the Radeon RX 7900 XTX generally keeps pace with the 5070 Ti, though at this late date it is mostly out of stock unless you’re willing to pay way more than you ought to for one).
Compared to the old 4070 Ti, the 5070 Ti can be between 20 and 50 percent faster at 4K, depending on how limited the game is by the 4070 Ti’s narrower memory bus and 12GB bank of RAM. The performance improvement over the 4070 Ti Super is more muted, ranging from as little as 8 percent to as much as 20 percent in our 4K tests. This is better than the RTX 5080 did relative to the RTX 4080 Super, but as a generational leap, it’s still pretty modest—it’s clear why Nvidia wants everyone to look at the Multi-Frame Generation numbers when making comparisons.
Waiting to put theory into practice
Asus’ RTX 5070 Ti, replete with 12-pin power plug. Credit: Andrew Cunningham
Being able to get RTX 4080-level performance for several hundred dollars less just a couple of years after the 4080 launched is kind of exciting, though that excitement is leavened by the still high-ish $749 price tag (again, assuming it’s actually available at or anywhere near that price). That certainly makes it feel more like a next-generation GPU than the RTX 5080 did—and whatever else you can say about it, the 5070 Ti certainly feels like a better buy than the 5080.
The 5070 Ti is a fast and 4K-capable graphics card, fast enough that you should be able to get some good results from all of Blackwell’s new frame-generation trickery if that’s something you want to play with. Its price-to-performance ratio does not thrill me, but if you do the math, it’s still a much better value than the 4070 Ti series was—particularly the original 4070 Ti, with the 12GB allotment of RAM that limited its usefulness and future-proofness at 4K.
Two reasons to hold off on buying a 5070 Ti, if you’re thinking about it: We’re waiting to see how AMD’s 9070 series GPUs shake out, and Nvidia’s 50-series launch so far has been kind of a mess, with low availability and price gouging both on retail sites and in the secondhand market. Pay much more than $749 for a 5070 Ti, and its delicate value proposition fades quickly. We should know more about the AMD cards in a couple of weeks. The supply situation, at least so far, seems like a problem that Nvidia can’t (or won’t) figure out how to solve.
The good
For a starting price of $749, you get the approximate performance and power consumption of an RTX 4080, a GPU that cost $1,199 two years ago and $999 one year ago.
Good 4K performance and great 1440p performance for those with high-refresh monitors.
16GB of RAM should be reasonably future-proof.
Multi-Frame Generation is an interesting performance-boosting tool to have in your toolbox, even if it isn’t a cure-all for low framerates.
Nvidia-specific benefits like DLSS support and CUDA.
The bad
Not all that much faster than a 4070 Ti Super.
$749 looks cheap compared to a $2,000 GPU, but it’s still enough money to buy a high-end game console or an entire 1080p gaming PC.
The ugly
Pricing and availability for other 50-series GPUs to date have both been kind of a mess.
Will you actually be able to get it for $749? Because it doesn’t make a ton of sense if it costs more than $749.
Seriously, it’s been months since I reviewed a GPU that was actually widely available at its advertised price.
And it’s not just the RTX 5090 or 5080, it’s low-end stuff like the Intel Arc B580 and B570, too.
Is it high demand? Low supply? Scalpers and resellers hanging off the GPU market like the parasites they are? No one can say!
It makes these reviews very hard to do.
It also makes PC gaming, as a hobby, really difficult to get into if you aren’t into it already!
It just makes me mad is all.
If you’re reading this months from now and the GPUs actually are in stock at the list price, I hope this was helpful.
Andrew is a Senior Technology Reporter at Ars Technica, with a focus on consumer tech including computer hardware and in-depth reviews of operating systems like Windows and macOS. Andrew lives in Philadelphia and co-hosts a weekly book podcast called Overdue.
The 12VHPWR and 12V-2×6 connectors are both designed to solve a real problem: delivering hundreds of watts of power to high-end GPUs over a single cable rather than trying to fit multiple 8-pin power connectors onto these GPUs. In theory, swapping two to four 8-pin connectors for a single 12V-2×6 or 12VHPWR connector cuts down on the amount of board space OEMs must reserve for these connectors in their designs and the number of cables that users have to snake through the inside of their gaming PCs.
But while Nvidia, Intel, AMD, Qualcomm, Arm, and other companies are all PCI-SIG members and all had a hand in the design of the new standards, Nvidia is the only GPU company to use the 12VHPWR and 12V-2×6 connectors in most of its GPUs. AMD and Intel have continued to use the 8-pin power connector, and even some of Nvidia’s partners have stuck with 8-pin connectors for lower-end, lower-power cards like the RTX 4060 and 4070 series.
Both of the reported 5090 incidents involved third-party cables, one from custom PC part manufacturer MODDIY and one included with an FSP power supply, rather than the first-party 8-pin adapter that Nvidia supplies with GeForce GPUs. It’s much too early to say whether these cables (or Nvidia, or the design of the connector, or the affected users) caused the problem or whether this was just a coincidence.
We’ve contacted Nvidia to see whether it’s aware of and investigating the reports and will update this piece if we receive a response.
Even setting aside Frame Generation, this is a fast, power-hungry $2,000 GPU.
Credit: Andrew Cunningham
Credit: Andrew Cunningham
Nvidia’s GeForce RTX 5090 starts at $1,999 before you factor in upsells from the company’s partners or price increases driven by scalpers and/or genuine demand. It costs more than my entire gaming PC.
The new GPU is so expensive that you could build an entire well-specced gaming PC with Nvidia’s next-fastest GPU in it—the $999 RTX 5080, which we don’t have in hand yet—for the same money, or maybe even a little less with judicious component selection. It’s not the most expensive GPU that Nvidia has ever launched—2018’s $2,499 Titan RTX has it beat, and 2022’s RTX 3090 Ti also cost $2,000—but it’s safe to say it’s not really a GPU intended for the masses.
At least as far as gaming is concerned, the 5090 is the very definition of a halo product; it’s for people who demand the best and newest thing regardless of what it costs (the calculus is probably different for deep-pocketed people and companies who want to use them as some kind of generative AI accelerator). And on this front, at least, the 5090 is successful. It’s the newest and fastest GPU you can buy, and the competition is not particularly close. It’s also a showcase for DLSS Multi-Frame Generation, a new feature unique to the 50-series cards that Nvidia is leaning on heavily to make its new GPUs look better than they already are.
Founders Edition cards: Design and cooling
RTX 5090
RTX 4090
RTX 5080
RTX 4080 Super
CUDA cores
21,760
16,384
10,752
10,240
Boost clock
2,410 MHz
2,520 MHz
2,617 MHz
2,550 MHz
Memory bus width
512-bit
384-bit
256-bit
256-bit
Memory bandwidth
1,792 GB/s
1,008 GB/s
960 GB/s
736 GB/s
Memory size
32GB GDDR7
24GB GDDR6X
16GB GDDR7
16GB GDDR6X
TGP
575 W
450 W
360 W
320 W
We won’t spend too long talking about the specific designs of Nvidia’s Founders Edition cards since many buyers will experience the Blackwell GPUs with cards from Nvidia’s partners instead (the cards we’ve seen so far mostly look like the expected fare: gargantuan triple-slot triple-fan coolers, with varying degrees of RGB). But it’s worth noting that Nvidia has addressed a couple of my functional gripes with the 4090/4080-series design.
The first was the sheer dimensions of each card—not an issue unique to Nvidia, but one that frequently caused problems for me as someone who tends toward ITX-based PCs and smaller builds. The 5090 and 5080 FE designs are the same length and height as the 4090 and 4080 FE designs, but they only take up two slots instead of three, which will make them an easier fit for many cases.
Nvidia has also tweaked the cards’ 12VHPWR connector, recessing it into the card and mounting it at a slight angle instead of having it sticking straight out of the top edge. The height of the 4090/4080 FE design made some cases hard to close up once you factored in the additional height of a 12VHPWR cable or Nvidia’s many-tentacled 8-pin-to-12VHPWR adapter. The angled connector still extends a bit beyond the top of the card, but it’s easier to tuck the cable away so you can put the side back on your case.
Finally, Nvidia has changed its cooler—whereas most OEM GPUs mount all their fans on the top of the GPU, Nvidia has historically placed one fan on each side of the card. In a standard ATX case with the GPU mounted parallel to the bottom of the case, this wasn’t a huge deal—there’s plenty of room for that air to circulate inside the case and to be expelled by whatever case fans you have installed.
But in “sandwich-style” ITX cases, where a riser cable wraps around so the GPU can be mounted parallel to the motherboard, the fan on the bottom side of the GPU was poorly placed. In many sandwich-style cases, the GPU fan will dump heat against the back of the motherboard, making it harder to keep the GPU cool and creating heat problems elsewhere besides. The new GPUs mount both fans on the top of the cards.
Nvidia’s Founders Edition cards have had heat issues in the past—most notably the 30-series GPUs—and that was my first question going in. A smaller cooler plus a dramatically higher peak power draw seems like a recipe for overheating.
Temperatures for the various cards we re-tested for this review. The 5090 FE is the toastiest of all of them, but it still has a safe operating temperature.
At least for the 5090, the smaller cooler does mean higher temperatures—around 10 to 12 degrees Celsius higher when running the same benchmarks as the RTX 4090 Founders Edition. And while temperatures of around 77 degrees aren’t hugely concerning, this is sort of a best-case scenario, with an adequately cooled testbed case with the side panel totally removed and ambient temperatures at around 21° or 22° Celsius. You’ll just want to make sure you have a good amount of airflow in your case if you buy one of these.
Testbed notes
A new high-end Nvidia GPU is a good reason to tweak our test bed and suite of games, and we’ve done both here. Mainly, we added a 1050 W Thermaltake Toughpower GF A3 power supply—Nvidia recommends at least 1000 W for the 5090, and this one has a native 12VHPWR connector for convenience. We’ve also swapped the Ryzen 7 7800X3D for a slightly faster Ryzen 7 9800X3D to reduce the odds that the CPU will bottleneck performance as we try to hit high frame rates.
As for the suite of games, we’ve removed a couple of older titles and added some with built-in benchmarks that will tax these GPUs a bit more, especially at 4K with all the settings turned up. Those games include the RT Overdrive preset in the perennially punishing Cyberpunk 2077 and Black Myth: Wukong in Cinematic mode, both games where even the RTX 4090 struggles to hit 60 fps without an assist from DLSS. We’ve also added Horizon Zero Dawn Remastered, a recent release that doesn’t include ray-tracing effects but does support most DLSS 3 and FSR 3 features (including FSR Frame Generation).
We’ve tried to strike a balance between games with ray-tracing effects and games without it, though most AAA games these days include it, and modern GPUs should be able to handle it well (best of luck to AMD with its upcoming RDNA 4 cards).
For the 5090, we’ve run all tests in 4K—if you don’t care about running games in 4K, even if you want super-high frame rates at 1440p or for some kind of ultrawide monitor, the 5090 is probably overkill. When we run upscaling tests, we use the newest DLSS version available for Nvidia cards, the newest FSR version available for AMD cards, and the newest XeSS version available for Intel cards (not relevant here, just stating for the record), and we use the “Quality” setting (at 4K, that equates to an actual rendering version of 1440p).
Rendering performance: A lot faster, a lot more power-hungry
Before we talk about Frame Generation or “fake frames,” let’s compare apples to apples and just examine the 5090’s rendering performance.
The card mainly benefits from four things compared to the 4090: the updated Blackwell GPU architecture, a nearly 33 percent increase in the number of CUDA cores, an upgrade from GDDR6X to GDDR7, and a move from a 384-bit memory bus to a 512-bit bus. It also jumps from 24GB of RAM to 32GB, but games generally aren’t butting up against a 24GB limit yet, so the capacity increase by itself shouldn’t really change performance if all you’re focused on is gaming.
And for people who prioritize performance over all else, the 5090 is a big deal—it’s the first consumer graphics card from any company that is faster than a 4090, as Nvidia never spruced up the 4090 last year when it did its mid-generation Super refreshes of the 4080, 4070 Ti, and 4070.
Comparing natively rendered games at 4K, the 5090 is between 17 percent and 40 percent faster than the 4090, with most of the games we tested landing somewhere in the low to high 30 percent range. That’s an undeniably big bump, one that’s roughly commensurate with the increase in the number of CUDA cores. Tests run with DLSS enabled (both upscaling-only and with Frame Generation running in 2x mode) improve by roughly the same amount.
You could find things to be disappointed about if you went looking for them. That 30-something-percent performance increase comes with a 35 percent increase in power use in our testing under load with punishing 4K games—the 4090 tops out around 420 W, whereas the 5090 went all the way up to 573 W, with the 5090 coming closer to its 575 W TDP than the 4090 does to its theoretical 450 W maximum. The 50-series cards use the same TSMC 4N manufacturing process as the 40-series cards, and increasing the number of transistors without changing the process results in a chip that uses more power (though it should be said that capping frame rates, running at lower resolutions, or running less-demanding games can rein in that power use a bit).
Power draw under load goes up by an amount roughly commensurate with performance. The 4090 was already power-hungry; the 5090 is dramatically more so. Credit: Andrew Cunningham
The 5090’s 30-something percent increase over the 4090 might also seem underwhelming if you recall that the 4090 was around 55 percent faster than the previous-generation 3090 Ti while consuming about the same amount of power. To be even faster than a 4090 is no small feat—AMD’s fastest GPU is more in line with Nvidia’s 4080 Super—but if you’re comparing the two cards using the exact same tests, the relative leap is less seismic.
That brings us to Nvidia’s answer for that problem: DLSS 4 and its Multi-Frame Generation feature.
DLSS 4 and Multi-Frame Generation
As a refresher, Nvidia’s DLSS Frame Generation feature, as introduced in the GeForce 40-series, takes DLSS upscaling one step further. The upscaling feature inserted interpolated pixels into a rendered image to make it look like a sharper, higher-resolution image without having to do all the work of rendering all those pixels. DLSS FG would interpolate an entire frame between rendered frames, boosting your FPS without dramatically boosting the amount of work your GPU was doing. If you used DLSS upscaling and FG at the same time, Nvidia could claim that seven out of eight pixels on your screen were generated by AI.
DLSS Multi-Frame Generation (hereafter MFG, for simplicity’s sake) does the same thing, but it can generate one to three interpolated frames for every rendered frame. The marketing numbers have gone up, too; now, 15 out of every 16 pixels on your screen can be generated by AI.
Nvidia might point to this and say that the 5090 is over twice as fast as the 4090, but that’s not really comparing apples to apples. Expect this issue to persist over the lifetime of the 50-series. Credit: Andrew Cunningham
Nvidia provided reviewers with a preview build of Cyberpunk 2077 with DLSS MFG enabled, which gives us an example of how those settings will be exposed to users. For 40-series cards that only support the regular DLSS FG, you won’t notice a difference in games that support MFG—Frame Generation is still just one toggle you can turn on or off. For 50-series cards that support MFG, you’ll be able to choose from among a few options, just as you currently can with other DLSS quality settings.
The “2x” mode is the old version of DLSS FG and is supported by both the 50-series cards and 40-series GPUs; it promises one generated frame for every rendered frame (two frames total, hence “2x”). The “3x” and “4x” modes are new to the 50-series and promise two and three generated frames (respectively) for every rendered frame. Like the original DLSS FG, MFG can be used in concert with normal DLSS upscaling, or it can be used independently.
One problem with the original DLSS FG was latency—user input was only being sampled at the natively rendered frame rate, meaning you could be looking at 60 frames per second on your display but only having your input polled 30 times per second. Another is image quality; as good as the DLSS algorithms can be at guessing and recreating what a natively rendered pixel would look like, you’ll inevitably see errors, particularly in fine details.
Both these problems contribute to the third problem with DLSS FG: Without a decent underlying frame rate, the lag you feel and the weird visual artifacts you notice will both be more pronounced. So DLSS FG can be useful for turning 120 fps into 240 fps, or even 60 fps into 120 fps. But it’s not as helpful if you’re trying to get from 20 or 30 fps up to a smooth 60 fps.
We’ll be taking a closer look at the DLSS upgrades in the next couple of weeks (including MFG and the new transformer model, which will supposedly increase upscaling quality and supports all RTX GPUs). But in our limited testing so far, the issues with DLSS MFG are basically the same as with the first version of Frame Generation, just slightly more pronounced. In the built-in Cyberpunk 2077 benchmark, the most visible issues are with some bits of barbed-wire fencing, which get smoother-looking and less detailed as you crank up the number of AI-generated frames. But the motion does look fluid and smooth, and the frame rate counts are admittedly impressive.
But as we noted in last year’s 4090 review, the xx90 cards portray FG and MFG in the best light possible since the card is already capable of natively rendering such high frame rates. It’s on lower-end cards where the shortcomings of the technology become more pronounced. Nvidia might say that the upcoming RTX 5070 is “as fast as a 4090 for $549,” and it might be right in terms of the number of frames the card can put up on your screen every second. But responsiveness and visual fidelity on the 4090 will be better every time—AI is a good augmentation for rendered frames, but it’s iffy as a replacement for rendered frames.
A 4090, amped way up
Nvidia’s GeForce RTX 5090. Credit: Andrew Cunningham
The GeForce RTX 5090 is an impressive card—it’s the only consumer graphics card to be released in over two years that can outperform the RTX 4090. The main caveats are its sky-high power consumption and sky-high price; by itself, it costs as much (and consumes as much power as) an entire mainstream gaming PC. The card is aimed at people who care about speed way more than they care about price, but it’s still worth putting it into context.
The main controversy, as with the 40-series, is how Nvidia talks about its Frame Generation-inflated performance numbers. Frame Generation and Multi-Frame Generation are tools in a toolbox—there will be games where they make things look great and run fast with minimal noticeable impact to visual quality or responsiveness, games where those impacts are more noticeable, and games that never add support for the features at all. (As well-supported as DLSS generally is in new releases, it is incumbent upon game developers to add it—and update it when Nvidia puts out a new version.)
But using those Multi-Frame Generation-inflated FPS numbers to make topline comparisons to last-generation graphics cards just feels disingenuous. No, an RTX 5070 will not be as fast as an RTX 4090 for just $549, because not all games support DLSS MFG, and not all games that do support it will run it well. Frame Generation still needs a good base frame rate to start with, and the slower your card is, the more issues you might notice.
Fuzzy marketing aside, Nvidia is still the undisputed leader in the GPU market, and the RTX 5090 extends that leadership for what will likely be another entire GPU generation, since both AMD and Intel are focusing their efforts on higher-volume, lower-cost cards right now. DLSS is still generally better than AMD’s FSR, and Nvidia does a good job of getting developers of new AAA game releases to support it. And if you’re buying this GPU to do some kind of rendering work or generative AI acceleration, Nvidia’s performance and software tools are still superior. The misleading performance claims are frustrating, but Nvidia still gains a lot of real advantages from being as dominant and entrenched as it is.
The good
Usually 30-something percent faster than an RTX 4090
Redesigned Founders Edition card is less unwieldy than the bricks that were the 4090/4080 design
Adequate cooling, despite the smaller card and higher power use
DLSS Multi-Frame Generation is an intriguing option if you’re trying to hit 240 or 360 fps on your high-refresh-rate gaming monitor
The bad
Much higher power consumption than the 4090, which already consumed more power than any other GPU on the market
Frame Generation is good at making a game that’s running fast run faster, it’s not as good for bringing a slow game up to 60 Hz
Nvidia’s misleading marketing around Multi-Frame Generation is frustrating—and will likely be more frustrating for lower-end cards since they aren’t getting the same bumps to core count and memory interface that the 5090 gets
The ugly
You can buy a whole lot of PC for $2,000, and we wouldn’t bet on this GPU being easy to find at MSRP
Andrew is a Senior Technology Reporter at Ars Technica, with a focus on consumer tech including computer hardware and in-depth reviews of operating systems like Windows and macOS. Andrew lives in Philadelphia and co-hosts a weekly book podcast called Overdue.
Nvidia leans heavily on DLSS 4 and AI-generated frames for speed comparisons.
Nvidia’s RTX 5070, one of four new desktop GPUs announced this week. Credit: Nvidia
Nvidia’s RTX 5070, one of four new desktop GPUs announced this week. Credit: Nvidia
Nvidia has good news and bad news for people building or buying gaming PCs.
The good news is that three of its four new RTX 50-series GPUs are the same price or slightly cheaper than the RTX 40-series GPUs they’re replacing. The RTX 5080 is $999, the same price as the RTX 4080 Super; the 5070 Ti and 5070 are launching for $749 and $549, each $50 less than the 4070 Ti Super and 4070 Super.
The bad news for people looking for the absolute fastest card they can get is that the company is charging $1,999 for its flagship RTX 5090 GPU, significantly more than the $1,599 MSRP of the RTX 4090. If you want Nvidia’s biggest and best, it will cost at least as much as four high-end game consoles or a pair of decently specced midrange gaming PCs.
Pricing for the first batch of Blackwell-based RTX 50-series GPUs. Credit: Nvidia
Nvidia also announced a new version of its upscaling algorithm, DLSS 4. As with DLSS 3 and the RTX 40-series, DLSS 4’s flagship feature will be exclusive to the 50-series. It’s called DLSS Multi Frame Generation, and as the name implies, it takes the Frame Generation feature from DLSS 3 and allows it to generate even more frames. It’s why Nvidia CEO Jensen Huang claimed that the $549 RTX 5070 performed like the $1,599 RTX 4090; it’s also why those claims are a bit misleading.
The rollout will begin with the RTX 5090 and 5080 on January 30. The 5070 Ti and 5070 will follow at some point in February. All cards except the 5070 Ti will come in Nvidia-designed Founders Editions as well as designs made by Nvidia’s partners; the 5070 Ti isn’t getting a Founders Edition.
The RTX 5090 and 5080
RTX 5090
RTX 4090
RTX 5080
RTX 4080 Super
CUDA Cores
21,760
16,384
10,752
10,240
Boost Clock
2,410 MHz
2,520 MHz
2,617 MHz
2,550 MHz
Memory Bus Width
512-bit
384-bit
256-bit
256-bit
Memory Bandwidth
1,792 GB/s
1,008 GB/s
960 GB/s
736 GB/s
Memory size
32GB GDDR7
24GB GDDR6X
16GB GDDR7
16GB GDDR6X
TGP
575 W
450 W
360 W
320 W
The RTX 5090, based on Nvidia’s new Blackwell architecture, is a gigantic chip with 92 billion transistors in it. And while it is double the price of an RTX 5080, you also get double the GPU cores and double the RAM and nearly double the memory bandwidth. Even more than the 4090, it’s being positioned head and shoulders above the rest of the GPUs in the family, and the 5080’s performance won’t come remotely close to it.
Although $1,999 is a lot to ask for a graphics card, if Nvidia can consistently make the RTX 5090 available at $2,000, it could still be an improvement over the pricing of the 4090, which regularly sold for well over $1,599 over the course of its lifetime, due in part to pandemic-fueled GPU shortages, cryptocurrency mining, and the generative AI boom. Companies and other entities buying them as AI accelerators may restrict the availability of the 5090, too, but Nvidia’s highest GPU tier has been well out of the price range of most consumers for a while now.
Despite the higher power budget—as predicted, it’s 125 W higher than the 4090 at 450 W, and Nvidia recommends a 1,000 W power supply or better—the physical size of the 5090 Founders Edition is considerably smaller than the 4090, which was large enough that it had trouble fitting into some computer cases. Thanks to a “high-density PCB” and redesigned cooling system, the 5090 Founders Edition is a dual-slot card that ought to fit into small-form-factor systems much more easily than the 4090. Of course, this won’t stop most third-party 5090 GPUs from being gigantic triple-fan monstrosities, but it is apparently possible to make a reasonably sized version of the card.
Moving on to the 5080, it looks like more of a mild update from last year’s RTX 4080 Super, with a few hundred more CUDA cores, more memory bandwidth (thanks to the use of GDDR7, since the two GPUs share the same 256-bit interface), and a slightly higher power budget of 360 W (compared to 320 W for the 4080 Super).
Having more cores and faster memory, in addition to whatever improvements and optimizations come with the Blackwell architecture, should help the 5080 easily beat the 4080 Super. But it’s an open question as to whether it will be able to beat the 4090, at least before you consider any DLSS-related frame rate increases. The 4090 has 52 percent more GPU cores, a wider memory bus, and 8GB more memory.
5070 Ti and 5070
RTX 5070 Ti
RTX 4070 Ti Super
RTX 5070
RTX 4070 Super
CUDA Cores
8,960
8,448
6,144
7,168
Boost Clock
2,452 MHz
2,610 MHz
2,512 MHz
2,475 MHz
Memory Bus Width
256-bit
256-bit
192-bit
192-bit
Memory Bandwidth
896 GB/s
672 GB/s
672 GB/s
504 GB/s
Memory size
16GB GDDR7
16GB GDDR6X
12GB GDDR7
12GB GDDR6X
TGP
300 W
285 W
250 W
220 W
At $749 and $549, the 5070 Ti and 5070 are slightly more within reach for someone who’s trying to spend less than $2,000 on a new gaming PC. Both cards hew relatively closely to the specs of the 4070 Ti Super and 4070 Super, both of which are already solid 1440p and 4K graphics cards for many titles.
Like the 5080, the 5070 Ti includes a few hundred more CUDA cores, more memory bandwidth, and slightly higher power requirements compared to the 4070 Ti Super. That the card is $50 less than the 4070 Ti Super was at launch is a nice bonus—if it can come close to or beat the RTX 4080 for $250 less, it could be an appealing high-end option.
The RTX 5070 is alone in having fewer CUDA cores than its immediate predecessor—6,144, down from 7,168. It is an upgrade from the original 4070, which had 5,888 CUDA cores, and GDDR7 and slightly faster clock speeds may still help it outrun the 4070 Super; like the other 50-series cards, it also comes with a higher power budget. But right now this card is looking like the closest thing to a lateral move in the lineup, at least before you consider the additional frame-generation capabilities of DLSS 4.
DLSS 4 and fudging the numbers
Many of Nvidia’s most ostentatious performance claims—including the one that the RTX 5070 is as fast as a 4090—factors in DLSS 4’s additional AI-generated frames. Credit: Nvidia
When launching new 40-series cards over the last two years, it was common for Nvidia to publish a couple of different performance comparisons to last-gen cards: one with DLSS turned off and one with DLSS and the 40-series-exclusive Frame Generation feature turned on. Nvidia would then lean on the DLSS-enabled numbers when making broad proclamations about a GPU’s performance, as it does in its official press release when it says the 5090 is twice as fast as the 4090, or as Huang did during his CES keynote when he claimed that an RTX 5070 offered RTX 4090 performance for $549.
DLSS Frame Generation is an AI feature that builds on what DLSS is already doing. Where DLSS uses AI to fill in gaps and make a lower-resolution image look like a higher-resolution image, DLSS Frame Generation creates entirely new frames and inserts them in between the frames that your GPU is actually rendering.
DLSS 4 now generates up to three frames for every frame the GPU is actually rendering. Used in concert with DLSS image upscaling, Nvidia says that “15 out of every 16 pixels” you see on your screen are being generated by its AI models. Credit: Nvidia
The RTX 50-series one-ups the 40-series with DLSS 4, another new revision that’s exclusive to its just-launched GPUs: DLSS Multi Frame Generation. Instead of generating one extra frame for every traditionally rendered frame, DLSS 4 generates “up to three additional frames” to slide in between the ones your graphics card is actually rendering—based on Nvidia’s slides, it looks like users ought to be able to control how many extra frames are being generated, just as they can control the quality settings for DLSS upscaling. Nvidia is leaning on the Blackwell architecture’s faster Tensor Cores, which it says are up to 2.5 times faster than the Tensor Cores in the RTX 40-series, to do the AI processing necessary to upscale rendered frames and to generate new ones.
Nvidia’s performance comparisons aren’t indefensible; with DLSS FG enabled, the cards can put out a lot of frames per second. It’s just dependent on game support (Nvidia says that 75 titles will support it at launch), and going off of our experience with the original iteration of Frame Generation, there will likely be scenarios where image quality is noticeably worse or just “off-looking” compared to actual rendered frames. DLSS FG also needed a solid base frame rate to get the best results, which may or may not be the case for Multi-FG.
Enhanced versions of older DLSS features can benefit all RTX cards, including the 20-, 30-, and 40-series. Multi-Frame Generation is restricted to the 50-series, though. Credit: Nvidia
Though the practice of restricting the biggest DLSS upgrades to all-new hardware is a bit frustrating, Nvidia did announce that it’s releasing a new transformer module for the DLSS Ray Reconstruction, Super Resolution, and Anti-Aliasing features. These are DLSS features that are available on all RTX GPUs going all the way back to the RTX 20-series, and games that are upgraded to use the newer models should benefit from improved upscaling quality even if they’re using older GPUs.
GeForce 50-series: Also for laptops!
Nvidia’s projected pricing for laptops with each of its new mobile GPUs. Credit: Nvidia
Nvidia’s laptop GPU announcements sometimes trail the desktop announcements by a few weeks or months. But the company has already announced mobile versions of the 5090, 5080, 5070 Ti, and 5070 that Nvidia says will begin shipping in laptops priced between $1,299 and $2,899 when they launch in March.
All of these GPUs share names, the Blackwell architecture, and DLSS 4 support with their desktop counterparts, but per usual they’re significantly cut down to fit on a laptop motherboard and within a laptop’s cooling capacity. The mobile version of the 5090 includes 10,496 GPU cores, less than half the number of the desktop version, and just 24GB of GDDR7 memory on a 256-bit interface instead of 32GB on a 512-bit interface. But it also can operate with a power budget between 95 and 150 W, a fraction of what the desktop 5090 needs.
RTX 5090 (mobile)
RTX 5080 (mobile)
RTX 5070 Ti (mobile)
RTX 5070 (mobile)
CUDA Cores
10,496
7,680
5,888
4,608
Memory Bus Width
256-bit
256-bit
192-bit
128-bit
Memory size
24GB GDDR7
16GB GDDR7
12GB GDDR7
8GB GDDR7
TGP
95-150 W
80-150 W
60-115 W
50-100 W
The other three GPUs are mostly cut down in similar ways, and all of them have fewer GPU cores and lower power requirements than their desktop counterparts. The 5070 GPUs both have less RAM and narrowed memory buses, too, but the mobile RTX 5080 at least comes closer to its desktop iteration, with the same 256-bit bus width and 16GB of RAM.
Andrew is a Senior Technology Reporter at Ars Technica, with a focus on consumer tech including computer hardware and in-depth reviews of operating systems like Windows and macOS. Andrew lives in Philadelphia and co-hosts a weekly book podcast called Overdue.
Nvidia is reportedly gearing up to launch the first few cards in its RTX 50-series at CES next week, including an RTX 5090, RTX 5080, RTX 5070 Ti, and RTX 5070. The 5090 will be of particular interest to performance-obsessed, money-is-no-object PC gaming fanatics since it’s the first new GPU in over two years that can beat the performance of 2022’s RTX 4090.
But boosted performance and slower advancements in chip manufacturing technology mean that the 5090’s maximum power draw will far outstrip the 4090’s, according to leakers. VideoCardz reports that the 5090’s thermal design power (TDP) will be set at 575 W, up from 450 W for the already power-hungry RTX 4090. The RTX 5080’s TDP is also increasing to 360 W, up from 320 W for the RTX 4080 Super.
That also puts the RTX 5090 close to the maximum power draw available over a single 12VHPWR connector, which is capable of delivering up to 600 W of power (though once you include the 75 W available via the PCI Express slot on your motherboard, the actual maximum possible power draw for a GPU with a single 12VHPWR connector is a slightly higher 675 W).
Higher peak power consumption doesn’t necessarily mean that these cards will always draw more power during actual gaming than their 40-series counterparts. And their performance could be good enough that they could still be very efficient cards in terms of performance per watt.
But if you’re considering an upgrade to an RTX 5090 and these power specs are accurate, you may need to consider an upgraded power supply along with your new graphics card. Nvidia recommends at least an 850 W power supply for the RTX 4090 to accommodate what the GPU needs while leaving enough power left over for the rest of the system. An additional 125 W bump suggests that Nvidia will recommend a 1,000 W power supply as the minimum for the 5090.
We’ll probably know more about Nvidia’s next-gen cards after its CES keynote, currently scheduled for 9: 30 pm Eastern/6: 30 pm Pacific on Monday, January 6.
Rumors have suggested that Nvidia will be taking the wraps off of some next-generation RTX 50-series graphics cards at CES in January. And as we get closer to that date, Nvidia’s partners and some of the PC makers have begun to inadvertently leak details of the cards.
According to recent leaks from both Zotac and Acer, it looks like Nvidia is planning to announce four new GPUs next month, all at the high end of its lineup: The RTX 5090, RTX 5080, RTX 5070 Ti, and RTX 5070 were all briefly listed on Zotac’s website, as spotted by VideoCardz. There’s also an RTX 5090D variant for the Chinese market, which will presumably have its specs tweaked to conform with current US export restrictions on high-performance GPUs.
Though the website leak didn’t confirm many specs, it did list the RTX 5090 as including 32GB of GDDR7, an upgrade from the 4090’s 24GB of GDDR6X. An Acer spec sheet for new Predator Orion desktops also lists 32GB of GDDR7 for the 4090, as well as 16GB of GDDR7 for the RTX 5080. This is the same amount of RAM included with the RTX 4080 and 4080 Super.
The 5090 will be a big deal when it launches because no graphics card released since October 2022 has come close to beating the 4090’s performance. Nvidia’s early 2024 Super refresh for some 40-series cards didn’t include a 4090 Super, and AMD’s flagship RX 7900 XTX card is more comfortable competing with the likes of the 4080 and 4080 Super. The 5090 isn’t a card that most people are going to buy, but for the performance-obsessed, it’s the first high-end performance upgrade the GPU market has seen in more than two years.
Enlarge/ Nvidia’s CEO Jensen Huang delivers his keystone speech ahead of Computex 2024 in Taipei on June 2, 2024.
On Sunday, Nvidia CEO Jensen Huang reached beyond Blackwell and revealed the company’s next-generation AI-accelerating GPU platform during his keynote at Computex 2024 in Taiwan. Huang also detailed plans for an annual tick-tock-style upgrade cycle of its AI acceleration platforms, mentioning an upcoming Blackwell Ultra chip slated for 2025 and a subsequent platform called “Rubin” set for 2026.
Nvidia’s data center GPUs currently power a large majority of cloud-based AI models, such as ChatGPT, in both development (training) and deployment (inference) phases, and investors are keeping a close watch on the company, with expectations to keep that run going.
During the keynote, Huang seemed somewhat hesitant to make the Rubin announcement, perhaps wary of invoking the so-called Osborne effect, whereby a company’s premature announcement of the next iteration of a tech product eats into the current iteration’s sales. “This is the very first time that this next click as been made,” Huang said, holding up his presentation remote just before the Rubin announcement. “And I’m not sure yet whether I’m going to regret this or not.”
Nvidia Keynote at Computex 2023.
The Rubin AI platform, expected in 2026, will use HBM4 (a new form of high-bandwidth memory) and NVLink 6 Switch, operating at 3,600GBps. Following that launch, Nvidia will release a tick-tock iteration called “Rubin Ultra.” While Huang did not provide extensive specifications for the upcoming products, he promised cost and energy savings related to the new chipsets.
During the keynote, Huang also introduced a new ARM-based CPU called “Vera,” which will be featured on a new accelerator board called “Vera Rubin,” alongside one of the Rubin GPUs.
Much like Nvidia’s Grace Hopper architecture, which combines a “Grace” CPU and a “Hopper” GPU to pay tribute to the pioneering computer scientist of the same name, Vera Rubin refers to Vera Florence Cooper Rubin (1928–2016), an American astronomer who made discoveries in the field of deep space astronomy. She is best known for her pioneering work on galaxy rotation rates, which provided strong evidence for the existence of dark matter.
A calculated risk
Enlarge/ Nvidia CEO Jensen Huang reveals the “Rubin” AI platform for the first time during his keynote at Computex 2024 on June 2, 2024.
Nvidia’s reveal of Rubin is not a surprise in the sense that most big tech companies are continuously working on follow-up products well in advance of release, but it’s notable because it comes just three months after the company revealed Blackwell, which is barely out of the gate and not yet widely shipping.
At the moment, the company seems to be comfortable leapfrogging itself with new announcements and catching up later; Nvidia just announced that its GH200 Grace Hopper “Superchip,” unveiled one year ago at Computex 2023, is now in full production.
With Nvidia stock rising and the company possessing an estimated 70–95 percent of the data center GPU market share, the Rubin reveal is a calculated risk that seems to come from a place of confidence. That confidence could turn out to be misplaced if a so-called “AI bubble” pops or if Nvidia misjudges the capabilities of its competitors. The announcement may also stem from pressure to continue Nvidia’s astronomical growth in market cap with nonstop promises of improving technology.
Accordingly, Huang has been eager to showcase the company’s plans to continue pushing silicon fabrication tech to its limits and widely broadcast that Nvidia plans to keep releasing new AI chips at a steady cadence.
“Our company has a one-year rhythm. Our basic philosophy is very simple: build the entire data center scale, disaggregate and sell to you parts on a one-year rhythm, and we push everything to technology limits,” Huang said during Sunday’s Computex keynote.
Despite Nvidia’s recent market performance, the company’s run may not continue indefinitely. With ample money pouring into the data center AI space, Nvidia isn’t alone in developing accelerator chips. Competitors like AMD (with the Instinct series) and Intel (with Guadi 3) also want to win a slice of the data center GPU market away from Nvidia’s current command of the AI-accelerator space. And OpenAI’s Sam Altman is trying to encourage diversified production of GPU hardware that will power the company’s next generation of AI models in the years ahead.
Enlarge/ An Intel handout photo of the Gaudi 3 AI accelerator.
On Tuesday, Intel revealed a new AI accelerator chip called Gaudi 3 at its Vision 2024 event in Phoenix. With strong claimed performance while running large language models (like those that power ChatGPT), the company has positioned Gaudi 3 as an alternative to Nvidia’s H100, a popular data center GPU that has been subject to shortages, though apparently that is easing somewhat.
Compared to Nvidia’s H100 chip, Intel projects a 50 percent faster training time on Gaudi 3 for both OpenAI’s GPT-3 175B LLM and the 7-billion parameter version of Meta’s Llama 2. In terms of inference (running the trained model to get outputs), Intel claims that its new AI chip delivers 50 percent faster performance than H100 for Llama 2 and Falcon 180B, which are both relatively popular open-weights models.
Intel is targeting the H100 because of its high market share, but the chip isn’t Nvidia’s most powerful AI accelerator chip in the pipeline. Announcements of the H200 and the Blackwell B200 have since surpassed the H100 on paper, but neither of those chips is out yet (the H200 is expected in the second quarter of 2024—basically any day now).
Meanwhile, the aforementioned H100 supply issues have been a major headache for tech companies and AI researchers who have to fight for access to any chips that can train AI models. This has led several tech companies like Microsoft, Meta, and OpenAI (rumor has it) to seek their own AI-accelerator chip designs, although that custom silicon is typically manufactured by either Intel or TSMC. Google has its own line of tensor processing units (TPUs) that it has been using internally since 2015.
Given those issues, Intel’s Gaudi 3 may be a potentially attractive alternative to the H100 if Intel can hit an ideal price (which Intel has not provided, but an H100 reportedly costs around $30,000–$40,000) and maintain adequate production. AMD also manufactures a competitive range of AI chips, such as the AMD Instinct MI300 Series, that sell for around $10,000–$15,000.
Gaudi 3 performance
Enlarge/ An Intel handout featuring specifications of the Gaudi 3 AI accelerator.
Intel says the new chip builds upon the architecture of its predecessor, Gaudi 2, by featuring two identical silicon dies connected by a high-bandwidth connection. Each die contains a central cache memory of 48 megabytes, surrounded by four matrix multiplication engines and 32 programmable tensor processor cores, bringing the total cores to 64.
The chipmaking giant claims that Gaudi 3 delivers double the AI compute performance of Gaudi 2 using 8-bit floating-point infrastructure, which has become crucial for training transformer models. The chip also offers a fourfold boost for computations using the BFloat 16-number format. Gaudi 3 also features 128GB of the less expensive HBMe2 memory capacity (which may contribute to price competitiveness) and features 3.7TB of memory bandwidth.
Since data centers are well-known to be power hungry, Intel emphasizes the power efficiency of Gaudi 3, claiming 40 percent greater inference power-efficiency across Llama 7B and 70B parameters, and Falcon 180B parameter models compared to Nvidia’s H100. Eitan Medina, chief operating officer of Intel’s Habana Labs, attributes this advantage to Gaudi’s large-matrix math engines, which he claims require significantly less memory bandwidth compared to other architectures.
Gaudi vs. Blackwell
Enlarge/ An Intel handout photo of the Gaudi 3 AI accelerator.
Last month, we covered the splashy launch of Nvidia’s Blackwell architecture, including the B200 GPU, which Nvidia claims will be the world’s most powerful AI chip. It seems natural, then, to compare what we know about Nvidia’s highest-performing AI chip to the best of what Intel can currently produce.
For starters, Gaudi 3 is being manufactured using TSMC’s N5 process technology, according to IEEE Spectrum, narrowing the gap between Intel and Nvidia in terms of semiconductor fabrication technology. The upcoming Nvidia Blackwell chip will use a custom N4P process, which reportedly offers modest performance and efficiency improvements over N5.
Gaudi 3’s use of HBM2e memory (as we mentioned above) is notable compared to the more expensive HBM3 or HBM3e used in competing chips, offering a balance of performance and cost-efficiency. This choice seems to emphasize Intel’s strategy to compete not only on performance but also on price.
As far as raw performance comparisons between Gaudi 3 and the B200, that can’t be known until the chips have been released and benchmarked by a third party.
As the race to power the tech industry’s thirst for AI computation heats up, IEEE Spectrum notes that the next generation of Intel’s Gaudi chip, code-named Falcon Shores, remains a point of interest. It also remains to be seen whether Intel will continue to rely on TSMC’s technology or leverage its own foundry business and upcoming nanosheet transistor technology to gain a competitive edge in the AI accelerator market.