GPU

nvidia-jumps-ahead-of-itself-and-reveals-next-gen-“rubin”-ai-chips-in-keynote-tease

Nvidia jumps ahead of itself and reveals next-gen “Rubin” AI chips in keynote tease

Swing beat —

“I’m not sure yet whether I’m going to regret this,” says CEO Jensen Huang at Computex 2024.

Nvidia's CEO Jensen Huang delivers his keystone speech ahead of Computex 2024 in Taipei on June 2, 2024.

Enlarge / Nvidia’s CEO Jensen Huang delivers his keystone speech ahead of Computex 2024 in Taipei on June 2, 2024.

On Sunday, Nvidia CEO Jensen Huang reached beyond Blackwell and revealed the company’s next-generation AI-accelerating GPU platform during his keynote at Computex 2024 in Taiwan. Huang also detailed plans for an annual tick-tock-style upgrade cycle of its AI acceleration platforms, mentioning an upcoming Blackwell Ultra chip slated for 2025 and a subsequent platform called “Rubin” set for 2026.

Nvidia’s data center GPUs currently power a large majority of cloud-based AI models, such as ChatGPT, in both development (training) and deployment (inference) phases, and investors are keeping a close watch on the company, with expectations to keep that run going.

During the keynote, Huang seemed somewhat hesitant to make the Rubin announcement, perhaps wary of invoking the so-called Osborne effect, whereby a company’s premature announcement of the next iteration of a tech product eats into the current iteration’s sales. “This is the very first time that this next click as been made,” Huang said, holding up his presentation remote just before the Rubin announcement. “And I’m not sure yet whether I’m going to regret this or not.”

Nvidia Keynote at Computex 2023.

The Rubin AI platform, expected in 2026, will use HBM4 (a new form of high-bandwidth memory) and NVLink 6 Switch, operating at 3,600GBps. Following that launch, Nvidia will release a tick-tock iteration called “Rubin Ultra.” While Huang did not provide extensive specifications for the upcoming products, he promised cost and energy savings related to the new chipsets.

During the keynote, Huang also introduced a new ARM-based CPU called “Vera,” which will be featured on a new accelerator board called “Vera Rubin,” alongside one of the Rubin GPUs.

Much like Nvidia’s Grace Hopper architecture, which combines a “Grace” CPU and a “Hopper” GPU to pay tribute to the pioneering computer scientist of the same name, Vera Rubin refers to Vera Florence Cooper Rubin (1928–2016), an American astronomer who made discoveries in the field of deep space astronomy. She is best known for her pioneering work on galaxy rotation rates, which provided strong evidence for the existence of dark matter.

A calculated risk

Nvidia CEO Jensen Huang reveals the

Enlarge / Nvidia CEO Jensen Huang reveals the “Rubin” AI platform for the first time during his keynote at Computex 2024 on June 2, 2024.

Nvidia’s reveal of Rubin is not a surprise in the sense that most big tech companies are continuously working on follow-up products well in advance of release, but it’s notable because it comes just three months after the company revealed Blackwell, which is barely out of the gate and not yet widely shipping.

At the moment, the company seems to be comfortable leapfrogging itself with new announcements and catching up later; Nvidia just announced that its GH200 Grace Hopper “Superchip,” unveiled one year ago at Computex 2023, is now in full production.

With Nvidia stock rising and the company possessing an estimated 70–95 percent of the data center GPU market share, the Rubin reveal is a calculated risk that seems to come from a place of confidence. That confidence could turn out to be misplaced if a so-called “AI bubble” pops or if Nvidia misjudges the capabilities of its competitors. The announcement may also stem from pressure to continue Nvidia’s astronomical growth in market cap with nonstop promises of improving technology.

Accordingly, Huang has been eager to showcase the company’s plans to continue pushing silicon fabrication tech to its limits and widely broadcast that Nvidia plans to keep releasing new AI chips at a steady cadence.

“Our company has a one-year rhythm. Our basic philosophy is very simple: build the entire data center scale, disaggregate and sell to you parts on a one-year rhythm, and we push everything to technology limits,” Huang said during Sunday’s Computex keynote.

Despite Nvidia’s recent market performance, the company’s run may not continue indefinitely. With ample money pouring into the data center AI space, Nvidia isn’t alone in developing accelerator chips. Competitors like AMD (with the Instinct series) and Intel (with Guadi 3) also want to win a slice of the data center GPU market away from Nvidia’s current command of the AI-accelerator space. And OpenAI’s Sam Altman is trying to encourage diversified production of GPU hardware that will power the company’s next generation of AI models in the years ahead.

Nvidia jumps ahead of itself and reveals next-gen “Rubin” AI chips in keynote tease Read More »

amd-promises-big-upscaling-improvements-and-a-future-proof-api-in-fsr-3.1

AMD promises big upscaling improvements and a future-proof API in FSR 3.1

upscale upscaling —

API should help more games get future FSR improvements without a game update.

AMD promises big upscaling improvements and a future-proof API in FSR 3.1

AMD

Last summer, AMD debuted the latest version of its FidelityFX Super Resolution (FSR) upscaling technology. While version 2.x focused mostly on making lower-resolution images look better at higher resolutions, version 3.0 focused on AMD’s “Fluid Motion Frames,” which attempt to boost FPS by generating interpolated frames to insert between the ones that your GPU is actually rendering.

Today, the company is announcing FSR 3.1, which among other improvements decouples the upscaling improvements in FSR 3.x from the Fluid Motion Frames feature. FSR 3.1 will be available “later this year” in games whose developers choose to implement it.

Fluid Motion Frames and Nvidia’s equivalent DLSS Frame Generation usually work best when a game is already running at a high frame rate, and even then can be more prone to mistakes and odd visual artifacts than regular FSR or DLSS upscaling. FSR 3.0 was an all-or-nothing proposition, but version 3.1 should let you pick and choose what features you want to enable.

It also means you can use FSR 3.0 frame generation with other upscalers like DLSS, especially useful for 20- and 30-series Nvidia GeForce GPUs that support DLSS upscaling but not DLSS Frame Generation.

“When using FSR 3 Frame Generation with any upscaling quality mode OR with the new ‘Native AA’ mode, it is highly recommended to be always running at a minimum of ~60 FPS before Frame Generation is applied for an optimal high-quality gaming experience and to mitigate any latency introduced by the technology,” wrote AMD’s Alexander Blake-Davies in the post announcing FSR 3.1.

Generally, FSR’s upscaling image quality falls a little short of Nvidia’s DLSS, but FSR 2 closed that gap a bit, and FSR 3.1 goes further. AMD highlights two specific improvements: one for “temporal stability,” which will help reduce the flickering and shimmering effect that FSR sometimes introduces, and one for ghosting reduction, which will reduce unintentional blurring effects for fast-moving objects.

The biggest issue with these new FSR improvements is that they need to be implemented on a game-to-game basis. FSR 3.0 was announced in August 2023, and AMD now trumpets that there are 40 “available and upcoming” games that support the technology, of which just 19 are currently available. There are a lot of big-name AAA titles in the list, but that’s still not many compared to the sum total of all PC games or even the 183 titles that currently support FSR 2.x.

AMD wants to help solve this problem in FSR 3.1 by introducing a stable FSR API for developers, which AMD says “makes it easier for developers to debug and allows forward compatibility with updated versions of FSR.” This may eventually lead to more games getting future FSR improvements for “free,” without the developer’s effort.

AMD didn’t mention any hardware requirements for FSR 3.1, though presumably, the company will still support a reasonably wide range of recent GPUs from AMD, Nvidia, and Intel. FSR 3.0 is formally supported on Radeon RX 5000, 6000, and 7000 cards, Nvidia’s RTX 20-series and newer, and Intel Arc GPUs. It will also bring FSR 3.x features to games that use the Vulkan API, not just DirectX 12, and the Xbox Game Development Kit (GDK) so it can be used in console titles as well as PC games.

AMD promises big upscaling improvements and a future-proof API in FSR 3.1 Read More »

nvidia-unveils-blackwell-b200,-the-“world’s-most-powerful-chip”-designed-for-ai

Nvidia unveils Blackwell B200, the “world’s most powerful chip” designed for AI

There’s no knowing where we’re rowing —

208B transistor chip can reportedly reduce AI cost and energy consumption by up to 25x.

The GB200

Enlarge / The GB200 “superchip” covered with a fanciful blue explosion.

Nvidia / Benj Edwards

On Monday, Nvidia unveiled the Blackwell B200 tensor core chip—the company’s most powerful single-chip GPU, with 208 billion transistors—which Nvidia claims can reduce AI inference operating costs (such as running ChatGPT) and energy consumption by up to 25 times compared to the H100. The company also unveiled the GB200, a “superchip” that combines two B200 chips and a Grace CPU for even more performance.

The news came as part of Nvidia’s annual GTC conference, which is taking place this week at the San Jose Convention Center. Nvidia CEO Jensen Huang delivered the keynote Monday afternoon. “We need bigger GPUs,” Huang said during his keynote. The Blackwell platform will allow the training of trillion-parameter AI models that will make today’s generative AI models look rudimentary in comparison, he said. For reference, OpenAI’s GPT-3, launched in 2020, included 175 billion parameters. Parameter count is a rough indicator of AI model complexity.

Nvidia named the Blackwell architecture after David Harold Blackwell, a mathematician who specialized in game theory and statistics and was the first Black scholar inducted into the National Academy of Sciences. The platform introduces six technologies for accelerated computing, including a second-generation Transformer Engine, fifth-generation NVLink, RAS Engine, secure AI capabilities, and a decompression engine for accelerated database queries.

Press photo of the Grace Blackwell GB200 chip, which combines two B200 GPUs with a Grace CPU into one chip.

Enlarge / Press photo of the Grace Blackwell GB200 chip, which combines two B200 GPUs with a Grace CPU into one chip.

Several major organizations, such as Amazon Web Services, Dell Technologies, Google, Meta, Microsoft, OpenAI, Oracle, Tesla, and xAI, are expected to adopt the Blackwell platform, and Nvidia’s press release is replete with canned quotes from tech CEOs (key Nvidia customers) like Mark Zuckerberg and Sam Altman praising the platform.

GPUs, once only designed for gaming acceleration, are especially well suited for AI tasks because their massively parallel architecture accelerates the immense number of matrix multiplication tasks necessary to run today’s neural networks. With the dawn of new deep learning architectures in the 2010s, Nvidia found itself in an ideal position to capitalize on the AI revolution and began designing specialized GPUs just for the task of accelerating AI models.

Nvidia’s data center focus has made the company wildly rich and valuable, and these new chips continue the trend. Nvidia’s gaming GPU revenue ($2.9 billion in the last quarter) is dwarfed in comparison to data center revenue (at $18.4 billion), and that shows no signs of stopping.

A beast within a beast

Press photo of the Nvidia GB200 NVL72 data center computer system.

Enlarge / Press photo of the Nvidia GB200 NVL72 data center computer system.

The aforementioned Grace Blackwell GB200 chip arrives as a key part of the new NVIDIA GB200 NVL72, a multi-node, liquid-cooled data center computer system designed specifically for AI training and inference tasks. It combines 36 GB200s (that’s 72 B200 GPUs and 36 Grace CPUs total), interconnected by fifth-generation NVLink, which links chips together to multiply performance.

A specification chart for the Nvidia GB200 NVL72 system.

Enlarge / A specification chart for the Nvidia GB200 NVL72 system.

“The GB200 NVL72 provides up to a 30x performance increase compared to the same number of NVIDIA H100 Tensor Core GPUs for LLM inference workloads and reduces cost and energy consumption by up to 25x,” Nvidia said.

That kind of speed-up could potentially save money and time while running today’s AI models, but it will also allow for more complex AI models to be built. Generative AI models—like the kind that power Google Gemini and AI image generators—are famously computationally hungry. Shortages of compute power have widely been cited as holding back progress and research in the AI field, and the search for more compute has led to figures like OpenAI CEO Sam Altman trying to broker deals to create new chip foundries.

While Nvidia’s claims about the Blackwell platform’s capabilities are significant, it’s worth noting that its real-world performance and adoption of the technology remain to be seen as organizations begin to implement and utilize the platform themselves. Competitors like Intel and AMD are also looking to grab a piece of Nvidia’s AI pie.

Nvidia says that Blackwell-based products will be available from various partners starting later this year.

Nvidia unveils Blackwell B200, the “world’s most powerful chip” designed for AI Read More »

review:-amd-radeon-rx-7900-gre-gpu-doesn’t-quite-earn-its-“7900”-label

Review: AMD Radeon RX 7900 GRE GPU doesn’t quite earn its “7900” label

rabbit season —

New $549 graphics card is the more logical successor to the RX 6800 XT.

ASRock's take on AMD's Radeon RX 7900 GRE.

Enlarge / ASRock’s take on AMD’s Radeon RX 7900 GRE.

Andrew Cunningham

In July 2023, AMD released a new GPU called the “Radeon RX 7900 GRE” in China. GRE stands for “Golden Rabbit Edition,” a reference to the Chinese zodiac, and while the card was available outside of China in a handful of pre-built OEM systems, AMD didn’t make it widely available at retail.

That changes today—AMD is launching the RX 7900 GRE at US retail for a suggested starting price of $549. This throws it right into the middle of the busy upper-mid-range graphics card market, where it will compete with Nvidia’s $549 RTX 4070 and the $599 RTX 4070 Super, as well as AMD’s own $500 Radeon RX 7800 XT.

We’ve run our typical set of GPU tests on the 7900 GRE to see how it stacks up to the cards AMD and Nvidia are already offering. Is it worth buying a new card relatively late in this GPU generation, when rumors point to new next-gen GPUs from Nvidia, AMD, and Intel before the end of the year? Can the “Golden Rabbit Edition” still offer a good value, even though it’s currently the year of the dragon?

Meet the 7900 GRE

RX 7900 XT RX 7900 GRE RX 7800 XT RX 6800 XT RX 6800 RX 7700 XT RX 6700 XT RX 6750 XT
Compute units (Stream processors) 84 (5,376) 80 (5,120) 60 (3,840) 72 (4,608) 60 (3,840) 54 (3,456) 40 (2,560) 40 (2,560)
Boost Clock 2,400 MHz 2,245 MHz 2,430 MHz 2,250 MHz 2,105 MHz 2,544 MHz 2,581 MHz 2,600 MHz
Memory Bus Width 320-bit 256-bit 256-bit 256-bit 256-bit 192-bit 192-bit 192-bit
Memory Clock 2,500 MHz 2,250 MHz 2,438 MHz 2,000 MHz 2,000 MHz 2,250 MHz 2,000 MHz 2,250 MHz
Memory size 20GB GDDR6 16GB GDDR6 16GB GDDR6 16GB GDDR6 16GB GDDR6 12GB GDDR6 12GB GDDR6 12GB GDDR6
Total board power (TBP) 315 W 260 W 263 W 300 W 250 W 245 W 230 W 250 W

The 7900 GRE slots into AMD’s existing lineup above the RX 7800 XT (currently $500-ish) and below the RX 7900 (around $750). Technologically, we’re looking at the same Navi 31 GPU silicon as the 7900 XT and XTX, but with just 80 of the compute units enabled, down from 84 and 96, respectively. The normal benefits of the RDNA3 graphics architecture apply, including hardware-accelerated AV1 video encoding and DisplayPort 2.1 support.

The 7900 GRE also includes four active memory controller die (MCD) chiplets, giving it a narrower 256-bit memory bus and 16GB of memory instead of 20GB—still plenty for modern games, though possibly not quite as future-proof as the 7900 XT. The card uses significantly less power than the 7900 XT and about the same amount as the 7800 XT. That feels a bit weird, intuitively, since slower cards almost always consume less power than faster ones. But it does make some sense; pushing the 7800 XT’s smaller Navi 32 GPU to get higher clock speeds out of it is probably making it run a bit less efficiently than a larger Navi 31 GPU die that isn’t being pushed as hard.

  • Andrew Cunningham

  • Andrew Cunningham

  • Andrew Cunningham

When we reviewed the 7800 XT last year, we noted that its hardware configuration and performance made it seem more like a successor to the (non-XT) Radeon RX 6800, while it just barely managed to match or beat the 6800 XT in our tests. Same deal with the 7900 GRE, which is a more logical successor to the 6800 XT. Bear that in mind when doing generation-over-generation comparisons.

Review: AMD Radeon RX 7900 GRE GPU doesn’t quite earn its “7900” label Read More »

ryzen-8000g-review:-an-integrated-gpu-that-can-beat-a-graphics-card,-for-a-price

Ryzen 8000G review: An integrated GPU that can beat a graphics card, for a price

The most interesting thing about AMD's Ryzen 7 8700G CPU is the Radeon 780M GPU that's attached to it.

Enlarge / The most interesting thing about AMD’s Ryzen 7 8700G CPU is the Radeon 780M GPU that’s attached to it.

Andrew Cunningham

Put me on the short list of people who can get excited about the humble, much-derided integrated GPU.

Yes, most of them are afterthoughts, designed for office desktops and laptops that will spend most of their lives rendering 2D images to a single monitor. But when integrated graphics push forward, it can open up possibilities for people who want to play games but can only afford a cheap desktop (or who have to make do with whatever their parents will pay for, which was the big limiter on my PC gaming experience as a kid).

That, plus an unrelated but accordant interest in building small mini-ITX-based desktops, has kept me interested in AMD’s G-series Ryzen desktop chips (which it sometimes calls “APUs,” to distinguish them from the Ryzen CPUs). And the Ryzen 8000G chips are a big upgrade from the 5000G series that immediately preceded them (this makes sense, because as we all know the number 8 immediately follows the number 5).

We’re jumping up an entire processor socket, one CPU architecture, three GPU architectures, and up to a new generation of much faster memory; especially for graphics, it’s a pretty dramatic leap. It’s an integrated GPU that can credibly beat the lowest tier of currently available graphics cards, replacing a $100–$200 part with something a lot more energy-efficient.

As with so many current-gen Ryzen chips, still-elevated pricing for the socket AM5 platform and the DDR5 memory it requires limit the 8000G series’ appeal, at least for now.

From laptop to desktop

AMD's first Ryzen 8000 desktop processors are what the company used to call

Enlarge / AMD’s first Ryzen 8000 desktop processors are what the company used to call “APUs,” a combination of a fast integrated GPU and a reasonably capable CPU.

AMD

The 8000G chips use the same Zen 4 CPU architecture as the Ryzen 7000 desktop chips, but the way the rest of the chip is put together is pretty different. Like past APUs, these are actually laptop silicon (in this case, the Ryzen 7040/8040 series, codenamed Phoenix and Phoenix 2) repackaged for a desktop processor socket.

Generally, the real-world impact of this is pretty mild; in most ways, the 8700G and 8600G will perform a lot like any other Zen 4 CPU with the same number of cores (our benchmarks mostly bear this out). But to the extent that there is a difference, the Phoenix silicon will consistently perform just a little worse, because it has half as much L3 cache. AMD’s Ryzen X3D chips revolve around the performance benefits of tons of cache, so you can see why having less would be detrimental.

The other missing feature from the Ryzen 7000 desktop chips is PCI Express 5.0 support—Ryzen 8000G tops out at PCIe 4.0. This might, maybe, one day in the distant future, eventually lead to some kind of user-observable performance difference. Some recent GPUs use an 8-lane PCIe 4.0 interface instead of the typical 16 lanes, which limits performance slightly. But PCIe 5.0 SSDs remain rare (and PCIe 4.0 peripherals remain extremely fast), so it probably shouldn’t top your list of concerns.

The Ryzen 5 8500G is a lot different from the 8700G and 8600G, since some of the CPU cores in the Phoenix 2 chips are based on Zen 4c rather than Zen 4. These cores have all the same capabilities as regular Zen 4 ones—unlike Intel’s E-cores—but they’re optimized to take up less space rather than hit high clock speeds. They were initially made for servers, where cramming lots of cores into a small amount of space is more important than having a smaller number of faster cores, but AMD is also using them to make some of its low-end consumer chips physically smaller and presumably cheaper to produce. AMD didn’t send us a Ryzen 8500G for review, so we can’t see exactly how Phoenix 2 stacks up in a desktop.

The 8700G and 8600G chips are also the only ones that come with AMD’s “Ryzen AI” feature, the brand AMD is using to refer to processors with a neural processing unit (NPU) included. Sort of like GPUs or video encoding/decoding blocks, these are additional bits built into the chip that handle things that CPUs can’t do very efficiently—in this case, machine learning and AI workloads.

Most PCs still don’t have NPUs, and as such they are only barely used in current versions of Windows (Windows 11 offers some webcam effects that will take advantage of NPU acceleration, but for now that’s mostly it). But expect this to change as they become more common and as more AI-accelerated text, image, and video creating and editing capabilities are built into modern operating systems.

The last major difference is the GPU. Ryzen 7000 includes a pair of RDNA2 compute units that perform more or less like Intel’s desktop integrated graphics: good enough to render your desktop on a monitor or two, but not much else. The Ryzen 8000G chips include up to 12 RDNA3 CUs, which—as we’ve already seen in laptops and portable gaming systems like the Asus ROG Ally that use the same silicon—is enough to run most games, if just barely in some cases.

That gives AMD’s desktop APUs a unique niche. You can use them in cases where you can’t afford a dedicated GPU—for a time during the big graphics card shortage in 2020 and 2021, a Ryzen 5700G was actually one of the only ways to build a budget gaming PC. Or you can use them in cases where a dedicated GPU won’t fit, like super-small mini ITX-based desktops.

The main argument that AMD makes is the affordability one, comparing the price of a Ryzen 8700G to the price of an Intel Core i5-13400F and a GeForce GTX 1650 GPU (this card is nearly five years old, but it remains Nvidia’s newest and best GPU available for less than $200).

Let’s check on performance first, and then we’ll revisit pricing.

Ryzen 8000G review: An integrated GPU that can beat a graphics card, for a price Read More »

just-10-lines-of-code-can-steal-ai-secrets-from-apple,-amd,-and-qualcomm-gpus

Just 10 lines of code can steal AI secrets from Apple, AMD, and Qualcomm GPUs

massive leakage —

Patching all affected devices, which include some Macs and iPhones, may be tough.

ai brain

MEHAU KULYK/Getty Images

As more companies ramp up development of artificial intelligence systems, they are increasingly turning to graphics processing unit (GPU) chips for the computing power they need to run large language models (LLMs) and to crunch data quickly at massive scale. Between video game processing and AI, demand for GPUs has never been higher, and chipmakers are rushing to bolster supply. In new findings released today, though, researchers are highlighting a vulnerability in multiple brands and models of mainstream GPUs—including Apple, Qualcomm, and AMD chips—that could allow an attacker to steal large quantities of data from a GPU’s memory.

The silicon industry has spent years refining the security of central processing units, or CPUs, so they don’t leak data in memory even when they are built to optimize for speed. However, since GPUs were designed for raw graphics processing power, they haven’t been architected to the same degree with data privacy as a priority. As generative AI and other machine learning applications expand the uses of these chips, though, researchers from New York-based security firm Trail of Bits say that vulnerabilities in GPUs are an increasingly urgent concern.

“There is a broader security concern about these GPUs not being as secure as they should be and leaking a significant amount of data,” Heidy Khlaaf, Trail of Bits’ engineering director for AI and machine learning assurance, tells WIRED. “We’re looking at anywhere from 5 megabytes to 180 megabytes. In the CPU world, even a bit is too much to reveal.”

To exploit the vulnerability, which the researchers call LeftoverLocals, attackers would need to already have established some amount of operating system access on a target’s device. Modern computers and servers are specifically designed to silo data so multiple users can share the same processing resources without being able to access each others’ data. But a LeftoverLocals attack breaks down these walls. Exploiting the vulnerability would allow a hacker to exfiltrate data they shouldn’t be able to access from the local memory of vulnerable GPUs, exposing whatever data happens to be there for the taking, which could include queries and responses generated by LLMs as well as the weights driving the response.

In their proof of concept, as seen in the GIF below, the researchers demonstrate an attack where a target—shown on the left—asks the open source LLM Llama.cpp to provide details about WIRED magazine. Within seconds, the attacker’s device—shown on the right—collects the majority of the response provided by the LLM by carrying out a LeftoverLocals attack on vulnerable GPU memory. The attack program the researchers created uses less than 10 lines of code.

An attacker (right) exploits the LeftoverLocals vulnerability to listen to LLM conversations.

Last summer, the researchers tested 11 chips from seven GPU makers and multiple corresponding programming frameworks. They found the LeftoverLocals vulnerability in GPUs from Apple, AMD, and Qualcomm and launched a far-reaching coordinated disclosure of the vulnerability in September in collaboration with the US-CERT Coordination Center and the Khronos Group, a standards body focused on 3D graphics, machine learning, and virtual and augmented reality.

The researchers did not find evidence that Nvidia, Intel, or Arm GPUs contain the LeftoverLocals vulnerability, but Apple, Qualcomm, and AMD all confirmed to WIRED that they are impacted. This means that well-known chips like the AMD Radeon RX 7900 XT and devices like Apple’s iPhone 12 Pro and M2 MacBook Air are vulnerable. The researchers did not find the flaw in the Imagination GPUs they tested, but others may be vulnerable.

Just 10 lines of code can steal AI secrets from Apple, AMD, and Qualcomm GPUs Read More »

they’re-not-cheap,-but-nvidia’s-new-super-gpus-are-a-step-in-the-right-direction

They’re not cheap, but Nvidia’s new Super GPUs are a step in the right direction

supersize me —

RTX 4080, 4070 Ti, and 4070 Super arrive with price cuts and/or spec bumps.

Nvidia's latest GPUs, apparently dropping out of hyperspace.

Enlarge / Nvidia’s latest GPUs, apparently dropping out of hyperspace.

Nvidia

  • Nvidia’s latest GPUs, apparently dropping out of hyperspace.

    Nvidia

  • The RTX 4080 Super.

    Nvidia

  • Comparing it to the last couple of xx80 GPUs (but not the original 4080).

    Nvidia

  • The 4070 Ti Super.

    Nvidia

  • Comparing to past xx70 Ti generations.

    Nvidia

  • The 4070 Super.

    Nvidia

  • Compared to past xx70 generations.

    Nvidia

If there’s been one consistent criticism of Nvidia’s RTX 40-series graphics cards, it’s been pricing. All of Nvidia’s product tiers have seen their prices creep up over the last few years, but cards like the 4090 raised prices to new heights, while lower-end models like the 4060 and 4060 Ti kept pricing the same but didn’t improve performance much.

Today, Nvidia is sprucing up its 4070 and 4080 tiers with a mid-generation “Super” refresh that at least partially addresses some of these pricing problems. Like older Super GPUs, the 4070 Super, 4070 Ti Super, and 4080 Super use the same architecture and support all the same features as their non-Super versions, but with bumped specs and tweaked prices that might make them more appealing to people who skipped the originals.

The 4070 Super will launch first, on January 17, for $599. The $799 RTX 4070 Ti Super launches on January 24, and the $999 4080 Super follows on January 31.

RTX 4090 RTX 4080 RTX 4080 Super RTX 4070 Ti RTX 4070 Ti Super RTX 4070 RTX 4070 Super
CUDA Cores 16,384 9,728 10,240 7,680 8,448 5,888 7,168
Boost Clock 2,520 MHz 2,505 MHz 2,550 MHz 2,610 MHz 2,610 MHz 2,475 MHz 2,475 MHz
Memory Bus Width 384-bit 256-bit 256-bit 192-bit 256-bit 192-bit 192-bit
Memory Clock 1,313 MHz 1,400 MHz 1,437 MHz 1,313 MHz 1,313 MHz 1,313 MHz 1,313 MHz
Memory size 24GB GDDR6X 16GB GDDR6X 16GB GDDR6X 12GB GDDR6X 16GB GDDR6X 12GB GDDR6X 12GB GDDR6X
TGP 450 W 320 W 320 W 285 W 285 W 200 W 220 W

Of the three cards, the 4080 Super probably brings the least significant spec bump, with a handful of extra CUDA cores and small clock speed increases but the same amount of memory and the same 256-bit memory interface. Its main innovation is its price, which at $999 is $200 lower than the original 4080’s $1,199 launch price. This doesn’t make it a bargain—we’re still talking about a $1,000 graphics card—but the 4080 Super feels like a more proportionate step down from the 4090 and a good competitor to AMD’s flagship Radeon RX 7900 XTX.

The 4070 Ti Super stays at the same $799 price as the 4070 Ti (which, if you’ll recall, was nearly launched at $899 as the “RTX 4080 12GB“) but addresses two major gripes with the original by stepping up to a 256-bit memory interface and 16GB of RAM. It also picks up some extra CUDA cores, while staying within the same power envelope as the original 4070 Ti. These changes should help it keep up with modern 4K games, where the smaller pool of memory and narrower memory interface of the original 4070 Ti could sometimes be a drag on performance.

Most of the RTX 40-series lineup. The original 4080 and 4070 Ti are going away, while the original 4070 now slots in at $549. It's not shown here, but Nvidia confirmed that the 16GB 4060 Ti is also sticking around at $449.

Enlarge / Most of the RTX 40-series lineup. The original 4080 and 4070 Ti are going away, while the original 4070 now slots in at $549. It’s not shown here, but Nvidia confirmed that the 16GB 4060 Ti is also sticking around at $449.

Nvidia

Finally, we get to the RTX 4070 Super, which also keeps the 4070’s $599 price tag but sees a substantial uptick in processing hardware, from 5,888 CUDA cores to 7,168 (the power envelope also increases, from 200 W to 220 W). The memory system remains unchanged. The original 4070 was already a decent baseline for entry-level 4K gaming and very good 1440p gaming, and the 4070 Super should make 60 FPS 4K attainable in even more games.

Nvidia says that the original 4070 Ti and 4080 will be phased out. The original 4070 will stick around at a new $549 price, $50 less than before, but not particularly appealing compared to the $599 4070 Super. The 4090, 4060, and the 8GB and 16GB versions of the 4060 Ti all remain available for the same prices as before.

  • The Super cards’ high-level average performance compared to some past generations of GPU, without DLSS 3 frame generation numbers muddying the waters. The 4070 should be a bit faster than an RTX 3090 most of the time.

    Nvidia

  • Some RTX 4080 performance comparisons. Note that the games at the top all have DLSS 3 frame generation enabled for the 4080 Super, while the older cards don’t support it.

    Nvidia

  • The 4070 Ti Super vs the 3070 Ti and 2070 Super.

    Nvidia

  • The 4070 Super versus the 3070 and the 2070.

    Nvidia

Nvidia’s performance comparisons focus mostly on older-generation cards rather than the non-Super versions, and per usual for 40-series GPU announcements, they lean heavily on performance numbers that are inflated by DLSS 3 frame generation. In terms of pure rendering performance, Nvidia says the 4070 Super should outperform an RTX 3090—impressive, given that the original 4070 was closer to an RTX 3080. The RTX 4080 Super is said to be roughly twice as fast as an RTX 3080, and Nvidia says the RTX 4070 Ti Super will be roughly 2.5 times faster than a 3070 Ti.

Though all three of these cards provide substantially more value than their non-Super predecessors at the same prices, the fact remains that prices have still gone up compared to past generations. Nvidia last released a Super refresh during the RTX 20-series back in 2019; the RTX 2080 Super went for $699 and the 2070 Super for $499. But the 4080 Super, 4070 Ti Super, and 4070 Super will give you more for your money than you could get before, which is at least a move in the right direction.

They’re not cheap, but Nvidia’s new Super GPUs are a step in the right direction Read More »

$329-radeon-7600-xt-brings-16gb-of-memory-to-amd’s-latest-midrange-gpu

$329 Radeon 7600 XT brings 16GB of memory to AMD’s latest midrange GPU

more rams —

Updated 7600 XT also bumps up clock speeds and power requirements.

The new Radeon RX 7600 XT mostly just adds extra memory, though clock speeds and power requirements have also increased somewhat.

Enlarge / The new Radeon RX 7600 XT mostly just adds extra memory, though clock speeds and power requirements have also increased somewhat.

AMD

Graphics card buyers seem anxious about buying a GPU with enough memory installed, even in midrange graphics cards that aren’t otherwise equipped to play games at super-high resolutions. And while this anxiety tends to be a bit overblown—lots of first- and third-party testing of cards like the GeForce 4060 Ti shows that just a handful of games benefit when all you do is boost GPU memory from 8GB to 16GB—there’s still a market for less-expensive GPUs with big pools of memory, whether you’re playing games that need it or running compute tasks that benefit from it.

That’s the apparent impetus behind AMD’s sole GPU announcement from its slate of CES news today: the $329 Radeon RX 7600 XT, a version of last year’s $269 RX 7600 with twice as much memory, slightly higher clock speeds, and higher power use to go with it.

RX 7700 XT RX 7600 RX 7600 XT RX 6600 RX 6600 XT RX 6650 XT RX 6750 XT
Compute units (Stream processors) 54 (3,456) 32 (2,048) 32 (2,048) 28 (1,792) 32 (2,048) 32 (2,048) 40 (2,560)
Boost Clock 2,544 MHz 2,600 MHz 2,760 MHz 2,490 MHz 2,589 MHz 2,635 MHz 2,600 MHz
Memory Bus Width 192-bit 128-bit 128-bit 128-bit 128-bit 128-bit 192-bit
Memory Clock 2,250 MHz 2,250 MHz 2,250 MHz 1,750 MHz 2,000 MHz 2,190 MHz 2,250 MHz
Memory size 12GB GDDR6 8GB GDDR6 16GB GDDR6 8GB GDDR6 8GB GDDR6 8GB GDDR6 12GB GDDR6
Total board power (TBP) 245 W 165 W 190 W 132 W 160 W 180 W 250 W

The core specifications of the 7600 XT remain the same as the regular 7600: 32 of AMD’s compute units (CUs) based on the RDNA3 GPU architecture and the same memory clock speed attached to the same 128-bit memory bus. But RAM has been boosted from 8GB to 16GB, and the GPU’s clock speeds have been boosted a little, ensuring that the card runs games a little faster than the regular 7600, even in games that don’t care about the extra memory.

Images of AMD’s reference design show a slightly larger card than the regular 7600, with a second 8-pin power connector to provide the extra power (total board power increases from 165 W to 190 W). The only other difference between the cards is DisplayPort 2.1 support—it was optional in the regular RX 7600, but all 7600 XTs will have it. That brings it in line with all the other RX 7000-series GPUs.

  • AMD’s hand-picked benchmarks generally show a mild performance improvement over the RX 7600, though Forza is an outlier.

    AMD

  • The 7600 XT’s performance relative to Nvidia’s RTX 4060 is also a little better than the RX 7600’s, thanks to added RAM and higher clocks. But Nvidia should continue to benefit from superior ray-tracing performance in a lot of games.

    AMD

  • Testing against the 4060 at 1440p. Note that the longest bars are coming from games with FSR 3 frame-generation enabled and that Nvidia’s cards also support DLSS 3.

    AMD

  • The complete RX 7000-series lineup.

    AMD

AMD’s provided performance figures show the 7600 XT outrunning the regular 7600 by between 5 and 10 percent in most titles, with one—Forza Horizon 5 with ray-tracing turned all the way up—showing a more significant jump of around 40 percent at 1080p and 1440p. Whether that kind of performance jump is worth the extra $60 depends on the games you play and how worried you are about the system requirements in future games.

AMD says the RX 7600 XT will be available starting on January 24. Pricing and availability for other RX 7000-series GPUs, including the regular RX 7600, aren’t changing.

$329 Radeon 7600 XT brings 16GB of memory to AMD’s latest midrange GPU Read More »

2023-was-the-year-that-gpus-stood-still

2023 was the year that GPUs stood still

2023 was the year that GPUs stood still

Andrew Cunningham

In many ways, 2023 was a long-awaited return to normalcy for people who build their own gaming and/or workstation PCs. For the entire year, most mainstream components have been available at or a little under their official retail prices, making it possible to build all kinds of PCs at relatively reasonable prices without worrying about restocks or waiting for discounts. It was a welcome continuation of some GPU trends that started in 2022. Nvidia, AMD, and Intel could release a new GPU, and you could consistently buy that GPU for roughly what it was supposed to cost.

That’s where we get into how frustrating 2023 was for GPU buyers, though. Cards like the GeForce RTX 4090 and Radeon RX 7900 series launched in late 2022 and boosted performance beyond what any last-generation cards could achieve. But 2023’s midrange GPU launches were less ambitious. Not only did they offer the performance of a last-generation GPU, but most of them did it for around the same price as the last-gen GPUs whose performance they matched.

The midrange runs in place

Not every midrange GPU launch will get us a GTX 1060—a card roughly 50 percent faster than its immediate predecessor and beat the previous-generation GTX 980 despite costing just a bit over half as much money. But even if your expectations were low, this year’s midrange GPU launches have been underwhelming.

The worst was probably the GeForce RTX 4060 Ti, which sometimes struggled to beat the card it replaced at around the same price. The 16GB version of the card was particularly maligned since it was $100 more expensive but was only faster than the 8GB version in a handful of games.

The regular RTX 4060 was slightly better news, thanks partly to a $30 price drop from where the RTX 3060 started. The performance gains were small, and a drop from 12GB to 8GB of RAM isn’t the direction we prefer to see things move, but it was still a slightly faster and more efficient card at around the same price. AMD’s Radeon RX 7600, RX 7700 XT, and RX 7800 XT all belong in this same broad category—some improvements, but generally similar performance to previous-generation parts at similar or slightly lower prices. Not an exciting leap for people with aging GPUs who waited out the GPU shortage to get an upgrade.

The best midrange card of the generation—and at $600, we’re definitely stretching the definition of “midrange”—might be the GeForce RTX 4070, which can generally match or slightly beat the RTX 3080 while using much less power and costing $100 less than the RTX 3080’s suggested retail price. That seems like a solid deal once you consider that the RTX 3080 was essentially unavailable at its suggested retail price for most of its life span. But $600 is still a $100 increase from the 2070 and a $220 increase from the 1070, making it tougher to swallow.

In all, 2023 wasn’t the worst time to buy a $300 GPU; that dubious honor belongs to the depths of 2021, when you’d be lucky to snag a GTX 1650 for that price. But “consistently available, basically competent GPUs” are harder to be thankful for the further we get from the GPU shortage.

Marketing gets more misleading

1.7 times faster than the last-gen GPU? Sure, under exactly the right conditions in specific games.

Enlarge / 1.7 times faster than the last-gen GPU? Sure, under exactly the right conditions in specific games.

Nvidia

If you just looked at Nvidia’s early performance claims for each of these GPUs, you might think that the RTX 40-series was an exciting jump forward.

But these numbers were only possible in games that supported these GPUs’ newest software gimmick, DLSS Frame Generation (FG). The original DLSS and DLSS 2 improve performance by upsampling the images generated by your GPU, generating interpolated pixels that make lower-res image into higher-res ones without the blurriness and loss of image quality you’d get from simple upscaling. DLSS FG generates entire frames in between the ones being rendered by your GPU, theoretically providing big frame rate boosts without requiring a powerful GPU.

The technology is impressive when it works, and it’s been successful enough to spawn hardware-agnostic imitators like the AMD-backed FSR 3 and an alternate implementation from Intel that’s still in early stages. But it has notable limitations—mainly, it needs a reasonably high base frame rate to have enough data to generate convincing extra frames, something that these midrange cards may struggle to do. Even when performance is good, it can introduce weird visual artifacts or lose fine detail. The technology isn’t available in all games. And DLSS FG also adds a bit of latency, though this can be offset with latency-reducing technologies like Nvidia Reflex.

As another tool in the performance-enhancing toolbox, DLSS FG is nice to have. But to put it front-and-center in comparisons with previous-generation graphics cards is, at best, painting an overly rosy picture of what upgraders can actually expect.

2023 was the year that GPUs stood still Read More »