GPUs

doj-subpoenas-nvidia-in-deepening-ai-antitrust-probe,-report-says

DOJ subpoenas Nvidia in deepening AI antitrust probe, report says

DOJ subpoenas Nvidia in deepening AI antitrust probe, report says

The Department of Justice is reportedly deepening its probe into Nvidia. Officials have moved on from merely questioning competitors to subpoenaing Nvidia and other tech companies for evidence that could substantiate allegations that Nvidia is abusing its “dominant position in AI computing,” Bloomberg reported.

When news of the DOJ’s probe into the trillion-dollar company was first reported in June, Fast Company reported that scrutiny was intensifying merely because Nvidia was estimated to control “as much as 90 percent of the market for chips” capable of powering AI models. Experts told Fast Company that the DOJ probe might even be good for Nvidia’s business, noting that the market barely moved when the probe was first announced.

But the market’s confidence seemed to be shaken a little more on Tuesday, when Nvidia lost a “record-setting $279 billion” in market value following Bloomberg’s report. Nvidia’s losses became “the biggest single-day market-cap decline on record,” TheStreet reported.

People close to the DOJ’s investigation told Bloomberg that the DOJ’s “legally binding requests” require competitors “to provide information” on Nvidia’s suspected anticompetitive behaviors as a “dominant provider of AI processors.”

One concern is that Nvidia may be giving “preferential supply and pricing to customers who use its technology exclusively or buy its complete systems,” sources told Bloomberg. The DOJ is also reportedly probing Nvidia’s acquisition of RunAI—suspecting the deal may lock RunAI customers into using Nvidia chips.

Bloomberg’s report builds on a report last month from The Information that said that Advanced Micro Devices Inc. (AMD) and other Nvidia rivals were questioned by the DOJ—as well as third parties who could shed light on whether Nvidia potentially abused its market dominance in AI chips to pressure customers into buying more products.

According to Bloomberg’s sources, the DOJ is worried that “Nvidia is making it harder to switch to other suppliers and penalizes buyers that don’t exclusively use its artificial intelligence chips.”

In a statement to Bloomberg, Nvidia insisted that “Nvidia wins on merit, as reflected in our benchmark results and value to customers, who can choose whatever solution is best for them.” Additionally, Bloomberg noted that following a chip shortage in 2022, Nvidia CEO Jensen Huang has said that his company strives to prevent stockpiling of Nvidia’s coveted AI chips by prioritizing customers “who can make use of his products in ready-to-go data centers.”

Potential threats to Nvidia’s dominance

Despite the slump in shares, Nvidia’s market dominance seems unlikely to wane any time soon after its stock more than doubled this year. In an SEC filing this year, Nvidia bragged that its “accelerated computing ecosystem is bringing AI to every enterprise” with an “ecosystem” spanning “nearly 5 million developers and 40,000 companies.” Nvidia specifically highlighted that “more than 1,600 generative AI companies are building on Nvidia,” and according to Bloomberg, Nvidia will close out 2024 with more profits than the total sales of its closest competitor, AMD.

After the DOJ’s most recent big win, which successfully proved that Google has a monopoly on search, the DOJ appears intent on getting ahead of any tech companies’ ambitions to seize monopoly power and essentially become the Google of the AI industry. In June, DOJ antitrust chief Jonathan Kanter confirmed to the Financial Times that the DOJ is examining “monopoly choke points and the competitive landscape” in AI beyond just scrutinizing Nvidia.

According to Kanter, the DOJ is scrutinizing all aspects of the AI industry—”everything from computing power and the data used to train large language models, to cloud service providers, engineering talent and access to essential hardware such as graphics processing unit chips.” But in particular, the DOJ appears concerned that GPUs like Nvidia’s advanced AI chips remain a “scarce resource.” Kanter told the Financial Times that an “intervention” in “real time” to block a potential monopoly could be “the most meaningful intervention” and the least “invasive” as the AI industry grows.

DOJ subpoenas Nvidia in deepening AI antitrust probe, report says Read More »

elon-musk-claims-he-is-training-“the-world’s-most-powerful-ai-by-every-metric”

Elon Musk claims he is training “the world’s most powerful AI by every metric”

the biggest, most powerful —

One snag: xAI might not have the electrical power contracts to do it.

Elon Musk, chief executive officer of Tesla Inc., during a fireside discussion on artificial intelligence risks with Rishi Sunak, UK prime minister, in London, UK, on Thursday, Nov. 2, 2023.

Enlarge / Elon Musk, chief executive officer of Tesla Inc., during a fireside discussion on artificial intelligence risks with Rishi Sunak, UK prime minister, in London, UK, on Thursday, Nov. 2, 2023.

On Monday, Elon Musk announced the start of training for what he calls “the world’s most powerful AI training cluster” at xAI’s new supercomputer facility in Memphis, Tennessee. The billionaire entrepreneur and CEO of multiple tech companies took to X (formerly Twitter) to share that the so-called “Memphis Supercluster” began operations at approximately 4: 20 am local time that day.

Musk’s xAI team, in collaboration with X and Nvidia, launched the supercomputer cluster featuring 100,000 liquid-cooled H100 GPUs on a single RDMA fabric. This setup, according to Musk, gives xAI “a significant advantage in training the world’s most powerful AI by every metric by December this year.”

Given issues with xAI’s Grok chatbot throughout the year, skeptics would be justified in questioning whether those claims will match reality, especially given Musk’s tendency for grandiose, off-the-cuff remarks on the social media platform he runs.

Power issues

According to a report by News Channel 3 WREG Memphis, the startup of the massive AI training facility marks a milestone for the city. WREG reports that xAI’s investment represents the largest capital investment by a new company in Memphis’s history. However, the project has raised questions among local residents and officials about its impact on the area’s power grid and infrastructure.

WREG reports that Doug McGowen, president of Memphis Light, Gas and Water (MLGW), previously stated that xAI could consume up to 150 megawatts of power at peak times. This substantial power requirement has prompted discussions with the Tennessee Valley Authority (TVA) regarding the project’s electricity demands and connection to the power system.

The TVA told the local news station, “TVA does not have a contract in place with xAI. We are working with xAI and our partners at MLGW on the details of the proposal and electricity demand needs.”

The local news outlet confirms that MLGW has stated that xAI moved into an existing building with already existing utility services, but the full extent of the company’s power usage and its potential effects on local utilities remain unclear. To address community concerns, WREG reports that MLGW plans to host public forums in the coming days to provide more information about the project and its implications for the city.

For now, Tom’s Hardware reports that Musk is side-stepping power issues by installing a fleet of 14 VoltaGrid natural gas generators that provide supplementary power to the Memphis computer cluster while his company works out an agreement with the local power utility.

As training at the Memphis Supercluster gets underway, all eyes are on xAI and Musk’s ambitious goal of developing the world’s most powerful AI by the end of the year (by which metric, we are uncertain), given the competitive landscape in AI at the moment between OpenAI/Microsoft, Amazon, Apple, Anthropic, and Google. If such an AI model emerges from xAI, we’ll be ready to write about it.

This article was updated on July 24, 2024 at 1: 11 pm to mention Musk installing natural gas generators onsite in Memphis.

Elon Musk claims he is training “the world’s most powerful AI by every metric” Read More »

nvidia-jumps-ahead-of-itself-and-reveals-next-gen-“rubin”-ai-chips-in-keynote-tease

Nvidia jumps ahead of itself and reveals next-gen “Rubin” AI chips in keynote tease

Swing beat —

“I’m not sure yet whether I’m going to regret this,” says CEO Jensen Huang at Computex 2024.

Nvidia's CEO Jensen Huang delivers his keystone speech ahead of Computex 2024 in Taipei on June 2, 2024.

Enlarge / Nvidia’s CEO Jensen Huang delivers his keystone speech ahead of Computex 2024 in Taipei on June 2, 2024.

On Sunday, Nvidia CEO Jensen Huang reached beyond Blackwell and revealed the company’s next-generation AI-accelerating GPU platform during his keynote at Computex 2024 in Taiwan. Huang also detailed plans for an annual tick-tock-style upgrade cycle of its AI acceleration platforms, mentioning an upcoming Blackwell Ultra chip slated for 2025 and a subsequent platform called “Rubin” set for 2026.

Nvidia’s data center GPUs currently power a large majority of cloud-based AI models, such as ChatGPT, in both development (training) and deployment (inference) phases, and investors are keeping a close watch on the company, with expectations to keep that run going.

During the keynote, Huang seemed somewhat hesitant to make the Rubin announcement, perhaps wary of invoking the so-called Osborne effect, whereby a company’s premature announcement of the next iteration of a tech product eats into the current iteration’s sales. “This is the very first time that this next click as been made,” Huang said, holding up his presentation remote just before the Rubin announcement. “And I’m not sure yet whether I’m going to regret this or not.”

Nvidia Keynote at Computex 2023.

The Rubin AI platform, expected in 2026, will use HBM4 (a new form of high-bandwidth memory) and NVLink 6 Switch, operating at 3,600GBps. Following that launch, Nvidia will release a tick-tock iteration called “Rubin Ultra.” While Huang did not provide extensive specifications for the upcoming products, he promised cost and energy savings related to the new chipsets.

During the keynote, Huang also introduced a new ARM-based CPU called “Vera,” which will be featured on a new accelerator board called “Vera Rubin,” alongside one of the Rubin GPUs.

Much like Nvidia’s Grace Hopper architecture, which combines a “Grace” CPU and a “Hopper” GPU to pay tribute to the pioneering computer scientist of the same name, Vera Rubin refers to Vera Florence Cooper Rubin (1928–2016), an American astronomer who made discoveries in the field of deep space astronomy. She is best known for her pioneering work on galaxy rotation rates, which provided strong evidence for the existence of dark matter.

A calculated risk

Nvidia CEO Jensen Huang reveals the

Enlarge / Nvidia CEO Jensen Huang reveals the “Rubin” AI platform for the first time during his keynote at Computex 2024 on June 2, 2024.

Nvidia’s reveal of Rubin is not a surprise in the sense that most big tech companies are continuously working on follow-up products well in advance of release, but it’s notable because it comes just three months after the company revealed Blackwell, which is barely out of the gate and not yet widely shipping.

At the moment, the company seems to be comfortable leapfrogging itself with new announcements and catching up later; Nvidia just announced that its GH200 Grace Hopper “Superchip,” unveiled one year ago at Computex 2023, is now in full production.

With Nvidia stock rising and the company possessing an estimated 70–95 percent of the data center GPU market share, the Rubin reveal is a calculated risk that seems to come from a place of confidence. That confidence could turn out to be misplaced if a so-called “AI bubble” pops or if Nvidia misjudges the capabilities of its competitors. The announcement may also stem from pressure to continue Nvidia’s astronomical growth in market cap with nonstop promises of improving technology.

Accordingly, Huang has been eager to showcase the company’s plans to continue pushing silicon fabrication tech to its limits and widely broadcast that Nvidia plans to keep releasing new AI chips at a steady cadence.

“Our company has a one-year rhythm. Our basic philosophy is very simple: build the entire data center scale, disaggregate and sell to you parts on a one-year rhythm, and we push everything to technology limits,” Huang said during Sunday’s Computex keynote.

Despite Nvidia’s recent market performance, the company’s run may not continue indefinitely. With ample money pouring into the data center AI space, Nvidia isn’t alone in developing accelerator chips. Competitors like AMD (with the Instinct series) and Intel (with Guadi 3) also want to win a slice of the data center GPU market away from Nvidia’s current command of the AI-accelerator space. And OpenAI’s Sam Altman is trying to encourage diversified production of GPU hardware that will power the company’s next generation of AI models in the years ahead.

Nvidia jumps ahead of itself and reveals next-gen “Rubin” AI chips in keynote tease Read More »

nvidia-unveils-blackwell-b200,-the-“world’s-most-powerful-chip”-designed-for-ai

Nvidia unveils Blackwell B200, the “world’s most powerful chip” designed for AI

There’s no knowing where we’re rowing —

208B transistor chip can reportedly reduce AI cost and energy consumption by up to 25x.

The GB200

Enlarge / The GB200 “superchip” covered with a fanciful blue explosion.

Nvidia / Benj Edwards

On Monday, Nvidia unveiled the Blackwell B200 tensor core chip—the company’s most powerful single-chip GPU, with 208 billion transistors—which Nvidia claims can reduce AI inference operating costs (such as running ChatGPT) and energy consumption by up to 25 times compared to the H100. The company also unveiled the GB200, a “superchip” that combines two B200 chips and a Grace CPU for even more performance.

The news came as part of Nvidia’s annual GTC conference, which is taking place this week at the San Jose Convention Center. Nvidia CEO Jensen Huang delivered the keynote Monday afternoon. “We need bigger GPUs,” Huang said during his keynote. The Blackwell platform will allow the training of trillion-parameter AI models that will make today’s generative AI models look rudimentary in comparison, he said. For reference, OpenAI’s GPT-3, launched in 2020, included 175 billion parameters. Parameter count is a rough indicator of AI model complexity.

Nvidia named the Blackwell architecture after David Harold Blackwell, a mathematician who specialized in game theory and statistics and was the first Black scholar inducted into the National Academy of Sciences. The platform introduces six technologies for accelerated computing, including a second-generation Transformer Engine, fifth-generation NVLink, RAS Engine, secure AI capabilities, and a decompression engine for accelerated database queries.

Press photo of the Grace Blackwell GB200 chip, which combines two B200 GPUs with a Grace CPU into one chip.

Enlarge / Press photo of the Grace Blackwell GB200 chip, which combines two B200 GPUs with a Grace CPU into one chip.

Several major organizations, such as Amazon Web Services, Dell Technologies, Google, Meta, Microsoft, OpenAI, Oracle, Tesla, and xAI, are expected to adopt the Blackwell platform, and Nvidia’s press release is replete with canned quotes from tech CEOs (key Nvidia customers) like Mark Zuckerberg and Sam Altman praising the platform.

GPUs, once only designed for gaming acceleration, are especially well suited for AI tasks because their massively parallel architecture accelerates the immense number of matrix multiplication tasks necessary to run today’s neural networks. With the dawn of new deep learning architectures in the 2010s, Nvidia found itself in an ideal position to capitalize on the AI revolution and began designing specialized GPUs just for the task of accelerating AI models.

Nvidia’s data center focus has made the company wildly rich and valuable, and these new chips continue the trend. Nvidia’s gaming GPU revenue ($2.9 billion in the last quarter) is dwarfed in comparison to data center revenue (at $18.4 billion), and that shows no signs of stopping.

A beast within a beast

Press photo of the Nvidia GB200 NVL72 data center computer system.

Enlarge / Press photo of the Nvidia GB200 NVL72 data center computer system.

The aforementioned Grace Blackwell GB200 chip arrives as a key part of the new NVIDIA GB200 NVL72, a multi-node, liquid-cooled data center computer system designed specifically for AI training and inference tasks. It combines 36 GB200s (that’s 72 B200 GPUs and 36 Grace CPUs total), interconnected by fifth-generation NVLink, which links chips together to multiply performance.

A specification chart for the Nvidia GB200 NVL72 system.

Enlarge / A specification chart for the Nvidia GB200 NVL72 system.

“The GB200 NVL72 provides up to a 30x performance increase compared to the same number of NVIDIA H100 Tensor Core GPUs for LLM inference workloads and reduces cost and energy consumption by up to 25x,” Nvidia said.

That kind of speed-up could potentially save money and time while running today’s AI models, but it will also allow for more complex AI models to be built. Generative AI models—like the kind that power Google Gemini and AI image generators—are famously computationally hungry. Shortages of compute power have widely been cited as holding back progress and research in the AI field, and the search for more compute has led to figures like OpenAI CEO Sam Altman trying to broker deals to create new chip foundries.

While Nvidia’s claims about the Blackwell platform’s capabilities are significant, it’s worth noting that its real-world performance and adoption of the technology remain to be seen as organizations begin to implement and utilize the platform themselves. Competitors like Intel and AMD are also looking to grab a piece of Nvidia’s AI pie.

Nvidia says that Blackwell-based products will be available from various partners starting later this year.

Nvidia unveils Blackwell B200, the “world’s most powerful chip” designed for AI Read More »

2023-was-the-year-that-gpus-stood-still

2023 was the year that GPUs stood still

2023 was the year that GPUs stood still

Andrew Cunningham

In many ways, 2023 was a long-awaited return to normalcy for people who build their own gaming and/or workstation PCs. For the entire year, most mainstream components have been available at or a little under their official retail prices, making it possible to build all kinds of PCs at relatively reasonable prices without worrying about restocks or waiting for discounts. It was a welcome continuation of some GPU trends that started in 2022. Nvidia, AMD, and Intel could release a new GPU, and you could consistently buy that GPU for roughly what it was supposed to cost.

That’s where we get into how frustrating 2023 was for GPU buyers, though. Cards like the GeForce RTX 4090 and Radeon RX 7900 series launched in late 2022 and boosted performance beyond what any last-generation cards could achieve. But 2023’s midrange GPU launches were less ambitious. Not only did they offer the performance of a last-generation GPU, but most of them did it for around the same price as the last-gen GPUs whose performance they matched.

The midrange runs in place

Not every midrange GPU launch will get us a GTX 1060—a card roughly 50 percent faster than its immediate predecessor and beat the previous-generation GTX 980 despite costing just a bit over half as much money. But even if your expectations were low, this year’s midrange GPU launches have been underwhelming.

The worst was probably the GeForce RTX 4060 Ti, which sometimes struggled to beat the card it replaced at around the same price. The 16GB version of the card was particularly maligned since it was $100 more expensive but was only faster than the 8GB version in a handful of games.

The regular RTX 4060 was slightly better news, thanks partly to a $30 price drop from where the RTX 3060 started. The performance gains were small, and a drop from 12GB to 8GB of RAM isn’t the direction we prefer to see things move, but it was still a slightly faster and more efficient card at around the same price. AMD’s Radeon RX 7600, RX 7700 XT, and RX 7800 XT all belong in this same broad category—some improvements, but generally similar performance to previous-generation parts at similar or slightly lower prices. Not an exciting leap for people with aging GPUs who waited out the GPU shortage to get an upgrade.

The best midrange card of the generation—and at $600, we’re definitely stretching the definition of “midrange”—might be the GeForce RTX 4070, which can generally match or slightly beat the RTX 3080 while using much less power and costing $100 less than the RTX 3080’s suggested retail price. That seems like a solid deal once you consider that the RTX 3080 was essentially unavailable at its suggested retail price for most of its life span. But $600 is still a $100 increase from the 2070 and a $220 increase from the 1070, making it tougher to swallow.

In all, 2023 wasn’t the worst time to buy a $300 GPU; that dubious honor belongs to the depths of 2021, when you’d be lucky to snag a GTX 1650 for that price. But “consistently available, basically competent GPUs” are harder to be thankful for the further we get from the GPU shortage.

Marketing gets more misleading

1.7 times faster than the last-gen GPU? Sure, under exactly the right conditions in specific games.

Enlarge / 1.7 times faster than the last-gen GPU? Sure, under exactly the right conditions in specific games.

Nvidia

If you just looked at Nvidia’s early performance claims for each of these GPUs, you might think that the RTX 40-series was an exciting jump forward.

But these numbers were only possible in games that supported these GPUs’ newest software gimmick, DLSS Frame Generation (FG). The original DLSS and DLSS 2 improve performance by upsampling the images generated by your GPU, generating interpolated pixels that make lower-res image into higher-res ones without the blurriness and loss of image quality you’d get from simple upscaling. DLSS FG generates entire frames in between the ones being rendered by your GPU, theoretically providing big frame rate boosts without requiring a powerful GPU.

The technology is impressive when it works, and it’s been successful enough to spawn hardware-agnostic imitators like the AMD-backed FSR 3 and an alternate implementation from Intel that’s still in early stages. But it has notable limitations—mainly, it needs a reasonably high base frame rate to have enough data to generate convincing extra frames, something that these midrange cards may struggle to do. Even when performance is good, it can introduce weird visual artifacts or lose fine detail. The technology isn’t available in all games. And DLSS FG also adds a bit of latency, though this can be offset with latency-reducing technologies like Nvidia Reflex.

As another tool in the performance-enhancing toolbox, DLSS FG is nice to have. But to put it front-and-center in comparisons with previous-generation graphics cards is, at best, painting an overly rosy picture of what upgraders can actually expect.

2023 was the year that GPUs stood still Read More »