Jensen Huang

Nvidia announces DGX desktop “personal AI supercomputers”

AI, AI GPU, ASUS, Biz & IT, dell, desktop AI PC, desktop PC, DGX, DGX Spark, DGX Station, GPUs, HP, Jensen Huang, Lenovo, machine learning, NVIDIA, Nvidia DGX, Nvidia GTC, personal AI supercomputer, supermicro, Tech / Paul Patrick / March 19, 2025

During Tuesday’s Nvidia GTX keynote, CEO Jensen Huang unveiled two “personal AI supercomputers” called DGX Spark and DGX Station, both powered by the Grace Blackwell platform. In a way, they are a new type of AI PC architecture specifically built for running neural networks, and five major PC manufacturers will build the supercomputers.

These desktop systems, first previewed as “Project DIGITS” in January, aim to bring AI capabilities to developers, researchers, and data scientists who need to prototype, fine-tune, and run large AI models locally. DGX systems can serve as standalone desktop AI labs or “bridge systems” that allow AI developers to move their models from desktops to DGX Cloud or any AI cloud infrastructure with few code changes.

Huang explained the rationale behind these new products in a news release, saying, “AI has transformed every layer of the computing stack. It stands to reason a new class of computers would emerge—designed for AI-native developers and to run AI-native applications.”

The smaller DGX Spark features the GB10 Grace Blackwell Superchip with Blackwell GPU and fifth-generation Tensor Cores, delivering up to 1,000 trillion operations per second for AI.

Meanwhile, the more powerful DGX Station includes the GB300 Grace Blackwell Ultra Desktop Superchip with 784GB of coherent memory and the ConnectX-8 SuperNIC supporting networking speeds up to 800Gb/s.

The DGX architecture serves as a prototype that other manufacturers can produce. Asus, Dell, HP, and Lenovo will develop and sell both DGX systems, with DGX Spark reservations opening today and DGX Station expected later in 2025. Additional manufacturing partners for the DGX Station include BOXX, Lambda, and Supermicro, with systems expected to be available later this year.

Since the systems will be manufactured by different companies, Nvidia did not mention pricing for the units. However, in January, Nvidia mentioned that the base-level configuration for a DGX Spark-like computer would retail for around $3,000.

Nvidia announces DGX desktop “personal AI supercomputers” Read More »

Nvidia announces “Rubin Ultra” and “Feynman” AI chips for 2027 and 2028

AI, AI infrastructure, Biz & IT, Blackwell, Cloud Computing, datacenters, GPUs, IT, Jensen Huang, machine learning, NVIDIA, Nvidia Feynman, Nvidia Vera Rubin, Rubin Ultra, Tech, Vera Rubin / Paul Patrick / March 19, 2025

On Tuesday at Nvidia’s GTC 2025 conference in San Jose, California, CEO Jensen Huang revealed several new AI-accelerating GPUs the company plans to release over the coming months and years. He also revealed more specifications about previously announced chips.

The centerpiece announcement was Vera Rubin, first teased at Computex 2024 and now scheduled for release in the second half of 2026. This GPU, named after a famous astronomer, will feature tens of terabytes of memory and comes with a custom Nvidia-designed CPU called Vera.

According to Nvidia, Vera Rubin will deliver significant performance improvements over its predecessor, Grace Blackwell, particularly for AI training and inference.

Specifications for Vera Rubin, presented by Jensen Huang during his GTC 2025 keynote.

Vera Rubin features two GPUs together on one die that deliver 50 petaflops of FP4 inference performance per chip. When configured in a full NVL144 rack, the system delivers 3.6 exaflops of FP4 inference compute—3.3 times more than Blackwell Ultra’s 1.1 exaflops in a similar rack configuration.

The Vera CPU features 88 custom ARM cores with 176 threads connected to Rubin GPUs via a high-speed 1.8 TB/s NVLink interface.

Huang also announced Rubin Ultra, which will follow in the second half of 2027. Rubin Ultra will use the NVL576 rack configuration and feature individual GPUs with four reticle-sized dies, delivering 100 petaflops of FP4 precision (a 4-bit floating-point format used for representing and processing numbers within AI models) per chip.

At the rack level, Rubin Ultra will provide 15 exaflops of FP4 inference compute and 5 exaflops of FP8 training performance—about four times more powerful than the Rubin NVL144 configuration. Each Rubin Ultra GPU will include 1TB of HBM4e memory, with the complete rack containing 365TB of fast memory.

Nvidia announces “Rubin Ultra” and “Feynman” AI chips for 2027 and 2028 Read More »

How a stubborn computer scientist accidentally launched the deep learning boom

AI, AlexNet, deep learning, Features, Fei-Fei Li, Geoffrey Hinton, Jensen Huang, Science, Yann LeCun / Paul Patrick / November 11, 2024

“You’ve taken this idea way too far,” a mentor told Prof. Fei-Fei Li.

Credit: Aurich Lawson | Getty Images

During my first semester as a computer science graduate student at Princeton, I took COS 402: Artificial Intelligence. Toward the end of the semester, there was a lecture about neural networks. This was in the fall of 2008, and I got the distinct impression—both from that lecture and the textbook—that neural networks had become a backwater.

Neural networks had delivered some impressive results in the late 1980s and early 1990s. But then progress stalled. By 2008, many researchers had moved on to mathematically elegant approaches such as support vector machines.

I didn’t know it at the time, but a team at Princeton—in the same computer science building where I was attending lectures—was working on a project that would upend the conventional wisdom and demonstrate the power of neural networks. That team, led by Prof. Fei-Fei Li, wasn’t working on a better version of neural networks. They were hardly thinking about neural networks at all.

Rather, they were creating a new image dataset that would be far larger than any that had come before: 14 million images, each labeled with one of nearly 22,000 categories.

Li tells the story of ImageNet in her recent memoir, The Worlds I See. As she worked on the project, she faced plenty of skepticism from friends and colleagues.

“I think you’ve taken this idea way too far,” a mentor told her a few months into the project in 2007. “The trick is to grow with your field. Not to leap so far ahead of it.”

It wasn’t just that building such a large dataset was a massive logistical challenge. People doubted that the machine learning algorithms of the day would benefit from such a vast collection of images.

“Pre-ImageNet, people did not believe in data,” Li said in a September interview at the Computer History Museum. “Everyone was working on completely different paradigms in AI with a tiny bit of data.”

Ignoring negative feedback, Li pursued the project for more than two years. It strained her research budget and the patience of her graduate students. When she took a new job at Stanford in 2009, she took several of those students—and the ImageNet project—with her to California.

ImageNet received little attention for the first couple of years after its release in 2009. But in 2012, a team from the University of Toronto trained a neural network on the ImageNet dataset, achieving unprecedented performance in image recognition. That groundbreaking AI model, dubbed AlexNet after lead author Alex Krizhevsky, kicked off the deep learning boom that has continued to the present day.

AlexNet would not have succeeded without the ImageNet dataset. AlexNet also would not have been possible without a platform called CUDA, which allowed Nvidia’s graphics processing units (GPUs) to be used in non-graphics applications. Many people were skeptical when Nvidia announced CUDA in 2006.

So the AI boom of the last 12 years was made possible by three visionaries who pursued unorthodox ideas in the face of widespread criticism. One was Geoffrey Hinton, a University of Toronto computer scientist who spent decades promoting neural networks despite near-universal skepticism. The second was Jensen Huang, the CEO of Nvidia, who recognized early that GPUs could be useful for more than just graphics.

The third was Fei-Fei Li. She created an image dataset that seemed ludicrously large to most of her colleagues. But it turned out to be essential for demonstrating the potential of neural networks trained on GPUs.

Geoffrey Hinton

A neural network is a network of thousands, millions, or even billions of neurons. Each neuron is a mathematical function that produces an output based on a weighted average of its inputs.

Suppose you want to create a network that can identify handwritten decimal digits like the number two in the red square above. Such a network would take in an intensity value for each pixel in an image and output a probability distribution over the ten possible digits—0, 1, 2, and so forth.

To train such a network, you first initialize it with random weights. You then run it on a sequence of example images. For each image, you train the network by strengthening the connections that push the network toward the right answer (in this case, a high-probability value for the “2” output) and weakening connections that push toward a wrong answer (a low probability for “2” and high probabilities for other digits). If trained on enough example images, the model should start to predict a high probability for “2” when shown a two—and not otherwise.

In the late 1950s, scientists started to experiment with basic networks that had a single layer of neurons. However, their initial enthusiasm cooled as they realized that such simple networks lacked the expressive power required for complex computations.

Deeper networks—those with multiple layers—had the potential to be more versatile. But in the 1960s, no one knew how to train them efficiently. This was because changing a parameter somewhere in the middle of a multi-layer network could have complex and unpredictable effects on the output.

So by the time Hinton began his career in the 1970s, neural networks had fallen out of favor. Hinton wanted to study them, but he struggled to find an academic home in which to do so. Between 1976 and 1986, Hinton spent time at four different research institutions: Sussex University, the University of California San Diego (UCSD), a branch of the UK Medical Research Council, and finally Carnegie Mellon, where he became a professor in 1982.

Geoffrey Hinton speaking in Toronto in June. Credit: Photo by Mert Alper Dervis/Anadolu via Getty Images

In a landmark 1986 paper, Hinton teamed up with two of his former colleagues at UCSD, David Rumelhart and Ronald Williams, to describe a technique called backpropagation for efficiently training deep neural networks.

Their idea was to start with the final layer of the network and work backward. For each connection in the final layer, the algorithm computes a gradient—a mathematical estimate of whether increasing the strength of that connection would push the network toward the right answer. Based on these gradients, the algorithm adjusts each parameter in the model’s final layer.

The algorithm then propagates these gradients backward to the second-to-last layer. A key innovation here is a formula—based on the chain rule from high school calculus—for computing the gradients in one layer based on gradients in the following layer. Using these new gradients, the algorithm updates each parameter in the second-to-last layer of the model. The gradients then get propagated backward to the third-to-last layer, and the whole process repeats once again.

The algorithm only makes small changes to the model in each round of training. But as the process is repeated over thousands, millions, billions, or even trillions of training examples, the model gradually becomes more accurate.

Hinton and his colleagues weren’t the first to discover the basic idea of backpropagation. But their paper popularized the method. As people realized it was now possible to train deeper networks, it triggered a new wave of enthusiasm for neural networks.

Hinton moved to the University of Toronto in 1987 and began attracting young researchers who wanted to study neural networks. One of the first was the French computer scientist Yann LeCun, who did a year-long postdoc with Hinton before moving to Bell Labs in 1988.

Hinton’s backpropagation algorithm allowed LeCun to train models deep enough to perform well on real-world tasks like handwriting recognition. By the mid-1990s, LeCun’s technology was working so well that banks started to use it for processing checks.

“At one point, LeCun’s creation read more than 10 percent of all checks deposited in the United States,” wrote Cade Metz in his 2022 book Genius Makers.

But when LeCun and other researchers tried to apply neural networks to larger and more complex images, it didn’t go well. Neural networks once again fell out of fashion, and some researchers who had focused on neural networks moved on to other projects.

Hinton never stopped believing that neural networks could outperform other machine learning methods. But it would be many years before he’d have access to enough data and computing power to prove his case.

Jensen Huang

The brain of every personal computer is a central processing unit (CPU). These chips are designed to perform calculations in order, one step at a time. This works fine for conventional software like Windows and Office. But some video games require so many calculations that they strain the capabilities of CPUs. This is especially true of games like Quake, Call of Duty, and Grand Theft Auto, which render three-dimensional worlds many times per second.

So gamers rely on GPUs to accelerate performance. Inside a GPU are many execution units—essentially tiny CPUs—packaged together on a single chip. During gameplay, different execution units draw different areas of the screen. This parallelism enables better image quality and higher frame rates than would be possible with a CPU alone.

Nvidia invented the GPU in 1999 and has dominated the market ever since. By the mid-2000s, Nvidia CEO Jensen Huang suspected that the massive computing power inside a GPU would be useful for applications beyond gaming. He hoped scientists could use it for compute-intensive tasks like weather simulation or oil exploration.

So in 2006, Nvidia announced the CUDA platform. CUDA allows programmers to write “kernels,” short programs designed to run on a single execution unit. Kernels allow a big computing task to be split up into bite-sized chunks that can be processed in parallel. This allows certain kinds of calculations to be completed far faster than with a CPU alone.

But there was little interest in CUDA when it was first introduced, wrote Steven Witt in The New Yorker last year:

When CUDA was released, in late 2006, Wall Street reacted with dismay. Huang was bringing supercomputing to the masses, but the masses had shown no indication that they wanted such a thing.

“They were spending a fortune on this new chip architecture,” Ben Gilbert, the co-host of “Acquired,” a popular Silicon Valley podcast, said. “They were spending many billions targeting an obscure corner of academic and scientific computing, which was not a large market at the time—certainly less than the billions they were pouring in.”

Huang argued that the simple existence of CUDA would enlarge the supercomputing sector. This view was not widely held, and by the end of 2008, Nvidia’s stock price had declined by seventy percent…

Downloads of CUDA hit a peak in 2009, then declined for three years. Board members worried that Nvidia’s depressed stock price would make it a target for corporate raiders.

Huang wasn’t specifically thinking about AI or neural networks when he created the CUDA platform. But it turned out that Hinton’s backpropagation algorithm could easily be split up into bite-sized chunks. So training neural networks turned out to be a killer app for CUDA.

According to Witt, Hinton was quick to recognize the potential of CUDA:

In 2009, Hinton’s research group used Nvidia’s CUDA platform to train a neural network to recognize human speech. He was surprised by the quality of the results, which he presented at a conference later that year. He then reached out to Nvidia. “I sent an e-mail saying, ‘Look, I just told a thousand machine-learning researchers they should go and buy Nvidia cards. Can you send me a free one?’ ” Hinton told me. “They said no.”

Despite the snub, Hinton and his graduate students, Alex Krizhevsky and Ilya Sutskever, obtained a pair of Nvidia GTX 580 GPUs for the AlexNet project. Each GPU had 512 execution units, allowing Krizhevsky and Sutskever to train a neural network hundreds of times faster than would be possible with a CPU. This speed allowed them to train a larger model—and to train it on many more training images. And they would need all that extra computing power to tackle the massive ImageNet dataset.

Fei-Fei Li

Fei-Fei Li wasn’t thinking about either neural networks or GPUs as she began a new job as a computer science professor at Princeton in January of 2007. While earning her PhD at Caltech, she had built a dataset called Caltech 101 that had 9,000 images across 101 categories.

That experience had taught her that computer vision algorithms tended to perform better with larger and more diverse training datasets. Not only had Li found her own algorithms performed better when trained on Caltech 101, but other researchers also started training their models using Li’s dataset and comparing their performance to one another. This turned Caltech 101 into a benchmark for the field of computer vision.

So when she got to Princeton, Li decided to go much bigger. She became obsessed with an estimate by vision scientist Irving Biederman that the average person recognizes roughly 30,000 different kinds of objects. Li started to wonder if it would be possible to build a truly comprehensive image dataset—one that included every kind of object people commonly encounter in the physical world.

A Princeton colleague told Li about WordNet, a massive database that attempted to catalog and organize 140,000 words. Li called her new dataset ImageNet, and she used WordNet as a starting point for choosing categories. She eliminated verbs and adjectives, as well as intangible nouns like “truth.” That left a list of 22,000 countable objects ranging from “ambulance” to “zucchini.”

She planned to take the same approach she’d taken with the Caltech 101 dataset: use Google’s image search to find candidate images, then have a human being verify them. For the Caltech 101 dataset, Li had done this herself over the course of a few months. This time she would need more help. She planned to hire dozens of Princeton undergraduates to help her choose and label images.

But even after heavily optimizing the labeling process—for example, pre-downloading candidate images so they’re instantly available for students to review—Li and her graduate student Jia Deng calculated that it would take more than 18 years to select and label millions of images.

The project was saved when Li learned about Amazon Mechanical Turk, a crowdsourcing platform Amazon had launched a couple of years earlier. Not only was AMT’s international workforce more affordable than Princeton undergraduates, but the platform was also far more flexible and scalable. Li’s team could hire as many people as they needed, on demand, and pay them only as long as they had work available.

AMT cut the time needed to complete ImageNet down from 18 to two years. Li writes that her lab spent two years “on the knife-edge of our finances” as the team struggled to complete the ImageNet project. But they had enough funds to pay three people to look at each of the 14 million images in the final data set.

ImageNet was ready for publication in 2009, and Li submitted it to the Conference on Computer Vision and Pattern Recognition, which was held in Miami that year. Their paper was accepted, but it didn’t get the kind of recognition Li hoped for.

“ImageNet was relegated to a poster session,” Li writes. “This meant that we wouldn’t be presenting our work in a lecture hall to an audience at a predetermined time but would instead be given space on the conference floor to prop up a large-format print summarizing the project in hopes that passersby might stop and ask questions… After so many years of effort, this just felt anticlimactic.”

To generate public interest, Li turned ImageNet into a competition. Realizing that the full dataset might be too unwieldy to distribute to dozens of contestants, she created a much smaller (but still massive) dataset with 1,000 categories and 1.4 million images.

The first year’s competition in 2010 generated a healthy amount of interest, with 11 teams participating. The winning entry was based on support vector machines. Unfortunately, Li writes, it was “only a slight improvement over cutting-edge work found elsewhere in our field.”

The second year of the ImageNet competition attracted fewer entries than the first. The winning entry in 2011 was another support vector machine, and it just barely improved on the performance of the 2010 winner. Li started to wonder if the critics had been right. Maybe “ImageNet was too much for most algorithms to handle.”

“For two years running, well-worn algorithms had exhibited only incremental gains in capabilities, while true progress seemed all but absent,” Li writes. “If ImageNet was a bet, it was time to start wondering if we’d lost.”

But when Li reluctantly staged the competition a third time in 2012, the results were totally different. Geoff Hinton’s team was the first to submit a model based on a deep neural network. And its top-5 accuracy was 85 percent—10 percentage points better than the 2011 winner.

Li’s initial reaction was incredulity: “Most of us saw the neural network as a dusty artifact encased in glass and protected by velvet ropes.”

“This is proof”

Yann LeCun testifies before the US Senate in September. Credit: Photo by Kevin Dietsch/Getty Images

The ImageNet winners were scheduled to be announced at the European Conference on Computer Vision in Florence, Italy. Li, who had a baby at home in California, was planning to skip the event. But when she saw how well AlexNet had done on her dataset, she realized this moment would be too important to miss: “I settled reluctantly on a twenty-hour slog of sleep deprivation and cramped elbow room.”

On an October day in Florence, Alex Krizhevsky presented his results to a standing-room-only crowd of computer vision researchers. Fei-Fei Li was in the audience. So was Yann LeCun.

Cade Metz reports that after the presentation, LeCun stood up and called AlexNet “an unequivocal turning point in the history of computer vision. This is proof.”

The success of AlexNet vindicated Hinton’s faith in neural networks, but it was arguably an even bigger vindication for LeCun.

AlexNet was a convolutional neural network, a type of neural network that LeCun had developed 20 years earlier to recognize handwritten digits on checks. (For more details on how CNNs work, see the in-depth explainer I wrote for Ars in 2018.) Indeed, there were few architectural differences between AlexNet and LeCun’s image recognition networks from the 1990s.

AlexNet was simply far larger. In a 1998 paper, LeCun described a document-recognition network with seven layers and 60,000 trainable parameters. AlexNet had eight layers, but these layers had 60 million trainable parameters.

LeCun could not have trained a model that large in the early 1990s because there were no computer chips with as much processing power as a 2012-era GPU. Even if LeCun had managed to build a big enough supercomputer, he would not have had enough images to train it properly. Collecting those images would have been hugely expensive in the years before Google and Amazon Mechanical Turk.

And this is why Fei-Fei Li’s work on ImageNet was so consequential. She didn’t invent convolutional networks or figure out how to make them run efficiently on GPUs. But she provided the training data that large neural networks needed to reach their full potential.

The technology world immediately recognized the importance of AlexNet. Hinton and his students formed a shell company with the goal to be “acquihired” by a big tech company. Within months, Google purchased the company for $44 million. Hinton worked at Google for the next decade while retaining his academic post in Toronto. Ilya Sutskever spent a few years at Google before becoming a cofounder of OpenAI.

AlexNet also made Nvidia GPUs the industry standard for training neural networks. In 2012, the market valued Nvidia at less than $10 billion. Today, Nvidia is one of the most valuable companies in the world, with a market capitalization north of $3 trillion. That high valuation is driven mainly by overwhelming demand for GPUs like the H100 that are optimized for training neural networks.

Sometimes the conventional wisdom is wrong

“That moment was pretty symbolic to the world of AI because three fundamental elements of modern AI converged for the first time,” Li said in a September interview at the Computer History Museum. “The first element was neural networks. The second element was big data, using ImageNet. And the third element was GPU computing.”

Today, leading AI labs believe the key to progress in AI is to train huge models on vast data sets. Big technology companies are in such a hurry to build the data centers required to train larger models that they’ve started to lease out entire nuclear power plants to provide the necessary power.

You can view this as a straightforward application of the lessons of AlexNet. But I wonder if we ought to draw the opposite lesson from AlexNet: that it’s a mistake to become too wedded to conventional wisdom.

“Scaling laws” have had a remarkable run in the 12 years since AlexNet, and perhaps we’ll see another generation or two of impressive results as the leading labs scale up their foundation models even more.

But we should be careful not to let the lessons of AlexNet harden into dogma. I think there’s at least a chance that scaling laws will run out of steam in the next few years. And if that happens, we’ll need a new generation of stubborn nonconformists to notice that the old approach isn’t working and try something different.

Tim Lee was on staff at Ars from 2017 to 2021. Last year, he launched a newsletter, Understanding AI, that explores how AI works and how it’s changing our world. You can subscribe here.

Timothy is a senior reporter covering tech policy and the future of transportation. He lives in Washington DC.

How a stubborn computer scientist accidentally launched the deep learning boom Read More »

DOJ subpoenas Nvidia in deepening AI antitrust probe, report says

AI chips, antitrust, Antitrust law, Artificial Intelligence, competition law, GPUs, Jensen Huang, monopoly, NVIDIA, Policy, run:ai / Mike M. / September 4, 2024

The Department of Justice is reportedly deepening its probe into Nvidia. Officials have moved on from merely questioning competitors to subpoenaing Nvidia and other tech companies for evidence that could substantiate allegations that Nvidia is abusing its “dominant position in AI computing,” Bloomberg reported.

When news of the DOJ’s probe into the trillion-dollar company was first reported in June, Fast Company reported that scrutiny was intensifying merely because Nvidia was estimated to control “as much as 90 percent of the market for chips” capable of powering AI models. Experts told Fast Company that the DOJ probe might even be good for Nvidia’s business, noting that the market barely moved when the probe was first announced.

But the market’s confidence seemed to be shaken a little more on Tuesday, when Nvidia lost a “record-setting $279 billion” in market value following Bloomberg’s report. Nvidia’s losses became “the biggest single-day market-cap decline on record,” TheStreet reported.

People close to the DOJ’s investigation told Bloomberg that the DOJ’s “legally binding requests” require competitors “to provide information” on Nvidia’s suspected anticompetitive behaviors as a “dominant provider of AI processors.”

One concern is that Nvidia may be giving “preferential supply and pricing to customers who use its technology exclusively or buy its complete systems,” sources told Bloomberg. The DOJ is also reportedly probing Nvidia’s acquisition of RunAI—suspecting the deal may lock RunAI customers into using Nvidia chips.

Bloomberg’s report builds on a report last month from The Information that said that Advanced Micro Devices Inc. (AMD) and other Nvidia rivals were questioned by the DOJ—as well as third parties who could shed light on whether Nvidia potentially abused its market dominance in AI chips to pressure customers into buying more products.

According to Bloomberg’s sources, the DOJ is worried that “Nvidia is making it harder to switch to other suppliers and penalizes buyers that don’t exclusively use its artificial intelligence chips.”

In a statement to Bloomberg, Nvidia insisted that “Nvidia wins on merit, as reflected in our benchmark results and value to customers, who can choose whatever solution is best for them.” Additionally, Bloomberg noted that following a chip shortage in 2022, Nvidia CEO Jensen Huang has said that his company strives to prevent stockpiling of Nvidia’s coveted AI chips by prioritizing customers “who can make use of his products in ready-to-go data centers.”

Potential threats to Nvidia’s dominance

Despite the slump in shares, Nvidia’s market dominance seems unlikely to wane any time soon after its stock more than doubled this year. In an SEC filing this year, Nvidia bragged that its “accelerated computing ecosystem is bringing AI to every enterprise” with an “ecosystem” spanning “nearly 5 million developers and 40,000 companies.” Nvidia specifically highlighted that “more than 1,600 generative AI companies are building on Nvidia,” and according to Bloomberg, Nvidia will close out 2024 with more profits than the total sales of its closest competitor, AMD.

After the DOJ’s most recent big win, which successfully proved that Google has a monopoly on search, the DOJ appears intent on getting ahead of any tech companies’ ambitions to seize monopoly power and essentially become the Google of the AI industry. In June, DOJ antitrust chief Jonathan Kanter confirmed to the Financial Times that the DOJ is examining “monopoly choke points and the competitive landscape” in AI beyond just scrutinizing Nvidia.

According to Kanter, the DOJ is scrutinizing all aspects of the AI industry—”everything from computing power and the data used to train large language models, to cloud service providers, engineering talent and access to essential hardware such as graphics processing unit chips.” But in particular, the DOJ appears concerned that GPUs like Nvidia’s advanced AI chips remain a “scarce resource.” Kanter told the Financial Times that an “intervention” in “real time” to block a potential monopoly could be “the most meaningful intervention” and the least “invasive” as the AI industry grows.

DOJ subpoenas Nvidia in deepening AI antitrust probe, report says Read More »

AI’s future in grave danger from Nvidia’s chokehold on chips, groups warn

AI chips, ai industry, Antitrust law, Artificial Intelligence, department of justice, intel, Jensen Huang, monopoly, NVIDIA, Policy / Kris Guyer / August 1, 2024

Controlling “the world’s computing destiny” —

Anti-monopoly groups want DOJ to probe Nvidia’s AI chip bundling, alleged price-fixing.

Ashley Belanger – Aug 1, 2024 6: 14 pm UTC

Sen. Elizabeth Warren (D-Mass.) has joined progressive groups—including Demand Progress, Open Markets Institute, and the Tech Oversight Project—pressuring the US Department of Justice to investigate Nvidia’s dominance in the AI chip market due to alleged antitrust concerns, Reuters reported.

In a letter to the DOJ’s chief antitrust enforcer, Jonathan Kanter, groups demanding more Big Tech oversight raised alarms that Nvidia’s top rivals apparently “are struggling to gain traction” because “Nvidia’s near-absolute dominance of the market is difficult to counter” and “funders are wary of backing its rivals.”

Nvidia is currently “the world’s most valuable public company,” their letter said, worth more than $3 trillion after taking near-total control of the high-performance AI chip market. Particularly “astonishing,” the letter said, was Nvidia’s dominance in the market for GPU accelerator chips, which are at the heart of today’s leading AI. Groups urged Kanter to probe Nvidia’s business practices to ensure that rivals aren’t permanently blocked from competing.

According to the advocacy groups that strongly oppose Big Tech monopolies, Nvidia “now holds an 80 percent overall global market share in GPU chips and a 98 percent share in the data center market.” This “puts it in a position to crowd out competitors and set global pricing and the terms of trade,” the letter warned.

Earlier this year, inside sources reported that the DOJ and the Federal Trade Commission reached a deal where the DOJ would probe Nvidia’s alleged anti-competitive behavior in the booming AI industry, and the FTC would probe OpenAI and Microsoft. But there has been no official Nvidia probe announced, prompting progressive groups to push harder for the DOJ to recognize what they view as a “dire danger to the open market” that “well deserves DOJ scrutiny.”

Ultimately, the advocacy groups told Kanter that they fear Nvidia wielding “control over the world’s computing destiny,” noting that Nvidia’s cloud computing data centers don’t just power “Big Tech’s consumer products” but also “underpin every aspect of contemporary society, including the financial system, logistics, healthcare, and defense.”

They claimed that Nvidia is “leveraging” its “scarce chips” to force customers to buy its “chips, networking, and programming software as a package.” Such bundling and “price-fixing,” their letter warned, appear to be “the same kinds of anti-competitive tactics that the courts, in response to actions brought by the Department of Justice against other companies, have found to be illegal” and could perhaps “stifle innovation.”

Although data from TechInsights suggested that Nvidia’s chip shortage and cost actually helped companies like AMD and Intel sell chips in 2023, both Nvidia rivals reported losses in market share earlier this year, Yahoo Finance reported.

Perhaps most closely monitoring Nvidia’s dominance, France antitrust authorities launched an investigation into Nvidia last month over antitrust concerns, the letter said, “making it the first enforcer to act against the computer chip maker,” Reuters reported.

Since then, the European Union and the United Kingdom, as well as the US, have heightened scrutiny, but their seeming lag to follow through with an official investigation may only embolden Nvidia, as the company allegedly “believes its market behavior is above the law,” the progressive groups wrote. Suspicious behavior includes allegations that “Nvidia has continued to sell chips to Chinese customers and provide them computing access” despite a “Department of Commerce ban on trading with Chinese companies due to national security and human rights concerns.”

“Its chips have been confirmed to be reaching blacklisted Chinese entities,” their letter warned, citing a Wall Street Journal report.

Nvidia’s dominance apparently impacts everyone involved with AI. According to the letter, Nvidia seemingly “determining who receives inventory from a limited supply, setting premium pricing, and contractually blocking customers from doing business with competitors” is “alarming” the entire AI industry. That includes “both small companies (who find their supply choked off) and the Big Tech AI giants.”

Kanter will likely be receptive to the letter. In June, Fast Company reported that Kanter told an audience at an AI conference that there are “structures and trends in AI that should give us pause.” He further suggested that any technology that “relies on massive amounts of data and computing power” can “give already dominant firms a substantial advantage,” according to Fast Company’s summary of his remarks.

AI’s future in grave danger from Nvidia’s chokehold on chips, groups warn Read More »

Nvidia jumps ahead of itself and reveals next-gen “Rubin” AI chips in keynote tease

AI, AI accelerators, Biz & IT, Blackwell, chatgpt, chatgtp, computex, GH200, GPU, GPUs, Grace Hopper Superchip, Jensen Huang, machine learning, NVIDIA, Nvidia Rubin, Taiwan, Vera CPU, Vera Rubin / Paul Patrick / June 3, 2024

Swing beat —

“I’m not sure yet whether I’m going to regret this,” says CEO Jensen Huang at Computex 2024.

Benj Edwards – Jun 3, 2024 5: 13 pm UTC

Nvidia's CEO Jensen Huang delivers his keystone speech ahead of Computex 2024 in Taipei on June 2, 2024. — Enlarge / Nvidia’s CEO Jensen Huang delivers his keystone speech ahead of Computex 2024 in Taipei on June 2, 2024.

On Sunday, Nvidia CEO Jensen Huang reached beyond Blackwell and revealed the company’s next-generation AI-accelerating GPU platform during his keynote at Computex 2024 in Taiwan. Huang also detailed plans for an annual tick-tock-style upgrade cycle of its AI acceleration platforms, mentioning an upcoming Blackwell Ultra chip slated for 2025 and a subsequent platform called “Rubin” set for 2026.

Nvidia’s data center GPUs currently power a large majority of cloud-based AI models, such as ChatGPT, in both development (training) and deployment (inference) phases, and investors are keeping a close watch on the company, with expectations to keep that run going.

During the keynote, Huang seemed somewhat hesitant to make the Rubin announcement, perhaps wary of invoking the so-called Osborne effect, whereby a company’s premature announcement of the next iteration of a tech product eats into the current iteration’s sales. “This is the very first time that this next click as been made,” Huang said, holding up his presentation remote just before the Rubin announcement. “And I’m not sure yet whether I’m going to regret this or not.”

Nvidia Keynote at Computex 2023.

The Rubin AI platform, expected in 2026, will use HBM4 (a new form of high-bandwidth memory) and NVLink 6 Switch, operating at 3,600GBps. Following that launch, Nvidia will release a tick-tock iteration called “Rubin Ultra.” While Huang did not provide extensive specifications for the upcoming products, he promised cost and energy savings related to the new chipsets.

During the keynote, Huang also introduced a new ARM-based CPU called “Vera,” which will be featured on a new accelerator board called “Vera Rubin,” alongside one of the Rubin GPUs.

Much like Nvidia’s Grace Hopper architecture, which combines a “Grace” CPU and a “Hopper” GPU to pay tribute to the pioneering computer scientist of the same name, Vera Rubin refers to Vera Florence Cooper Rubin (1928–2016), an American astronomer who made discoveries in the field of deep space astronomy. She is best known for her pioneering work on galaxy rotation rates, which provided strong evidence for the existence of dark matter.

A calculated risk

Enlarge / Nvidia CEO Jensen Huang reveals the “Rubin” AI platform for the first time during his keynote at Computex 2024 on June 2, 2024.

Nvidia’s reveal of Rubin is not a surprise in the sense that most big tech companies are continuously working on follow-up products well in advance of release, but it’s notable because it comes just three months after the company revealed Blackwell, which is barely out of the gate and not yet widely shipping.

At the moment, the company seems to be comfortable leapfrogging itself with new announcements and catching up later; Nvidia just announced that its GH200 Grace Hopper “Superchip,” unveiled one year ago at Computex 2023, is now in full production.

With Nvidia stock rising and the company possessing an estimated 70–95 percent of the data center GPU market share, the Rubin reveal is a calculated risk that seems to come from a place of confidence. That confidence could turn out to be misplaced if a so-called “AI bubble” pops or if Nvidia misjudges the capabilities of its competitors. The announcement may also stem from pressure to continue Nvidia’s astronomical growth in market cap with nonstop promises of improving technology.

Accordingly, Huang has been eager to showcase the company’s plans to continue pushing silicon fabrication tech to its limits and widely broadcast that Nvidia plans to keep releasing new AI chips at a steady cadence.

“Our company has a one-year rhythm. Our basic philosophy is very simple: build the entire data center scale, disaggregate and sell to you parts on a one-year rhythm, and we push everything to technology limits,” Huang said during Sunday’s Computex keynote.

Despite Nvidia’s recent market performance, the company’s run may not continue indefinitely. With ample money pouring into the data center AI space, Nvidia isn’t alone in developing accelerator chips. Competitors like AMD (with the Instinct series) and Intel (with Guadi 3) also want to win a slice of the data center GPU market away from Nvidia’s current command of the AI-accelerator space. And OpenAI’s Sam Altman is trying to encourage diversified production of GPU hardware that will power the company’s next generation of AI models in the years ahead.

Nvidia jumps ahead of itself and reveals next-gen “Rubin” AI chips in keynote tease Read More »

Nvidia announces “moonshot” to create embodied human-level AI in robot form

AI, Biz & IT, Jensen Huang, Jetson Thor, Jim Fan, machine learning, moonshot, NVIDIA, Project GR00T, Robots, Thor SoC / Mike M. / March 20, 2024

Here come the robots —

As companies race to pair AI with general-purpose humanoid robots, Nvidia’s GR00T emerges.

Benj Edwards – Mar 20, 2024 8: 21 pm UTC

Enlarge / An illustration of a humanoid robot created by Nvidia.

Nvidia

In sci-fi films, the rise of humanlike artificial intelligence often comes hand in hand with a physical platform, such as an android or robot. While the most advanced AI language models so far seem mostly like disembodied voices echoing from an anonymous data center, they might not remain that way for long. Some companies like Google, Figure, Microsoft, Tesla, Boston Dynamics, and others are working toward giving AI models a body. This is called “embodiment,” and AI chipmaker Nvidia wants to accelerate the process.

“Building foundation models for general humanoid robots is one of the most exciting problems to solve in AI today,” said Nvidia CEO Jensen Huang in a statement. Huang spent a portion of Nvidia’s annual GTC conference keynote on Monday going over Nvidia’s robotics efforts. “The next generation of robotics will likely be humanoid robotics,” Huang said. “We now have the necessary technology to imagine generalized human robotics.”

To that end, Nvidia announced Project GR00T, a general-purpose foundation model for humanoid robots. As a type of AI model itself, Nvidia hopes GR00T (which stands for “Generalist Robot 00 Technology” but sounds a lot like a famous Marvel character) will serve as an AI mind for robots, enabling them to learn skills and solve various tasks on the fly. In a tweet, Nvidia researcher Linxi “Jim” Fan called the project “our moonshot to solve embodied AGI in the physical world.”

AGI, or artificial general intelligence, is a poorly defined term that usually refers to hypothetical human-level AI (or beyond) that can learn any task a human could without specialized training. Given a capable enough humanoid body driven by AGI, one could imagine fully autonomous robotic assistants or workers. Of course, some experts think that true AGI is long way off, so it’s possible that Nvidia’s goal is more aspirational than realistic. But that’s also what makes Nvidia’s plan a moonshot.

NVIDIA Robotics: A Journey From AVs to Humanoids.

“The GR00T model will enable a robot to understand multimodal instructions, such as language, video, and demonstration, and perform a variety of useful tasks,” wrote Fan on X. “We are collaborating with many leading humanoid companies around the world, so that GR00T may transfer across embodiments and help the ecosystem thrive.” We reached out to Nvidia researchers, including Fan, for comment but did not hear back by press time.

Nvidia is designing GR00T to understand natural language and emulate human movements, potentially allowing robots to learn coordination, dexterity, and other skills necessary for navigating and interacting with the real world like a person. And as it turns out, Nvidia says that making robots shaped like humans might be the key to creating functional robot assistants.

The humanoid key

Enlarge / Robotics startup figure, an Nvidia partner, recently showed off its humanoid “Figure 01” robot.

Figure

So far, we’ve seen plenty of robotics platforms that aren’t human-shaped, including robot vacuum cleaners, autonomous weed pullers, industrial units used in automobile manufacturing, and even research arms that can fold laundry. So why focus on imitating the human form? “In a way, human robotics is likely easier,” said Huang in his GTC keynote. “And the reason for that is because we have a lot more imitation training data that we can provide robots, because we are constructed in a very similar way.”

That means that researchers can feed samples of training data captured from human movement into AI models that control robot movement, teaching them how to better move and balance themselves. Also, humanoid robots are particularly convenient because they can fit anywhere a person can, and we’ve designed a world of physical objects and interfaces (such as tools, furniture, stairs, and appliances) to be used or manipulated by the human form.

Along with GR00T, Nvidia also debuted a new computer platform called Jetson Thor, based on NVIDIA’s Thor system-on-a-chip (SoC), as part of the new Blackwell GPU architecture, which it hopes will power this new generation of humanoid robots. The SoC reportedly includes a transformer engine capable of 800 teraflops of 8-bit floating point AI computation for running models like GR00T.

Nvidia announces “moonshot” to create embodied human-level AI in robot form Read More »

Nvidia unveils Blackwell B200, the “world’s most powerful chip” designed for AI

AI, AI inference, Biz & IT, chatgpt, chatgtp, GPU, GPUs, Jensen Huang, large language models, LLMs, machine learning, NVIDIA, openai, supercomputers / Paul Patrick / March 19, 2024

There’s no knowing where we’re rowing —

208B transistor chip can reportedly reduce AI cost and energy consumption by up to 25x.

Benj Edwards – Mar 19, 2024 3: 27 pm UTC

Enlarge / The GB200 “superchip” covered with a fanciful blue explosion.

Nvidia / Benj Edwards

On Monday, Nvidia unveiled the Blackwell B200 tensor core chip—the company’s most powerful single-chip GPU, with 208 billion transistors—which Nvidia claims can reduce AI inference operating costs (such as running ChatGPT) and energy consumption by up to 25 times compared to the H100. The company also unveiled the GB200, a “superchip” that combines two B200 chips and a Grace CPU for even more performance.

The news came as part of Nvidia’s annual GTC conference, which is taking place this week at the San Jose Convention Center. Nvidia CEO Jensen Huang delivered the keynote Monday afternoon. “We need bigger GPUs,” Huang said during his keynote. The Blackwell platform will allow the training of trillion-parameter AI models that will make today’s generative AI models look rudimentary in comparison, he said. For reference, OpenAI’s GPT-3, launched in 2020, included 175 billion parameters. Parameter count is a rough indicator of AI model complexity.

Nvidia named the Blackwell architecture after David Harold Blackwell, a mathematician who specialized in game theory and statistics and was the first Black scholar inducted into the National Academy of Sciences. The platform introduces six technologies for accelerated computing, including a second-generation Transformer Engine, fifth-generation NVLink, RAS Engine, secure AI capabilities, and a decompression engine for accelerated database queries.

Enlarge / Press photo of the Grace Blackwell GB200 chip, which combines two B200 GPUs with a Grace CPU into one chip.

Several major organizations, such as Amazon Web Services, Dell Technologies, Google, Meta, Microsoft, OpenAI, Oracle, Tesla, and xAI, are expected to adopt the Blackwell platform, and Nvidia’s press release is replete with canned quotes from tech CEOs (key Nvidia customers) like Mark Zuckerberg and Sam Altman praising the platform.

GPUs, once only designed for gaming acceleration, are especially well suited for AI tasks because their massively parallel architecture accelerates the immense number of matrix multiplication tasks necessary to run today’s neural networks. With the dawn of new deep learning architectures in the 2010s, Nvidia found itself in an ideal position to capitalize on the AI revolution and began designing specialized GPUs just for the task of accelerating AI models.

Nvidia’s data center focus has made the company wildly rich and valuable, and these new chips continue the trend. Nvidia’s gaming GPU revenue ($2.9 billion in the last quarter) is dwarfed in comparison to data center revenue (at $18.4 billion), and that shows no signs of stopping.

A beast within a beast

Enlarge / Press photo of the Nvidia GB200 NVL72 data center computer system.

The aforementioned Grace Blackwell GB200 chip arrives as a key part of the new NVIDIA GB200 NVL72, a multi-node, liquid-cooled data center computer system designed specifically for AI training and inference tasks. It combines 36 GB200s (that’s 72 B200 GPUs and 36 Grace CPUs total), interconnected by fifth-generation NVLink, which links chips together to multiply performance.

Enlarge / A specification chart for the Nvidia GB200 NVL72 system.

“The GB200 NVL72 provides up to a 30x performance increase compared to the same number of NVIDIA H100 Tensor Core GPUs for LLM inference workloads and reduces cost and energy consumption by up to 25x,” Nvidia said.

That kind of speed-up could potentially save money and time while running today’s AI models, but it will also allow for more complex AI models to be built. Generative AI models—like the kind that power Google Gemini and AI image generators—are famously computationally hungry. Shortages of compute power have widely been cited as holding back progress and research in the AI field, and the search for more compute has led to figures like OpenAI CEO Sam Altman trying to broker deals to create new chip foundries.

While Nvidia’s claims about the Blackwell platform’s capabilities are significant, it’s worth noting that its real-world performance and adoption of the technology remain to be seen as organizations begin to implement and utilize the platform themselves. Competitors like Intel and AMD are also looking to grab a piece of Nvidia’s AI pie.

Nvidia says that Blackwell-based products will be available from various partners starting later this year.

Nvidia unveils Blackwell B200, the “world’s most powerful chip” designed for AI Read More »