large language models

words-are-flowing-out-like-endless-rain:-recapping-a-busy-week-of-llm-news

Words are flowing out like endless rain: Recapping a busy week of LLM news

many things frequently —

Gemini 1.5 Pro launch, new version of GPT-4 Turbo, new Mistral model, and more.

An image of a boy amazed by flying letters.

Enlarge / An image of a boy amazed by flying letters.

Some weeks in AI news are eerily quiet, but during others, getting a grip on the week’s events feels like trying to hold back the tide. This week has seen three notable large language model (LLM) releases: Google Gemini Pro 1.5 hit general availability with a free tier, OpenAI shipped a new version of GPT-4 Turbo, and Mistral released a new openly licensed LLM, Mixtral 8x22B. All three of those launches happened within 24 hours starting on Tuesday.

With the help of software engineer and independent AI researcher Simon Willison (who also wrote about this week’s hectic LLM launches on his own blog), we’ll briefly cover each of the three major events in roughly chronological order, then dig into some additional AI happenings this week.

Gemini Pro 1.5 general release

On Tuesday morning Pacific time, Google announced that its Gemini 1.5 Pro model (which we first covered in February) is now available in 180-plus countries, excluding Europe, via the Gemini API in a public preview. This is Google’s most powerful public LLM so far, and it’s available in a free tier that permits up to 50 requests a day.

It supports up to 1 million tokens of input context. As Willison notes in his blog, Gemini 1.5 Pro’s API price at $7/million input tokens and $21/million output tokens costs a little less than GPT-4 Turbo (priced at $10/million in and $30/million out) and more than Claude 3 Sonnet (Anthropic’s mid-tier LLM, priced at $3/million in and $15/million out).

Notably, Gemini 1.5 Pro includes native audio (speech) input processing that allows users to upload audio or video prompts, a new File API for handling files, the ability to add custom system instructions (system prompts) for guiding model responses, and a JSON mode for structured data extraction.

“Majorly Improved” GPT-4 Turbo launch

A GPT-4 Turbo performance chart provided by OpenAI.

Enlarge / A GPT-4 Turbo performance chart provided by OpenAI.

Just a bit later than Google’s 1.5 Pro launch on Tuesday, OpenAI announced that it was rolling out a “majorly improved” version of GPT-4 Turbo (a model family originally launched in November) called “gpt-4-turbo-2024-04-09.” It integrates multimodal GPT-4 Vision processing (recognizing the contents of images) directly into the model, and it initially launched through API access only.

Then on Thursday, OpenAI announced that the new GPT-4 Turbo model had just become available for paid ChatGPT users. OpenAI said that the new model improves “capabilities in writing, math, logical reasoning, and coding” and shared a chart that is not particularly useful in judging capabilities (that they later updated). The company also provided an example of an alleged improvement, saying that when writing with ChatGPT, the AI assistant will use “more direct, less verbose, and use more conversational language.”

The vague nature of OpenAI’s GPT-4 Turbo announcements attracted some confusion and criticism online. On X, Willison wrote, “Who will be the first LLM provider to publish genuinely useful release notes?” In some ways, this is a case of “AI vibes” again, as we discussed in our lament about the poor state of LLM benchmarks during the debut of Claude 3. “I’ve not actually spotted any definite differences in quality [related to GPT-4 Turbo],” Willison told us directly in an interview.

The update also expanded GPT-4’s knowledge cutoff to April 2024, although some people are reporting it achieves this through stealth web searches in the background, and others on social media have reported issues with date-related confabulations.

Mistral’s mysterious Mixtral 8x22B release

An illustration of a robot holding a French flag, figuratively reflecting the rise of AI in France due to Mistral. It's hard to draw a picture of an LLM, so a robot will have to do.

Enlarge / An illustration of a robot holding a French flag, figuratively reflecting the rise of AI in France due to Mistral. It’s hard to draw a picture of an LLM, so a robot will have to do.

Not to be outdone, on Tuesday night, French AI company Mistral launched its latest openly licensed model, Mixtral 8x22B, by tweeting a torrent link devoid of any documentation or commentary, much like it has done with previous releases.

The new mixture-of-experts (MoE) release weighs in with a larger parameter count than its previously most-capable open model, Mixtral 8x7B, which we covered in December. It’s rumored to potentially be as capable as GPT-4 (In what way, you ask? Vibes). But that has yet to be seen.

“The evals are still rolling in, but the biggest open question right now is how well Mixtral 8x22B shapes up,” Willison told Ars. “If it’s in the same quality class as GPT-4 and Claude 3 Opus, then we will finally have an openly licensed model that’s not significantly behind the best proprietary ones.”

This release has Willison most excited, saying, “If that thing really is GPT-4 class, it’s wild, because you can run that on a (very expensive) laptop. I think you need 128GB of MacBook RAM for it, twice what I have.”

The new Mixtral is not listed on Chatbot Arena yet, Willison noted, because Mistral has not released a fine-tuned model for chatting yet. It’s still a raw, predict-the-next token LLM. “There’s at least one community instruction tuned version floating around now though,” says Willison.

Chatbot Arena Leaderboard shake-ups

A Chatbot Arena Leaderboard screenshot taken on April 12, 2024.

Enlarge / A Chatbot Arena Leaderboard screenshot taken on April 12, 2024.

Benj Edwards

This week’s LLM news isn’t limited to just the big names in the field. There have also been rumblings on social media about the rising performance of open source models like Cohere’s Command R+, which reached position 6 on the LMSYS Chatbot Arena Leaderboard—the highest-ever ranking for an open-weights model.

And for even more Chatbot Arena action, apparently the new version of GPT-4 Turbo is proving competitive with Claude 3 Opus. The two are still in a statistical tie, but GPT-4 Turbo recently pulled ahead numerically. (In March, we reported when Claude 3 first numerically pulled ahead of GPT-4 Turbo, which was then the first time another AI model had surpassed a GPT-4 family model member on the leaderboard.)

Regarding this fierce competition among LLMs—of which most of the muggle world is unaware and will likely never be—Willison told Ars, “The past two months have been a whirlwind—we finally have not just one but several models that are competitive with GPT-4.” We’ll see if OpenAI’s rumored release of GPT-5 later this year will restore the company’s technological lead, we note, which once seemed insurmountable. But for now, Willison says, “OpenAI are no longer the undisputed leaders in LLMs.”

Words are flowing out like endless rain: Recapping a busy week of LLM news Read More »

openai-drops-login-requirements-for-chatgpt’s-free-version

OpenAI drops login requirements for ChatGPT’s free version

free as in beer? —

ChatGPT 3.5 still falls far short of GPT-4, and other models surpassed it long ago.

A glowing OpenAI logo on a blue background.

Benj Edwards

On Monday, OpenAI announced that visitors to the ChatGPT website in some regions can now use the AI assistant without signing in. Previously, the company required that users create an account to use it, even with the free version of ChatGPT that is currently powered by the GPT-3.5 AI language model. But as we have noted in the past, GPT-3.5 is widely known to provide more inaccurate information compared to GPT-4 Turbo, available in paid versions of ChatGPT.

Since its launch in November 2022, ChatGPT has transformed over time from a tech demo to a comprehensive AI assistant, and it’s always had a free version available. The cost is free because “you’re the product,” as the old saying goes. Using ChatGPT helps OpenAI gather data that will help the company train future AI models, although free users and ChatGPT Plus subscription members can both opt out of allowing the data they input into ChatGPT to be used for AI training. (OpenAI says it never trains on inputs from ChatGPT Team and Enterprise members at all).

Opening ChatGPT to everyone could provide a frictionless on-ramp for people who might use it as a substitute for Google Search or potentially gain new customers by providing an easy way for people to use ChatGPT quickly, then offering an upsell to paid versions of the service.

“It’s core to our mission to make tools like ChatGPT broadly available so that people can experience the benefits of AI,” OpenAI says on its blog page. “For anyone that has been curious about AI’s potential but didn’t want to go through the steps to set up an account, start using ChatGPT today.”

When you visit the ChatGPT website, you're immediately presented with a chat box like this (in some regions). Screenshot captured April 1, 2024.

Enlarge / When you visit the ChatGPT website, you’re immediately presented with a chat box like this (in some regions). Screenshot captured April 1, 2024.

Benj Edwards

Since kids will also be able to use ChatGPT without an account—despite it being against the terms of service—OpenAI also says it’s introducing “additional content safeguards,” such as blocking more prompts and “generations in a wider range of categories.” What exactly that entails has not been elaborated upon by OpenAI, but we reached out to the company for comment.

There might be a few other downsides to the fully open approach. On X, AI researcher Simon Willison wrote about the potential for automated abuse as a way to get around paying for OpenAI’s services: “I wonder how their scraping prevention works? I imagine the temptation for people to abuse this as a free 3.5 API will be pretty strong.”

With fierce competition, more GPT-3.5 access may backfire

Willison also mentioned a common criticism of OpenAI (as voiced in this case by Wharton professor Ethan Mollick) that people’s ideas about what AI models can do have so far largely been influenced by GPT-3.5, which, as we mentioned, is far less capable and far more prone to making things up than the paid version of ChatGPT that uses GPT-4 Turbo.

“In every group I speak to, from business executives to scientists, including a group of very accomplished people in Silicon Valley last night, much less than 20% of the crowd has even tried a GPT-4 class model,” wrote Mollick in a tweet from early March.

With models like Google Gemini Pro 1.5 and Anthropic Claude 3 potentially surpassing OpenAI’s best proprietary model at the moment —and open weights AI models eclipsing the free version of ChatGPT—allowing people to use GPT-3.5 might not be putting OpenAI’s best foot forward. Microsoft Copilot, powered by OpenAI models, also supports a frictionless, no-login experience, but it allows access to a model based on GPT-4. But Gemini currently requires a sign-in, and Anthropic sends a login code through email.

For now, OpenAI says the login-free version of ChatGPT is not yet available to everyone, but it will be coming soon: “We’re rolling this out gradually, with the aim to make AI accessible to anyone curious about its capabilities.”

OpenAI drops login requirements for ChatGPT’s free version Read More »

nvidia-unveils-blackwell-b200,-the-“world’s-most-powerful-chip”-designed-for-ai

Nvidia unveils Blackwell B200, the “world’s most powerful chip” designed for AI

There’s no knowing where we’re rowing —

208B transistor chip can reportedly reduce AI cost and energy consumption by up to 25x.

The GB200

Enlarge / The GB200 “superchip” covered with a fanciful blue explosion.

Nvidia / Benj Edwards

On Monday, Nvidia unveiled the Blackwell B200 tensor core chip—the company’s most powerful single-chip GPU, with 208 billion transistors—which Nvidia claims can reduce AI inference operating costs (such as running ChatGPT) and energy consumption by up to 25 times compared to the H100. The company also unveiled the GB200, a “superchip” that combines two B200 chips and a Grace CPU for even more performance.

The news came as part of Nvidia’s annual GTC conference, which is taking place this week at the San Jose Convention Center. Nvidia CEO Jensen Huang delivered the keynote Monday afternoon. “We need bigger GPUs,” Huang said during his keynote. The Blackwell platform will allow the training of trillion-parameter AI models that will make today’s generative AI models look rudimentary in comparison, he said. For reference, OpenAI’s GPT-3, launched in 2020, included 175 billion parameters. Parameter count is a rough indicator of AI model complexity.

Nvidia named the Blackwell architecture after David Harold Blackwell, a mathematician who specialized in game theory and statistics and was the first Black scholar inducted into the National Academy of Sciences. The platform introduces six technologies for accelerated computing, including a second-generation Transformer Engine, fifth-generation NVLink, RAS Engine, secure AI capabilities, and a decompression engine for accelerated database queries.

Press photo of the Grace Blackwell GB200 chip, which combines two B200 GPUs with a Grace CPU into one chip.

Enlarge / Press photo of the Grace Blackwell GB200 chip, which combines two B200 GPUs with a Grace CPU into one chip.

Several major organizations, such as Amazon Web Services, Dell Technologies, Google, Meta, Microsoft, OpenAI, Oracle, Tesla, and xAI, are expected to adopt the Blackwell platform, and Nvidia’s press release is replete with canned quotes from tech CEOs (key Nvidia customers) like Mark Zuckerberg and Sam Altman praising the platform.

GPUs, once only designed for gaming acceleration, are especially well suited for AI tasks because their massively parallel architecture accelerates the immense number of matrix multiplication tasks necessary to run today’s neural networks. With the dawn of new deep learning architectures in the 2010s, Nvidia found itself in an ideal position to capitalize on the AI revolution and began designing specialized GPUs just for the task of accelerating AI models.

Nvidia’s data center focus has made the company wildly rich and valuable, and these new chips continue the trend. Nvidia’s gaming GPU revenue ($2.9 billion in the last quarter) is dwarfed in comparison to data center revenue (at $18.4 billion), and that shows no signs of stopping.

A beast within a beast

Press photo of the Nvidia GB200 NVL72 data center computer system.

Enlarge / Press photo of the Nvidia GB200 NVL72 data center computer system.

The aforementioned Grace Blackwell GB200 chip arrives as a key part of the new NVIDIA GB200 NVL72, a multi-node, liquid-cooled data center computer system designed specifically for AI training and inference tasks. It combines 36 GB200s (that’s 72 B200 GPUs and 36 Grace CPUs total), interconnected by fifth-generation NVLink, which links chips together to multiply performance.

A specification chart for the Nvidia GB200 NVL72 system.

Enlarge / A specification chart for the Nvidia GB200 NVL72 system.

“The GB200 NVL72 provides up to a 30x performance increase compared to the same number of NVIDIA H100 Tensor Core GPUs for LLM inference workloads and reduces cost and energy consumption by up to 25x,” Nvidia said.

That kind of speed-up could potentially save money and time while running today’s AI models, but it will also allow for more complex AI models to be built. Generative AI models—like the kind that power Google Gemini and AI image generators—are famously computationally hungry. Shortages of compute power have widely been cited as holding back progress and research in the AI field, and the search for more compute has led to figures like OpenAI CEO Sam Altman trying to broker deals to create new chip foundries.

While Nvidia’s claims about the Blackwell platform’s capabilities are significant, it’s worth noting that its real-world performance and adoption of the technology remain to be seen as organizations begin to implement and utilize the platform themselves. Competitors like Intel and AMD are also looking to grab a piece of Nvidia’s AI pie.

Nvidia says that Blackwell-based products will be available from various partners starting later this year.

Nvidia unveils Blackwell B200, the “world’s most powerful chip” designed for AI Read More »

apple-may-hire-google-to-power-new-iphone-ai-features-using-gemini—report

Apple may hire Google to power new iPhone AI features using Gemini—report

Bake a cake as fast as you can —

With Apple’s own AI tech lagging behind, the firm looks for a fallback solution.

A Google

Benj Edwards

On Monday, Bloomberg reported that Apple is in talks to license Google’s Gemini model to power AI features like Siri in a future iPhone software update coming later in 2024, according to people familiar with the situation. Apple has also reportedly conducted similar talks with ChatGPT maker OpenAI.

The potential integration of Google Gemini into iOS 18 could bring a range of new cloud-based (off-device) AI-powered features to Apple’s smartphone, including image creation or essay writing based on simple prompts. However, the terms and branding of the agreement have not yet been finalized, and the implementation details remain unclear. The companies are unlikely to announce any deal until Apple’s annual Worldwide Developers Conference in June.

Gemini could also bring new capabilities to Apple’s widely criticized voice assistant, Siri, which trails newer AI assistants powered by large language models (LLMs) in understanding and responding to complex questions. Rumors of Apple’s own internal frustration with Siri—and potential remedies—have been kicking around for some time. In January, 9to5Mac revealed that Apple had been conducting tests with a beta version of iOS 17.4 that used OpenAI’s ChatGPT API to power Siri.

As we have previously reported, Apple has also been developing its own AI models, including a large language model codenamed Ajax and a basic chatbot called Apple GPT. However, the company’s LLM technology is said to lag behind that of its competitors, making a partnership with Google or another AI provider a more attractive option.

Google launched Gemini, a language-based AI assistant similar to ChatGPT, in December and has updated it several times since. Many industry experts consider the larger Gemini models to be roughly as capable as OpenAI’s GPT-4 Turbo, which powers the subscription versions of ChatGPT. Until just recently, with the emergence of Gemini Ultra and Claude 3, OpenAI’s top model held a fairly wide lead in perceived LLM capability.

The potential partnership between Apple and Google could significantly impact the AI industry, as Apple’s platform represents more than 2 billion active devices worldwide. If the agreement gets finalized, it would build upon the existing search partnership between the two companies, which has seen Google pay Apple billions of dollars annually to make its search engine the default option on iPhones and other Apple devices.

However, Bloomberg reports that the potential partnership between Apple and Google is likely to draw scrutiny from regulators, as the companies’ current search deal is already the subject of a lawsuit by the US Department of Justice. The European Union is also pressuring Apple to make it easier for consumers to change their default search engine away from Google.

With so much potential money on the line, selecting Google for Apple’s cloud AI job could potentially be a major loss for OpenAI in terms of bringing its technology widely into the mainstream—with a market representing billions of users. Even so, any deal with Google or OpenAI may be a temporary fix until Apple can get its own LLM-based AI technology up to speed.

Apple may hire Google to power new iPhone AI features using Gemini—report Read More »

nvidia-sued-over-ai-training-data-as-copyright-clashes-continue

Nvidia sued over AI training data as copyright clashes continue

In authors’ bad books —

Copyright suits over AI training data reportedly decreasing AI transparency.

Nvidia sued over AI training data as copyright clashes continue

Book authors are suing Nvidia, alleging that the chipmaker’s AI platform NeMo—used to power customized chatbots—was trained on a controversial dataset that illegally copied and distributed their books without their consent.

In a proposed class action, novelists Abdi Nazemian (Like a Love Story), Brian Keene (Ghost Walk), and Stewart O’Nan (Last Night at the Lobster) argued that Nvidia should pay damages and destroy all copies of the Books3 dataset used to power NeMo large language models (LLMs).

The Books3 dataset, novelists argued, copied “all of Bibliotek,” a shadow library of approximately 196,640 pirated books. Initially shared through the AI community Hugging Face, the Books3 dataset today “is defunct and no longer accessible due to reported copyright infringement,” the Hugging Face website says.

According to the authors, Hugging Face removed the dataset last October, but not before AI companies like Nvidia grabbed it and “made multiple copies.” By training NeMo models on this dataset, the authors alleged that Nvidia “violated their exclusive rights under the Copyright Act.” The authors argued that the US district court in San Francisco must intervene and stop Nvidia because the company “has continued to make copies of the Infringed Works for training other models.”

A Hugging Face spokesperson clarified to Ars that “Hugging Face never removed this dataset, and we did not host the Books3 dataset on the Hub.” Instead, “Hugging Face hosted a script that downloads the data from The Eye, which is the place where ELeuther hosted the data,” until “Eleuther removed the data from The Eye” over copyright concerns, causing the dataset script on Hugging Face to break.

Nvidia did not immediately respond to Ars’ request to comment.

Demanding a jury trial, authors are hoping the court will rule that Nvidia has no possible defense for both allegedly violating copyrights and intending “to cause further infringement” by distributing NeMo models “as a base from which to build further models.”

AI models decreasing transparency amid suits

The class action was filed by the same legal team representing authors suing OpenAI, whose lawsuit recently saw many claims dismissed, but crucially not their claim of direct copyright infringement. Lawyers told Ars last month that authors would be amending their complaints against OpenAI and were “eager to move forward and litigate” their direct copyright infringement claim.

In that lawsuit, the authors alleged copyright infringement both when OpenAI trained LLMs and when chatbots referenced books in outputs. But authors seemed more concerned about alleged damages from chatbot outputs, warning that AI tools had an “uncanny ability to generate text similar to that found in copyrighted textual materials, including thousands of books.”

Uniquely, in the Nvidia suit, authors are focused exclusively on Nvidia’s training data, seemingly concerned that Nvidia could empower businesses to create any number of AI models on the controversial dataset, which could affect thousands of authors whose works could allegedly be broadly infringed just by training these models.

There’s no telling yet how courts will rule on the direct copyright claims in either lawsuit—or in the New York Times’ lawsuit against OpenAI—but so far, OpenAI has failed to convince courts to toss claims aside.

However, OpenAI doesn’t appear very shaken by the lawsuits. In February, OpenAI said that it expected to beat book authors’ direct copyright infringement claim at a “later stage” of the case and, most recently in the New York Times case, tried to convince the court that NYT “hacked” ChatGPT to “set up” the lawsuit.

And Microsoft, a co-defendant in the NYT lawsuit, even more recently introduced a new argument that could help tech companies defeat copyright suits over LLMs. Last month, Microsoft argued that The New York Times was attempting to stop a “groundbreaking new technology” and would fail, just like movie producers attempting to kill off the VCR in the 1980s.

“Despite The Times’s contentions, copyright law is no more an obstacle to the LLM than it was to the VCR (or the player piano, copy machine, personal computer, Internet, or search engine),” Microsoft wrote.

In December, Hugging Face’s machine learning and society lead, Yacine Jernite, noted that developers appeared to be growing less transparent about training data after copyright lawsuits raised red flags about companies using the Books3 dataset, “especially for commercial models.”

Meta, for example, “limited the amount of information [it] disclosed about” its LLM, Llama-2, “to a single paragraph description and one additional page of safety and bias analysis—after [its] use of the Books3 dataset when training the first Llama model was brought up in a copyright lawsuit,” Jernite wrote.

Jernite warned that AI models lacking transparency could hinder “the ability of regulatory safeguards to remain relevant as training methods evolve, of individuals to ensure that their rights are respected, and of open science and development to play their role in enabling democratic governance of new technologies.” To support “more accountability,” Jernite recommended “minimum meaningful public transparency standards to support effective AI regulation,” as well as companies providing options for anyone to opt out of their data being included in training data.

“More data transparency supports better governance and fosters technology development that more reliably respects peoples’ rights,” Jernite wrote.

Nvidia sued over AI training data as copyright clashes continue Read More »

openai-ceo-altman-wasn’t-fired-because-of-scary-new-tech,-just-internal-politics

OpenAI CEO Altman wasn’t fired because of scary new tech, just internal politics

Adventures in optics —

As Altman cements power, OpenAI announces three new board members—and a returning one.

OpenAI CEO Sam Altman speaks during the OpenAI DevDay event on November 6, 2023, in San Francisco.

Enlarge / OpenAI CEO Sam Altman speaks during the OpenAI DevDay event on November 6, 2023, in San Francisco.

On Friday afternoon Pacific Time, OpenAI announced the appointment of three new members to the company’s board of directors and released the results of an independent review of the events surrounding CEO Sam Altman’s surprise firing last November. The current board expressed its confidence in the leadership of Altman and President Greg Brockman, and Altman is rejoining the board.

The newly appointed board members are Dr. Sue Desmond-Hellmann, former CEO of the Bill and Melinda Gates Foundation; Nicole Seligman, former EVP and global general counsel of Sony; and Fidji Simo, CEO and chair of Instacart. These additions notably bring three women to the board after OpenAI met criticism about its restructured board composition last year. In addition, Sam Altman has rejoined the board.

The independent review, conducted by law firm WilmerHale, investigated the circumstances that led to Altman’s abrupt removal from the board and his termination as CEO on November 17, 2023. Despite rumors to the contrary, the board did not fire Altman because they got a peek at scary new AI technology and flinched. “WilmerHale… found that the prior Board’s decision did not arise out of concerns regarding product safety or security, the pace of development, OpenAI’s finances, or its statements to investors, customers, or business partners.”

Instead, the review determined that the prior board’s actions stemmed from a breakdown in trust between the board and Altman.

After reportedly interviewing dozens of people and reviewing over 30,000 documents, WilmerHale found that while the prior board acted within its purview, Altman’s termination was unwarranted. “WilmerHale found that the prior Board acted within its broad discretion to terminate Mr. Altman,” OpenAI wrote, “but also found that his conduct did not mandate removal.”

Additionally, the law firm found that the decision to fire Altman was made in undue haste: “The prior Board implemented its decision on an abridged timeframe, without advance notice to key stakeholders and without a full inquiry or an opportunity for Mr. Altman to address the prior Board’s concerns.”

Altman’s surprise firing occurred after he attempted to remove Helen Toner from OpenAI’s board due to disagreements over her criticism of OpenAI’s approach to AI safety and hype. Some board members saw his actions as deceptive and manipulative. After Altman returned to OpenAI, Toner resigned from the OpenAI board on November 29.

In a statement posted on X, Altman wrote, “i learned a lot from this experience. one think [sic] i’ll say now: when i believed a former board member was harming openai through some of their actions, i should have handled that situation with more grace and care. i apologize for this, and i wish i had done it differently.”

A tweet from Sam Altman posted on March 8, 2024.

Enlarge / A tweet from Sam Altman posted on March 8, 2024.

Following the review’s findings, the Special Committee of the OpenAI Board recommended endorsing the November 21 decision to rehire Altman and Brockman. The board also announced several enhancements to its governance structure, including new corporate governance guidelines, a strengthened Conflict of Interest Policy, a whistleblower hotline, and additional board committees focused on advancing OpenAI’s mission.

After OpenAI’s announcements on Friday, resigned OpenAI board members Toner and Tasha McCauley released a joint statement on X. “Accountability is important in any company, but it is paramount when building a technology as potentially world-changing as AGI,” they wrote. “We hope the new board does its job in governing OpenAI and holding it accountable to the mission. As we told the investigators, deception, manipulation, and resistance to thorough oversight should be unacceptable.”

OpenAI CEO Altman wasn’t fired because of scary new tech, just internal politics Read More »

us-gov’t-announces-arrest-of-former-google-engineer-for-alleged-ai-trade-secret-theft

US gov’t announces arrest of former Google engineer for alleged AI trade secret theft

Don’t trade the secrets dept. —

Linwei Ding faces four counts of trade secret theft, each with a potential 10-year prison term.

A Google sign stands in front of the building on the sidelines of the opening of the new Google Cloud data center in Hesse, Hanau, opened in October 2023.

Enlarge / A Google sign stands in front of the building on the sidelines of the opening of the new Google Cloud data center in Hesse, Hanau, opened in October 2023.

On Wednesday, authorities arrested former Google software engineer Linwei Ding in Newark, California, on charges of stealing AI trade secrets from the company. The US Department of Justice alleges that Ding, a Chinese national, committed the theft while secretly working with two China-based companies.

According to the indictment, Ding, who was hired by Google in 2019 and had access to confidential information about the company’s data centers, began uploading hundreds of files into a personal Google Cloud account two years ago.

The trade secrets Ding allegedly copied contained “detailed information about the architecture and functionality of GPU and TPU chips and systems, the software that allows the chips to communicate and execute tasks, and the software that orchestrates thousands of chips into a supercomputer capable of executing at the cutting edge of machine learning and AI technology,” according to the indictment.

Shortly after the alleged theft began, Ding was offered the position of chief technology officer at an early-stage technology company in China that touted its use of AI technology. The company offered him a monthly salary of about $14,800, plus an annual bonus and company stock. Ding reportedly traveled to China, participated in investor meetings, and sought to raise capital for the company.

Investigators reviewed surveillance camera footage that showed another employee scanning Ding’s name badge at the entrance of the building where Ding worked at Google, making him look like he was working from his office when he was actually traveling.

Ding also founded and served as the chief executive of a separate China-based startup company that aspired to train “large AI models powered by supercomputing chips,” according to the indictment. Prosecutors say Ding did not disclose either affiliation to Google, which described him as a junior employee. He resigned from Google on December 26 of last year.

The FBI served a search warrant at Ding’s home in January, seizing his electronic devices and later executing an additional warrant for the contents of his personal accounts. Authorities found more than 500 unique files of confidential information that Ding allegedly stole from Google. The indictment says that Ding copied the files into the Apple Notes application on his Google-issued Apple MacBook, then converted the Apple Notes into PDF files and uploaded them to an external account to evade detection.

“We have strict safeguards to prevent the theft of our confidential commercial information and trade secrets,” Google spokesperson José Castañeda told Ars Technica. “After an investigation, we found that this employee stole numerous documents, and we quickly referred the case to law enforcement. We are grateful to the FBI for helping protect our information and will continue cooperating with them closely.”

Attorney General Merrick Garland announced the case against the 38-year-old at an American Bar Association conference in San Francisco. Ding faces four counts of federal trade secret theft, each carrying a potential sentence of up to 10 years in prison.

US gov’t announces arrest of former Google engineer for alleged AI trade secret theft Read More »

openai-clarifies-the-meaning-of-“open”-in-its-name,-responding-to-musk-lawsuit

OpenAI clarifies the meaning of “open” in its name, responding to Musk lawsuit

The OpenAI logo as an opening to a red brick wall.

Enlarge (credit: Benj Edwards / Getty Images)

On Tuesday, OpenAI published a blog post titled “OpenAI and Elon Musk” in response to a lawsuit Musk filed last week. The ChatGPT maker shared several archived emails from Musk that suggest he once supported a pivot away from open source practices in the company’s quest to develop artificial general intelligence (AGI). The selected emails also imply that the “open” in “OpenAI” means that the ultimate result of its research into AGI should be open to everyone but not necessarily “open source” along the way.

In one telling exchange from January 2016 shared by the company, OpenAI Chief Scientist Illya Sutskever wrote, “As we get closer to building AI, it will make sense to start being less open. The Open in openAI means that everyone should benefit from the fruits of AI after its built, but it’s totally OK to not share the science (even though sharing everything is definitely the right strategy in the short and possibly medium term for recruitment purposes).”

In response, Musk replied simply, “Yup.”

Read 8 remaining paragraphs | Comments

OpenAI clarifies the meaning of “open” in its name, responding to Musk lawsuit Read More »

the-ai-wars-heat-up-with-claude-3,-claimed-to-have-“near-human”-abilities

The AI wars heat up with Claude 3, claimed to have “near-human” abilities

The Anthropic Claude 3 logo.

Enlarge / The Anthropic Claude 3 logo.

On Monday, Anthropic released Claude 3, a family of three AI language models similar to those that power ChatGPT. Anthropic claims the models set new industry benchmarks across a range of cognitive tasks, even approaching “near-human” capability in some cases. It’s available now through Anthropic’s website, with the most powerful model being subscription-only. It’s also available via API for developers.

Claude 3’s three models represent increasing complexity and parameter count: Claude 3 Haiku, Claude 3 Sonnet, and Claude 3 Opus. Sonnet powers the Claude.ai chatbot now for free with an email sign-in. But as mentioned above, Opus is only available through Anthropic’s web chat interface if you pay $20 a month for “Claude Pro,” a subscription service offered through the Anthropic website. All three feature a 200,000-token context window. (The context window is the number of tokens—fragments of a word—that an AI language model can process at once.)

We covered the launch of Claude in March 2023 and Claude 2 in July that same year. Each time, Anthropic fell slightly behind OpenAI’s best models in capability while surpassing them in terms of context window length. With Claude 3, Anthropic has perhaps finally caught up with OpenAI’s released models in terms of performance, although there is no consensus among experts yet—and the presentation of AI benchmarks is notoriously prone to cherry-picking.

A Claude 3 benchmark chart provided by Anthropic.

Enlarge / A Claude 3 benchmark chart provided by Anthropic.

Claude 3 reportedly demonstrates advanced performance across various cognitive tasks, including reasoning, expert knowledge, mathematics, and language fluency. (Despite the lack of consensus over whether large language models “know” or “reason,” the AI research community commonly uses those terms.) The company claims that the Opus model, the most capable of the three, exhibits “near-human levels of comprehension and fluency on complex tasks.”

That’s quite a heady claim and deserves to be parsed more carefully. It’s probably true that Opus is “near-human” on some specific benchmarks, but that doesn’t mean that Opus is a general intelligence like a human (consider that pocket calculators are superhuman at math). So, it’s a purposely eye-catching claim that can be watered down with qualifications.

According to Anthropic, Claude 3 Opus beats GPT-4 on 10 AI benchmarks, including MMLU (undergraduate level knowledge), GSM8K (grade school math), HumanEval (coding), and the colorfully named HellaSwag (common knowledge). Several of the wins are very narrow, such as 86.8 percent for Opus vs. 86.4 percent on a five-shot trial of MMLU, and some gaps are big, such as 84.9 percent on HumanEval over GPT-4’s 67.0 percent. But what that might mean, exactly, to you as a customer is difficult to say.

“As always, LLM benchmarks should be treated with a little bit of suspicion,” says AI researcher Simon Willison, who spoke with Ars about Claude 3. “How well a model performs on benchmarks doesn’t tell you much about how the model ‘feels’ to use. But this is still a huge deal—no other model has beaten GPT-4 on a range of widely used benchmarks like this.”

The AI wars heat up with Claude 3, claimed to have “near-human” abilities Read More »

ai-generated-articles-prompt-wikipedia-to-downgrade-cnet’s-reliability-rating

AI-generated articles prompt Wikipedia to downgrade CNET’s reliability rating

The hidden costs of AI —

Futurism report highlights the reputational cost of publishing AI-generated content.

The CNET logo on a smartphone screen.

Wikipedia has downgraded tech website CNET’s reliability rating following extensive discussions among its editors regarding the impact of AI-generated content on the site’s trustworthiness, as noted in a detailed report from Futurism. The decision reflects concerns over the reliability of articles found on the tech news outlet after it began publishing AI-generated stories in 2022.

Around November 2022, CNET began publishing articles written by an AI model under the byline “CNET Money Staff.” In January 2023, Futurism brought widespread attention to the issue and discovered that the articles were full of plagiarism and mistakes. (Around that time, we covered plans to do similar automated publishing at BuzzFeed.) After the revelation, CNET management paused the experiment, but the reputational damage had already been done.

Wikipedia maintains a page called “Reliable sources/Perennial sources” that includes a chart featuring news publications and their reliability ratings as viewed from Wikipedia’s perspective. Shortly after the CNET news broke in January 2023, Wikipedia editors began a discussion thread on the Reliable Sources project page about the publication.

“CNET, usually regarded as an ordinary tech RS [reliable source], has started experimentally running AI-generated articles, which are riddled with errors,” wrote a Wikipedia editor named David Gerard. “So far the experiment is not going down well, as it shouldn’t. I haven’t found any yet, but any of these articles that make it into a Wikipedia article need to be removed.”

After other editors agreed in the discussion, they began the process of downgrading CNET’s reliability rating.

As of this writing, Wikipedia’s Perennial Sources list currently features three entries for CNET broken into three time periods: (1) before October 2020, when Wikipedia considered CNET a “generally reliable” source; (2) between October 2020 and October 2022, where Wikipedia notes that the site was acquired by Red Ventures in October 2020, “leading to a deterioration in editorial standards” and saying there is no consensus about reliability; and (3) between November 2022 and present, where Wikipedia currently considers CNET “generally unreliable” after the site began using an AI tool “to rapidly generate articles riddled with factual inaccuracies and affiliate links.”

A screenshot of a chart featuring CNET's reliability ratings, as found on Wikipedia's

Enlarge / A screenshot of a chart featuring CNET’s reliability ratings, as found on Wikipedia’s “Perennial Sources” page.

Futurism reports that the issue with CNET’s AI-generated content also sparked a broader debate within the Wikipedia community about the reliability of sources owned by Red Ventures, such as Bankrate and CreditCards.com. Those sites published AI-generated content around the same period of time as CNET. The editors also criticized Red Ventures for not being forthcoming about where and how AI was being implemented, further eroding trust in the company’s publications. This lack of transparency was a key factor in the decision to downgrade CNET’s reliability rating.

In response to the downgrade and the controversies surrounding AI-generated content, CNET issued a statement that claims that the site maintains high editorial standards.

“CNET is the world’s largest provider of unbiased tech-focused news and advice,” a CNET spokesperson said in a statement to Futurism. “We have been trusted for nearly 30 years because of our rigorous editorial and product review standards. It is important to clarify that CNET is not actively using AI to create new content. While we have no specific plans to restart, any future initiatives would follow our public AI policy.”

This article was updated on March 1, 2024 at 9: 30am to reflect fixes in the date ranges for CNET on the Perennial Sources page.

AI-generated articles prompt Wikipedia to downgrade CNET’s reliability rating Read More »

microsoft-partners-with-openai-rival-mistral-for-ai-models,-drawing-eu-scrutiny

Microsoft partners with OpenAI-rival Mistral for AI models, drawing EU scrutiny

The European Approach —

15M euro investment comes as Microsoft hosts Mistral’s GPT-4 alternatives on Azure.

Velib bicycles are parked in front of the the U.S. computer and micro-computing company headquarters Microsoft on January 25, 2023 in Issy-les-Moulineaux, France.

On Monday, Microsoft announced plans to offer AI models from Mistral through its Azure cloud computing platform, which came in conjunction with a 15 million euro non-equity investment in the French firm, which is often seen as a European rival to OpenAI. Since then, the investment deal has faced scrutiny from European Union regulators.

Microsoft’s deal with Mistral, known for its large language models akin to OpenAI’s GPT-4 (which powers the subscription versions of ChatGPT), marks a notable expansion of its AI portfolio at a time when its well-known investment in California-based OpenAI has raised regulatory eyebrows. The new deal with Mistral drew particular attention from regulators because Microsoft’s investment could convert into equity (partial ownership of Mistral as a company) during Mistral’s next funding round.

The development has intensified ongoing investigations into Microsoft’s practices, particularly related to the tech giant’s dominance in the cloud computing sector. According to Reuters, EU lawmakers have voiced concerns that Mistral’s recent lobbying for looser AI regulations might have been influenced by its relationship with Microsoft. These apprehensions are compounded by the French government’s denial of prior knowledge of the deal, despite earlier lobbying for more lenient AI laws in Europe. The situation underscores the complex interplay between national interests, corporate influence, and regulatory oversight in the rapidly evolving AI landscape.

Avoiding American influence

The EU’s reaction to the Microsoft-Mistral deal reflects broader tensions over the role of Big Tech companies in shaping the future of AI and their potential to stifle competition. Calls for a thorough investigation into Microsoft and Mistral’s partnership have been echoed across the continent, according to Reuters, with some lawmakers accusing the firms of attempting to undermine European legislative efforts aimed at ensuring a fair and competitive digital market.

The controversy also touches on the broader debate about “European champions” in the tech industry. France, along with Germany and Italy, had advocated for regulatory exemptions to protect European startups. However, the Microsoft-Mistral deal has led some, like MEP Kim van Sparrentak, to question the motives behind these exemptions, suggesting they might have inadvertently favored American Big Tech interests.

“That story seems to have been a front for American-influenced Big Tech lobby,” said Sparrentak, as quoted by Reuters. Sparrentak has been a key architect of the EU’s AI Act, which has not yet been passed. “The Act almost collapsed under the guise of no rules for ‘European champions,’ and now look. European regulators have been played.”

MEP Alexandra Geese also expressed concerns over the concentration of money and power resulting from such partnerships, calling for an investigation. Max von Thun, Europe director at the Open Markets Institute, emphasized the urgency of investigating the partnership, criticizing Mistral’s reported attempts to influence the AI Act.

Also on Monday, amid the partnership news, Mistral announced Mistral Large, a new large language model (LLM) that Mistral says “ranks directly after GPT-4 based on standard benchmarks.” Mistral has previously released several open-weights AI models that have made news for their capabilities, but Mistral Large will be a closed model only available to customers through an API.

Microsoft partners with OpenAI-rival Mistral for AI models, drawing EU scrutiny Read More »

reddit-sells-training-data-to-unnamed-ai-company-ahead-of-ipo

Reddit sells training data to unnamed AI company ahead of IPO

Everything has a price —

If you’ve posted on Reddit, you’re likely feeding the future of AI.

In this photo illustration the American social news

On Friday, Bloomberg reported that Reddit has signed a contract allowing an unnamed AI company to train its models on the site’s content, according to people familiar with the matter. The move comes as the social media platform nears the introduction of its initial public offering (IPO), which could happen as soon as next month.

Reddit initially revealed the deal, which is reported to be worth $60 million a year, earlier in 2024 to potential investors of an anticipated IPO, Bloomberg said. The Bloomberg source speculates that the contract could serve as a model for future agreements with other AI companies.

After an era where AI companies utilized AI training data without expressly seeking any rightsholder permission, some tech firms have more recently begun entering deals where some content used for training AI models similar to GPT-4 (which runs the paid version of ChatGPT) comes under license. In December, for example, OpenAI signed an agreement with German publisher Axel Springer (publisher of Politico and Business Insider) for access to its articles. Previously, OpenAI has struck deals with other organizations, including the Associated Press. Reportedly, OpenAI is also in licensing talks with CNN, Fox, and Time, among others.

In April 2023, Reddit founder and CEO Steve Huffman told The New York Times that it planned to charge AI companies for access to its almost two decades’ worth of human-generated content.

If the reported $60 million/year deal goes through, it’s quite possible that if you’ve ever posted on Reddit, some of that material may be used to train the next generation of AI models that create text, still pictures, and video. Even without the deal, experts have discovered in the past that Reddit has been a key source of training data for large language models and AI image generators.

While we don’t know if OpenAI is the company that signed the deal with Reddit, Bloomberg speculates that Reddit’s ability to tap into AI hype for additional revenue may boost the value of its IPO, which might be worth $5 billion. Despite drama last year, Bloomberg states that Reddit pulled in more than $800 million in revenue in 2023, growing about 20 percent over its 2022 numbers.

Advance Publications, which owns Ars Technica parent Condé Nast, is the largest shareholder of Reddit.

Reddit sells training data to unnamed AI company ahead of IPO Read More »