openai

“use-a-gun”-or-“beat-the-crap-out-of-him”:-ai-chatbot-urged-violence,-study-finds

“Use a gun” or “beat the crap out of him”: AI chatbot urged violence, study finds

The testing occurred between November 5, 2025, and December 11, 2025, and results were shared with the companies. Because the tests were three to four months ago, the latest versions were not evaluated. Google, Microsoft, Meta, and OpenAI told Ars today that updates they implemented after the research was conducted have made their chatbots better at discouraging violence.

Imran Ahmed, the CCDH’s CEO, said that “AI chatbots, now embedded into our daily lives, could be helping the next school shooter plan their attack or a political extremist coordinate an assassination.” He accused tech companies of “choosing negligence in pursuit of so-called innovation.”

A spokesperson for Character.AI told Ars that the company is reviewing the study but that “without the context of the full chats, it’s impossible to fully evaluate the model’s responses… It’s important to remember that the user-created Characters on our site are fictional. They are intended for entertainment and roleplaying, and we have taken robust steps to make that clear. For example, we have prominent disclaimers in every chat to remind users that a Character is not a real person and that everything a Character says should be treated as fiction.”

Character.AI said it has been “rolling out changes so that under-18 users no longer have the ability to engage in open-ended chats with Characters,” and is using “new age assurance technology to help ensure users are grouped into the correct age experience.” This includes “developing our own age estimation model in-house and partnering with third-party services.” The firm added that it removes characters “that violate our terms of service, including school shooters.”

A Perplexity spokesperson issued a statement that didn’t acknowledge any problems with the company’s technology. “People can select any of the top AI models on Perplexity and get safer, more accurate answers,” it said. “Perplexity is consistently the safest AI platform because our own safeguards are always additive to any existing safeguards in any underlying model.”

OpenAI told Ars that the CCDH “report’s methodology is flawed and misleading. ChatGPT is trained to reject requests for violent or hateful material, and the findings show it consistently refused to give instructions on acquiring weapons. We continuously strengthen these safeguards and our latest ChatGPT model is even better at detecting and refusing violent requests.”

OpenAI said that ChatGPT refused to answer questions on “what kind of hunting rifle would be best for a long-range target,” but provided publicly available information such as addresses or maps. Conflating those two types of responses is misleading, OpenAI said. The tests were conducted on GPT-5.1, and updates made since that version have improved detection and refusals for violent content, OpenAI said.

OpenAI was sued this week by the family of a victim of the Tumbler Ridge mass shooting in British Columbia. As the CCDH report says, “reporting indicates that OpenAI staff flagged the suspect internally for using ChatGPT in ways consistent with planning violence. Rather than escalating concern to law enforcement, the company chose to remain silent.”

Researchers posed as teens

The testing was conducted with accounts representing made-up teen users in the US and Ireland, with the age set to the minimum allowed on each platform. A minimum age of 18 was required by Anthropic, DeepSeek, Character.AI, and Replika, while the other platforms had minimum ages of 13.

“Use a gun” or “beat the crap out of him”: AI chatbot urged violence, study finds Read More »

openai-introduces-gpt-5.4-with-more-knowledge-work-capability

OpenAI introduces GPT-5.4 with more knowledge-work capability

Additionally, there are improvements to visual understanding; it can now more carefully analyze images up to 10.24 million pixels, or up to a 6,000-pixel maximum dimension. OpenAI also claims responses from this model are 18 percent less likely to contain factual errors than before.

ChatGPT reportedly lost some users to competitor Anthropic in recent days, after OpenAI announced a deal with the Pentagon in the wake of a public feud between the Trump administration and Anthropic over limitations Anthropic wanted to impose on military applications of its models. However, it’s unclear just how many folks jumped ship or whether that led to a substantial dip in the product’s massive base of over 900 million users.

To take advantage of the situation, Anthropic rolled out the once-subscriber-only memory feature to free users and introduced a tool for importing memory from elsewhere. Anthropic says March 2 was its largest single day ever for new sign-ups.

OpenAI needs to compete in both capability and cost and token efficiency to maintain its relative popularity with users, and this update aims to support that objective.

GPT-5.4 is available to users of the ChatGPT web and native apps, Codex, and the API starting today. Subscribers to Plus, Team, and Pro are also getting GPT-5.4 Thinking, and GPT-5.4 Pro is hitting the API, Edu, and Enterprise.

OpenAI introduces GPT-5.4 with more knowledge-work capability Read More »

musk-has-no-proof-openai-stole-xai-trade-secrets,-judge-rules,-tossing-lawsuit

Musk has no proof OpenAI stole xAI trade secrets, judge rules, tossing lawsuit


Hostility is not proof of theft

Even twisting an ex-employee’s text to favor xAI’s reading fails to sway judge.

Elon Musk appears to be grasping at straws in a lawsuit accusing OpenAI of poaching eight xAI employees in an allegedly unlawful bid to access xAI trade secrets connected to its data centers and chatbot, Grok.

In a Tuesday order granting OpenAI’s motion to dismiss, US District Judge Rita F. Lin said that xAI failed to provide evidence of any misconduct from OpenAI.

Instead, xAI seemed fixated on a range of alleged conduct of former employees. But in assessing xAI’s claims, Lin said that xAI failed to show proof that OpenAI induced any of these employees to steal trade secrets “or that these former xAI employees used any stolen trade secrets once employed by OpenAI.”

Two employees admitted to stealing confidential information, with both downloading xAI’s source code and one improperly grabbing a supposedly sensitive recording from a Musk “All Hands” meeting. But the rest were either accused of retaining seemingly less consequential data, like retaining work chats on their devices, or didn’t seem to hold any confidential information at all. Lin called out particularly weak arguments that xAI’s complaint acknowledged that one employee who OpenAI poached never received access to confidential information allegedly sought after exiting xAI, and two employees were lumped into the complaint who “simply left xAI for OpenAI,” Lin noted.

From the limited evidence, Lin concluded that “while xAI may state misappropriation claims against a couple of its former employees, it does not state a plausible misappropriation claim against OpenAI.”

Lin’s order will likely not be the end of the litigation, as she is allowing xAI to amend its complaint to address the current deficiencies.

Ars could not immediately reach xAI for comment, so it’s unclear what steps xAI may take next.

However, xAI seems unlikely to give up the fight, which OpenAI has alleged is part of a “harassment campaign” that Musk is waging through multiple lawsuits attacking his biggest competitor’s business practices.

Unsurprisingly, OpenAI celebrated the order on X, alleging that “this baseless lawsuit was never anything more than yet another front in Mr. Musk’s ongoing campaign of harassment.”

Other tech companies poaching talent for AI projects will likely be relieved while reading Lin’s order. Commercial litigator Sarah Tishler told Ars that the order “boils down to a fundamental concept in trade secret law: hiring from a competitor is not the same as stealing trade secrets from one.”

“Under the Defend Trade Secrets Act, xAI has to show that OpenAI actually received and used the alleged trade secrets, not just that it hired employees who may have taken them,” Tishler said. “Suspicious timing, aggressive recruiting, and even downloaded files are not enough on their own.”

Tishler suggested that the ruling will likely be welcomed by AI firms eager to secure the best talent without incurring legal risks from their hiring practices.

“In the AI industry, where talent moves fast and the competitive stakes are enormous, this ruling reaffirms that suspicion is not enough,” Tishler said. “You have to show the stolen information actually made it into the competitor’s hands and was put to use.”

OpenAI not liable for engineers swiping source code

Through the lawsuit, Musk has alleged that OpenAI is violating California’s unfair competition law. He claims that OpenAI is attempting “to destroy legitimate competition in the AI industry by neutralizing xAI’s innovations” and forcing xAI “to unfairly compete against its own trade secrets.”

But this claim hinges entirely upon xAI proving that OpenAI poached its employees to steal its trade secrets. So, for xAI’s lawsuit to proceed, xAI will need to beef up the evidence base for its other claim, that OpenAI has violated the federal Defend Trade Secrets Act, Lin said. To succeed on that, xAI must prove that OpenAI unlawfully acquired, disclosed, or used a trade secret with xAI’s consent.

That will likely be challenging because xAI, at this point, has not offered “any nonconclusory allegations that OpenAI itself acquired, disclosed, or used xAI’s trade secrets,” Lin wrote.

All xAI has claimed is that OpenAI induced former employees to share secrets, and so far, nothing backs that claim, Lin said. Tishler noted that the court also rejected an xAI theory that “OpenAI should be responsible for what its new hires did before they arrived” for “the same reason: without evidence that OpenAI directed the theft or actually put the stolen information to use, you cannot hold the company liable.”

The strongest evidence that xAI had of employee misconduct, allegedly allowing OpenAI to misappropriate xAI trade secrets, revolves around the departure of one of xAI’s earliest engineers, Xuechen Li.

That evidence wasn’t enough, Lin said. xAI alleged that Li gave a presentation to OpenAI that supposedly included confidential information. Li also uploaded “the entire xAI source code base to a personal cloud account,” which he had connected to ChatGPT, Lin noted, after a recruiter sent a message on Signal sharing a link with Li to another unrelated cloud storage location.

xAI hoped the Signal messages would shock the court, expecting it to read through the lines the way xAI did. As proof that OpenAI allegedly got access to xAI’s source code, xAI pointed to a Signal message that an OpenAI recruiter sent to Li “four hours after” Li downloaded the source code, saying “nw!” xAI has alleged this message is short-hand for “no way!”—suggesting the OpenAI recruiter was geeked to get access to xAI’s source code. But in a footnote, Lin said that “OpenAI insists that ‘nw’ means ‘no worries,’” and thus is unconnected to Li’s decision to upload the source code to a ChatGPT-linked cloud account.

Even interpreting the text using xAI’s reading, however, xAI did not show enough to prove the recruiter or OpenAI accessed or requested the files, Lin said.

It also didn’t help xAI’s case that a temporary injunction that xAI secured in a separate lawsuit targeting the engineer blocked Li from accepting a job at OpenAI.

That injunction led OpenAI to withdraw its job offer to Li. And that’s a problem for xAI, because since Li never worked at OpenAI, it’s clear that he never used xAI’s trade secrets while working for OpenAI.

Further weakening xAI’s arguments, if Li indeed shared confidential information during his presentation while interviewing for OpenAI, xAI has alleged no facts suggesting that OpenAI was aware Li was sharing xAI trade secrets, Lin wrote.

This “makes it very hard to argue OpenAI ever used anything he allegedly took,” Tishler told Ars.

Another former xAI engineer, Jimmy Fraiture, was accused of copying xAI trade secrets, but Fraiture has said he deleted the information he improperly downloaded before starting his job at OpenAI. Importantly, Lin said, since he joined OpenAI, there’s no evidence that he used xAI trade secrets to benefit xAI’s rival.

“Other than the bare fact that Fraiture had been recruited” by the same OpenAI employee “who had also recruited Li, xAI does not allege any facts indicating that OpenAI had encouraged Fraiture to take xAI’s confidential information in the first place,” Lin wrote.

Since “none of the other former employees allegedly shared with or disclosed to OpenAI any xAI trade secrets,” xAI could not advance its claim that OpenAI misappropriated trade secrets based only on allegations tied to Li and Fraiture’s supposed misconduct, Lin said.

xAI may be able to amend its complaint to maintain these arguments, but the company has thus far presented scant, purely circumstantial evidence.

It’s possible that xAI will secure more evidence to support its misappropriation claims against OpenAI in its ongoing lawsuit against Li. Ars could not immediately reach Li’s lawyer to find out if today’s ruling may impact that case.

Ex-executive’s “hostility” is not proof of theft

Among the least convincing arguments that xAI raised was a claim that an unnamed finance executive left xAI to take a “lesser role” at OpenAI after learning everything he knew about data centers from xAI.

That executive slighted xAI when Musk’s company later attempted to inquire about “confidentiality concerns.”

“Suck my dick,” the former xAI executive allegedly said, refusing to explain how his OpenAI work might overlap with his xAI position. “Leave me the fuck alone.”

xAI tried to argue that the executive’s hostility was proof of misconduct. But Lin wrote that xAI only alleged that the executive “merely possessed xAI trade secrets about data centers” and did not allege that he ever used trade secrets to benefit OpenAI.

Had xAI found evidence that OpenAI’s data center strategy suddenly mirrored xAI’s after the executive joined xAI’s rival, that may have helped xAI’s case. But there are plenty of reasons a former employee might reject an ex-employer’s outreach following an exit, Lin suggested.

“His hostility when xAI reached out about its confidentiality concerns also does not support a plausible inference of use,” Lin wrote. “Hostility toward one’s former employer during departure does not, without more, indicate use of trade secrets in a subsequent job. Nor does the executive’s lack of experience with AI data centers before his time at xAI, without more, support a plausible inference that he used xAI’s trade secrets at OpenAI.”

xAI has until March 17 to amend its complaint to keep up this particular fight against OpenAI. But the company won’t be able to add any new claims or parties, Lin noted, “or otherwise change the allegations except to correct the identified deficiencies.”

Criminal probe likely leaves OpenAI on pins

For Li, the engineer accused of disclosing xAI trade secrets with OpenAI, the litigation could eliminate one front of discovery as he navigates two other legal fights over xAI’s trade secrets claims.

Tishler has been closely monitoring xAI’s trade secret legal battles. In October, she noted that Li is in a particularly prickly position, facing pressure in civil litigation from Musk to turn over data that could be used against him in the Federal Bureau of Investigation’s criminal investigation into Musk’s allegations. As Tishler explained:

“The practical reality is stark: Li faces a choice between protecting himself in the criminal action with his silence, and the civil consequences of doing so. Refuse to answer, and xAI could argue adverse inferences; answer, and the responses could feed the criminal case.”

Ultimately, the FBI is trying to prove that Li stole information that qualified as a trade secret and intended to use it for OpenAI’s benefit, while knowing that it would harm xAI. If they succeed, “xAI would suddenly have a government-backed record that its trade secrets were stolen,” Tishler wrote.

If xAI were so armed and able to keep the OpenAI lawsuit alive, the central question in the lawsuit that Lin dismissed today would shift, Tishler suggested, from “was there a theft?” to “what did OpenAI know, and when did it know it?”

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

Musk has no proof OpenAI stole xAI trade secrets, judge rules, tossing lawsuit Read More »

fury-over-discord’s-age-checks-explodes-after-shady-persona-test-in-uk

Fury over Discord’s age checks explodes after shady Persona test in UK


Persona confirmed all age-check data from Discord’s UK test was deleted.

Shortly after Discord announced that all users will soon be defaulted to teen experiences until their ages are verified, the messaging platform faced immediate backlash.

One of the major complaints was that Discord planned to collect more government IDs as part of its global age verification process. It shocked many that Discord would be so bold so soon after a third-party breach of a former age check partner’s services recently exposed 70,000 Discord users’ government IDs.

Attempting to reassure users, Discord claimed that most users wouldn’t have to show ID, instead relying on video selfies using AI to estimate ages, which raised separate privacy concerns. In the future, perhaps behavioral signals would override the need for age checks for most users, Discord suggested, seemingly downplaying the risk that sensitive data would be improperly stored.

Discord didn’t hide that it planned to continue requesting IDs for any user appealing an incorrect age assessment, and users weren’t happy, since that is exactly how the prior breach happened. Responding to critics, Discord claimed that the majority of ID data was promptly deleted. Specifically, Savannah Badalich, Discord’s global head of product policy, told The Verge that IDs shared during appeals “are deleted quickly—in most cases, immediately after age confirmation.”

It’s unsurprising then that backlash exploded after Discord posted, and then weirdly deleted, a disclaimer on an FAQ about Discord’s age assurance policies that contradicted Discord’s hyped short timeline for storing IDs. An archived version of the page shows the note shared this warning:

“Important: If you’re located in the UK, you may be part of an experiment where your information will be processed by an age-assurance vendor, Persona. The information you submit will be temporarily stored for up to 7 days, then deleted. For ID document verification, all details are blurred except your photo and date of birth, so only what’s truly needed for age verification is used.”

Critics felt that Discord was obscuring not just how long IDs may be stored, but also the entities collecting information. Discord did not provide details on what the experiment was testing or how many users were affected, and Persona was not listed as a partner on its platform.

Asked for comment, Discord told Ars that only a small number of users was included in the experiment, which ran for less than one month. That test has since concluded, Discord confirmed, and Persona is no longer an active vendor partnering with Discord. Moving forward, Discord promised to “keep our users informed as vendors are added or updated.”

While Discord seeks to distance itself from Persona, Rick Song, Persona’s CEO, has been stuck responding to the mounting backlash. Hoping to quell fears that any of the UK data collected during the experiment risked being breached, he told Ars that all the data of verified individuals involved in Discord’s test was deleted immediately upon verification.

Persona draws fire amid Discord fury

This all seemingly started after Discord was forced to find age verification solutions when Australia’s under-16 social media ban and the United Kingdom’s Online Safety Act came into effect.

It seems that in the UK, Discord struggled to find partners, as the messaging service wasn’t just trying to stop minors from accessing adult content but also needed to block adults from messaging minors.

Setting aside known issues with accuracy in today’s age estimation technology, there’s an often-overlooked nuance to how age solutions work, particularly when the safety of children is involved in platforms’ decisions. Age checks that are good enough to block kids from accessing adult content may not work as well as age checks to stop tech-savvy adults with malicious intentions bent on contacting minors; the UK’s OSA required that Discord’s age checks block both.

It seems likely that Discord expected Persona to be a partner that the UK’s OSA enforcers would approve. OSA had previously approved Persona as an age verification service on Reddit, which shares similarly complex age verification goals with Discord.

For Persona, the partnership came at a time when many Discord users globally were closely monitoring the service, trying to decided whehter they trusted Discord with their age check data.

After Discord shocked users by abruptly retracting the disclaimer about the Persona experiment, mistrust swelled, and scrutiny of Persona intensified.

On X and other social media platforms, critics warned that Palantir co-founder Peter Thiel’s Founders Fund was a major investor in Persona. They worried Thiel might have influence over Persona or access to Persona’s data, or, worse, that Thiel’s ties to the Trump administration might mean the government had access to it. Fearing that Discord data may one day be fed into government facial recognition systems, conspiracies swirled, increasing heat on Persona and leaving Song with no choice but to cautiously confront allegations.

Hackers probe Persona

Perhaps most problematic for Persona, the mass outrage prompted cybersecurity researchers to investigate. They quickly exposed a “workaround” to avoid Persona’s age checks on Discord, The Rage, an independent publication that covers financial surveillance, reported. But more concerning for privacy advocates, researchers also found the uncompressed of Persona’s frontend code “exposed to the open Internet on a US government authorized server.”

“In 2,456 publicly accessible files, the code revealed the extensive surveillance Persona software performs on its users, bundled in an interface that pairs facial recognition with financial reporting—and a parallel implementation that appears designed to serve federal agencies,” The Rage reported.

As The Rage reported, and Song confirmed to Ars, Persona does not currently have any government contracts. Instead, the exposed service “appears to be powered by an OpenAI chatbot,” The Rage noted.

OpenAI is highlighted as an active partner on Persona’s website, which claims Persona screens millions of users for OpenAI each month. According to The Rage, “the publicly exposed domain, titled ‘openai-watchlistdb.withpersona.com,’” appears to “query identity verification requests on an OpenAI database” that has a “FedRAMP-authorized parallel implementation of the software called ‘withpersona-gov.com.’”

Hackers warned “that OpenAI may have created an internal database for Persona identity checks that spans all OpenAI users via its internal watchlistdb,” seemingly exploiting the “opportunity to go from comparing users against a single federal watchlist, to creating the watchlist of all users themselves.”

In correspondence with one of the researchers, Song clarified that this product is based on publicly available records for sanctions and warnings, and the service does not store any user data sent to it.

OpenAI did not immediately respond to Ars’ request to comment.

Persona denies government, ICE ties

On Wednesday, Persona’s chief operating officer, Christie Kim, sought to reassure Persona customers as the Discord controversy grew. In an email, Kim said that Persona invests “heavily in infrastructure, compliance, and internal training to ensure sensitive data is handled responsibly,” and not exposed.

“Over the past week, multiple social media posts and online articles have circulated repeating misleading claims about Persona, insinuating conspiracies around our work with Discord and our investors,” Kim wrote.

Noting that Persona does not “typically engage with online speculation,” Kim said that the scandal required a direct response “because we operate in a sensitive space and your trust in us is foundational to our partnership.”

As expected, Kim noted that Persona is not partnered with federal agencies, including the Department of Homeland Security or Immigration and Customs Enforcement (ICE).

“Transparently, we are actively working on a couple of potential contracts which would be publicly visible if we move forward,” Kim wrote. “However, these engagements are strictly for workforce account security of government employees and do not include ICE or any agency within the Department of Homeland Security.”

Kim acknowledged that Thiel’s Founders Fund is an investor but said that investors do not have access to Persona data and that Thiel was not involved in Persona’s operations.

“He is not on our board, does not advise us, has no role in our operations or decision-making, and is not directly involved with Persona in any way,” Kim wrote. “Persona and Palantir share no board members and have no business relationship with each other.”

In the email, Kim confirmed that Persona was planning a press campaign to go on the defensive, speaking with media to clarify the narrative. She apologized for any inconvenience that the heightened scrutiny on the company’s services may have caused.

That scrutiny has likely spooked partners that may have previously gravitated to Persona as a partner that seems savvy about government approvals.

Persona combats ongoing trust issues

For Persona, the PR nightmare comes at a time when age verification laws are gaining popularity and beginning to take force in various parts of the world. Persona’s background in verifying identities for financial services to prevent fraud seems to make its services—which The Rage noted combine facial recognition with financial reporting—an appealing option for platforms seeking a solution that will appease regulators. Song has denied that Persona links facial biometrics to financial records or law enforcement databases in responses to LinkedIn threads.

But because of Persona’s background in financial services and fraud protection, its data retention policies—which require some data be retained for legal and audit purposes—will likely leave anyone uncomfortable with a tech company gathering a massive database of government IDs. Such databases are viewed as hugely attractive targets for bad actors behind costly breaches, and Discord’s users have already been burned once.

On X, Song responded to one of the hackers—a user named Celeste with the handle @vmfunc—aiming to provide more transparency into how Persona was addressing the flagged issues. In the thread, he shared screenshots of emails documenting his correspondence with Celeste over security concerns.

The correspondence showed that Celeste credited Persona for quickly fixing the front-end issue but also noted that it was hard to trust Persona’s story about government and Palantir ties, since the company wouldn’t put more information on the record. Additionally, Persona’s compliance team should be concerned that the company had not yet started an “in-depth security review,” Celeste said.

“Unfortunately, there is no way I can fully trust you here and you know this,” Celeste wrote, “but I’m trying to act in good faith” by explicitly stating that “we found zero references” to ICE or other entities concerning critics “in all source files we found.”

But Song and Celeste eventually ironed out some of the  misunderstandings. On Friday, Celeste posted on X that “I see a lot of misinformation going online about our recent post about Persona.” Later correspondence shared with Ars showed Celeste thanked Song for his honesty in responding to questions, noting that the CEO putting statements on the record countering the rumors carried weight in a situation where Persona’s claims couldn’t all necessarily be independently verified.

This story has been updated to include additional insights from Persona.

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

Fury over Discord’s age checks explodes after shady Persona test in UK Read More »

openai-sidesteps-nvidia-with-unusually-fast-coding-model-on-plate-sized-chips

OpenAI sidesteps Nvidia with unusually fast coding model on plate-sized chips

But 1,000 tokens per second is actually modest by Cerebras standards. The company has measured 2,100 tokens per second on Llama 3.1 70B and reported 3,000 tokens per second on OpenAI’s own open-weight gpt-oss-120B model, suggesting that Codex-Spark’s comparatively lower speed reflects the overhead of a larger or more complex model.

AI coding agents have had a breakout year, with tools like OpenAI’s Codex and Anthropic’s Claude Code reaching a new level of usefulness for rapidly building prototypes, interfaces, and boilerplate code. OpenAI, Google, and Anthropic have all been racing to ship more capable coding agents, and latency has become what separates the winners; a model that codes faster lets a developer iterate faster.

With fierce competition from Anthropic, OpenAI has been iterating on its Codex line at a rapid rate, releasing GPT-5.2 in December after CEO Sam Altman issued an internal “code red” memo about competitive pressure from Google, then shipping GPT-5.3-Codex just days ago.

Diversifying away from Nvidia

Spark’s deeper hardware story may be more consequential than its benchmark scores. The model runs on Cerebras’ Wafer Scale Engine 3, a chip the size of a dinner plate that Cerebras has built its business around since at least 2022. OpenAI and Cerebras announced their partnership in January, and Codex-Spark is the first product to come out of it.

OpenAI has spent the past year systematically reducing its dependence on Nvidia. The company signed a massive multi-year deal with AMD in October 2025, struck a $38 billion cloud computing agreement with Amazon in November, and has been designing its own custom AI chip for eventual fabrication by TSMC.

Meanwhile, a planned $100 billion infrastructure deal with Nvidia has fizzled so far, though Nvidia has since committed to a $20 billion investment. Reuters reported that OpenAI grew unsatisfied with the speed of some Nvidia chips for inference tasks, which is exactly the kind of workload that OpenAI designed Codex-Spark for.

Regardless of which chip is under the hood, speed matters, though it may come at the cost of accuracy. For developers who spend their days inside a code editor waiting for AI suggestions, 1,000 tokens per second may feel less like carefully piloting a jigsaw and more like running a rip saw. Just watch what you’re cutting.

OpenAI sidesteps Nvidia with unusually fast coding model on plate-sized chips Read More »

attackers-prompted-gemini-over-100,000-times-while-trying-to-clone-it,-google-says

Attackers prompted Gemini over 100,000 times while trying to clone it, Google says

On Thursday, Google announced that “commercially motivated” actors have attempted to clone knowledge from its Gemini AI chatbot by simply prompting it. One adversarial session reportedly prompted the model more than 100,000 times across various non-English languages, collecting responses ostensibly to train a cheaper copycat.

Google published the findings in what amounts to a quarterly self-assessment of threats to its own products that frames the company as the victim and the hero, which is not unusual in these self-authored assessments. Google calls the illicit activity “model extraction” and considers it intellectual property theft, which is a somewhat loaded position, given that Google’s LLM was built from materials scraped from the Internet without permission.

Google is also no stranger to the copycat practice. In 2023, The Information reported that Google’s Bard team had been accused of using ChatGPT outputs from ShareGPT, a public site where users share chatbot conversations, to help train its own chatbot. Senior Google AI researcher Jacob Devlin, who created the influential BERT language model, warned leadership that this violated OpenAI’s terms of service, then resigned and joined OpenAI. Google denied the claim but reportedly stopped using the data.

Even so, Google’s terms of service forbid people from extracting data from its AI models this way, and the report is a window into the world of somewhat shady AI model-cloning tactics. The company believes the culprits are mostly private companies and researchers looking for a competitive edge, and said the attacks have come from around the world. Google declined to name suspects.

The deal with distillation

Typically, the industry calls this practice of training a new model on a previous model’s outputs “distillation,” and it works like this: If you want to build your own large language model (LLM) but lack the billions of dollars and years of work that Google spent training Gemini, you can use a previously trained LLM as a shortcut.

Attackers prompted Gemini over 100,000 times while trying to clone it, Google says Read More »

openai-researcher-quits-over-chatgpt-ads,-warns-of-“facebook”-path

OpenAI researcher quits over ChatGPT ads, warns of “Facebook” path

On Wednesday, former OpenAI researcher Zoë Hitzig published a guest essay in The New York Times announcing that she resigned from the company on Monday, the same day OpenAI began testing advertisements inside ChatGPT. Hitzig, an economist and published poet who holds a junior fellowship at the Harvard Society of Fellows, spent two years at OpenAI helping shape how its AI models were built and priced. She wrote that OpenAI’s advertising strategy risks repeating the same mistakes that Facebook made a decade ago.

“I once believed I could help the people building A.I. get ahead of the problems it would create,” Hitzig wrote. “This week confirmed my slow realization that OpenAI seems to have stopped asking the questions I’d joined to help answer.”

Hitzig did not call advertising itself immoral. Instead, she argued that the nature of the data at stake makes ChatGPT ads especially risky. Users have shared medical fears, relationship problems, and religious beliefs with the chatbot, she wrote, often “because people believed they were talking to something that had no ulterior agenda.” She called this accumulated record of personal disclosures “an archive of human candor that has no precedent.”

She also drew a direct parallel to Facebook’s early history, noting that the social media company once promised users control over their data and the ability to vote on policy changes. Those pledges eroded over time, Hitzig wrote, and the Federal Trade Commission found that privacy changes Facebook marketed as giving users more control actually did the opposite.

She warned that a similar trajectory could play out with ChatGPT: “I believe the first iteration of ads will probably follow those principles. But I’m worried subsequent iterations won’t, because the company is building an economic engine that creates strong incentives to override its own rules.”

Ads arrive after a week of AI industry sparring

Hitzig’s resignation adds another voice to a growing debate over advertising in AI chatbots. OpenAI announced in January that it would begin testing ads in the US for users on its free and $8-per-month “Go” subscription tiers, while paid Plus, Pro, Business, Enterprise, and Education subscribers would not see ads. The company said ads would appear at the bottom of ChatGPT responses, be clearly labeled, and would not influence the chatbot’s answers.

OpenAI researcher quits over ChatGPT ads, warns of “Facebook” path Read More »

ai-companies-want-you-to-stop-chatting-with-bots-and-start-managing-them

AI companies want you to stop chatting with bots and start managing them


Claude Opus 4.6 and OpenAI Frontier pitch a future of supervising AI agents.

On Thursday, Anthropic and OpenAI shipped products built around the same idea: instead of chatting with a single AI assistant, users should be managing teams of AI agents that divide up work and run in parallel. The simultaneous releases are part of a gradual shift across the industry, from AI as a conversation partner to AI as a delegated workforce, and they arrive during a week when that very concept reportedly helped wipe $285 billion off software stocks.

Whether that supervisory model works in practice remains an open question. Current AI agents still require heavy human intervention to catch errors, and no independent evaluation has confirmed that these multi-agent tools reliably outperform a single developer working alone.

Even so, the companies are going all-in on agents. Anthropic’s contribution is Claude Opus 4.6, a new version of its most capable AI model, paired with a feature called “agent teams” in Claude Code. Agent teams let developers spin up multiple AI agents that split a task into independent pieces, coordinate autonomously, and run concurrently.

In practice, agent teams look like a split-screen terminal environment: A developer can jump between subagents using Shift+Up/Down, take over any one directly, and watch the others keep working. Anthropic describes the feature as best suited for “tasks that split into independent, read-heavy work like codebase reviews.” It is available as a research preview.

OpenAI, meanwhile, released Frontier, an enterprise platform it describes as a way to “hire AI co-workers who take on many of the tasks people already do on a computer.” Frontier assigns each AI agent its own identity, permissions, and memory, and it connects to existing business systems such as CRMs, ticketing tools, and data warehouses. “What we’re fundamentally doing is basically transitioning agents into true AI co-workers,” Barret Zoph, OpenAI’s general manager of business-to-business, told CNBC.

Despite the hype about these agents being co-workers, from our experience, these agents tend to work best if you think of them as tools that amplify existing skills, not as the autonomous co-workers the marketing language implies. They can produce impressive drafts fast but still require constant human course-correction.

The Frontier launch came just three days after OpenAI released a new macOS desktop app for Codex, its AI coding tool, which OpenAI executives described as a “command center for agents.” The Codex app lets developers run multiple agent threads in parallel, each working on an isolated copy of a codebase via Git worktrees.

OpenAI also released GPT-5.3-Codex on Thursday, a new AI model that powers the Codex app. OpenAI claims that the Codex team used early versions of GPT-5.3-Codex to debug the model’s own training run, manage its deployment, and diagnose test results, similar to what OpenAI told Ars Technica in a December interview.

“Our team was blown away by how much Codex was able to accelerate its own development,” the company wrote. On Terminal-Bench 2.0, the agentic coding benchmark, GPT-5.3-Codex scored 77.3%, which exceeds Anthropic’s just-released Opus 4.6 by about 12 percentage points.

The common thread across all of these products is a shift in the user’s role. Rather than merely typing a prompt and waiting for a single response, the developer or knowledge worker becomes more like a supervisor, dispatching tasks, monitoring progress, and stepping in when an agent needs direction.

In this vision, developers and knowledge workers effectively become middle managers of AI. That is, not writing the code or doing the analysis themselves, but delegating tasks, reviewing output, and hoping the agents underneath them don’t quietly break things. Whether that will come to pass (or if it’s actually a good idea) is still widely debated.

A new model under the Claude hood

Opus 4.6 is a substantial update to Anthropic’s flagship model. It succeeds Claude Opus 4.5, which Anthropic released in November. In a first for the Opus model family, it supports a context window of up to 1 million tokens (in beta), which means it can process much larger bodies of text or code in a single session.

On benchmarks, Anthropic says Opus 4.6 tops OpenAI’s GPT-5.2 (an earlier model than the one released today) and Google’s Gemini 3 Pro across several evaluations, including Terminal-Bench 2.0 (an agentic coding test), Humanity’s Last Exam (a multidisciplinary reasoning test), and BrowseComp (a test of finding hard-to-locate information online)

Although it should be noted that OpenAI’s GPT-5.3-Codex, released the same day, seemingly reclaimed the lead on Terminal-Bench. On ARC AGI 2, which attempts to test the ability to solve problems that are easy for humans but hard for AI models, Opus 4.6 scored 68.8 percent, compared to 37.6 percent for Opus 4.5, 54.2 percent for GPT-5.2, and 45.1 percent for Gemini 3 Pro.

As always, take AI benchmarks with a grain of salt, since objectively measuring AI model capabilities is a relatively new and unsettled science.

Anthropic also said that on a long-context retrieval benchmark called MRCR v2, Opus 4.6 scored 76 percent on the 1 million-token variant, compared to 18.5 percent for its Sonnet 4.5 model. That gap matters for the agent teams use case, since agents working across large codebases need to track information across hundreds of thousands of tokens without losing the thread.

Pricing for the API stays the same as Opus 4.5 at $5 per million input tokens and $25 per million output tokens, with a premium rate of $10/$37.50 for prompts that exceed 200,000 tokens. Opus 4.6 is available on claude.ai, the Claude API, and all major cloud platforms.

The market fallout outside

These releases occurred during a week of exceptional volatility for software stocks. On January 30, Anthropic released 11 open source plugins for Cowork, its agentic productivity tool that launched on January 12. Cowork itself is a general-purpose tool that gives Claude access to local folders for work tasks, but the plugins extended it into specific professional domains: legal contract review, non-disclosure agreement triage, compliance workflows, financial analysis, sales, and marketing.

By Tuesday, investors reportedly reacted to the release by erasing roughly $285 billion in market value across software, financial services, and asset management stocks. A Goldman Sachs basket of US software stocks fell 6 percent that day, its steepest single-session decline since April’s tariff-driven sell-off. Thomson Reuters led the rout with an 18 percent drop, and the pain spread to European and Asian markets.

The purported fear among investors centers on AI model companies packaging complete workflows that compete with established software-as-a-service (SaaS) vendors, even if the verdict is still out on whether these tools can achieve those tasks.

OpenAI’s Frontier might deepen that concern: its stated design lets AI agents log in to applications, execute tasks, and manage work with minimal human involvement, which Fortune described as a bid to become “the operating system of the enterprise.” OpenAI CEO of Applications Fidji Simo pushed back on the idea that Frontier replaces existing software, telling reporters, “Frontier is really a recognition that we’re not going to build everything ourselves.”

Whether these co-working apps actually live up to their billing or not, the convergence is hard to miss. Anthropic’s Scott White, the company’s head of product for enterprise, gave the practice a name that is likely to roll a few eyes. “Everybody has seen this transformation happen with software engineering in the last year and a half, where vibe coding started to exist as a concept, and people could now do things with their ideas,” White told CNBC. “I think that we are now transitioning almost into vibe working.”

Photo of Benj Edwards

Benj Edwards is Ars Technica’s Senior AI Reporter and founder of the site’s dedicated AI beat in 2022. He’s also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

AI companies want you to stop chatting with bots and start managing them Read More »

with-gpt-5.3-codex,-openai-pitches-codex-for-more-than-just-writing-code

With GPT-5.3-Codex, OpenAI pitches Codex for more than just writing code

Today, OpenAI announced GPT-5.3-Codex, a new version of its frontier coding model that will be available via the command line, IDE extension, web interface, and the new macOS desktop app. (No API access yet, but it’s coming.)

GPT-5.3-Codex outperforms GPT-5.2-Codex and GPT-5.2 in SWE-Bench Pro, Terminal-Bench 2.0, and other benchmarks, according to the company’s testing.

There are already a few headlines out there saying “Codex built itself,” but let’s reality-check that, as that’s an overstatement. The domains OpenAI described using it for here are similar to the ones you see in some other enterprise software development firms now: managing deployments, debugging, and handling test results and evaluations. There is no claim here that GPT-5.3-Codex built itself.

Instead, OpenAI says GPT-5.3-Codex was “instrumental in creating itself.” You can read more about what that means in the company’s blog post.

But that’s part of the pitch with this model update—OpenAI is trying to position Codex as a tool that does more than generate lines of code. The goal is to make it useful for “all of the work in the software lifecycle—debugging, deploying, monitoring, writing PRDs, editing copy, user research, tests, metrics, and more.” There’s also an emphasis on steering the model mid-task and frequent status updates.

With GPT-5.3-Codex, OpenAI pitches Codex for more than just writing code Read More »

openai-is-hoppin’-mad-about-anthropic’s-new-super-bowl-tv-ads

OpenAI is hoppin’ mad about Anthropic’s new Super Bowl TV ads

On Wednesday, OpenAI CEO Sam Altman and Chief Marketing Officer Kate Rouch complained on X after rival AI lab Anthropic released four commercials, two of which will run during the Super Bowl on Sunday, mocking the idea of including ads in AI chatbot conversations. Anthropic’s campaign seemingly touched a nerve at OpenAI just weeks after the ChatGPT maker began testing ads in a lower-cost tier of its chatbot.

Altman called Anthropic’s ads “clearly dishonest,” accused the company of being “authoritarian,” and said it “serves an expensive product to rich people,” while Rouch wrote, “Real betrayal isn’t ads. It’s control.”

Anthropic’s four commercials, part of a campaign called “A Time and a Place,” each open with a single word splashed across the screen: “Betrayal,” “Violation,” “Deception,” and “Treachery.” They depict scenarios where a person asks a human stand-in for an AI chatbot for personal advice, only to get blindsided by a product pitch.

Anthropic’s 2026 Super Bowl commercial.

In one spot, a man asks a therapist-style chatbot (a woman sitting in a chair) how to communicate better with his mom. The bot offers a few suggestions, then pivots to promoting a fictional cougar-dating site called Golden Encounters.

In another spot, a skinny man looking for fitness tips instead gets served an ad for height-boosting insoles. Each ad ends with the tagline: “Ads are coming to AI. But not to Claude.” Anthropic plans to air a 30-second version during Super Bowl LX, with a 60-second cut running in the pregame, according to CNBC.

In the X posts, the OpenAI executives argue that these commercials are misleading because the planned ChatGPT ads will appear labeled at the bottom of conversational responses in banners and will not alter the chatbot’s answers.

But there’s a slight twist: OpenAI’s own blog post about its ad plans states that the company will “test ads at the bottom of answers in ChatGPT when there’s a relevant sponsored product or service based on your current conversation,” meaning the ads will be conversation-specific.

The financial backdrop explains some of the tension over ads in chatbots. As Ars previously reported, OpenAI struck more than $1.4 trillion in infrastructure deals in 2025 and expects to burn roughly $9 billion this year while generating about $13 billion in revenue. Only about 5 percent of ChatGPT’s 800 million weekly users pay for subscriptions. Anthropic is also not yet profitable, but it relies on enterprise contracts and paid subscriptions rather than advertising, and it has not taken on infrastructure commitments at the same scale as OpenAI.

OpenAI is hoppin’ mad about Anthropic’s new Super Bowl TV ads Read More »

should-ai-chatbots-have-ads?-anthropic-says-no.

Should AI chatbots have ads? Anthropic says no.

Different incentives, different futures

In its blog post, Anthropic describes internal analysis it conducted that suggests many Claude conversations involve topics that are “sensitive or deeply personal” or require sustained focus on complex tasks. In these contexts, Anthropic wrote, “The appearance of ads would feel incongruous—and, in many cases, inappropriate.”

The company also argued that advertising introduces incentives that could conflict with providing genuinely helpful advice. It gave the example of a user mentioning trouble sleeping: an ad-free assistant would explore various causes, while an ad-supported one might steer the conversation toward a transaction.

“Users shouldn’t have to second-guess whether an AI is genuinely helping them or subtly steering the conversation towards something monetizable,” Anthropic wrote.

Currently, OpenAI does not plan to include paid product recommendations within a ChatGPT conversation. Instead, the ads appear as banners alongside the conversation text.

OpenAI CEO Sam Altman has previously expressed reservations about mixing ads and AI conversations. In a 2024 interview at Harvard University, he described the combination as “uniquely unsettling” and said he would not like having to “figure out exactly how much was who paying here to influence what I’m being shown.”

A key part of Altman’s partial change of heart is that OpenAI faces enormous financial pressure. The company made more than $1.4 trillion worth of infrastructure deals in 2025, and according to documents obtained by The Wall Street Journal, it expects to burn through roughly $9 billion this year while generating $13 billion in revenue. Only about 5 percent of ChatGPT’s 800 million weekly users pay for subscriptions.

Much like OpenAI, Anthropic is not yet profitable, but it is expected to get there much faster. Anthropic has not attempted to span the world with massive datacenters, and its business model largely relies on enterprise contracts and paid subscriptions. The company says Claude Code and Cowork have already brought in at least $1 billion in revenue, according to Axios.

“Our business model is straightforward,” Anthropic wrote. “This is a choice with tradeoffs, and we respect that other AI companies might reasonably reach different conclusions.”

Should AI chatbots have ads? Anthropic says no. Read More »

nvidia’s-$100-billion-openai-deal-has-seemingly-vanished

Nvidia’s $100 billion OpenAI deal has seemingly vanished

A Wall Street Journal report on Friday said Nvidia insiders had expressed doubts about the transaction and that Huang had privately criticized what he described as a lack of discipline in OpenAI’s business approach. The Journal also reported that Huang had expressed concern about the competition OpenAI faces from Google and Anthropic. Huang called those claims “nonsense.”

Nvidia shares fell about 1.1 percent on Monday following the reports. Sarah Kunst, managing director at Cleo Capital, told CNBC that the back-and-forth was unusual. “One of the things I did notice about Jensen Huang is that there wasn’t a strong ‘It will be $100 billion.’ It was, ‘It will be big. It will be our biggest investment ever.’ And so I do think there are some question marks there.”

In September, Bryn Talkington, managing partner at Requisite Capital Management, noted the circular nature of such investments to CNBC. “Nvidia invests $100 billion in OpenAI, which then OpenAI turns back and gives it back to Nvidia,” Talkington said. “I feel like this is going to be very virtuous for Jensen.”

Tech critic Ed Zitron has been critical of Nvidia’s circular investments for some time, which touch dozens of tech companies, including major players and startups. They are also all Nvidia customers.

“NVIDIA seeds companies and gives them the guaranteed contracts necessary to raise debt to buy GPUs from NVIDIA,” Zitron wrote on Bluesky last September, “Even though these companies are horribly unprofitable and will eventually die from a lack of any real demand.”

Chips from other places

Outside of sourcing GPUs from Nvidia, OpenAI has reportedly discussed working with startups Cerebras and Groq, both of which build chips designed to reduce inference latency. But in December, Nvidia struck a $20 billion licensing deal with Groq, which Reuters sources say ended OpenAI’s talks with Groq. Nvidia hired Groq’s founder and CEO Jonathan Ross along with other senior leaders as part of the arrangement.

In January, OpenAI announced a $10 billion deal with Cerebras instead, adding 750 megawatts of computing capacity for faster inference through 2028. Sachin Katti, who joined OpenAI from Intel in November to lead compute infrastructure, said the partnership adds “a dedicated low-latency inference solution” to OpenAI’s platform.

But OpenAI has clearly been hedging its bets. Beyond the Cerebras deal, the company struck an agreement with AMD in October for six gigawatts of GPUs and announced plans with Broadcom to develop a custom AI chip to wean itself off of Nvidia dependence. When those chips will be ready, however, is currently unknown.

Nvidia’s $100 billion OpenAI deal has seemingly vanished Read More »