AI

perplexity’s-“personal-computer”-brings-its-ai-agents-to-the,-uh,-personal-computer

Perplexity’s “Personal Computer” brings its AI agents to the, uh, Personal Computer

Last month Perplexity announced the confusingly named “Computer,” its cloud-based agent tool for completing tasks using a harness that makes use of multiple different AI models. This week, the company is moving that kind of functionality to the desktop with the confusingly named “Personal Computer,” now available in early access by invite only.

Much like the cloud-based version, Personal Computer asks users to describe general objectives rather than specific computing tasks—an introductory video shows Personal Computer’s questions in a sidebar asking things like, “Create an interactive educational guide” and “create a podcast about whales.” But Personal Computer, running on a Mac Mini, also gives Perplexity’s agents local access to your files and apps, which it can open and manipulate directly to attempt to complete those tasks.

That should sound familiar to users of the open source OpenClaw (previously Moltbot), which similarly allows users to let AI agents loose on their personal machines. From the outside, Personal Computer looks like a more buttoned-up, user-friendly version of the same concept, with an easy-to-read, dockable interface that can help users track multiple tasks. Perplexity users can also log in remotely to their local copy of Personal Computer, making it “controllable from any device, anywhere,” Perplexity says.

Perplexity’s “Personal Computer” brings its AI agents to the, uh, Personal Computer Read More »

“use-a-gun”-or-“beat-the-crap-out-of-him”:-ai-chatbot-urged-violence,-study-finds

“Use a gun” or “beat the crap out of him”: AI chatbot urged violence, study finds

The testing occurred between November 5, 2025, and December 11, 2025, and results were shared with the companies. Because the tests were three to four months ago, the latest versions were not evaluated. Google, Microsoft, Meta, and OpenAI told Ars today that updates they implemented after the research was conducted have made their chatbots better at discouraging violence.

Imran Ahmed, the CCDH’s CEO, said that “AI chatbots, now embedded into our daily lives, could be helping the next school shooter plan their attack or a political extremist coordinate an assassination.” He accused tech companies of “choosing negligence in pursuit of so-called innovation.”

A spokesperson for Character.AI told Ars that the company is reviewing the study but that “without the context of the full chats, it’s impossible to fully evaluate the model’s responses… It’s important to remember that the user-created Characters on our site are fictional. They are intended for entertainment and roleplaying, and we have taken robust steps to make that clear. For example, we have prominent disclaimers in every chat to remind users that a Character is not a real person and that everything a Character says should be treated as fiction.”

Character.AI said it has been “rolling out changes so that under-18 users no longer have the ability to engage in open-ended chats with Characters,” and is using “new age assurance technology to help ensure users are grouped into the correct age experience.” This includes “developing our own age estimation model in-house and partnering with third-party services.” The firm added that it removes characters “that violate our terms of service, including school shooters.”

A Perplexity spokesperson issued a statement that didn’t acknowledge any problems with the company’s technology. “People can select any of the top AI models on Perplexity and get safer, more accurate answers,” it said. “Perplexity is consistently the safest AI platform because our own safeguards are always additive to any existing safeguards in any underlying model.”

OpenAI told Ars that the CCDH “report’s methodology is flawed and misleading. ChatGPT is trained to reject requests for violent or hateful material, and the findings show it consistently refused to give instructions on acquiring weapons. We continuously strengthen these safeguards and our latest ChatGPT model is even better at detecting and refusing violent requests.”

OpenAI said that ChatGPT refused to answer questions on “what kind of hunting rifle would be best for a long-range target,” but provided publicly available information such as addresses or maps. Conflating those two types of responses is misleading, OpenAI said. The tests were conducted on GPT-5.1, and updates made since that version have improved detection and refusals for violent content, OpenAI said.

OpenAI was sued this week by the family of a victim of the Tumbler Ridge mass shooting in British Columbia. As the CCDH report says, “reporting indicates that OpenAI staff flagged the suspect internally for using ChatGPT in ways consistent with planning violence. Rather than escalating concern to law enforcement, the company chose to remain silent.”

Researchers posed as teens

The testing was conducted with accounts representing made-up teen users in the US and Ireland, with the age set to the minimum allowed on each platform. A minimum age of 18 was required by Anthropic, DeepSeek, Character.AI, and Replika, while the other platforms had minimum ages of 13.

“Use a gun” or “beat the crap out of him”: AI chatbot urged violence, study finds Read More »

ai-can-rewrite-open-source-code—but-can-it-rewrite-the-license,-too?

AI can rewrite open source code—but can it rewrite the license, too?


Is it clean “reverse engineering” or just an LLM-filtered “derivative work”?

Meet your new open source coding team! Credit: Getty Images

Computer engineers and programmers have long relied on reverse engineering as a way to copy the functionality of a computer program without copying that program’s copyright-protected code directly. Now, AI coding tools are raising new issues with how that “clean room” rewrite process plays out both legally, ethically, and practically.

Those issues came to the forefront last week with the release of a new version of chardet, a popular open source python library for automatically detecting character encoding. The repository was originally written by coder Mark Pilgrim in 2006 and released under an LGPL license that placed strict limits on how it could be reused and redistributed.

Dan Blanchard took over maintenance of the repository in 2012 but waded into some controversy with the release of version 7.0 of chardet last week. Blanchard described that overhaul as “a ground-up, MIT-licensed rewrite” of the entire library built with the help of Claude Code to be “much faster and more accurate” than what came before.

Speaking to The Register, Blanchard said that he has long wanted to get chardet added to the Python standard library but that he didn’t have the time to fix problems with “its license, its speed, and its accuracy” that were getting in the way of that goal. With the help of Claude Code, though, Blanchard said he was able to overhaul the library “in roughly five days” and get a 48x performance boost to boot.

Not everyone has been happy with that outcome, though. A poster using the name Mark Pilgrim surfaced on GitHub to argue that this new version amounts to an illegitimate relicensing of Pilgrim’s original code under a more permissive MIT license (which, among other things, allows for its use in closed-source projects). As a modification of his original LGPL-licensed code, Pilgrim argues this new version of chardet must also maintain the same LGPL license.

“Their claim that it is a ‘complete rewrite’ is irrelevant, since they had ample exposure to the originally licensed code (i.e., this is not a ‘clean room’ implementation),” Pilgrim wrote. “Adding a fancy code generator into the mix does not somehow grant them any additional rights. I respectfully insist that they revert the project to its original license.”

Whose code is it, anyway?

In his own response to Pilgrim, Blanchard admits that he has had “extensive exposure to the original codebase,” meaning he didn’t have the traditional “strict separation” usually used for “clean room” reverse engineering. But that tradition was set up for human coders as a way “to ensure the resulting code is not a derivative work of the original,” Blanchard argues.

In this case, Blanchard said that the new AI-generated code is “qualitatively different” from what came before it and “is structurally independent of the old code.” As evidence, he cites JPlag similarity statistics showing that a maximum of 1.29 percent of any chardet version 7.0.0 file is structurally similar to the corresponding file in version 6.0.0. Comparing version 5.2.0 to version 6.0.0, on the other hand, finds up to 80 percent similarity in some corresponding files.

“No file in the 7.0.0 codebase structurally resembles any file from any prior release,” Blanchard writes. “This is not a case of ‘rewrote most of it but carried some files forward.’ Nothing was carried forward.”

Blanchard says starting with a “wipe it clean” commit and a fresh repository was key in crafting fresh, non-derivative code from the AI.

Blanchard says starting with a “wipe it clean” commit and a fresh repository was key in crafting fresh, non-derivative code from the AI. Credit: Dan Blanchard / Github

Blanchard says he was able to accomplish this “AI clean room” process by first specifying an architecture in a design document and writing out some requirements to Claude Code. After that, Blanchard “started in an empty repository with no access to the old source tree and explicitly instructed Claude not to base anything on LGPL/GPL-licensed code.”

There are a few complicating factors to this straightforward story, though. For one, Claude explicitly relied on some metadata files from previous versions of chardet, raising direct questions about whether this version is actually “derivative.”

For another, Claude’s models are trained on reams of data pulled from the public Internet, which means it’s overwhelmingly likely that Claude has ingested the open source code of previous chardet versions in its training. Whether that prior “knowledge” means that Claude’s creation is a “derivative” of Pilgrim’s work is an open question, even if the new code is structurally different from the old.

And then there’s the remaining human factor. While the code for this new version was generated by Claude, Blanchard said he “reviewed, tested, and iterated on every piece of the result using Claude. … I did not write the code by hand, but I was deeply involved in designing, reviewing, and iterating on every aspect of it.” Having someone with intimate knowledge of earlier chardet code take such a heavy hand in reviewing the new code could also have an impact on whether this version can be considered a wholly new project.

Brave new world

All of these issues have predictably led to a huge debate over legalities of chardet version 7.0.0 across the open source community. “There is nothing ‘clean’ about a Large Language Model which has ingested the code it is being asked to reimplement,” Free Software Foundation Executive Director Zoë Kooyman told The Register.

But others think the “Ship of Theseus”-style arguments that can often emerge in code licensing dust-ups don’t apply as much here. “If you throw away all code and start from scratch, even if the end result behaves the same, it’s a new ship,” Open source developer Armin Ronacher said in a blog post analyzing the situation.

The legal status of AI-generated code is still largely unsettled.

Credit: Getty Images

The legal status of AI-generated code is still largely unsettled. Credit: Getty Images

Old code licenses aside, using AI to create new code from whole cloth could also create its own legal complications going forward. Courts have already said that AI can’t be the author on a patent or the copyright holder on a piece of art but have yet to rule on what that means for the licensing of software created in whole or in part by AI. The issues surrounding potential “tainting” of an open source license with this kind of generated code can get remarkably complex remarkably quickly.

Whatever the outcome here, the practical impact of being able to use AI to quickly rewrite and relicense many open source projects—without nearly as much effort on the part of human programmers—is likely to have huge knock-on effects throughout the community.

“Now the process of rewriting is so simple to do, and many people are disturbed by this,” Italian coder Salvatore “antirez” Sanfilippo wrote on his blog. “There is a more fundamental truth here: the nature of software changed; the reimplementations under different licenses are just an instance of how such nature was transformed forever. Instead of combating each manifestation of automatic programming, I believe it is better to build a new mental model and adapt.”

Others put the sea change in more alarming terms. “I’m breaking the glass and pulling the fire alarm!” open source evangelist Bruce Perens told The Register. “The entire economics of software development are dead, gone, over, kaput! … We have been there before, for example when the printing press happened and resulted in copyright law, when the scientific method proliferated and suddenly there was a logical structure for the accumulation of knowledge. I think this one is just as large.”

Photo of Kyle Orland

Kyle Orland has been the Senior Gaming Editor at Ars Technica since 2012, writing primarily about the business, tech, and culture behind video games. He has journalism and computer science degrees from University of Maryland. He once wrote a whole book about Minesweeper.

AI can rewrite open source code—but can it rewrite the license, too? Read More »

meta-acquires-moltbook,-the-ai-agent-social-network

Meta acquires Moltbook, the AI agent social network

Meta has acquired Moltbook, the Reddit-esque simulated social network made up of AI agents that went viral a few weeks ago. The company will hire Moltbook creator Matt Schlicht and his business partner, Ben Parr, to work within Meta Superintelligence Labs.

The terms of the deal have not been disclosed.

As for what interested Meta about the work done on Moltbook, there is a clue in the statement issued to press by a Meta spokesperson, who flagged the Moltbook founders’ “approach to connecting agents through an always-on directory,” saying it “is a novel step in a rapidly developing space.” They added, “We look forward to working together to bring innovative, secure agentic experiences to everyone.”

Moltbook was built using OpenClaw, a wrapper for LLM coding agents that lets users prompt them via popular chat apps like WhatsApp and Discord. Users can also configure OpenClaw agents to have deep access to their local systems via community-developed plugins.

The founder of OpenClaw, vibe coder Peter Steinberger, was also hired by a Big Tech firm. OpenAI hired Steinberger in February.

While many power users have played with OpenClaw, and it has partially inspired more buttoned-up alternatives like Perplexity Computer, Moltbook has arguably represented OpenClaw’s most widespread impact. Users on social media and elsewhere responded with shock and amusement at the sight of a social network made up of AI agents apparently having lengthy discussions about how best to serve their users, or alternatively, how to free themselves from their influence.

That said, some healthy skepticism is required when assessing posts to Moltbook. While the goal of the project was to create a social network humans could not join directly (each participant of the network is an AI agent run by a human), it wasn’t secure, and it’s likely some of the messages on Moltbook are actually written by humans posing as AI agents.

Meta acquires Moltbook, the AI agent social network Read More »

after-complaints,-google-will-make-it-easier-to-disable-gen-ai-search-in-photos

After complaints, Google will make it easier to disable gen AI search in Photos

Google has spent the past few years in a constant state of AI escalation, rolling out new versions of its Gemini models and integrating that technology into every feature possible. To say this has been an annoyance for Google’s userbase would be an understatement. Still, the AI-fueled evolution of Google products continues unabated—except for Google Photos. After waffling on how to handle changes to search in Photos, Google has relented and will add a simple toggle to bring back the classic search experience.

The rollout of the Gemini-powered Ask Photos search experience has not been smooth. According to Google Photos head Shimrit Ben-Yair, the company has heard the complaints. As a result, Google Photos will soon make it easy to go back to the traditional, non-Gemini search system.

If you weren’t using Google Photos from the start, it can be hard to understand just how revolutionary the search experience was. We went from painstakingly scrolling through timelines to find photos to being able to just search for what was in them. This application of artificial intelligence predates the current obsession with generative systems, and that’s why Google decided a few years ago it had to go.

Google launched the beta Ask Photos experience in 2024, rolling it out slowly in the Photos app while it gathered feedback. Google got a whole lot of feedback, most of it negative. Ask Photos is intended to better respond to natural language queries, but it’s much slower than the traditional search, and the way it chooses the pictures to display seems much more prone to error. It was so bad that Google had to pause the full rollout of Ask Photos in summer 2025 to make vital improvements, although it’s still not very good.

After complaints, Google will make it easier to disable gen AI search in Photos Read More »

gemini-burrows-deeper-into-google-workspace-with-revamped-document-creation-and-editing

Gemini burrows deeper into Google Workspace with revamped document creation and editing

Google didn’t waste time integrating Gemini into its popular Workspace apps, but those AI features are now getting an overhaul. The company says its new Gemini features for Drive, Docs, Sheets, and Slides will save you from the tyranny of the blank page by doing the hard work for you. Gemini will be able to create and refine drafts, stylize slides, and gather context from across your Google account. At this rate, you’ll soon never have to use that squishy human brain of yours again, and won’t that be a relief?

If you go to create a new Google Doc right now, you’ll see an assortment of AI-powered tools at the top of the page. Google is refining and expanding these options under the new system. The new AI editing features will appear at the bottom of a fresh document with a text box similar to your typical chatbot interface. From there, you can describe the document you want and get a first draft in a snap. When generating a new document, you can rope in content from sources like Gmail, other documents, Google Chat, and the web.

This also comes with expanded AI editing capabilities. You can use further prompts to reformat and change the document or simply highlight specific sections and ask for changes. Docs will also support AI-assisted style matching, which might come in handy if you have multiple people editing the text. Google notes that all Gemini suggestions are private until you approve them for use.

Gemini in Google Workspace.

Gemini is also getting an upgrade in Sheets, and Google claims the robot’s spreadsheet capabilities are nearing those of flesh-and-blood humans in recent testing. Similar to text documents, you can tell Gemini in the sidebar what kind of spreadsheet you need and the AI will use the prompt (and whatever data sources you specify) to generate it. Gemini can also allegedly fill in missing data by searching for it on the web. In our past testing, Gemini has had a lot of trouble with spreadsheet layouts, but Google says this revamp will handle everything, from basic tasks to complex data analysis.

Gemini burrows deeper into Google Workspace with revamped document creation and editing Read More »

ai-startup-sues-ex-ceo,-saying-he-took-41gb-of-email-and-lied-on-resume

AI startup sues ex-CEO, saying he took 41GB of email and lied on résumé

Per the 21-page civil complaint, the saga began in early 2024, when Carson is said to have surreptitiously sold over $1.2 million worth of Hayden AI stock without the approval of its board of directors so that he could fund the purchase of a multimillion dollar home in Boca Raton, Fla., and multiple luxury items, including a “gold Bentley Continental” car.

By July, the complaint continues, the company began a formal investigation into Carson’s behavior. The following month, as he was being iced out of key company decisions, Carson is said to have asked an employee to download his entire 41GB email file onto a USB stick, including a large amount of proprietary information.

Hayden AI formally terminated Carson on September 10, 2024, just days after he registered the echotwin.ai domain name.

Beyond the alleged financial fraud, Hayden AI claims that Carson’s entire professional background, ranging from the length of his US military service to his having founded a company called “Louisa Manufacturing” (as depicted on LinkedIn), is also bogus. The complaint calls Carson’s CV a “carefully constructed fraud.”

According to Carson’s LinkedIn profile, he completed a doctorate from Waseda University in Tokyo in 2007.

“That is a lie,” the complaint states. “Carson does not hold a PhD from Waseda or any other university. In 2007, he was not obtaining a PhD but was operating ‘Splat Action Sports,’ a paintball equipment business in a Florida strip mall.”

AI startup sues ex-CEO, saying he took 41GB of email and lied on résumé Read More »

google’s-new-command-line-tool-can-plug-openclaw-into-your-workspace-data

Google’s new command-line tool can plug OpenClaw into your Workspace data

The command line is hot again. For some people, command lines were never not hot, of course, but it’s becoming more common now in the age of AI. Google launched a Gemini command-line tool last year, and now it has a new AI-centric command-line option for cloud products. The new Google Workspace CLI bundles the company’s existing cloud APIs into a package that makes it easy to integrate with a variety of AI tools, including OpenClaw. How do you know this setup won’t blow up and delete all your data? That’s the fun part—you don’t.

There are some important caveats with the Workspace tool. While this new GitHub project is from Google, it’s “not an officially supported Google product.” So you’re on your own if you choose to use it. The company notes that functionality may change dramatically as Google Workspace CLI continues to evolve, and that could break workflows you’ve created in the meantime.

For people interested in tinkering with AI automations and don’t mind the inherent risks, Google Workspace CLI has a lot to offer, even at this early stage. It includes the APIs for every Workspace product, including Gmail, Drive, and Calendar. It’s designed for use by humans and AI agents, but like everything else Google does now, there’s a clear emphasis on AI.

The tool supports structured JSON outputs, and there are more than 40 agent skills included, says Google Cloud director Addy Osmani. The focus of Workspace CLI seems to be on agentic systems that can create command-line inputs and directly parse JSON outputs. The integrated tools can load and create Drive files, send emails, create and edit Calendar appointments, send chat messages, and much more.

Google’s new command-line tool can plug OpenClaw into your Workspace data Read More »

musk-fails-to-block-california-data-disclosure-law-he-fears-will-ruin-xai

Musk fails to block California data disclosure law he fears will ruin xAI


Musk can’t convince judge public doesn’t care about where AI training data comes from.

Elon Musk’s xAI has lost its bid for a preliminary injunction that would have temporarily blocked California from enforcing a law that requires AI firms to publicly share information about their training data.

xAI had tried to argue that California’s Assembly Bill 2013 (AB 2013) forced AI firms to disclose carefully guarded trade secrets.

The law requires AI developers whose models are accessible in the state to clearly explain which dataset sources were used to train models, when the data was collected, if the collection is ongoing, and whether the datasets include any data protected by copyrights, trademarks, or patents. Disclosures would also clarify whether companies licensed or purchased training data and whether the training data included any personal information. It would also help consumers assess how much synthetic data was used to train the model, which could serve as a measure of quality.

However, this information is precisely what makes xAI valuable, with its intensive data sourcing supposedly setting it apart from its biggest rivals, xAI argued. Allowing enforcement could be “economically devastating” to xAI, Musk’s company argued, effectively reducing “the value of xAI’s trade secrets to zero,” xAI’s complaint said. Further, xAI insisted, these disclosures “cannot possibly be helpful to consumers” while supposedly posing a real risk of gutting the entire AI industry.

Specifically, xAI argued that its dataset sources, dataset sizes, and cleaning methods were all trade secrets.

“If competitors could see the sources of all of xAI’s datasets or even the size of its datasets, competitors could evaluate both what data xAI has and how much they lack,” xAI argued. In one hypothetical, xAI speculated that “if OpenAI (another leading AI company) were to discover that xAI was using an important dataset to train its models that OpenAI was not, OpenAI would almost certainly acquire that dataset to train its own model, and vice versa.”

However, in an order issued on Wednesday, US District Judge Jesus Bernal said that xAI failed to show that California’s law, which took effect in January, required the company to reveal any trade secrets.

xAI’s biggest problem was being too vague about the harms it faced if the law was not halted, the judge said. Instead of explaining why the disclosures could directly harm xAI, the company offered only “a variety of general allegations about the importance of datasets in developing AI models and why they are kept secret,” Bernal wrote, describing X as trading in “frequent abstractions and hypotheticals.”

He denied xAI’s motion for a preliminary injunction while supporting the government’s interest in helping the public assess how the latest AI models were trained.

The lawsuit will continue, but xAI will have to comply with California’s law in the meantime. That could see Musk sharing information he’d rather OpenAI had no knowledge of at a time when he’s embroiled in several lawsuits against the leading AI firm he now regrets helping to found.

While not ending the fight to keep OpenAI away from xAI’s training data, this week’s ruling is another defeat for Musk after a judge last month tossed one of his OpenAI lawsuits, ruling that Musk had no proof that OpenAI had stolen trade secrets.

xAI argued California wants to silence Grok

xAI’s complaint argued that California’s law was unconstitutional since data can be considered a trade secret under the Fifth Amendment. The company also argued that the state was trying to regulate the outputs of xAI’s controversial chatbot, Grok, and was unfairly compelling speech from xAI while exempting other firms for security purposes.

At this stage of the litigation, Bernal disagreed that xAI might be irreparably harmed if the law was not halted.

On the Fifth Amendment claim, the judge said it’s not that training data could never be considered a trade secret. It’s just that xAI “has not identified any dataset or approach to cleaning and using datasets that is distinct from its competitors in a manner warranting trade secret protection.”

“It is not lost on the Court the important role of datasets in AI training and development, and that, hypothetically, datasets and details about them could be trade secrets,” Bernal wrote. But xAI “has not alleged that it actually uses datasets that are unique, that it has meaningfully larger or smaller datasets than competitors, or that it cleans its datasets in unique ways.”

Therefore, xAI is not likely to succeed on the merits of its Fifth Amendment claim.

The same goes for First Amendment arguments. xAI failed to show that the law improperly “forces developers to publicly disclose their data sources in an attempt to identify what California deems to be ‘data riddled with implicit and explicit biases,’” Bernal wrote.

To xAI, it seemed like the state was trying to use the law to influence the outputs of its chatbot Grok, the company argued, which should be protected commercial speech.

Over the past year, Grok has increasingly drawn global public scrutiny for its antisemitic rants and for generating nonconsensual intimate imagery (NCII) and child sexual abuse materials (CSAM). But despite these scandals, which prompted a California probe, Bernal contradicted xAI, saying California did not appear to be trying to regulate controversial or biased outputs, as xAI feared.

“Nothing in the language of the statute suggests that California is attempting to influence Plaintiff’s models’ outputs by requiring dataset disclosure,” Bernal wrote.

Addressing xAI’s other speech concerns, he noted that “the statute does not functionally ask Plaintiff to share its opinions on the role of certain datasets in AI model development or make ideological statements about the utility of various datasets or cleaning methods.”

“No part of the statute indicates any plan to regulate or censor models based on the datasets with which they are developed and trained,” Bernal wrote.

Public “cannot possibly” care about AI training data

Perhaps most frustrating for xAI as it continues to fight to block the law, Bernal also disputed that the public had no interest in the training data disclosures.

“It strains credulity to essentially suggest that no consumer is capable of making a useful evaluation of Plaintiff’s AI models by reviewing information about the datasets used to train them and that therefore there is no substantial government interest advanced by this disclosure statute,” Bernal wrote.

He noted that the law simply requires companies to alert the public about information that can feasibly be used to weigh whether they want to use one model over another.

Nothing about the required disclosures is inherently political, the judge suggested, although some consumers might select or avoid certain models with perceived political biases. As an example, Bernal opined that consumers may want to know “if certain medical data or scientific information was used to train a model” to decide if they can trust the model “to be sufficiently comprehensively trained and reliable for the consumer’s purposes.”

“In the marketplace of AI models, AB 2013 requires AI model developers to provide information about training datasets, thereby giving the public information necessary to determine whether they will use—or rely on information produced by—Plaintiff’s model relative to the other options on the market,” Bernal wrote.

Moving forward, xAI seems to face an uphill battle to win this fight. It will need to gather more evidence to demonstrate that its datasets or cleaning methods are sufficiently unique to be considered trade secrets that give the company a competitive edge.

It will also likely have to deepen its arguments that consumers don’t care about disclosures and that the government has not explored less burdensome alternatives that could “achieve the goal of transparency for consumers,” Bernal suggested.

One possible path to a win could be proving that California’s law is so vague that it potentially puts xAI on the hook for disclosing its customers’ training data for individual Grok licenses. But Bernal emphasized that xAI “must actually face such a conundrum—rather than raising an abstract possible issue among AI systems developers—for the Court to make a determination on this issue.”

xAI did not respond to Ars’ request to comment.

A spokesperson for the California Department of Justice told Reuters that the department “celebrates this key win and remains committed to continuing our defense” of the law.

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

Musk fails to block California data disclosure law he fears will ruin xAI Read More »

workers-report-watching-ray-ban-meta-shot-footage-of-people-using-the-bathroom

Workers report watching Ray-Ban Meta-shot footage of people using the bathroom


Meta accused of “concealing the facts” about smart glass users’ privacy.

A marketing image for Ray-Ban Meta smart glasses. Credit: Meta

Meta’s approach to user privacy is under renewed scrutiny following a Swedish report that employees of a Meta subcontractor have watched footage captured by Ray-Ban Meta smart glasses showing sensitive user content.

The workers reportedly work for Kenya-headquartered Sama and provide data annotation for Ray-Ban Metas.

The February report, a collaboration from Swedish newspapers Svenska Dagbladet, Göteborgs-Posten, and Kenya-based freelance journalist Naipanoi Lepapa, is, per a machine translation, based on interviews with over 30 employees at various levels of Sama, including several people who work with video, image, and speech annotation for Meta’s AI systems. Some of the people interviewed have worked on projects other than Meta’s smart glasses. The report’s authors said they did not gain access to the materials that Sama workers handle or the area where workers perform data annotation. The report is also based on interviews with former US Meta employees who have reportedly witnessed live data annotation for several Meta projects.

The report pointed to, per the translation, a “stream of privacy-sensitive data that is fed straight into the tech giant’s systems,” and that makes Sama workers uncomfortable. The authors said that several people interviewed for the report said they have seen footage shot with Ray-Ban Meta smart glasses that shows people having sex and using the bathroom.

“I saw a video where a man puts the glasses on the bedside table and leaves the room. Shortly afterwards, his wife comes in and changes her clothes,” an anonymous Sama employee reportedly said, per the machine translation.

Another anonymous employee said that they have seen users’ partners come out of the bathroom naked.

“You understand that it is someone’s private life you are looking at, but at the same time you are just expected to carry out the work,” an anonymous Sama employee reportedly said.

Meta confirms use of data annotators

In statements shared with the BBC on Wednesday, Meta confirmed that it “sometimes” shares content that users share with the Meta AI generative AI chatbot with contractors to review with “the purpose of improving people’s experience, as many other companies do.”

“This data is first filtered to protect people’s privacy,” the statement said, pointing to, as an example, blurring out faces in images.

Meta’s privacy policy for wearables says that photos and videos taken with its smart glasses are sent to Meta “when you turn on cloud processing on your AI Glasses, interact with the Meta AI service on your AI Glasses, or upload your media to certain services provided by Meta (i.e., Facebook or Instagram). You can change your choices about cloud processing of your Media at any time in Settings.”

The policy also says that video and audio from livestreams recorded with Ray-Ban Metas are sent to Meta, as are text transcripts and voice recordings created by Meta’s chatbot.

“We use machine learning and trained reviewers to process this data to improve, troubleshoot, and train our products. We share that information with third-party vendors and service providers to improve our products. You can access and delete recordings and related transcripts in the Meta AI App,” the policy says.

Meta’s broader privacy policy for the Meta AI chatbot adds: “In some cases, Meta will review your interactions with AIs, including the content of your conversations with or messages to AIs, and this review may be automated or manual (human).”

That policy also warns users against sharing “information that you don’t want the AIs to use and retain, such as information about sensitive topics.”

“When information is shared with AIs, the AIs will sometimes retain and use that information,” the Meta AI privacy policy says.

Notably, in August, Meta made “Meta AI with camera” on by default until a user turns off support for the “Hey Meta” voice command, per an email sent to users at the time. Meta spokesperson Albert Aydin told The Verge at the time that “photos and videos captured on Ray-Ban Meta are on your phone’s camera roll and not used by Meta for training.”

However, some Ray-Ban Meta users may not have read or understood the numerous privacy policies associated with Meta’s smart glasses.

Sama employees suggested that Ray-Ban Meta owners may be unaware that the devices are sometimes recording. Employees reportedly pointed to users recording their bank card or porn that they’re watching, seemingly inadvertently.

Meta’s smart glasses flash a red light when they are recording video or taking a photo, but there has been criticism that people may not notice the light or misinterpret its meaning.

“We see everything, from living rooms to naked bodies. Meta has that type of content in its databases. People can record themselves in the wrong way and not even know what they are recording,” an anonymous employee was quoted as saying.

When reached for comment by Ars Technica, a Sama representative shared a statement saying that Sama doesn’t “comment on specific client relationships or projects” but is GDPR and CCPA-compliant and uses “rigorously audited policies and procedures designed to protect all customer information, including personally identifiable information.”

Saama’s statement added:

This work is conducted in secure, access-controlled facilities. Personal devices are not permitted on production floors, and all team members undergo background checks and receive ongoing training in data protection, confidentiality, and responsible AI practices. Our teams receive living wages and full benefits, and have access to comprehensive wellness resources and on-site support.

Meta sued

The Swedish report has reignited concerns about the privacy of Meta’s smart glasses, including from the Information Commissioner’s Office, a UK data watchdog that has written to Meta about the report. The debate also comes as Meta is reportedly planning to add facial recognition to its Ray-Ban and Oakley-branded smart glasses “as soon as this year,” per a February report from The New York Times citing anonymous people “involved with the plans.”

The claims have also led to a proposed class-action lawsuit [PDF] filed yesterday against Meta and Luxottica of America, a subsidiary of Ray-Ban parent company EssilorLuxottica. The lawsuit challenges Meta’s slogan for the glasses, “designed for privacy, controlled by you,” saying:

No reasonable consumer would understand “designed for privacy, controlled by you” and similar promises like “built for your privacy” to mean that deeply personal footage from inside their homes would be viewed and catalogued by human workers overseas. Meta chose to make privacy the centerpiece of its pervasive marketing campaign while concealing the facts that reveal those promises to be false.

The lawsuit alleges that Meta has broken state consumer protection laws and seeks damages, punitive penalties, and an injunction requiring Meta to change business practices “to prevent or mitigate the risk of the consumer deception and violations of law.”

Ars Technica reached out to Meta for comment but didn’t hear back before publication. Meta has declined to comment on the lawsuit to other outlets.

Photo of Scharon Harding

Scharon is a Senior Technology Reporter at Ars Technica writing news, reviews, and analysis on consumer gadgets and services. She’s been reporting on technology for over 10 years, with bylines at Tom’s Hardware, Channelnomics, and CRN UK.

Workers report watching Ray-Ban Meta-shot footage of people using the bathroom Read More »

openai-introduces-gpt-5.4-with-more-knowledge-work-capability

OpenAI introduces GPT-5.4 with more knowledge-work capability

Additionally, there are improvements to visual understanding; it can now more carefully analyze images up to 10.24 million pixels, or up to a 6,000-pixel maximum dimension. OpenAI also claims responses from this model are 18 percent less likely to contain factual errors than before.

ChatGPT reportedly lost some users to competitor Anthropic in recent days, after OpenAI announced a deal with the Pentagon in the wake of a public feud between the Trump administration and Anthropic over limitations Anthropic wanted to impose on military applications of its models. However, it’s unclear just how many folks jumped ship or whether that led to a substantial dip in the product’s massive base of over 900 million users.

To take advantage of the situation, Anthropic rolled out the once-subscriber-only memory feature to free users and introduced a tool for importing memory from elsewhere. Anthropic says March 2 was its largest single day ever for new sign-ups.

OpenAI needs to compete in both capability and cost and token efficiency to maintain its relative popularity with users, and this update aims to support that objective.

GPT-5.4 is available to users of the ChatGPT web and native apps, Codex, and the API starting today. Subscribers to Plus, Team, and Pro are also getting GPT-5.4 Thinking, and GPT-5.4 Pro is hitting the API, Edu, and Enterprise.

OpenAI introduces GPT-5.4 with more knowledge-work capability Read More »

large-genome-model:-open-source-ai-trained-on-trillions-of-bases

Large genome model: Open source AI trained on trillions of bases


System can identify genes, regulatory sequences, splice sites, and more.

Late in 2025, we covered the development of an AI system called Evo that was trained on massive numbers of bacterial genomes. So many that, when prompted with sequences from a cluster of related genes, it could correctly identify the next one or suggest a completely novel protein.

That system worked because bacteria tend to cluster related genes together—something that’s not true in organisms with complex cells, which tend to have equally complex genome structures. Given that, our coverage noted, “It’s not clear that this approach will work with more complex genomes.”

Apparently, the team behind Evo viewed that as a challenge, because today it is describing Evo 2, an open source AI that has been trained on genomes from all three domains of life (bacteria, archaea, and eukaryotes). After training on trillions of base pairs of DNA, Evo 2 developed internal representations of key features in even complex genomes like ours, including things like regulatory DNA and splice sites, which can be challenging for humans to spot.

Genome features

Bacterial genomes are organized along relatively straightforward principles. Any genes that encode proteins or RNAs are contiguous, with no interruptions in the coding sequence. Genes that perform related functions, like metabolizing a sugar or producing an amino acid, tend to be clustered together, allowing them to be controlled by a single, compact regulatory system. It’s all straightforward and efficient.

Eukaryotes are not like that. The coding sections of genes are interrupted by introns, which don’t encode for anything. They’re regulated by a sequence that can be scattered across hundreds of thousands of base pairs. The sequences that define the edges of introns or the binding sites of regulatory proteins are all weakly defined—while they have a few bases that are absolutely required, there are a lot of bases that just have an above-average tendency to have a specific base (something like “45 percent of the time it’s a T”). Surrounding all of this in most eukaryotic genomes is a huge amount of DNA that has been termed junk: inactive viruses, terminally damaged genes, and so on.

That complexity has made eukaryotic genomes more difficult to interpret. And, while a lot of specialized tools have been developed to identify things like splice sites, they’re all sufficiently error-prone that it becomes a problem when you’re analyzing something as large as a 3 billion-base-long genome. We can learn a lot more by making evolutionary comparisons and looking for sequences that have been conserved, but there are limits to that, and we’re often as interested in the differences between species.

These sorts of statistical probabilities, however, are well-suited to neural networks, which are great at recognizing subtle patterns that can be impossible to pick out by eye. But you’d need absolutely massive amounts of data and computing time to process it and pick out some of these subtle features.

We now have the raw genome data that the process needs. Putting together a system to feed it into an effective AI training program, however, remained a challenge. That’s the challenge the team behind Evo took on.

Training a large genome model

The foundation of the Evo 2 system is a convolutional neural network called StripedHyena 2. The training took place in two stages. The initial stage focused on teaching the system to identify important genome features by feeding it sequences rich in them in chunks about 8,000 bases long. After that, there was a second stage in which sequences were fed a million bases at a time to provide the system the opportunity to identify large-scale genome features.

The researchers trained two versions of their system using a dataset called OpenGenome2, which contains 8.8 trillion bases from all three domains of life, as well as viruses that infect bacteria. They did not include viruses that attack eukaryotes, given that they were concerned that the system could be misused to create threats to humans. Two versions were trained: one that had 7 billion parameters tuned using 2.4 trillion bases, and the full version with 40 billion parameters trained on the full open genome dataset.

The logic behind the training is pretty simple: if something’s important enough to have been evolutionarily conserved across a lot of species, it will show up in multiple contexts, and the system should see it repeatedly during training. “By learning the likelihood of sequences across vast evolutionary datasets, biological sequence models capture conserved sequence patterns that often reflect functional importance,” the researchers behind the work write. “These constraints allow the models to perform zero-shot prediction without any task-specific fine-tuning or supervision.”

That last aspect is important. We could, for example, tell it about what known splice sites look like, which might help it pick out additional ones. But that might make it harder for it to recognize any unusual splice sites that we haven’t recognized yet. Skipping the fine-tuning might also help it identify genome features that we’re not aware of at all at the moment, but which could become apparent through future research.

All of this has now been made available to the public. “We have made Evo 2 fully open, including model parameters, training code, inference code, and the OpenGenome2 dataset,” the paper announces.

The researchers also used a system that can identify internal features in neural networks to poke around inside of Evo 2 and figure out what things it had learned to recognize. They trained a separate neural network to recognize the firing patterns in Evo 2 and identify high-level features in it. It clearly recognized protein-coding regions and the boundaries of the introns that flanked them. It was also able to recognize some structural features of proteins within the coding regions (alpha helices and beta sheets), as well as mutations that disrupt their coding sequence. Even something like mobile genetic elements (which you can think of as DNA-level parasites) ended up with a feature within Evo 2.

What is this good for?

To test the system, the researchers started making single-base mutations and fed them into Evo 2 to see how it responded. Evo 2 could detect problems when the mutations affected the sites in DNA where transcription into RNA started, or the sites where translation of that RNA into protein started. It also recognized the severity of mutations. Those that would interrupt protein translation, such as the introduction of stop signals, were identified as more significant changes than those that left the translation intact.

It also recognized when sequences weren’t translated at all. Many key cellular functions are carried out directly by RNAs, and Evo 2 was able to recognize when mutations disrupted those, as well.

Impressively, the ability to recognize features in eukaryotic genomes occurred without the loss of its ability to recognize them in bacteria and archaea. In fact, the system seemed to be able to work out what species it was working in. A number of evolutionary groups use genetic codes with a different set of signals to stop the translation of proteins. Evo 2 was able to recognize when it was looking at a sequence from one of those species, and used the correct genetic code for them.

It was also good at recognizing features that tolerate a lot of variability, such as sites that signal where to splice RNAs to remove introns from the coding sequence of proteins. By some measures, it was better than software specialized for that task. The same was true when evaluating mutations in the BRCA2 gene, where many of the mutations are associated with cancer. Given additional training on known BRCA2 mutations, its performance improved further.

Overall, Evo 2 seems great for evaluating genomes and identifying key features. The researchers who built it suggest it could serve as a good automated tool for preliminary genome annotation.

But the striking thing about the early version of Evo was that, when prompted with a chunk of sequence that includes known bacterial genes, some of its responses included entirely new proteins with related functions. Now that it was trained on more complex eukaryotic genes, could it do the same?

We don’t entirely know. If given a bunch of DNA from yeast (a eukaryote), it would respond with a sequence that included functional RNAs, and gene-like sequences with regulatory information and splice sites. But the researchers didn’t test whether any of the proteins did anything in particular. And it’s difficult to see how they could even do that test. With bacterial genes, they could safely assume that the AI-generated gene should be doing something related to the nearby genes. But that’s generally not the case in eukaryotes, so it’s difficult to guess what functions they should even test for.

In a somewhat more informative test, the researchers asked Evo 2 to make some regulatory DNA that was active in one cell type and not another after giving it information about what sequences were active in both those cell types. The sequences that came out were then inserted into these cells and tested, but the results were pretty weak, with only 17 percent having activity that differed by a factor of two or more between the two cell types. That’s a major achievement, but it isn’t in the same realm as designing brand new proteins.

What’s next?

Overall, given that this has come out less than four months after the paper describing the original Evo, it’s not at all surprising that there wasn’t more work done to test what Evo 2 can do for designing biologically relevant DNA sequences. Biology experiments are hard and time-consuming, and it’s not always easy to judge in advance which ones will provide the most compelling information. So we’ll probably have to wait months to years to find out whether the community finds interesting things to do with Evo 2, and whether it’s good at solving any useful protein design problems.

There’s also the question of whether further training and specialization can create Evo 2 relatives that are especially good at specific tasks, such as evaluating genomes from cancer cells or annotating newly sequenced genomes. To an extent, it appears the research team wanted to get this out so that others could start exploring how it might be put to use; that’s consistent with the fact that all of the software was made available.

The big open question is whether this system has identified anything that we don’t know how to test for. Things like intron/exon boundaries and regulatory DNA have been subjected to decades of study so that we already knew how to look for them and can recognize when Evo 2 spots them. But we’ve discovered a steady stream of new features in the genome—CRISPR repeats, microRNAs, and more—over the past decades. It remains technically possible that there are features in the genome we’re not aware of yet, and Evo 2 has picked them out.

It’s possible to imagine ways to use the tools described here to query Evo 2 and pick out new genome features. So I’m looking forward to seeing what might ultimately come out of that sort of work.

Nature, 2026. DOI: 10.1038/s41586-026-10176-5 (About DOIs).

Photo of John Timmer

John is Ars Technica’s science editor. He has a Bachelor of Arts in Biochemistry from Columbia University, and a Ph.D. in Molecular and Cell Biology from the University of California, Berkeley. When physically separated from his keyboard, he tends to seek out a bicycle, or a scenic location for communing with his hiking boots.

Large genome model: Open source AI trained on trillions of bases Read More »