AI

openai-launches-operator,-an-ai-agent-that-can-operate-your-computer

OpenAI launches Operator, an AI agent that can operate your computer

While it’s working, Operator shows a miniature browser window of its actions.

However, the technology behind Operator is still relatively new and far from perfect. The model reportedly performs best at repetitive web tasks like creating shopping lists or playlists. It struggles more with unfamiliar interfaces like tables and calendars, and does poorly with complex text editing (with a 40 percent success rate), according to OpenAI’s internal testing data.

OpenAI reported the system achieved an 87 percent success rate on the WebVoyager benchmark, which tests live sites like Amazon and Google Maps. On WebArena, which uses offline test sites for training autonomous agents, Operator’s success rate dropped to 58.1 percent. For computer operating system tasks, CUA set an apparent record of 38.1 percent success on the OSWorld benchmark, surpassing previous models but still falling short of human performance at 72.4 percent.

With this imperfect research preview, OpenAI hopes to gather user feedback and refine the system’s capabilities. The company acknowledges CUA won’t perform reliably in all scenarios but plans to improve its reliability across a wider range of tasks through user testing.

Safety and privacy concerns

For any AI model that can see how you operate your computer and even control some aspects of it, privacy and safety are very important. OpenAI says it built multiple safety controls into Operator, requiring user confirmation before completing sensitive actions like sending emails or making purchases. Operator also has limits on what it can browse, set by OpenAI. It cannot access certain website categories, including gambling and adult content.

Traditionally, AI models based on large language model-style Transformer technology like Operator have been relatively easy to fool with jailbreaks and prompt injections.

To catch attempts at subverting Operator, which might hypothetically be embedded in websites that the AI model browses, OpenAI says it has implemented real-time moderation and detection systems. OpenAI reports the system recognized all but one case of prompt injection attempts during an early internal red-teaming session.

OpenAI launches Operator, an AI agent that can operate your computer Read More »

anthropic-chief-says-ai-could-surpass-“almost-all-humans-at-almost-everything”-shortly-after-2027

Anthropic chief says AI could surpass “almost all humans at almost everything” shortly after 2027

He then shared his concerns about how human-level AI models and robotics that are capable of replacing all human labor may require a complete re-think of how humans value both labor and themselves.

“We’ve recognized that we’ve reached the point as a technological civilization where the idea, there’s huge abundance and huge economic value, but the idea that the way to distribute that value is for humans to produce economic labor, and this is where they feel their sense of self worth,” he added. “Once that idea gets invalidated, we’re all going to have to sit down and figure it out.”

The eye-catching comments, similar to comments about AGI made recently by OpenAI CEO Sam Altman, come as Anthropic negotiates a $2 billion funding round that would value the company at $60 billion. Amodei disclosed that Anthropic’s revenue multiplied tenfold in 2024.

Amodei distances himself from “AGI” term

Even with his dramatic predictions, Amodei distanced himself from a term for this advanced labor-replacing AI favored by Altman, “artificial general intelligence” (AGI), calling it in a separate CNBC interview from the same event in Switzerland a marketing term.

Instead, he prefers to describe future AI systems as a “country of geniuses in a data center,” he told CNBC. Amodei wrote in an October 2024 essay that such systems would need to be “smarter than a Nobel Prize winner across most relevant fields.”

On Monday, Google announced an additional $1 billion investment in Anthropic, bringing its total commitment to $3 billion. This follows Amazon’s $8 billion investment over the past 18 months. Amazon plans to integrate Claude models into future versions of its Alexa speaker.

Anthropic chief says AI could surpass “almost all humans at almost everything” shortly after 2027 Read More »

trump-announces-$500b-“stargate”-ai-infrastructure-project-with-agi-aims

Trump announces $500B “Stargate” AI infrastructure project with AGI aims

Video of the Stargate announcement conference at the White House.

Despite optimism from the companies involved, as CNN reports, past presidential investment announcements have yielded mixed results. In 2017, Trump and Foxconn unveiled plans for a $10 billion Wisconsin electronics factory promising 13,000 jobs. The project later scaled back to a $672 million investment with fewer than 1,500 positions. The facility now operates as a Microsoft AI data center.

The Stargate announcement wasn’t Trump’s only major AI move announced this week. It follows the newly inaugurated US president’s reversal of a 2023 Biden executive order on AI risk monitoring and regulation.

Altman speaks, Musk responds

On Tuesday, OpenAI CEO Sam Altman appeared at a White House press conference alongside Present Trump, Oracle CEO Larry Ellison, and SoftBank CEO Masayoshi Son to announce Stargate.

Altman said he thinks Stargate represents “the most important project of this era,” allowing AGI to emerge in the United States. He believes that future AI technology could create hundreds of thousands of jobs. “We wouldn’t be able to do this without you, Mr. President,” Altman added.

Responding to off-camera questions from Trump about AI’s potential to spur scientific development, Altman said he believes AI will accelerate the discoveries for cures of diseases like cancer and heart disease.

Screenshots of Elon Musk challenging the Stargate announcement on X.

Screenshots of Elon Musk challenging the Stargate announcement on X.

Meanwhile on X, Trump ally and frequent Altman foe Elon Musk immediately attacked the Stargate plan, writing, “They don’t actually have the money,” and following up with a claim that we cannot yet substantiate, saying, “SoftBank has well under $10B secured. I have that on good authority.”

Musk’s criticism has complex implications given his very close ties to Trump, his history of litigating against OpenAI (which he co-founded and later left), and his own goals with his xAI company.

Trump announces $500B “Stargate” AI infrastructure project with AGI aims Read More »

apple-intelligence,-previously-opt-in-by-default,-enabled-automatically-in-ios-18.3

Apple Intelligence, previously opt-in by default, enabled automatically in iOS 18.3

Apple has sent out release candidate builds of the upcoming iOS 18.3, iPadOS 18.3, and macOS 15.3 updates to developers today. But they come with one tweak that hasn’t been reported on, per MacRumors: They enable all of the AI-powered Apple Intelligence features by default during setup. When Apple Intelligence was initially released in iOS 18.1, the features were off by default, unless users chose to opt-in and enable them.

Those who still wish to opt out of Apple Intelligence features will now have to do it after their devices are set up by navigating to the Apple Intelligence & Siri section in the Settings app.

Apple Intelligence will only be enabled by default for hardware that supports it. For the iPhone, that’s just the iPhone 15 Pro series, iPhone 16 series, and iPhone 16 Pro series. It goes further back on the iPad and Mac—Apple Intelligence works on any model with an M1 processor or newer.

Apple is following in the footsteps of Microsoft and Google here, rolling out new generative AI features to its user base as quickly as possible and enabling some or all of them by default while still labeling everything as a “beta” and pointing to that label when things go wrong. Case in point: The iOS 18.3 update also temporarily disables all notification summaries for apps in the App Store’s “news and entertainment” category, because some of those summaries contained major factual inaccuracies.

Apple Intelligence, previously opt-in by default, enabled automatically in iOS 18.3 Read More »

cutting-edge-chinese-“reasoning”-model-rivals-openai-o1—and-it’s-free-to-download

Cutting-edge Chinese “reasoning” model rivals OpenAI o1—and it’s free to download

Unlike conventional LLMs, these SR models take extra time to produce responses, and this extra time often increases performance on tasks involving math, physics, and science. And this latest open model is turning heads for apparently quickly catching up to OpenAI.

For example, DeepSeek reports that R1 outperformed OpenAI’s o1 on several benchmarks and tests, including AIME (a mathematical reasoning test), MATH-500 (a collection of word problems), and SWE-bench Verified (a programming assessment tool). As we usually mention, AI benchmarks need to be taken with a grain of salt, and these results have yet to be independently verified.

A chart of DeepSeek R1 benchmark results, created by DeepSeek.

A chart of DeepSeek R1 benchmark results, created by DeepSeek. Credit: DeepSeek

TechCrunch reports that three Chinese labs—DeepSeek, Alibaba, and Moonshot AI’s Kimi—have now released models they say match o1’s capabilities, with DeepSeek first previewing R1 in November.

But the new DeepSeek model comes with a catch if run in the cloud-hosted version—being Chinese in origin, R1 will not generate responses about certain topics like Tiananmen Square or Taiwan’s autonomy, as it must “embody core socialist values,” according to Chinese Internet regulations. This filtering comes from an additional moderation layer that isn’t an issue if the model is run locally outside of China.

Even with the potential censorship, Dean Ball, an AI researcher at George Mason University, wrote on X, “The impressive performance of DeepSeek’s distilled models (smaller versions of r1) means that very capable reasoners will continue to proliferate widely and be runnable on local hardware, far from the eyes of any top-down control regime.”

Cutting-edge Chinese “reasoning” model rivals OpenAI o1—and it’s free to download Read More »

report:-apple-mail-is-getting-automatic-categories-on-ipados-and-macos

Report: Apple Mail is getting automatic categories on iPadOS and macOS

Unlike numerous other new and recent OS-level features from Apple, mail sorting does not require a device capable of supporting its Apple Intelligence (generally M-series Macs or iPads), and happens entirely on the device. It’s an optional feature and available only for English-language emails.

Apple released a third beta of MacOS 15.3 just days ago, indicating that early, developer-oriented builds of macOS 15.4 with the sorting feature should be weeks away. While Gurman’s newsletter suggests mail sorting will also arrive in the Mail app for iPadOS, he did not specify which version, though the timing would suggest the roughly simultaneous release of iPadOS 18.4.

Also slated to arrive in the same update for Apple-Intelligence-ready devices is the version of Siri that understands more context about questions, from what’s on your screen and in your apps. “Add this address to Rick’s contact information,” “When is my mom’s flight landing,” and “What time do I have dinner with her” are the sorts of examples Apple highlighted in its June unveiling of iOS 18.

Since then, Apple has divvied up certain aspects of Intelligence into different OS point updates. General ChatGPT access and image generation have arrived in iOS 18.2 (and related Mac and iPad updates), while notification summaries, which can be pretty rough, are being rethought and better labeled and will be removed from certain news notifications in iOS 18.3.

Report: Apple Mail is getting automatic categories on iPadOS and macOS Read More »

under-new-law,-cops-bust-famous-cartoonist-for-ai-generated-child-sex-abuse-images

Under new law, cops bust famous cartoonist for AI-generated child sex abuse images

Late last year, California passed a law against the possession or distribution of child sex abuse material (CSAM) that has been generated by AI. The law went into effect on January 1, and Sacramento police announced yesterday that they have already arrested their first suspect—a 49-year-old Pulitzer-prize-winning cartoonist named Darrin Bell.

The new law, which you can read here, declares that AI-generated CSAM is harmful, even without an actual victim. In part, says the law, this is because all kinds of CSAM can be used to groom children into thinking sexual activity with adults is normal. But the law singles out AI-generated CSAM for special criticism due to the way that generative AI systems work.

“The creation of CSAM using AI is inherently harmful to children because the machine-learning models utilized by AI have been trained on datasets containing thousands of depictions of known CSAM victims,” it says, “revictimizing these real children by using their likeness to generate AI CSAM images into perpetuity.”

The law defines “artificial intelligence” as “an engineered or machine-based system that varies in its level of autonomy and that can, for explicit or implicit objectives, infer from the input it receives how to generate outputs that can influence physical or virtual environments.”

Under new law, cops bust famous cartoonist for AI-generated child sex abuse images Read More »

google-is-about-to-make-gemini-a-core-part-of-workspaces—with-price-changes

Google is about to make Gemini a core part of Workspaces—with price changes

Google has added AI features to its regular Workspace accounts for business while slightly raising the baseline prices of Workspace plans.

Previously, AI tools in the Gemini Business plan were a $20 per seat add-on to existing Workspace accounts, which had a base cost of $12 per seat without. Now, the AI tools are included for all Workspace users, but the per-seat base price is increasing from $12 to $14.

That means that those who were already paying extra for Gemini are going to pay less than half of what they were—effectively $14 per seat instead of $32. But those who never used or wanted Gemini or any other newer features under the AI umbrella from Workspace are going to pay a little bit more than before.

Features covered here include access to Gemini Advanced, the NotebookLM research assistant, email and document summaries in Gmail and Docs, adaptive audio and additional transcription languages for Meet, and “help me write” and Gemini in the side panel across a variety of applications.

Google says that it plans “to roll out even more AI features previously available in Gemini add-ons only.”

Google is about to make Gemini a core part of Workspaces—with price changes Read More »

home-microsoft-365-plans-use-copilot-ai-features-as-pretext-for-a-price-hike

Home Microsoft 365 plans use Copilot AI features as pretext for a price hike

Microsoft hasn’t said for how long this “limited time” offer will last, but presumably it will only last for a year or two to help ease the transition between the old pricing and the new pricing. New subscribers won’t be offered the option to pay for the Classic plans.

Subscribers on the Personal and Family plans can’t use Copilot indiscriminately; they get 60 AI credits per month to use across all the Office apps, credits that can also be used to generate images or text in Windows apps like Designer, Paint, and Notepad. It’s not clear how these will stack with the 15 credits that Microsoft offers for free for apps like Designer, or the 50 credits per month Microsoft is handing out for Image Cocreator in Paint.

Those who want unlimited usage and access to the newest AI models are still asked to pay $20 per month for a Copilot Pro subscription.

As Microsoft notes, this is the first price increase it has ever implemented for the personal Microsoft 365 subscriptions in the US, which have stayed at the same levels since being introduced as Office 365 over a decade ago. Pricing for the business plans and pricing in other countries has increased before. Pricing for Office Home 2024 ($150) and Office Home & Business 2024 ($250), which can’t access Copilot or other Microsoft 365 features, is also the same as it was before.

Home Microsoft 365 plans use Copilot AI features as pretext for a price hike Read More »

researchers-use-ai-to-design-proteins-that-block-snake-venom-toxins

Researchers use AI to design proteins that block snake venom toxins

Since these two toxicities work through entirely different mechanisms, the researchers tackled them separately.

Blocking a neurotoxin

The neurotoxic three-fingered proteins are a subgroup of the larger protein family that specializes in binding to and blocking the receptors for acetylcholine, a major neurotransmitter. Their three-dimensional structure, which is key to their ability to bind these receptors, is based on three strings of amino acids within the protein that nestle against each other (for those that have taken a sufficiently advanced biology class, these are anti-parallel beta sheets). So to interfere with these toxins, the researchers targeted these strings.

They relied on an AI package called RFdiffusion (the RF denotes its relation to the Rosetta Fold protein-folding software). RFdiffusion can be directed to design protein structures that are complements to specific chemicals; in this case, it identified new strands that could line up along the edge of the ones in the three-fingered toxins. Once those were identified, a separate AI package, called ProteinMPNN, was used to identify the amino acid sequence of a full-length protein that would form the newly identified strands.

But we’re not done with the AI tools yet. The combination of three-fingered toxins and a set of the newly designed proteins were then fed into DeepMind’s AlfaFold2 and the Rosetta protein structure software, and the strength of the interactions between them were estimated.

It’s only at this point that the researchers started making actual proteins, focusing on the candidates that the software suggested would interact the best with the three-fingered toxins. Forty-four of the computer-designed proteins were tested for their ability to interact with the three-fingered toxin, and the single protein that had the strongest interaction was used for further studies.

At this point, it was back to the AI, where RFDiffusion was used to suggest variants of this protein that might bind more effectively. About 15 percent of its suggestions did, in fact, interact more strongly with the toxin. The researchers then made both the toxin and the strongest inhibitor in bacteria and obtained the structure of their interactions. This confirmed that the software’s predictions were highly accurate.

Researchers use AI to design proteins that block snake venom toxins Read More »

meta-takes-us-a-step-closer-to-star-trek’s-universal-translator

Meta takes us a step closer to Star Trek’s universal translator


The computer science behind translating speech from 100 source languages.

In 2023, AI researchers at Meta interviewed 34 native Spanish and Mandarin speakers who lived in the US but didn’t speak English. The goal was to find out what people who constantly rely on translation in their day-to-day activities expect from an AI translation tool. What those participants wanted was basically a Star Trek universal translator or the Babel Fish from the Hitchhiker’s Guide to the Galaxy: an AI that could not only translate speech to speech in real time across multiple languages, but also preserve their voice, tone, mannerisms, and emotions. So, Meta assembled a team of over 50 people and got busy building it.

What this team came up with was a next-gen translation system called Seamless. The first building block of this system is described in Wednesday’s issue of Nature; it can translate speech among 36 different languages.

Language data problems

AI translation systems today are mostly focused on text, because huge amounts of text are available in a wide range of languages thanks to digitization and the Internet. Institutions like the United Nations or European Parliament routinely translate all their proceedings into the languages of all their member states, which means there are enormous databases comprising aligned documents prepared by professional human translators. You just needed to feed those huge, aligned text corpora into neural nets (or hidden Markov models before neural nets became all the rage) and you ended up with a reasonably good machine translation system. But there were two problems with that.

The first issue was those databases comprised formal documents, which made the AI translators default to the same boring legalese in the target language even if you tried to translate comedy. The second problem was speech—none of this included audio data.

The problem of language formality was mostly solved by including less formal sources like books, Wikipedia, and similar material in AI training databases. The scarcity of aligned audio data, however, remained. Both issues were at least theoretically manageable in high-resource languages like English or Spanish, but they got dramatically worse in low-resource languages like Icelandic or Zulu.

As a result, the AI translators we have today support an impressive number of languages in text, but things are complicated when it comes to translating speech. There are cascading systems that simply do this trick in stages. An utterance is first converted to text just as it would be in any dictation service. Then comes text-to-text translation, and finally the resulting text in the target language is synthesized into speech. Because errors accumulate at each of those stages, the performance you get this way is usually poor, and it doesn’t work in real time.

A few systems that can translate speech-to-speech directly do exist, but in most cases they only translate into English and not in the opposite way. Your foreign language interlocutor can say something to you in one of the languages supported by tools like Google’s AudioPaLM, and they will translate that to English speech, but you can’t have a conversation going both ways.

So, to pull off the Star Trek universal translator thing Meta’s interviewees dreamt about, the Seamless team started with sorting out the data scarcity problem. And they did it in a quite creative way.

Building a universal language

Warren Weaver, a mathematician and pioneer of machine translation, argued in 1949 that there might be a yet undiscovered universal language working as a common base of human communication. This common base of all our communication was exactly what the Seamless team went for in its search for data more than 70 years later. Weaver’s universal language turned out to be math—more precisely, multidimensional vectors.

Machines do not understand words as humans do. To make sense of them, they need to first turn them into sequences of numbers that represent their meaning. Those sequences of numbers are numerical vectors that are termed word embeddings. When you vectorize tens of millions of documents this way, you’ll end up with a huge multidimensional space where words with similar meaning that often go together, like “tea” and “coffee,” are placed close to each other. When you vectorize aligned text in two languages like those European Parliament proceedings, you end up with two separate vector spaces, and then you can run a neural net to learn how those two spaces map onto each other.

But the Meta team didn’t have those nicely aligned texts for all the languages they wanted to cover. So, they vectorized all texts in all languages as if they were just a single language and dumped them into one embedding space called SONAR (Sentence-level Multimodal and Language-Agnostic Representations). Once the text part was done, they went to speech data, which was vectorized using a popular W2v (word to vector) tool and added it to the same massive multilingual, multimodal space. Of course, each embedding carried metadata identifying its source language and whether it was text or speech before vectorization.

The team just used huge amounts of raw data—no fancy human labeling, no human-aligned translations. And then, the data mining magic happened.

SONAR embeddings represented entire sentences instead of single words. Part of the reason behind that was to control for differences between morphologically rich languages, where a single word may correspond to multiple words in morphologically simple languages. But the most important thing was that it ensured that sentences with similar meaning in multiple languages ended up close to each other in the vector space.

It was the same story with speech, too—a spoken sentence in one language was close to spoken sentences in other languages with similar meaning. It even worked between text and speech. So, the team simply assumed that embeddings in two different languages or two different modalities (speech or text) that are at a sufficiently close distance to each other are equivalent to the manually aligned texts of translated documents.

This produced huge amounts of automatically aligned data. The Seamless team suddenly got access to millions of aligned texts, even in low-resource languages, along with thousands of hours of transcribed audio. And they used all this data to train their next-gen translator.

Seamless translation

The automatically generated data set was augmented with human-curated texts and speech samples where possible and used to train multiple AI translation models. The largest one was called SEAMLESSM4T v2. It could translate speech to speech from 101 source languages into any of 36 output languages, and translate text to text. It would also work as an automatic speech recognition system in 96 languages, translate speech to text from 101 into 96 languages, and translate text to speech from 96 into 36 languages—all from a single unified model. It also outperformed state-of-the-art cascading systems by 8 percent in a speech-to-text and by 23 percent in a speech-to-speech translations based on the scores in Bilingual Evaluation Understudy (an algorithm commonly used to evaluate the quality of machine translation).

But it can now do even more than that. The Nature paper published by Meta’s Seamless ends at the SEAMLESSM4T models, but Nature has a long editorial process to ensure scientific accuracy. The paper published on January 15, 2025, was submitted in late November 2023. But in a quick search of the arXiv.org, a repository of not-yet-peer-reviewed papers, you can find the details of two other models that the Seamless team has already integrated on top of the SEAMLESSM4T: SeamlessStreaming and SeamlessExpressive, which take this AI even closer to making a Star Trek universal translator a reality.

SeamlessStreaming is meant to solve the translation latency problem. The baseline SEAMLESSM4T, despite all the bells and whistles, worked as a standard AI translation tool. You had to say what you wanted to say, push “translate,” and it spat out the translation. SeamlessStreaming was designed to take this experience a bit closer to what human simultaneous translator do—it translates what you’re saying as you speak in a streaming fashion. SeamlessExpressive, on the other hand, is aimed at preserving the way you express yourself in translations. When you whisper or say something in a cheerful manner or shout out with anger, SeamlessExpressive will encode the features of your voice, like tone, prosody, volume, tempo, and so on, and transfer those into the output speech in the target language.

Sadly, it still can’t do both at the same time; you can only choose to go for either streaming or expressivity, at least at the moment. Also, the expressivity variant is very limited in supported languages—it only works in English, Spanish, French, and German. But at least it’s online so you can go ahead and give it a spin.

Nature, 2025.  DOI: 10.1038/s41586-024-08359-z

Photo of Jacek Krywko

Jacek Krywko is a freelance science and technology writer who covers space exploration, artificial intelligence research, computer science, and all sorts of engineering wizardry.

Meta takes us a step closer to Star Trek’s universal translator Read More »

chatgpt-becomes-more-siri-like-with-new-scheduled-tasks-feature

ChatGPT becomes more Siri-like with new scheduled tasks feature

OpenAI is making ChatGPT work a little more like older digital assistants with a new feature called Tasks, as reported by TechCrunch and others.

Currently in beta, Tasks allows users to direct the chatbot to send reminders or to generate responses to specific prompts at certain times; recurring tasks are also supported.

The feature is available to Plus, Team, and Pro subscribers starting today, while free users don’t have access.

To create a task, users need to select “4o with scheduled tasks” from the model picker and then direct ChatGPT using the same kind of plain language text prompts that drive everything else it does. ChatGPT will sometimes suggest tasks, too, but they won’t go into effect unless the user approves them.

The user can then make changes to assigned tasks through the same chat conversation, or they can use a new Tasks section of the ChatGPT apps to manage all currently assigned items. There’s currently a 10-task limit.

When the time comes to perform an assigned task, the ChatGPT mobile or desktop app will send a notification on schedule.

This update can be seen as OpenAI’s first step into the agentic AI space, where applications built using deep learning can operate relatively independently within certain boundaries, either replacing or easing the day-to-day responsibilities of information workers.

ChatGPT becomes more Siri-like with new scheduled tasks feature Read More »