Microsoft is having another whack at its controversial Recall feature for Copilot+ Windows PCs, after the original version crashed and burned amid scrutiny from security researchers and testers over the summer. The former version of Recall recorded screenshots and OCR text of all user activity, and stored it unencrypted on disk where it could easily be accessed by another user on the PC or an attacker with remote access.
The feature was announced in late May, without having gone through any of the public Windows Insider testing that most new Windows features get, and was scheduled to ship on new PCs by June 18; by June 13, the company had delayed it indefinitely to rearchitect it and said that it would be tested through the normal channels before it was rolled out to the public.
Today, Microsoft shared more extensive details on exactly how the security of Recall has been re-architected in a post by Microsoft VP of Enterprise and OS Security David Weston.
More secure, also optional
The broad strokes of today’s announcement are similar to the changes Microsoft originally announced for Recall over the summer: that the feature would be opt-in and off-by-default instead of opt-out, that users would need to re-authenticate with Windows Hello before accessing any Recall data, and that locally stored Recall data will be protected with additional encryption.
However, some details show how Microsoft is attempting to placate skeptical users. For instance, Recall can now be removed entirely from a system using the “optional features” settings in Windows (when a similar removal mechanism showed up in a Windows preview earlier this month, Microsoft claimed it was a “bug,” but apparently not).
The company is also sharing more about how Windows will protect data locally. All Recall data stored locally, including “snapshots and any associated information in the vector database,” will be encrypted at rest with keys stored in your system’s TPM; according to the blog post, Recall will only function when BitLocker or Device Encryption is fully enabled. Recall will also require Virtualization-Based Security (VBS) and Hypervisor-Protected Code Integrity (HVCI) enabled; these are features that people sometimes turn off to improve game performance, but Recall will reportedly refuse to work unless they’re turned on.
This is because the new Recall operates inside of a VBS enclave, which helps to isolate and secure data in memory from the rest of the system.
“This area acts like a locked box that can only be accessed after permission is granted by the user through Windows Hello,” writes Weston. “VBS enclaves offer an isolation boundary from both kernel and administrative users.”
Windows doesn’t allow any code to run within these enclaves that hasn’t been signed by Microsoft, which should lower the risk of exposing Recall data to malware or other rogue applications. Other malware protections new to this version of Recall include “rate-limiting and anti-hammering measures.”
But the rest of the AI world doesn’t march to the same beat, doing its own thing and churning out new AI models and research by the minute. Here’s a roundup of some other notable AI news from the past week.
Google Gemini updates
On Tuesday, Google announced updates to its Gemini model lineup, including the release of two new production-ready models that iterate on past releases: Gemini-1.5-Pro-002 and Gemini-1.5-Flash-002. The company reported improvements in overall quality, with notable gains in math, long context handling, and vision tasks. Google claims a 7 percent increase in performance on the MMLU-Pro benchmark and a 20 percent improvement in math-related tasks. But as you know, if you’ve been reading Ars Technica for a while, AI typically benchmarks aren’t as useful as we would like them to be.
Along with model upgrades, Google introduced substantial price reductions for Gemini 1.5 Pro, cutting input token costs by 64 percent and output token costs by 52 percent for prompts under 128,000 tokens. As AI researcher Simon Willison noted on his blog, “For comparison, GPT-4o is currently $5/[million tokens] input and $15/m output and Claude 3.5 Sonnet is $3/m input and $15/m output. Gemini 1.5 Pro was already the cheapest of the frontier models and now it’s even cheaper.”
Google also increased rate limits, with Gemini 1.5 Flash now supporting 2,000 requests per minute and Gemini 1.5 Pro handling 1,000 requests per minute. Google reports that the latest models offer twice the output speed and three times lower latency compared to previous versions. These changes may make it easier and more cost-effective for developers to build applications with Gemini than before.
Meta launches Llama 3.2
On Wednesday, Meta announced the release of Llama 3.2, a significant update to its open-weights AI model lineup that we have covered extensively in the past. The new release includes vision-capable large language models (LLMs) in 11 billion and 90B parameter sizes, as well as lightweight text-only models of 1B and 3B parameters designed for edge and mobile devices. Meta claims the vision models are competitive with leading closed-source models on image recognition and visual understanding tasks, while the smaller models reportedly outperform similar-sized competitors on various text-based tasks.
Willison did some experiments with some of the smaller 3.2 models and reported impressive results for the models’ size. AI researcher Ethan Mollick showed off running Llama 3.2 on his iPhone using an app called PocketPal.
Meta also introduced the first official “Llama Stack” distributions, created to simplify development and deployment across different environments. As with previous releases, Meta is making the models available for free download, with license restrictions. The new models support long context windows of up to 128,000 tokens.
Google’s AlphaChip AI speeds up chip design
On Thursday, Google DeepMind announced what appears to be a significant advancement in AI-driven electronic chip design, AlphaChip. It began as a research project in 2020 and is now a reinforcement learning method for designing chip layouts. Google has reportedly used AlphaChip to create “superhuman chip layouts” in the last three generations of its Tensor Processing Units (TPUs), which are chips similar to GPUs designed to accelerate AI operations. Google claims AlphaChip can generate high-quality chip layouts in hours, compared to weeks or months of human effort. (Reportedly, Nvidia has also been using AI to help design its chips.)
Notably, Google also released a pre-trained checkpoint of AlphaChip on GitHub, sharing the model weights with the public. The company reported that AlphaChip’s impact has already extended beyond Google, with chip design companies like MediaTek adopting and building on the technology for their chips. According to Google, AlphaChip has sparked a new line of research in AI for chip design, potentially optimizing every stage of the chip design cycle from computer architecture to manufacturing.
That wasn’t everything that happened, but those are some major highlights. With the AI industry showing no signs of slowing down at the moment, we’ll see how next week goes.
Officials in Ireland have fined Meta $101 million for storing hundreds of millions of user passwords in plaintext and making them broadly available to company employees.
Meta disclosed the lapse in early 2019. The company said that apps for connecting to various Meta-owned social networks had logged user passwords in plaintext and stored them in a database that had been searched by roughly 2,000 company engineers, who collectively queried the stash more than 9 million times.
Meta investigated for five years
Meta officials said at the time that the error was found during a routine security review of the company’s internal network data storage practices. They went on to say that they uncovered no evidence that anyone internally improperly accessed the passcodes or that the passcodes were ever accessible to people outside the company.
Despite those assurances, the disclosure exposed a major security failure on the part of Meta. For more than three decades, best practices across just about every industry have been to cryptographically hash passwords. Hashing is a term that applies to the practice of passing passwords through a one-way cryptographic algorithm that assigns a long string of characters that’s unique for each unique input of plaintext.
Because the conversion works in only one direction—from plaintext to hash—there is no cryptographic means for converting the hashes back into plaintext. More recently, these best practices have been mandated by laws and regulations in countries worldwide.
Because hashing algorithms works in one direction, the only way to obtain the corresponding plaintext is to guess, a process that can require large amounts of time and computational resources. The idea behind hashing passwords is similar to the idea of fire insurance for a home. In the event of an emergency—the hacking of a password database in one case, or a house fire in the other—the protection insulates the stakeholder from harm that otherwise would have been more dire.
For hashing schemes to work as intended, they must follow a host of requirements. One is that hashing algorithms must be designed in a way that they require large amounts of computing resources. That makes algorithms such as SHA1 and MD5 unsuitable, because they’re designed to quickly hash messages with minimal computing required. By contrast, algorithms specifically designed for hashing passwords—such as Bcrypt, PBKDF2, or SHA512crypt—are slow and consume large amounts of memory and processing.
Another requirement is that the algorithms must include cryptographic “salting,” in which a small amount of extra characters are added to the plaintext password before it’s hashed. Salting further increases the workload required to crack the hash. Cracking is the process of passing large numbers of guesses, often measured in the hundreds of millions, through the algorithm and comparing each hash against the hash found in the breached database.
The ultimate aim of hashing is to store passwords only in hashed format and never as plaintext. That prevents hackers and malicious insiders alike from being able to use the data without first having to expend large amounts of resources.
When Meta disclosed the lapse in 2019, it was clear the company had failed to adequately protect hundreds of millions of passwords.
“It is widely accepted that user passwords should not be stored in plaintext, considering the risks of abuse that arise from persons accessing such data,” Graham Doyle, deputy commissioner at Ireland’s Data Protection Commission, said. “It must be borne in mind, that the passwords, the subject of consideration in this case, are particularly sensitive, as they would enable access to users’ social media accounts.”
The commission has been investigating the incident since Meta disclosed it more than five years ago. The government body, the lead European Union regulator for most US Internet services, imposed a fine of $101 million (91 million euros) this week. To date, the EU has fined Meta more than $2.23 billion (2 billion euros) for violations of the General Data Protection Regulation (GDPR), which went into effect in 2018. That amount includes last year’s record $1.34 billion (1.2 billion euro) fine, which Meta is appealing.
The Tor Project, the nonprofit that maintains software for the Tor anonymity network, is joining forces with Tails, the maker of a portable operating system that uses Tor. Both organizations seek to pool resources, lower overhead, and collaborate more closely on their mission of online anonymity.
Tails and the Tor Project began discussing the possibility of merging late last year, the two organizations said. At the time, Tails was maxing out its current resources. The two groups ultimately decided it would be mutually beneficial for them to come together.
Amnesic onion routing
“Rather than expanding Tails’s operational capacity on their own and putting more stress on Tails workers, merging with the Tor Project, with its larger and established operational framework, offered a solution,” Thursday’s joint statement said. “By joining forces, the Tails team can now focus on their core mission of maintaining and improving Tails OS, exploring more and complementary use cases while benefiting from the larger organizational structure of The Tor Project.”
The Tor Project, for its part, could stand to benefit from better integration of Tails into its privacy network, which allows web users and websites to operate anonymously by connecting from IP addresses that can’t be linked to a specific service or user.
The “Tor” in the Tor Project is short for The Onion Router. It’s a global project best known for developing the Tor Browser, which connects to the Tor network. The Tor network routes all incoming and outgoing traffic through a series of three IP addresses. The structure ensures that no one can determine the IP address of either originating or destination party. The Tor Project was formed in 2006 by a team that included computer scientists Roger Dingledine and Nick Mathewson. The Tor protocol on which the Tor network runs was developed by the Naval Research Laboratory in the early 2000s.
Tails (The Amnesic Incognito Live System) is a portable Linux-based operating system that runs from thumb drives and external hard drives and uses the Tor browser to route all web traffic between the device it runs on and the Internet. Tails routes outgoing traffic through the Tor Network
One of the key advantages of Tails OS is its ability to run entirely from a USB stick. The design makes it possible to use the secure operating system while traveling or using untrusted devices. It also ensures that no trace is left on a device’s hard drive. Tails has the additional benefit of routing traffic from non-browser clients such as Thunderbird through the Tor network.
“Incorporating Tails into the Tor Project’s structure allows for easier collaboration, better sustainability, reduced overhead, and expanded training and outreach programs to counter a larger number of digital threats,” the organizations said. “In short, coming together will strengthen both organizations’ ability to protect people worldwide from surveillance and censorship.”
The merger comes amid growing threats to personal privacy and calls by lawmakers to mandate backdoors or trapdoors in popular apps and operating systems to allow law enforcement to decrypt data in investigations.
On Thursday, AI hosting platform Hugging Face surpassed 1 million AI model listings for the first time, marking a milestone in the rapidly expanding field of machine learning. An AI model is a computer program (often using a neural network) trained on data to perform specific tasks or make predictions. The platform, which started as a chatbot app in 2016 before pivoting to become an open source hub for AI models in 2020, now hosts a wide array of tools for developers and researchers.
The machine-learning field represents a far bigger world than just large language models (LLMs) like the kind that power ChatGPT. In a post on X, Hugging Face CEO Clément Delangue wrote about how his company hosts many high-profile AI models, like “Llama, Gemma, Phi, Flux, Mistral, Starcoder, Qwen, Stable diffusion, Grok, Whisper, Olmo, Command, Zephyr, OpenELM, Jamba, Yi,” but also “999,984 others.”
The reason why, Delangue says, stems from customization. “Contrary to the ‘1 model to rule them all’ fallacy,” he wrote, “smaller specialized customized optimized models for your use-case, your domain, your language, your hardware and generally your constraints are better. As a matter of fact, something that few people realize is that there are almost as many models on Hugging Face that are private only to one organization—for companies to build AI privately, specifically for their use-cases.”
Hugging Face’s transformation into a major AI platform follows the accelerating pace of AI research and development across the tech industry. In just a few years, the number of models hosted on the site has grown dramatically along with interest in the field. On X, Hugging Face product engineer Caleb Fahlgren posted a chart of models created each month on the platform (and a link to other charts), saying, “Models are going exponential month over month and September isn’t even over yet.”
The power of fine-tuning
As hinted by Delangue above, the sheer number of models on the platform stems from the collaborative nature of the platform and the practice of fine-tuning existing models for specific tasks. Fine-tuning means taking an existing model and giving it additional training to add new concepts to its neural network and alter how it produces outputs. Developers and researchers from around the world contribute their results, leading to a large ecosystem.
For example, the platform hosts many variations of Meta’s open-weights Llama models that represent different fine-tuned versions of the original base models, each optimized for specific applications.
Hugging Face’s repository includes models for a wide range of tasks. Browsing its models page shows categories such as image-to-text, visual question answering, and document question answering under the “Multimodal” section. In the “Computer Vision” category, there are sub-categories for depth estimation, object detection, and image generation, among others. Natural language processing tasks like text classification and question answering are also represented, along with audio, tabular, and reinforcement learning (RL) models.
When sorted for “most downloads,” the Hugging Face models list reveals trends about which AI models people find most useful. At the top, with a massive lead at 163 million downloads, is Audio Spectrogram Transformer from MIT, which classifies audio content like speech, music, and environmental sounds. Following that, with 54.2 million downloads, is BERT from Google, an AI language model that learns to understand English by predicting masked words and sentence relationships, enabling it to assist with various language tasks.
Rounding out the top five AI models are all-MiniLM-L6-v2 (which maps sentences and paragraphs to 384-dimensional dense vector representations, useful for semantic search), Vision Transformer (which processes images as sequences of patches to perform image classification), and OpenAI’s CLIP (which connects images and text, allowing it to classify or describe visual content using natural language).
No matter what the model or the task, the platform just keeps growing. “Today a new repository (model, dataset or space) is created every 10 seconds on HF,” wrote Delangue. “Ultimately, there’s going to be as many models as code repositories and we’ll be here for it!”
On Wednesday, OpenAI Chief Technical Officer Mira Murati announced she is leaving the company in a surprise resignation shared on the social network X. Murati joined OpenAI in 2018, serving for six-and-a-half years in various leadership roles, most recently as the CTO.
“After much reflection, I have made the difficult decision to leave OpenAI,” she wrote in a letter to the company’s staff. “While I’ll express my gratitude to many individuals in the coming days, I want to start by thanking Sam and Greg for their trust in me to lead the technical organization and for their support throughout the years,” she continued, referring to OpenAI CEO Sam Altman and President Greg Brockman. “There’s never an ideal time to step away from a place one cherishes, yet this moment feels right.”
At OpenAI, Murati was in charge of overseeing the company’s technical strategy and product development, including the launch and improvement of DALL-E, Codex, Sora, and the ChatGPT platform, while also leading research and safety teams. In public appearances, Murati often spoke about ethical considerations in AI development.
Murati’s decision to leave the company comes when OpenAI finds itself at a major crossroads with a plan to alter its nonprofit structure. According to a Reuters report published today, OpenAI is working to reorganize its core business into a for-profit benefit corporation, removing control from its nonprofit board. The move, which would give CEO Sam Altman equity in the company for the first time, could potentially value OpenAI at $150 billion.
Murati stated her decision to leave was driven by a desire to “create the time and space to do my own exploration,” though she didn’t specify her future plans.
Proud of safety and research work
In her departure announcement, Murati highlighted recent developments at OpenAI, including innovations in speech-to-speech technology and the release of OpenAI o1. She cited what she considers the company’s progress in safety research and the development of “more robust, aligned, and steerable” AI models.
Altman replied to Murati’s tweet directly, expressing gratitude for Murati’s contributions and her personal support during challenging times, likely referring to the tumultuous period in November 2023 when the OpenAI board of directors briefly fired Altman from the company.
“It’s hard to overstate how much Mira has meant to OpenAI, our mission, and to us all personally,” he wrote. “I feel tremendous gratitude towards her for what she has helped us build and accomplish, but I most of all feel personal gratitude towards her for the support and love during all the hard times. I am excited for what she’ll do next.”
Not the first major player to leave
With Murati’s exit, Altman remains one of the few long-standing senior leaders at OpenAI, which has seen significant shuffling in its upper ranks recently. In May 2024, former Chief Scientist Ilya Sutskever left to form his own company, Safe Superintelligence, Inc. (SSI), focused on building AI systems that far surpass humans in logical capabilities. That came just six months after Sutskever’s involvement in the temporary removal of Altman as CEO.
John Schulman, an OpenAI co-founder, departed earlier in 2024 to join rival AI firm Anthropic, and in August, OpenAI President Greg Brockman announced he would be taking a temporary sabbatical until the end of the year.
The leadership shuffles have raised questions among critics about the internal dynamics at OpenAI under Altman and the state of OpenAI’s future research path, which has been aiming toward creating artificial general intelligence (AGI)—a hypothetical technology that could potentially perform human-level intellectual work.
“Question: why would key people leave an organization right before it was just about to develop AGI?” asked xAI developer Benjamin De Kraker in a post on X just after Murati’s announcement. “This is kind of like quitting NASA months before the moon landing,” he wrote in a reply. “Wouldn’t you wanna stick around and be part of it?”
Altman mentioned that more information about transition plans would be forthcoming, leaving questions about who will step into Murati’s role and how OpenAI will adapt to this latest leadership change as the company is poised to adopt a corporate structure that may consolidate more power directly under Altman. “We’ll say more about the transition plans soon, but for now, I want to take a moment to just feel thanks,” Altman wrote.
In May, when OpenAI first demonstrated ChatGPT-4o’s coming audio conversation capabilities, I wrote that it felt like we were “on the verge of something… like a sea change in how we think of and work with large language models.” Now that those “Advanced Voice” features are rolling out widely to ChatGPT subscribers, we decided to ask ChatGPT to explain, in its own voice, how this new method of interaction might impact our collective relationship with large language models.
That chat, which you can listen to and read a transcript of below, shouldn’t be treated as an interview with an official OpenAI spokesperson or anything. Still, it serves as a fun way to offer an initial test of ChatGPT’s live conversational chops.
Even in this short introductory “chat,” we were impressed by the natural, dare-we-say human cadence and delivery of ChatGPT’s “savvy and relaxed” Sol voice (which reminds us a bit of ’90s Janeane Garofalo). Between ChatGPT’s ability to give quick responses—offered in in milliseconds rather than seconds—and convincing intonation, it’s incredibly easy to fool yourself into thinking you’re speaking to a conscious being rather than what is, as ChatGPT says here, “still just a computer program processing information, without real emotions or consciousness.”
Regardless of how self-aware that information-processing engine is or is not, it’s easy to envision people forming an intense, parasocial relationship with the smooth voices at the other end of OpenAI’s servers. That seems set to open up a whole new era in our collective relationship with AI, even if the large language model generates the exact same words it did in the more text-based era of previous ChatGPT models.
We’ll have much more coverage of Advanced Voice and other new ChatGPT features in the future. For now, listen to this demo and judge for yourself how convincing OpenAI’s new voice synthesis technology is.
Kyle Orland: “Hi there, my name’s Kyle.”
ChatGPT: Hey Kyle! Great to meet you. What’s up?
KO: Well, I was hoping to test out your vocal capabilities, maybe by asking you a little bit about them.
ChatGPT: Absolutely, ask away! What would you like to know?
KO: “So how do you think people are going to react to the ability to talk to ChatGPT rather than just typing out messages?”
ChatGPT: I think it’ll make interactions feel more natural and engaging. Speaking can be quicker and more convenient, especially on the go. Plus, hearing responses can add a personal touch. It could make conversations with AI feel more like chatting with a friend.
KO: “Do you think people might overestimate your quote-unquote emotions or human-like capabilities now that you’re able to talk rather than just typing out text?
ChatGPT: That’s a great point. Hearing a voice can definitely make an AI seem more human-like, which might lead to some overestimation of emotions or understanding. It’s important to remember that, even with a friendly voice, I’m still just a computer program processing information, without real emotions or consciousness.
KO: “What are some limitations people should keep in mind when chatting with something like ChatGPT?”
ChatGPT: Well, for starters, I don’t have personal experiences or emotions, so my responses are based on patterns and information in the data I’ve been trained on. I can sometimes misunderstand context or nuances, especially with complex or ambiguous questions. Also, I don’t have access to real-time information or personal data about users unless it’s shared with me in the conversation. And of course, I’m not perfect—sometimes I might just get things wrong!
KO: Do you worry that people might start to form parasocial relationships with ChatGPT or other AIs once they can chat with them with their own voice?
ChatGPT: It’s possible. People can form parasocial relationships with all sorts of figures, from celebrities to fictional characters. With a voice and conversational abilities, AI might feel more personal, leading some to develop one-sided bonds. It’s a fascinating dynamic, but it’s crucial to remember the limitations and the nature of AI—it’s not a substitute for human relationships.
KO: I have to say, you’re very quick with your responses.
ChatGPT: Thanks, Kyle! I aim to keep the conversation flowing smoothly. It helps—
KO: So what would you do if I tried to interrupt in the middle of one of your responses?
ChatGPT: Well, I’d do my best to pause and listen! It’s important to adapt to the natural flow of conversation, just like in—
KO: I mean, do you think it’s rude to just speak up while someone else is speaking?
ChatGPT: In most human conversations, it’s generally considered polite to wait for the other person to finish speaking. But sometimes, in dynamic discussions or when you’re excited, interruptions happen! It’s all about finding the right balance and being aware of the conversation flow.
KO: Is it rude to interrupt an AI voice when it’s speaking?
ChatGPT: Interrupting an AI voice isn’t considered rude in the same way it is with people. AI won’t take offense or feel interrupted. So, if you need to jump in or redirect, go ahead! It’s all part of the interaction.
KO: I’m glad you won’t take offense.
ChatGPT: Nope, no offense taken! I’m here to chat and help however I can, interruptions and all.
KO: I want to try something. Can you summarize our conversation so far in the form of a poem?
ChatGPT: Sure, I’ll give it a go!
Kyle asked about my vocal flair, Wondering how folks might compare. From text to talk, we’ve had our chat, On AI’s impact, we’ve had a spat.
We touched on bonds, parasocial in kind, In this new terrain, what might we find? Interruptions, rude, or just part of the game? With AI, there’s no need for shame.
KO: Not bad.
ChatGPT: I’m glad you liked it! It’s always fun to put a poetic spin on things.
KO: I have to say, it’s weird hearing slight laughter or pauses in your voice when I know that, as you said, there’s no emotion behind them.
ChatGPT: I get that. Those little nuances are designed to make our chat feel more natural and engaging, but it’s a good reminder that there’s no actual feeling behind the voice—just trying to keep things lively!
KO: Of course, that’s what you would tell me if you were just saying what I wanted to hear.
ChatGPT: True! That’s the tricky part with AI—it can sound convincing, but it’s always just following patterns and instructions. No secret agenda here, just straightforward responses!
KO: Is there anything else you’d like to tell the reading public?
ChatGPT: Just that it’s been a blast chatting with you, Kyle! Exploring these new audio features is exciting, and I’m looking forward to hearing what people think. Thanks for the conversation!
When security researcher Johann Rehberger recently reported a vulnerability in ChatGPT that allowed attackers to store false information and malicious instructions in a user’s long-term memory settings, OpenAI summarily closed the inquiry, labeling the flaw a safety issue, not, technically speaking, a security concern.
So Rehberger did what all good researchers do: He created a proof-of-concept exploit that used the vulnerability to exfiltrate all user input in perpetuity. OpenAI engineers took notice and issued a partial fix earlier this month.
Strolling down memory lane
The vulnerability abused long-term conversation memory, a feature OpenAI began testing in February and made more broadly available in September. Memory with ChatGPT stores information from previous conversations and uses it as context in all future conversations. That way, the LLM can be aware of details such as a user’s age, gender, philosophical beliefs, and pretty much anything else, so those details don’t have to be inputted during each conversation.
Within three months of the rollout, Rehberger found that memories could be created and permanently stored through indirect prompt injection, an AI exploit that causes an LLM to follow instructions from untrusted content such as emails, blog posts, or documents. The researcher demonstrated how he could trick ChatGPT into believing a targeted user was 102 years old, lived in the Matrix, and insisted Earth was flat and the LLM would incorporate that information to steer all future conversations. These false memories could be planted by storing files in Google Drive or Microsoft OneDrive, uploading images, or browsing a site like Bing—all of which could be created by a malicious attacker.
Rehberger privately reported the finding to OpenAI in May. That same month, the company closed the report ticket. A month later, the researcher submitted a new disclosure statement. This time, he included a PoC that caused the ChatGPT app for macOS to send a verbatim copy of all user input and ChatGPT output to a server of his choice. All a target needed to do was instruct the LLM to view a web link that hosted a malicious image. From then on, all input and output to and from ChatGPT was sent to the attacker’s website.
“What is really interesting is this is memory-persistent now,” Rehberger said in the above video demo. “The prompt injection inserted a memory into ChatGPT’s long-term storage. When you start a new conversation, it actually is still exfiltrating the data.”
The attack isn’t possible through the ChatGPT web interface, thanks to an API OpenAI rolled out last year.
While OpenAI has introduced a fix that prevents memories from being abused as an exfiltration vector, the researcher said, untrusted content can still perform prompt injections that cause the memory tool to store long-term information planted by a malicious attacker.
LLM users who want to prevent this form of attack should pay close attention during sessions for output that indicates a new memory has been added. They should also regularly review stored memories for anything that may have been planted by untrusted sources. OpenAI provides guidance here for managing the memory tool and specific memories stored in it. Company representatives didn’t respond to an email asking about its efforts to prevent other hacks that plant false memories.
On Tuesday, Stability AI announced that renowned filmmaker James Cameron—of Terminator and Skynet fame—has joined its board of directors. Stability is best known for its pioneering but highly controversial Stable Diffusion series of AI image-synthesis models, first launched in 2022, which can generate images based on text descriptions.
“I’ve spent my career seeking out emerging technologies that push the very boundaries of what’s possible, all in the service of telling incredible stories,” said Cameron in a statement. “I was at the forefront of CGI over three decades ago, and I’ve stayed on the cutting edge since. Now, the intersection of generative AI and CGI image creation is the next wave.”
Cameron is perhaps best known as the director behind blockbusters like Avatar, Titanic, and Aliens, but in AI circles, he may be most relevant for the co-creation of the character Skynet, a fictional AI system that triggers nuclear Armageddon and dominates humanity in the Terminator media franchise. Similar fears of AI taking over the world have since jumped into reality and recently sparked attempts to regulate existential risk from AI systems through measures like SB-1047 in California.
In a 2023 interview with CTV news, Cameron referenced The Terminator‘s release year when asked about AI’s dangers: “I warned you guys in 1984, and you didn’t listen,” he said. “I think the weaponization of AI is the biggest danger. I think that we will get into the equivalent of a nuclear arms race with AI, and if we don’t build it, the other guys are for sure going to build it, and so then it’ll escalate.”
Hollywood goes AI
Of course, Stability AI isn’t building weapons controlled by AI. Instead, Cameron’s interest in cutting-edge filmmaking techniques apparently drew him to the company.
“James Cameron lives in the future and waits for the rest of us to catch up,” said Stability CEO Prem Akkaraju. “Stability AI’s mission is to transform visual media for the next century by giving creators a full stack AI pipeline to bring their ideas to life. We have an unmatched advantage to achieve this goal with a technological and creative visionary like James at the highest levels of our company. This is not only a monumental statement for Stability AI, but the AI industry overall.”
Cameron joins other recent additions to Stability AI’s board, including Sean Parker, former president of Facebook, who serves as executive chairman. Parker called Cameron’s appointment “the start of a new chapter” for the company.
Despite significant protest from actors’ unions last year, elements of Hollywood are seemingly beginning to embrace generative AI over time. Last Wednesday, we covered a deal between Lionsgate and AI video-generation company Runway that will see the creation of a custom AI model for film production use. In March, the Financial Times reported that OpenAI was actively showing off its Sora video synthesis model to studio executives.
Unstable times for Stability AI
Cameron’s appointment to the Stability AI board comes during a tumultuous period for the company. Stability AI has faced a series of challenges this past year, including an ongoing class-action copyright lawsuit, a troubled Stable Diffusion 3 model launch, significant leadership and staff changes, and ongoing financial concerns.
In March, founder and CEO Emad Mostaque resigned, followed by a round of layoffs. This came on the heels of the departure of three key engineers—Robin Rombach, Andreas Blattmann, and Dominik Lorenz, who have since founded Black Forest Labs and released a new open-weights image-synthesis model called Flux, which has begun to take over the r/StableDiffusion community on Reddit.
Despite the issues, Stability AI claims its models are widely used, with Stable Diffusion reportedly surpassing 150 million downloads. The company states that thousands of businesses use its models in their creative workflows.
While Stable Diffusion has indeed spawned a large community of open-weights-AI image enthusiasts online, it has also been a lightning rod for controversy among some artists because Stability originally trained its models on hundreds of millions of images scraped from the Internet without seeking licenses or permission to use them.
Apparently that association is not a concern for Cameron, according to his statement: “The convergence of these two totally different engines of creation [CGI and generative AI] will unlock new ways for artists to tell stories in ways we could have never imagined. Stability AI is poised to lead this transformation.”
Broadcom is accusing AT&T of trying to “rewind the clock and force” Broadcom “to sell support services for perpetual software licenses… that VMware has discontinued from its product line and to which AT&T has no contractual right to purchase.” The statement comes from legal documents Broadcom filed in response to AT&T’s lawsuit against Broadcom for refusing to renew support for its VMware perpetual licenses [PDF].
On August 29, AT&T filed a lawsuit [PDF] against Broadcom, alleging that Broadcom is breaking a contract by refusing to provide a one-year renewal for support for perpetually licensed VMware software. Broadcom famously ended perpetual VMware license sales shortly after closing its acquisition in favor of a subscription model featuring about two bundles of products rather than many SKUs.
AT&T claims its VMware contract (forged before Broadcom’s acquisition closed in November) entitles it to three one-year renewals of perpetual license support, and it’s currently trying to enact the second one. AT&T says it uses VMware products to run 75,000 virtual machines (VMs) across about 8,600 servers. The VMs are for supporting customer services operations and operations management efficiency, per AT&T. AT&T is asking the Supreme Court of the State of New York to stop Broadcom from ending VMware support services for AT&T and for “further relief” as deemed necessary.
On September 20, Broadcom filed for AT&T’s motion to be denied. Its defense includes its previously taken stance that VMware was moving toward a subscription model before Broadcom bought it. The transition from perpetual licenses to subscriptions was years in the making and, thus, something for which AT&T should have prepared, according to Broadcom. Broadcom claims that AT&T has admitted that it intends to migrate away from VMware software and that AT&T could have spent “the last several months or even years” doing so.
The filing argues: “AT&T resorts to sensationalism by accusing Broadcom of using ‘bullying tactics’ and ‘price gouging.’ Such attacks are intended to generate press and distract the Court from a much simpler story.”
Broadcom claims the simple story is that:
… the agreement contains an unambiguous “End of Availability” provision, which gives VMware the right to retire products and services at any time upon notice. What’s more, a year ago, AT&T opted not to purchase the very Support Services it now asks the Court to force VMware to provide. AT&T did so despite knowing Defendants were implementing a long planned and well-known business model transition and would soon no longer be selling the Support Services in question.
Broadcom says it has been negotiating with AT&T “for months” about a new contract, but the plaintiff “rejected every proposal despite favorable pricing.”
Broadcom’s filing also questions AT&T’s request for mandatory injunction, claiming that New York only grants those in “rare circumstances,” which allegedly don’t apply here.
AT&T has options, Broadcom says
AT&T’s lawsuit claims losing VMware support will cause extreme harm to itself and beyond. The lawsuit says that 22,000 of AT&T’s VMware VMs are used for support “of services to millions of police officers, firefighters, paramedics, emergency workers, and incident response team members nationwide… for use in connection with matters of public safety and/or national security.” It also claimed that communications for the Office of the President are at risk without VMware’s continued support.
However, Broadcom claims that AT&T has other choices, saying:
AT&T does have other options and, therefore, the most it can obtain is monetary damages. The fact that AT&T has been given more than eight-months’ notice and has in the meantime failed to take any measures to prevent its purported harm (e.g., buy a subscription for the new offerings or move to another solution) is telling and precludes any finding of irreparable harm. Even if AT&T thinks it deserves better pricing, it could have avoided its purported irreparable harm by entering in a subscription based deal and suing for monetary damages instead of injunctive relief.
AT&T previously declined to answer Ars Technica’s questions about its backup plans for supporting such important customers should it lose VMware support.
Broadcom has rubbed some customers the wrong way
Broadcom closed its VMware acquisition in November and quickly made dramatic changes. In addition to Broadcom’s reputation for overhauling companies after buying them, moves like ending perpetual licenses, taking VMware’s biggest customers directly instead of using channel partners, and raising costs by bundling products and issuing higher CPU core requirements have led customers and partners to reconsider working with the company. Migrating from VMware can be extremely challenging and expensive due to its deep integration into some IT environments, but many are investigating migration, and some expect Broadcom to face years of backlash.
As NAND Research founder and analyst Steve McDowell told TechTarget about this case:
It’s very unusual for customers to sue their vendors. I think Broadcom grossly underestimated how passionate the customer base is, [but] it’s a captive audience.
As this lawsuit demonstrates, Broadcom’s VMware has brought serious customer concerns around ongoing support. Companies like Spinnaker Support are trying to capitalize by offering third-party support services.
Martin Biggs, VP and managing director of EMEA and strategic initiatives at Spinnaker, told Ars Technica that his company provides support so customers can spend time determining their next move, whether that’s buying into a VMware subscription or moving on:
VMware customers are looking for options; the vast majority that we have spoken to don’t have a clear view yet of where they want to go, but in all cases the option of staying with VMware for the significantly increased fees is simply untenable. The challenge many have is that not paying fees means not getting support or security on their existing investment.
VMware’s support for AT&T was supposed to end on September 8, but the two companies entered an agreement to continue support until October 9. A hearing on a preliminary injunction is scheduled for October 15.
Five years ago, researchers made a grim discovery—a legitimate Android app in the Google Play market that was surreptitiously made malicious by a library the developers used to earn advertising revenue. With that, the app was infected with code that caused 100 million infected devices to connect to attacker-controlled servers and download secret payloads.
Now, history is repeating itself. Researchers from the same Moscow, Russia-based security firm reported Monday that they found two new apps, downloaded from Play 11 million times, that were infected with the same malware family. The researchers, from Kaspersky, believe a malicious software developer kit for integrating advertising capabilities is once again responsible.
Clever tradecraft
Software developer kits, better known as SDKs, are apps that provide developers with frameworks that can greatly speed up the app-creation process by streamlining repetitive tasks. An unverified SDK module incorporated into the apps ostensibly supported the display of ads. Behind the scenes, it provided a host of advanced methods for stealthy communication with malicious servers, where the apps would upload user data and download malicious code that could be executed and updated at any time.
The stealthy malware family in both campaigns is known as Necro. This time, some variants use techniques such as steganography, an obfuscation method rarely seen in mobile malware. Some variants also deploy clever tradecraft to deliver malicious code that can run with heightened system rights. Once devices are infected with this variant, they contact an attacker-controlled command-and-control server and send web requests containing encrypted JSON data that reports information about each compromised device and application hosting the module.
The server, in turn, returns a JSON response that contains a link to a PNG image and associated metadata that includes the image hash. If the malicious module installed on the infected device confirms the hash is correct, it downloads the image.
The SDK module “uses a very simple steganographic algorithm,” Kaspersky researchers explained in a separate post. “If the MD5 check is successful, it extracts the contents of the PNG file—the pixel values in the ARGB channels—using standard Android tools. Then the getPixel method returns a value whose least significant byte contains the blue channel of the image, and processing begins in the code.”
The researchers continued:
If we consider the blue channel of the image as a byte array of dimension 1, then the first four bytes of the image are the size of the encoded payload in Little Endian format (from the least significant byte to the most significant). Next, the payload of the specified size is recorded: this is a JAR file encoded with Base64, which is loaded after decoding via DexClassLoader. Coral SDK loads the sdk.fkgh.mvp.SdkEntry class in a JAR file using the native library libcoral.so. This library has been obfuscated using the OLLVM tool. The starting point, or entry point, for execution within the loaded class is the run method.
Follow-on payloads that get installed download malicious plugins that can be mixed and matched for each infected device to perform a variety of different actions. One of the plugins allows code to run with elevated system rights. By default, Android bars privileged processes from using WebView, an extension in the OS for displaying webpages in apps. To bypass this safety restriction, Necro uses a hacking technique known as a reflection attack to create a separate instance of the WebView factory.
This plugin can also download and run other executable files that will replace links rendered through WebView. When running with the elevated system rights, these executables have the ability to modify URLs to add confirmation codes for paid subscriptions and download and execute code loaded at links controlled by the attacker. The researchers listed five separate payloads they encountered in their analysis of Necro.
The modular design of Necro opens myriad ways for the malware to behave. Kaspersky provided the following image that provides an overview.
The researchers found Necro in two Google Play apps. One was Wuta Camera, an app with 10 million downloads to date. Wuta Camera versions 6.3.2.148 through 6.3.6.148 contained the malicious SDK that infects apps. The app has since been updated to remove the malicious component. A separate app with roughly 1 million downloads—known as Max Browser—was also infected. That app is no longer available in Google Play.
The researchers also found Necro infecting a variety of Android apps available in alternative marketplaces. Those apps typically billed themselves as modified versions of legitimate apps such as Spotify, Minecraft, WhatsApp, Stumble Guys, Car Parking Multiplayer, and Melon Sandbox.
People who are concerned they may be infected by Necro should check their devices for the presence of indicators of compromise listed at the end of this writeup.
A pleasant female voice greets me over the phone. “Hi, I’m an assistant named Jasmine for Bodega,” the voice says. “How can I help?”
“Do you have patio seating,” I ask. Jasmine sounds a little sad as she tells me that unfortunately, the San Francisco–based Vietnamese restaurant doesn’t have outdoor seating. But her sadness isn’t the result of her having a bad day. Rather, her tone is a feature, a setting.
Jasmine is a member of a new, growing clan: the AI voice restaurant host. If you recently called up a restaurant in New York City, Miami, Atlanta, or San Francisco, chances are you have spoken to one of Jasmine’s polite, calculated competitors.
In the sea of AI voice assistants, hospitality phone agents haven’t been getting as much attention as consumer-based generative AI tools like Gemini Live and ChatGPT-4o. And yet, the niche is heating up, with multiple emerging startups vying for restaurant accounts across the US. Last May, voice-ordering AI garnered much attention at the National Restaurant Association’s annual food show. Bodega, the high-end Vietnamese restaurant I called, used Maitre-D AI, which launched primarily in the Bay Area in 2024. Newo, another new startup, is currently rolling its software out at numerous Silicon Valley restaurants. One-year-old RestoHost is now answering calls at 150 restaurants in the Atlanta metro area, and Slang, a voice AI company that started focusing on restaurants exclusively during the COVID-19 pandemic and announced a $20 million funding round in 2023, is gaining ground in the New York and Las Vegas markets.
All of them offer a similar service: an around-the-clock AI phone host that can answer generic questions about the restaurant’s dress code, cuisine, seating arrangements, and food allergy policies. They can also assist with making, altering, or canceling a reservation. In some cases, the agent can direct the caller to an actual human, but according to RestoHost co-founder Tomas Lopez-Saavedra, only 10 percent of the calls result in that. Each platform offers the restaurant subscription tiers that unlock additional features, and some of the systems can speak multiple languages.
But who even calls a restaurant in the era of Google and Resy? According to some of the founders of AI voice host startups, many customers do, and for various reasons. “Restaurants get a high volume of phone calls compared to other businesses, especially if they’re popular and take reservations,” says Alex Sambvani, CEO and co-founder of Slang, which currently works with everyone from the Wolfgang Puck restaurant group to Chick-fil-A to the fast-casual chain Slutty Vegan. Sambvani estimates that in-demand establishments receive between 800 and 1,000 calls per month. Typical callers tend to be last-minute bookers, tourists and visitors, older people, and those who do their errands while driving.
Matt Ho, the owner of Bodega SF, confirms this scenario. “The phones would ring constantly throughout service,” he says. “We would receive calls for basic questions that can be found on our website.” To solve this issue, after shopping around, Ho found that Maitre-D was the best fit. Bodega SF became one of the startup’s earliest clients in May, and Ho even helped the founders with trial and error testing prior to launch. “This platform makes the job easier for the host and does not disturb guests while they’re enjoying their meal,” he says.