Biz & IT

apple-releases-eight-small-ai-language-models-aimed-at-on-device-use

Apple releases eight small AI language models aimed at on-device use

Inside the Apple core —

OpenELM mirrors efforts by Microsoft to make useful small AI language models that run locally.

An illustration of a robot hand tossing an apple to a human hand.

Getty Images

In the world of AI, what might be called “small language models” have been growing in popularity recently because they can be run on a local device instead of requiring data center-grade computers in the cloud. On Wednesday, Apple introduced a set of tiny source-available AI language models called OpenELM that are small enough to run directly on a smartphone. They’re mostly proof-of-concept research models for now, but they could form the basis of future on-device AI offerings from Apple.

Apple’s new AI models, collectively named OpenELM for “Open-source Efficient Language Models,” are currently available on the Hugging Face under an Apple Sample Code License. Since there are some restrictions in the license, it may not fit the commonly accepted definition of “open source,” but the source code for OpenELM is available.

On Tuesday, we covered Microsoft’s Phi-3 models, which aim to achieve something similar: a useful level of language understanding and processing performance in small AI models that can run locally. Phi-3-mini features 3.8 billion parameters, but some of Apple’s OpenELM models are much smaller, ranging from 270 million to 3 billion parameters in eight distinct models.

In comparison, the largest model yet released in Meta’s Llama 3 family includes 70 billion parameters (with a 400 billion version on the way), and OpenAI’s GPT-3 from 2020 shipped with 175 billion parameters. Parameter count serves as a rough measure of AI model capability and complexity, but recent research has focused on making smaller AI language models as capable as larger ones were a few years ago.

The eight OpenELM models come in two flavors: four as “pretrained” (basically a raw, next-token version of the model) and four as instruction-tuned (fine-tuned for instruction following, which is more ideal for developing AI assistants and chatbots):

OpenELM features a 2048-token maximum context window. The models were trained on the publicly available datasets RefinedWeb, a version of PILE with duplications removed, a subset of RedPajama, and a subset of Dolma v1.6, which Apple says totals around 1.8 trillion tokens of data. Tokens are fragmented representations of data used by AI language models for processing.

Apple says its approach with OpenELM includes a “layer-wise scaling strategy” that reportedly allocates parameters more efficiently across each layer, saving not only computational resources but also improving the model’s performance while being trained on fewer tokens. According to Apple’s released white paper, this strategy has enabled OpenELM to achieve a 2.36 percent improvement in accuracy over Allen AI’s OLMo 1B (another small language model) while requiring half as many pre-training tokens.

An table comparing OpenELM with other small AI language models in a similar class, taken from the OpenELM research paper by Apple.

Enlarge / An table comparing OpenELM with other small AI language models in a similar class, taken from the OpenELM research paper by Apple.

Apple

Apple also released the code for CoreNet, a library it used to train OpenELM—and it also included reproducible training recipes that allow the weights (neural network files) to be replicated, which is unusual for a major tech company so far. As Apple says in its OpenELM paper abstract, transparency is a key goal for the company: “The reproducibility and transparency of large language models are crucial for advancing open research, ensuring the trustworthiness of results, and enabling investigations into data and model biases, as well as potential risks.”

By releasing the source code, model weights, and training materials, Apple says it aims to “empower and enrich the open research community.” However, it also cautions that since the models were trained on publicly sourced datasets, “there exists the possibility of these models producing outputs that are inaccurate, harmful, biased, or objectionable in response to user prompts.”

While Apple has not yet integrated this new wave of AI language model capabilities into its consumer devices, the upcoming iOS 18 update (expected to be revealed in June at WWDC) is rumored to include new AI features that utilize on-device processing to ensure user privacy—though the company may potentially hire Google or OpenAI to handle more complex, off-device AI processing to give Siri a long-overdue boost.

Apple releases eight small AI language models aimed at on-device use Read More »

millions-of-ips-remain-infected-by-usb-worm-years-after-its-creators-left-it-for-dead

Millions of IPs remain infected by USB worm years after its creators left it for dead

I’M NOT DEAD YET —

Ability of PlugX worm to live on presents a vexing dilemma: Delete it or leave it be.

Millions of IPs remain infected by USB worm years after its creators left it for dead

Getty Images

A now-abandoned USB worm that backdoors connected devices has continued to self-replicate for years since its creators lost control of it and remains active on thousands, possibly millions, of machines, researchers said Thursday.

The worm—which first came to light in a 2023 post published by security firm Sophos—became active in 2019 when a variant of malware known as PlugX added functionality that allowed it to infect USB drives automatically. In turn, those drives would infect any new machine they connected to, a capability that allowed the malware to spread without requiring any end-user interaction. Researchers who have tracked PlugX since at least 2008 have said that the malware has origins in China and has been used by various groups tied to the country’s Ministry of State Security.

Still active after all these years

For reasons that aren’t clear, the worm creator abandoned the one and only IP address that was designated as its command-and-control channel. With no one controlling the infected machines anymore, the PlugX worm was effectively dead, or at least one might have presumed so. The worm, it turns out, has continued to live on in an undetermined number of machines that possibly reaches into the millions, researchers from security firm Sekoia reported.

The researchers purchased the IP address and connected their own server infrastructure to “sinkhole” traffic connecting to it, meaning intercepting the traffic to prevent it from being used maliciously. Since then, their server continues to receive PlugX traffic from 90,000 to 100,000 unique IP addresses every day. Over the span of six months, the researchers counted requests from nearly 2.5 million unique IPs. These sorts of requests are standard for virtually all forms of malware and typically happen at regular intervals that span from minutes to days. While the number of affected IPs doesn’t directly indicate the number of infected machines, the volume nonetheless suggests the worm remains active on thousands, possibly millions, of devices.

“We initially thought that we will have a few thousand victims connected to it, as what we can have on our regular sinkholes,” Sekoia researchers Felix Aimé and Charles M wrote. “However, by setting up a simple web server we saw a continuous flow of HTTP requests varying through the time of the day.”

They went on to say that other variants of the worm remain active through at least three other command-and-control channels known in security circles. There are indications that one of them may also have been sinkholed, however.

As the image below shows, the machines reporting to the sinkhole have broad geographic disbursement:

A world map showing country IPs reporting to the sinkhole.

Enlarge / A world map showing country IPs reporting to the sinkhole.

Sekoia

A sample of incoming traffic over a single day appeared to show that Nigeria hosted the largest concentration of infected machines, followed by India, Indonesia, and the UK.

Graph showing the countries with the most affected IPs.

Enlarge / Graph showing the countries with the most affected IPs.

Sekoia

The researchers wrote:

Based on that data, it’s notable that around 15 countries account for over 80% of the total infections. It’s also intriguing to note that the leading infected countries don’t share many similarities, a pattern observed with previous USB worms such as RETADUP which has the highest infection rates in Spanish spelling countries. This suggests the possibility that this worm might have originated from multiple patient zeros in different countries.

One explanation is that most of the biggest concentrations are in countries that have coastlines where China’s government has significant investments in infrastructure. Additionally many of the most affected countries have strategic importance to Chinese military objectives. The researchers speculated that the purpose of the campaign was to collect intelligence the Chinese government could use to achieve those objectives.

The researchers noted that the zombie worm has remained susceptible to takeover by any threat actor who gains control of the IP address or manages to insert itself into the pathway between the server at that address and an infected device. That threat poses interesting dilemmas for the governments of affected countries. They could choose to preserve the status quo by taking no action, or they could activate a self-delete command built into the worm that would disinfect infected machines. Additionally, if they choose the latter option, they could elect to disinfect only the infected machine or add new functionality to disinfect any infected USB drives that happen to be connected.

Because of how the worm infects drives, disinfecting them risks deleting the legitimate data stored on them. On the other hand, allowing drives to remain infected makes it possible for the worm to start its proliferation all over again. Further complicating the decision-making process, the researchers noted that even if someone issues commands that disinfect any infected drives that happen to be plugged in, it’s inevitable that the worm will live on in drives that aren’t connected when a remote disinfect command is issued.

“Given the potential legal challenges that could arise from conducting a widespread disinfection campaign, which involves sending an arbitrary command to workstations we do not own, we have resolved to defer the decision on whether to disinfect workstations in their respective countries to the discretion of national Computer Emergency Response Teams (CERTs), Law Enforcement Agencies (LEAs), and cybersecurity authorities,” the researchers wrote. “Once in possession of the disinfection list, we can provide them an access to start the disinfection for a period of three months. During this time, any PlugX request from an Autonomous System marked for disinfection will be responded to with a removal command or a removal payload.”

Millions of IPs remain infected by USB worm years after its creators left it for dead Read More »

nation-state-hackers-exploit-cisco-firewall-0-days-to-backdoor-government-networks

Nation-state hackers exploit Cisco firewall 0-days to backdoor government networks

A stylized skull and crossbones made out of ones and zeroes.

Hackers backed by a powerful nation-state have been exploiting two zero-day vulnerabilities in Cisco firewalls in a five-month-long campaign that breaks into government networks around the world, researchers reported Wednesday.

The attacks against Cisco’s Adaptive Security Appliances firewalls are the latest in a rash of network compromises that target firewalls, VPNs, and network-perimeter devices, which are designed to provide a moated gate of sorts that keeps remote hackers out. Over the past 18 months, threat actors—mainly backed by the Chinese government—have turned this security paradigm on its head in attacks that exploit previously unknown vulnerabilities in security appliances from the likes of Ivanti, Atlassian, Citrix, and Progress. These devices are ideal targets because they sit at the edge of a network, provide a direct pipeline to its most sensitive resources, and interact with virtually all incoming communications.

Cisco ASA likely one of several targets

On Wednesday, it was Cisco’s turn to warn that its ASA products have received such treatment. Since November, a previously unknown actor tracked as UAT4356 by Cisco and STORM-1849 by Microsoft has been exploiting two zero-days in attacks that go on to install two pieces of never-before-seen malware, researchers with Cisco’s Talos security team said. Notable traits in the attacks include:

  • An advanced exploit chain that targeted multiple vulnerabilities, at least two of which were zero-days
  • Two mature, full-feature backdoors that have never been seen before, one of which resided solely in memory to prevent detection
  • Meticulous attention to hiding footprints by wiping any artifacts the backdoors may leave behind. In many cases, the wiping was customized based on characteristics of a specific target.

Those characteristics, combined with a small cast of selected targets all in government, have led Talos to assess that the attacks are the work of government-backed hackers motivated by espionage objectives.

“Our attribution assessment is based on the victimology, the significant level of tradecraft employed in terms of capability development and anti-forensic measures, and the identification and subsequent chaining together of 0-day vulnerabilities,” Talos researchers wrote. “For these reasons, we assess with high confidence that these actions were performed by a state-sponsored actor.”

The researchers also warned that the hacking campaign is likely targeting other devices besides the ASA. Notably, the researchers said they still don’t know how UAT4356 gained initial access, meaning the ASA vulnerabilities could be exploited only after one or more other currently unknown vulnerabilities—likely in network wares from Microsoft and others—were exploited.

“Regardless of your network equipment provider, now is the time to ensure that the devices are properly patched, logging to a central, secure location, and configured to have strong, multi-factor authentication (MFA),” the researchers wrote. Cisco has released security updates that patch the vulnerabilities and is urging all ASA users to install them promptly.

UAT4356 started work on the campaign no later than last July when it was developing and testing the exploits. By November, the threat group first set up the dedicated server infrastructure for the attacks, which began in earnest in January. The following image details the timeline:

Cisco

One of the vulnerabilities, tracked as CVE-2024-20359, resides in a now-retired capability allowing for the preloading of VPN clients and plug-ins in ASA. It stems from improper validation of files when they’re read from the flash memory of a vulnerable device and allows for remote code execution with root system privileges when exploited. UAT4356 is exploiting it to backdoors Cisco tracks under the names Line Dancer and Line Runner. In at least one case, the threat actor is installing the backdoors by exploiting CVE-2024-20353, a separate ASA vulnerability with a severity rating of 8.6 out of a possible 10.

Nation-state hackers exploit Cisco firewall 0-days to backdoor government networks Read More »

deepfakes-in-the-courtroom:-us-judicial-panel-debates-new-ai-evidence-rules

Deepfakes in the courtroom: US judicial panel debates new AI evidence rules

adventures in 21st-century justice —

Panel of eight judges confronts deep-faking AI tech that may undermine legal trials.

An illustration of a man with a very long nose holding up the scales of justice.

On Friday, a federal judicial panel convened in Washington, DC, to discuss the challenges of policing AI-generated evidence in court trials, according to a Reuters report. The US Judicial Conference’s Advisory Committee on Evidence Rules, an eight-member panel responsible for drafting evidence-related amendments to the Federal Rules of Evidence, heard from computer scientists and academics about the potential risks of AI being used to manipulate images and videos or create deepfakes that could disrupt a trial.

The meeting took place amid broader efforts by federal and state courts nationwide to address the rise of generative AI models (such as those that power OpenAI’s ChatGPT or Stability AI’s Stable Diffusion), which can be trained on large datasets with the aim of producing realistic text, images, audio, or videos.

In the published 358-page agenda for the meeting, the committee offers up this definition of a deepfake and the problems AI-generated media may pose in legal trials:

A deepfake is an inauthentic audiovisual presentation prepared by software programs using artificial intelligence. Of course, photos and videos have always been subject to forgery, but developments in AI make deepfakes much more difficult to detect. Software for creating deepfakes is already freely available online and fairly easy for anyone to use. As the software’s usability and the videos’ apparent genuineness keep improving over time, it will become harder for computer systems, much less lay jurors, to tell real from fake.

During Friday’s three-hour hearing, the panel wrestled with the question of whether existing rules, which predate the rise of generative AI, are sufficient to ensure the reliability and authenticity of evidence presented in court.

Some judges on the panel, such as US Circuit Judge Richard Sullivan and US District Judge Valerie Caproni, reportedly expressed skepticism about the urgency of the issue, noting that there have been few instances so far of judges being asked to exclude AI-generated evidence.

“I’m not sure that this is the crisis that it’s been painted as, and I’m not sure that judges don’t have the tools already to deal with this,” said Judge Sullivan, as quoted by Reuters.

Last year, Chief US Supreme Court Justice John Roberts acknowledged the potential benefits of AI for litigants and judges, while emphasizing the need for the judiciary to consider its proper uses in litigation. US District Judge Patrick Schiltz, the evidence committee’s chair, said that determining how the judiciary can best react to AI is one of Roberts’ priorities.

In Friday’s meeting, the committee considered several deepfake-related rule changes. In the agenda for the meeting, US District Judge Paul Grimm and attorney Maura Grossman proposed modifying Federal Rule 901(b)(9) (see page 5), which involves authenticating or identifying evidence. They also recommended the addition of a new rule, 901(c), which might read:

901(c): Potentially Fabricated or Altered Electronic Evidence. If a party challenging the authenticity of computer-generated or other electronic evidence demonstrates to the court that it is more likely than not either fabricated, or altered in whole or in part, the evidence is admissible only if the proponent demonstrates that its probative value outweighs its prejudicial effect on the party challenging the evidence.

The panel agreed during the meeting that this proposal to address concerns about litigants challenging evidence as deepfakes did not work as written and that it will be reworked before being reconsidered later.

Another proposal by Andrea Roth, a law professor at the University of California, Berkeley, suggested subjecting machine-generated evidence to the same reliability requirements as expert witnesses. However, Judge Schiltz cautioned that such a rule could hamper prosecutions by allowing defense lawyers to challenge any digital evidence without establishing a reason to question it.

For now, no definitive rule changes have been made, and the process continues. But we’re witnessing the first steps of how the US justice system will adapt to an entirely new class of media-generating technology.

Putting aside risks from AI-generated evidence, generative AI has led to embarrassing moments for lawyers in court over the past two years. In May 2023, US lawyer Steven Schwartz of the firm Levidow, Levidow, & Oberman apologized to a judge for using ChatGPT to help write court filings that inaccurately cited six nonexistent cases, leading to serious questions about the reliability of AI in legal research. Also, in November, a lawyer for Michael Cohen cited three fake cases that were potentially influenced by a confabulating AI assistant.

Deepfakes in the courtroom: US judicial panel debates new AI evidence rules Read More »

hackers-infect-users-of-antivirus-service-that-delivered-updates-over-http

Hackers infect users of antivirus service that delivered updates over HTTP

GOT HTTPS? —

eScan AV updates were delivered over HTTP for five years.

Hackers infect users of antivirus service that delivered updates over HTTP

Getty Images

Hackers abused an antivirus service for five years in order to infect end users with malware. The attack worked because the service delivered updates over HTTP, a protocol vulnerable to attacks that corrupt or tamper with data as it travels over the Internet.

The unknown hackers, who may have ties to the North Korean government, pulled off this feat by performing a man-in-the-middle (MiitM) attack that replaced the genuine update with a file that installed an advanced backdoor instead, said researchers from security firm Avast today.

eScan, an AV service headquartered in India, has delivered updates over HTTP since at least 2019, Avast researchers reported. This protocol presented a valuable opportunity for installing the malware, which is tracked in security circles under the name GuptiMiner.

“This sophisticated operation has been performing MitM attacks targeting an update mechanism of the eScan antivirus vendor,” Avast researchers Jan Rubín and Milánek wrote. “We disclosed the security vulnerability to both eScan and the India CERT and received confirmation on 2023-07-31 from eScan that the issue was fixed and successfully resolved.”

Complex infection chain

The complex infection chain started when eScan applications checked in with the eScan update system. The threat actors then performed a MitM attack that allowed them to intercept the package sent by the update server and replace it with a corrupted one that contained code to install GuptiMiner. The Avast researchers still don’t know precisely how the attackers were able to perform the interception. They suspect targeted networks may already have been compromised somehow to route traffic to a malicious intermediary.

To lower the chances of detection, the infection file used DLL hijacking, a technique that replaces legitimate dynamic link library files used by most Microsoft apps with maliciously crafted ones that use the same file name. For added stealth, the infection chain also relied on a custom domain name system (DNS)  server that allowed it to use legitimate domain names when connecting to attacker-controlled channels.

Last year, the attackers abandoned the DNS technique and replaced it with another obfuscation technique known as IP address masking. This involved the following steps:

  1. Obtain an IP address of a hardcoded server name registered to the attacker by standard use of the gethostbyname API function
  2. For that server, two IP addresses are returned—the first is an IP address which is a masked address, and the second one denotes an available payload version and starts with 23.195. as its first two octets
  3. If the version is newer than the current one, the masked IP address is de-masked, resulting in a real command-and-control (C&C) IP address
  4. The real C&C IP address is used along with a hardcoded constant string (part of a URL path) to download a file containing malicious shellcode

Some variants of the infection chain stashed the malicious code inside an image file to make them harder to detect. The variants also installed a custom root TLS certificate that satisfied requirements by some targeted systems that all apps must be digitally signed before being installed.

The payload contained multiple backdoors that were activated when installed on large networks. Curiously, the update also delivered XMRig, an open-source package for mining cryptocurrency.

The GuptiMiner infection chain.

Enlarge / The GuptiMiner infection chain.

Avast

GuptiMiner has circulated since at least 2018 and has undergone multiple revisions. One searched compromised networks for systems running Windows 7 and Windows Server 2008, presumably to deliver exploits that worked on those earlier versions. Another provided an interface for installing special-purpose modules that could be customized for different victims. (This version also scanned the local system for stored private keys and cryptocurrency wallets.)

The researchers were surprised that malware that took such pains to fly under the radar would also install a cryptocurrency miner, which by nature is usually easy to detect. One possibility is the attackers’ possible connection to Kimsuky, the tracking name for a group backed by the North Korean government. Over the years, North Korea’s government has generated billions of dollars in cryptocurrency through malware installed on the devices of unwitting victims. The researchers made the possible connection after finding similarities between a known Kimsuky keylogger and code fragments used during the GuptiMiner operation.

The GuptiMiner attack is notable for exposing major shortcomings in eScan that went unnoticed for at least five years. Besides not delivering updates over HTTPS, a medium not susceptible to MitM attacks, eScan also failed to enforce digital signing to ensure updates hadn’t been tampered with before being installed. Representatives of eScan didn’t respond to an email asking why engineers designed the update process this way.

People who use or have used eScan should check the Avast post for details on whether their systems are infected. It’s likely that most reputable AV scanners will also detect this infection.

Hackers infect users of antivirus service that delivered updates over HTTP Read More »

microsoft’s-phi-3-shows-the-surprising-power-of-small,-locally-run-ai-language-models

Microsoft’s Phi-3 shows the surprising power of small, locally run AI language models

small packages —

Microsoft’s 3.8B parameter Phi-3 may rival GPT-3.5, signaling a new era of “small language models.”

An illustration of lots of information being compressed into a smartphone with a funnel.

Getty Images

On Tuesday, Microsoft announced a new, freely available lightweight AI language model named Phi-3-mini, which is simpler and less expensive to operate than traditional large language models (LLMs) like OpenAI’s GPT-4 Turbo. Its small size is ideal for running locally, which could bring an AI model of similar capability to the free version of ChatGPT to a smartphone without needing an Internet connection to run it.

The AI field typically measures AI language model size by parameter count. Parameters are numerical values in a neural network that determine how the language model processes and generates text. They are learned during training on large datasets and essentially encode the model’s knowledge into quantified form. More parameters generally allow the model to capture more nuanced and complex language-generation capabilities but also require more computational resources to train and run.

Some of the largest language models today, like Google’s PaLM 2, have hundreds of billions of parameters. OpenAI’s GPT-4 is rumored to have over a trillion parameters but spread over eight 220-billion parameter models in a mixture-of-experts configuration. Both models require heavy-duty data center GPUs (and supporting systems) to run properly.

In contrast, Microsoft aimed small with Phi-3-mini, which contains only 3.8 billion parameters and was trained on 3.3 trillion tokens. That makes it ideal to run on consumer GPU or AI-acceleration hardware that can be found in smartphones and laptops. It’s a follow-up of two previous small language models from Microsoft: Phi-2, released in December, and Phi-1, released in June 2023.

A chart provided by Microsoft showing Phi-3 performance on various benchmarks.

Enlarge / A chart provided by Microsoft showing Phi-3 performance on various benchmarks.

Phi-3-mini features a 4,000-token context window, but Microsoft also introduced a 128K-token version called “phi-3-mini-128K.” Microsoft has also created 7-billion and 14-billion parameter versions of Phi-3 that it plans to release later that it claims are “significantly more capable” than phi-3-mini.

Microsoft says that Phi-3 features overall performance that “rivals that of models such as Mixtral 8x7B and GPT-3.5,” as detailed in a paper titled “Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone.” Mixtral 8x7B, from French AI company Mistral, utilizes a mixture-of-experts model, and GPT-3.5 powers the free version of ChatGPT.

“[Phi-3] looks like it’s going to be a shockingly good small model if their benchmarks are reflective of what it can actually do,” said AI researcher Simon Willison in an interview with Ars. Shortly after providing that quote, Willison downloaded Phi-3 to his Macbook laptop locally and said, “I got it working, and it’s GOOD” in a text message sent to Ars.

A screenshot of Phi-3-mini running locally on Simon Willison's Macbook.

Enlarge / A screenshot of Phi-3-mini running locally on Simon Willison’s Macbook.

Simon Willison

Most models that run on a local device still need hefty hardware,” says Willison. “Phi-3-mini runs comfortably with less than 8GB of RAM, and can churn out tokens at a reasonable speed even on just a regular CPU. It’s licensed MIT and should work well on a $55 Raspberry Pi—and the quality of results I’ve seen from it so far are comparable to models 4x larger.

How did Microsoft cram a capability potentially similar to GPT-3.5, which has at least 175 billion parameters, into such a small model? Its researchers found the answer by using carefully curated, high-quality training data they initially pulled from textbooks. “The innovation lies entirely in our dataset for training, a scaled-up version of the one used for phi-2, composed of heavily filtered web data and synthetic data,” writes Microsoft. “The model is also further aligned for robustness, safety, and chat format.”

Much has been written about the potential environmental impact of AI models and datacenters themselves, including on Ars. With new techniques and research, it’s possible that machine learning experts may continue to increase the capability of smaller AI models, replacing the need for larger ones—at least for everyday tasks. That would theoretically not only save money in the long run but also require far less energy in aggregate, dramatically decreasing AI’s environmental footprint. AI models like Phi-3 may be a step toward that future if the benchmark results hold up to scrutiny.

Phi-3 is immediately available on Microsoft’s cloud service platform Azure, as well as through partnerships with machine learning model platform Hugging Face and Ollama, a framework that allows models to run locally on Macs and PCs.

Microsoft’s Phi-3 shows the surprising power of small, locally run AI language models Read More »

windows-vulnerability-reported-by-the-nsa-exploited-to-install-russian-malware

Windows vulnerability reported by the NSA exploited to install Russian malware

Windows vulnerability reported by the NSA exploited to install Russian malware

Getty Images

Kremlin-backed hackers have been exploiting a critical Microsoft vulnerability for four years in attacks that targeted a vast array of organizations with a previously undocumented tool, the software maker disclosed Monday.

When Microsoft patched the vulnerability in October 2022—at least two years after it came under attack by the Russian hackers—the company made no mention that it was under active exploitation. As of publication, the company’s advisory still made no mention of the in-the-wild targeting. Windows users frequently prioritize the installation of patches based on whether a vulnerability is likely to be exploited in real-world attacks.

Exploiting CVE-2022-38028, as the vulnerability is tracked, allows attackers to gain system privileges, the highest available in Windows, when combined with a separate exploit. Exploiting the flaw, which carries a 7.8 severity rating out of a possible 10, requires low existing privileges and little complexity. It resides in the Windows print spooler, a printer-management component that has harbored previous critical zero-days. Microsoft said at the time that it learned of the vulnerability from the US National Security Agency.

On Monday, Microsoft revealed that a hacking group tracked under the name Forest Blizzard has been exploiting CVE-2022-38028 since at least June 2020—and possibly as early as April 2019. The threat group—which is also tracked under names including APT28, Sednit, Sofacy, GRU Unit 26165, and Fancy Bear—has been linked by the US and the UK governments to Unit 26165 of the Main Intelligence Directorate, a Russian military intelligence arm better known as the GRU. Forest Blizzard focuses on intelligence gathering through the hacking of a wide array of organizations, mainly in the US, Europe, and the Middle East.

Since as early as April 2019, Forest Blizzard has been exploiting CVE-2022-38028 in attacks that, once system privileges are acquired, use a previously undocumented tool that Microsoft calls GooseEgg. The post-exploitation malware elevates privileges within a compromised system and goes on to provide a simple interface for installing additional pieces of malware that also run with system privileges. This additional malware, which includes credential stealers and tools for moving laterally through a compromised network, can be customized for each target.

“While a simple launcher application, GooseEgg is capable of spawning other applications specified at the command line with elevated permissions, allowing threat actors to support any follow-on objectives such as remote code execution, installing a backdoor, and moving laterally through compromised networks,” Microsoft officials wrote.

GooseEgg is typically installed using a simple batch script, which is executed following the successful exploitation of CVE-2022-38028 or another vulnerability, such as CVE-2023-23397, which Monday’s advisory said has also been exploited by Forest Blizzard. The script is responsible for installing the GooseEgg binary, often named justice.exe or DefragmentSrv.exe, then ensuring that they run each time the infected machine is rebooted.

Windows vulnerability reported by the NSA exploited to install Russian malware Read More »

microsoft’s-vasa-1-can-deepfake-a-person-with-one-photo-and-one-audio-track

Microsoft’s VASA-1 can deepfake a person with one photo and one audio track

pics and it didn’t happen —

YouTube videos of 6K celebrities helped train AI model to animate photos in real time.

A sample image from Microsoft for

Enlarge / A sample image from Microsoft for “VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time.”

On Tuesday, Microsoft Research Asia unveiled VASA-1, an AI model that can create a synchronized animated video of a person talking or singing from a single photo and an existing audio track. In the future, it could power virtual avatars that render locally and don’t require video feeds—or allow anyone with similar tools to take a photo of a person found online and make them appear to say whatever they want.

“It paves the way for real-time engagements with lifelike avatars that emulate human conversational behaviors,” reads the abstract of the accompanying research paper titled, “VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time.” It’s the work of Sicheng Xu, Guojun Chen, Yu-Xiao Guo, Jiaolong Yang, Chong Li, Zhenyu Zang, Yizhong Zhang, Xin Tong, and Baining Guo.

The VASA framework (short for “Visual Affective Skills Animator”) uses machine learning to analyze a static image along with a speech audio clip. It is then able to generate a realistic video with precise facial expressions, head movements, and lip-syncing to the audio. It does not clone or simulate voices (like other Microsoft research) but relies on an existing audio input that could be specially recorded or spoken for a particular purpose.

Microsoft claims the model significantly outperforms previous speech animation methods in terms of realism, expressiveness, and efficiency. To our eyes, it does seem like an improvement over single-image animating models that have come before.

AI research efforts to animate a single photo of a person or character extend back at least a few years, but more recently, researchers have been working on automatically synchronizing a generated video to an audio track. In February, an AI model called EMO: Emote Portrait Alive from Alibaba’s Institute for Intelligent Computing research group made waves with a similar approach to VASA-1 that can automatically sync an animated photo to a provided audio track (they call it “Audio2Video”).

Trained on YouTube clips

Microsoft Researchers trained VASA-1 on the VoxCeleb2 dataset created in 2018 by three researchers from the University of Oxford. That dataset contains “over 1 million utterances for 6,112 celebrities,” according to the VoxCeleb2 website, extracted from videos uploaded to YouTube. VASA-1 can reportedly generate videos of 512×512 pixel resolution at up to 40 frames per second with minimal latency, which means it could potentially be used for realtime applications like video conferencing.

To show off the model, Microsoft created a VASA-1 research page featuring many sample videos of the tool in action, including people singing and speaking in sync with pre-recorded audio tracks. They show how the model can be controlled to express different moods or change its eye gaze. The examples also include some more fanciful generations, such as Mona Lisa rapping to an audio track of Anne Hathaway performing a “Paparazzi” song on Conan O’Brien.

The researchers say that, for privacy reasons, each example photo on their page was AI-generated by StyleGAN2 or DALL-E 3 (aside from the Mona Lisa). But it’s obvious that the technique could equally apply to photos of real people as well, although it’s likely that it will work better if a person appears similar to a celebrity present in the training dataset. Still, the researchers say that deepfaking real humans is not their intention.

“We are exploring visual affective skill generation for virtual, interactive charactors [sic], NOT impersonating any person in the real world. This is only a research demonstration and there’s no product or API release plan,” reads the site.

While the Microsoft researchers tout potential positive applications like enhancing educational equity, improving accessibility, and providing therapeutic companionship, the technology could also easily be misused. For example, it could allow people to fake video chats, make real people appear to say things they never actually said (especially when paired with a cloned voice track), or allow harassment from a single social media photo.

Right now, the generated video still looks imperfect in some ways, but it could be fairly convincing for some people if they did not know to expect an AI-generated animation. The researchers say they are aware of this, which is why they are not openly releasing the code that powers the model.

“We are opposed to any behavior to create misleading or harmful contents of real persons, and are interested in applying our technique for advancing forgery detection,” write the researchers. “Currently, the videos generated by this method still contain identifiable artifacts, and the numerical analysis shows that there’s still a gap to achieve the authenticity of real videos.”

VASA-1 is only a research demonstration, but Microsoft is far from the only group developing similar technology. If the recent history of generative AI is any guide, it’s potentially only a matter of time before similar technology becomes open source and freely available—and they will very likely continue to improve in realism over time.

Microsoft’s VASA-1 can deepfake a person with one photo and one audio track Read More »

llms-keep-leaping-with-llama-3,-meta’s-newest-open-weights-ai-model

LLMs keep leaping with Llama 3, Meta’s newest open-weights AI model

computer-powered word generator —

Zuckerberg says new AI model “was still learning” when Meta stopped training.

A group of pink llamas on a pixelated background.

On Thursday, Meta unveiled early versions of its Llama 3 open-weights AI model that can be used to power text composition, code generation, or chatbots. It also announced that its Meta AI Assistant is now available on a website and is going to be integrated into its major social media apps, intensifying the company’s efforts to position its products against other AI assistants like OpenAI’s ChatGPT, Microsoft’s Copilot, and Google’s Gemini.

Like its predecessor, Llama 2, Llama 3 is notable for being a freely available, open-weights large language model (LLM) provided by a major AI company. Llama 3 technically does not quality as “open source” because that term has a specific meaning in software (as we have mentioned in other coverage), and the industry has not yet settled on terminology for AI model releases that ship either code or weights with restrictions (you can read Llama 3’s license here) or that ship without providing training data. We typically call these releases “open weights” instead.

At the moment, Llama 3 is available in two parameter sizes: 8 billion (8B) and 70 billion (70B), both of which are available as free downloads through Meta’s website with a sign-up. Llama 3 comes in two versions: pre-trained (basically the raw, next-token-prediction model) and instruction-tuned (fine-tuned to follow user instructions). Each has a 8,192 token context limit.

A screenshot of the Meta AI Assistant website on April 18, 2024.

Enlarge / A screenshot of the Meta AI Assistant website on April 18, 2024.

Benj Edwards

Meta trained both models on two custom-built, 24,000-GPU clusters. In a podcast interview with Dwarkesh Patel, Meta CEO Mark Zuckerberg said that the company trained the 70B model with around 15 trillion tokens of data. Throughout the process, the model never reached “saturation” (that is, it never hit a wall in terms of capability increases). Eventually, Meta pulled the plug and moved on to training other models.

“I guess our prediction going in was that it was going to asymptote more, but even by the end it was still leaning. We probably could have fed it more tokens, and it would have gotten somewhat better,” Zuckerberg said on the podcast.

Meta also announced that it is currently training a 400B parameter version of Llama 3, which some experts like Nvidia’s Jim Fan think may perform in the same league as GPT-4 Turbo, Claude 3 Opus, and Gemini Ultra on benchmarks like MMLU, GPQA, HumanEval, and MATH.

Speaking of benchmarks, we have devoted many words in the past to explaining how frustratingly imprecise benchmarks can be when applied to large language models due to issues like training contamination (that is, including benchmark test questions in the training dataset), cherry-picking on the part of vendors, and an inability to capture AI’s general usefulness in an interactive session with chat-tuned models.

But, as expected, Meta provided some benchmarks for Llama 3 that list results from MMLU (undergraduate level knowledge), GSM-8K (grade-school math), HumanEval (coding), GPQA (graduate-level questions), and MATH (math word problems). These show the 8B model performing well compared to open-weights models like Google’s Gemma 7B and Mistral 7B Instruct, and the 70B model also held its own against Gemini Pro 1.5 and Claude 3 Sonnet.

A chart of instruction-tuned Llama 3 8B and 70B benchmarks provided by Meta.

Enlarge / A chart of instruction-tuned Llama 3 8B and 70B benchmarks provided by Meta.

Meta says that the Llama 3 model has been enhanced with capabilities to understand coding (like Llama 2) and, for the first time, has been trained with both images and text—though it currently outputs only text. According to Reuters, Meta Chief Product Officer Chris Cox noted in an interview that more complex processing abilities (like executing multi-step plans) are expected in future updates to Llama 3, which will also support multimodal outputs—that is, both text and images.

Meta plans to host the Llama 3 models on a range of cloud platforms, making them accessible through AWS, Databricks, Google Cloud, and other major providers.

Also on Thursday, Meta announced that Llama 3 will become the new basis of the Meta AI virtual assistant, which the company first announced in September. The assistant will appear prominently in search features for Facebook, Instagram, WhatsApp, Messenger, and the aforementioned dedicated website that features a design similar to ChatGPT, including the ability to generate images in the same interface. The company also announced a partnership with Google to integrate real-time search results into the Meta AI assistant, adding to an existing partnership with Microsoft’s Bing.

LLMs keep leaping with Llama 3, Meta’s newest open-weights AI model Read More »

kremlin-backed-actors-spread-disinformation-ahead-of-us-elections

Kremlin-backed actors spread disinformation ahead of US elections

MANUFACTURING DIVISION —

To a lesser extent, China and Iran also peddle disinfo in hopes of influencing voters.

Kremlin-backed actors spread disinformation ahead of US elections

Kremlin-backed actors have stepped up efforts to interfere with the US presidential election by planting disinformation and false narratives on social media and fake news sites, analysts with Microsoft reported Wednesday.

The analysts have identified several unique influence-peddling groups affiliated with the Russian government seeking to influence the election outcome, with the objective in large part to reduce US support of Ukraine and sow domestic infighting. These groups have so far been less active during the current election cycle than they were during previous ones, likely because of a less contested primary season.

Stoking divisions

Over the past 45 days, the groups have seeded a growing number of social media posts and fake news articles that attempt to foment opposition to US support of Ukraine and stoke divisions over hot-button issues such as election fraud. The influence campaigns also promote questions about President Biden’s mental health and corrupt judges. In all, Microsoft has tracked scores of such operations in recent weeks.

In a report published Wednesday, the Microsoft analysts wrote:

The deteriorated geopolitical relationship between the United States and Russia leaves the Kremlin with little to lose and much to gain by targeting the US 2024 presidential election. In doing so, Kremlin-backed actors attempt to influence American policy regarding the war in Ukraine, reduce social and political support to NATO, and ensnare the United States in domestic infighting to distract from the world stage. Russia’s efforts thus far in 2024 are not novel, but rather a continuation of a decade-long strategy to “win through the force of politics, rather than the politics of force,” or active measures. Messaging regarding Ukraine—via traditional media and social media—picked up steam over the last two months with a mix of covert and overt campaigns from at least 70 Russia-affiliated activity sets we track.

The most prolific of the influence-peddling groups, Microsoft said, is tied to the Russian Presidential Administration, which according to the Marshal Center think tank, is a secretive institution that acts as the main gatekeeper for President Vladimir Putin. The affiliation highlights the “the increasingly centralized nature of Russian influence campaigns,” a departure from campaigns in previous years that primarily relied on intelligence services and a group known as the Internet Research Agency.

“Each Russian actor has shown the capability and willingness to target English-speaking—and in some cases Spanish-speaking—audiences in the US, pushing social and political disinformation meant to portray Ukrainian President Volodymyr Zelensky as unethical and incompetent, Ukraine as a puppet or failed state, and any American aid to Ukraine as directly supporting a corrupt and conspiratorial regime,” the analysts wrote.

An example is Storm-1516, the name Microsoft uses to track a group seeding anti-Ukraine narratives through US Internet and media sources. Content, published in English, Russian, French, Arabic, and Finnish, frequently originates through disinformation seeded by a purported whistleblower or citizen journalist over a purpose-built video channel and then picked up by a network of Storm-1516-controlled websites posing as independent news sources. These fake news sites reside in the Middle East and Africa as well as in the US, with DC Weekly, Miami Chronicle, and the Intel Drop among them.

Eventually, once the disinformation has circulated in subsequent days, US audiences begin amplifying it, in many cases without being aware of the original source. The following graphic illustrates the flow.

Storm-1516 process for laundering anti-Ukraine disinformation.

Enlarge / Storm-1516 process for laundering anti-Ukraine disinformation.

Microsoft

Wednesday’s report also referred to another group tracked as Storm-1099, which is best known for a campaign called Doppelganger. According to the disinformation research group Disinfo Research Lab, the campaign has targeted multiple countries since 2022 with content designed to undermine support for Ukraine and sow divisions among audiences. Two US outlets tied to Storm-1099 are Election Watch and 50 States of Lie, Microsoft said. The image below shows content recently published by the outlets:

Storm-1099 sites.

Enlarge / Storm-1099 sites.

Microsoft

Wednesday’s report also touched on two other Kremlin-tied operations. One attempts to revive a campaign perpetuated by NABU Leaks, a website that published content alleging then-Vice President Joe Biden colluded with former Ukrainian leader Petro Poroshenko, according to Reuters. In January, Andrei Derkoch—the ex-Ukrainian Parliamentarian and US-sanctioned Russian agent responsible for NABU Leaks—reemerged on social media for the first time in two years. In an interview, Derkoch propagated both old and new claims about Biden and other US political figures.

The other operation follows a playbook known as hack and leak, in which operatives obtain private information through hacking and leak it to news outlets.

Kremlin-backed actors spread disinformation ahead of US elections Read More »

broadcom-says-“many”-vmware-perpetual-licenses-got-support-extensions

Broadcom says “many” VMware perpetual licenses got support extensions

Conveniently timed blog post —

Broadcom reportedly accused of changing VMware licensing and support conditions.

The logo of American cloud computing and virtualization technology company VMware is seen at the Mobile World Congress (MWC), the telecom industry's biggest annual gathering, in Barcelona on March 2, 2023.

Broadcom CEO Hock Tan this week publicized some concessions aimed at helping customers and partners ease into VMware’s recent business model changes. Tan reiterated that the controversial changes, like the end of perpetual licensing, aren’t going away. But amid questioning from antitrust officials in the European Union (EU), Tan announced that the company has already given support extensions for some VMware perpetual license holders.

Broadcom closed its $69 billion VMware acquisition in November. One of its first moves was ending VMware perpetual license sales in favor of subscriptions. Since December, Broadcom also hasn’t sold Support and Subscription renewals for VMware perpetual licenses.

In a blog post on Monday, Tan admitted that this shift requires “a change in the timing of customers’ expenditures and the balance of those expenditures between capital and operating spending.” As a result, Broadcom has “given support extensions to many customers who came up for renewal while these changes were rolling out.” Tan didn’t specify how Broadcom determined who is eligible for an extension or for how long. However, the executive’s blog is the first time Broadcom has announced such extensions and opens the door to more extension requests.

Tan also announced free access to zero-day security patches for supported versions of vSphere to “ensure that customers whose maintenance and support contracts have expired and choose to not continue on one of our subscription offerings are able to use perpetual licenses in a safe and secure fashion.” Tan said other VMware offerings would also receive this concession but didn’t say which or when.

Antitrust concerns in the EU

The news follows Broadcom being questioned by EU antitrust regulators. In late March, MLex said that a European Commission spokesperson had contacted Broadcom for questioning because the commission “received information suggesting that Broadcom is changing the conditions of VMware’s software licensing and support.” Reuters confirmed the news on Monday, the same day Tan posted his blog. Tan didn’t specify if his blog post was related to the EU probing. Broadcom moving VMware to a subscription model was one of the allegations that led to EU officials’ probe, MLex said last month. It’s unclear what, if anything, will follow the questioning.

Tan said this week that VMware’s plan to move to a subscription model started in 2018 (he previously said the plans started to “accelerate in 2019”) before Broadcom’s acquisition. He has argued that the transition ultimately occurred later than most competitors.

The Commission previously approved Broadcom’s VMware purchase in July after a separate antitrust investigation.

However, various European trade groups, including Beltug, a Belgian CIO trade group, and the CIO Platform Nederland association for CIOs and CDOs, wrote a letter (PDF) to the European Commission on March 28, requesting that the Commission “take appropriate action” against Broadcom, which it accused of implementing VMware business practices that resulted in “steeply increased prices,” “non-fulfillment of previous contractual agreements,” and Broadcom “refusing to maintain security conditions for perpetual licenses.”

Partner worries

VMware channel partners and customers have also criticized Broadcom’s VMware for seemingly having less interest in doing business with smaller businesses. The company previously announced that it is killing the VMware Cloud Services Provider (CSP) partner program. The Palo Alto-headquartered firm originally said that CSPs may be invited to the Broadcom Expert Advantage Partner Program. However, reported minimum core requirements seemed to outprice small firms; in February, some small managed service providers claimed that the price of doing VMware business would increase tenfold under the new structure.

Small CSPs will be able to white-label offerings from larger CSPs that qualified for Broadcom’s Premier or Pinnacle partner program tiers as of April 30, when VMware’s CSP partner program shutters. But in the meantime, Broadcom “will continue existing operations” small CSPs “under modified monthly billing arrangements until the white-label offers are available,” Tan said, adding that the move is about ensuring that “there is continuity of service for this smaller partner group.”

However, some channel partners accessing VMware offerings through larger partners remain worried about the future. CRN spoke with an anonymous channel partner selling VMware through Hewlett Packard Enterprise (HPE), which said that more than half of its VMware customers “have reached out to say they are concerned and they want to be aware of alternatives.”

Another unnamed HPE partner told CRN that Broadcom’s perceived prioritization of “the “bigger, more profitable customers, is sensible but “leaves a lot of people in the lurch.”

Broadcom didn’t respond to Ars’ request for comment.

Broadcom says “many” VMware perpetual licenses got support extensions Read More »

linus-torvalds-reiterates-his-tabs-versus-spaces-stance-with-a-kernel-trap

Linus Torvalds reiterates his tabs-versus-spaces stance with a kernel trap

Tabs Versus Space 2024: The Sabotage —

One does not simply suggest changing a kernel line to help out a parsing tool.

Updated

Tab soda displayed on a grocery shelf

Enlarge / Cans of Tab diet soda on display in 2011. Tab was discontinued in 2020. There has never been a soda named “Spaces” that had a cult following.

Getty Images

Anybody can contribute to the Linux kernel, but any person’s commit suggestion can become the subject of the kernel’s master and namesake, Linus Torvalds. Torvalds is famously not overly committed to niceness, though he has been working on it since 2018. You can see glimpses of this newer, less curse-laden approach in how Torvalds recently addressed a commit with which he vehemently disagreed. It involves tabs.

The commit last week changed exactly one thing on one line, replacing a tab character with a space: “It helps Kconfig parsers to read file without error.” Torvalds responded with a commit of his own, as spotted by The Register, which would “add some hidden tabs on purpose.” Trying to smooth over a tabs-versus-spaces matter seemed to awaken Torvalds to the need to have tab-detecting failures be “more obvious.” Torvalds would have added more, he wrote, but didn’t “want to make things uglier than necessary. But it *mightbe necessary if it turns out we see more of this kind of silly tooling.”

If you’ve read this far and don’t understand what’s happening, please allow me, a failed CS minor, to offer a quick explanation: Tabs Versus Spaces will never be truly resolved, codified, or set right by standards, and the energy spent on the issue over time could, if harnessed, likely power one or more small nations. Still, the Linux kernel has its own coding style, and it directly cites “K&R,” or Kernighan & Ritchie, the authors of the coding bible The C Programming Language, which is a tabs book. If you are submitting kernel code, it had better use tabs (eight-character tabs, ideally, though that is tied in part to teletype and line-printer history).

By attempting to smooth over one tiny part of the kernel so that a parsing tool could see a space character as a delineating whitespace, Prasad Pandit inadvertently spurred a robust rebuttal:

It wasn’t clear what tool it was, but let’s make sure it gets fixed. Because if you can’t parse tabs as whitespace, you should not be parsing the kernel Kconfig files.

In fact, let’s make such breakage more obvious than some esoteric ftrace record size option. If you can’t parse tabs, you can’t have page sizes.

Yes, tab-vs-space confusion is sadly a traditional Unix thing, and ‘make’ is famous for being broken in this regard. But no, that does not mean that it’s ok.

Torvalds’ hidden tabs appear in the fourth release candidate for Linux kernel 6.9, which Torvlads wrote had “nothing particularly unusual going on” the week of its release.

Disclosure: The author is a tab person insofar as he has any idea what he’s doing.

This post was updated at 6: 33 pm Eastern to fix some line-break issues in the Torvalds blockquote. The irony was duly noted. A better link regarding the Tabs Vs. Spaces debate was also swapped in.

Linus Torvalds reiterates his tabs-versus-spaces stance with a kernel trap Read More »