Biz & IT

journalists-“deeply-troubled”-by-openai’s-content-deals-with-vox,-the-atlantic

Journalists “deeply troubled” by OpenAI’s content deals with Vox, The Atlantic

adventures in training data —

“Alarmed” writers unions question transparency of AI training deals with ChatGPT maker.

A man covered in newspaper.

On Wednesday, Axios broke the news that OpenAI had signed deals with The Atlantic and Vox Media that will allow the ChatGPT maker to license their editorial content to further train its language models. But some of the publications’ writers—and the unions that represent them—were surprised by the announcements and aren’t happy about it. Already, two unions have released statements expressing “alarm” and “concern.”

“The unionized members of The Atlantic Editorial and Business and Technology units are deeply troubled by the opaque agreement The Atlantic has made with OpenAI,” reads a statement from the Atlantic union. “And especially by management’s complete lack of transparency about what the agreement entails and how it will affect our work.”

The Vox Union—which represents The Verge, SB Nation, and Vulture, among other publications—reacted in similar fashion, writing in a statement, “Today, members of the Vox Media Union … were informed without warning that Vox Media entered into a ‘strategic content and product partnership’ with OpenAI. As both journalists and workers, we have serious concerns about this partnership, which we believe could adversely impact members of our union, not to mention the well-documented ethical and environmental concerns surrounding the use of generative AI.”

  • A statement from The Atlantic Union about the OpenAI deal, released May 30, 2024.

  • A statement from the Vox Media Union about the OpenAI deal, released May 29, 2024.

OpenAI has previously admitted to using copyrighted information scraped from publications like the ones that just inked licensing deals to train AI models like GPT-4, which powers its ChatGPT AI assistant. While the company maintains the practice is fair use, it has simultaneously licensed training content from publishing groups like Axel Springer and social media sites like Reddit and Stack Overflow, sparking protests from users of those platforms.

As part of the multi-year agreements with The Atlantic and Vox, OpenAI will be able to openly and officially utilize the publishers’ archived materials—dating back to 1857 in The Atlantic’s case—as well as current articles to train responses generated by ChatGPT and other AI language models. In exchange, the publishers will receive undisclosed sums of money and be able to use OpenAI’s technology “to power new journalism products,” according to Axios.

Reporters react

News of the deals took both journalists and unions by surprise. On X, Vox reporter Kelsey Piper, who recently penned an exposé about OpenAI’s restrictive non-disclosure agreements that prompted a change in policy from the company, wrote, “I’m very frustrated they announced this without consulting their writers, but I have very strong assurances in writing from our editor in chief that they want more coverage like the last two weeks and will never interfere in it. If that’s false I’ll quit..”

Journalists also reacted to news of the deals through the publications themselves. On Wednesday, The Atlantic Senior Editor Damon Beres wrote a piece titled “A Devil’s Bargain With OpenAI,” in which he expressed skepticism about the partnership, likening it to making a deal with the devil that may backfire. He highlighted concerns about AI’s use of copyrighted material without permission and its potential to spread disinformation at a time when publications have seen a recent string of layoffs. He drew parallels to the pursuit of audiences on social media leading to clickbait and SEO tactics that degraded media quality. While acknowledging the financial benefits and potential reach, Beres cautioned against relying on inaccurate, opaque AI models and questioned the implications of journalism companies being complicit in potentially destroying the internet as we know it, even as they try to be part of the solution by partnering with OpenAI.

Similarly, over at Vox, Editorial Director Bryan Walsh penned a piece titled, “This article is OpenAI training data,” in which he expresses apprehension about the licensing deal, drawing parallels between the relentless pursuit of data by AI companies and the classic AI thought experiment of Bostrom’s “paperclip maximizer,” cautioning that the single-minded focus on market share and profits could ultimately destroy the ecosystem AI companies rely on for training data. He worries that the growth of AI chatbots and generative AI search products might lead to a significant decline in search engine traffic to publishers, potentially threatening the livelihoods of content creators and the richness of the Internet itself.

Meanwhile, OpenAI still battles over “fair use”

Not every publication is eager to step up to the licensing plate with OpenAI. The San Francisco-based company is currently in the middle of a lawsuit with The New York Times in which OpenAI claims that scraping data from publications for AI training purposes is fair use. The New York Times has tried to block AI companies from such scraping by updating its terms of service to prohibit AI training, arguing in its lawsuit that ChatGPT could easily become a substitute for NYT.

The Times has accused OpenAI of copying millions of its works to train AI models, finding 100 examples where ChatGPT regurgitated articles. In response, OpenAI accused NYT of “hacking” ChatGPT with deceptive prompts simply to set up a lawsuit. NYT’s counsel Ian Crosby previously told Ars that OpenAI’s decision “to enter into deals with news publishers only confirms that they know their unauthorized use of copyrighted work is far from ‘fair.'”

While that issue has yet to be resolved in the courts, for now, The Atlantic Union seeks transparency.

“The Atlantic has defended the values of transparency and intellectual honesty for more than 160 years. Its legacy is built on integrity, derived from the work of its writers, editors, producers, and business staff,” it wrote. “OpenAI, on the other hand, has used news articles to train AI technologies like ChatGPT without permission. The people who continue to maintain and serve The Atlantic deserve to know what precisely management has licensed to an outside firm and how, specifically, they plan to use the archive of our creative output and our work product.”

Journalists “deeply troubled” by OpenAI’s content deals with Vox, The Atlantic Read More »

google’s-ai-overview-is-flawed-by-design,-and-a-new-company-blog-post-hints-at-why

Google’s AI Overview is flawed by design, and a new company blog post hints at why

guided by voices —

Google: “There are bound to be some oddities and errors” in system that told people to eat rocks.

A selection of Google mascot characters created by the company.

Enlarge / The Google “G” logo surrounded by whimsical characters, all of which look stunned and surprised.

On Thursday, Google capped off a rough week of providing inaccurate and sometimes dangerous answers through its experimental AI Overview feature by authoring a follow-up blog post titled, “AI Overviews: About last week.” In the post, attributed to Google VP Liz Reid, head of Google Search, the firm formally acknowledged issues with the feature and outlined steps taken to improve a system that appears flawed by design, even if it doesn’t realize it is admitting it.

To recap, the AI Overview feature—which the company showed off at Google I/O a few weeks ago—aims to provide search users with summarized answers to questions by using an AI model integrated with Google’s web ranking systems. Right now, it’s an experimental feature that is not active for everyone, but when a participating user searches for a topic, they might see an AI-generated answer at the top of the results, pulled from highly ranked web content and summarized by an AI model.

While Google claims this approach is “highly effective” and on par with its Featured Snippets in terms of accuracy, the past week has seen numerous examples of the AI system generating bizarre, incorrect, or even potentially harmful responses, as we detailed in a recent feature where Ars reporter Kyle Orland replicated many of the unusual outputs.

Drawing inaccurate conclusions from the web

On Wednesday morning, Google's AI Overview was erroneously telling us the Sony PlayStation and Sega Saturn were available in 1993.

Enlarge / On Wednesday morning, Google’s AI Overview was erroneously telling us the Sony PlayStation and Sega Saturn were available in 1993.

Kyle Orland / Google

Given the circulating AI Overview examples, Google almost apologizes in the post and says, “We hold ourselves to a high standard, as do our users, so we expect and appreciate the feedback, and take it seriously.” But Reid, in an attempt to justify the errors, then goes into some very revealing detail about why AI Overviews provides erroneous information:

AI Overviews work very differently than chatbots and other LLM products that people may have tried out. They’re not simply generating an output based on training data. While AI Overviews are powered by a customized language model, the model is integrated with our core web ranking systems and designed to carry out traditional “search” tasks, like identifying relevant, high-quality results from our index. That’s why AI Overviews don’t just provide text output, but include relevant links so people can explore further. Because accuracy is paramount in Search, AI Overviews are built to only show information that is backed up by top web results.

This means that AI Overviews generally don’t “hallucinate” or make things up in the ways that other LLM products might.

Here we see the fundamental flaw of the system: “AI Overviews are built to only show information that is backed up by top web results.” The design is based on the false assumption that Google’s page-ranking algorithm favors accurate results and not SEO-gamed garbage. Google Search has been broken for some time, and now the company is relying on those gamed and spam-filled results to feed its new AI model.

Even if the AI model draws from a more accurate source, as with the 1993 game console search seen above, Google’s AI language model can still make inaccurate conclusions about the “accurate” data, confabulating erroneous information in a flawed summary of the information available.

Generally ignoring the folly of basing its AI results on a broken page-ranking algorithm, Google’s blog post instead attributes the commonly circulated errors to several other factors, including users making nonsensical searches “aimed at producing erroneous results.” Google does admit faults with the AI model, like misinterpreting queries, misinterpreting “a nuance of language on the web,” and lacking sufficient high-quality information on certain topics. It also suggests that some of the more egregious examples circulating on social media are fake screenshots.

“Some of these faked results have been obvious and silly,” Reid writes. “Others have implied that we returned dangerous results for topics like leaving dogs in cars, smoking while pregnant, and depression. Those AI Overviews never appeared. So we’d encourage anyone encountering these screenshots to do a search themselves to check.”

(No doubt some of the social media examples are fake, but it’s worth noting that any attempts to replicate those early examples now will likely fail because Google will have manually blocked the results. And it is potentially a testament to how broken Google Search is if people believed extreme fake examples in the first place.)

While addressing the “nonsensical searches” angle in the post, Reid uses the example search, “How many rocks should I eat each day,” which went viral in a tweet on May 23. Reid says, “Prior to these screenshots going viral, practically no one asked Google that question.” And since there isn’t much data on the web that answers it, she says there is a “data void” or “information gap” that was filled by satirical content found on the web, and the AI model found it and pushed it as an answer, much like Featured Snippets might. So basically, it was working exactly as designed.

A screenshot of an AI Overview query,

Enlarge / A screenshot of an AI Overview query, “How many rocks should I eat each day” that went viral on X last week.

Google’s AI Overview is flawed by design, and a new company blog post hints at why Read More »

federal-agency-warns-critical-linux-vulnerability-being-actively-exploited

Federal agency warns critical Linux vulnerability being actively exploited

NETFILTER FLAW —

Cybersecurity and Infrastructure Security Agency urges affected users to update ASAP.

Federal agency warns critical Linux vulnerability being actively exploited

Getty Images

The US Cybersecurity and Infrastructure Security Agency has added a critical security bug in Linux to its list of vulnerabilities known to be actively exploited in the wild.

The vulnerability, tracked as CVE-2024-1086 and carrying a severity rating of 7.8 out of a possible 10, allows people who have already gained a foothold inside an affected system to escalate their system privileges. It’s the result of a use-after-free error, a class of vulnerability that occurs in software written in the C and C++ languages when a process continues to access a memory location after it has been freed or deallocated. Use-after-free vulnerabilities can result in remote code or privilege escalation.

The vulnerability, which affects Linux kernel versions 5.14 through 6.6, resides in the NF_tables, a kernel component enabling the Netfilter, which in turn facilitates a variety of network operations, including packet filtering, network address [and port] translation (NA[P]T), packet logging, userspace packet queueing, and other packet mangling. It was patched in January, but as the CISA advisory indicates, some production systems have yet to install it. At the time this Ars post went live, there were no known details about the active exploitation.

A deep-dive write-up of the vulnerability reveals that these exploits provide “a very powerful double-free primitive when the correct code paths are hit.” Double-free vulnerabilities are a subclass of use-after-free errors that occur when the free() function for freeing memory is called more than once for the same location. The write-up lists multiple ways to exploit the vulnerability, along with code for doing so.

The double-free error is the result of a failure to achieve input sanitization in netfilter verdicts when nf_tables and unprivileged user namespaces are enabled. Some of the most effective exploitation techniques allow for arbitrary code execution in the kernel and can be fashioned to drop a universal root shell.

The author offered the following graphic providing a conceptual illustration:

pwning tech

CISA has given federal agencies under its authority until June 20 to issue a patch. The agency is urging all organizations that have yet to apply an update to do so as soon as possible.

Federal agency warns critical Linux vulnerability being actively exploited Read More »

mystery-malware-destroys-600,000-routers-from-a-single-isp-during-72-hour-span

Mystery malware destroys 600,000 routers from a single ISP during 72-hour span

PUMPKIN ECLIPSE —

An unknown threat actor with equally unknown motives forces ISP to replace routers.

Mystery malware destroys 600,000 routers from a single ISP during 72-hour span

Getty Images

One day last October, subscribers to an ISP known as Windstream began flooding message boards with reports their routers had suddenly stopped working and remained unresponsive to reboots and all other attempts to revive them.

“The routers now just sit there with a steady red light on the front,” one user wrote, referring to the ActionTec T3200 router models Windstream provided to both them and a next door neighbor. “They won’t even respond to a RESET.”

In the messages—which appeared over a few days beginning on October 25—many Windstream users blamed the ISP for the mass bricking. They said it was the result of the company pushing updates that poisoned the devices. Windstream’s Kinetic broadband service has about 1.6 million subscribers in 18 states, including Iowa, Alabama, Arkansas, Georgia, and Kentucky. For many customers, Kinetic provides an essential link to the outside world.

“We have 3 kids and both work from home,” another subscriber wrote in the same forum. “This has easily cost us $1,500+ in lost business, no tv, WiFi, hours on the phone, etc. So sad that a company can treat customers like this and not care.”

After eventually determining that the routers were permanently unusable, Windstream sent new routers to affected customers. Black Lotus has named the event Pumpkin Eclipse.

A deliberate act

A report published Thursday by security firm Lumen Technologies’ Black Lotus Labs may shed new light on the incident, which Windstream has yet to explain. Black Lotus Labs researchers said that over a 72-hour period beginning on October 25, malware took out more than 600,000 routers connected to a single autonomous system number, or ASN, belonging to an unnamed ISP.

While the researchers aren’t identifying the ISP, the particulars they report match almost perfectly with those detailed in the October messages from Windstream subscribers. Specifically, the date the mass bricking started, the router models affected, the description of the ISP, and the displaying of a static red light by the out-of-commission ActionTec routers. Windstream representatives declined to answer questions sent by email.

According to Black Lotus, the routers—conservatively estimated at a minimum of 600,000—were taken out by an unknown threat actor with equally unknown motivations. The actor took deliberate steps to cover their tracks by using commodity malware known as Chalubo, rather than a custom-developed toolkit. A feature built into Chalubo allowed the actor to execute custom Lua scripts on the infected devices. The researchers believe the malware downloaded and ran code that permanently overwrote the router firmware.

“We assess with high confidence that the malicious firmware update was a deliberate act intended to cause an outage, and though we expected to see a number of router make and models affected across the internet, this event was confined to the single ASN,” Thursday’s report stated before going on to note the troubling implications of a single piece of malware suddenly severing the connections of 600,000 routers.

The researchers wrote:

Destructive attacks of this nature are highly concerning, especially so in this case. A sizeable portion of this ISP’s service area covers rural or underserved communities; places where residents may have lost access to emergency services, farming concerns may have lost critical information from remote monitoring of crops during the harvest, and health care providers cut off from telehealth or patients’ records. Needless to say, recovery from any supply chain disruption takes longer in isolated or vulnerable communities.

After learning of the mass router outage, Black Lotus began querying the Censys search engine for the affected router models. A one-week snapshot soon revealed that one specific ASN experienced a 49 percent drop in those models just as the reports began. This amounted to the disconnection of at least 179,000 ActionTec routers and more than 480,000 routers sold by Sagemcom.

Black Lotus Labs

The constant connecting and disconnecting of routers to any ISP complicates the tracking process, because it’s impossible to know if a disappearance is the result of the normal churn or something more complicated. Black Lotus said that a conservative estimate is that at least 600,000 of the disconnections it tracked were the result of Chaluba infecting the devices and, from there, permanently wiping the firmware they ran on.

After identifying the ASN, Black Lotus discovered a complex multi-path infection mechanism for installing Chaluba on the routers. The following graphic provides a logical overview.

Black Lotus Labs

There aren’t many known precedents for malware that wipes routers en masse in the way witnessed by the researchers. Perhaps the closest was the discovery in 2022 of AcidRain, the name given to malware that knocked out 10,000 modems for satellite Internet provider Viasat. The outage, hitting Ukraine and other parts of Europe, was timed to Russia’s invasion of the smaller neighboring country.

A Black Lotus representative said in an interview that researchers can’t rule out that a nation-state is behind the router-wiping incident affecting the ISP. But so far, the researchers say they aren’t aware of any overlap between the attacks and any known nation-state groups they track.

The researchers have yet to determine the initial means of infecting the routers. It’s possible the threat actors exploited a vulnerability, although the researchers said they aren’t aware of any known vulnerabilities in the affected routers. Other possibilities are the threat actor abused weak credentials or accessed an exposed administrative panel.

An attack unlike any other

While the researchers have analyzed attacks on home and small office routers before, they said two things make this latest one stand out. They explained:

First, this campaign resulted in a hardware-based replacement of the affected devices, which likely indicates that the attacker corrupted the firmware on specific models. The event was unprecedented due to the number of units affected—no attack that we can recall has required the replacement of over 600,000 devices. In addition, this type of attack has only ever happened once before, with AcidRain used as a precursor to an active military invasion.

They continued:

The second unique aspect is that this campaign was confined to a particular ASN. Most previous campaigns we’ve seen target a specific router model or common vulnerability and have effects across multiple providers’ networks. In this instance, we observed that both Sagemcom and ActionTec devices were impacted at the same time, both within the same provider’s network.This led us to assess it was not the result of a faulty firmware update by a single manufacturer, which would normally be confined to one device model or models from a given company. Our analysis of the Censys data shows the impact was only for the two in question. This combination of factors led us to conclude the event was likely a deliberate action taken by an unattributed malicious cyber actor, even if we were not able to recover the destructive module.

With no clear idea how the routers came to be infected, the researchers can only offer the usual generic advice for keeping such devices free of malware. That includes installing security updates, replacing default passwords with strong ones, and regular rebooting. ISPs and other organizations that manage routers should follow additional advice for securing the management interfaces for administering the devices.

Thursday’s report includes IP addresses, domain names, and other indicators that people can use to determine if their devices have been targeted or compromised in the attacks.

Mystery malware destroys 600,000 routers from a single ISP during 72-hour span Read More »

openai-board-first-learned-about-chatgpt-from-twitter,-according-to-former-member

OpenAI board first learned about ChatGPT from Twitter, according to former member

It’s a secret to everybody —

Helen Toner, center of struggle with Altman, suggests CEO fostered “toxic atmosphere” at company.

Helen Toner, former OpenAI board member, speaks onstage during Vox Media's 2023 Code Conference at The Ritz-Carlton, Laguna Niguel on September 27, 2023.

Enlarge / Helen Toner, former OpenAI board member, speaks during Vox Media’s 2023 Code Conference at The Ritz-Carlton, Laguna Niguel on September 27, 2023.

In a recent interview on “The Ted AI Show” podcast, former OpenAI board member Helen Toner said the OpenAI board was unaware of the existence of ChatGPT until they saw it on Twitter. She also revealed details about the company’s internal dynamics and the events surrounding CEO Sam Altman’s surprise firing and subsequent rehiring last November.

OpenAI released ChatGPT publicly on November 30, 2022, and its massive surprise popularity set OpenAI on a new trajectory, shifting focus from being an AI research lab to a more consumer-facing tech company.

“When ChatGPT came out in November 2022, the board was not informed in advance about that. We learned about ChatGPT on Twitter,” Toner said on the podcast.

Toner’s revelation about ChatGPT seems to highlight a significant disconnect between the board and the company’s day-to-day operations, bringing new light to accusations that Altman was “not consistently candid in his communications with the board” upon his firing on November 17, 2023. Altman and OpenAI’s new board later said that the CEO’s mismanagement of attempts to remove Toner from the OpenAI board following her criticism of the company’s release of ChatGPT played a key role in Altman’s firing.

“Sam didn’t inform the board that he owned the OpenAI startup fund, even though he constantly was claiming to be an independent board member with no financial interest in the company on multiple occasions,” she said. “He gave us inaccurate information about the small number of formal safety processes that the company did have in place, meaning that it was basically impossible for the board to know how well those safety processes were working or what might need to change.”

Toner also shed light on the circumstances that led to Altman’s temporary ousting. She mentioned that two OpenAI executives had reported instances of “psychological abuse” to the board, providing screenshots and documentation to support their claims. The allegations made by the former OpenAI executives, as relayed by Toner, suggest that Altman’s leadership style fostered a “toxic atmosphere” at the company:

In October of last year, we had this series of conversations with these executives, where the two of them suddenly started telling us about their own experiences with Sam, which they hadn’t felt comfortable sharing before, but telling us how they couldn’t trust him, about the toxic atmosphere it was creating. They use the phrase “psychological abuse,” telling us they didn’t think he was the right person to lead the company, telling us they had no belief that he could or would change, there’s no point in giving him feedback, no point in trying to work through these issues.

Despite the board’s decision to fire Altman, Altman began the process of returning to his position just five days later after a letter to the board signed by over 700 OpenAI employees. Toner attributed this swift comeback to employees who believed the company would collapse without him, saying they also feared retaliation from Altman if they did not support his return.

“The second thing I think is really important to know, that has really gone under reported is how scared people are to go against Sam,” Toner said. “They experienced him retaliate against people retaliating… for past instances of being critical.”

“They were really afraid of what might happen to them,” she continued. “So some employees started to say, you know, wait, I don’t want the company to fall apart. Like, let’s bring back Sam. It was very hard for those people who had had terrible experiences to actually say that… if Sam did stay in power, as he ultimately did, that would make their lives miserable.”

In response to Toner’s statements, current OpenAI board chair Bret Taylor provided a statement to the podcast: “We are disappointed that Miss Toner continues to revisit these issues… The review concluded that the prior board’s decision was not based on concerns regarding product safety or security, the pace of development, OpenAI’s finances, or its statements to investors, customers, or business partners.”

Even given that review, Toner’s main argument is that OpenAI hasn’t been able to police itself despite claims to the contrary. “The OpenAI saga shows that trying to do good and regulating yourself isn’t enough,” she said.

OpenAI board first learned about ChatGPT from Twitter, according to former member Read More »

researchers-crack-11-year-old-password,-recover-$3-million-in-bitcoin

Researchers crack 11-year-old password, recover $3 million in bitcoin

Illustration of a wallet

Flavio Coelho/Getty Images

Two years ago when “Michael,” an owner of cryptocurrency, contacted Joe Grand to help recover access to about $2 million worth of bitcoin he stored in encrypted format on his computer, Grand turned him down.

Michael, who is based in Europe and asked to remain anonymous, stored the cryptocurrency in a password-protected digital wallet. He generated a password using the RoboForm password manager and stored that password in a file encrypted with a tool called TrueCrypt. At some point, that file got corrupted, and Michael lost access to the 20-character password he had generated to secure his 43.6 BTC (worth a total of about 4,000 euros, or $5,300, in 2013). Michael used the RoboForm password manager to generate the password but did not store it in his manager. He worried that someone would hack his computer and obtain the password.

“At [that] time, I was really paranoid with my security,” he laughs.

Grand is a famed hardware hacker who in 2022 helped another crypto wallet owner recover access to $2 million in cryptocurrency he thought he’d lost forever after forgetting the PIN to his Trezor wallet. Since then, dozens of people have contacted Grand to help them recover their treasure. But Grand, known by the hacker handle “Kingpin,” turns down most of them, for various reasons.

Grand is an electrical engineer who began hacking computing hardware at age 10 and in 2008 cohosted the Discovery Channel’s Prototype This show. He now consults with companies that build complex digital systems to help them understand how hardware hackers like him might subvert their systems. He cracked the Trezor wallet in 2022 using complex hardware techniques that forced the USB-style wallet to reveal its password.

But Michael stored his cryptocurrency in a software-based wallet, which meant none of Grand’s hardware skills were relevant this time. He considered brute-forcing Michael’s password—writing a script to automatically guess millions of possible passwords to find the correct one—but determined this wasn’t feasible. He briefly considered that the RoboForm password manager Michael used to generate his password might have a flaw in the way it generated passwords, which would allow him to guess the password more easily. Grand, however, doubted such a flaw existed.

Michael contacted multiple people who specialize in cracking cryptography; they all told him “there’s no chance” of retrieving his money. But last June he approached Grand again, hoping to convince him to help, and this time Grand agreed to give it a try, working with a friend named Bruno in Germany who also hacks digital wallets.

Researchers crack 11-year-old password, recover $3 million in bitcoin Read More »

openai-training-its-next-major-ai-model,-forms-new-safety-committee

OpenAI training its next major AI model, forms new safety committee

now with 200% more safety —

GPT-5 might be farther off than we thought, but OpenAI wants to make sure it is safe.

A man rolling a boulder up a hill.

On Monday, OpenAI announced the formation of a new “Safety and Security Committee” to oversee risk management for its projects and operations. The announcement comes as the company says it has “recently begun” training its next frontier model, which it expects to bring the company closer to its goal of achieving artificial general intelligence (AGI), though some critics say AGI is farther off than we might think. It also comes as a reaction to a terrible two weeks in the press for the company.

Whether the aforementioned new frontier model is intended to be GPT-5 or a step beyond that is currently unknown. In the AI industry, “frontier model” is a term for a new AI system designed to push the boundaries of current capabilities. And “AGI” refers to a hypothetical AI system with human-level abilities to perform novel, general tasks beyond its training data (unlike narrow AI, which is trained for specific tasks).

Meanwhile, the new Safety and Security Committee, led by OpenAI directors Bret Taylor (chair), Adam D’Angelo, Nicole Seligman, and Sam Altman (CEO), will be responsible for making recommendations about AI safety to the full company board of directors. In this case, “safety” partially means the usual “we won’t let the AI go rogue and take over the world,” but it also includes a broader set of “processes and safeguards” that the company spelled out in a May 21 safety update related to alignment research, protecting children, upholding election integrity, assessing societal impacts, and implementing security measures.

OpenAI says the committee’s first task will be to evaluate and further develop those processes and safeguards over the next 90 days. At the end of this period, the committee will share its recommendations with the full board, and OpenAI will publicly share an update on adopted recommendations.

OpenAI says that multiple technical and policy experts, including Aleksander Madry (head of preparedness), Lilian Weng (head of safety systems), John Schulman (head of alignment science), Matt Knight (head of security), and Jakub Pachocki (chief scientist), will also serve on its new committee.

The announcement is notable in a few ways. First, it’s a reaction to the negative press that came from OpenAI Superalignment team members Ilya Sutskever and Jan Leike resigning two weeks ago. That team was tasked with “steer[ing] and control[ling] AI systems much smarter than us,” and their departure has led to criticism from some within the AI community (and Leike himself) that OpenAI lacks a commitment to developing highly capable AI safely. Other critics, like Meta Chief AI Scientist Yann LeCun, think the company is nowhere near developing AGI, so the concern over a lack of safety for superintelligent AI may be overblown.

Second, there have been persistent rumors that progress in large language models (LLMs) has plateaued recently around capabilities similar to GPT-4. Two major competing models, Anthropic’s Claude Opus and Google’s Gemini 1.5 Pro, are roughly equivalent to the GPT-4 family in capability despite every competitive incentive to surpass it. And recently, when many expected OpenAI to release a new AI model that would clearly surpass GPT-4 Turbo, it instead released GPT-4o, which is roughly equivalent in ability but faster. During that launch, the company relied on a flashy new conversational interface rather than a major under-the-hood upgrade.

We’ve previously reported on a rumor of GPT-5 coming this summer, but with this recent announcement, it seems the rumors may have been referring to GPT-4o instead. It’s quite possible that OpenAI is nowhere near releasing a model that can significantly surpass GPT-4. But with the company quiet on the details, we’ll have to wait and see.

OpenAI training its next major AI model, forms new safety committee Read More »

newly-discovered-ransomware-uses-bitlocker-to-encrypt-victim-data

Newly discovered ransomware uses BitLocker to encrypt victim data

GOING NATIVE —

ShrinkLocker is the latest ransomware to use Windows’ full-disk encryption.

A previously unknown piece of ransomware, dubbed ShrinkLocker, encrypts victim data using the BitLocker feature built into the Windows operating system.

BitLocker is a full-volume encryptor that debuted in 2007 with the release of Windows Vista. Users employ it to encrypt entire hard drives to prevent people from reading or modifying data in the event they get physical access to the disk. Starting with the rollout of Windows 10, BitLocker by default has used the 128-bit and 256-bit XTS-AES encryption algorithm, giving the feature extra protection from attacks that rely on manipulating cipher text to cause predictable changes in plain text.

Recently, researchers from security firm Kaspersky found a threat actor using BitLocker to encrypt data on systems located in Mexico, Indonesia, and Jordan. The researchers named the new ransomware ShrinkLocker, both for its use of BitLocker and because it shrinks the size of each non-boot partition by 100 MB and splits the newly unallocated space into new primary partitions of the same size.

“Our incident response and malware analysis are evidence that attackers are constantly refining their tactics to evade detection,” the researchers wrote Friday. “In this incident, we observed the abuse of the native BitLocker feature for unauthorized data encryption.”

ShrinkLocker isn’t the first malware to leverage BitLocker. In 2022, Microsoft reported that ransomware attackers with a nexus to Iran also used the tool to encrypt files. That same year, the Russian agricultural business Miratorg was attacked by ransomware that used BitLocker to encrypt files residing in the system storage of infected devices.

Once installed on a device, ShrinkLocker runs a VisualBasic script that first invokes the Windows Management Instrumentation and Win32_OperatingSystem class to obtain information about the operating system.

“For each object within the query results, the script checks if the current domain is different from the target,” the Kaspersky researchers wrote. “If it is, the script finishes automatically. After that, it checks if the name of the operating system contains ‘xp,’ ‘2000,’ ‘2003,’ or ‘vista,’ and if the Windows version matches any one of these, the script finishes automatically and deletes itself.”

A screenshot showing initial conditions for execution.

Enlarge / A screenshot showing initial conditions for execution.

Kaspersky

The script then continues to use the WMI for querying information about the OS. It goes on to perform the disk resizing operations, which can vary depending on the OS version detected. The ransomware performs these operations only on local, fixed drives. The decision to leave network drives alone is likely motivated by the desire not to trigger network detection protections.

Eventually, ShrinkLocker disables protections designed to secure the BitLocker encryption key and goes on to delete them. It then enables the use of a numerical password, both as a protector against anyone else taking back control of BitLocker and as an encryptor for system data. The reason for deleting the default protectors is to disable key recovery features by the device owner. ShrinkLocker then goes on to generate a 64-character encryption key using random multiplication and replacement of:

  • A variable with the numbers 0–9;
  • The famous pangram, “The quick brown fox jumps over the lazy dog,” in lowercase and uppercase, which contains every letter of the English alphabet;
  • Special characters.

After several additional steps, data is encrypted. The next time the device reboots, the display looks like this:

Screenshot showing the BitLocker recovery screen.

Enlarge / Screenshot showing the BitLocker recovery screen.

Kaspersky

Decrypting drives without the attacker-supplied key is difficult and likely impossible in many cases. While it is possible to recover some of the passphrases and fixed values used to generate the keys, the script uses variable values that are different on each infected device. These variable values aren’t easy to recover.

There are no protections specific to ShrinkLocker for preventing successful attacks. Kaspersky advises the following:

  • Use robust, properly configured endpoint protection to detect threats that try to abuse BitLocker;
  • Implement Managed Detection and Response (MDR) to proactively scan for threats;
  • If BitLocker is enabled, make sure it uses a strong password and that the recovery keys are stored in a secure location;
  • Ensure that users have only minimal privileges. This prevents them from enabling encryption features or changing registry keys on their own;
  • Enable network traffic logging and monitoring. Configure the logging of both GET and POST requests. In case of infection, the requests made to the attacker’s domain may contain passwords or keys;
  • Monitor for events associated with VBS execution and PowerShell, then save the logged scripts and commands to an external repository storing activity that may be deleted locally;
  • Make backups frequently, store them offline, and test them.

Friday’s report also includes indicators that organizations can use to determine if they have been targeted by ShrinkLocker.

Listing image by Getty Images

Newly discovered ransomware uses BitLocker to encrypt victim data Read More »

google’s-“ai-overview”-can-give-false,-misleading,-and-dangerous-answers

Google’s “AI Overview” can give false, misleading, and dangerous answers

This is fine.

Enlarge / This is fine.

Getty Images

If you use Google regularly, you may have noticed the company’s new AI Overviews providing summarized answers to some of your questions in recent days. If you use social media regularly, you may have come across many examples of those AI Overviews being hilariously or even dangerously wrong.

Factual errors can pop up in existing LLM chatbots as well, of course. But the potential damage that can be caused by AI inaccuracy gets multiplied when those errors appear atop the ultra-valuable web real estate of the Google search results page.

“The examples we’ve seen are generally very uncommon queries and aren’t representative of most people’s experiences,” a Google spokesperson told Ars. “The vast majority of AI Overviews provide high quality information, with links to dig deeper on the web.”

After looking through dozens of examples of Google AI Overview mistakes (and replicating many ourselves for the galleries below), we’ve noticed a few broad categories of errors that seemed to show up again and again. Consider this a crash course in some of the current weak points of Google’s AI Overviews and a look at areas of concern for the company to improve as the system continues to roll out.

Treating jokes as facts

  • The bit about using glue on pizza can be traced back to an 11-year-old troll post on Reddit. (via)

    Kyle Orland / Google

  • This wasn’t funny when the guys at Pep Boys said it, either. (via)

    Kyle Orland / Google

  • Weird Al recommends “running with scissors” as well! (via)

    Kyle Orland / Google

Some of the funniest example of Google’s AI Overview failing come, ironically enough, when the system doesn’t realize a source online was trying to be funny. An AI answer that suggested using “1/8 cup of non-toxic glue” to stop cheese from sliding off pizza can be traced back to someone who was obviously trying to troll an ongoing thread. A response recommending “blinker fluid” for a turn signal that doesn’t make noise can similarly be traced back to a troll on the Good Sam advice forums, which Google’s AI Overview apparently trusts as a reliable source.

In regular Google searches, these jokey posts from random Internet users probably wouldn’t be among the first answers someone saw when clicking through a list of web links. But with AI Overviews, those trolls were integrated into the authoritative-sounding data summary presented right at the top of the results page.

What’s more, there’s nothing in the tiny “source link” boxes below Google’s AI summary to suggest either of these forum trolls are anything other than good sources of information. Sometimes, though, glancing at the source can save you some grief, such as when you see a response calling running with scissors “cardio exercise that some say is effective” (that came from a 2022 post from Little Old Lady Comedy).

Bad sourcing

  • Washington University in St. Louis says this ratio is accurate, but others disagree. (via)

    Kyle Orland / Google

  • Man, we wish this fantasy remake was real. (via)

    Kyle Orland / Google

Sometimes Google’s AI Overview offers an accurate summary of a non-joke source that happens to be wrong. When asking about how many Declaration of Independence signers owned slaves, for instance, Google’s AI Overview accurately summarizes a Washington University of St. Louis library page saying that one-third “were personally enslavers.” But the response ignores contradictory sources like a Chicago Sun-Times article saying the real answer is closer to three-quarters. I’m not enough of a history expert to judge which authoritative-seeming source is right, but at least one historian online took issue with the Google AI’s answer sourcing.

Other times, a source that Google trusts as authoritative is really just fan fiction. That’s the case for a response that imagined a 2022 remake of 2001: A Space Odyssey, directed by Steven Spielberg and produced by George Lucas. A savvy web user would probably do a double-take before citing citing Fandom’s “Idea Wiki” as a reliable source, but a careless AI Overview user might not notice where the AI got its information.

Google’s “AI Overview” can give false, misleading, and dangerous answers Read More »

crooks-plant-backdoor-in-software-used-by-courtrooms-around-the-world

Crooks plant backdoor in software used by courtrooms around the world

DISORDER IN THE COURT —

It’s unclear how the malicious version of JAVS Viewer came to be.

Crooks plant backdoor in software used by courtrooms around the world

JAVS

A software maker serving more than 10,000 courtrooms throughout the world hosted an application update containing a hidden backdoor that maintained persistent communication with a malicious website, researchers reported Thursday, in the latest episode of a supply-chain attack.

The software, known as the JAVS Viewer 8, is a component of the JAVS Suite 8, an application package courtrooms use to record, play back, and manage audio and video from proceedings. Its maker, Louisville, Kentucky-based Justice AV Solutions, says its products are used in more than 10,000 courtrooms throughout the US and 11 other countries. The company has been in business for 35 years.

JAVS Viewer users at high risk

Researchers from security firm Rapid7 reported that a version of the JAVS Viewer 8 available for download on javs.com contained a backdoor that gave an unknown threat actor persistent access to infected devices. The malicious download, planted inside an executable file that installs the JAVS Viewer version 8.3.7, was available no later than April 1, when a post on X (formerly Twitter) reported it. It’s unclear when the backdoored version was removed from the company’s download page. JAVS representatives didn’t immediately respond to questions sent by email.

“Users who have version 8.3.7 of the JAVS Viewer executable installed are at high risk and should take immediate action,” Rapid7 researchers Ipek Solak, Thomas Elkins, Evan McCann, Matthew Smith, Jake McMahon, Tyler McGraw, Ryan Emmons, Stephen Fewer, and John Fenninger wrote. “This version contains a backdoored installer that allows attackers to gain full control of affected systems.”

The installer file was titled JAVS Viewer Setup 8.3.7.250-1.exe. When executed, it copied the binary file fffmpeg.exe to the file path C:Program Files (x86)JAVSViewer 8. To bypass security warnings, the installer was digitally signed, but with a signature issued to an entity called “Vanguard Tech Limited” rather than to “Justice AV Solutions Inc.,” the signing entity used to authenticate legitimate JAVS software.

fffmpeg.exe, in turn, used Windows Sockets and WinHTTP to establish communications with a command-and-control server. Once successfully connected, fffmpeg.exe sent the server passwords harvested from browsers and data about the compromised host, including hostname, operating system details, processor architecture, program working directory, and the user name.

The researchers said fffmpeg.exe also downloaded the file chrome_installer.exe from the IP address 45.120.177.178. chrome_installer.exe went on to execute a binary and several Python scripts that were responsible for stealing the passwords saved in browsers. fffmpeg.exe is associated with a known malware family called GateDoor/Rustdoor. The exe file was already flagged by 30 endpoint protection engines.

A screenshot from VirusTotal showing detections from 30 endpoint protection engines.

Enlarge / A screenshot from VirusTotal showing detections from 30 endpoint protection engines.

Rapid7

The number of detections had grown to 38 at the time this post went live.

The researchers warned that the process of disinfecting infected devices will require care. They wrote:

To remediate this issue, affected users should:

  • Reimage any endpoints where JAVS Viewer 8.3.7 was installed. Simply uninstalling the software is insufficient, as attackers may have implanted additional backdoors or malware. Re-imaging provides a clean slate.
  • Reset credentials for any accounts that were logged into affected endpoints. This includes local accounts on the endpoint itself as well as any remote accounts accessed during the period when JAVS Viewer 8.3.7 was installed. Attackers may have stolen credentials from compromised systems.
  • Reset credentials used in web browsers on affected endpoints. Browser sessions may have been hijacked to steal cookies, stored passwords, or other sensitive information.
  • Install the latest version of JAVS Viewer (8.3.8 or higher) after re-imaging affected systems. The new version does not contain the backdoor present in 8.3.7.

Completely re-imaging affected endpoints and resetting associated credentials is critical to ensure attackers have not persisted through backdoors or stolen credentials. All organizations running JAVS Viewer 8.3.7 should take these steps immediately to address the compromise.

The Rapid7 post included a statement from JAVS that confirmed that the installer for version 8.3.7 of the JAVS viewer was malicious.

“We pulled all versions of Viewer 8.3.7 from the JAVS website, reset all passwords, and conducted a full internal audit of all JAVS systems,” the statement read. “We confirmed all currently available files on the JAVS.com website are genuine and malware-free. We further verified that no JAVS Source code, certificates, systems, or other software releases were compromised in this incident.”

The statement didn’t explain how the installer became available for download on its site. It also didn’t say if the company retained an outside firm to investigate.

The incident is the latest example of a supply-chain attack, a technique that tampers with a legitimate service or piece of software with the aim of infecting all downstream users. These sorts of attacks are usually carried out by first hacking the provider of the service or software. There’s no sure way to prevent falling victim to supply-chain attacks, but one potentially useful measure is to vet a file using VirusTotal before executing it. That advice would have served JAVS users well.

Crooks plant backdoor in software used by courtrooms around the world Read More »

bing-outage-shows-just-how-little-competition-google-search-really-has

Bing outage shows just how little competition Google search really has

Searching for new search —

Opinion: Actively searching without Google or Bing is harder than it looks.

Google logo on a phone in front of a Bing logo in the background

Getty Images

Bing, Microsoft’s search engine platform, went down in the very early morning today. That meant that searches from Microsoft’s Edge browsers that had yet to change their default providers didn’t work. It also meant that services relying on Bing’s search API—Microsoft’s own Copilot, ChatGPT search, Yahoo, Ecosia, and DuckDuckGo—similarly failed.

Services were largely restored by the morning Eastern work hours, but the timing feels apt, concerning, or some combination of the two. Google, the consistently dominating search platform, just last week announced and debuted AI Overviews as a default addition to all searches. If you don’t want an AI response but still want to use Google, you can hunt down the new “Web” option in a menu, or you can, per Ernie Smith, tack “&udm=14” onto your search or use Smith’s own “Konami code” shortcut page.

If dismay about AI’s hallucinations, power draw, or pizza recipes concern you—along with perhaps broader Google issues involving privacy, tracking, news, SEO, or monopoly power—most of your other major options were brought down by a single API outage this morning. Moving past that kind of single point of vulnerability will take some work, both by the industry and by you, the person wondering if there’s a real alternative.

Search engine market share, as measured by StatCounter, April 2023–April 2024.

Search engine market share, as measured by StatCounter, April 2023–April 2024.

StatCounter

Upward of a billion dollars a year

The overwhelming majority of search tools offering an “alternative” to Google are using Google, Bing, or Yandex, the three major search engines that maintain massive global indexes. Yandex, being based in Russia, is a non-starter for many people around the world at the moment. Bing offers its services widely, most notably to DuckDuckGo, but its ad-based revenue model and privacy particulars have caused some friction there in the past. Before his company was able to block more of Microsoft’s own tracking scripts, DuckDuckGo CEO and founder Gabriel Weinberg explained in a Reddit reply why firms like his weren’t going the full DIY route:

… [W]e source most of our traditional links and images privately from Bing … Really only two companies (Google and Microsoft) have a high-quality global web link index (because I believe it costs upwards of a billion dollars a year to do), and so literally every other global search engine needs to bootstrap with one or both of them to provide a mainstream search product. The same is true for maps btw — only the biggest companies can similarly afford to put satellites up and send ground cars to take streetview pictures of every neighborhood.

Bing makes Microsoft money, if not quite profit yet. It’s in Microsoft’s interest to keep its search index stocked and API open, even if its focus is almost entirely on its own AI chatbot version of Bing. Yet if Microsoft decided to pull API access, or it became unreliable, Google’s default position gets even stronger. What would non-conformists have to choose from then?

Bing outage shows just how little competition Google search really has Read More »

a-root-server-at-the-internet’s-core-lost-touch-with-its-peers-we-still-don’t-know-why.

A root-server at the Internet’s core lost touch with its peers. We still don’t know why.

A root-server at the Internet’s core lost touch with its peers. We still don’t know why.

For more than four days, a server at the very core of the Internet’s domain name system was out of sync with its 12 root server peers due to an unexplained glitch that could have caused stability and security problems worldwide. This server, maintained by Internet carrier Cogent Communications, is one of the 13 root servers that provision the Internet’s root zone, which sits at the top of the hierarchical distributed database known as the domain name system, or DNS.

Here’s a simplified recap of the way the domain name system works and how root servers fit in:

When someone enters wikipedia.org in their browser, the servers handling the request first must translate the human-friendly domain name into an IP address. This is where the domain name system comes in. The first step in the DNS process is the browser queries the local stub resolver in the local operating system. The stub resolver forwards the query to a recursive resolver, which may be provided by the user’s ISP or a service such as 1.1.1.1 or 8.8.8.8 from Cloudflare and Google, respectively.

If it needs to, the recursive resolver contacts the c-root server or one of its 12 peers to determine the authoritative name server for the .org top level domain. The .org name server then refers the request to the Wikipedia name server, which then returns the IP address. In the following diagram, the recursive server is labeled “iterator.”

Given the crucial role a root server provides in ensuring one device can find any other device on the Internet, there are 13 of them geographically dispersed all over the world. Each root sever is, in fact, a cluster of servers that are also geographically dispersed, providing even more redundancy. Normally, the 13 root servers—each operated by a different entity—march in lockstep. When a change is made to the contents they host, it generally occurs on all of them within a few seconds or minutes at most.

Strange events at the C-root name server

This tight synchronization is crucial for ensuring stability. If one root server directs traffic lookups to one intermediate server and another root server sends lookups to a different intermediate server, the Internet as we know it could collapse. More important still, root servers store the cryptographic keys necessary to authenticate some of intermediate servers under a mechanism known as DNSSEC. If keys aren’t identical across all 13 root servers, there’s an increased risk of attacks such as DNS cache poisoning.

For reasons that remain unclear outside of Cogent—which declined to comment for this post—all 12 instances of the c-root it’s responsible for maintaining suddenly stopped updating on Saturday. Stéphane Bortzmeyer, a French engineer who was among the first to flag the problem in a Tuesday post, noted then that the c-root was three days behind the rest of the root servers.

A mismatch in what's known as the zone serials shows root-c is three days behind.

Enlarge / A mismatch in what’s known as the zone serials shows root-c is three days behind.

The lag was further noted on Mastodon.

By mid-day Wednesday, the lag was shortened to about one day.

By late Wednesday, the c-root was finally up to date.

A root-server at the Internet’s core lost touch with its peers. We still don’t know why. Read More »