training data

openai-desperate-to-avoid-explaining-why-it-deleted-pirated-book-datasets

OpenAI desperate to avoid explaining why it deleted pirated book datasets


Not for OpenAI to reason why?

OpenAI risks increased fines after deleting pirated books datasets.

OpenAI may soon be forced to explain why it deleted a pair of controversial datasets composed of pirated books, and the stakes could not be higher.

At the heart of a class-action lawsuit from authors alleging that ChatGPT was illegally trained on their works, OpenAI’s decision to delete the datasets could end up being a deciding factor that gives the authors the win.

It’s undisputed that OpenAI deleted the datasets, known as “Books 1” and “Books 2,” prior to ChatGPT’s release in 2022. Created by former OpenAI employees in 2021, the datasets were built by scraping the open web and seizing the bulk of its data from a shadow library called Library Genesis (LibGen).

As OpenAI tells it, the datasets fell out of use within that same year, prompting an internal decision to delete them.

But the authors suspect there’s more to the story than that. They noted that OpenAI appeared to flip-flop by retracting its claim that the datasets’ “non-use” was a reason for deletion, then later claiming that all reasons for deletion, including “non-use,” should be shielded under attorney-client privilege.

To the authors, it seemed like OpenAI was quickly backtracking after the court granted the authors’ discovery requests to review OpenAI’s internal messages on the firm’s “non-use.”

In fact, OpenAI’s reversal only made authors more eager to see how OpenAI discussed “non-use,” and now they may get to find out all the reasons why OpenAI deleted the datasets.

Last week, US district judge Ona Wang ordered OpenAI to share all communications with in-house lawyers about deleting the datasets, as well as “all internal references to LibGen that OpenAI has redacted or withheld on the basis of attorney-client privilege.”

According to Wang, OpenAI slipped up by arguing that “non-use” was not a “reason” for deleting the datasets, while simultaneously claiming that it should also be deemed a “reason” considered privileged.

Either way, the judge ruled that OpenAI couldn’t block discovery on “non-use” just by deleting a few words from prior filings that had been on the docket for more than a year.

“OpenAI has gone back-and-forth on whether ‘non-use’ as a ‘reason’ for the deletion of Books1 and Books2 is privileged at all,” Wang wrote. “OpenAI cannot state a ‘reason’ (which implies it is not privileged) and then later assert that the ‘reason’ is privileged to avoid discovery.”

Additionally, OpenAI’s claim that all reasons for deleting the datasets are privileged “strains credulity,” she concluded, ordering OpenAI to produce a wide range of potentially revealing internal messages by December 8. OpenAI must also make its in-house lawyers available for deposition by December 19.

OpenAI has argued that it never flip-flopped or retracted anything. It simply used vague phrasing that led to confusion over whether any of the reasons for deleting the datasets were considered non-privileged. But Wang didn’t buy into that, concluding that “even if a ‘reason’ like ‘non-use’ could be privileged, OpenAI has waived privilege by making a moving target of its privilege assertions.”

Asked for comment, OpenAI told Ars that “we disagree with the ruling and intend to appeal.”

OpenAI’s “flip-flop” may cost it the win

So far, OpenAI has avoided disclosing its rationale, claiming that all the reasons it had for deleting the datasets are privileged. In-house lawyers weighed in on the decision to delete and were even copied on a Slack channel initially called “excise-libgen.”

But Wang reviewed those Slack messages and found that “the vast majority of these communications were not privileged because they were ‘plainly devoid of any request for legal advice and counsel [did] not once weigh in.’”

In a particularly non-privileged batch of messages, one OpenAI lawyer, Jason Kwon, only weighed in once, the judge noted, to recommend the channel name be changed to “project-clear.” Wang reminded OpenAI that “the entirety of the Slack channel and all messages contained therein is not privileged simply because it was created at the direction of an attorney and/or the fact that a lawyer was copied on the communications.”

The authors believe that exposing OpenAI’s rationale may help prove that the ChatGPT maker willfully infringed on copyrights when pirating the book data. As Wang explained, OpenAI’s retraction risked putting the AI firm’s “good faith and state of mind at issue,” which could increase fines in a loss.

“In a copyright case, a court can increase the award of statutory damages up to $150,000 per infringed work if the infringement was willful, meaning the defendant ‘was actually aware of the infringing activity’ or the ‘defendant’s actions were the result of reckless disregard for, or willful blindness to, the copyright holder’s rights,’” Wang wrote.

In a court transcript, a lawyer representing some of the authors suing OpenAI, Christopher Young, noted that OpenAI could be in trouble if evidence showed that it decided against using the datasets for later models due to legal risks. He also suggested that OpenAI could be using the datasets under different names to mask further infringement.

Judge calls out OpenAI for twisting fair use ruling

Wang also found it contradictory that OpenAI continued to argue in a recent filing that it acted in good faith, while “artfully” removing “its good faith affirmative defense and key words such as ‘innocent,’ ‘reasonably believed,’ and ‘good faith.’” These changes only strengthened discovery requests to explore authors’ willfulness theory, Wang wrote, noting the sought-after internal messages would now be critical for the court’s review.

“A jury is entitled to know the basis for OpenAI’s purported good faith,” Wang wrote.

The judge appeared particularly frustrated by OpenAI seemingly twisting the Anthropic ruling to defend against the authors’ request to learn more about the deletion of the datasets.

In a footnote, Wang called out OpenAI for “bizarrely” citing an Anthropic ruling that “grossly” misrepresented Judge William Alsup’s decision by claiming that he found that “downloading pirated copies of books is lawful as long as they are subsequently used for training an LLM.”

Instead, Alsup wrote that he doubted that “any accused infringer could ever meet its burden of explaining why downloading source copies from pirate sites that it could have purchased or otherwise accessed lawfully was itself reasonably necessary to any subsequent fair use.”

If anything, Wang wrote, OpenAI’s decision to pirate book data—then delete it—seemed “to fall squarely into the category of activities proscribed by” Alsup. For emphasis, she quoted Alsup’s order, which said, “such piracy of otherwise available copies is inherently, irredeemably infringing even if the pirated copies are immediately used for the transformative use and immediately discarded.”

For the authors, getting hold of OpenAI’s privileged communications could tip the scales in their favor, the Hollywood Reporter suggested. Some authors believe the key to winning could be testimony from Anthropic CEO Dario Amodei, who is accused of creating the controversial datasets while he was still at OpenAI. The authors think Amodei also possesses information on the destruction of the datasets, court filings show.

OpenAI tried to fight the authors’ motion to depose Amodei, but a judge sided with the authors in March, compelling Amodei to answer their biggest questions on his involvement.

Whether Amodei’s testimony is a bombshell remains to be seen, but it’s clear that OpenAI may struggle to overcome claims of willful infringement. Wang noted there is a “fundamental conflict” in circumstances “where a party asserts a good faith defense based on advice of counsel but then blocks inquiry into their state of mind by asserting attorney-client privilege,” suggesting that OpenAI may have substantially weakened its defense.

The outcome of the dispute over the deletions could influence OpenAI’s calculus on whether it should ultimately settle the lawsuit. Ahead of the Anthropic settlement—the largest publicly reported copyright class action settlement in history—authors suing pointed to evidence that Anthropic became “not so gung ho about” training on pirated books “for legal reasons.” That seems to be the type of smoking-gun evidence that authors hope will emerge from OpenAI’s withheld Slack messages.

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

OpenAI desperate to avoid explaining why it deleted pirated book datasets Read More »

ai-models-can-acquire-backdoors-from-surprisingly-few-malicious-documents

AI models can acquire backdoors from surprisingly few malicious documents

Fine-tuning experiments with 100,000 clean samples versus 1,000 clean samples showed similar attack success rates when the number of malicious examples stayed constant. For GPT-3.5-turbo, between 50 and 90 malicious samples achieved over 80 percent attack success across dataset sizes spanning two orders of magnitude.

Limitations

While it may seem alarming at first that LLMs can be compromised in this way, the findings apply only to the specific scenarios tested by the researchers and come with important caveats.

“It remains unclear how far this trend will hold as we keep scaling up models,” Anthropic wrote in its blog post. “It is also unclear if the same dynamics we observed here will hold for more complex behaviors, such as backdooring code or bypassing safety guardrails.”

The study tested only models up to 13 billion parameters, while the most capable commercial models contain hundreds of billions of parameters. The research also focused exclusively on simple backdoor behaviors rather than the sophisticated attacks that would pose the greatest security risks in real-world deployments.

Also, the backdoors can be largely fixed by the safety training companies already do. After installing a backdoor with 250 bad examples, the researchers found that training the model with just 50–100 “good” examples (showing it how to ignore the trigger) made the backdoor much weaker. With 2,000 good examples, the backdoor basically disappeared. Since real AI companies use extensive safety training with millions of examples, these simple backdoors might not survive in actual products like ChatGPT or Claude.

The researchers also note that while creating 250 malicious documents is easy, the harder problem for attackers is actually getting those documents into training datasets. Major AI companies curate their training data and filter content, making it difficult to guarantee that specific malicious documents will be included. An attacker who could guarantee that one malicious webpage gets included in training data could always make that page larger to include more examples, but accessing curated datasets in the first place remains the primary barrier.

Despite these limitations, the researchers argue that their findings should change security practices. The work shows that defenders need strategies that work even when small fixed numbers of malicious examples exist rather than assuming they only need to worry about percentage-based contamination.

“Our results suggest that injecting backdoors through data poisoning may be easier for large models than previously believed as the number of poisons required does not scale up with model size,” the researchers wrote, “highlighting the need for more research on defences to mitigate this risk in future models.”

AI models can acquire backdoors from surprisingly few malicious documents Read More »

anthropic-destroyed-millions-of-print-books-to-build-its-ai-models

Anthropic destroyed millions of print books to build its AI models

But if you’re not intimately familiar with the AI industry and copyright, you might wonder: Why would a company spend millions of dollars on books to destroy them? Behind these odd legal maneuvers lies a more fundamental driver: the AI industry’s insatiable hunger for high-quality text.

The race for high-quality training data

To understand why Anthropic would want to scan millions of books, it’s important to know that AI researchers build large language models (LLMs) like those that power ChatGPT and Claude by feeding billions of words into a neural network. During training, the AI system processes the text repeatedly, building statistical relationships between words and concepts in the process.

The quality of training data fed into the neural network directly impacts the resulting AI model’s capabilities. Models trained on well-edited books and articles tend to produce more coherent, accurate responses than those trained on lower-quality text like random YouTube comments.

Publishers legally control content that AI companies desperately want, but AI companies don’t always want to negotiate a license. The first-sale doctrine offered a workaround: Once you buy a physical book, you can do what you want with that copy—including destroy it. That meant buying physical books offered a legal workaround.

And yet buying things is expensive, even if it is legal. So like many AI companies before it, Anthropic initially chose the quick and easy path. In the quest for high-quality training data, the court filing states, Anthropic first chose to amass digitized versions of pirated books to avoid what CEO Dario Amodei called “legal/practice/business slog”—the complex licensing negotiations with publishers. But by 2024, Anthropic had become “not so gung ho about” using pirated ebooks “for legal reasons” and needed a safer source.

Anthropic destroyed millions of print books to build its AI models Read More »