Conde Nast

ars-technica-content-is-now-available-in-openai-services

Ars Technica content is now available in OpenAI services

Adventures in capitalism —

Condé Nast joins other publishers in allowing OpenAI to access its content.

The OpenAI and Conde Nast logos on a gradient background.

Ars Technica

On Tuesday, OpenAI announced a partnership with Ars Technica parent company Condé Nast to display content from prominent publications within its AI products, including ChatGPT and a new SearchGPT prototype. It also allows OpenAI to use Condé content to train future AI language models. The deal covers well-known Condé brands such as Vogue, The New Yorker, GQ, Wired, Ars Technica, and others. Financial details were not disclosed.

One immediate effect of the deal will be that users of ChatGPT or SearchGPT will now be able to see information from Condé Nast publications pulled from those assistants’ live views of the web. For example, a user could ask ChatGPT, “What’s the latest Ars Technica article about Space?” and ChatGPT can browse the web and pull up the result, attribute it, and summarize it for users while also linking to the site.

In the longer term, the deal also means that OpenAI can openly and officially utilize Condé Nast articles to train future AI language models, which includes successors to GPT-4o. In this case, “training” means feeding content into an AI model’s neural network so the AI model can better process conceptual relationships.

AI training is an expensive and computationally intense process that happens rarely, usually prior to the launch of a major new AI model, although a secondary process called “fine-tuning” can continue over time. Having access to high-quality training data, such as vetted journalism, improves AI language models’ ability to provide accurate answers to user questions.

It’s worth noting that Condé Nast internal policy still forbids its publications from using text created by generative AI, which is consistent with its AI rules before the deal.

Not waiting on fair use

With the deal, Condé Nast joins a growing list of publishers partnering with OpenAI, including Associated Press, Axel Springer, The Atlantic, and others. Some publications, such as The New York Times, have chosen to sue OpenAI over content use, and there’s reason to think they could win.

In an internal email to Condé Nast staff, CEO Roger Lynch framed the multi-year partnership as a strategic move to expand the reach of the company’s content, adapt to changing audience behaviors, and ensure proper compensation and attribution for using the company’s IP. “This partnership recognizes that the exceptional content produced by Condé Nast and our many titles cannot be replaced,” Lynch wrote in the email, “and is a step toward making sure our technology-enabled future is one that is created responsibly.”

The move also brings additional revenue to Condé Nast, Lynch added, at a time when “many technology companies eroded publishers’ ability to monetize content, most recently with traditional search.” The deal will allow Condé to “continue to protect and invest in our journalism and creative endeavors,” Lynch wrote.

OpenAI COO Brad Lightcap said in a statement, “We’re committed to working with Condé Nast and other news publishers to ensure that as AI plays a larger role in news discovery and delivery, it maintains accuracy, integrity, and respect for quality reporting.”

Ars Technica content is now available in OpenAI services Read More »

ai-search-engine-accused-of-plagiarism-announces-publisher-revenue-sharing-plan

AI search engine accused of plagiarism announces publisher revenue-sharing plan

Beg, borrow, or license —

Perplexity says WordPress.com, TIME, Der Spiegel, and Fortune have already signed up.

Robot caught in a flashlight vector illustration

On Tuesday, AI-powered search engine Perplexity unveiled a new revenue-sharing program for publishers, marking a significant shift in its approach to third-party content use, reports CNBC. The move comes after plagiarism allegations from major media outlets, including Forbes, Wired, and Ars parent company Condé Nast. Perplexity, valued at over $1 billion, aims to compete with search giant Google.

“To further support the vital work of media organizations and online creators, we need to ensure publishers can thrive as Perplexity grows,” writes the company in a blog post announcing the problem. “That’s why we’re excited to announce the Perplexity Publishers Program and our first batch of partners: TIME, Der Spiegel, Fortune, Entrepreneur, The Texas Tribune, and WordPress.com.”

Under the program, Perplexity will share a percentage of ad revenue with publishers when their content is cited in AI-generated answers. The revenue share applies on a per-article basis and potentially multiplies if articles from a single publisher are used in one response. Some content providers, such as WordPress.com, plan to pass some of that revenue on to content creators.

A press release from WordPress.com states that joining Perplexity’s Publishers Program allows WordPress.com content to appear in Perplexity’s “Keep Exploring” section on their Discover pages. “That means your articles will be included in their search index and your articles can be surfaced as an answer on their answer engine and Discover feed,” the blog company writes. “If your website is referenced in a Perplexity search result where the company earns advertising revenue, you’ll be eligible for revenue share.”

A screenshot of the Perplexity.ai website taken on July 30, 2024.

Enlarge / A screenshot of the Perplexity.ai website taken on July 30, 2024.

Benj Edwards

Dmitry Shevelenko, Perplexity’s chief business officer, told CNBC that the company began discussions with publishers in January, with program details solidified in early 2024. He reported strong initial interest, with over a dozen publishers reaching out within hours of the announcement.

As part of the program, publishers will also receive access to Perplexity APIs that can be used to create custom “answer engines” and “Enterprise Pro” accounts that provide “enhanced data privacy and security capabilities” for all employees of Publishers in the program for one year.

Accusations of plagiarism

The revenue-sharing announcement follows a rocky month for the AI startup. In mid-June, Forbes reported finding its content within Perplexity’s Pages tool with minimal attribution. Pages allows Perplexity users to curate content and share it with others. Ars Technica sister publication Wired later made similar claims, also noting suspicious traffic patterns from IP addresses likely linked to Perplexity that were ignoring robots.txt exclusions. Perplexity was also found to be manipulating its crawling bots’ ID string to get around website blocks.

As part of company policy, Ars Technica parent Condé Nast disallows AI-based content scrapers, and its CEO Roger Lynch testified in the US Senate earlier this year that generative AI has been built with “stolen goods.” Condé sent a cease-and-desist letter to Perplexity earlier this month.

But publisher trouble might not be Perplexity’s only problem. In some tests of the search we performed in February, Perplexity badly confabulated certain answers, even when citations were readily available. Since our initial tests, the accuracy of Perplexity’s results seems to have improved, but providing inaccurate answers (which also plagued Google’s AI Overviews search feature) is still a potential issue.

Compared to the free tier of service, Perplexity users who pay $20 per month can access more capable LLMs such as GPT-4o and Claude 3, so the quality and accuracy of the output can vary dramatically depending on whether a user subscribes or not. The addition of citations to every Perplexity answer allows users to check accuracy—if they take the time to do it.

The move by Perplexity occurs against a backdrop of tensions between AI companies and content creators. Some media outlets, such as The New York Times, have filed lawsuits against AI vendors like OpenAI and Microsoft, alleging copyright infringement in the training of large language models. OpenAI has struck media licensing deals with many publishers as a way to secure access to high-quality training data and avoid future lawsuits.

In this case, Perplexity is not using the licensed articles and content to train AI models but is seeking legal permission to reproduce content from publishers on its website.

AI search engine accused of plagiarism announces publisher revenue-sharing plan Read More »

at-senate-ai-hearing,-news-executives-fight-against-“fair-use”-claims-for-ai-training-data

At Senate AI hearing, news executives fight against “fair use” claims for AI training data

All’s fair in love and AI —

Media orgs want AI firms to license content for training, and Congress is sympathetic.

WASHINGTON, DC - JANUARY 10: Danielle Coffey, President and CEO of News Media Alliance, Professor Jeff Jarvis, CUNY Graduate School of Journalism, Curtis LeGeyt President and CEO of National Association of Broadcasters, Roger Lynch CEO of Condé Nast, are strong in during a Senate Judiciary Subcommittee on Privacy, Technology, and the Law hearing on “Artificial Intelligence and The Future Of Journalism” at the U.S. Capitol on January 10, 2024 in Washington, DC. Lawmakers continue to hear testimony from experts and business leaders about artificial intelligence and its impact on democracy, elections, privacy, liability and news. (Photo by Kent Nishimura/Getty Images)

Enlarge / Danielle Coffey, president and CEO of News Media Alliance; Professor Jeff Jarvis, CUNY Graduate School of Journalism; Curtis LeGeyt, president and CEO of National Association of Broadcasters; and Roger Lynch, CEO of Condé Nast, are sworn in during a Senate Judiciary Subcommittee on Privacy, Technology, and the Law hearing on “Artificial Intelligence and The Future Of Journalism.”

Getty Images

On Wednesday, news industry executives urged Congress for legal clarification that using journalism to train AI assistants like ChatGPT is not fair use, as claimed by companies such as OpenAI. Instead, they would prefer a licensing regime for AI training content that would force Big Tech companies to pay for content in a method similar to rights clearinghouses for music.

The plea for action came during a US Senate Judiciary Committee hearing titled “Oversight of A.I.: The Future of Journalism,” chaired by Sen. Richard Blumenthal of Connecticut, with Sen. Josh Hawley of Missouri also playing a large role in the proceedings. Last year, the pair of senators introduced a bipartisan framework for AI legislation and held a series of hearings on the impact of AI.

Blumenthal described the situation as an “existential crisis” for the news industry and cited social media as a cautionary tale for legislative inaction about AI. “We need to move more quickly than we did on social media and learn from our mistakes in the delay there,” he said.

Companies like OpenAI have admitted that vast amounts of copyrighted material are necessary to train AI large language models, but they claim their use is transformational and covered under fair use precedents of US copyright law. Currently, OpenAI is negotiating licensing content from some news providers and striking deals, but the executives in the hearing said those efforts are not enough, highlighting closing newsrooms across the US and dropping media revenues while Big Tech’s profits soar.

“Gen AI cannot replace journalism,” said Condé Nast CEO Roger Lynch in his opening statement. (Condé Nast is the parent company of Ars Technica.) “Journalism is fundamentally a human pursuit, and it plays an essential and irreplaceable role in our society and our democracy.” Lynch said that generative AI has been built with “stolen goods,” referring to the use of AI training content from news outlets without authorization. “Gen AI companies copy and display our content without permission or compensation in order to build massive commercial businesses that directly compete with us.”

Roger Lynch, CEO of Condé Nast, testifies before the Senate Judiciary Subcommittee on Privacy, Technology, and the Law during a hearing on “Artificial Intelligence and The Future Of Journalism.”

Enlarge / Roger Lynch, CEO of Condé Nast, testifies before the Senate Judiciary Subcommittee on Privacy, Technology, and the Law during a hearing on “Artificial Intelligence and The Future Of Journalism.”

Getty Images

In addition to Lynch, the hearing featured three other witnesses: Jeff Jarvis, a veteran journalism professor and pundit; Danielle Coffey, the president and CEO of News Media Alliance; and Curtis LeGeyt, president and CEO of the National Association of Broadcasters.

Coffey also shared concerns about generative AI using news material to create competitive products. “These outputs compete in the same market, with the same audience, and serve the same purpose as the original articles that feed the algorithms in the first place,” she said.

When Sen. Hawley asked Lynch what kind of legislation might be needed to fix the problem, Lynch replied, “I think quite simply, if Congress could clarify that the use of our content and other publisher content for training and output of AI models is not fair use, then the free market will take care of the rest.”

Lynch used the music industry as a model: “You think about millions of artists, millions of ultimate consumers consuming that content, there have been models that have been set up, ASCAP, BMI, CSAC, GMR, these collective rights organizations to simplify the content that’s being used.”

Curtis LeGeyt, CEO of the National Association of Broadcasters, said that TV broadcast journalists are also affected by generative AI. “The use of broadcasters’ news content in AI models without authorization diminishes our audience’s trust and our reinvestment in local news,” he said. “Broadcasters have already seen numerous examples where content created by our journalists has been ingested and regurgitated by AI bots with little or no attribution.”

At Senate AI hearing, news executives fight against “fair use” claims for AI training data Read More »