reddit

reddit-sells-training-data-to-unnamed-ai-company-ahead-of-ipo

Reddit sells training data to unnamed AI company ahead of IPO

Everything has a price —

If you’ve posted on Reddit, you’re likely feeding the future of AI.

In this photo illustration the American social news

On Friday, Bloomberg reported that Reddit has signed a contract allowing an unnamed AI company to train its models on the site’s content, according to people familiar with the matter. The move comes as the social media platform nears the introduction of its initial public offering (IPO), which could happen as soon as next month.

Reddit initially revealed the deal, which is reported to be worth $60 million a year, earlier in 2024 to potential investors of an anticipated IPO, Bloomberg said. The Bloomberg source speculates that the contract could serve as a model for future agreements with other AI companies.

After an era where AI companies utilized AI training data without expressly seeking any rightsholder permission, some tech firms have more recently begun entering deals where some content used for training AI models similar to GPT-4 (which runs the paid version of ChatGPT) comes under license. In December, for example, OpenAI signed an agreement with German publisher Axel Springer (publisher of Politico and Business Insider) for access to its articles. Previously, OpenAI has struck deals with other organizations, including the Associated Press. Reportedly, OpenAI is also in licensing talks with CNN, Fox, and Time, among others.

In April 2023, Reddit founder and CEO Steve Huffman told The New York Times that it planned to charge AI companies for access to its almost two decades’ worth of human-generated content.

If the reported $60 million/year deal goes through, it’s quite possible that if you’ve ever posted on Reddit, some of that material may be used to train the next generation of AI models that create text, still pictures, and video. Even without the deal, experts have discovered in the past that Reddit has been a key source of training data for large language models and AI image generators.

While we don’t know if OpenAI is the company that signed the deal with Reddit, Bloomberg speculates that Reddit’s ability to tap into AI hype for additional revenue may boost the value of its IPO, which might be worth $5 billion. Despite drama last year, Bloomberg states that Reddit pulled in more than $800 million in revenue in 2023, growing about 20 percent over its 2022 numbers.

Advance Publications, which owns Ars Technica parent Condé Nast, is the largest shareholder of Reddit.

Reddit sells training data to unnamed AI company ahead of IPO Read More »

reddit-must-share-ip-addresses-of-piracy-discussing-users,-film-studios-say

Reddit must share IP addresses of piracy-discussing users, film studios say

A keyboard icon for piracy beside letter v and n

For the third time in less than a year, film studios with copyright infringement complaints against a cable Internet provider are trying to force Reddit to share information about users who have discussed piracy on the site.

In 2023, film companies lost two attempts to have Reddit unmask its users. In the first instance, US Magistrate Judge Laurel Beeler ruled in the US District Court for the Northern District of California that the First Amendment right to anonymous speech meant Reddit didn’t have to disclose the names, email addresses, and other account registration information for nine Reddit users. Film companies, including Bodyguard Productions and Millennium Media, had subpoenaed Reddit in relation to a copyright infringement lawsuit against Astound Broadband-owned RCN about subscribers allegedly pirating 34 movie titles, including Hellboy (2019), Rambo V: Last Blood, and Tesla.

In the second instance, the same companies sued Astound Broadband-owned ISP Grande, again for alleged copyright infringement occurring over the ISP’s network. The studios subpoenaed Reddit for user account information, including “IP address registration and logs from 1/1/2016 to present, name, email address, and other account registration information” for six Reddit users, per a July 2023 court filing.

In August, a federal court again quashed that subpoena, citing First Amendment rights. In her ruling, Beeler noted that while the First Amendment right to anonymous speech is not absolute, the film producers had already received the names of 118 Grande subscribers. She also said the film producers had failed to prove that “the identifying information is directly or materially relevant or unavailable from another source.”

Third piracy-related subpoena

This week, as reported by TorrentFreak, film companies Voltage Holdings, which are part of the previous two subpoenas, and Screen Media Ventures, another film studio with litigation against RCN, filed a motion to compel [PDF] Reddit to respond to the subpoena in the US District Court for the Northern District of California. The studios said they’re seeking the information concerning claims they’ve made that the “ability to pirate content efficiently without any consequences is a draw for becoming a Frontier subscriber” and that Frontier Communications “does not have an effective policy for terminating repeat infringers.” The film studios are claimants against Frontier in its bankruptcy case. The studios are represented by the same lawyers used in the two aforementioned cases.

The studios are asking that the court require Reddit to provide “IP address log information from 1/1/2017 to present” for six anonymous Reddit users who talked about piracy on Reddit. Although, Reddit posts shared in the court filing only date back to 2021.

Reddit responded to the studios’ subpoena with a letter [PDF] on January 2 stating that the subpoena “does not satisfy the First Amendment standard for disclosure of identifying information regarding an anonymous speaker.” Reddit also noted the two previously quashed subpoenas and suggested that it did not have to comply with the new request because the studios could acquire equivalent or better information elsewhere.

As with the previously mentioned litigation against ISPs, Reddit is a non-party. However, since the film companies claimed that Frontier had refused to produce customer identifying information and Reddit responded with a denial to the requests, the film companies filed their motion to compel.

The studios argue that the information requests do not implicate the First Amendment and that the rulings around the two aforementioned subpoenas are not applicable because the new subpoena is only about IP address logs and not other user-identifying information.

“The Reddit users do not have a recognized privacy interest in their IP addresses,” the motion says.

Reddit must share IP addresses of piracy-discussing users, film studios say Read More »