openai

judge-calls-out-openai’s-“straw-man”-argument-in-new-york-times-copyright-suit

Judge calls out OpenAI’s “straw man” argument in New York Times copyright suit

“Taken as true, these facts give rise to a plausible inference that defendants at a minimum had reason to investigate and uncover end-user infringement,” Stein wrote.

To Stein, the fact that OpenAI maintains an “ongoing relationship” with users by providing outputs that respond to users’ prompts also supports contributory infringement claims, despite OpenAI’s argument that ChatGPT’s “substantial noninfringing uses” are exonerative.

OpenAI defeated some claims

For OpenAI, Stein’s ruling likely disappoints, although Stein did drop some of NYT’s claims.

Likely upsetting to news publishers, that included a “free-riding” claim that ChatGPT unfairly profits off time-sensitive “hot news” items, including the NYT’s Wirecutter posts. Stein explained that news publishers failed to plausibly allege non-attribution (which is key to a free-riding claim) because, for example, ChatGPT cites the NYT when sharing information from Wirecutter posts. Those claims are pre-empted by the Copyright Act anyway, Stein wrote, granting OpenAI’s motion to dismiss.

Stein also dismissed a claim from the NYT regarding alleged removal of copyright management information (CMI), which Stein said cannot be proven simply because ChatGPT reproduces excerpts of NYT articles without CMI.

The Digital Millennium Copyright Act (DMCA) requires news publishers to show that ChatGPT’s outputs are “close to identical” to the original work, Stein said, and allowing publishers’ claims based on excerpts “would risk boundless DMCA liability”—including for any use of block quotes without CMI.

Asked for comment on the ruling, an OpenAI spokesperson declined to go into any specifics, instead repeating OpenAI’s long-held argument that AI training on copyrighted works is fair use. (Last month, OpenAI warned Donald Trump that the US would lose the AI race to China if courts ruled against that argument.)

“ChatGPT helps enhance human creativity, advance scientific discovery and medical research, and enable hundreds of millions of people to improve their daily lives,” OpenAI’s spokesperson said. “Our models empower innovation, and are trained on publicly available data and grounded in fair use.”

Judge calls out OpenAI’s “straw man” argument in New York Times copyright suit Read More »

openai-#12:-battle-of-the-board-redux

OpenAI #12: Battle of the Board Redux

Back when the OpenAI board attempted and failed to fire Sam Altman, we faced a highly hostile information environment. The battle was fought largely through control of the public narrative, and the above was my attempt to put together what happened.

My conclusion, which I still believe, was that Sam Altman had engaged in a variety of unacceptable conduct that merited his firing.

In particular, he very much ‘not been consistently candid’ with the board on several important occasions. In particular, he lied to board members about what was said by other board members, with the goal of forcing out a board member he disliked. There were also other instances in which he misled and was otherwise toxic to employees, and he played fast and loose with the investment fund and other outside opportunities.

I concluded that the story that this was about ‘AI safety’ or ‘EA (effective altruism)’ or existential risk concerns, other than as Altman’s motivation to attempt to remove board members, was a false narrative largely spread by Altman’s allies and those who are determined to hate on anyone who is concerned future AI might get out of control or kill everyone, often using EA’s bad press or vibes as a point of leverage to do that.

A few weeks later, I felt that leaks confirmed the bulk the story I told at that first link, and since then I’ve had anonymous sources confirm my account was centrally true.

Thanks to Keach Hagey at the Wall Street Journal, we now have by far the most well-researched and complete piece on what happened: The Secrets and Misdirection Behind Sam Altman’s Firing From OpenAI. Most, although not all, of the important remaining questions are now definitively answered, and the story I put together has been confirmed.

The key now is to Focus Only On What Matters. What matters going forward are:

  1. Claims of Altman’s toxic and dishonest behaviors, that if true merited his firing.

  2. That the motivations behind the firing were these ordinary CEO misbehaviors.

  3. Altman’s allies successfully spread a highly false narrative about events.

  4. That OpenAI could easily have moved forward with a different CEO, if things had played out differently and Altman had not threatened to blow up OpenAI.

  5. OpenAI is now effectively controlled by Sam Altman going forward. His claims that ‘the board can fire me’ in practice mean very little.

Also important is what happened afterwards, which was likely caused in large part by both the events and also way they were framed, and also Altman’s consolidated power.

In particular, Sam Altman and OpenAI, whose explicit mission is building AGI and who plan to do so within Trump’s second term, started increasingly talking and acting like AGI was No Big Deal, except for the amazing particular benefits.

Their statements don’t feel the AGI. They no longer tell us our lives will change that much. It is not important, they do not even bother to tell us, to protect against key downside risks of building machines smarter and more capable than humans – such as the risk that those machines effectively take over, or perhaps end up killing everyone.

And if you disagreed with that, or opposed Sam Altman? You were shown the door.

  1. OpenAI was then effectively purged. Most of its strongest alignment researchers left, as did most of those who most prominently wanted to take care to ensure OpenAI’s quest for AGI did not kill everyone or cause humanity to lose control over the future.

  2. Altman’s public statements about AGI, and OpenAI’s policy positions, stopped even mentioning the most important downside risks of AGI and ASI (artificial superintelligence), and shifted towards attempts at regulatory capture and access to government cooperation and funding. Most prominently, their statement on the US AI Action Plan can only be described as disingenuous vice signaling in pursuit of their own private interests.

  3. Those public statements and positions no longer much even ‘feel the AGI.’ Altman has taken to predicting that AGI will happen and your life won’t much change, and treating future AGI as essentially a fungible good. We know, from his prior statements, that Altman knows better. And we know from their current statements that many the engineers at OpenAI know better. Indeed, in context, they shout it from the rooftops.

  4. We discovered that self-hiding NDAs were aggressively used by OpenAI, under threat of equity confiscation, to control people and the narrative.

  5. With control over the board, Altman is attempting to convert OpenAI into a for-profit company, with sufficiently low compensation that this act could plausibly become the greatest theft in human history.

Beware being distracted by the shiny. In particular:

  1. Don’t be distracted by the article’s ‘cold open’ in which Peter Thiel tells a paranoid and false story to Sam Altman, in which Thiel asserts that ‘EAs’ or ‘safety’ people will attempt to destroy OpenAI, and that they have ‘half the company convinced’ and so on. I don’t doubt the interaction happened, but this was unrelated to what happened.

    1. To the extent it was related, it was because Altman and his allies paranoia about such possibilities, inspired by such tall tales, caused Altman to lie to the board in general, and attempt to force Helen Toner off the board in particular.

  2. Don’t be distracted by the fact that the board botched the firing, and the subsequent events, from a tactical perspective. Yes we can learn from their mistakes, but the board that made those mistakes is gone now.

This is all quite bad, but things could be far worse. OpenAI still has many excellent people working on alignment, security and safety. I They have put out a number of strong documents. By that standard, and in terms of how responsibly they have actually handled their releases, OpenAI has outperformed many other industry actors, although less responsible than Anthropic. Companies like DeepSeek, Meta and xAI, and at times Google, work hard to make OpenAI look good on these fronts.

Now, on to what we learned this week.

Hagey’s story paints a clear picture of what actually happened.

It is especially clear about why this happened. The firing wasn’t about EA, ‘the safety people’ or existential risk. What was this about?

Altman repeatedly lied to, misled and mistreated employees of OpenAI. Altman repeatedly lied about and withheld factual and importantly material matters, including directly to the board. There was a large litany of complaints.

The big new fact is that the board was counting on Murati’s support. But partly because of this, they felt they couldn’t disclose that their information came largely from Murati. That doesn’t explain why they couldn’t say this to Murati herself.

If the facts asserted in the WSJ article are true, I would say that any responsible board would have voted for Altman’s removal. As OpenAI’s products got more impactful, and the stakes got higher, Altman’s behaviors left no choice.

Claude agreed, this was one shot, I pasted in the full article and asked:

Zvi: I’ve shared a news article. Based on what is stated in the news article, if the reporting is accurate, how would you characterize the board’s decision to fire Altman? Was it justified? Was it necessary?

Claude 3.7: Based on what’s stated in the article, the board’s decision to fire Sam Altman appears both justified and necessary from their perspective, though clearly poorly executed in terms of preparation and communication.

I agree, on both counts. There are only two choices here, at least one must be true:

  1. The board had a fiduciary duty to fire Altman.

  2. The board members are outright lying about what happened.

That doesn’t excuse the board’s botched execution, especially its failure to disclose information in a timely manner.

The key facts cited here are:

  1. Altman said publicly and repeatedly ‘the board can fire me. That’s important’ but he really called the shots and did everything in his power to ensure this.

  2. Altman did not even inform the board about ChatGPT in advance, at all.

  3. Altman explicitly claimed three enhancements to GPT-4 had been approved by the joint safety board. Helen Toner found only one had been approved.

  4. Altman allowed Microsoft to launch the test of GPT-4 in India, in the form of Sydney, without the approval of the safety board or informing the board of directors of the breach. Due to the results of that experiment entering the training data, deploying Sydney plausibly had permanent effects on all future AIs. This was not a trivial oversight.

  5. Altman did not inform the board that he had taken financial ownership of the OpenAI investment fund, which he claimed was temporary and for tax reasons.

  6. Mira Murati came to the board with a litany of complaints about what she saw as Altman’s toxic management style, including having Brockman, who reported to her, go around her to Altman whenever there was a disagreement. Altman responded by bringing the head of HR to their 1-on-1s until Mira said she wouldn’t share her feedback with the board.

  7. Altman promised both Pachocki and Sutskever they could direct the research direction of the company, losing months of productivity, and this was when Sutskever started looking to replace Altman.

  8. The most egregious lie (Hagey’s term for it) and what I consider on its own sufficient to require Altman be fired: Altman told one board member, Sutskever, that a second board member, McCauley, had said that Toner should leave the board because of an article Toner wrote. McCauley said no such thing. This was an attempt to get Toner removed from the board. If you lie to board members about other board members in an attempt to gain control over the board, I assert that the board should fire you, pretty much no matter what.

  9. Sutskever collected dozens of examples of alleged Altman lies and other toxic behavior, largely backed up by screenshots from Murati’s Slack channel. One lie in particular was that Altman told Murati that the legal department had said GPT-4-Turbo didn’t have to go through joint safety board review. The head lawyer said he did not say that. The decision not to go through the safety board here was not crazy, but lying about the lawyers opinion on this is highly unacceptable.

Murati was clearly a key source for many of these firing offenses (and presumably for this article, given its content and timing, although I don’t know anything nonpublic). Despite this, even after Altman was fired, the board didn’t even tell Murati why they had fired him while asking her to become interim CEO, and in general stayed quiet largely (in this post’s narrative) to protect Murati. But then, largely because of the board’s communication failures, Murati turned on the board and the employees backed Altman.

This section reiterates and expands on my warnings above.

The important narrative here is that Altman engaged in various shenanigans and made various unforced errors that together rightfully got him fired. But the board botched the execution, and Altman was willing to burn down OpenAI in response and the board wasn’t. Thus, Altman got power back and did an ideological purge.

The first key distracting narrative, the one I’m seeing many fall into, is to treat this primarily as a story about board incompetence. Look at those losers, who lost, because they were stupid losers in over their heads with no business playing at this level. Many people seem to think the ‘real story’ is that a now defunct group of people were bad at corporate politics and should get mocked.

Yes, that group was bad at corporate politics. We should update on that, and be sure that the next time we have to Do Corporate Politics we don’t act like that, and especially that we explain why we we doing things. But the group that dropped this ball is defunct, whereas Altman is still CEO. And this is not a sporting event.

The board is now irrelevant. Altman isn’t. What matters is the behavior of Altman, and what he did to earn getting fired. Don’t be distracted by the shiny.

A second key narrative spun by Altman’s allies is that Altman is an excellent player of corporate politics. He has certainly pulled off some rather impressive (and some would say nasty) tricks. But the picture painted here is rife with unforced errors. Altman won because the opposition played badly, not because he played so well.

Most importantly, as I noted at the time, the board started out with nine members, five of whom at the time were loyal to Altman even if you don’t count Ilya Sutskever. Altman could easily have used this opportunity to elect new loyal board members. Instead, he allowed three of his allies to leave the board without replacement, leading to the deadlock of control, which then led to the power struggle. Given Altman knows so many well-qualified allies, this seems like a truly epic level of incompetence to me.

The third other key narrative is the one Altman’s allies have centrally told since day one, which is entirely false, is that this firing (which they misleadingly call a ‘coup’) was ‘the safety people’ or ‘the EAs’ trying to ‘destroy’ OpenAI.

My worry is that many will see that this false framing is presented early in the post, and not read far enough to realize the post is pointing out that the framing is entirely false. Thus, many or even most readers might get exactly the wrong idea.

In particular, this piece opens with an irrelevant story ecoching this false narrative. Peter Thiel is at dinner telling his friend Sam Altman a frankly false and paranoid story about Effective Altruism and Eliezer Yudkowsky.

Thiel says that ‘half the company believes this stuff’ (if only!) and that ‘the EAs’ had ‘taken over’ OpenAI (if only again!), and predicting that ‘the safety people,’ who on various occasions Thiel has described as literally and at length as the biblical Antichrist would ‘destroy’ OpenAI (whereas, instead, the board in the end fell on its sword to prevent Altman and his allies from destroying OpenAI).

And it gets presented in ways like this:

We are told to focus on the nice people eating dinner while other dastardly people held ‘secret video meetings.’ How is this what is important here?

Then if you keep reading, Hagey makes it clear: The board’s firing of Altman had nothing to do with that. And we get on with the actual excellent article.

I don’t doubt Thiel told that to Altman, and I find it likely Thiel even believed it. The thing is, it isn’t true, and it’s rather important that people know it isn’t true.

If you want to read more about what has happened at OpenAI, I have covered this extensively, and my posts contain links to the best primary and other secondary sources I could find. Here are the posts in this sequence.

  1. OpenAI: Facts From a Weekend.

  2. OpenAI: The Battle of the Board.

  3. OpenAI: Altman Returns.

  4. OpenAI: Leaks Confirm the Story.

  5. OpenAI: The Board Expands.

  6. OpenAI: Exodus.

  7. OpenAI: Fallout

  8. OpenAI: Helen Toner Speaks.

  9. OpenAI #8: The Right to Warn.

  10. OpenAI #10: Reflections.

  11. On the OpenAI Economic Blueprint.

  12. The Mask Comes Off: At What Price?

  13. OpenAI #11: America Action Plan.

The write-ups will doubtless continue, as this is one of the most important companies in the world.

Discussion about this post

OpenAI #12: Battle of the Board Redux Read More »

dad-demands-openai-delete-chatgpt’s-false-claim-that-he-murdered-his-kids

Dad demands OpenAI delete ChatGPT’s false claim that he murdered his kids

Currently, ChatGPT does not repeat these horrible false claims about Holmen in outputs. A more recent update apparently fixed the issue, as “ChatGPT now also searches the Internet for information about people, when it is asked who they are,” Noyb said. But because OpenAI had previously argued that it cannot correct information—it can only block information—the fake child murderer story is likely still included in ChatGPT’s internal data. And unless Holmen can correct it, that’s a violation of the GDPR, Noyb claims.

“While the damage done may be more limited if false personal data is not shared, the GDPR applies to internal data just as much as to shared data,” Noyb says.

OpenAI may not be able to easily delete the data

Holmen isn’t the only ChatGPT user who has worried that the chatbot’s hallucinations might ruin lives. Months after ChatGPT launched in late 2022, an Australian mayor threatened to sue for defamation after the chatbot falsely claimed he went to prison. Around the same time, ChatGPT linked a real law professor to a fake sexual harassment scandal, The Washington Post reported. A few months later, a radio host sued OpenAI over ChatGPT outputs describing fake embezzlement charges.

In some cases, OpenAI filtered the model to avoid generating harmful outputs but likely didn’t delete the false information from the training data, Noyb suggested. But filtering outputs and throwing up disclaimers aren’t enough to prevent reputational harm, Noyb data protection lawyer, Kleanthi Sardeli, alleged.

“Adding a disclaimer that you do not comply with the law does not make the law go away,” Sardeli said. “AI companies can also not just ‘hide’ false information from users while they internally still process false information. AI companies should stop acting as if the GDPR does not apply to them, when it clearly does. If hallucinations are not stopped, people can easily suffer reputational damage.”

Dad demands OpenAI delete ChatGPT’s false claim that he murdered his kids Read More »

study-finds-ai-generated-meme-captions-funnier-than-human-ones-on-average

Study finds AI-generated meme captions funnier than human ones on average

It’s worth clarifying that AI models did not generate the images used in the study. Instead, researchers used popular, pre-existing meme templates, and GPT-4o or human participants generated captions for them.

More memes, not better memes

When crowdsourced participants rated the memes, those created entirely by AI models scored higher on average in humor, creativity, and shareability. The researchers defined shareability as a meme’s potential to be widely circulated, influenced by humor, relatability, and relevance to current cultural topics. They note that this study is among the first to show AI-generated memes outperforming human-created ones across these metrics.

However, the study comes with an important caveat. On average, fully AI-generated memes scored higher than those created by humans alone or humans collaborating with AI. But when researchers looked at the best individual memes, humans created the funniest examples, and human-AI collaborations produced the most creative and shareable memes. In other words, AI models consistently produced broadly appealing memes, but humans—with or without AI help—still made the most exceptional individual examples.

Diagrams of meme creation and evaluation workflows taken from the paper.

Diagrams of meme creation and evaluation workflows taken from the paper. Credit: Wu et al.

The study also found that participants using AI assistance generated significantly more meme ideas and described the process as easier and requiring less effort. Despite this productivity boost, human-AI collaborative memes did not rate higher on average than memes humans created alone. As the researchers put it, “The increased productivity of human-AI teams does not lead to better results—just to more results.”

Participants who used AI assistance reported feeling slightly less ownership over their creations compared to solo creators. Given that a sense of ownership influenced creative motivation and satisfaction in the study, the researchers suggest that people interested in using AI should carefully consider how to balance AI assistance in creative tasks.

Study finds AI-generated meme captions funnier than human ones on average Read More »

openai-#11:-america-action-plan

OpenAI #11: America Action Plan

Last week I covered Anthropic’s submission to the request for suggestions for America’s action plan. I did not love what they submitted, and especially disliked how aggressively they sidelines existential risk and related issues, but given a decision to massively scale back ambition like that the suggestions were, as I called them, a ‘least you can do’ agenda, with many thoughtful details.

OpenAI took a different approach. They went full jingoism in the first paragraph, framing this as a race in which we must prevail over the CCP, and kept going. A lot of space is spent on what a kind person would call rhetoric and an unkind person corporate jingoistic propaganda.

Their goal is to have the Federal Government not only not regulate AI or impose any requirements on AI whatsoever on any level, but also prevent the states from doing so, and ensure that existing regulations do not apply to them, seeking ‘relief’ from proposed bills, including exemption from all liability, explicitly emphasizing immunity from regulations targeting frontier models in particular and name checking SB 1047 as an example of what they want immunity from, all in the name of ‘Freedom to Innovate,’ warning of undermining America’s leadership position otherwise.

None of which actually makes any sense from a legal perspective, that’s not how any of this works, but that’s clearly not what they decided to care about. If this part was intended as a serious policy proposal it would have tried to pretend to be that. Instead it’s a completely incoherent proposal, that goes halfway towards something unbelievably radical but pulls back from trying to implement it.

Meanwhile, they want the United States to not only ban Chinese ‘AI infrastructure’ but also coordinate with other countries to ban it, and they want to weaken the compute diffusion rules for those who cooperate with this, essentially only restricting countries with a history or expectation of leaking technology to China, or those who won’t play ball with OpenAI’s anticompetitive proposals.

They refer to DeepSeek as ‘state controlled.’

They claim that DeepSeek could be ordered to alter its models to cause harm, if one were to build upon them, seems to fundamentally misunderstand that DeepSeek is releasing open models. You can’t modify an open model like that. Nor can you steal someone’s data if they’re running their own copy. The parallel to Huawei is disingenuous at best, especially given the source.

They cite the ‘Belt and Road Initiative’ and claim to expect China to coerce people into using DeepSeek’s models.

For copyright they proclaim the need for ‘freedom to learn’ and asserts that AI training is fully fair use and immune from copyright. I think this is a defensible position, and myself support mandatory licensing similar to radio for music, in a way that compensates creators. I think the position here is defensible. But the rhetoric?

They all but declare that if we don’t apply fair use, the authoritarians will conquer us.

If the PRC’s developers have unfettered access to data and American companies are left without fair use access, the race for AI is effectively over. America loses, as does the success of democratic AI.

It amazes me they wrote that with a straight face. Everything is power laws. Suggesting that depriving American labs of some percentage of data inputs, even if that were to happen and the labs were to honor those restrictions (which I very much do not believe they have typically been doing), would mean ‘the race is effectively over’ is patently absurd. They know that better than anyone. Have they no shame? Are they intentionally trying to tell us that they have no shame? Why?

This document is written in a way that seems almost designed to make one vomit. This is vice signaling. As I have said before, and with OpenAI documents this has happened before, when that happens, I think it is important to notice it!

I don’t think the inducing of vomit is a coincidence. They chose to write it this way. They want people to see that they are touting disingenuous jingoistic propaganda in a way that seems suspiciously corrupt. Why would they want to signal that? You tell me.

You don’t publish something like this unless you actively want headlines like this:

Evan Morrison: Altman translated – if you don’t give Open AI free access to steal all copyrighted material by writers, musicians and filmmakers without legal repercussions then we will lose the AI race with China – a communist nation which nonetheless protects the copyright of individuals.

There are other similar and similarly motivated claims throughout.

The claim that China can circumvent some regulatory restrictions present in America is true enough, and yes that constitutes an advantage that could be critical if we do EU-style things, but the way they frame it goes beyond hyperbolic. Every industry, everywhere, would like to say ‘any requirements you place upon me make our lives harder and helps our competitors, so you need to place no restrictions on us of any kind.’

Then there’s a mix of proposals, some of which are good, presented reasonably:

Their proposal for a ‘National Transmission Highway Act’ on par with the 1956 National Interstate and Defense Highways Act seems like it should be overkill, but our regulations in these areas are deeply fed, so if as they suggest here it is focused purely on approvals I am all for that one. They also want piles of government money.

Similarly their idea of AI ‘Opportunity Zones’ is great if it only includes sidestepping permitting and various regulations. The tax incentives or ‘credit enhancements’ I see as an unnecessary handout, private industry is happy to make these investments if we clear the way.

The exception is semiconductor manufacturing, where we do need to provide the right incentives, so we will need to pay up.

Note that OpenAI emphasizes the need for solar and wind projects on top of other energy sources.

Digitization of government data currently in analog form is a great idea, we should do it for many overdetermined reasons. But to point out the obvious, are we then going to hide that data from PRC? It’s not an advantage to American AI companies if everyone gets equal access.

The Compact for AI proposal is vague but directionally seems good.

Their ‘national AI Readiness Strategy’ is part of a long line of ‘retraining’ style government initiatives that, frankly, don’t work, and also aren’t necessary here. I’m fine with expanding 529 savings plans to cover AI supply chain-related training programs, I mean sure why not, but don’t try to do much more than that. The private sector is far better equipped to handle this one, especially with AI help.

I don’t get the ‘creating AI research labs’ strategy here, it seems to be a tax on AI companies payable to universities? This doesn’t actually make economic sense at all.

The section on Government Adaptation of AI is conceptually fine, but the emphasis on private-public partnerships is telling.

Some others are even hasher than I was. Andrew Curran has similar even blunter thoughts on both of the DeepSeek and fair use rhetorical moves.

Alexander Doria: The main reason OpenAI is calling to reinforce fair use for model training: their new models directly compete with writers, journalists, wikipedia editors. We have deep research (a “wikipedia killer”, ditto Noam Brown) and now the creative writing model.

The fundamental doctrine behind the google books transformative exception: you don’t impede on the normal commercialization of the work used. No longer really the case…

We have models trained exclusively on open data.

Gallabytes (on the attempt to ban Chinese AI models): longshoremen level scummy move. @OpenAI this is disgraceful.

As we should have learned many times in the past, most famously with the Jones Act, banning the competition is not The Way. You don’t help your industry compete, you instead risk destroying your industry’s ability to compete.

This week, we saw for example that Saudi Aramco chief says DeepSeek AI makes ‘big difference’ to operations. The correct response is to say, hey, have you tried Claude and ChatGPT, or if you need open models have you tried Gemma? Let’s turn that into a reasoning model for you.

The response that says you’re ngmi? Trying to ban DeepSeek, or saying if you don’t get exemptions from laws then ‘the race is over.’

From Peter Wildeford, seems about right:

The best steelman of OpenAI’s response I’ve seen comes from John Pressman. His argument is, yes there is cringe here – he chooses to focus here on a line about DeepSeek’s willingness to do a variety of illicit activities and a claim that this reflects CCP’s view of violating American IP law. Which is certainly another cringy line. But, he points out, the Trump administration asked how America can get ahead and stay ahead in AI, so in that context why shouldn’t OpenAI respond with a jingoistic move towards regulatory capture and a free pass to do as they want?

And yes, there is that, although his comments also reinforce that the price in ‘gesture towards open model support’ for some people to cheer untold other horrors is remarkably cheap.

This letter is part of a recurring pattern in OpenAI’s public communications.

OpenAI have issued some very good documents on the alignment and technical fronts, including their model spec and statement on alignment philosophy, as well as their recent paper on The Most Forbidden Technique. They have been welcoming of detailed feedback on those fronts. In these places they are being thoughtful and transparent, and doing some good work, and I have updated positively. OpenAI’s actual model deployment decisions have mostly been fine in practice, with some troubling signs such as the attempt to pretend GPT-4.5 was not a frontier model.

Alas, their public relations and lobbying departments, and Altman’s public statements in various places, have been consistently terrible and getting even worse over time, to the point of being consistent and rather blatant vice signaling. OpenAI is intentionally presenting themselves as disingenuous jingoistic villains, seeking out active regulatory protections, doing their best to kill attempts to keep models secure, and attempting various forms of government subsidy and regulatory capture.

I get why they would think it is strategically wise to present themselves in this way, to appeal to both the current government and to investors, especially in the wake of recent ‘vibe shifts.’ So I get why one could be tempted to say, oh, they don’t actually believe any of this, they’re only being strategic, obviously not enough people will penalize them for it so they need to do it, and thus you shouldn’t penalize them for it either, that would only be spite.

I disagree. When people tell you who they are, you should believe them.

Discussion about this post

OpenAI #11: America Action Plan Read More »

ai-search-engines-cite-incorrect-sources-at-an-alarming-60%-rate,-study-says

AI search engines cite incorrect sources at an alarming 60% rate, study says

A new study from Columbia Journalism Review’s Tow Center for Digital Journalism finds serious accuracy issues with generative AI models used for news searches. The research tested eight AI-driven search tools equipped with live search functionality and discovered that the AI models incorrectly answered more than 60 percent of queries about news sources.

Researchers Klaudia Jaźwińska and Aisvarya Chandrasekar noted in their report that roughly 1 in 4 Americans now uses AI models as alternatives to traditional search engines. This raises serious concerns about reliability, given the substantial error rate uncovered in the study.

Error rates varied notably among the tested platforms. Perplexity provided incorrect information in 37 percent of the queries tested, whereas ChatGPT Search incorrectly identified 67 percent (134 out of 200) of articles queried. Grok 3 demonstrated the highest error rate, at 94 percent.

A graph from CJR shows

A graph from CJR shows “confidently wrong” search results. Credit: CJR

For the tests, researchers fed direct excerpts from actual news articles to the AI models, then asked each model to identify the article’s headline, original publisher, publication date, and URL. They ran 1,600 queries across the eight different generative search tools.

The study highlighted a common trend among these AI models: rather than declining to respond when they lacked reliable information, the models frequently provided confabulations—plausible-sounding incorrect or speculative answers. The researchers emphasized that this behavior was consistent across all tested models, not limited to just one tool.

Surprisingly, premium paid versions of these AI search tools fared even worse in certain respects. Perplexity Pro ($20/month) and Grok 3’s premium service ($40/month) confidently delivered incorrect responses more often than their free counterparts. Though these premium models correctly answered a higher number of prompts, their reluctance to decline uncertain responses drove higher overall error rates.

Issues with citations and publisher control

The CJR researchers also uncovered evidence suggesting some AI tools ignored Robot Exclusion Protocol settings, which publishers use to prevent unauthorized access. For example, Perplexity’s free version correctly identified all 10 excerpts from paywalled National Geographic content, despite National Geographic explicitly disallowing Perplexity’s web crawlers.

AI search engines cite incorrect sources at an alarming 60% rate, study says Read More »

openai-declares-ai-race-“over”-if-training-on-copyrighted-works-isn’t-fair-use

OpenAI declares AI race “over” if training on copyrighted works isn’t fair use

OpenAI is hoping that Donald Trump’s AI Action Plan, due out this July, will settle copyright debates by declaring AI training fair use—paving the way for AI companies’ unfettered access to training data that OpenAI claims is critical to defeat China in the AI race.

Currently, courts are mulling whether AI training is fair use, as rights holders say that AI models trained on creative works threaten to replace them in markets and water down humanity’s creative output overall.

OpenAI is just one AI company fighting with rights holders in several dozen lawsuits, arguing that AI transforms copyrighted works it trains on and alleging that AI outputs aren’t substitutes for original works.

So far, one landmark ruling favored rights holders, with a judge declaring AI training is not fair use, as AI outputs clearly threatened to replace Thomson-Reuters’ legal research firm Westlaw in the market, Wired reported. But OpenAI now appears to be looking to Trump to avoid a similar outcome in its lawsuits, including a major suit brought by The New York Times.

“OpenAI’s models are trained to not replicate works for consumption by the public. Instead, they learn from the works and extract patterns, linguistic structures, and contextual insights,” OpenAI claimed. “This means our AI model training aligns with the core objectives of copyright and the fair use doctrine, using existing works to create something wholly new and different without eroding the commercial value of those existing works.”

Providing “freedom-focused” recommendations on Trump’s plan during a public comment period ending Saturday, OpenAI suggested Thursday that the US should end these court fights by shifting its copyright strategy to promote the AI industry’s “freedom to learn.” Otherwise, the People’s Republic of China (PRC) will likely continue accessing copyrighted data that US companies cannot access, supposedly giving China a leg up “while gaining little in the way of protections for the original IP creators,” OpenAI argued.

OpenAI declares AI race “over” if training on copyrighted works isn’t fair use Read More »

openai-pushes-ai-agent-capabilities-with-new-developer-api

OpenAI pushes AI agent capabilities with new developer API

Developers using the Responses API can access the same models that power ChatGPT Search: GPT-4o search and GPT-4o mini search. These models can browse the web to answer questions and cite sources in their responses.

That’s notable because OpenAI says the added web search ability dramatically improves the factual accuracy of its AI models. On OpenAI’s SimpleQA benchmark, which aims to measure confabulation rate, GPT-4o search scored 90 percent, while GPT-4o mini search achieved 88 percent—both substantially outperforming the larger GPT-4.5 model without search, which scored 63 percent.

Despite these improvements, the technology still has significant limitations. Aside from issues with CUA properly navigating websites, the improved search capability doesn’t completely solve the problem of AI confabulations, with GPT-4o search still making factual mistakes 10 percent of the time.

Alongside the Responses API, OpenAI released the open source Agents SDK, providing developers free tools to integrate models with internal systems, implement safeguards, and monitor agent activities. This toolkit follows OpenAI’s earlier release of Swarm, a framework for orchestrating multiple agents.

These are still early days in the AI agent field, and things will likely improve rapidly. However, at the moment, the AI agent movement remains vulnerable to unrealistic claims, as demonstrated earlier this week when users discovered that Chinese startup Butterfly Effect’s Manus AI agent platform failed to deliver on many of its promises, highlighting the persistent gap between promotional claims and practical functionality in this emerging technology category.

OpenAI pushes AI agent capabilities with new developer API Read More »

what-does-“phd-level”-ai-mean?-openai’s-rumored-$20,000-agent-plan-explained.

What does “PhD-level” AI mean? OpenAI’s rumored $20,000 agent plan explained.

On the Frontier Math benchmark by EpochAI, o3 solved 25.2 percent of problems, while no other model has exceeded 2 percent—suggesting a leap in mathematical reasoning capabilities over the previous model.

Benchmarks vs. real-world value

Ideally, potential applications for a true PhD-level AI model would include analyzing medical research data, supporting climate modeling, and handling routine aspects of research work.

The high price points reported by The Information, if accurate, suggest that OpenAI believes these systems could provide substantial value to businesses. The publication notes that SoftBank, an OpenAI investor, has committed to spending $3 billion on OpenAI’s agent products this year alone—indicating significant business interest despite the costs.

Meanwhile, OpenAI faces financial pressures that may influence its premium pricing strategy. The company reportedly lost approximately $5 billion last year covering operational costs and other expenses related to running its services.

News of OpenAI’s stratospheric pricing plans come after years of relatively affordable AI services that have conditioned users to expect powerful capabilities at relatively low costs. ChatGPT Plus remains $20 per month and Claude Pro costs $30 monthly—both tiny fractions of these proposed enterprise tiers. Even ChatGPT Pro’s $200/month subscription is relatively small compared to the new proposed fees. Whether the performance difference between these tiers will match their thousandfold price difference is an open question.

Despite their benchmark performances, these simulated reasoning models still struggle with confabulations—instances where they generate plausible-sounding but factually incorrect information. This remains a critical concern for research applications where accuracy and reliability are paramount. A $20,000 monthly investment raises questions about whether organizations can trust these systems not to introduce subtle errors into high-stakes research.

In response to the news, several people quipped on social media that companies could hire an actual PhD student for much cheaper. “In case you have forgotten,” wrote xAI developer Hieu Pham in a viral tweet, “most PhD students, including the brightest stars who can do way better work than any current LLMs—are not paid $20K / month.”

While these systems show strong capabilities on specific benchmarks, the “PhD-level” label remains largely a marketing term. These models can process and synthesize information at impressive speeds, but questions remain about how effectively they can handle the creative thinking, intellectual skepticism, and original research that define actual doctoral-level work. On the other hand, they will never get tired or need health insurance, and they will likely continue to improve in capability and drop in cost over time.

What does “PhD-level” AI mean? OpenAI’s rumored $20,000 agent plan explained. Read More »

elon-musk-loses-initial-attempt-to-block-openai’s-for-profit-conversion

Elon Musk loses initial attempt to block OpenAI’s for-profit conversion

A federal judge rejected Elon Musk’s request to block OpenAI’s planned conversion from a nonprofit to for-profit entity but expedited the case so that Musk’s core claims can be addressed in a trial before the end of this year.

Musk had filed a motion for preliminary injunction in US District Court for the Northern District of California, claiming that OpenAI’s for-profit conversation “violates the terms of Musk’s donations” to the company. But Musk failed to meet the burden of proof needed for an injunction, Judge Yvonne Gonzalez Rogers ruled yesterday.

“Plaintiffs Elon Musk, [former OpenAI board member] Shivon Zilis, and X.AI Corp. (‘xAI’) collectively move for a preliminary injunction barring defendants from engaging in various business activities, which plaintiffs claim violate federal antitrust and state law,” Rogers wrote. “The relief requested is extraordinary and rarely granted as it seeks the ultimate relief of the case on an expedited basis, with a cursory record, and without the benefit of a trial.”

Rogers said that “the Court is prepared to offer an expedited schedule on the core claims driving this litigation [to] address the issues which are allegedly more urgent in terms of public, not private, considerations.” There would be important public interest considerations if the for-profit shift is found to be illegal at a trial, she wrote.

Musk said OpenAI took advantage of him

Noting that OpenAI donors may have taken tax deductions from a nonprofit that is now turning into a for-profit enterprise, Rogers said the court “agrees that significant and irreparable harm is incurred when the public’s money is used to fund a non-profit’s conversion into a for-profit.” But as for the motion to block the for-profit conversion before a trial, “The request for an injunction barring any steps towards OpenAI’s conversion to a for-profit entity is DENIED.”

Elon Musk loses initial attempt to block OpenAI’s for-profit conversion Read More »

ai-firms-follow-deepseek’s-lead,-create-cheaper-models-with-“distillation”

AI firms follow DeepSeek’s lead, create cheaper models with “distillation”

Thanks to distillation, developers and businesses can access these models’ capabilities at a fraction of the price, allowing app developers to run AI models quickly on devices such as laptops and smartphones.

Developers can use OpenAI’s platform for distillation, learning from the large language models that underpin products like ChatGPT. OpenAI’s largest backer, Microsoft, used GPT-4 to distill its small language family of models Phi as part of a commercial partnership after investing nearly $14 billion into the company.

However, the San Francisco-based start-up has said it believes DeepSeek distilled OpenAI’s models to train its competitor, a move that would be against its terms of service. DeepSeek has not commented on the claims.

While distillation can be used to create high-performing models, experts add they are more limited.

“Distillation presents an interesting trade-off; if you make the models smaller, you inevitably reduce their capability,” said Ahmed Awadallah of Microsoft Research, who said a distilled model can be designed to be very good at summarising emails, for example, “but it really would not be good at anything else.”

David Cox, vice-president for AI models at IBM Research, said most businesses do not need a massive model to run their products, and distilled ones are powerful enough for purposes such as customer service chatbots or running on smaller devices like phones.

“Any time you can [make it less expensive] and it gives you the right performance you want, there is very little reason not to do it,” he added.

That presents a challenge to many of the business models of leading AI firms. Even if developers use distilled models from companies like OpenAI, they cost far less to run, are less expensive to create, and, therefore, generate less revenue. Model-makers like OpenAI often charge less for the use of distilled models as they require less computational load.

AI firms follow DeepSeek’s lead, create cheaper models with “distillation” Read More »

“it’s-a-lemon”—openai’s-largest-ai-model-ever-arrives-to-mixed-reviews

“It’s a lemon”—OpenAI’s largest AI model ever arrives to mixed reviews

Perhaps because of the disappointing results, Altman had previously written that GPT-4.5 will be the last of OpenAI’s traditional AI models, with GPT-5 planned to be a dynamic combination of “non-reasoning” LLMs and simulated reasoning models like o3.

A stratospheric price and a tech dead-end

And about that price—it’s a doozy. GPT-4.5 costs $75 per million input tokens and $150 per million output tokens through the API, compared to GPT-4o’s $2.50 per million input tokens and $10 per million output tokens. (Tokens are chunks of data used by AI models for processing). For developers using OpenAI models, this pricing makes GPT-4.5 impractical for many applications where GPT-4o already performs adequately.

By contrast, OpenAI’s flagship reasoning model, o1 pro, costs $15 per million input tokens and $60 per million output tokens—significantly less than GPT-4.5 despite offering specialized simulated reasoning capabilities. Even more striking, the o3-mini model costs just $1.10 per million input tokens and $4.40 per million output tokens, making it cheaper than even GPT-4o while providing much stronger performance on specific tasks.

OpenAI has likely known about diminishing returns in training LLMs for some time. As a result, the company spent most of last year working on simulated reasoning models like o1 and o3, which use a different inference-time (runtime) approach to improving performance instead of throwing ever-larger amounts of training data at GPT-style AI models.

OpenAI's self-reported benchmark results for the SimpleQA test, which measures confabulation rate.

OpenAI’s self-reported benchmark results for the SimpleQA test, which measures confabulation rate. Credit: OpenAI

While this seems like bad news for OpenAI in the short term, competition is thriving in the AI market. Anthropic’s Claude 3.7 Sonnet has demonstrated vastly better performance than GPT-4.5, with a reportedly more efficient architecture. It’s worth noting that Claude 3.7 Sonnet is likely a system of AI models working together behind the scenes, although Anthropic has not provided details about its architecture.

For now, it seems that GPT-4.5 may be the last of its kind—a technological dead-end for an unsupervised learning approach that has paved the way for new architectures in AI models, such as o3’s inference-time reasoning and perhaps even something more novel, like diffusion-based models. Only time will tell how things end up.

GPT-4.5 is now available to ChatGPT Pro subscribers, with rollout to Plus and Team subscribers planned for next week, followed by Enterprise and Education customers the week after. Developers can access it through OpenAI’s various APIs on paid tiers, though the company is uncertain about its long-term availability.

“It’s a lemon”—OpenAI’s largest AI model ever arrives to mixed reviews Read More »