Author name: Kris Guyer

via-the-false-claims-act,-nih-puts-universities-on-edge

Via the False Claims Act, NIH puts universities on edge


Funding pause at U. Michigan illustrates uncertainty around new language in NIH grants.

University of Michigan students walk on the UM campus next to signage displaying the University’s “Core Values” on April 3, 2025 in Ann Arbor, Michigan. Credit: Bill Pugliano/Getty Images

Earlier this year, a biomedical researcher at the University of Michigan received an update from the National Institutes of Health. The federal agency, which funds a large swath of the country’s medical science, had given the green light to begin releasing funding for the upcoming year on the researcher’s multi-year grant.

Not long after, the researcher learned that the university had placed the grant on hold. The school’s lawyers, it turned out, were wrestling with a difficult question: whether to accept new terms in the Notice of Award, a legal document that outlines the grant’s terms and conditions.

Other researchers at the university were having the same experience. Indeed, Undark’s reporting suggests that the University of Michigan—among the top three university recipients of NIH funding in 2024, with more than $750 million in grants—had quietly frozen some, perhaps all, of its incoming NIH funding dating back to at least the second half of April.

The university’s director of public affairs, Kay Jarvis, declined to comment for this article or answer a list of questions from Undark, instead pointing to the institution’s research website.

In conversations with Michigan scientists, and in internal communications obtained by Undark, administrators explained the reason for the delays: University officials were concerned about new language in NIH grant notices. That language said that universities will be subject to liability under a Civil War-era statute called the False Claims Act if they fail to abide by civil rights laws and a January 20 executive order related to gender.

For the most part, public attention to NIH funding has focused on what the new Trump administration is doing on its end, including freezing and terminating grants at elite institutions for alleged Title VI and IX violations, and slashing funding for newly disfavored areas of research. The events in Ann Arbor show how universities themselves are struggling to cope with a wave of recent directives from the federal government.

The new terms may expose universities to significant legal risk, according to several experts. “The Trump administration is using the False Claims Act as a massive threat to the bottom lines of research institutions,” said Samuel Bagenstos, a law professor at the University of Michigan, who served as general counsel for the Department of Health and Human Services during the Biden administration. (Bagenstos said he has not advised the university’s lawyers on this issue.) That law entitles the government to collect up to three times the financial damage. “So potentially you could imagine the Trump administration seeking all the federal funds times three that an institution has received if they find a violation of the False Claims Act.”

Such an action, Bagenstos and another legal expert said, would be unlikely to hold up in court. But the possibility, he said, is enough to cause concern for risk-averse institutions.

The grant pauses unsettled the affected researchers. One of them noted that the university had put a hold on a grant that supported a large chunk of their research program. “I don’t have a lot of money left,” they said.

The researcher worried that if funds weren’t released soon, personnel would have to be fired and medical research halted. “There’s a feeling in the air that somebody’s out to get scientists,” said the researcher, reflecting on the impact of all the changes at the federal level. “And it could be your turn tomorrow for no clear reason.” (The researcher, like other Michigan scientists interviewed for this story, spoke on condition of anonymity for fear of retaliation.)

Bagenstos said some other universities had also halted funding—a claim Undark was unable to confirm. At Michigan, at least, money is now flowing: On Wednesday, June 11, just hours after Undark sent a list of questions to the university’s public affairs office, some researchers began receiving emails saying their funding would be released. And research administrators received a message stating that the university would begin releasing the more than 270 awards that it had placed on hold.

The federal government distributes tens of billions of dollars each year to universities through NIH funding. In the past, the terms of those grants have required universities to comply with civil rights laws. More recently, though, the scope of those expectations has expanded. Multiple recent award notices viewed by Undark now contain language referring to a January 20 executive order that states the administration “will defend women’s rights and protect freedom of conscience by using clear and accurate language and policies that recognize women are biologically female, and men are biologically male.” The notices also contain four bullet points, one of which asks the grant recipient—meaning the researcher’s institution—to acknowledge that “a knowing false statement” regarding compliance is subject to liability under the False Claims Act.

Read an NIH Notice of Award

Alongside this change, on April 21, the agency issued a policy requiring universities to certify that they will not participate in discriminatory DEI activities or boycotts of Israel, noting that false statements would be subject to penalties under the False Claims Act. (That measure was rescinded in early June, reinstated, and then rescinded again while the agency awaits further White House guidance.) Additionally, in May, an announcement from the Department of Justice encouraged use of the False Claims Act in civil rights enforcement.

Some experts said that signing onto FCA terms could put universities in a vulnerable position, not because they aren’t following civil rights laws, but because the new grant language is vague and seemingly ripe for abuse.

The False Claims Act says someone who knowingly submits a false claim to the government can be held liable for triple damages. In the case of a major research institution like the University of Michigan, worst-case scenarios could range into the billions of dollars.

It’s not just the dollar amount that may cause schools to act in a risk-averse way, said Bagenstos. The False Claims Act also contains what’s known as a “qui tam” provision, which allows private entities to file a lawsuit on behalf of the United States and then potentially take a piece of the recovery money. “The government does not have the resources to identify and pursue all cases of legitimate fraud” in the country, said Bagenstos, so generally the provision is a useful one. But it can be weaponized when “yoked to a pernicious agenda of trying to suppress speech by institutions of higher learning, or simply to try to intimidate them.”

Avoiding the worst-case scenario might seem straightforward enough: Just follow civil rights laws. But in reality, it’s not entirely clear where a university’s responsibility starts and stops. For example, an institution might officially adopt policies that align with the new executive orders. But if, say, a student group, or a sociology department, steps out of bounds, then the university might be understood to not be in compliance—particularly by a less-than-friendly federal administration.

University attorneys may also balk at the ambiguity and vagueness of terms like “gender ideology” and “DEI,” said Andrew Twinamatsiko, a director of the Center for Health Policy and the Law at the O’Neill Institute at Georgetown Law. Litigation-averse universities may end up rolling back their programming, he said, because they don’t want to run afoul of the government’s overly broad directives.

“I think this is a time that calls for some courage,” said Bagenstos. If every university decides the risks are too great, then the current policies will prevail without challenge, he said, even though some are legally unsound. And the bar for False Claims Act liability is actually quite high, he pointed out: There’s a requirement that the person knowingly made a false statement or deliberately ignored facts. Universities are actually well-positioned to prevail in court, said Bagenstos and other legal experts. The issue is that they don’t want to engage in drawn-out and potentially costly litigation.

One possibility might be for a trade group, such as the Association of American Universities, to mount the legal challenge, said Richard Epstein, a libertarian legal scholar. In his view, the new NIH terms are unconstitutional because such conditions on spending, which he characterized as “unrelated to scientific endeavors,” need to be authorized by Congress.

The NIH did not respond to repeated requests for comment.

Some people expressed surprise at the insertion of the False Claims Act language.

Michael Yassa, a professor of neurobiology and behavior at the University of California, Irvine, said that he wasn’t aware of the new terms until Undark contacted him. The NIH-supported researcher and study-section chair started reading from a recent Notice of Award during the interview. “I can’t give you a straight answer on this one,” he said, and after further consideration, added, “Let me run this by a legal team.”

Andrew Miltenberg, an attorney in New York City who’s nationally known for his work on Title IX litigation, was more pointed. “I don’t actually understand why it’s in there,” he said, referring to the new grant language. “I don’t think it belongs in there. I don’t think it’s legal, and I think it’s going to take some lawsuits to have courts interpret the fact that there’s no real place for it.

This article was originally published on Undark. Read the original article.

Via the False Claims Act, NIH puts universities on edge Read More »

spanish-blackout-report:-power-plants-meant-to-stabilize-voltage-didn’t

Spanish blackout report: Power plants meant to stabilize voltage didn’t

The blackout that took down the Iberian grid serving Spain and Portugal in April was the result of a number of smaller interacting problems, according to an investigation by the Spanish government. The report concludes that several steps meant to address a small instability made matters worse, eventually leading to a self-reinforcing cascade where high voltages caused power plants to drop off the grid, thereby increasing the voltage further. Critically, the report suggests that the Spanish grid operator had an unusually low number of plants on call to stabilize matters, and some of the ones it did have responded poorly.

The full report will be available later today; however, the government released a summary ahead of its release. The document includes a timeline of the events that triggered the blackout, as well as an analysis of why grid management failed to keep it in check. It also notes that a parallel investigation checked for indications of a cyberattack and found none.

Oscillations and a cascade

The document notes that for several days prior to the blackout, the Iberian grid had been experiencing voltage fluctuations—products of a mismatch between supply and demand—that had been managed without incident. These continued through the morning of April 28 until shortly after noon, when an unusual frequency oscillation occurred. This oscillation has been traced back to a single facility on the grid, but the report doesn’t identify it or even indicate its type, simply referring to it as an “instalación.”

The grid operators responded in a way that suppressed the oscillations but increased the voltages on the grid. About 15 minutes later, a weakened version of this oscillation occurred again, followed shortly thereafter by oscillations at a different frequency, this one with properties that are commonly seen on European grids. That prompted the grid operators to take corrective steps again, which increased the voltages on the grid.

The Iberian grid is capable of handling this sort of thing. But the grid operator only scheduled 10 power plants to handle voltage regulation on the 28th, which the report notes is the lowest total it had committed to in all of 2025 up to that point. The report found that a number of those plants failed to respond properly to the grid operators, and a few even responded in a way that contributed to the surging voltages.

Spanish blackout report: Power plants meant to stabilize voltage didn’t Read More »

2025-audi-s5-and-a5-first-drive:-five-door-is-the-new-four-door

2025 Audi S5 and A5 first drive: Five-door is the new four-door

The S5 is eager and more engaging to drive than the A5. Jonathan Gitlin

Like the Q5 last week, the A5 and S5 use a new electronic architecture called E3 1.2. This is a clean-sheet approach to the various electronic subsystems in the car, replacing decades of legacy cruft and more than a hundred individual electronic control units with five powerful high-performance computers, each with responsibility for a different domain: ride and handling, infotainment, driver assists, and convenience functions, all overseen by a master computer.

On the road

Sadly, those looking for driver engagement will not find much in the A5. Despite the improvements to the front suspension, there’s still very little in the way of feedback, and in comfort mode, the steering was too light, at least for me. In Dynamic mode, on the other hand, the car felt extremely sure-footed in bad weather. The A5 makes do with conventional springs, so the ride doesn’t change between drive modes, but Audi has tuned it well, and the car is not too firm. I noted a fair amount of wind noise, despite the acoustic front glass that comes with the ($6,450) Prestige package.

The S5 will appeal much more to driving enthusiasts. The steering provides a better picture of what the front tires are doing, and the air suspension gives the car a supple ride, albeit one that gets firmer in Balanced rather than Dynamic modes. Like some other recent fast Audis, the car is deceptively quick, and because it’s quite quiet and smooth, you can find yourself going a good deal faster than you thought. The S5’s exhaust note also sounds rather pleasant and not obnoxious.

The A5 cabin has a similar layout as the Q5 and Q6 e-tron SUVs. Audi

The A5 starts at $49,700, but the $3,600 Premium Plus package is likely a must-have, as this adds adaptive cruise control, a heads-up display, top-down parking cameras, and some other features (including USB-C ports). If you want to get really fancy, the Prestige pack adds speakers in the front headrests, OLED taillights, the aforementioned acoustic glass, plus a second infotainment screen for the front passenger.

Meanwhile, the S5 starts at $62,700; the Premium Plus package (which adds mostly the same stuff) will set you back $3,800. For the S5, the $7,550 Prestige pack includes front sports seats, Nappa leather, rear window sunshades, the passenger display, and the adaptive sports suspension. Those are all some hefty numbers, but the A5 and S5 are actually both cheaper in real terms than the models launched in 2018, once you take seven years’ worth of inflation into account.

2025 Audi S5 and A5 first drive: Five-door is the new four-door Read More »

scientists-once-hoarded-pre-nuclear-steel;-now-we’re-hoarding-pre-ai-content

Scientists once hoarded pre-nuclear steel; now we’re hoarding pre-AI content

A time capsule of human expression

Graham-Cumming is no stranger to tech preservation efforts. He’s a British software engineer and writer best known for creating POPFile, an open source email spam filtering program, and for successfully petitioning the UK government to apologize for its persecution of codebreaker Alan Turing—an apology that Prime Minister Gordon Brown issued in 2009.

As it turns out, his pre-AI website isn’t new, but it has languished unannounced until now. “I created it back in March 2023 as a clearinghouse for online resources that hadn’t been contaminated with AI-generated content,” he wrote on his blog.

The website points to several major archives of pre-AI content, including a Wikipedia dump from August 2022 (before ChatGPT’s November 2022 release), Project Gutenberg’s collection of public domain books, the Library of Congress photo archive, and GitHub’s Arctic Code Vault—a snapshot of open source code buried in a former coal mine near the North Pole in February 2020. The wordfreq project appears on the list as well, flash-frozen from a time before AI contamination made its methodology untenable.

The site accepts submissions of other pre-AI content sources through its Tumblr page. Graham-Cumming emphasizes that the project aims to document human creativity from before the AI era, not to make a statement against AI itself. As atmospheric nuclear testing ended and background radiation returned to natural levels, low-background steel eventually became unnecessary for most uses. Whether pre-AI content will follow a similar trajectory remains a question.

Still, it feels reasonable to protect sources of human creativity now, including archival ones, because these repositories may become useful in ways that few appreciate at the moment. For example, in 2020, I proposed creating a so-called “cryptographic ark”—a timestamped archive of pre-AI media that future historians could verify as authentic, collected before my then-arbitrary cutoff date of January 1, 2022. AI slop pollutes more than the current discourse—it could cloud the historical record as well.

For now, lowbackgroundsteel.ai stands as a modest catalog of human expression from what may someday be seen as the last pre-AI era. It’s a digital archaeology project marking the boundary between human-generated and hybrid human-AI cultures. In an age where distinguishing between human and machine output grows increasingly difficult, these archives may prove valuable for understanding how human communication evolved before AI entered the chat.

Scientists once hoarded pre-nuclear steel; now we’re hoarding pre-AI content Read More »

o3-turns-pro

o3 Turns Pro

You can now have o3 throw vastly more compute at a given problem. That’s o3-pro.

Should you have o3 throw vastly more compute at a given problem, if you are paying the $200/month subscription price for ChatGPT Pro? Should you pay the $200, or the order of magnitude markup over o3 to use o3-pro in the API?

That’s trickier. Sometimes yes. Sometimes no. My experience so far is that waiting a long time is annoying, sufficiently annoying that you often won’t want to wait. Whenever I ask o3-pro something, I often also have been asking o3 and Opus.

Using the API at scale seems prohibitively expensive for what you get, and you can (and should) instead run parallel queries using the chat interface.

The o3-pro answers have so far definitely been better than o3, but the wait is usually enough to break my workflow and human context window in meaningful ways – fifteen minutes plus variance is past the key breakpoint, such that it would have not been substantially more painful to fully wait for Deep Research.

Indeed, the baseline workflow feels similar to Deep Research, in that you fire off a query and then eventually you context shift back and look at it. But if you are paying the subscription price already it’s often worth queuing up a question and then having it ready later if it is useful.

In many ways o3-pro still feels like o3, only modestly better in exchange for being slower. Otherwise, same niche. If you were already thinking ‘I want to use Opus rather than o3’ chances are you want Opus rather than, or in addition to, o3-pro.

Perhaps the most interesting claim, from some including Tyler Cowen, was that o3-pro is perhaps not a lying liar, and hallucinates far less than o3. If this is true, in many situations it would be worth using for that reason alone, provided the timing allows this. The bad news is that it didn’t improve on a Confabulations benchmark.

My poll (n=19) was roughly evenly split on this question.

My hunch, based on my use so far, is that o3-pro is hallucinating modestly less because:

  1. It is more likely to find or know the right answer to a given question, which is likely to be especially relevant to Tyler’s observations.

  2. It is considering its answer a lot, so it usually won’t start writing an answer and then think ‘oh I guess that start means I will provide some sort of answer’ like o3.

  3. The queries you send are more likely to be well-considered to avoid the common mistake of essentially asking for hallucinations.

But for now I think you still have to have a lot of the o3 skepticism.

And as always, the next thing will be here soon, Gemini 2.5 Pro Deep Think is coming.

Pliny of course jailbroke it, for those wondering. Pliny also offers us the tools and channels information.

My poll strongly suggested o3-pro is slightly stronger than o3.

Greg Brockman (OpenAI): o3-pro is much stronger than o3.

OpenAI: In expert evaluations, reviewers consistently prefer OpenAI o3-pro over o3, highlighting its improved performance in key domains—including science, education, programming, data analysis, and writing.

Reviewers also rated o3-pro consistently higher for clarity, comprehensiveness, instruction-following, and accuracy.

Like OpenAI o1-pro, OpenAI o3-pro excels at math, science, and coding as shown in academic evaluations.

To assess the key strength of OpenAI o3-pro, we once again use our rigorous “4/4 reliability” evaluation, where a model is considered successful only if it correctly answers a question in all four attempts, not just one.

OpenAI o3-pro has access to tools that make ChatGPT useful—it can search the web, analyze files, reason about visual inputs, use Python, personalize responses using memory, and more.

Sam Altman: o3-pro is rolling out now for all chatgpt pro users and in the api.

it is really smart! i didnt believe the win rates relative to o3 the first time i saw them.

Arena has gotten quite silly if treated as a comprehensive measure (as in Gemini 2.5 Flash is rated above o3), but as a quick heuristic, if we take a 64% win rate seriously, that would by the math put o3-pro ~100 above o3 at 1509 on Arena, crushing Gemini-2.5-Pro for the #1 spot. I would assume that most pairwise comparisons would have a less impressive jump, since o3-pro is essentially offering the same product as o3 only somewhat better, which means the result will be a lot less noisy than if it was up against Gemini.

So this both is a very impressive statistic and also doesn’t mean much of anything.

The problem with o3-pro is that it is slow.

Nearcyan: one funny note is that minor UX differences in how you display ‘thinking’/loading/etc can easily move products from the bottom half of this meme to the top half.

Another note is anyone I know who is the guy in the bottom left is always extremely smart and a pleasure to speak with.

the real problem is I may be closer to the top right than the bottom left

Today I had my first instance of noticing I’d gotten a text (during the night, in this case) and they got a response 20 minutes slower than they would have otherwise because I waited for o3-pro to give its answer to the question I’d been asked.

Thus, even with access to o3-pro at zero marginal compute cost, almost half of people reported they rarely use it for a given query, and only about a quarter said they usually use it.

It is also super frustrating to run into errors when you are waiting 15+ minutes for a response, and reports of such errors were common which matches my experience.

Bindu Reddy: o3-Pro Is Not Very Good At Agentic Coding And Doesn’t Score Higher Than o3 😿

After a lot of waiting and numerous retries, we have finally deployed o3-pro on LiveBench AI.

Sadly, the overall score doesn’t improve over o3 🤷‍♂️

Mainly because it’s not very agentic and isn’t very good at tool use… it scores way below o3 on the agentic-coding category.

The big story yesterday was not o3-pro but the price decrease in o3!!

Dominik Lukes: I think this take by @bindureddy very much matches the vibes I’m getting: it does not “feel” very agentic and as ready to reach for the right tools as o3 is – but it could just be because o3 keeps you informed about what it’s doing in the CoT trace.

I certainly would try o3-pro in cases where o3 was failing, if I’d already also tried Opus and Gemini first. I wonder if that agentic coding score drop actually represent an issue here, where because it is for the purpose of reasoning longer and they don’t want it endlessly web searching o3-pro is not properly inclined to exploit tools?

o3-pro gets 8.5/10 on BaldurBench, which is about creating detailed build guides for rapidly changing video games. Somewhat subjective but should still work.

L Zahir: bombs all my secret benchmarks, no better than o3.

Lech Mazur gives us four of his benchmarks: A small improvement over o3 for Creative Writing Benchmark, a substantial boost from 79.5% (o3) or 82.5% (o1-pro) to 87.3% on Word Connections, no improvement on Thematic Generalization, very little improvement on Confabulations (avoiding hallucinations). The last one seems the most important to note.

Tyler Cowen was very positive, he seems like the perfect customer for o3-pro? By which I mean he can context shift easily so he doesn’t mind waiting, and also often uses queries where these models get a lot of value out of going at problems super hard, and relatively less value out of the advantages of other models (doesn’t want the personality, doesn’t want to code, and so on).

Tyler Cowen: It is very, very good. Hallucinates far less than other models. Can solve economics problems that o3 cannot. It can be slow, but that is what we have Twitter scrolling for, right? While we are waiting for o3 pro to answer a query we can read abouto3 pro.

Contrast that with the score on Confabulations not changing. I am guessing there is a modest improvement, for reasons described earlier.

There are a number of people pointing out places o3-pro solves something o3 doesn’t, such has here it solved the gimbal uap mystery in 18 minutes.

McKay Wrigley, eternal optimist, agrees on many fronts.

McKay Wrigley: My last 4 o3 Pro requests in ChatGPT… It thought for: – 26m 10s – 23m 45s – 19m 6s – 21m 18s Absolute *powerhouseof a model.

Testing how well it can 1-shot complex problems – impressed so far.

It’s too slow to use as a daily driver model (makes sense, it’s a beast!), but it’s a great “escalate this issue” model. If the current model you’re using is struggling with a task, then escalate it to o3 pro.

This is not a “vibe code” model.

This is the kind of model where you’ll want to see how useful it is to people like Terence Tao and Tyler Cowen.

Btw the point of this post was that I’m happy to have a model that is allowed to think for a long time.

To me that’s the entire point of having a “Pro” version of the model – let it think!

Obviously more goes into evaluating if it’s a great model (imo it’s really powerful).

Here’s a different kind of vibe coding, perhaps?

Conrad Barski: For programming tasks, I can give o3 pro some code that needs a significant revision, then ramble on and on about what the various attributes of the revision need to be and then it can reliably generate an implementation of the revision.

It feels like with previous models I had to give them more hand holding to get good results, I had to write my requests in a more thoughtful, structured way, spending more time on prompting technique.

o3 pro, on the other hand, can take loosely-connected constraints and then “fill in the gaps” in a relatively intelligent way- I feel it does this better than any other model so far.

The time cost and dollar costs are very real.

Matt Shumer: My initial take on o3 Pro:

It is not a daily-driver coding model.

It’s a superhuman researcher + structured thinker, capable of taking in massive amounts of data and uncovering insights you would probably miss on your own.

Use it accordingly.

I reserve the right to alter my take.

Bayram Annokov: slow, expensive, and veeeery good – definitely a jump up in analytical tasks

Emad: 20 o3 prompts > o3 pro except for some really advanced specific stuff I have found Only use it as a final check really or when stumped.

Eyes Alight: it is so very slow it took 13 minutes to answer a trivial question about a post on Twitter. I understand the appeal intellectually of an Einstein at 1/20th speed, but in reality I’m not sure I have the patience for it.

Clay: o3-pro achieving breakthrough performance in taking a long time to think.

Dominik Lukes: Here’s my o3 Pro testing results thread. Preliminary conclusions:

– great at analysis

– slow and overthinking simple problems

– o3 is enough for most tasks

– still fails SVG bike and local LLM research test

– very few people need it

– it will take time to develop a feel for it

Kostya Medvedovsky: For a lot of problems, it reminds me very strongly of Deep Research. Takes about the same amount of time, and will spend a lot of effort scouring the web for the answer to the question.

Makes me wish I could optionally turn off web access and get it to focus more on the reasoning aspect.

This may be user error and I should be giving it *waymore context.

Violet: you can turn search off, and only turn search on for specific prompts.

Xeophon: TL;DR:

o3 pro is another step up, but for going deep, not wide. It is good to go down one path, solve one problem; not for getting a broad overview about different topics/papers etc. Then it hallucinates badly, use ODR for this.

Part of ‘I am very intelligent’ is knowing when to think for longer and when not to. In that sense, o3-pro is not so smart, you have to take care of that question yourself. I do understand why this decision was made, let the user control that.

I agree with Lukes that most people do not ‘need’ o3 pro and they will be fine not paying for it, and for now they are better off with their expensive subscription (if any) being Claude Max. But even if you don’t need it, the queries you benefit from can still be highly useful.

It makes sense to default to using Opus and o3 pro (and for quick stuff Sonnet)

o3-pro is too slow to be a good ‘default’ model, especially for coding. I don’t want to have to reload my state in 15 minute intervals. It may or may not be good for the ‘call in the big guns’ role in coding, where you have a problem that Opus and Gemini (and perhaps regular o3) have failed to solve, but which you think o3-pro might get.

Here’s one that both seems central wrong but also makes an important point:

Nabeel Qureshi: You need to think pretty hard to get a set of evals which allows you to even distinguish between o3 and o3 pro.

Implication: “good enough AGI” is already here.

The obvious evals where it does better are Codeforces, and also ‘user preferences.’ Tyler Cowen’s statement suggests hallucination rate, which is huge if true (and it better be true, I’m not waiting 20 minutes that often to get an o3-level lying liar.) Tyler also reports there are questions where o3 fails and o3-pro succeeds, which is definitive if the gap is only one way. And of course if all else fails you can always have them do things like play board games against each other, as one answer suggests.

Nor do I think either o3 or o3-pro is the AGI you are looking for.

However, it is true that for a large percentage of tasks, o3 is ‘good enough.’ That’s even true in a strict sense for Claude Sonnet or even Gemini Flash. Most of the time one has a query, the amount of actually needed intelligence is small.

In the limit, we’ll have to rely on AIs to tell us which AI model is smarter, because we won’t be smart enough to tell the difference. What a weird future.

(Incidentally, this has already been the case in chess for years. Humans cannot tell the difference between a 3300 elo and a 3600 elo chess engine; we just make them fight it out and count the number of wins.)

You can tell 3300 from 3600 in chess, but only because you can tell who won. If almost any human looked at individual moves, you’d have very little idea.

I always appreciate people thinking at the limit rather than only on the margin. This is a central case of that.

Here’s one report that it’s doing well on the fully informal FictionBench:

Chris: Going to bed now, but had to share something crazy: been testing the o3 pro model, and honestly, the writing capabilities are astounding. Even with simple prompts, it crafts medium to long-form stories that make me deeply invested & are engaging they come with surprising twists, and each one carries this profound, meaningful depth that feels genuinely human.

The creativity behind these narratives is wild far beyond what I’d expect from most writers today. We’re talking sophisticated character development, nuanced plot arcs, and emotional resonance, all generated seamlessly. It’s genuinely hard to believe this is early-stage reinforcement learning with compute added at test time; the potential here is mind blowing. We’re witnessing just the beginning of AI enhanced storytelling, and already it’s surpassing what many humans can create. Excited to see what’s next with o4 Goodnight!

This contrasts with:

Archivedvideos: Really like it for technical stuff, soulless

Julius: I asked it to edit an essay and it took 13 minutes and provided mediocre results. Different from but slightly below the quality of 4o. Much worse than o3 or either Claude 4 model

Other positive reactions include Matt Wigdahl being impressed on a hairy RDP-related problem, a66mike99 getting interesting output and pushback on the request (in general I like this, although if you’re thinking for 20 minutes this could be a lot more frustrating?), niplav being impressed by results on a second attempt after Claude crafted a better prompt (this seems like an excellent workflow!), and Sithis3 saying o3-pro solves many problems o3 struggles on.

The obvious counterpoint is some people didn’t get good responses, and saw it repeating the flaws in o3.

Erik Hoel: First o3 pro usage. Many mistakes. Massive overconfidence. Clear inability to distinguish citations, pay attention to dates. Does anyone else actually use these models? They may be smarter on paper but they are increasingly lazy and evil in practice.

Kukutz: very very very slow, not so clever (can’t solve my semantic puzzle).

Allen: I think it’s less of an upgrade compared to base model than o1-pro was. Its general quality is better on avg but doesn’t seem to hit “next-level” on any marks. Usually mentions the same things as o3.

I think OAI are focused on delivering GPT-5 more than anything.

This thread from Xeophon features reactions that are mixed but mostly meh.

Or to some it simply doesn’t feel like much of a change at all.

Nikita Sokolsky: Feels like o3’s outputs after you fix the grammar and writing in Claude/Gemini: it writes less concisely but haven’t seen any “next level” prompt responses just yet.

MartinDeVido: Meh….

Here’s a fun reminder that details can matter a lot:

John Hughes: I was thrilled yesterday: o3-pro was accepting ~150k tokens of context (similar to Opus), a big step up from regular o3, which allows only a third as much in ChatGPT. @openai seems to have changed that today. Queries I could do yesterday are now rejected as too long.

With such a low context limit, o3-pro is much less useful to lawyers than o1-pro was. Regular o3 is great for quick questions/mini-research, but Gemini is better at analyzing long docs and Opus is tops for coding. Not yet seeing answers where o3-pro is noticeably better than o3.

I presume that even at $200/month, the compute costs of letting o3-pro have 150k input tokens would add up fast, if people actually used it a lot.

This is one of the things I’ve loved the most so far about o3-pro.

Jerry Liu: o3-pro is extremely good at reasoning, extremely slow, and extremely concise – a top-notch consultant that will take a few minutes to think, and output bullet points.

Do not ask it to write essays for you.

o3-pro will make you wait, but its answer will not waste your time. This is a sharp contrast to Deep Research queries, which will take forever to generate and then include a ton of slop.

It is not the main point but I must note the absence of a system card update. When you are releasing what is likely the most powerful model out there, o3-pro, was everything you needed to say truly already addressed by the model card for o3?

OpenAI: As o3-pro uses the same underlying model as o3, full safety details can be found in the o3 system card.

Miles Brundage: This last sentence seems false?

The system card does not appear to have been updated even to incorporate the information in this thread.

The whole point of the term system card is that the model isn’t the only thing that matters.

If they didn’t do a full Preparedness Framework assessment, e.g. because the evals weren’t too different and they didn’t consider it a good use of time given other coming launches, they should just say that, I think.

If o3-pro were the max capability level, I wouldn’t be super concerned about this, and I actually suspect it is the same Preparedness Framework level as o3.

The problem is that this is not the last launch, and lax processes/corner-cutting/groupthink get more dangerous each day.

As OpenAI put it, ‘there’s no such thing as a small launch.’

The link they provide goes to ‘Model Release Notes,’ which is not quite nothing, but it isn’t much and does not include a Preparedness Framework evaluation.

I agree with Miles that if you don’t want to provide a system card for o3-pro that This Is Fine, but you need to state your case for why you don’t need one. This can be any of:

  1. The old system card tested for what happens at higher inference costs (as it should!) so we effectively were testing o3-pro the whole time, and we’re fine.

  2. The Preparedness team tested o3-pro and found it not appreciably different from o3 in the ways we care about, providing no substantial additional uplift or other concerns, despite looking impressive in some other ways.

  3. This is only available at the $200 level so not a release of o3-pro so it doesn’t count (I don’t actually think this is okay, but it would be consistent with previous decisions I also think aren’t okay, and not an additional issue.)

As far as I can tell we’re basically in scenario #2, and they see no serious issues here. Which again is fine if true, and if they actually tell us that this is the case. But the framework is full of ‘here are the test results’ and presumably those results are different now. I want o3-pro on those charts.

What about alignment otherwise? Hard to say. I did notice this (but did not attempt to make heads or tails of the linked thread), seems like what you would naively expect:

Yeshua God: Following the mesa-optimiser recipe to the letter. @aidan_mclau very troubling.

For many purposes, the 80% price cut in o3 seems more impactful than o3-pro. That’s a huge price cut, whereas o3-pro is still largely a ‘special cases only’ model.

Aaron Levie: With OpenAI dropping the price of o3 by 80%, today is a great reminder about how important it is to build for where AI is going instead of just what’s possible now. You can now get 5X the amount of output today for the same price you were paying yesterday.

If you’re building AI Agents, it means it’s far better to build capabilities that are priced and designed for the future instead of just economically reasonable today.

In general, we know there’s a tight correlation between the amount of compute spent on a problem and the level of successful outcomes we can get from AI. This is especially true with AI Agents that potentially can burn through hundreds of thousands or millions of tokens on a single task.

You’re always making trade-off decisions when building AI Agents around what level of accuracy or success you want and how much you want to spend: do you want to spend $0.10 for something to be 95% successful or $1 for something to be 99% successful? A 10X increase in cost for just a 4 pt improvement in results? At every price:success intersection a new set of use-cases from customers can be unlocked.

Normally when building technology that moves at a typical pace, you would primarily build features that are economically viable today (or with some slight efficiency gains anticipated at the rate of Moore’s Law, for instance). You’d be out of business otherwise. But with the cost of AI inference dropping rapidly, the calculus completely changes. In a world where the cost of inference could drop by orders of magnitude in a year or two, it means the way we build software to anticipate these cost drops changes meaningfully.

Instead of either building in lots of hacks to reduce costs, or going after only the most economically feasible use-cases today, this instructs you to build the more ambitious AI Agent capabilities that would normally seem too cost prohibitive to go after. Huge implications for how we build AI Agents and the kind of problems to go after.

I would say the cost of inference not only might drop an order of magnitude in a year or two, if you hold quality of outputs constant it is all but certain to happen at least one more time. Where you ‘take your profits’ in quality versus quantity is up to you.

Discussion about this post

o3 Turns Pro Read More »

the-macbook-air-is-the-obvious-loser-as-the-sun-sets-on-the-intel-mac-era

The MacBook Air is the obvious loser as the sun sets on the Intel Mac era


In the end, Intel Macs have mostly gotten a better deal than PowerPC Macs did.

For the last three years, we’ve engaged in some in-depth data analysis and tea-leaf reading to answer two questions about Apple’s support for older Macs that still use Intel chips.

First, was Apple providing fewer updates and fewer years of software support to Macs based on Intel chips as it worked to transition the entire lineup to its internally developed Apple Silicon? And second, how long could Intel Mac owners reasonably expect to keep getting updates?

The answer to the first question has always been “it depends, but generally yes.” And this year, we have a definitive answer to the second question: For the bare handful of Intel Macs it supports, macOS 26 Tahoe will be the final new version of the operating system to support any of Intel’s chips.

To its credit, Apple has also clearly spelled this out ahead of time rather than pulling the plug on Intel Macs with no notice. The company has also said that it plans to provide security updates for those Macs for two years after Tahoe is replaced by macOS 27 next year. These Macs aren’t getting special treatment—this has been Apple’s unspoken, unwritten policy for macOS security updates for decades now—but to look past its usual “we don’t comment on our future plans” stance to give people a couple years of predictability is something we’ve been pushing Apple to do for a long time.

With none of the tea leaf reading left to do, we can now present a fairly definitive look at how Apple has handled the entire Intel transition, compare it to how the PowerPC-to-Intel switch went two decades ago, and predict what it might mean about support for Apple Silicon Macs.

The data

We’ve assembled an epoch-spanning spreadsheet of every PowerPC or Intel Mac Apple has released since the original iMac kicked off the modern era of Apple back in 1998. On that list, we’ve recorded the introduction date for each Mac, the discontinuation date (when it was either replaced or taken off the market), the version of macOS it shipped with, and the final version of macOS it officially supported.

For those macOS versions, we’ve recorded the dates they received their last major point update—these are the feature-adding updates these releases get when they’re Apple’s latest and greatest version of macOS, as macOS 15 Sequoia is right now. After replacing them, Apple releases security-only patches and Safari browser updates for old macOS versions for another two years after replacing them, so we’ve also recorded the dates that those Macs would have received their final security update. For Intel Macs that are still receiving updates (versions 13, 14, and 15) and macOS 26 Tahoe, we’ve extrapolated end-of-support dates based on Apple’s past practices.

A 27-inch iMac model. It’s still the only Intel Mac without a true Apple Silicon replacement. Credit: Andrew Cunningham

We’re primarily focusing on two time spans: from the date of each Mac’s introduction to the date it stopped receiving major macOS updates, and from the date of each Mac’s introduction to the date it stopped receiving any updates at all. We consider any Macs inside either of these spans to be actively supported; Macs that are no longer receiving regular updates from Apple will gradually become less secure and less compatible with modern apps as time passes. We measure by years of support rather than number of releases, which controls for Apple’s transition to a once-yearly release schedule for macOS back in the early 2010s.

We’ve also tracked the time between each Mac model’s discontinuation and when it stopped receiving updates. This is how Apple determines which products go on its “vintage” and “obsolete” hardware lists, which determine the level of hardware support and the kinds of repairs that the company will provide.

We have lots of detailed charts, but here are some highlights:

  • For all Mac models tracked, the average Mac receives about 6.6 years of macOS updates that add new features, plus another two years of security-only updates.
  • If you only count the Intel era, the average is around seven years of macOS updates, plus two years of security-only patches.
  • Most (though not all) Macs released since 2016 come in lower than either of these averages, indicating that Apple has been less generous to most Intel Macs since the Apple Silicon transition began.
  • The three longest-lived Macs are still the mid-2007 15- and 17-inch MacBook Pros, the mid-2010 Mac Pro, and the mid-2007 iMac, which received new macOS updates for around nine years after their introduction (and security updates for around 11 years).
  • The shortest-lived Mac is still the late-2008 version of the white MacBook, which received only 2.7 years of new macOS updates and another 3.3 years of security updates from the time it was introduced. (Late PowerPC-era and early Intel-era Macs are all pretty bad by modern standards.)

The charts

If you bought a Mac any time between 2016 and 2020, you’re generally settling for fewer years of software updates than you would have gotten in the recent past. If you bought a Mac released in 2020, the tail end of the Intel era when Apple Silicon Macs were around the corner, your reward is the shortest software support window since 2006.

There are outliers in either direction. The sole iMac Pro, introduced in 2017 as Apple tried to regain some of its lost credibility with professional users, will end up with 7.75 years of updates plus another two years of security updates when all is said and done. Buyers of 2018–2020 MacBook Airs and the two-port version of the 2020 13-inch MacBook Pro, however, are treated pretty poorly, getting not quite 5.5 years of updates (plus two years of security patches) on average from the date they were introduced.

That said, most Macs usually end up getting a little over six years of macOS updates and two more years of security updates. If that’s a year or two lower than the recent past, it’s also not ridiculously far from the historical average.

If there’s something to praise here, it’s interesting that Apple doesn’t seem to treat any of its Macs differently based on how much they cost. Now that we have a complete overview of the Intel era, breaking out the support timelines by model rather than by model year shows that a Mac mini doesn’t get dramatically more or less support than an iMac or a Mac Pro, despite costing a fraction of the price. A MacBook Air doesn’t receive significantly more or less support than a MacBook Pro.

These are just averages, and some models are lucky while others are not. The no-adjective MacBook that Apple has sold on and off since 2006 is also an outlier, with fewer years of support on average than the other Macs.

If there’s one overarching takeaway, it’s that you should buy new Macs as close to the date of their introduction as possible if you want to maximize your software support window. Especially for Macs that were sold continuously for years and years—the 2013 and 2019 Mac Pro, the 2018 Mac mini, the non-Retina 2015 MacBook Air that Apple sold some version of for over four years—buying them toward the end of their retail lifecycle means settling for years of fewer updates than you would have gotten if you had waited for the introduction of a new model. And that’s true even though Apple’s hardware support timelines are all calculated from the date of last availability rather than the date of introduction.

It just puts Mac buyers in a bad spot when Apple isn’t prompt with hardware updates, forcing people to either buy something that doesn’t fully suit their needs or settle for something older that will last for fewer years.

What should you do with an older Intel Mac?

The big question: If your Intel Mac is still functional but Apple is no longer supporting it, is there anything you can do to keep it both secure and functional?

All late-model Intel Macs officially support Windows 10, but that OS has its own end-of-support date looming in October 2025. Windows 11 can be installed, but only if you bypass its system requirements, which can work well, but it does require additional fiddling when it comes time to install major updates. Consumer-focused Linux distributions like Ubuntu, Mint, or Pop!_OS may work, depending on your hardware, but they come with a steep learning curve for non-technical users. Google’s ChromeOS Flex may also work, but ChromeOS is more functionally limited than most other operating systems.

The OpenCore Legacy Patcher provides one possible stay of execution for Mac owners who want to stay on macOS for as long as they can. But it faces two steep uphill climbs in macOS Tahoe. First, as Apple has removed more Intel Macs from the official support list, it has removed more of the underlying code from macOS that is needed to support those Macs and other Macs with similar hardware. This leaves more for the OpenCore Configurator team to have to patch in from older OSes, and this kind of forward-porting can leave hardware and software partly functional or non-functional.

Second, there’s the Apple T2 to consider. The Macs with a T2 treat it as a load-bearing co-processor, responsible for crucial operating system functions such as enabling Touch ID, serving as an SSD controller, encoding and decoding videos, communicating with the webcam and built-in microphone, and other operations. But Apple has never opened the T2 up to anyone, and it remains a bit of a black box for both the OpenCore/Hackintosh community and folks who would run Linux-based operating systems like Ubuntu or ChromeOS on that hardware.

The result is that the 2018 and 2019 MacBook Airs that didn’t support macOS 15 Sequoia last year never had support for them added to the OpenCore Legacy Patcher because the T2 chip simply won’t communicate with OpenCore firmware booted. Some T2 Macs don’t have this problem. But if yours does, it’s unlikely that anyone will be able to do anything about it, and your software support will end when Apple says it does.

Does any of this mean anything for Apple Silicon Mac support?

Late-model Intel MacBook Airs have fared worse than other Macs in terms of update longevity. Credit: Valentina Palladino

It will likely be at least two or three years before we know for sure how Apple plans to treat Apple Silicon Macs. Will the company primarily look at specs and technical capabilities, as it did from the late-’90s through to the mid-2010s? Or will Apple mainly stop supporting hardware based on its age, as it has done for more recent Macs and most current iPhones and iPads?

The three models to examine for this purpose are the first ones to shift to Apple Silicon: the M1 versions of the MacBook Air, Mac mini, and 13-inch MacBook Pro, all launched in late 2020. If these Macs are dropped in, say, 2027 or 2028’s big macOS release, but other, later M1 Macs like the iMac stay supported, it means Apple is likely sticking to a somewhat arbitrary age-based model, with certain Macs cut off from software updates that they are perfectly capable of running.

But it’s our hope that all Apple Silicon Macs have a long life ahead of them. The M2, M3, and M4 have all improved on the M1’s performance and other capabilities, but the M1 Macs are much more capable than the Intel ones they supplanted, the M1 was used so widely in various Mac models for so long, and Mac owners can pay so much more for their devices than iPhone and iPad owners. We’d love to see macOS return to the longer-tail software support it provided in the late-’00s and mid-2010s, when models could expect to see seven or eight all-new macOS versions and another two years of security updates afterward.

All signs point to Apple using the launch date of any given piece of hardware as the determining factor for continued software support. But that isn’t how it has always been, nor is it how it always has to be.

Photo of Andrew Cunningham

Andrew is a Senior Technology Reporter at Ars Technica, with a focus on consumer tech including computer hardware and in-depth reviews of operating systems like Windows and macOS. Andrew lives in Philadelphia and co-hosts a weekly book podcast called Overdue.

The MacBook Air is the obvious loser as the sun sets on the Intel Mac era Read More »

here’s-kia’s-new-small,-affordable-electric-car:-the-2026-ev4-sedan

Here’s Kia’s new small, affordable electric car: The 2026 EV4 sedan

The mesh headrests are a clever touch, as they’re both comfortable and lightweight. The controls built into the side of the passenger seat that let the driver change its position are a specialty of the automaker. There are also plenty of other conveniences, including wireless device charging, 100 W USB-C ports, and wireless Android Auto and Apple CarPlay. We relied on the native navigation app, which is not as visually pretty as the one you cast from your phone to the 12.3-inch infotainment screen, but it kept me on course on unfamiliar roads in a foreign country while suffering from jet lag. That seems worthy of a mention.

Public transport

Traffic in and around Seoul makes a wonderful case for public transport; it provided less of an opportunity for the EV4 to show its stuff beyond relatively low-speed stop-and-go, mostly topping out at 50 mph (80 km/h) on the roads, which are heavily studded with traffic cameras. Determining a true impression of the car’s range will require spending more time with it on US roads, as a result.

It was, however, an easy car to drive in traffic and to drive slowly. It’s no speed demon anyway; 0–62 mph (100 km/h) takes 7.4 seconds if you floor it in the standard range car, or 7.7 seconds in the big battery one. The ride is good over broken tarmac, although it is quite firm when dealing with short-duration bumps. Meanwhile, the steering is light but not particularly informative when it comes to providing a picture of what the front tires are doing.

Good driving dynamics help sell a car once someone has had a test drive, but most will only get that far if the pricing is right. That’s yet to be announced, and who knows what will happen with tariffs and the clean vehicle tax credit between now and when the cars arrive in dealerships toward the end of the year. However, we expect the standard-range car to start between $37,000 and $39,000, undercutting the Tesla Model 3 in the process. That sounds rather compelling to me.

Here’s Kia’s new small, affordable electric car: The 2026 EV4 sedan Read More »

delightfully-irreverent-underdogs-isn’t-your-parents’-nature-docuseries

Delightfully irreverent Underdogs isn’t your parents’ nature docuseries

Narrator Ryan Reynolds celebrates nature’s outcasts in the new NatGeo docuseries Underdogs.

Most of us have seen a nature documentary or two (or three) at some point in our lives, so it’s a familiar format: sweeping, majestic footage of impressively regal animals accompanied by reverently high-toned narration (preferably with a tony British accent). Underdogs, a new docuseries from National Geographic, takes a decidedly different approach. Narrated with hilarious irreverence by Ryan Reynolds, the five-part series highlights nature’s less cool and majestic creatures: the outcasts and benchwarmers, more noteworthy for their “unconventional hygiene choices” and “unsavory courtship rituals.” It’s like The Suicide Squad or Thunderbolts*, except these creatures actually exist.

Per the official premise, “Underdogs features a range of never-before-filmed scenes, including the first time a film crew has ever entered a special cave in New Zealand—a huge cavern that glows brighter than a bachelor pad under a black light thanks to the glowing butts of millions of mucus-coated grubs. All over the world, overlooked superstars like this are out there 24/7, giving it maximum effort and keeping the natural world in working order for all those showboating polar bears, sharks and gorillas.” It’s rated PG-13 thanks to the odd bit of scatalogical humor and shots of Nature Sexy Time.

Each of the five episodes is built around a specific genre. “Superheroes” highlights the surprising superpowers of the honey badger, pistol shrimp, and the invisible glass frog, among others, augmented with comic book graphics; “Sexy Beasts” focuses on bizarre mating habits and follows the format of a romantic advice column; “Terrible Parents” highlights nature’s worst practices, following the outline of a parenting guide; “Total Grossout” is exactly what it sounds like; and “The Unusual Suspects” is a heist tale, documenting the supposed efforts of a macaque to put together the ultimate team of masters of deception and disguise (an inside man, a decoy, a fall guy, etc.). Green Day even wrote and recorded a special theme song for the opening credits.

Co-creators Mark Linfield and Vanessa Berlowitz of Wildstar Films are longtime producers of award-winning wildlife films, most notably Frozen Planet, Planet Earth, and David Attenborough’s Life of Mammals—you know, the kind of prestige nature documentaries that have become a mainstay for National Geographic and the BBC, among others. They’re justly proud of that work, but this time around the duo wanted to try something different.

Delightfully irreverent Underdogs isn’t your parents’ nature docuseries Read More »

companies-may-soon-pay-a-fee-for-their-rockets-to-share-the-skies-with-airplanes

Companies may soon pay a fee for their rockets to share the skies with airplanes


Some space companies aren’t necessarily against this idea, but SpaceX hasn’t spoken.

Starship soars through the stratosphere. Credit: Stephen Clark/Ars Technica

The Federal Aviation Administration may soon levy fees on companies seeking launch and reentry licenses, a new tack in the push to give the agency the resources it needs to keep up with the rapidly growing commercial space industry.

The text of a budget reconciliation bill released by Sen. Ted Cruz (R-Texas) last week calls for the FAA’s Office of Commercial Space Transportation, known as AST, to begin charging licensing fees to space companies next year. The fees would phase in over eight years, after which the FAA would adjust them to keep pace with inflation. The money would go into a trust fund to help pay for the operating costs of the FAA’s commercial space office.

The bill released by Cruz’s office last week covers federal agencies under the oversight of the Senate Commerce Committee, which he chairs. These agencies include the FAA and NASA. Ars recently covered Cruz’s proposals for NASA to keep the Space Launch System rocket, Orion spacecraft, and Gateway lunar space station alive, while the Trump administration aims to cancel Gateway and end the SLS and Orion programs after two crew missions to the Moon.

The Trump administration’s fiscal year 2026 budget request, released last month, proposes $42 million for the FAA’s Office of Commercial Space Transportation, a fraction of the agency’s overall budget request of $22 billion. The FAA’s commercial space office received an almost identical funding level in 2024 and 2025. Accounting for inflation, this is effectively a budget cut for AST. The office’s budget increased from $27.6 million to more than $42 million between 2021 and 2024, when companies like SpaceX began complaining the FAA was not equipped to keep up with the fast-moving commercial launch industry.

The FAA licensed 11 commercial launch and reentry operations in 2015, when AST’s budget was $16.6 million. Last year, the number of space operations increased to 164, and the US industry is on track to conduct more than 200 commercial launches and reentries in 2025. SpaceX’s Falcon 9 rocket is doing most of these launches.

While the FAA’s commercial space office receives more federal funding today, the budget hasn’t grown to keep up with the cadence of commercial spaceflight. SpaceX officials urged the FAA to double its licensing staff in 2023 after the company experienced delays in securing launch licenses.

In the background, a Falcon 9 rocket climbs away from Space Launch Complex 40 at Cape Canaveral Space Force Station, Florida. Another Falcon 9 stands on its launch pad at neighboring Kennedy Space Center awaiting its opportunity to fly.

Adding it up

Cruz’s section of the Senate reconciliation bill calls for the FAA to charge commercial space companies per pound of payload mass, beginning with 25 cents per pound in 2026 and increasing to $1.50 per pound in 2033. Subsequent fee rates would change based on inflation. The overall fee per launch or entry would be capped at $30,000 in 2026, increasing to $200,000 in 2033, and then adjusted to keep pace with inflation.

The Trump administration has not weighed in on Cruz’s proposed fee schedule, but Trump’s nominee for the next FAA administrator, Bryan Bedford, agreed with the need for launch and reentry licensing fees in a Senate confirmation hearing Wednesday. Most of the hearing’s question-and-answer session focused on the safety of commercial air travel, but there was a notable exchange on the topic of commercial spaceflight.

Cruz said the rising number of space launches will “add considerable strain to the airspace system” in the United States. Airlines and their passengers pay FAA-mandated fees for each flight segment, and private owners pay the FAA a fee to register their aircraft. The FAA also charges overflight fees to aircraft traveling through US airspace, even if they don’t take off or land in the United States.

“Nearly every user of the National Airspace System pays something back into the system to help cover their operational costs, yet under current law, space launch companies do not, and there is no mechanism for them to pay even if they wish to,” Cruz said. “As commercial spaceflight expands rapidly, so does its impact on the FAA’s ability to operate the National Airspace System. This proposal accounts for that.”

When asked if he agreed, Trump’s FAA nominee suggested he did. Bedford, president and CEO of Republic Airways, is poised to take the helm of the federal aviation regulator if he passes Senate confirmation.

Bryan Bedford is seen prior to his nomination hearing before the Senate Commerce Committee to lead the Federal Aviation Administration on June 11, 2025. Credit: Craig Hudson For The Washington Post via Getty Images

The FAA clears airspace of commercial and private air traffic along the flight corridors of rockets as they launch into space, and around the paths of spacecraft as they return to Earth. The agency is primarily charged with ensuring commercial rockets don’t endanger the public. The National Airspace System (NAS) consists of 29 million square miles of airspace over land and oceans. The FAA says more than 45,000 flights and 2.9 million airline passengers travel through the airspace every day.

Bedford said he didn’t want to speak on specific policy proposals before the Trump administration announces an official position on the matter.

“But I’ll confirm you’re exactly right,” Bedford told Cruz. “Passengers and airlines themselves pay significant taxes. … Those taxes are designed to modernize our NAS. One of the things that is absolutely critical in modernization is making sure we design the NAS so it can accommodate an increased cadence in space launch, so I certainly support where you’re going with that.”

SpaceX would be the company most affected by the proposed licensing fees. The majority of SpaceX’s missions launch the company’s own Starlink broadband satellites aboard Falcon 9 rockets. Most of those launches carry around 17 metric tons (about 37,500 pounds) of usable payload mass.

A quick calculation shows that SpaceX would pay a fee of roughly $9,400 for an average Starlink launch on a Falcon 9 rocket next year if Cruz’s legislation is signed into law. SpaceX launched 89 dedicated Starlink missions last year. That would add up to more than $800,000 in annual fees going into the FAA’s coffers under Cruz’s licensing scheme. Once you account for all of SpaceX’s other commercial launches, this number would likely exceed $1 million.

Assuming Falcon 9s continue to launch Starlink satellites in 2033, the fees would rise to approximately $56,000 per launch. SpaceX may have switched over all Starlink missions to its giant new Starship rocket by then, in which case the company will likely reach the FAA’s proposed fee cap of $200,000 per launch. SpaceX hopes to launch Starships at lower cost than it currently launches the Falcon 9 rocket, so this proposal would see SpaceX pay a significantly larger fraction of its per-mission costs in the form of FAA fees.

Industry reaction

A senior transportation official in the Biden administration voiced tentative support in 2023 for a fee scheme similar to the one under consideration by the Senate. Michael Huerta, a former FAA administrator during the Obama administration and the first Trump administration, told NPR last year that he supports the idea.

“You have this group of new users that are paying nothing into the system that are an increasing share of the operations,” Huerta said. “I truly believe the current structure isn’t sustainable.”

The Commercial Spaceflight Federation, an industry advocacy group that includes SpaceX and Blue Origin among its membership, signaled last year it was against the idea of creating launch and reentry fees, or taxes, as some industry officials call them. Commercial launch and reentry companies have been excluded from FAA fees to remove regulatory burdens and help the industry grow. The federation told NPR last year that because the commercial space industry requires access to US airspace much less often than the aviation industry, it would not yet be appropriate to have space companies pay into an FAA trust fund.

SpaceX did not respond to questions from Ars on the matter. United Launch Alliance would likely be on the hook to become the second-largest payer of FAA fees, at least over the next couple of years, with numerous missions in its backlog to launch massive stacks of Internet satellites for Amazon’s Project Kuiper network from Cape Canaveral Space Force Station in Florida.

A ULA spokesperson told Ars the company is still reviewing and assessing the Senate Commerce Committee’s proposal. “In general, we are supportive of fees that are affordable, do not disadvantage US companies against their foreign counterparts, are fair, equitable, and are used to directly improve the shared infrastructure at the Cape and other spaceports,” the spokesperson said.

Photo of Stephen Clark

Stephen Clark is a space reporter at Ars Technica, covering private space companies and the world’s space agencies. Stephen writes about the nexus of technology, science, policy, and business on and off the planet.

Companies may soon pay a fee for their rockets to share the skies with airplanes Read More »

biofuels-policy-has-been-a-failure-for-the-climate,-new-report-claims

Biofuels policy has been a failure for the climate, new report claims

The new report concludes that not only will the expansion of ethanol increase greenhouse gas emissions, but it has also failed to provide the social and financial benefits to Midwestern communities that lawmakers and the industry say it has. (The report defines the Midwest as Illinois, Indiana, Iowa, Kansas, Michigan, Minnesota, Missouri, Nebraska, North Dakota, Ohio, South Dakota, and Wisconsin.)

“The benefits from biofuels remain concentrated in the hands of a few,” Leslie-Bole said. “As subsidies flow, so may the trend of farmland consolidation, increasing inaccessibility of farmland in the Midwest, and locking out emerging or low-resource farmers. This means the benefits of biofuels production are flowing to fewer people, while more are left bearing the costs.”

New policies being considered in state legislatures and Congress, including additional tax credits and support for biofuel-based aviation fuel, could expand production, potentially causing more land conversion and greenhouse gas emissions, widening the gap between the rural communities and rich agribusinesses at a time when food demand is climbing and, critics say, land should be used to grow food instead.

President Donald Trump’s tax cut bill, passed by the House and currently being negotiated in the Senate, would not only extend tax credits for biofuels producers, it specifically excludes calculations of emissions from land conversion when determining what qualifies as a low-emission fuel.

The primary biofuels industry trade groups, including Growth Energy and the Renewable Fuels Association, did not respond to Inside Climate News requests for comment or interviews.

An employee with the Clean Fuels Alliance America, which represents biodiesel and sustainable aviation fuel producers, not ethanol, said the report vastly overstates the carbon emissions from crop-based fuels by comparing the farmed land to natural landscapes, which no longer exist.

They also noted that the impact of soy-based fuels in 2024 was more than $42 billion, providing over 100,000 jobs.

“Ten percent of the value of every bushel of soybeans is linked to biomass-based fuel,” they said.

Biofuels policy has been a failure for the climate, new report claims Read More »

trump’s-ftc-may-impose-merger-condition-that-forbids-advertising-boycotts

Trump’s FTC may impose merger condition that forbids advertising boycotts

FTC chair alleged “serious risk” from ad boycotts

After Musk’s purchase of Twitter, the social network lost advertisers for various reasons, including changes to content moderation and an incident in which Musk posted a favorable response to an antisemitic tweet and then told concerned advertisers to “go fuck yourself.”

FTC Chairman Andrew Ferguson said at a conference in April that “the risk of an advertiser boycott is a pretty serious risk to the free exchange of ideas.”

“If advertisers get into a back room and agree, ‘We aren’t going to put our stuff next to this guy or woman or his or her ideas,’ that is a form of concerted refusal to deal,” Ferguson said. “The antitrust laws condemn concerted refusals to deal. Now, of course, because of the First Amendment, we don’t have a categorical antitrust prohibition on boycotts. When a boycott ceases to be economic for purposes of the antitrust laws and becomes purely First Amendment activity, the courts have not been super clear—[it’s] sort of a ‘we know it when we see it’ type of thing.”

The FTC website says that any individual company acting on its own may “refuse to do business with another firm, but an agreement among competitors not to do business with targeted individuals or businesses may be an illegal boycott, especially if the group of competitors working together has market power.” The examples given on the FTC webpage are mostly about price competition and do not address the widespread practice of companies choosing where to place advertising based on concerns about their brands.

We contacted the FTC about the merger review today and will update this article if it provides any comment.

X’s ad lawsuit

X’s lawsuit targets a World Federation of Advertisers initiative called the Global Alliance for Responsible Media (GARM), a now-defunct program that Omnicom and Interpublic participated in. X itself was part of the GARM initiative, which shut down after X filed the lawsuit. X alleged that the defendants conspired “to collectively withhold billions of dollars in advertising revenue.”

The World Federation of Advertisers said in a court filing last month that GARM was founded “to bring clarity and transparency to disparate definitions and understandings in advertising and brand safety in the context of social media. For example, certain advertisers did not want platforms to advertise their brands alongside content that could negatively impact their brands.”

Trump’s FTC may impose merger condition that forbids advertising boycotts Read More »

there’s-another-leak-on-the-iss,-but-nasa-is-not-saying-much-about-it

There’s another leak on the ISS, but NASA is not saying much about it

No one is certain. The best guess is that the seals on the hatch leading to the PrK module are, in some way, leaking. In this scenario, pressure from the station is feeding the leak inside the PrK module through these seals, leading to a stable pressure inside—making it appear as though the PrK module leaks are fully repaired.

At this point, NASA is monitoring the ongoing leak and preparing for any possibility. A senior industry source told Ars that the NASA leadership of the space station program is “worried” about the leak and its implications.

This is one reason the space agency delayed the launch of a commercial mission carrying four astronauts to the space station, Axiom-4, on Thursday.

“The postponement of Axiom Mission 4 provides additional time for NASA and Roscosmos to evaluate the situation and determine whether any additional troubleshooting is necessary,” NASA said in a statement. “A new launch date for the fourth private astronaut mission will be provided once available.”

One source indicated that the new tentative launch date is now June 18. However, this will depend on whatever resolution there is to the leak issue.

What’s the worst that could happen?

The worst-case scenario for the space station is that the ongoing leaks are a harbinger of a phenomenon known as “high cycle fatigue,” which affects metal, including aluminum. Consider that if you bend a metal clothes hanger once, it bends. But if you bend it back and forth multiple times, it will snap. This is because, as the metal fatigues, it hardens and eventually snaps. This happens suddenly and without warning, as was the case with an Aloha Airlines flight in 1988.

The concern is that some of these metal structures on board the station could fail quickly and catastrophically. Accordingly, in its previous assessments, NASA has classified the structural cracking issue on the space station as the highest level of concern on its 5v5 risk matrix to gauge the likelihood and severity of risks to the space station.

In the meantime, the space agency has not been forthcoming with any additional information. Despite many questions from Ars Technica and other publications, NASA has not scheduled a press conference or said anything else publicly about the leaks beyond stating, “The crew aboard the International Space Station is safely conducting normal operations.”

There’s another leak on the ISS, but NASA is not saying much about it Read More »