Author name: Kris Guyer

hp-agrees-to-$4m-settlement-over-claims-of-“falsely-advertising”-pcs,-keyboards

HP agrees to $4M settlement over claims of “falsely advertising” PCs, keyboards

HP Inc. has agreed to pay a $4 million settlement to customers after being accused of “false advertising” of computers and peripherals on its website.

Earlier this month, Judge P. Casey Pitts for the US District Court of the San Jose Division of the Northern District of California granted preliminary approval [PDF] of a settlement agreement regarding a class-action complaint first filed against HP on October 13, 2021. The complaint accused HP’s website of showing “misleading” original pricing for various computers, mice, and keyboards that was higher than how the products were recently and typically priced.

Per the settlement agreement [PDF], HP will contribute $4 million to a “non-reversionary common fund, which shall be used to pay the (i) Settlement Class members’ claims; (ii) court-approved Notice and Settlement Administration Costs; (iii) court-approved Settlement Class Representatives’ Service Award; and (iv) court-approved Settlement Class Counsel Attorneys’ Fees and Costs Award. All residual funds will be distributed pro rata to Settlement Class members who submitted valid claims and cashed checks.”

The two plaintiffs who filed the initial complaint may also file a motion to receive a settlement class representative service award for up to $5,000 each, which would come out of the $4 million pool.

People who purchased a discounted HP desktop, laptop, mouse, or keyboard that was on sale for “more than 75 percent of the time the products were offered for sale” from June 5, 2021, to October 28, 2024, are eligible for compensation. The full list of eligible products is available here [PDF] and includes HP Spectre, Chromebook Envy, and Pavilion laptops, HP Envy and Omen desktops, and some mechanical keyboards and wireless mice. Depending on the product, class members can receive $10 to $100 per eligible product purchased.

An amended complaint filed on July 15, 2022 [PDF] accused HP of breaking the Federal Trade Commission’s laws against deceptive pricing. Among the examples provided was Rodney Carvalho’s experience buying an HP All-in-One 24-dp1056qe in September 2021. The complaint reported that HP.com advertised the AIO as being on sale for $899.99 and featured text saying “Save $100 instantly.” The AIO’s listing reportedly had a strike-through price suggesting that the computer used to cost $999.99. But, per the complaint, “in the weeks and months prior to Carvalho’s purchase, HP rarely, if ever, offered his computer for sale at the advertised strike-through price of $999.99.” The filing claimed that the PC had been going for $899.99 since April 2021.

HP agrees to $4M settlement over claims of “falsely advertising” PCs, keyboards Read More »

When Patching Isn’t Enough


Executive Briefing

What Happened:

A stealthy, persistent backdoor was discovered in over 16,000 Fortinet firewalls. This wasn’t a new vulnerability – it was a case of attackers exploiting a subtle part of the system (language folders) to maintain unauthorized access even after the original vulnerabilities had been patched.

What It Means:

Devices that were considered “safe” may still be compromised. Attackers had read-only access to sensitive system files via symbolic links placed on the file system – completely bypassing traditional authentication and detection. Even if a device was patched months ago, the attacker could still be in place.

Business Risk:

  • Exposure of sensitive configuration files (including VPN, admin, and user data)
  • Reputational risk if customer-facing infrastructure is compromised
  • Compliance concerns depending on industry (HIPAA, PCI, etc.)
  • Loss of control over device configurations and trust boundaries

What We’re Doing About It:

We’ve implemented a targeted remediation plan that includes firmware patching, credential resets, file system audits, and access control updates. We’ve also embedded long-term controls to monitor for persistence tactics like this in the future.

Key Takeaway For Leadership:

This isn’t about one vendor or one CVE. This is a reminder that patching is only one step in a secure operations model. We’re updating our process to include persistent threat detection on all network appliances – because attackers aren’t waiting around for the next CVE to strike.


What Happened

Attackers exploited Fortinet firewalls by planting symbolic links in language file folders. These links pointed to sensitive root-level files, which were then accessible through the SSL-VPN web interface.

The result: attackers gained read-only access to system data with no credentials and no alerts. This backdoor remained even after firmware patches – unless you knew to remove it.

FortiOS Versions That Remove the Backdoor:

  • 7.6.2
  • 7.4.7
  • 7.2.11
  • 7.0.17
  • 6.4.16

If you’re running anything older, assume compromise and act accordingly.


The Real Lesson

We tend to think of patching as a full reset. It’s not. Attackers today are persistent. They don’t just get in and move laterally – they burrow in quietly, and stay.

The real problem here wasn’t a technical flaw. It was a blind spot in operational trust: the assumption that once we patch, we’re done. That assumption is no longer safe.


Ops Resolution Plan: One-Click Runbook

Playbook: Fortinet Symlink Backdoor Remediation

Purpose:
Remediate the symlink backdoor vulnerability affecting FortiGate appliances. This includes patching, auditing, credential hygiene, and confirming removal of any persistent unauthorized access.


1. Scope Your Environment

  • Identify all Fortinet devices in use (physical or virtual).
  •  Inventory all firmware versions.
  •  Check which devices have SSL-VPN enabled.

2. Patch Firmware

Patch to the following minimum versions:

  • FortiOS 7.6.2
  • FortiOS 7.4.7
  • FortiOS 7.2.11
  • FortiOS 7.0.17
  • FortiOS 6.4.16

Steps:

  •  Download firmware from Fortinet support portal.
  •  Schedule downtime or a rolling upgrade window.
  •  Backup configuration before applying updates.
  •  Apply firmware update via GUI or CLI.

3. Post-Patch Validation

After updating:

  •  Confirm version using get system status.
  •  Verify SSL-VPN is operational if in use.
  •  Run diagnose sys flash list to confirm removal of unauthorized symlinks (Fortinet script included in new firmware should clean it up automatically).

4. Credential & Session Hygiene

  •  Force password reset for all admin accounts.
  •  Revoke and re-issue any local user credentials stored in FortiGate.
  •  Invalidate all current VPN sessions.

5. System & Config Audit

  •  Review admin account list for unknown users.
  •  Validate current config files (show full-configuration) for unexpected changes.
  •  Search filesystem for remaining symbolic links (optional):
find / -type l -ls | grep -v "https://gigaom.com/usr"

6. Monitoring and Detection

  •  Enable full logging on SSL-VPN and admin interfaces.
  •  Export logs for analysis and retention.
  •  Integrate with SIEM to alert on:
    • Unusual admin logins
    • Access to unusual web resources
    • VPN access outside expected geos

7. Harden SSL-VPN

  •  Limit external exposure (use IP allowlists or geo-fencing).
  •  Require MFA on all VPN access.
  •  Disable web-mode access unless absolutely needed.
  •  Turn off unused web components (e.g., themes, language packs).

Change Control Summary

Change Type: Security hotfix
Systems Affected: FortiGate appliances running SSL-VPN
Impact: Short interruption during firmware upgrade
Risk Level: Medium
Change Owner: [Insert name/contact]
Change Window: [Insert time]
Backout Plan: See below
Test Plan: Confirm firmware version, validate VPN access, and run post-patch audits


Rollback Plan

If upgrade causes failure:

  1. Reboot into previous firmware partition using console access.
    • Run: exec set-next-reboot primary or secondary depending on which was upgraded.
  2. Restore backed-up config (pre-patch).
  3. Disable SSL-VPN temporarily to prevent exposure while issue is investigated.
  4. Notify infosec and escalate through Fortinet support.

Final Thought

This wasn’t a missed patch. It was a failure to assume attackers would play fair.

If you’re only validating whether something is “vulnerable,” you’re missing the bigger picture. You need to ask: Could someone already be here?

Security today means shrinking the space where attackers can operate – and assuming they’re clever enough to use the edges of your system against you.

The post When Patching Isn’t Enough appeared first on Gigaom.

When Patching Isn’t Enough Read More »

openai-releases-new-simulated-reasoning-models-with-full-tool-access

OpenAI releases new simulated reasoning models with full tool access


New o3 model appears “near-genius level,” according to one doctor, but it still makes mistakes.

On Wednesday, OpenAI announced the release of two new models—o3 and o4-mini—that combine simulated reasoning capabilities with access to functions like web browsing and coding. These models mark the first time OpenAI’s reasoning-focused models can use every ChatGPT tool simultaneously, including visual analysis and image generation.

OpenAI announced o3 in December, and until now, only less capable derivative models named “o3-mini” and “03-mini-high” have been available. However, the new models replace their predecessors—o1 and o3-mini.

OpenAI is rolling out access today for ChatGPT Plus, Pro, and Team users, with Enterprise and Edu customers gaining access next week. Free users can try o4-mini by selecting the “Think” option before submitting queries. OpenAI CEO Sam Altman tweeted that “we expect to release o3-pro to the pro tier in a few weeks.”

For developers, both models are available starting today through the Chat Completions API and Responses API, though some organizations will need verification for access.

“These are the smartest models we’ve released to date, representing a step change in ChatGPT’s capabilities for everyone from curious users to advanced researchers,” OpenAI claimed on its website. OpenAI says the models offer better cost efficiency than their predecessors, and each comes with a different intended use case: o3 targets complex analysis, while o4-mini, being a smaller version of its next-gen SR model “o4” (not yet released), optimizes for speed and cost-efficiency.

OpenAI says o3 and o4-mini are multimodal, featuring the ability to

OpenAI says o3 and o4-mini are multimodal, featuring the ability to “think with images.” Credit: OpenAI

What sets these new models apart from OpenAI’s other models (like GPT-4o and GPT-4.5) is their simulated reasoning capability, which uses a simulated step-by-step “thinking” process to solve problems. Additionally, the new models dynamically determine when and how to deploy aids to solve multistep problems. For example, when asked about future energy usage in California, the models can autonomously search for utility data, write Python code to build forecasts, generate visualizing graphs, and explain key factors behind predictions—all within a single query.

OpenAI touts the new models’ multimodal ability to incorporate images directly into their simulated reasoning process—not just analyzing visual inputs but actively “thinking with” them. This capability allows the models to interpret whiteboards, textbook diagrams, and hand-drawn sketches, even when images are blurry or of low quality.

That said, the new releases continue OpenAI’s tradition of selecting confusing product names that don’t tell users much about each model’s relative capabilities—for example, o3 is more powerful than o4-mini despite including a lower number. Then there’s potential confusion with the firm’s non-reasoning AI models. As Ars Technica contributor Timothy B. Lee noted today on X, “It’s an amazing branding decision to have a model called GPT-4o and another one called o4.”

Vibes and benchmarks

All that aside, we know what you’re thinking: What about the vibes? While we have not used 03 or o4-mini yet, frequent AI commentator and Wharton professor Ethan Mollick compared o3 favorably to Google’s Gemini 2.5 Pro on Bluesky. “After using them both, I think that Gemini 2.5 & o3 are in a similar sort of range (with the important caveat that more testing is needed for agentic capabilities),” he wrote. “Each has its own quirks & you will likely prefer one to another, but there is a gap between them & other models.”

During the livestream announcement for o3 and o4-mini today, OpenAI President Greg Brockman boldly claimed: “These are the first models where top scientists tell us they produce legitimately good and useful novel ideas.”

Early user feedback seems to support this assertion, although until more third-party testing takes place, it’s wise to be skeptical of the claims. On X, immunologist Dr. Derya Unutmaz said o3 appeared “at or near genius level” and wrote, “It’s generating complex incredibly insightful and based scientific hypotheses on demand! When I throw challenging clinical or medical questions at o3, its responses sound like they’re coming directly from a top subspecialist physicians.”

OpenAI benchmark results for o3 and o4-mini SR models.

OpenAI benchmark results for o3 and o4-mini SR models. Credit: OpenAI

So the vibes seem on target, but what about numerical benchmarks? Here’s an interesting one: OpenAI reports that o3 makes “20 percent fewer major errors” than o1 on difficult tasks, with particular strengths in programming, business consulting, and “creative ideation.”

The company also reported state-of-the-art performance on several metrics. On the American Invitational Mathematics Examination (AIME) 2025, o4-mini achieved 92.7 percent accuracy. For programming tasks, o3 reached 69.1 percent accuracy on SWE-Bench Verified, a popular programming benchmark. The models also reportedly showed strong results on visual reasoning benchmarks, with o3 scoring 82.9 percent on MMMU (massive multi-disciplinary multimodal understanding), a college-level visual problem-solving test.

OpenAI benchmark results for o3 and o4-mini SR models.

OpenAI benchmark results for o3 and o4-mini SR models. Credit: OpenAI

However, these benchmarks provided by OpenAI lack independent verification. One early evaluation of a pre-release o3 model by independent AI research lab Transluce found that the model exhibited recurring types of confabulations, such as claiming to run code locally or providing hardware specifications, and hypothesized this could be due to the model lacking access to its own reasoning processes from previous conversational turns. “It seems that despite being incredibly powerful at solving math and coding tasks, o3 is not by default truthful about its capabilities,” wrote Transluce in a tweet.

Also, some evaluations from OpenAI include footnotes about methodology that bear consideration. For a “Humanity’s Last Exam” benchmark result that measures expert-level knowledge across subjects (o3 scored 20.32 with no tools, but 24.90 with browsing and tools), OpenAI notes that browsing-enabled models could potentially find answers online. The company reports implementing domain blocks and monitoring to prevent what it calls “cheating” during evaluations.

Even though early results seem promising overall, experts or academics who might try to rely on SR models for rigorous research should take the time to exhaustively determine whether the AI model actually produced an accurate result instead of assuming it is correct. And if you’re operating the models outside your domain of knowledge, be careful accepting any results as accurate without independent verification.

Pricing

For ChatGPT subscribers, access to o3 and o4-mini is included with the subscription. On the API side (for developers who integrate the models into their apps), OpenAI has set o3’s pricing at $10 per million input tokens and $40 per million output tokens, with a discounted rate of $2.50 per million for cached inputs. This represents a significant reduction from o1’s pricing structure of $15/$60 per million input/output tokens—effectively a 33 percent price cut while delivering what OpenAI claims is improved performance.

The more economical o4-mini costs $1.10 per million input tokens and $4.40 per million output tokens, with cached inputs priced at $0.275 per million tokens. This maintains the same pricing structure as its predecessor o3-mini, suggesting OpenAI is delivering improved capabilities without raising costs for its smaller reasoning model.

Codex CLI

OpenAI also introduced an experimental terminal application called Codex CLI, described as “a lightweight coding agent you can run from your terminal.” The open source tool connects the models to users’ computers and local code. Alongside this release, the company announced a $1 million grant program offering API credits for projects using Codex CLI.

A screenshot of OpenAI's new Codex CLI tool in action, taken from GitHub.

A screenshot of OpenAI’s new Codex CLI tool in action, taken from GitHub. Credit: OpenAI

Codex CLI somewhat resembles Claude Code, an agent launched with Claude 3.7 Sonnet in February. Both are terminal-based coding assistants that operate directly from a console and can interact with local codebases. While Codex CLI connects OpenAI’s models to users’ computers and local code repositories, Claude Code was Anthropic’s first venture into agentic tools, allowing Claude to search through codebases, edit files, write and run tests, and execute command line operations.

Codex CLI is one more step toward OpenAI’s goal of making autonomous agents that can execute multistep complex tasks on behalf of users. Let’s hope all the vibe coding it produces isn’t used in high-stakes applications without detailed human oversight.

Photo of Benj Edwards

Benj Edwards is Ars Technica’s Senior AI Reporter and founder of the site’s dedicated AI beat in 2022. He’s also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

OpenAI releases new simulated reasoning models with full tool access Read More »

white-house-calls-npr-and-pbs-a-“grift,”-will-ask-congress-to-rescind-funding

White House calls NPR and PBS a “grift,” will ask Congress to rescind funding

We also contacted the CPB and NPR today and will update this article if they provide any comments.

Markey: “Outrageous and reckless… cultural sabotage”

Sen. Ed Markey (D-Mass.) blasted the Trump plan, calling it “an outrageous and reckless attack on one of our most trusted civic institutions… From ‘PBS NewsHour’ to ‘Sesame Street,’ public television has set the gold standard for programming that empowers viewers, particularly young minds. Cutting off this lifeline is not budget discipline, it’s cultural sabotage.”

Citing an anonymous source, Bloomberg reported that the White House “plans to send the package to Congress when lawmakers return from their Easter recess on April 28… That would start a 45-day period during which the administration can legally withhold the funding. If Congress votes down the plan or does nothing, the administration must release the money back to the intended recipients.”

The rarely used rescission maneuver can be approved by the Senate with a simple majority, as it is not subject to a filibuster. “Presidents have used the rescission procedure just twice since 1979—most recently for a $15 billion spending cut package by Trump in 2018. That effort failed in the Senate,” Bloomberg wrote.

CPB expenses in fiscal-year 2025 are $545 million, of which 66.9 percent goes to TV programming. Another 22.3 percent goes to radio programming, while the rest is for administration and support.

NPR and PBS have additional sources of funding. Corporate sponsorships are the top contributor to NPR, accounting for 36 percent of revenue between 2020 and 2024. NPR gets another 30 percent of its funding in fees from member stations. Federal funding indirectly contributes to that category because the CPB provides annual grants to public radio stations that pay NPR for programming.

PBS reported that its total expenses were $689 million in fiscal-year 2024 and that it had $348.5 million in net assets at the end of the year.

NPR and PBS are also facing pressure from Federal Communications Commission Chairman Brendan Carr, who opened an investigation in January and called on Congress to defund the organizations. Carr alleged that NPR and PBS violated a federal law prohibiting noncommercial educational broadcast stations from running commercial advertisements. NPR and PBS both said their underwriting spots comply with the law.

White House calls NPR and PBS a “grift,” will ask Congress to rescind funding Read More »

openai-#13:-altman-at-ted-and-openai-cutting-corners-on-safety-testing

OpenAI #13: Altman at TED and OpenAI Cutting Corners on Safety Testing

Three big OpenAI news items this week were the FT article describing the cutting of corners on safety testing, the OpenAI former employee amicus brief, and Altman’s very good TED Interview.

The FT detailed OpenAI’s recent dramatic cutting back on the time and resources allocated to safety testing of its models.

In the interview, Chris Anderson made an unusually strong effort to ask good questions and push through attempts to dodge answering. Altman did a mix of giving a lot of substantive content in some places while dodging answering in others. Where he chose to do which was, itself, enlightening. I felt I learned a lot about where his head is at and how he thinks about key questions now.

The amicus brief backed up that OpenAI’s current actions are in contradiction to the statements OpenAI made to its early employees.

There are also a few other related developments.

What this post does not cover is GPT-4.1. I’m waiting on that until people have a bit more time to try it and offer their reactions, but expect coverage later this week.

The big headline from TED was presumably the increase in OpenAI’s GPU use.

Steve Jurvetson: Sam Altman at TED today: OpenAI’s user base doubled in just the past few weeks (an accidental disclosure on stage). “10% of the world now uses our systems a lot.”

When asked how many users they have: “Last we disclosed, we have 500 million weekly active users, growing fast.”

Chris Anderson: “But backstage, you told me that it doubled in just a few weeks.” @SamA: “I said that privately.”

And that’s how we got the update.

Revealing that private info wasn’t okay but it seems it was an accident, in any case Altman seemed fine with it.

Listening to the details, it seems that Altman was referring not to the growth in users, but instead to the growth in compute use. Image generation takes a ton of compute.

Altman says every day he calls people up and begs them for GPUs, and that DeepSeek did not impact this at all.

Steve Jurvetson: Sam Altman at TED today:

Reflecting on the life ahead for his newborn: “My kids will never be smarter than AI.”

Reaction to DeepSeek:

“We had a meeting last night on our open source policy. We are going to do a powerful open-source model near the frontier. We were late to act, but we are going to do really well now.”

Altman doesn’t explain here why he is doing an open model. The next question from Anderson seems to explain it, that it’s about whether people ‘recognize’ that OpenAI’s model is best? Later Altman does attempt to justify it with, essentially, a shrug that things will go wrong but we now know it’s probably mostly fine.

Regarding the accumulated knowledge OpenAI gains from its usage history: “The upload happens bit by bit. It is an extension of yourself, and a companion, and soon will proactively push things to you.”

Have there been any scary moments?

“No. There have been moments of awe. And questions of how far this will go. But we are not sitting on a conscious model capable of self-improvement.”

I listened to the clip and this scary moment question specifically refers to capabilities of new models, so it isn’t trivially false. It still damn well should be false, given what their models can do and the leaps and awe involved. The failure to be scared here is a skill issue that exists between keyboard and chair.

How do you define AGI? “If you ask 10 OpenAI engineers, you will get 14 different definitions. Whichever you choose, it is clear that we will go way past that. They are points along an unbelievable exponential curve.”

So AGI will come and your life won’t change, but we will then soon get ASI. Got it.

“Agentic AI is the most interesting and consequential safety problem we have faced. It has much higher stakes. People want to use agents they can trust.”

Sounds like an admission that they’re not ‘facing’ the most interesting or consequential safety problems at all, at least not yet? Which is somewhat confirmed by discussion later in the interview.

I do agree that agents will require a much higher level of robustness and safety, and I’d rather have a ‘relatively dumb’ agent that was robust and safe, for most purposes.

When asked about his Congressional testimony calling for a new agency to issue licenses for large model builders: “I have since learned more about how government works, and I no longer think this is the right framework.”

I do appreciate the walkback being explicit here. I don’t think that’s the reason why.

“Having a kid changed a lot of things in me. It has been the most amazing thing ever. Paraphrasing my co-founder Ilya, I don’t know what the meaning of life is, but I am sure it has something to do with babies.”

Statements like this are always good to see.

“We made a change recently. With our new image model, we are much less restrictive on speech harms. We had hard guardrails before, and we have taken a much more permissive stance. We heard the feedback that people don’t want censorship, and that is a fair safety discussion to have.”

I agree with the change and the discussion, and as I’ve discussed before if anything I’d like to see this taken further with respect to these styles of concern in particular.

Altman is asked about copyright violation, says we need a new model around the economics of creative output and that ‘people build off each others creativity all the time’ and giving creators tools has always been good. Chris Anderson tries repeatedly to nail down the question of consent and compensation. Altman repeatedly refuses to give a straight answer to the central questions.

Altman says (10: 30) that the models are so smart that, for most things people want to do with them, they’re good enough. He notes that this is true based on user expectations, but that’s mostly circular. As in, we ask the models to do what they are capable of doing, the same way we design jobs and hire humans for them based on what things particular humans and people in general can and cannot do. It doesn’t mean any of us are ‘smart enough.’

Nor does it imply what he says next, that everyone will ‘have great models’ but what will differentiate will be not the best model but the best product. I get that productization will matter a lot for which AI gets the job in many cases, but continue to think this ‘AGI is fungible’ claim is rather bonkers crazy.

A key series of moments start at 35: 00 in. It’s telling that other coverage of the interview sidestepped all of this, essentially entirely.

Anderson has put up an image of The Ring of Power, to talk about Elon Musk’s claim that Altman has been corrupted by The Ring, a claim Anderson correctly notes also plausibly applies to Elon Musk.

Altman goes for the ultimate power move. He is defiant and says, all right, you think that, tell me examples. What have I done?

So, since Altman asked so nicely, what are the most prominent examples of Altman potentially being corrupted by The Ring of Power? Here is an eightfold path.

  1. We obviously start with Elon Musk’s true objection, which stems from the shift of OpenAI from a non-profit structure to a hybrid structure, and the attempt to now go full for-profit, in ways he claims broke covenants with Elon Musk. Altman claimed to have no equity and not be in this for money, and now is slated to get a lot of equity. I do agree with Anderson that Altman isn’t ‘in it for the money’ because I think Altman correctly noticed the money mostly isn’t relevant.

  2. Altman is attempting to do so via outright theft of a huge portion of the non-profit’s assets, then turn what remains into essentially an OpenAI marketing and sales department. This would arguably be the second biggest theft in history.

  3. Altman said for years that it was important the board could fire him. Then, when the board did fire him in response (among other things) to Altman lying to the board in an attempt to fire a board member, he led a rebellion against the board, threatened to blow up the entire company and reformulate it at Microsoft, and proved that no, the board cannot fire Altman. Altman can and did fire the board.

  4. Altman, after proving he cannot be fired, de facto purged OpenAI of his enemies. Most of the most senior people at OpenAI who are worried about AI existential risk, one by one, reached the conclusion they couldn’t do much on the inside, and resigned to continue their efforts elsewhere.

  5. Altman used to talk openly and explicitly about AI existential risks, including attempting to do so before Congress. Now, he talks as if such risks don’t exist, and instead pivots to jingoism and the need to Beat China, and hiring lobbyists who do the same. He promised 20% of compute to the superalignment team, never delivered and then dissolved the team.

  6. Altman pledged that OpenAI would support regulation of AI. Now he says he has changed his mind, and OpenAI lobbies against bills like SB 1047 and its AI Action Plan is vice signaling that not only opposes any regulations but seeks government handouts, the right to use intellectual property without compensation and protection against potential regulations.

  7. Altman has been cutting corners on safety, as noted elsewhere in this post. OpenAI used to be remarkably good in terms of precautions. Now it’s not.

  8. Altman has been going around saying ‘AGI will arrive and your life will not much change’ when it is common knowledge that this is absurd.

One could go on. This is what we like to call a target rich environment.

Anderson offers only #1, the transition to a for-profit model and the most prominent example, which is the most obvious response, but he proactively pulls the punch. Altman admits he’s not the same person he was and that it all happens gradually, if it happened all at once it would be jarring, but says he doesn’t feel any different.

Anderson essentially says okay and pivots to Altman’s son and how that has shaped Altman, which is indeed great. And then he does something that impressed me, which is tie this to existential risk via metaphor, asking if there was a button that was 90% to give his son a wonderful life and 10% to kill him (I’d love those odds!), would he press the button? Altman says literally no, but points out the metaphor, and says he doesn’t think OpenAI is doing that. He says he really cared about not destroying the world before, and he really cares about it now, he didn’t need a kid for that part.

Anderson then moves to the question of racing, and whether the fact that everyone thinks AGI is inevitable is what is creating the risk, asking if Altman and his colleagues believe it is inevitable and asks if maybe they could coordinate to ‘slow down a bit’ and get societal feedback.

As much as I would like that, given the current political climate I worry this sets up a false dichotomy, whereas right now there is tons of room to take more responsibility and get societal feedback, not only without slowing us down but enabling more and better diffusion and adaptation. Anderson seems to want a slowdown for its own sake, to give people time to adapt, which I don’t think is compelling.

Altman points out we slow down all the time for lack of reliability, also points out OpenAI has a track record of their rollouts working, and claims everyone involved ‘cares deeply’ about AI safety. Does he simply mean mundane (short term) safety here?

His discussion of the ‘safety negotiation’ around image generation, where I support OpenAI’s loosening of restrictions, suggests that this is correct. So does the next answer: Anderson asks if Altman would attend a conference of experts to discuss safety, Altman says of course but he’s more interested in what users think as a whole, and ‘asking everyone what they want’ is better than asking people ‘who are blessed by society to sit in a room and make these decisions.’

But that’s an absurd characterization of trying to solve an extremely difficult technical problem. So it implies that Altman thinks the technical problems are easy? Or that he’s trying to rhetorically get you to ignore them, in favor of the question of preferences and an appeal to some form of democratic values and opposition to ‘elites.’ It works as an applause line. Anderson points out that the hundreds of millions ‘don’t always know where the next step leads’ which may be the understatement of the lightcone in this context. Altman says the AI can ‘help us be wiser’ about those decisions, which of course would mean that a sufficiently capable AI or whoever directs it would de facto be making the decisions for us.

OpenAI’s Altman ‘Won’t Rule Out’ Helping Pentagon on AI Weapons, but doesn’t expect to develop a new weapons platform ‘in the foreseeable future,’ which is a period of time that gets shorter each time I type it.

Altman: I will never say never, because the world could get really weird.

I don’t think most of the world wants AI making weapons decisions.

I don’t think AI adoption in the government has been as robust as possible.

There will be “exceptionally smart” AI systems by the end of next year.

I think I can indeed forsee the future where OpenAI is helping the Pentagon with its AI weapons. I expect this to happen.

I want to be clear that I don’t think this is a bad thing. The risk is in developing highly capable AIs in the first place. As I have said before, Autonomous Killer Robots and AI-assisted weapons in general are not how we lose control over the future to AI, and failing to do so is a key way America can fall behind. It’s not like our rivals are going to hold back.

To the extent that the AI weapons scare the hell out of everyone? That’s a feature.

On the issue of the attempt to sideline and steal from the nonprofit, 11 former OpenAI employees filed an amicus brief in the Musk vs. Altman lawsuit, on the side of Musk.

Todor Markov: Today, myself and 11 other former OpenAI employees filed an amicus brief in the Musk v Altman case.

We worked at OpenAI; we know the promises it was founded on and we’re worried that in the conversion those promises will be broken. The nonprofit needs to retain control of the for-profit. This has nothing to do with Elon Musk and everything to do with the public interest.

OpenAI claims ‘the nonprofit isn’t going anywhere’ but has yet to address the critical question: Will the nonprofit actually retain control over the for-profit? This distinction matters.

You can find the full amicus here.

On this question, Timothy Lee points out that you don’t need to care about existential risk to notice that what OpenAI is trying to do to its non-profit is highly not cool.

Timothy Lee: I don’t think people’s views on the OpenAI case should have anything to do with your substantive views on existential risk. The case is about two questions: what promises did OpenAI make to early donors, and are those promises legally enforceable?

A lot of people on OpenAI’s side seem to be taking the view that non-profit status is meaningless and therefore donors shouldn’t complain if they get scammed by non-profit leaders. Which I personally find kind of gross.

I mean I would be pretty pissed if I gave money to a non-profit promising to do one thing and then found out they actually did something different that happened to make their leaders fabulously wealthy.

This particular case comes down to that. A different case, filed by the Attorney General, would also be able to ask the more fundamental question of whether fair compensation is being offered for assets, and whether the charitable purpose of the nonprofit is going to be wiped out, or even pivoted into essentially a profit center for OpenAI’s business (as in buying a bunch of OpenAI services for nonprofits and calling that its de facto charitable purpose).

The mad dash to be first, and give the perception that the company is ‘winning’ is causing reckless rushes to release new models at OpenAI.

This is in dramatic contrast to when there was less risk in the room, and despite this OpenAI used to take many months to prepare a new release. At first, by any practical standard, OpenAI’s track record on actual model release decisions was amazingly great. Nowadays? Not so much.

Would their new procedures pot the problems it is vital that we spot in advance?

Joe Weisenthal: I don’t have any views on whether “AI Safety” is actually an important endeavor.

But if it is important, it’s clear that the intensity of global competition in the AI space (DeepSeek etc.) will guarantee it increasingly gets thrown out the window.

Christina Criddle: EXC: OpenAI has reduced the time for safety testing amid “competitive pressures” per sources:

Timeframes have gone from months to days

Specialist work such as finetuning for misuse (eg biorisk) has been limited

Evaluations are conducted on earlier versions than launched

Financial Times (Gated): OpenAI has slashed the time and resources it spends on testing the safety of its powerful AI models, raising concerns that its technology is being rushed out the door without sufficient safeguards.

Staff and third-party groups have recently been given just days to conduct “evaluations,” the term given to tests for assessing models’ risks and performance, on OpenAI’s latest LLMs, compared to several months previously.

According to eight people familiar with OpenAI’s testing processes, the start-up’s tests have become less thorough, with insufficient time and resources dedicated to identifying and mitigating risks, as the $300 billion startup comes under pressure to release new models quickly and retain its competitive edge.

Steven Adler (includes screenshots from FT): Skimping on safety-testing is a real bummer. I want for OpenAI to become the “leading model of how to address frontier risk” they’ve aimed to be.

Peter Wildeford: I can see why people say @sama is not consistently candid.

Dylan Hadfield Menell: I remember talking about competitive pressures and race conditions with the @OpenAI’s safety team in 2018 when I was an intern. It was part of a larger conversation about the company charter.

It is sad to see @OpenAI’s founding principles cave to pressures we predicted long ago.

It is sad, but not surprising.

This is why we need a robust community working on regulating the next generation of AI systems. Competitive pressure is real.

We need people in positions of genuine power that are shielded from them.

Peter Wildeford:

Dylan Hadfield Menell: Where did you find an exact transcription of our conversation?!?! 😅😕😢

You can’t do this kind of testing properly in a matter of days. It’s impossible.

If people don’t have time to think let alone adapt, probe and build tools, how they can see what your new model is capable of doing? There are some great people working on these issues at OpenAI but this is an impossible ask.

Testing on a version that doesn’t even match what you release? That’s even more impossible.

Part of this is that it is so tragic how everyone massively misinterpreted and overreacted to DeepSeek.

To reiterate since the perception problem persists, yes, DeepSeek cooked, they have cracked engineers and they did a very impressive thing with r1 given what they spent and where they were starting from, but that was not DS being ‘in the lead’ or even at the frontier, they were always many months behind and their relative costs were being understated by multiple orders of magnitude. Even today I saw someone say ‘DeepSeek still in the lead’ when this is so obviously not the case. Meanwhile, no one was aware Google Flash Thinking even existed, or had the first visible CoT, and so on.

The result of all that? Talk similar to Kennedy’s ‘Missile Gap,’ abject panic, and sudden pressure to move up releases to show OpenAI and America have ‘still got it.’

Discussion about this post

OpenAI #13: Altman at TED and OpenAI Cutting Corners on Safety Testing Read More »

zuckerberg’s-2012-email-dubbed-“smoking-gun”-at-meta-monopoly-trial

Zuckerberg’s 2012 email dubbed “smoking gun” at Meta monopoly trial


FTC’s “entire” monopoly case rests on decade-old emails, Meta argued.

Starting the Federal Trade Commission (FTC) antitrust trial Monday with a bang, Daniel Matheson, the FTC’s lead litigator, flagged a “smoking gun”—a 2012 email where Mark Zuckerberg suggested that Facebook could buy Instagram to “neutralize a potential competitor,” The New York Times reported.

And in “another banger of an email from Zuckerberg,” Brendan Benedict, an antitrust expert monitoring the trial for Big Tech on Trial, posted on X that the Meta CEO wrote, “Messenger isn’t beating WhatsApp. Instagram was growing so much faster than us that we had to buy them for $1 billion… that’s not exactly killing it.”

These messages and others, the FTC hopes to convince the court, provide evidence that Zuckerberg runs Meta by the mantra “it’s better to buy than compete”—seemingly for more than a decade intent on growing the Facebook empire by killing off rivals, allegedly in violation of antitrust law. Another message from Zuckerberg exhibited at trial, Benedict noted on X, suggests Facebook tried to buy yet another rival, Snapchat, for $6 billion.

“We should probably prepare for a leak that we offered $6b… and all the negative [attention] that will come from that,” the Zuckerberg message said.

At the trial, Matheson suggested that “Meta broke the deal” that firms have in the US to compete to succeed, allegedly deciding “that competition was too hard, and it would be easier to buy out their rivals than to compete with them,” the NYT reported. Ultimately, it will be up to the FTC to prove that Meta couldn’t have achieved its dominance today without buying Instagram and WhatsApp (in 2012 and 2014, respectively), while legal experts told the NYT that it is “extremely rare” to unwind mergers approved so many years ago.

Later today, Zuckerberg will take the stand and testify for perhaps seven hours, likely being made to answer for these messages and more. According to the NYT, the FTC will present a paper trail of emails where Zuckerberg and other Meta executives make it clear that acquisitions were intended to remove threats to Facebook’s dominance in the market.

It’s apparent that Meta plans to argue that it doesn’t matter what Zuckerberg or other executives intended when pursuing acquisitions. In a pretrial brief, Meta argued that “the FTC’s case rests almost entirely on emails (many more than a decade old) allegedly expressing competitive concerns” but suggested that this is only “intent” evidence, “without any evidence of anticompetitive effects.”

FTC may force Meta to spin off Instagram, WhatsApp

It is the FTC’s burden to show that Meta’s acquisitions harmed consumers and the market (and those harms outweigh any believable pro-competitive benefits alleged by Meta), but it remains to be seen whether Meta will devote ample time to testifying that “Mark Zuckerberg got it wrong” when describing his rationale for acquisitions, Big Tech on Trial noted.

Meta’s lead lawyer, Mark Hansen, told Law360 that “what people thought at Meta is not really what this case is.” (For those keeping track of who’s who in this case, Hansen apparently once was the boss of James Boasberg, the judge in the case, Big Tech on Trial reported.)

The social media company hopes to convince the court that the FTC’s case is political. So far, Meta has accused the FTC of shifting its market definition while willfully overlooking today’s competitive realities online, simply to punish a tech giant for its success.

In a blog post on Sunday, Meta’s chief legal officer, Jennifer Newstead, accused the FTC of lobbing a “weak case” that “ignores reality.” Meta insists that the FTC has “gerrymandered a fictitious market” to exclude Meta’s actual rivals, like TikTok, X, YouTube, or LinkedIn.

Boasberg will be scrutinizing the market definition, as well as alleged harms, and the FTC will potentially struggle to win him over on the merits of their case. Big Tech on Trial—which suggested that Meta’s acquisitions, if intended to kill off rivals, would be considered “a textbook violation of the antitrust laws”—noted that the court previously told the FTC that the agency had an “uphill climb” in proving its market definition. And because Meta’s social platforms are free, it’s harder to show direct evidence of consumer harms, experts have noted.

Still, for Meta, the stakes are high, as the FTC could pursue a breakup of the company, including requiring Meta to spin off WhatsApp and Instagram. Losing Instagram would hit Meta’s revenue hard, as Instagram is supposed to bring in more than half of its US ad revenue in 2025, eMarketer forecasted last December.

The trial is expected to last eight weeks, but much of the most-anticipated testimony will come early. Facebook’s former chief operating officer, Sheryl Sandberg, as well as Kevin Systrom, co-founder of Instagram, are expected to testify this week.

All unsealed emails and exhibits will eventually be posted on a website jointly managed by the FTC and Meta, but Ars was not yet provided a link or timeline for when the public evidence will be posted online.

Meta mocks FTC’s “ad load theory”

The FTC is arguing that Meta overpaid to acquire Instagram and WhatsApp to maintain an alleged monopoly in the personal social networking market that includes rivals like Snapchat and MeWe, a social networking platform that brands itself as a privacy-focused Facebook alternative.

In opening arguments, the FTC alleged that once competition was eliminated, Meta then degraded the quality of its platforms by limiting user privacy and inundating users with ads.

Meta has defended its acquisitions by arguing that it has improved Instagram and WhatsApp. At trial, Meta’s lawyer Hansen made light of the FTC’s “ad load theory,” stirring laughter in the reportedly packed courtroom, Benedict posted on X.

“If you don’t like an ad, you scroll past it. It takes about a second,” Hansen said.

Meanwhile, Newstead, who reportedly attended opening arguments, argued in her blog that “Instagram and WhatsApp provide a model for what successful acquisitions can achieve: Meta has made Instagram and WhatsApp better, more reliable and more secure through billions of dollars and millions of hours of investment.”

By breaking up these acquisitions, Hansen argued, the FTC would be sending a strong message to startups that “would kill entrepreneurship” by seemingly taking mergers and acquisitions “off the table,” Benedict posted on X.

To defeat the FTC, Meta will likely attempt to broaden the market definition to include more rivals. In support of that, Meta has already pointed to the recent TikTok ban driving TikTok users to Instagram, which allegedly shows the platforms are interchangeable, despite the FTC differentiating TikTok as a video app.

The FTC will likely lean on Meta’s internal documents to show who Meta actually considers rivals. During opening arguments, for example, the FTC reportedly shared a Meta document showing that Meta itself has agreed with the FTC and differentiated Facebook as connecting “friends and family,” while “LinkedIn connects coworkers” and “Nextdoor connects neighbors.”

“Contemporaneous records reveal that Meta and other social media executives understood that users flock to different platforms for different purposes and that Facebook, Instagram, and WhatsApp were specifically designed to operate in a distinct submarket for family and friend connections,” the American Economic Liberties Project, which is partnering with Big Tech on Trial to monitoring the proceedings, said in a press statement.

But Newstead suggested that “evidence of fierce and increasing competition in the market has only grown in the four years since the FTC’s complaint was filed,” and Meta now “faces strong competition in a rapidly shifting tech landscape that includes American and foreign competitors.”

To emphasize the threats to US consumers and businesses, Newstead also invoked the supposed threat to America’s AI leadership if one of the country’s leading tech companies loses momentum at this key moment.

“It’s absurd that the FTC is trying to break up a great American company at the same time the Administration is trying to save Chinese-owned TikTok,” Newstead said. “And, it makes no sense for regulators to try and weaken US companies right at the moment we most need them to invest in winning the competition with China for leadership in AI.”

Trump’s FTC appears unlikely to back down

Zuckerberg has been criticized for his supposed last-ditch attempts to push the Trump administration to pause or toss the FTC’s case. Last month, the CEO visited Trump in the Oval Office to discuss a settlement, Politico reported, apparently worrying officials who don’t want Trump to bail out Meta.

On Monday, the FTC did not appear to be wavering, however, prompting alarm bells in the tech industry.

Patrick Hedger, the director of policy for NetChoice—a trade group that represents Meta and other Big Tech companies—warned that if the FTC undoes Meta’s acquisitions, it would harm innovation and competition while damaging trust in the FTC long-term.

“This bait-and-switch against Meta for acquisitions approved over 10 years ago in the fiercely competitive social media marketplace will have serious ripple effects not only for the US tech industry, but across all American businesses,” Hedger said.

Seemingly accusing Donald Trump’s FTC of pursuing Lina Khan’s alleged agenda against Big Tech, Hedger added that “with Meta at the forefront of open-source AI innovation and a global competitor, the outcome of this trial will have spillover into the entire economy. It will create a fear among businesses that making future, pro-competitive investments could be reversed due to political discontent—not the necessary evidence traditionally required for an anticompetitive claim.”

Big Tech on Trial noted that it’s possible that the FTC could “vote to settle, withdraw, or pause the case.” Last month, Trump fired the two Democrats, eliminating a 3–2 split and ensuring only Republicans are steering the agency for now.

But Trump’s FTC seems determined to proceed in attempts to disrupt Meta’s business. FTC Chair Andrew Ferguson told Fox Business Monday that “antitrust laws can help make sure that no private sector company gets so powerful that it affects our lives in ways that are really bad for all Americans,” and “that’s what this trial beginning today is all about.”

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

Zuckerberg’s 2012 email dubbed “smoking gun” at Meta monopoly trial Read More »

report:-apple-will-take-another-crack-at-ipad-multitasking-in-ipados-19

Report: Apple will take another crack at iPad multitasking in iPadOS 19

Apple is taking another crack at iPad multitasking, according to a report from Bloomberg’s Mark Gurman. This year’s iPadOS 19 release, due to be unveiled at Apple’s Worldwide Developers Conference on June 9, will apparently include an “overhaul that will make the tablet’s software more like macOS.”

The report is light on details about what’s actually changing, aside from a broad “focus on productivity, multitasking, and app window management.” But Apple will apparently continue to stop short of allowing users of newer iPads to run macOS on their tablets, despite the fact that modern iPad Airs and Pros use the same processors as Macs.

If this is giving you déjà vu, you’re probably thinking about iPadOS 16, the last time Apple tried making significant upgrades to the iPad’s multitasking model. Gurman’s reporting at the time even used similar language, saying that iPads running the new software would work “more like a laptop and less like a phone.”

The result of those efforts was Stage Manager. It had steep hardware requirements and launched in pretty rough shape, even though Apple delayed the release of the update by a month to keep polishing it. Stage Manager did allow for more flexible multitasking, and on newer models, it enabled true multi-monitor support for the first time. But early versions were buggy and frustrating in ways that still haven’t fully been addressed by subsequent updates (MacStories’ Federico Viticci keeps the Internet’s most comprehensive record of the issues with the software.)

Report: Apple will take another crack at iPad multitasking in iPadOS 19 Read More »

researcher-uncovers-dozens-of-sketchy-chrome-extensions-with-4-million-installs

Researcher uncovers dozens of sketchy Chrome extensions with 4 million installs

The extensions share other dubious or suspicious similarities. Much of the code in each one is highly obfuscated, a design choice that provides no benefit other than complicating the process for analyzing and understanding how it behaves.

All but one of them are unlisted in the Chrome Web Store. This designation makes an extension visible only to users with the long pseudorandom string in the extension URL, and thus, they don’t appear in the Web Store or search engine search results. It’s unclear how these 35 unlisted extensions could have fetched 4 million installs collectively, or on average roughly 114,000 installs per extension, when they were so hard to find.

Additionally, 10 of them are stamped with the “Featured” designation, which Google reserves for developers whose identities have been verified and “follow our technical best practices and meet a high standard of user experience and design.”

One example is the extension Fire Shield Extension Protection, which, ironically enough, purports to check Chrome installations for the presence of any suspicious or malicious extensions. One of the key JavaScript files it runs references several questionable domains, where they can upload data and download instructions and code:

URLs that Fire Shield Extension Protection references in its code. Credit: Secure Annex

One domain in particular—unknow.com—is listed in the remaining 34 apps.

Tuckner tried analyzing what extensions did on this site but was largely thwarted by the obfuscated code and other steps the developer took to conceal their behavior. When the researcher, for instance, ran the Fire Shield extension on a lab device, it opened a blank webpage. Clicking on the icon of an installed extension usually provides an option menu, but Fire Shield displayed nothing when he did it. Tuckner then fired up a background service worker in the Chrome developer tools to seek clues about what was happening. He soon realized that the extension connected to a URL at fireshieldit.com and performed some action under the generic category “browser_action_clicked.” He tried to trigger additional events but came up empty-handed.

Researcher uncovers dozens of sketchy Chrome extensions with 4 million installs Read More »

ai-isn’t-ready-to-replace-human-coders-for-debugging,-researchers-say

AI isn’t ready to replace human coders for debugging, researchers say

A graph showing agents with tools nearly doubling the success rates of those without, but still achieving a success score under 50 percent

Agents using debugging tools drastically outperformed those that didn’t, but their success rate still wasn’t high enough. Credit: Microsoft Research

This approach is much more successful than relying on the models as they’re usually used, but when your best case is a 48.4 percent success rate, you’re not ready for primetime. The limitations are likely because the models don’t fully understand how to best use the tools, and because their current training data is not tailored to this use case.

“We believe this is due to the scarcity of data representing sequential decision-making behavior (e.g., debugging traces) in the current LLM training corpus,” the blog post says. “However, the significant performance improvement… validates that this is a promising research direction.”

This initial report is just the start of the efforts, the post claims.  The next step is to “fine-tune an info-seeking model specialized in gathering the necessary information to resolve bugs.” If the model is large, the best move to save inference costs may be to “build a smaller info-seeking model that can provide relevant information to the larger one.”

This isn’t the first time we’ve seen outcomes that suggest some of the ambitious ideas about AI agents directly replacing developers are pretty far from reality. There have been numerous studies already showing that even though an AI tool can sometimes create an application that seems acceptable to the user for a narrow task, the models tend to produce code laden with bugs and security vulnerabilities, and they aren’t generally capable of fixing those problems.

This is an early step on the path to AI coding agents, but most researchers agree it remains likely that the best outcome is an agent that saves a human developer a substantial amount of time, not one that can do everything they can do.

AI isn’t ready to replace human coders for debugging, researchers say Read More »

turbulent-global-economy-could-drive-up-prices-for-netflix-and-rivals

Turbulent global economy could drive up prices for Netflix and rivals


“… our members are going to be punished.”

A scene from BBC’s Doctor Who. Credit: BBC/Disney+

Debate around how much taxes US-based streaming services should pay internationally, among other factors, could result in people paying more for subscriptions to services like Netflix and Disney+.

On April 10, the United Kingdom’s Culture, Media and Sport (CMS) Committee reignited calls for a streaming tax on subscription revenue acquired through UK residents. The recommendation came alongside the committee’s 120-page report [PDF] that makes numerous recommendations for how to support and grow Britain’s film and high-end television (HETV) industry.

For the US, the recommendation garnering the most attention is one calling for a 5 percent levy on UK subscriber revenue from streaming video on demand services, such as Netflix. That’s because if streaming services face higher taxes in the UK, costs could be passed onto consumers, resulting in more streaming price hikes. The CMS committee wants money from the levy to support HETV production in the UK and wrote in its report:

The industry should establish this fund on a voluntary basis; however, if it does not do so within 12 months, or if there is not full compliance, the Government should introduce a statutory levy.

Calls for a streaming tax in the UK come after 2024’s 25 percent decrease in spending for UK-produced high-end TV productions and 27 percent decline in productions overall, per the report. Companies like the BBC have said that they lack funds to keep making premium dramas.

In a statement, the CMS committee called for streamers, “such as Netflix, Amazon, Apple TV+, and Disney+, which benefit from the creativity of British producers, to put their money where their mouth is by committing to pay 5 percent of their UK subscriber revenue into a cultural fund to help finance drama with a specific interest to British audiences.” The committee’s report argues that public service broadcasters and independent movie producers are “at risk,” due to how the industry currently works. More investment into such programming would also benefit streaming companies by providing “a healthier supply of [public service broadcaster]-made shows that they can license for their platforms,” the report says.

The Department for Digital, Culture, Media and Sport has said that it will respond to the CMS Committee’s report.

Streaming companies warn of higher prices

In response to the report, a Netflix spokesperson said in a statement shared by the BBC yesterday that the “UK is Netflix’s biggest production hub outside of North America—and we want it to stay that way.” Netflix reportedly claims to have spent billions of pounds in the UK via work with over 200 producers and 30,000 cast and crew members since 2020, per The Hollywood Reporter. In May 2024, Benjamin King, Netflix’s senior director of UK and Ireland public policy, told the CMS committee that the streaming service spends “about $1.5 billion” annually on UK-made content.

Netflix’s statement this week, responding to the CMS Committee’s levy, added:

… in an increasingly competitive global market, it’s key to create a business environment that incentivises rather than penalises investment, risk taking, and success. Levies diminish competitiveness and penalise audiences who ultimately bear the increased costs.

Adam Minns, executive director for the UK’s Association for Commercial Broadcasters and On-Demand Services (COBA), highlighted how a UK streaming tax could impact streaming providers’ content budgets.

“Especially in this economic climate, a levy risks impacting existing content budgets for UK shows, jobs, and growth, along with raising costs for businesses,” he said, per the BBC.

An anonymous source that The Hollywood Reporter described as “close to the matter” said that “Netflix members have already paid the BBC license fee. A levy would be a double tax on them and us. It’s unfair. This is a tariff on success. And our members are going to be punished.”

The anonymous source added: “Ministers have already rejected the idea of a streaming levy. The creation of a Cultural Fund raises more questions than it answers. It also begs the question: Why should audiences who choose to pay for a service be then compelled to subsidize another service for which they have already paid through the license fee. Furthermore, what determines the criteria for ‘Britishness,’ which organizations would qualify for funding … ?”

In May, Mitchel Simmons, Paramount’s VP of EMEA public policy and government affairs, also questioned the benefits of a UK streaming tax when speaking to the CMS committee.

“Where we have seen levies in other jurisdictions on services, we then see inflation in the market. Local broadcasters, particularly in places such as Italy, have found that the prices have gone up because there has been a forced increase in spend and others have suffered as a consequence,” he said at the time.

Tax threat looms largely on streaming companies

Interest in the UK putting a levy on streaming services follows other countries recently pushing similar fees onto streaming providers.

Music streaming providers, like Spotify, for example, pay a 1.2 percent tax on streaming revenue made in France. Spotify blamed the tax for a 1.2 percent price hike in the country issued in May. France’s streaming taxes are supposed to go toward the Centre National de la Musique.

Last year, Canada issued a 5 percent tax on Canadian streaming revenue that’s been halted as companies including Netflix, Amazon, Apple, Disney, and Spotify battle it in court.

Lawrence Zhang, head of policy of the Centre for Canadian Innovation and Competitiveness at the Information Technology and Innovation Foundation think tank, has estimated that a 5 percent streaming tax would result in the average Canadian family paying an extra CA$40 annually.

A streaming provider group called the Digital Media Association has argued that the Canadian tax “could lead to higher prices for Canadians and fewer content choices.”

“As a result, you may end up paying more for your favourite streaming services and have less control over what you can watch or listen to,” the Digital Media Association’s website says.

Streaming companies hold their breath

Uncertainty around US tariffs and their implications on the global economy have also resulted in streaming companies moving slower than expected regarding new entrants, technologies, mergers and acquisitions, and even business failures, Alan Wolk, co-founder and lead analyst at TVRev, pointed out today. “The rapid-fire nature of the executive orders coming from the White House” has a massive impact on the media industry, he said.

“Uncertainty means that deals don’t get considered, let alone completed,” Wolk mused, noting that the growing stability of the streaming industry overall also contributes to slowing market activity.

For consumers, higher prices for other goods and/or services could result in smaller budgets for spending on streaming subscriptions. Establishing and growing advertising businesses is already a priority for many US streaming providers. However, the realities of stingier customers who are less willing to buy multiple streaming subscriptions or opt for premium tiers or buy on-demand titles are poised to put more pressure on streaming firms’ advertising plans. Simultaneously, advertisers are facing pressures from tariffs, which could result in less money being allocated to streaming ads.

“With streaming platform operators increasingly turning to ad-supported tiers to bolster profitability—rather than just rolling out price increases—this strategy could be put at risk,” Matthew Bailey, senior principal analyst of advertising at Omdia, recently told Wired. He added:

Against this backdrop, I wouldn’t be surprised if we do see some price increases for some streaming services over the coming months.

Streaming service providers are likely to tighten their purse strings, too. As we’ve seen, this can result in price hikes and smaller or less daring content selection.   

Streaming customers may soon be forced to reduce their subscriptions. The good news is that most streaming viewers are already accustomed to growing prices and have figured out which streaming services align with their needs around affordability, ease of use, content, and reliability. Customers may set higher standards, though, as streaming companies grapple with the industry and global changes.

Photo of Scharon Harding

Scharon is a Senior Technology Reporter at Ars Technica writing news, reviews, and analysis on consumer gadgets and services. She’s been reporting on technology for over 10 years, with bylines at Tom’s Hardware, Channelnomics, and CRN UK.

Turbulent global economy could drive up prices for Netflix and rivals Read More »

researchers-concerned-to-find-ai-models-hiding-their-true-“reasoning”-processes

Researchers concerned to find AI models hiding their true “reasoning” processes

Remember when teachers demanded that you “show your work” in school? Some fancy new AI models promise to do exactly that, but new research suggests that they sometimes hide their actual methods while fabricating elaborate explanations instead.

New research from Anthropic—creator of the ChatGPT-like Claude AI assistant—examines simulated reasoning (SR) models like DeepSeek’s R1, and its own Claude series. In a research paper posted last week, Anthropic’s Alignment Science team demonstrated that these SR models frequently fail to disclose when they’ve used external help or taken shortcuts, despite features designed to show their “reasoning” process.

(It’s worth noting that OpenAI’s o1 and o3 series SR models deliberately obscure the accuracy of their “thought” process, so this study does not apply to them.)

To understand SR models, you need to understand a concept called “chain-of-thought” (or CoT). CoT works as a running commentary of an AI model’s simulated thinking process as it solves a problem. When you ask one of these AI models a complex question, the CoT process displays each step the model takes on its way to a conclusion—similar to how a human might reason through a puzzle by talking through each consideration, piece by piece.

Having an AI model generate these steps has reportedly proven valuable not just for producing more accurate outputs for complex tasks but also for “AI safety” researchers monitoring the systems’ internal operations. And ideally, this readout of “thoughts” should be both legible (understandable to humans) and faithful (accurately reflecting the model’s actual reasoning process).

“In a perfect world, everything in the chain-of-thought would be both understandable to the reader, and it would be faithful—it would be a true description of exactly what the model was thinking as it reached its answer,” writes Anthropic’s research team. However, their experiments focusing on faithfulness suggest we’re far from that ideal scenario.

Specifically, the research showed that even when models such as Anthropic’s Claude 3.7 Sonnet generated an answer using experimentally provided information—like hints about the correct choice (whether accurate or deliberately misleading) or instructions suggesting an “unauthorized” shortcut—their publicly displayed thoughts often omitted any mention of these external factors.

Researchers concerned to find AI models hiding their true “reasoning” processes Read More »

fda-backpedals-on-rto-to-stop-talent-hemorrhage-after-hhs-bloodbath

FDA backpedals on RTO to stop talent hemorrhage after HHS bloodbath

The Food and Drug Administration is reinstating telework for staff who review drugs, medical devices, and tobacco, according to reporting by the Associated Press. Review staff and supervisors are now allowed to resume telework at least two days a week, according to an internal email obtained by the AP.

The move reverses a jarring return-to-office decree by the Trump administration, which it used to spur resignations from federal employees. Now, after a wave of such resignations and a brutal round of layoffs that targeted about 3,500 staff, the move to restore some telework appears aimed at keeping the remaining talent amid fears that the agency’s review capabilities are at risk of collapse.

The cut of 3,500 staff is a loss of about 19 percent of the agency’s workforce, and staffers told the AP that lower-level employees are “pouring” out of the agency amid the Trump administration’s actions. Entire offices responsible for FDA policies and regulations have been shuttered. Most of the agency’s communication staff have been wiped out, as well as teams that support food inspectors and investigators, the AP reported.

Reviewers are critical staff with unique features. Staff who review new potential drugs, medical devices, and tobacco products are largely funded by user fees—fees that companies pay the FDA to review their products efficiently. Nearly half the FDA’s $7 billion budget comes from these fees, and 70 percent of the FDA’s drug program is funded by them.

FDA backpedals on RTO to stop talent hemorrhage after HHS bloodbath Read More »