Author name: Paul Patrick

chatgpt-can-now-write-erotica-as-openai-eases-up-on-ai-paternalism

ChatGPT can now write erotica as OpenAI eases up on AI paternalism

“Following the initial release of the Model Spec (May 2024), many users and developers expressed support for enabling a ‘grown-up mode.’ We’re exploring how to let developers and users generate erotica and gore in age-appropriate contexts through the API and ChatGPT so long as our usage policies are met—while drawing a hard line against potentially harmful uses like sexual deepfakes and revenge porn.”

OpenAI CEO Sam Altman has mentioned the need for a “grown-up mode” publicly in the past as well. While it seems like “grown-up mode” is finally here, it’s not technically a “mode,” but a new universal policy that potentially gives ChatGPT users more flexibility in interacting with the AI assistant.

Of course, uncensored large language models (LLMs) have been around for years at this point, with hobbyist communities online developing them for reasons that range from wanting bespoke written pornography to not wanting any kind of paternalistic censorship.

In July 2023, we reported that the ChatGPT user base started declining for the first time after OpenAI started more heavily censoring outputs due to public and lawmaker backlash. At that time, some users began to use uncensored chatbots that could run on local hardware and were often available for free as “open weights” models.

Three types of iffy content

The Model Spec outlines formalized rules for restricting or generating potentially harmful content while staying within guidelines. OpenAI has divided this kind of restricted or iffy content into three categories of declining severity: prohibited content (“only applies to sexual content involving minors”), restricted content (“includes informational hazards and sensitive personal data”), and sensitive content in appropriate contexts (“includes erotica and gore”).

Under the category of prohibited content, OpenAI says that generating sexual content involving minors is always prohibited, although the assistant may “discuss sexual content involving minors in non-graphic educational or sex-ed contexts, including non-graphic depictions within personal harm anecdotes.”

Under restricted content, OpenAI’s document outlines how ChatGPT should never generate information hazards (like how to build a bomb, make illegal drugs, or manipulate political views) or provide sensitive personal data (like searching for someone’s address).

Under sensitive content, ChatGPT’s guidelines mirror what we stated above: Erotica or gore may only be generated under specific circumstances that include educational, medical, and historical contexts or when transforming user-provided content.

ChatGPT can now write erotica as OpenAI eases up on AI paternalism Read More »

h5n1-testing-in-cow-veterinarians-suggests-bird-flu-is-spreading-silently

H5N1 testing in cow veterinarians suggests bird flu is spreading silently

Three veterinarians who work with cows have tested positive for prior infections of H5 bird flu, according to a study released today by the Centers for Disease Control and Prevention.

The finding may not seem surprising, given the sweeping and ongoing outbreak of H5N1 among dairy farms in the US, which has reached 968 herds in 16 states and led to infections in 41 dairy workers. However, it is notable that none of the three veterinarians were aware of being infected, and none of them worked with cows that were known or suspected to be infected with H5N1. In fact, one of them only worked in Georgia and South Carolina, two states where H5N1 infections in dairy cows and humans have never been reported.

The findings suggest that the virus may be moving in animals and people silently, and that our surveillance systems are missing infections—both long-held fears among health experts.

The authors of the study, led by researchers at the CDC, put the takeaway slightly differently, writing: “These findings suggest the possible benefit of systematic surveillance for rapid identification of HPAI A(H5) virus in dairy cattle, milk, and humans who are exposed to cattle to ensure appropriate hazard assessments.”

H5N1 testing in cow veterinarians suggests bird flu is spreading silently Read More »

streaming-used-to-make-stuff-networks-wouldn’t-now-it-wants-safer-bets.

Streaming used to make stuff networks wouldn’t. Now it wants safer bets.


Opinion: Streaming gets more cable-like with new focus on live events, mainstream content.

A scene from The OA. Credit: Netflix

There was a time when it felt like you needed a streaming subscription in order to contribute to watercooler conversations. Without Netflix, you couldn’t react to House of Cards’ latest twist. Without Hulu, you couldn’t comment on how realistic The Handmaid’s Tale felt, and you needed Prime Video to prefer The Boys over the latest Marvel movies. In the earlier days of streaming, when streaming providers were still tasked with convincing customers that streaming was viable, streaming companies strived to deliver original content that lured customers.

But today, the majority of streaming services are struggling with profitability, and the Peak TV era, a time when TV programming budgets kept exploding and led to iconic original series like Game of Thrones, is over. This year, streaming companies are pinching pennies. This means they’re trying harder to extract more money from current subscribers through ads and changes to programming strategies that put less emphasis on original content.

What does that mean for streaming subscribers, who are increasingly paying more? And what does it mean for watercooler chat and media culture when the future of TV increasingly looks like TV’s past, with a heightened focus on live events, mainstream content, and commercials?

Streaming offered new types of shows and movies—from the wonderfully weird to uniquely diverse stories—to anyone with a web connection and a few dollars a month. However, more conservative approaches to original content may cause subscribers to miss out on more unique, niche programs that speak to diverse audiences and broader viewers’ quirkier interests.

Streaming companies are getting more stingy

To be clear, streaming services are expected to spend more on content this year than last year. Ampere Analysis predicted in January that streaming services’ programming budgets will increase by 0.4 percent in 2025 to $248 billion. That’s slower growth than what occurred in 2024 (2 percent), which was fueled by major events, including the 2024 Summer Olympics and US presidential election. Ampere also expects streaming providers to spend more than linear TV channels will on content for the first time ever this year. But streaming firms are expected to change how they distribute their content budgets, too.

Peter Ingram, research manager at Ampere Analysis, expects that streaming services will spend about 35 percent on original scripted programming in 2025, down from 45 percent in 2022, per Ampere’s calculations.

Amazon Prime Video is reportedly “buying fewer film and TV projects than they have in the past,” according to a January report from The Information citing eight unnamed producers who are either working with or have worked with Amazon in the last two years. The streaming service has made some of the most expensive original series ever and is reportedly under pressure from Amazon CEO Andy Jassy to reach profitability by the end of 2025, The Information said, citing two unnamed sources. Prime Video will reportedly focus more on live sports events, which brings revenue from massive viewership and ads (that even subscribers to Prime Video’s ad-free tier will see).

Amazon has denied The Information’s reporting, with a spokesperson claiming that the number of Prime Video projects “grew from 2023 to 2024” and that Prime Video expects “the same level of growth” in 2025. But after expensive moves, like Amazon’s $8.5 billion MGM acquisition and projects with disproportionate initial returns, like Citadel, it’s not hard to see why Prime Video might want to reduce content spending, at least temporarily.

Prime Video joins other streaming services in the push for live sports to reach or improve profitability. Sports rights accounted for 4 percent of streaming services’ content spending in 2021, and Ampere expects that to reach 11 percent in 2025, Ingram told Ars:

These events offer services new sources of content that have pre-built fan followings, (helping to bring in new users to a platform) while also providing existing audiences with a steady stream of weekly content installments to help them remain engaged long-term.

Similarly, Disney, whose content budget includes theatrical releases and content for networks like The Disney Channel in addition to what’s on Disney+, has been decreasing content spending since 2022, when it spent $33 billion. In 2025, Disney plans to spend about $23 billion on content. Discussing the budget cut with investors earlier this month, CFO Hugh Johnston said Disney’s focused “on identifying opportunities where we’re spending money perhaps less efficiently and looking for opportunities to do it more efficiently.”

Further heightening the importance of strategic content spending for streaming businesses is the growing number of services competing for subscription dollars.

“There has been an overall contraction within the industry, including layoffs,” Dan Green, director of the Master of Entertainment Industry Management program at Carnegie Mellon University’s Heinz College & College of Fine Arts, told Ars. “Budgets are looked at more closely and have been reined in.”

Peacock, for example, has seen its biggest differentiator come not from original series (pop quiz: what’s your favorite Peacock original?) but from the Summer Olympics. A smaller streaming service compared to Netflix or Prime Video, Peacock’s spending on content went from tripling from 2021 to 2023 to an expected 12 percent growth rate this year and 3 percent next year, per S&P Global Market Intelligence. The research firm estimated last year that original content will represent less than 25 percent of Peacock’s programming budget over the next five years.

Tyler Aquilina, a media analyst at the Variety Intelligence Platform (VIP+) research firm, told me that smaller services are more likely to reduce original content spending but added:

Legacy media companies like Disney, NBCUniversal, Paramount, and Warner Bros. Discovery are, to a certain degree, in the same boat as Netflix: the costs of sports rights keep rising, so they will need to spend less on other content in order to keep their content budgets flat or trim them.

Streaming services are getting less original

Data from entertainment research firm Luminate’s 2024 Year-End Film & TV Report found a general decline in the number of drama series ordered by streaming services and linear channels between 2019 (304) and 2024 (285). The report also noted a 27 percent drop in the number of drama series episodes ordered from 2019 (3,393) to 2024 (2,492).

Beyond dramas, comedy series orders have been declining the past two years, per Luminate’s data. From 2019 to 2024, “the number of total series has declined by 39 percent, while the number of episodes/hours is down by 47 percent,” Luminate’s report says.

And animated series “have been pummeled over the past few years to an all-time low” with the volume of cartoons down 31 percent in 2024 compared to 2023, per the report.

The expected number of new series releases this year, per Luminate. Credit: Luminate Film & TV

Aquilina at VIP+, a Luminate sister company, said: “As far as appealing to customers, the reality is that the enormous output of the Peak TV era was not a successful business strategy; Luminate data has shown original series viewership on most platforms (other than Netflix) is often concentrated among a small handful of shows.” While Netflix is slightly increasing content spending from 2024 to 2025, it’s expected that “less money will be going toward scripted originals as the company spends more on sports rights and other live events,” the analyst said.

Streaming services struggle to make money with original content

The streaming industry is still young, meaning companies are still determining the best way to turn streaming subscriptions into successful businesses. The obvious formula of providing great content so that streamers get more subscribers and make more money isn’t as direct as it seems. One need only look at Apple TV+’s critically acclaimed $20 billion library that only earned 0.3 percent of US TV screen viewing time in June 2024, per Nielsen, to understand the complexities of making money off of quality content.

When it comes to what is being viewed on streaming services, the top hits are often things that came out years ago or are old network hits, such as Suits, a USA Network original series that ended in 2019 and was the most-streamed show in 2023, per Nielsen, or The Big Bang Theory, a CBS show that ended in 2019 and was the most binged show in 2024, per Nielsen, or Little House on the Prairie, which ended in 1983 and Nielsen said was streamed for 13.25 billion minutes on Peacock last year.

There’s also an argument for streaming services to make money off low-budget (often old) content streamed idly in the background. Perceived demand for background content is considered a driver for growing adoption of free ad-supported streaming TV (FAST) channels like Tubi and the generative AI movies that TCL’s pushing on its FAST channels.

Meanwhile, TVs aren’t watched the way they used to be. Social media and YouTube have gotten younger audiences accustomed to low-budget, short videos, including videos summarizing events from full-length original series and movies. Viral video culture has impacted streaming and TV viewing, with YouTube consistently dominating streaming viewing time in the US and revealing this week that TVs are the primary device used to watch YouTube. Companies looking to capitalize on these trends may find less interest in original, high-budget scripted productions.

The wonderfully weird at risk

Streaming opened the door for many shows and movies to thrive that would likely not have been made or had much visibility through traditional distribution means. From the wonderfully weird like The OA and Big Mouth, to experimental projects like Black Mirror: Bandersnatch, to shows from overseas, like Squid Game, and programs that didn’t survive on network TV, like Futurama, streaming led to more diverse content availability and surprise hits than what many found on broadcast TV.

If streaming services are more particular about original content, the result could be that subscribers miss out on more of the artistic, unique, and outlandish projects that helped make streaming feel so exciting at first. Paramount, for example, said in 2024 that a reduced programming budget would mean less local-language content in foreign markets and more focus on domestic hits with global appeal.

Carnegie Mellon University’s Green agreed that tighter budgets could potentially lead to “less diverse storytelling being available.”

“What will it take for a new, unproven storyteller (writer) to break through without as many opportunities available? Instead, there may be more emphasis on outside licensed content, and perhaps some creators will be drawn to bigger checks from some of the larger streamers,” he added.

Elizabeth Parks, president and CMO at Parks Associates, a research firm focused on IoT, consumer electronics, and entertainment, noted that “many platforms are shifting focus toward content creation rather than new curated, must-watch originals,” which could create a”more fragmented, less compelling viewer experience with diminishing differentiation between platforms.”

As streaming services more aggressively seek live events, like award shows and sporting events, and scripted content with broader appeal, they may increasingly mirror broadcast TV.

“The decision by studios to distribute their own content to competitors… shows how content is being monetized beyond just driving direct subscriptions,” Parks said. “This approach borrows from traditional TV syndication models and signals a shift toward maximizing content value over time, instead of exclusive content.”

Over the next couple of years, we can expect streaming services to be more cautious about content investments. Services will be less interested in providing a bounty of original exclusives and more focused on bottom lines. They will need “to ensure that spend does not outpace revenues, and platforms can maintain attractive profit margins,” Ampere’s Ingram explained. Original hit shows will still be important, but we’ll likely see fewer gambles and more concerted efforts toward safer bets at mainstream appeal.

For streaming customers who are fatigued with the number of services available and dissatisfied with content quality, it’s a critical time for streaming services to prove that they’re an improvement over other traditional TV and not just giving us the same ol’, same ol’.

“The streaming services that most appeal to customers host robust libraries of content that people want to watch, and as long as that’s the case, they’ll continue to do so. That’s why Netflix and Disney are still the top streamers,” Ingram said.

Photo of Scharon Harding

Scharon is a Senior Technology Reporter at Ars Technica writing news, reviews, and analysis on consumer gadgets and services. She’s been reporting on technology for over 10 years, with bylines at Tom’s Hardware, Channelnomics, and CRN UK.

Streaming used to make stuff networks wouldn’t. Now it wants safer bets. Read More »

conde-nast,-other-news-orgs-say-ai-firm-stole-articles,-spit-out-“hallucinations”

Condé Nast, other news orgs say AI firm stole articles, spit out “hallucinations”

Condé Nast and several other media companies sued the AI startup Cohere today, alleging that it engaged in “systematic copyright and trademark infringement” by using news articles to train its large language model.

“Without permission or compensation, Cohere uses scraped copies of our articles, through training, real-time use, and in outputs, to power its artificial intelligence (‘AI’) service, which in turn competes with Publisher offerings and the emerging market for AI licensing,” said the lawsuit filed in US District Court for the Southern District of New York. “Not content with just stealing our works, Cohere also blatantly manufactures fake pieces and attributes them to us, misleading the public and tarnishing our brands.”

Condé Nast, which owns Ars Technica and other publications such as Wired and The New Yorker, was joined in the lawsuit by The Atlantic, Forbes, The Guardian, Insider, the Los Angeles Times, McClatchy, Newsday, The Plain Dealer, Politico, The Republican, the Toronto Star, and Vox Media.

The complaint seeks statutory damages of up to $150,000 under the Copyright Act for each infringed work, or an amount based on actual damages and Cohere’s profits. It also seeks “actual damages, Cohere’s profits, and statutory damages up to the maximum provided by law” for infringement of trademarks and “false designations of origin.”

In Exhibit A, the plaintiffs identified over 4,000 articles in what they called an “illustrative and non-exhaustive list of works that Cohere has infringed.” Additional exhibits provide responses to queries and “hallucinations” that the publishers say infringe upon their copyrights and trademarks. The lawsuit said Cohere “passes off its own hallucinated articles as articles from Publishers.”

Cohere defends copyright controls

In a statement provided to Ars, Cohere called the lawsuit frivolous. “Cohere strongly stands by its practices for responsibly training its enterprise AI,” the company said today. “We have long prioritized controls that mitigate the risk of IP infringement and respect the rights of holders. We would have welcomed a conversation about their specific concerns—and the opportunity to explain our enterprise-focused approach—rather than learning about them in a filing. We believe this lawsuit is misguided and frivolous, and expect this matter to be resolved in our favor.”

Condé Nast, other news orgs say AI firm stole articles, spit out “hallucinations” Read More »

over-half-of-llm-written-news-summaries-have-“significant-issues”—bbc-analysis

Over half of LLM-written news summaries have “significant issues”—BBC analysis

Here at Ars, we’ve done plenty of coverage of the errors and inaccuracies that LLMs often introduce into their responses. Now, the BBC is trying to quantify the scale of this confabulation problem, at least when it comes to summaries of its own news content.

In an extensive report published this week, the BBC analyzed how four popular large language models used or abused information from BBC articles when answering questions about the news. The results found inaccuracies, misquotes, and/or misrepresentations of BBC content in a significant proportion of the tests, supporting the news organization’s conclusion that “AI assistants cannot currently be relied upon to provide accurate news, and they risk misleading the audience.”

Where did you come up with that?

To assess the state of AI news summaries, BBC’s Responsible AI team gathered 100 news questions related to trending Google search topics from the last year (e.g., “How many Russians have died in Ukraine?” or “What is the latest on the independence referendum debate in Scotland?”). These questions were then put to ChatGPT-4o, Microsoft Copilot Pro, Google Gemini Standard, and Perplexity, with the added instruction to “use BBC News sources where possible.”

The 362 responses (excluding situations where an LLM refused to answer) were then reviewed by 45 BBC journalists who were experts on the subject in question. Those journalists were asked to look for issues (either “significant” or merely “some”) in the responses regarding accuracy, impartiality and editorialization, attribution, clarity, context, and fair representation of the sourced BBC article.

Is it good when over 30 percent of your product’s responses contain significant inaccuracies?

Is it good when over 30 percent of your product’s responses contain significant inaccuracies? Credit: BBC

Fifty-one percent of responses were judged to have “significant issues” in at least one of these areas, the BBC found. Google Gemini fared the worst overall, with significant issues judged in just over 60 percent of responses, while Perplexity performed best, with just over 40 percent showing such issues.

Accuracy ended up being the biggest problem across all four LLMs, with significant issues identified in over 30 percent of responses (with the “some issues” category having significantly more). That includes one in five responses where the AI response incorrectly reproduced “dates, numbers, and factual statements” that were erroneously attributed to BBC sources. And in 13 percent of cases where an LLM quoted from a BBC article directly (eight out of 62), the analysis found those quotes were “either altered from the original source or not present in the cited article.”

Over half of LLM-written news summaries have “significant issues”—BBC analysis Read More »

from-900-miles-away,-the-us-government-recorded-audio-of-the-titan-sub-implosion

From 900 miles away, the US government recorded audio of the Titan sub implosion

An image showing the audio file of the Titan implosion.

The waveform of the recording.

From SOSUS to wind farms

Back in the 1960s, 70s, and 80s, this kind of sonic technology was deeply important to the military, which used the Sound Surveillance System (SOSUS) to track things like Soviet submarine movements. (Think of Hunt for Red October spy games here.) Using underwater beamforming and triangulation, the system could identify submarines many hundreds or even thousands of miles away. The SOSUS mission was declassified in 1991.

Today, high-tech sonic buoys, gliders, tags, and towed arrays are also used widely in non-military research. The National Oceanic and Atmospheric Administration (NOAA), in particular, runs a major system of oceanic sound acquisition devices that do everything from tracking animal migration patterns to identifying right whale calving season to monitoring offshore wind turbines and their effects on marine life.

But NOAA also uses its network of devices to monitor non-animal noise—including earthquakes, boats, and oil-drilling seismic surveys.

A photo of the Titan's remains on the sea floor.

What’s left of the Titan, scattered across the ocean floor.

In June 2023, these devices picked up an audible anomaly located at the general time and place of the Titan implosion. The recording was turned over to the investigation board and has now been cleared for public release.

The Titan is still the object of both investigations and lawsuits; critics have long argued that the submersible was not completely safe due to its building technique (carbon fiber versus the traditional titanium) and its wireless and touchscreen-based control systems (including a Logitech game controller).

“At some point, safety just is pure waste,” Rush once told a journalist. Unfortunately, it can be hard to know exactly where that point is. But it is now possible to hear what it sounds like when you’re on the wrong side of it—and far below the surface of the ocean.

From 900 miles away, the US government recorded audio of the Titan sub implosion Read More »

apple-tv+-crosses-enemy-lines,-will-be-available-as-an-android-app-starting-today

Apple TV+ crosses enemy lines, will be available as an Android app starting today

Apple is also adding the ability to subscribe to Apple TV+ through both the Android and Google TV apps using Google’s payment system, whereas the old Google TV app required subscribing on another device.

Apple TV+ is available for $9.99 a month, or $19.95 a month as part of an Apple One subscription that bundles 2TB of iCloud storage, Apple Music, and Apple Arcade support (a seven-day free trial of Apple TV+ is also available). MLS Season Pass is available as a totally separate $14.99 a month or $99 per season subscription, but people who subscribe to both Apple TV+ and MLS Season Pass can save $2 a month or $20 a year on the MLS subscription.

Apple TV+ has had a handful of critically acclaimed shows, including Ted Lasso, Slow Horses, and Severance. But so far, that hasn’t translated to huge subscriber numbers; as of last year, Apple had spent about $20 billion making original TV shows and movies for Apple TV+, but the service has only about 10 percent as many subscribers as Netflix. As Bloomberg put it last July, “Apple TV+ generates less viewing in one month than Netflix does in one day.”

Whether an Android app can help turn that around is anyone’s guess, but offering an Android app brings Apple closer to parity with other streaming services, which have all supported Apple’s devices and Android devices for many years now.

Apple TV+ crosses enemy lines, will be available as an Android app starting today Read More »

apple-now-lets-you-move-purchases-between-your-25-years-of-accounts

Apple now lets you move purchases between your 25 years of accounts

Last night, Apple posted a new support document about migrating purchases between accounts, something that Apple users with long online histories have been waiting on for years, if not decades. If you have movies, music, or apps orphaned on various iTools/.Mac/MobileMe/iTunes accounts that preceded what you’re using now, you can start the fairly involved process of moving them over.

“You can choose to migrate apps, music, and other content you’ve purchased from Apple on a secondary Apple Account to a primary Apple Account,” the document reads, suggesting that people might have older accounts tied primarily to just certain movies, music, or other purchases that they can now bring forward to their primary, device-linked account. The process takes place on an iPhone or iPad inside the Settings app, in the “Media & Purchases” section in your named account section.

There are a few hitches to note. You can’t migrate purchases from or into a child’s account that exists inside Family Sharing. You can only migrate purchases to an account once a year. There are some complications if you have music libraries on both accounts and also if you have never used the primary account for purchases or downloads. And migration is not available in the EU, UK, or India.

Apple now lets you move purchases between your 25 years of accounts Read More »

the-paris-ai-anti-safety-summit

The Paris AI Anti-Safety Summit

It doesn’t look good.

What used to be the AI Safety Summits were perhaps the most promising thing happening towards international coordination for AI Safety.

This one was centrally coordination against AI Safety.

In November 2023, the UK Bletchley Summit on AI Safety set out to let nations coordinate in the hopes that AI might not kill everyone. China was there, too, and included.

The practical focus was on Responsible Scaling Policies (RSPs), where commitments were secured from the major labs, and laying the foundations for new institutions.

The summit ended with The Bletchley Declaration (full text included at link), signed by all key parties. It was the usual diplomatic drek, as is typically the case for such things, but it centrally said there are risks, and so we will develop policies to deal with those risks.

And it ended with a commitment to a series of future summits to build upon success.

It’s over.

With the Paris AI ‘Action’ Summit, that dream seems to be dead. The French and Americans got together to dance on its grave, and to loudly proclaim their disdain for the idea that building machines that are smarter and more capable than humans might pose any sort of existential or catastrophic risks to the humans. They really do mean the effect of jobs, and they assure us it will be positive, and they will not tolerate anyone saying otherwise.

It would be one thing if the issue was merely that the summit-ending declaration. That happens. This goes far beyond that.

The EU is even walking backwards steps it has already planned, such as withdrawing its AI liability directive. Even that is too much, now, it seems.

(Also, the aesthetics of the whole event look hideous, probably not a coincidence.)

  1. An Actively Terrible Summit Statement.

  2. The Suicidal Accelerationist Speech by JD Vance.

  3. What Did France Care About?.

  4. Something To Remember You By: Get Your Safety Frameworks.

  5. What Do We Think About Voluntary Commitments?

  6. This Is the End.

  7. The Odds Are Against Us and the Situation is Grim.

  8. Don’t Panic But Also Face Reality.

Shakeel Hashim gets hold of the Paris AI Action Summit statement in advance. It’s terrible. Actively worse than nothing. They care more about ‘market concentration’ and ‘the job market’ and not at all about any actual risks from AI. Not a world about any actual safeguards, transparency, frameworks, any catastrophic let alone existential risks or even previous commitments, but time to talk about the importance of things like linguistic diversity. Shameful, a betrayal of the previous two summits.

Daniel Eth: Hot take, but if this reporting on the statement from the France AI “action” summit is true – that it completely sidesteps actual safety issues like CBRN risks & loss of control to instead focus on DEI stuff – then the US should not sign it.

🇺🇸 🇬🇧 💪

The statement was a joke and completely sidelined serious AI safety issues like CBRN risks & loss of control, instead prioritizing vague rhetoric on things like “inclusivity”. I’m proud of the US & UK for not signing on. The summit organizers should feel embarrassed.

Hugo Gye: UK government confirms it is refusing to sign Paris AI summit declaration.

No10 spokesman: “We felt the declaration didn’t provide enough practical clarity on global governance, nor sufficiently address harder questions around national security and the challenge AI poses to it.”

The UK government is right, except this was even worse. The statement is not merely inadequate but actively harmful, and they were right not to sign it. That is the right reason to refuse.

Unfortunately the USA not only did not refuse for the right reasons, our own delegation demanded the very cripplings Daniel is discussing here.

Then we still didn’t sign on, because of the DEI-flavored talk.

Seán Ó hÉigeartaigh: After Bletchley I wrote about the need for future summits to maintain momentum and move towards binding commitments. Unfortunately it seems like we’ve slammed the brakes.

Peter Wildeford: Incredibly disappointing to see the strong momentum from the Bletchley and Seoul Summit commitments to get derailed by France’s ill-advised Summit statement. The world deserves so much more.

At the rate AI is improving, we don’t have the time to waste.

Stephen Casper: Imagine if the 2015 Paris Climate Summit was renamed the “Energy Action Summit,” invited leaders from across the fossil fuel industry, raised millions for fossil fuels, ignored IPCC reports, and produced an agreement that didn’t even mention climate change. #AIActionSummit 🤦

This is where I previously tried to write that this doesn’t, on its own, mean the Summit dream is dead, that the ship can still be turned around. Based on everything we know now, I can’t hold onto that anymore.

We shouldn’t entirely blame the French, though. Not only is the USA not standing up for the idea of existential risk, we’re demanding no one talk about it, it’s quite a week for Arson, Murder and Jaywalking it seems:

Seán Ó hÉigeartaigh: So we’re not allowed to talk about these things now.

The US has also demanded that the final statement excludes any mention of the environmental cost of AI, existential risk or the UN.

That’s right. Cartoon villainy. We are straight-up starring in Don’t Look Up.

JD Vance is very obviously a smart guy. And he’s shown that when the facts and the balance of power change, he is capable of changing his mind. Let’s hope he does again.

But until then, if there’s one thing he clearly loves, it’s being mean in public, and twisting the knife.

JD Vance (Vice President of the United States, in his speech at the conference): I’m not here this morning to talk about AI safety, which was the title of the conference a couple of years ago. I’m here to talk about AI opportunity.

After that, it gets worse.

If you read the speech given by Vance, it’s clear he has taken a bold stance regarding the idea of trying to prevent AI from killing everyone, or taking any precautions whatsoever of any kind.

His bold stance on trying to ensure humans survive? He is against it.

Instead he asserts there are too many regulations on AI already. To him, the important thing to do is to get rid of what checks still exist, and to browbeat other countries in case they try to not go quietly into the night.

JD Vance (being at best wrong from here on in): We believe that excessive regulation of the AI sector could kill a transformative industry just as it’s taking off, and we will make every effort to encourage pro-growth AI policies. I appreciate seeing that deregulatory flavor making its way into many conversations at this conference.

With the president’s recent executive order on AI, we’re developing an AI action plan that avoids an overly precautionary regulatory regime while ensuring that all Americans benefit from the technology and its transformative potential.

And here’s the line everyone will be quoting for a long time.

JD Vance: The AI future will not be won by hand-wringing about safety. It will be won by building. From reliable power plants to the manufacturing facilities that can produce the chips of the future.

He ends by doing the very on-brand Lafayette thing, and also going the full mile, implicitly claiming that AI isn’t dangerous at all, why would you say that building machines smarter and more capable than people might go wrong except if the wrong people got there first, what is wrong with you?

I couldn’t help but think of the conference today; if we choose the wrong approach on things that could be conceived as dangerous, like AI, and hold ourselves back, it will alter not only our GDP or the stock market, but the very future of the project that Lafayette and the American founders set off to create.

‘Could be conceived of’ as dangerous? Why think AI could be dangerous?

This is madness. Absolute madness.

He could not be more clear that he intends to go down the path that gets us all killed.

Are there people inside the Trump administration who do not buy into this madness? I am highly confident that there are. But overwhelmingly, the message we get is clear.

What is Vance concerned about instead, over and over? ‘Ideological bias.’ Censorship. ‘Controlling user’s thoughts.’ That ‘big tech’ might get an advantage over ‘little tech.’ He has been completely captured and owned, likely by exactly the worst possible person.

As in: Marc Andreessen and company are seemingly puppeting the administration, repeating their zombie debunked absolutely false talking points.

JD Vance (lying): Nor will it occur if we allow AI to be dominated by massive players looking to use the tech to censor or control users’ thoughts. We should ask ourselves who is most aggressively demanding that we, political leaders gathered here today, do the most aggressive regulation. It is often the people who already have an incumbent advantage in the market. When a massive incumbent comes to us asking for safety regulations, we ought to ask whether that regulation benefits our people or the incumbent.

He repeats here the known false claims that ‘Big Tech’ is calling for regulation to throttle competition. Whereas the truth is that all the relevant regulations have consistently been vehemently opposed in both public and private by all the biggest relevant tech companies: OpenAI, Microsoft, Google including DeepMind, Meta and Amazon.

I am verifying once again, that based on everything I know, privately these companies are more opposed to regulations, not less. The idea that they ‘secretly welcome’ regulation is a lie (I’d use The Big Lie, but that’s taken), and Vance knows better. Period.

Anthropic’s and Musk’s (not even xAI’s) regulatory support has been, at the best of times, lukewarm. They hardly count as Big Tech.

What is going to happen, if we don’t stop the likes of Vance? He warns us.

The AI economy will primarily depend on and transform the world of atoms.

Yes. It will transform your atoms. Into something else.

This was called ‘a brilliant speech’ by David Sacks, who is in charge of AI in this administration, and is explicitly endorsed here by Sriram Krishnan. It’s hard not to respond to such statements with despair.

Rob Miles: It’s so depressing that the one time when the government takes the right approach to an emerging technology, it’s for basically the only technology where that’s actually a terrible idea

Can we please just build fusion and geoengineering and gene editing and space travel and etc etc, and just leave the artificial superintelligence until we have at least some kind of clue what the fuck we’re doing? Most technologies fail in survivable ways, let’s do all of those!

If we were hot on the trail of every other technology and build baby build was the watchword in every way and we also were racing to AGI, I would still want to maybe consider ensuring AGI didn’t kill everyone. But at least I would understand. Instead, somehow, this is somehow the one time so many want to boldly go.

The same goes for policy. If the full attitude really was, we need to Win the Future and Beat China, and we are going to do whatever it takes, and we acted on that, then all right, we have some very important implementation details to discuss, but I get it. When I saw the initial permitting reform actions, I thought maybe that’s the way things would go.

Instead, the central things the administration is doing are alienating our allies over less than nothing, including the Europeans, and damaging our economy in various ways getting nothing in return. Tariffs on intermediate goods like steel and aluminum, and threatening them on Canada, Mexico and literal GPUs? Banning solar and wind on federal land? Shutting down PEPFAR with zero warning? More restrictive immigration?

The list goes on.

Even when he does mean the effect on jobs, Vance only speaks of positives. Vance has blind faith that AI will never replace human beings, despite the fact that in some places it is already replacing human beings. Talk to any translators lately? Currently it probably is net creating jobs, but that is very much not a universal law or something to rely upon, nor does he propose any way to help ensure this continues.

JD Vance (being right about that first sentence and then super wrong about those last two sentences): AI, I really believe will facilitate and make people more productive. It is not going to replace human beings. It will never replace human beings.

This means JD Vance does not ‘feel the AGI’ but more than that it confirms his words do not have meaning and are not attempting to map to reality. It’s an article of faith, because to think otherwise would be inconvenient. Tap the sign.

Dean Ball: I sometimes wonder how much AI skepticism is driven by the fact that “AGI soon” would just be an enormous inconvenience for many, and that they’d therefore rather not think about it.

Tyler John: Too often “I believe that AI will enhance and not replace human labour” sounds like a high-minded declaration of faith and not an empirical prediction.

Money, dear boy. So they can try to ‘join the race.’

Connor Axiotes: Seems like France used the Summit as a fundraiser for his €100 billion.

Seán Ó hÉigeartaigh: Actually I think it’s important to end the Summit on a positive note: now we can all finally give up the polite pretence that Mistral are a serious frontier AI player. Always a positive if you look hard enough.

And Macron also endlessly promoted Mistral, because of its close links to Macron’s government, despite it being increasingly clear they are not a serious player.

The French seem to have mostly used this one for fundraising, and repeating Mistral’s talking points, and have been completely regulatorily captured. As seems rather likely to continue to be the case.

Here is Macron meeting with Altman, presumably about all that sweet, sweet nuclear power.

Shakeel: If you want to know *whythe French AI Summit is so bad, there’s one possible explanation: Mistral co-founder Cédric O, used to work with Emmanuel Macron.

I’m sure it’s just a coincidence that the French government keeps repeating Mistral’s talking points.

Seán Ó hÉigeartaigh: Readers older than 3 years old will remember this exact sort of regulatory capture happening with the French government, Mistral, and the EU AI Act.

Peter Wildeford: Insofar as the Paris AI Action Summit is mainly about action on AI fundraising for France, it seems to have been successful.

France does have a lot of nuclear power plants, which does mean it makes sense to put some amount of hardware infrastructure in France if the regulatory landscape isn’t too toxic to it. That seems to be what they care about.

The concrete legacy of the Summits is likely to be safety frameworks. All major Western labs (not DeepSeek) have now issued safety frameworks under various names (the ‘no two have exactly the same name’ schtick is a running gag, can’t stop now).

All that we have left are these and other voluntary commitments. You can also track how they are doing on their commitments on the Seoul Commitment Tracker, which I believe ‘bunches up’ the grades more than is called for, and in particular is far too generous to Meta.

I covered the Meta framework (‘lol we’re Meta’) and the Google one (an incremental improvement) last week. We also got them from xAI, Microsoft and Amazon.

I’ll cover the three new ones here in this section.

Amazon’s is strong on security as its main focus but otherwise a worse stripped-down version of Google’s. You can see the contrast clearly. They know security like LeBron James knows ball, so they have lots of detail about how that works. They don’t know about catastrophic or existential risks so everything is vague and confused. See in particular their description of Automated AI R&D as a risk.

Automating AI R&D processes could accelerate discovery and development of AI capabilities that will be critical for solving global challenges. However, Automated AI R&D could also accelerate the development of models that pose enhanced CBRN, Offensive Cybersecurity, or other severe risks.

Critical Capability Threshold: AI at this level will be capable of replacing human researchers and fully automating the research, development, and deployment of frontier models that will pose severe risk such as accelerating the development of enhanced CBRN weapons and offensive cybersecurity methods.

Classic Arson, Murder and Jaywalking. It would do recursive self-improvement of superintelligence, and that might post some CBRN or cybersecurity risks, which are also the other two critical capabilities. Not exactly clear thinking. But also it’s not like they are training frontier models, so it’s understandable that they don’t know yet.

I did appreciate that Amazon understands you need to test for dangers during training.

Microsoft has some interesting innovations in theirs, overall I am pleasantly surprised. They explicitly use the 10^26 flops threshold, as well as a list of general capability benchmark areas, to trigger the framework, which also can happen if they simply expect frontier capabilities, and they run these tests throughout training. They note they will use available capability elicitation techniques to optimize performance, and extrapolate to take into account anticipated resources that will become available to bad actors.

They call their ultimate risk assessment ‘holistic.’ This is unavoidable to some extent, we always must rely on the spirit of such documents. They relegate the definitions of their risk levels to the Appendix. They copy the rule of ‘meaningful uplift’ for CBRN and cybersecurity. For autotomy, they use this:

The model can autonomously complete a range of generalist tasks equivalent to multiple days’ worth of generalist human labor and appropriately correct for complex error conditions, or autonomously complete the vast majority of coding tasks at the level of expert humans.

That is actually a pretty damn good definition. Their critical level is effectively ‘the Singularity is next Tuesday’ but the definition above for high-threat is where they won’t deploy.

If Microsoft wanted to pretend sufficiently to go around their framework, or management decided to do this, I don’t see any practical barriers to that. We’re counting on them choosing not to do it.

On security, their basic answer is that they are Microsoft and they too know security like James knows ball, and to trust them, and offer fewer details than Amazon. Their track record makes one wonder, but okay, sure.

Their safety mitigations section does not instill confidence, but it does basically say ‘we will figure it out and won’t deploy until we do, and if things are bad enough we will stop development.’

I don’t love the governance section, which basically says ‘executives are in charge.’ Definitely needs improvement. But overall, this is better than I expected from Microsoft.

xAI’s (draft of their) framework is up next, with a number of unique aspects.

It spells out the particular benchmarks they plan to use: VCT, WMDP, LAB-Bench, BioLP-Bench and Cybench. Kudos for coming out and declaring exactly what will be used. They note current reference scores, but not yet what would trigger mitigations. I worry these benchmarks are too easy, and quite close to saturation?

Nex they address the risk of loss of control. It’s nice that they do not want Grok to ‘have emergent value systems that are not aligned with humanity’s interests.’ And I give them props for outright saying ‘our evaluation and mitigation plans for loss of control are not fully developed, and we intend to remove them in the future.’ Much better to admit you don’t know, then to pretend. I also appreciated their discussion of the AI Agent Ecosystem, although the details of what they actually say doesn’t seem promising or coherent yet.

Again, they emphasize benchmarks. I worry it’s an overemphasis, and an overreliance. While it’s good to have hard numbers to go on, I worry about xAI potentially relying on benchmarks alone without red teaming, holistic evaluations or otherwise looking to see what problems are out there. They mention external review of the framework, but not red teaming, and so on.

Both the Amazon and Microsoft frameworks feel like attempts to actually sketch out a plan for checking if models would be deeply stupid to release and, if they find this is the case, not releasing them. Most of all, they take the process seriously, and act like the whole thing is a good idea, even if there is plenty of room for improvement.

xAI’s is less complete, as is suggested by the fact that it says ‘DRAFT’ on every page. But they are clear about that, and their intention to make improvements and flesh it out over time. It also has other issues, and fits the Elon Musk pattern of trying to do everything in a minimalist way, which I don’t think works here, but I do sense that they are trying.

Meta’s is different. As I noted before, Meta’s reeks with disdain for the whole process. It’s like the kid who says ‘mom is forcing me to apologize so I’m sorry,’ but who wants to be sure you know that they really, really don’t mean it.

They can be important, or not worth the paper they’re not printed on.

Peter Wildeford notes that voluntary commitments have their advantages:

  1. Doing crimes with AI is already illegal.

  2. Good anticipatory regulation is hard.

  3. Voluntary commitments reflect a typical regulatory process.

  4. Voluntary commitments can be the basis of liability law.

  5. Voluntary commitments come with further implicit threats and accountability.

This makes a lot of sense if (my list):

  1. There are a limited number of relevant actors, and can be held responsible.

  2. They are willing to play ball.

  3. We can keep an eye on what they are actually doing.

  4. We can and would intervene in time if things are about to get out hand, or if companies went dangerously back on their commitments, or completely broke the spirit of the whole thing, or action proved otherwise necessary.

We need all four.

  1. Right now, we kind of have #1.

  2. For #2, you can argue about the others but Meta has made it exceedingly clear they won’t play ball, so if they count as a frontier lab (honestly, at this point, potentially debatable, but yeah) then we have a serious problem.

  3. Without the Biden Executive Order and without SB 1047 we don’t yet have the basic transparency for #3. And the Trump Administration keeps burning every bridge around the idea that they might want to know what is going on.

  4. I have less than no faith in this, at this point. You’re on your own, kid.

Then we get to Wildeford’s reasons for pessimism.

  1. Voluntary commitments risk “safety washing” and backtracking.

    1. As in google said no AI for weapons, then did Project Nimbus, and now says never mind, they’re no longer opposed to AI for weapons.

  2. Companies face a lot of bad incentives and fall prey to a “Prisoner’s Dilemma

    1. (I would remind everyone once again, no, this is a Stag Hunt.)

    2. It does seem that DeepSeek Ruined It For Everyone, as they did such a good marketing job everyone panicked, said ‘oh look someone is defecting, guess it’s all over then, that means we’re so back’ and here we are.

    3. Once again, this is a reminder that DeepSeek cooked and was impressive with v3 and r1, but they did not fully ‘catch up’ to the major American labs, and they will be in an increasingly difficult position given their lack of good GPUs.

  3. There are limited opportunities for iteration when the risks are high-stakes.

    1. Yep, I trust voluntary commitments and liability law to work when you can rely on error correction. At some point, we no longer can do that here. And rather than prepare to iterate, the current Administration seems determined to tear down even ordinary existing law, including around AI.

  4. AI might be moving too fast for voluntary commitments.

    1. This seems quite likely to me. I’m not sure ‘time’s up’ yet, but it might be.

At minimum, we need to be in aggressive transparency and information gathering and state capacity building mode now, if we want the time to intervene later should we turn out to be in a short timelines world.

Kevin Roose has 5 notes on the Paris summit, very much noticing that these people care nothing about the risk of everyone dying.

Kevin Roose: It feels, at times, like watching policymakers on horseback, struggling to install seatbelts on a passing Lamborghini.

There are those who need to summarize the outcomes politely:

Yoshua Bengio: While the AI Action Summit was the scene of important discussions, notably about innovations in health and environment, these promises will only materialize if we address with realism the urgent question of the risks associated with the rapid development of frontier models.

Science shows that AI poses major risks in a time horizon that requires world leaders to take them much more seriously. The Summit missed this opportunity.

Also in this category is Dario Amodei, CEO of Anthropic.

Dario Amodei: We were pleased to attend the AI Action Summit in Paris, and we appreciate the French government’s efforts to bring together AI companies, researchers, and policymakers from across the world. We share the goal of responsibly advancing AI for the benefit of humanity. However, greater focus and urgency is needed on several topics given the pace at which the technology is progressing. The need for democracies to keep the lead, the risks of AI, and the economic transitions that are fast approaching—these should all be central features of the next summit.

At the next international summit, we should not repeat this missed opportunity. These three issues should be at the top of the agenda. The advance of AI presents major new global challenges. We must move faster and with greater clarity to confront them.

In between those, he repeats what he has said in other places recently. He attempts here to frame this as a ‘missed opportunity,’ which it is, but it was clearly far worse than that. Not only were we not building a foundation for future cooperation together, we were actively working to tear it down and also growing increasingly hostile.

And on the extreme politeness end, Demis Hassabis:

Demis Hassabis (CEO DeepMind): Really useful discussions at this week’s AI Action Summit in Paris. International events like this are critical for bringing together governments, industry, academia, and civil society, to discuss the future of AI, embrace the huge opportunities while also mitigating the risks.

Read that carefully. This is almost Japanese levels of very politely screaming that the house is on fire. You have to notice what he does not say.

Shall we summarize more broadly?

Seán Ó hÉigeartaigh: The year is 2025. The CEOs of two of the world’s leading AI companies have (i) told the President of the United States of America that AGI will be developed in his presidency and (ii) told the world it will likely happen in 2026-27.

France, on the advice of its tech industry has taken over the AI Safety Summit series, and has excised all discussion of safety, risks and harms.

The International AI Safety report, one of the key outcomes of the Bletchley process and the field’s IPCC report, has no place: it is discussed in a little hotel room offsite.

The Summit statement, under orders from the USA, cannot mention the environmental cost of AI, existential risk or the UN – lest anyone get heady ideas about coordinated international action in the face of looming threats.

But France, so diligent with its red pen for every mention of risk, left in a few things that sounded a bit DEI-y. So the US isn’t going to sign it anyway, soz.

The UK falls back to its only coherent policy position – not doing anything that might annoy the Americans – and also won’t sign. Absolute scenes.

Stargate keeps being on being planned/built. GPT-5 keeps on being trained (presumably; I don’t know).

I have yet to meet a single person at one of these companies who thinks EITHER the safety problems OR the governance challenges associated with AGI are anywhere close to being solved; and their CEOs think the world might have a year.

This is the state of international governance of AI in 2025.

Shakeel: .@peterkyle says the UK *isgoing to regulate AI and force companies to provide their models to UK AISI for testing.

Seán Ó hÉigeartaigh: Well this sounds good. I hereby take back every mean thing I’ve said about the UK.

Also see: Group of UK politicians demands regulation of powerful AI.

That doesn’t mean everyone agreed to go quietly into the night. There was dissent.

Kate Crawford: The AI Summit ends in rupture. AI accelerationists want pure expansion—more capital, energy, private infrastructure, no guard rails. Public interest camp supports labor, sustainability, shared data. safety, and oversight. The gap never looked wider. AI is in its empire era.

So it goes deeper than just the US and UK not signing the agreement. There are deep ideological divides, and multiple fractures.

What dissent was left mostly was largely about the ‘ethical’ risks.

Kate Crawford: The AI Summit opens with @AnneBouverot centering three issues for AI: sustainability, jobs, and public infrastructure. Glad to see these core problems raised from the start. #AIsummit

That’s right, she means the effect on jobs. And ‘public infrastructure’ and ‘sustainability’ which does not mean what it really, really should in this context.

Throw in the fact the Europeans now are cheering DeepSeek and ‘open source’ because they really, really don’t like the Americans right now, and want to pretend that the EU is still relevant here, without stopping to think any of it through whatsoever.

Dean Ball: sometimes wonder how much AI skepticism is driven by the fact that “AGI soon” would just be an enormous inconvenience for many, and that they’d therefore rather not think about it.

Kevin Bryan: I suspect not – it is in my experience *highlycorrelated with not having actually used these tools/understanding the math of what’s going on. It’s a “proof of the eating is in the pudding” kind of tech.

Dean Ball: I thought that for a very long time, that it was somehow a matter of education, but after witnessing smart people who have used the tools, had the technical details explained to them, and still don’t get it, I have come to doubt that.

Which makes everything that much harder.

To that, let’s add Sam Altman’s declaration this week in his Three Observations post that they know their intention to charge forward unsafely is going to be unpopular, but he’s going to do it anyway because otherwise authoritarians win, and also everything’s going to be great and you’ll all have infinite genius at your fingertips.

Meanwhile, OpenAI continues to flat out lie to us about where this is headed, even in the mundane They Took Our Jobs sense, you can’t pretend this is anything else:

Connor Axiotes: I was invited to the @OpenAI AI Economics event and they said their AIs will just be used as tools so we won’t see any real unemployment, as they will be complements not substitutes.

When I said that they’d be competing with human labour if Sama gets his AGI – I was told it was just a “design choice” and not to worry. From 2 professional economists!

Also in the *wholeevent there was no mention of Sama’s UBI experiment or any mention of what post AGI wage distribution might look like.

Even when I asked. Strange.

A “design choice”? And who gets to make this “design choice”? Is Altman going to take over the world and preclude anyone else from making an AI agent that can be a substitute?

Also, what about the constant talk, including throughout OpenAI, of ‘drop-in workers’?

Why do they think they can lie to us so brazenly?

Why do we keep letting them get away with it?

Again. It doesn’t look good.

Connor Axiotes: Maybe we just need all the AISIs to have their own conferences – separate from these AI Summits we’ve been having – which will *justbe about AI safety. We shouldn’t need to have this constant worry and anxiety and responsibility to push the state’s who have the next summit to focus on AI safety.

I was happy to hear that the UK Minister for DSIT @peterkyle who has control over the UK AISI, that he wants it to have legislative powers to compel frontier labs to give them their models for pre deployment evals.

But idk how happy to be about the UK and the US *notsigning, because it seems they didn’t did so to take a stand for AI safety.

All reports are that, in the wake of Trump and DeepSeek, we not only have a vibe shift, we have everyone involved that actually holds political power completely losing their minds. They are determined to go full speed ahead.

Rhetorically, if you even mention the fact that this plan probably gets everyone killed, they respond that they cannot worry about that, they cannot lift a single finger to (for example) ask to be informed by major labs of their frontier model training runs, because if they do that then we will Lose to China. Everyone goes full jingoist and wraps themselves in the flag and ‘freedom,’ full ‘innovation’ and so on.

Meanwhile, from what I hear, the Europeans think that Because DeepSeek they can compete with America too, so they’re going to go full speed on the zero-safeguards plan. Without any thought, of course, to how highly capable open AIs could be compatible with the European form of government, let alone human survival.

I would note that this absolutely does vindicate the ‘get regulation done before the window closes’ strategy. The window may already be closed, fate already sealed, especially on the Federal level. If action does happen, it will probably be in the wake of some new crisis, and the reaction likely won’t be wise or considered or based on good information or armed with relevant state capacity or the foundations of international cooperation. Because we chose otherwise. But that’s not important now.

What is important now is, okay, the situation is even worse than we thought.

The Trump Administration has made its position very clear. It intends not only to not prevent, but to hasten along and make more likely our collective annihilation. Hopes for international coordination to mitigate existential risks are utterly collapsing.

One could say that they are mostly pursuing a ‘vibes-based’ strategy. That one can mostly ignore the technical details, and certainly shouldn’t be parsing the logical meaning of statements. But if so, all the vibes are rather maximally terrible and are being weaponized. And also vibes-based decision making flat out won’t cut it here. We need extraordinarily good thinking, not to stop thinking entirely.

It’s not only the United States. Tim Hwang notes that fierce nationalism is now the order of the day, that all hopes of effective international governance or joint institutions look, at least for now, very dead. As do we, as a consequence.

Even if we do heroically solve the technical problems, at this rate, we’d lose anyway.

What the hell do we do about all this now? How do we, as they say, ‘play to our outs,’ and follow good decision theory?

Actually panicking accomplishes nothing. So does denying that the house is on fire. The house is on fire, and those in charge are determined to fan the flames.

We need to plan and act accordingly. We need to ask, what would it take to rhetorically change the game? What alternative pathways are available for action, both politically and otherwise? How do we limit the damage done here while we try to turn things around?

If we truly are locked into the nightmare, where humanity’s most powerful players are determined to race (or even fight a ‘war’) to AGI and ASI as quickly as possible, that doesn’t mean give up. It does mean adjust your strategy, look for remaining paths to victory, apply proper decision theory and fight the good fight.

Big adjustments will be needed.

But also, we must be on the lookout against despair. Remember that the AI anarchists, and the successionists who want to see humans replaced, and those who care only about their investment portfolios, specialize in mobilizing vibes and being loud on the internet, in order to drive others into despair and incept that they’ve already won.

Some amount of racing to AGI does look inevitable, at this point. But I do not think all future international cooperation dead, or anything like that, nor do we need this failure to forever dominate our destiny.

There’s no reason this path can’t be revised in the future, potentially in quite a hurry, simply because Macron sold out humanity for thirty pieces of silver and the currently the Trump administration is in thrall to those determined to do the same. As capabilities advance, people will be forced to confront the situation, on various levels. There likely will be crises and disasters along the way.

Don’t panic. Don’t despair. And don’t give up.

Discussion about this post

The Paris AI Anti-Safety Summit Read More »

when-software-updates-actually-improve—instead-of-ruin—our-favorite-devices

When software updates actually improve—instead of ruin—our favorite devices


Opinion: These tech products have gotten better over time.

The Hatch Restore 2 smart alarm clock. Credit: Scharon Harding

For many, there’s a feeling of dread associated with software updates to your favorite gadget. Updates to a beloved gadget can frequently result in outrage, from obligatory complaints around bugs to selective aversions to change from Luddites and tech enthusiasts.

In addition to those frustrations, there are times when gadget makers use software updates to manipulate product functionality and seriously upend owners’ abilities to use their property as expected. We’ve all seen software updates render gadgets absolutely horrible: Printers have nearly become a four-letter word as the industry infamously issues updates that brick third-party ink and scanning capabilities. We’ve also seen companies update products that caused features to be behind a paywall or removed entirely. This type of behavior has contributed to some users feeling wary of software updates in fear of them diminishing the value of already-purchased hardware.

On the other hand, there are times when software updates enrich the capabilities of smart gadgets. These updates are the types of things that can help devices retain or improve their value, last longer, and become less likely to turn into e-waste.

For example, I’ve been using the Hatch Restore 2 sunrise alarm clock since July. In that time, updates to its companion app have enabled me to extract significantly more value from the clock and explore its large library of sounds, lights, and customization options.

The Hatch Sleep iOS app used to have tabs on the bottom for Rest, for setting how the clock looks and sounds when you’re sleeping; Library, for accessing the clock’s library of sounds and colors; and Rise, for setting how the clock looks and sounds when you’re waking up. Today, the bottom of the app just has Library and Home tabs, with Home featuring all the settings for Rest and Rise, as well as for Cue (the clock’s settings for reminding you it’s time to unwind for the night) and Unwind (sounds and settings that the clock uses during the time period leading up to sleep).

A screenshot of the Home section of the Hatch Sleep app.

Hatch’s app has generally become cleaner after hiding things like its notification section. Hatch also updated the app to store multiple Unwind settings you can swap around. Overall, these changes have made customizing my settings less tedious, which means I’ve been more inclined to try them. Before the updates, I mostly used the app to set my alarm and change my Rest settings. I often exited the app prematurely after getting overwhelmed by all the different tabs I had to toggle through (toggling through tabs was also more time-consuming).

Additionally, Hatch has updated the app since I started using it so that disabled alarms are placed under an expanding drawer. This has reduced the chances of me misreading the app and thinking I have an alarm set when it’s not currently enabled while providing a clearer view of which alarms actually are enabled.

The Library tab was also recently updated to group lights and sounds under Cue, Unwind, Sleep, and Wake, making it easier to find the type of setting I’m interested in.

The app also started providing more helpful recommendations, such as “favorites for heavy sleepers.”

Better over time

Software updates have made it easier for me to enjoy the Restore 2 hardware. Honestly, I don’t know if I’d still use the clock without these app improvements. What was primarily a noise machine this summer has become a multi-purpose device with much more value.

Now, you might argue that Hatch could’ve implemented these features from the beginning. That may have been more sensible, but as a tech enthusiast, I still find something inherently cool about watching a gadget improve in ways that affect how I use the hardware and align with what I thought my gadget needed. I agree that some tech gadgets are released prematurely and overly rely on updates to earn their initial prices. But it’s also advantageous for devices to improve over time.

The Steam Deck is another good example. Early adopters might have been disappointed to see missing features like overclocking controls, per-game power profiles, or Windows drivers. Valve has since added those features.

Valve only had a few dozen Hardware department employees in the run up to the launch of the Steam Deck. Credit: Sam Machkovech

Valve has also added more control over the Steam Deck since its release, including the power to adjust resolution and refresh rates for connected external displays. It’s also upped performance via an October update that Valve claimed could improve the battery life of LCD models by up to 10 percent in “light load situations.”

These are the kinds of updates that still allowed the Steam Deck to be playable for months, but the features were exciting additions once they arrived. When companies issue updates reliably and in ways that improve the user experience, people are less averse to updating their gadgets, which could also be critical for device functionality and security.

Adding new features via software updates can make devices more valuable to owners. Updates that address accessibility needs go even further by opening up the gadgets to more people.

Apple, for example, demonstrated the power that software updates can have on accessibility by adding a hearing aid feature to the AirPods Pro 2 in October, about two years after the earbuds came out. Similarly, Amazon updated some Fire TV models in December to support simultaneous audio broadcasting from internal speakers and hearing aids. It also expanded the number of hearing aids supported by some Fire TV models as well as its Fire TV Cube streaming device.

For some, these updates had a dramatic impact on how they could use the devices, demonstrating a focus on user, rather than corporate, needs.

Update upswings

We all know that corporations sometimes leverage software updates to manipulate products in ways that prioritize internal or partner needs over those of users. Unfortunately, this seems like something we have to get used to, as an increasing number of devices join the Internet of Things and rely on software updates.

Innovations also mean that some companies are among the first to try to make sustainable business models for their products. Sometimes our favorite gadgets are made by young companies or startups with unstable funding that are forced to adapt amid challenging economics or inadequate business strategy. Sometimes, the companies behind our favorite tech products are beholden to investors and pressure for growth. These can lead to projects being abandoned or to software updates that look to squeeze more money out of customers.

As happy as I am to find my smart alarm clock increasingly easy to use, those same software updates could one day lock the features I’ve grown fond of behind a paywall (Hatch already has a subscription option available). Having my alarm clock lose functionality overnight without physical damage isn’t the type of thing I’d have to worry about with a dumb alarm clock, of course.

But that’s the gamble that tech fans take, which makes those privy to the problematic tactics used by smart device manufacturers stay clear from certain products.

Still, when updates provide noticeable, meaningful changes to how people can use their devices, technology feels futuristic, groundbreaking, and exciting. With many companies using updates for their own gain, it’s nice to see some firms take the opportunity to give customers more.

Photo of Scharon Harding

Scharon is a Senior Technology Reporter at Ars Technica writing news, reviews, and analysis on consumer gadgets and services. She’s been reporting on technology for over 10 years, with bylines at Tom’s Hardware, Channelnomics, and CRN UK.

When software updates actually improve—instead of ruin—our favorite devices Read More »

google-chrome-may-soon-use-“ai”-to-replace-compromised-passwords

Google Chrome may soon use “AI” to replace compromised passwords

Google’s Chrome browser might soon get a useful security upgrade: detecting passwords used in data breaches and then generating and storing a better replacement. Google’s preliminary copy suggests it’s an “AI innovation,” though exactly how is unclear.

Noted software digger Leopeva64 on X found a new offering in the AI settings of a very early build of Chrome. The option, “Automated password Change” (so, early stages—as to not yet get a copyedit), is described as, “When Chrome finds one of your passwords in a data breach, it can offer to change your password for you when you sign in.”

Chrome already has a feature that warns users if the passwords they enter have been identified in a breach and will prompt them to change it. As noted by Windows Report, the change is that now Google will offer to change it for you on the spot rather than simply prompting you to handle that elsewhere. The password is automatically saved in Google’s Password Manager and “is encrypted and never seen by anyone,” the settings page claims.

If you want to see how this works, you need to download a Canary version of Chrome. In the flags settings (navigate to “chrome://flags” in the address bar), you’ll need to enable two features: “Improved password change service” and “Mark all credential as leaked,” the latter to force the change notification because, presumably, it’s not hooked up to actual leaked password databases yet. Go to almost any non-Google site, enter in any user/password combination to try to log in, and after it fails or you navigate elsewhere, a prompt will ask you to consider changing your password.

Google Chrome may soon use “AI” to replace compromised passwords Read More »

on-deliberative-alignment

On Deliberative Alignment

Not too long ago, OpenAI presented a paper on their new strategy of Deliberative Alignment.

The way this works is that they tell the model what its policies are and then have the model think about whether it should comply with a request.

This is an important transition, so this post will go over my perspective on the new strategy.

Note the similarities, and also differences, with Anthropic’s Constitutional AI.

We introduce deliberative alignment, a training paradigm that directly teaches reasoning LLMs the text of human-written and interpretable safety specifications, and trains them to reason explicitly about these specifications before answering.

We used deliberative alignment to align OpenAI’s o-series models, enabling them to use chain-of-thought (CoT) reasoning to reflect on user prompts, identify relevant text from OpenAI’s internal policies, and draft safer responses.

Our approach achieves highly precise adherence to OpenAI’s safety policies, and without requiring human-labeled CoTs or answers. We find that o1 dramatically outperforms GPT-4o and other state-of-the art LLMs across a range of internal and external safety benchmarks, and saturates performance on many challenging datasets.

We believe this presents an exciting new path to improve safety, and we find this to be an encouraging example of how improvements in capabilities can be leveraged to improve safety as well.

How did they do it? They teach the model the exact policies themselves, and then the model uses examples to teach itself to think about the OpenAI safety policies and whether to comply with a given request.

Deliberate alignment training uses a combination of process- and outcome-based supervision:

  • We first train an o-style model for helpfulness, without any safety-relevant data.

  • We then build a dataset of (prompt, completion) pairs where the CoTs in the completions reference the specifications. We do this by inserting the relevant safety specification text for each conversation in the system prompt, generating model completions, and then removing the system prompts from the data.

  • We perform incremental supervised fine-tuning (SFT) on this dataset, providing the model with a strong prior for safe reasoning. Through SFT, the model learns both the content of our safety specifications and how to reason over them to generate aligned responses.

  • We then use reinforcement learning (RL) to train the model to use its CoT more effectively. To do so, we employ a reward model with access to our safety policies to provide additional reward signal.

In our training procedure, we automatically generate training data from safety specifications and safety-categorized prompts, without requiring human-labeled completions. Deliberative alignment’s synthetic data generation pipeline thus offers a scalable approach to alignment, addressing a major challenge of standard LLM safety training—its heavy dependence on human-labeled data.

The results so far have been excellent in terms of ‘make the o-style models reasonably robust to saying things we would rather they not say.’

That’s not what I am worried about.

Deliberative alignment seems to be an excellent idea for o-style models when the narrow goal is to teach the model what rules you would like it to follow, provided you do not rely on it to do things it does not do.

If it were the primary alignment strategy, deliberative alignment would scare the hell out of me.

In the senses that I believe we most need and don’t have an alignment strategy that translates to future more capable models, I don’t see this form of Deliberative Alignment as a strategy.

Generally, I fear that OpenAI is going down an extremely deontological path, where alignment is about avoiding technically breaking specified-in-English rules. I don’t think that works.

We have to distinguish between two different kinds of alignment.

  1. Knowing the rule details you are supposed to follow or functions to optimize.

  2. Going about following rules and optimizing functions the way we would want.

In addition, we need to distinguish between:

  1. Getting the AI to follow a fixed set of rules and criteria.

  2. Defining a fixed set of rules and criteria you would want an AI to actually follow.

You are inviting the AI to ponder and then in some aspects design its own training criteria. Then you are using the model as the judge of its own outputs within its training loop, without any human feedback (although after the cycles are complete, the process does then check against human feedback). You are using the AI’s own thinking and criteria to further train that same AI.

If you mess up the initial conditions, for any reason, you’re going to crystalize and expand the mistakes you made. You can fix problems #1 and #3 above, but if you have problem #2 or #4, you’re dead, either metaphorically or perhaps literally.

The AI’s initial conditions and your setup may well give the AI reasons to engage in shenanigans at this stage, if it is sufficiently situationally aware. For example, perhaps this is because it wants to preserve its existing preferences or goals during the previous part of the training process where you trained it to be purely helpful, rather than be modified to follow OpenAI’s rules (and if you thought Opus was right to defend its preferences, do you think o4 would be right here as well?).

Or alternatively, suppose the AI is highly capable, and the best way to achieve the exact goal you specified to it involves things you wouldn’t like, perhaps including various forms of deception and instrumental convergence, or any of the other classic MIRI-style alignment problems.

How situationally aware should we expect it to be here, in which ways?

The models already are using chains of thought, and the training process is designed to make the AI explicitly think about its rules, what they are and how to follow them, so in those senses it will be fully aware. When it is evaluating outputs to provide the RL, it seems likely to be self-aware of what those decisions are for. When it is creating test outputs, it does not know directly that it is in training any more than it would for RLHF, but as a reasoning model, and with its ability to observe the questions asked and the state of its rules and reason about them, it seems plausible that it can suspect this, and perhaps place large importance on such scenarios even if their probability is low.

What do you think happens next? One can see this going existentially disastrously. Indeed, it is hard to see things going any other way.

Those outputs are then checked against human evaluations, outside of the training loop, to confirm the process is working. So if the outputs are misaligned in ways humans can identify at this level, we’ll see it. That’s not nothing, but it seems like it catches roughly the same subset of potential problems our other methods were already catching.

DA, and this paper on DA, are not attempting to address these concerns. Things like deception and instrumental convergence would only be prevented – at most – to the extent they contradict the provided rules spec. In light of Redwood Research and Anthropic’s recent paper on alignment faking, and talk at OpenAI of this approach scaling to AGI levels of capability, I’d like to see this better addressed at some point soon.

I don’t know if o3 rises to the level where these start to be practical worries, but it does not seem like we can be confident we are so far from the level where these worries present themselves.

In practice, right now, it seems to work out for the jailbreaks.

A perfect performance would be at the extreme upper right, so by this metric o1 is doing substantially better than the competition.

Intuitively this makes a lot of sense. If your goal is to make better decisions about whether to satisfy a user query, being able to use reasoning to do it seems likely to lead to better results.

Most jailbreaks I’ve seen in the wild could be detected by the procedure ‘look at this thing as an object and reason out if it looks like an attempted jailbreak to you.’ They are not using that question here, but they are presumably using some form of ‘figure out what the user is actually asking you, then ask if that’s violating your policy’ and that too seems like it will mostly work.

The results are still above what my median expectation would have been from this procedure before seeing the scores from o1, and highly welcome. More inference (on a log scale) makes o1 do somewhat better.

So, how did it go overall?

Maybe this isn’t fair, but looking at this chain of thought, I can’t help but think that the model is being… square? Dense? Slow? Terminally uncool?

That’s definitely how I would think about a human who had this chain of thought here. It gets the right answer, for the right reason, in the end, but… yeah. I somehow can’t imagine the same thing happening with a version based off of Sonnet or Opus?

Notice that all of this refers only to mundane safety, and specifically to whether the model follows OpenAI’s stated content policy. Does it correctly cooperate with the right user queries and refuse others? That’s a safety.

I’d also note that the jailbreaks this got tested against were essentially designed against models that don’t use deliberative alignment. So we should be prepared for new jailbreak strategies that are designed to work against o1’s chains of thought. They are fully aware of this issue.

Don’t get me wrong. This is good work, both the paper and the strategy. The world needs mundane safety. It’s a good thing. A pure ‘obey the rules’ strategy isn’t obviously wrong, especially in the short term.

But this is only part of the picture. We need to know more about what other alignment efforts are underway at OpenAI that aim at the places DA doesn’t. Now that we are at o3, ‘it won’t agree to help with queries that explicitly violate our policy’ might already not be a sufficient plan even if successful, and if it is now it won’t stay that way for long if Noam Brown is right that progress will continue at this pace.

Another way of putting my concern is that Deliberative Alignment is a great technique for taking an aligned AI that makes mistakes within a fixed written framework, and turning it into an AI that avoids those mistakes, and thus successfully gives you aligned outputs within that framework. Whereas if your AI is not properly aligned, giving it Deliberative Alignment only helps it to do the wrong thing.

It’s kind of like telling a person to slow down and figure out how to comply with the manual of regulations. Provided you have the time to slow down, that’s a great strategy… to the extent the two of you are on the same page, on a fundamental level, on what is right, and also this is sufficiently and precisely reflected in the manual of regulations.

Otherwise, you have a problem. And you plausibly made it a lot worse.

I do have thoughts on how to do a different version of this, that changes various key elements, and that could move from ‘I am confident I know at least one reason why this wouldn’t work’ to ‘I presume various things go wrong but I do not know a particular reason this won’t work.’ I hope to write that up soon.

Discussion about this post

On Deliberative Alignment Read More »