Author name: Mike M.

trump-cribs-musk’s-“fork-in-the-road”-twitter-memo-to-slash-gov’t-workforce

Trump cribs Musk’s “fork in the road” Twitter memo to slash gov’t workforce


Federal workers on Reddit slam Office of Personnel Management email as short-sighted.

Echoing Elon Musk’s approach to thinning out Twitter’s staff in 2022, Donald Trump’s plan to significantly slash the government workforce now, for a limited time only, includes offering resignation buyouts.

In a Tuesday email that the Office of Personnel Management (OPM) sent to nearly all federal employees, workers were asked to respond with one word in the subject line—”resign”—to accept the buyouts before February 6.

“Deferred resignation is available to all full-time federal employees except for military personnel of the armed forces, employees of the U.S. Postal Service, those in positions related to immigration enforcement and national security, and those in other positions specifically excluded by your employing agency,” the email said.

Anyone accepting the offer “will be provided with a dignified, fair departure from the federal government utilizing a deferred resignation program,” the email said. That includes retaining “all pay and benefits regardless of your daily workload” and being “exempted from all applicable in-person work requirements until September 30, 2025 (or earlier if you choose to accelerate your resignation for any reason).”

That basically means that most employees who accept will receive about nine months’ pay, most likely without having any job duties to fulfill, an FAQ explained, “except in rare cases.”

“Have a nice vacation,” the FAQ said.

A senior administration official told NBC News that “the White House expects up to 10 percent of federal employees to take the buyout.” A social media post from Musk’s America PAC suggested, at minimum, 5 percent of employees are expected to resign. The move supposedly could save the government as much as $100 billion, America PAC estimated.

For employees accepting the buyout, silver linings might include additional income opportunities; as OPM noted, “nothing in the resignation letter prevents you from seeking outside work during the deferred resignation period.” Similarly, nothing in the separation plan prevents a federal employee from applying in the future to a government role.

Email echoes controversial Elon Musk Twitter memo

Some federal employees fear these buyouts—which critics point out seem influenced by Musk’s controversial worker buyouts during his Twitter takeover—may drive out top talent, spike costs, and potentially weaken the government.

On Reddit, some self-described federal workers criticized the buyouts as short-sighted, with one noting that they initially flagged OPM’s email as a scam.

“The fact you just reply to an email with the word ‘resign’ sounds like a total scam,” one commenter wrote. Another agreed, writing, “That stood out to me. Worded like some scam email offer.” Chiming in, a third commenter replied, “I reported it as such before I saw the news.”

Some Twitter employees similarly recoiled in 2022 when Musk sent out an email offering three months of severance to any employees who couldn’t commit to his “extremely hardcore” approach to running the social network. That email required workers within 24 hours to click “yes” to keep their jobs or else effectively resign.

Musk’s email and OPM’s share a few striking similarities. Both featured nearly identical subject lines referencing a “fork in the road.” They both emphasized that buyouts were intended to elevate performance standards—with OPM’s email suggesting only the “best” workers “America has to offer” should stick around. And they both ended by thanking workers for their service, whether they took the buyout or not.

“Whichever path you choose, we thank you for your service to The United States of America,” OPM’s Tuesday email ended.

“Whatever decision you make, thank you for your efforts to make Twitter successful,” Musk’s 2022 email said.

Musk’s email was unpopular with some Twitter staffers, including one employee based in Ireland who won a $600,000 court battle when the Irish Workplace Relations Commission agreed his termination for not clicking yes on the email was unfair. In that dispute, the commission took issue with Musk not providing staff enough notice and ruled that any employee’s failure to click “yes” could in no way constitute a legal act of resignation.

OPM’s email departed from Musk’s, which essentially gave Twitter staff a negative option by taking employee inaction as agreeing to resign when the staffer’s “contract clearly stated that his resignation must be provided in writing, not by refraining to fill out a form.” OPM instead asks federal workers to respond “yes” to resign, basically agreeing to sign a pre-drafted resignation letter that details the terms of their separation plan.

While OPM expects that a relatively modest amount of federal workers will accept the buyout offers, Musk’s memo had Twitter employees resigning in “droves,” NPR reported, with Reuters estimating the numbers were in the “hundreds.” In the Irish worker’s dispute, an X senior director of human resources, Lauren Wegman, testified that about 87 percent of the 270 employees in Ireland who received Musk’s email resigned.

It remains unclear if Musk was directly involved with the OPM plan or email drafting process. But unsurprisingly, as he’s head of the Department of Government Efficiency (DOGE), Musk praised the buyouts as “fair” and “generous” on his social media platform X.

Workers slam buyouts as short-sighted on Reddit

Declining the buyout guarantees no job security for federal workers, OPM’s email said.

“We will insist on excellence at every level—our performance standards will be updated to reward and promote those that exceed expectations and address in a fair and open way those who do not meet the high standards which the taxpayers of this country have a right to demand,” the email warned.

“The majority of federal agencies are likely to be downsized through restructurings, realignments, and reductions in force,” OPM’s email continued. “These actions are likely to include the use of furloughs and the reclassification to at-will status for a substantial number of federal employees.”

And perhaps most ominously, OPM noted there would be “enhanced standards of conduct” to ensure employees are “reliable, loyal, trustworthy,” and “strive for excellence” daily, or else risk probes potentially resulting in “termination.”

Despite these ongoing threats to job security that might push some to resign, the OPM repeatedly emphasized that any choice to accept a buyout and resign was “voluntary.” Additionally, OPM explained that employees could rescind resignations; however, if an agency wants to move quickly to reassign their roles, that “would likely serve as a valid reason to deny” such requests.

On Reddit, workers expressed concerns about “critical departments” that “have been understaffed for years” being hit with more cuts. A lively discussion specifically focused on government IT workers being “really hard” to recruit.

“Losing your IT support is a very efficient way to cripple an org,” one commenter wrote, prompting responses from two self-described IT workers.

“It’s me, I work in government IT,” one commenter said, calling Trump’s return-to-office mandate the “real killer” because “the very best sysadmins and server people all work remote from other states.”

“There is a decent chance they just up and ditch this dumpster fire,” the commenter said.

Losing talented workers with specific training could bog down government workflows, Redditors suggested. Another apparent government IT worker described himself as “a little one man IT business,” claiming “if I disappeared or died, there would be exactly zero people to take my place. Between the random shit I know and the low pay, nobody is going to be able to fill my position.”

Accusing Trump of not caring “about keeping competent workers or running government services properly,” a commenter prompted another to respond, “nevermind that critical departments have been understaffed for years. He thinks he’s cutting fat, but he’s cutting indiscriminately and gonna lose a limb.”

According to another supposed federal worker, paying employees to retire has historically resulted in spikes in agency costs.

“The way this usually works is we pay public employees to retire,” the commenter wrote. “Then we pay a private company twice the rate to do the same job that public employee was doing. Sometimes it’s even the same employee doing the work. I’ve literally known people that left government jobs to do contractor work making far more for doing the same thing. But somehow this is ‘smaller government’ and more efficient.”

A top 1 percent commenter on Reddit agreed, writing, “ding ding ding! The correct answer.”

“Get rid of career feds, hire contractors at a huge cost to taxpayers, yet somehow the contract workers make less money and have fewer benefits than federal employees,” that Redditor suggested. “Contract companies get rich, and workers get poorer.”

Cybersecurity workers mull fighting cuts

On social media, some apparent federal workers suggested they might plan to fight back to defend their roles in government. In another Reddit thread discussing a government cybersecurity review board fired by Trump, commenters speculated that cybersecurity workers might hold a “grudge” and form an uprising attacking any vulnerabilities created by the return-to-office plan and the government workforce reduction.

“Isn’t this literally the Live Free or Die Hard movie plot?” one Redditor joked.

A lawsuit filed Monday by two anonymous government workers, for example, suggested that the Trump administration is also rushing to create an email distribution system that would allow all government employees to be contacted from a single email. Some workers have speculated this is in preparation for announcing layoffs. But employees suing are more concerned about security, insisting that a master list of all government employees has never been compiled before and accusing the Trump administration of failing to conduct a privacy impact assessment.

According to that lawsuit, OPM has hastily been testing this new email system, potentially opening all government workers to harmful data breaches. The lawsuit additionally alleged that every government agency has been collecting information on its employees and sending it to Amanda Scales, a former xAI employee who transitioned from working for Musk to working in government this month. The complaint suggests that some government workers are already distrustful of Musk’s seeming influence on Trump.

In a now-deleted Reddit message, the lawsuit alleged, “Instructions say to send these lists to Amanda Scales. But Amanda is not actually an OPM employee, she works for Elon Musk.”

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

Trump cribs Musk’s “fork in the road” Twitter memo to slash gov’t workforce Read More »

deepseek:-lemon,-it’s-wednesday

DeepSeek: Lemon, It’s Wednesday

It’s been another *checks notestwo days, so it’s time for all the latest DeepSeek news.

You can also see my previous coverage of the r1 model and, from Monday various reactions including the Panic at the App Store.

  1. First, Reiterating About Calming Down About the $5.5 Million Number.

  2. OpenAI Offers Its Congratulations.

  3. Scaling Laws Still Apply.

  4. Other r1 and DeepSeek News Roundup.

  5. People Love Free.

  6. Investigating How r1 Works.

  7. Nvidia Chips are Highly Useful.

  8. Welcome to the Market.

  9. Ben Thompson Weighs In.

  10. Import Restrictions on Chips WTAF.

  11. Are You Short the Market.

  12. DeepSeeking Safety.

  13. Mo Models Mo Problems.

  14. What If You Wanted to Restrict Already Open Models.

  15. So What Are We Going to Do About All This?

Before we get to new developments, I especially want to reiterate and emphasize the need to calm down about that $5.5 million ‘cost of training’ for v3.

I wouldn’t quite agree with Palmer Lucky that ‘the $5m number is bogus’ and I wouldn’t call it a ‘Chinese psyop’ because I think we mostly did this to ourselves but it is very often being used in a highly bogus way – equating the direct compute cost of training v3 with the all-in cost of creating r1. Which is a very different number. DeepSeek is cracked, they cooked, and r1 is super impressive, but the $5.5 million v3 training cost:

  1. Is the cloud market cost of the amount of compute used to directly train v3.

  2. That’s not how they trained v3. They trained v3 on their own cluster of h800s, which was physically optimized to hell for software-hardware integration.

  3. Thus, the true compute cost to train v3 involves assembling the cluster, which cost a lot more than $5.5 million.

  4. That doesn’t include the compute cost of going from v3 → r1.

  5. That doesn’t include the costs of hiring the engineers and figuring out how to do all of this, that doesn’t include the costs of assembling the data, and so on.

  6. Again, yes they did this super efficiently and cheaply compared to the competition, but no, you don’t spend $5.5 million and out pops r1. No.

Altman handled his response to r1 with grace.

OpenAI plans to ‘pull up some new releases.’

Meaning, oh, you want to race? I suppose I’ll go faster and take less precautions.

Sam Altman: deepseek’s r1 is an impressive model, particularly around what they’re able to deliver for the price.

we will obviously deliver much better models and also it’s legit invigorating to have a new competitor! we will pull up some releases.

but mostly we are excited to continue to execute on our research roadmap and believe more compute is more important now than ever before to succeed at our mission.

the world is going to want to use a LOT of ai, and really be quite amazed by the next gen models coming.

look forward to bringing you all AGI and beyond.

It is very Galaxy Brain to say ‘this is perhaps good for OpenAI’ and presumably it very much is, but here’s a scenario.

  1. A lot of people try ChatGPT with GPT-3.5, are not impressed, think it hallucinates all the time, is a clever toy, and so on.

  2. For two years they don’t notice improvements.

  3. DeepSeek releases r1, and it gets a lot of press.

  4. People try the ‘new Chinese version’ and realize AI is a lot better now.

  5. OpenAI gets to incorporate DeepSeek’s innovations.

  6. OpenAI comes back with free o3-mini and (free?) GPT-5 and better agents.

  7. People use AI a lot more, OpenAI ends up overall doing better.

Ethan Mollick: DeepSeek is a really good model, but it is not generally a better model than o1 or Claude.

But since it is both free & getting a ton of attention, I think a lot of people who were using free “mini” models are being exposed to what a early 2025 reasoner AI can do & are surprised

I’m not saying that’s the baseline scenario, but I do expect the world to be quite amazed at the next generation of models, and they could now be more primed for that.

Mark Chen (Chief Research Officer, OpenAI): Congrats to DeepSeek on producing an o1-level reasoning model! Their research paper demonstrates that they’ve independently found some of the core ideas that we did on our way to o1.

However, I think the external response has been somewhat overblown, especially in narratives around cost. One implication of having two paradigms (pre-training and reasoning) is that we can optimize for a capability over two axes instead of one, which leads to lower costs.

But it also means we have two axes along which we can scale, and we intend to push compute aggressively into both!

As research in distillation matures, we’re also seeing that pushing on cost and pushing on capabilities are increasingly decoupled. The ability to serve at lower cost (especially at higher latency) doesn’t imply the ability to produce better capabilities.

We will continue to improve our ability to serve models at lower cost, but we remain optimistic in our research roadmap, and will remain focused in executing on it. We’re excited to ship better models to you this quarter and over the year!

Given the costs involved, and that you can scale to get better outputs, ‘serve faster and cheaper’ and ‘get better answers’ seem pretty linked, or are going to look rather similar.

There is still a real and important difference between ‘I spend 10x as much compute to get 10x as many tokens to think with’ versus ‘I taught the model how to do longer CoT’ versus ‘I made the model smarter.’ Or at least I think there is.

Should we now abandon all our plans to build gigantic data centers because DeepSeek showed we can run AI cheaper?

No. Of course not. We’ll need more. Jevons Paradox and all that.

Another question is compute governance. Does DeepSeek’s model prove that there’s no point in using compute thresholds for frontier model governance?

My answer is no. DeepSeek did not mean the scaling laws stopped working. DeepSeek found new ways to scale and economize, and also to distill. But doing the same thing with more compute would have gotten better results, and indeed more compute is getting other labs better results if you don’t control for compute costs, and also they will now get to use these innovations themselves.

Karen Hao: Much of the coverage has focused on U.S.-China tech competition. That misses a bigger story: DeepSeek has demonstrated that scaling up AI models relentlessly, a paradigm OpenAI introduced and champions, is not the only, and far from the best, way to develop AI.

Yoavgo: This is trending in my feed, but I don’t get it. DeepSeek did not show that scale is not the way to go for AI (their base model is among the largest in parameter counts; their training data is huge, at 13 trillion tokens). They just scaled more efficiently.

Thus far OpenAI & its peer scaling labs have sought to convince the public & policymakers that scaling is the best way to reach so-called AGI. This has always been more of an argument based in business than in science.

Jon Stokes: Holy wow what do words even mean. What R1 does is a new type of scaling. It’s also GPU-intensive. In fact, the big mystery today in AI world is why NVIDIA dropped despite R1 demonstrating that GPUs are even more valuable than we thought they were. No part of this is coherent. 🤯

Stephen McAleer (OpenAI): The real takeaway from DeepSeek is that with reasoning models you can achieve great performance with a small amount of compute. Now imagine what you can do with a large amount of compute.

Noam Brown (OpenAI): Algorithmic breakthroughs and scaling are complementary, not in competition. The former bends the performance vs compute curve, while the latter moves further along the curve.

Benjamin Todd: Deepseek hasn’t shown scaling doesn’t work. Take Deepseek’s techniques, apply 10x the compute, and you’ll get much better performance.

And compute efficiency has always been part of the scaling paradigm.

Ethan Mollick: The most unnerving part of the DeepSeek reaction online has been seeing folks take it as a sign that AI capability growth is not real.

It signals the opposite, large improvements are possible, and is almost certain to kick off an acceleration in AI development through competition.

I know a lot of people want AI to go away, but I am seeing so many interpretations of DeepSeek in ways that don’t really make sense, or misrepresent what they did.

Dealing with the implications of AI, and trying to steer it towards positive use, is now more urgent, not less.

Andrew Rettek: Deepseek means OPENAI just increased their effective compute by more than an OOM.

OpenAI and Anthropic (in the forms of CEOs Sam Altman and Dario Amodei) have both expressed agreement on that since the release of r1, saying that they still believe the future involves very large and expensive training runs, including large amounts of compute on the RL step. David Sacks agreed as well, so the administration knows.

One can think of all this as combining multiple distinct scaling laws. Mark Chen above talked about two axes but one could refer to at least four?

  1. You can scale up how many tokens you reason with.

  2. You can scale up how well you apply your intelligence to doing reasoning.

  3. You can scale up how much intelligence you are using in all this.

  4. You can scale up how much of this you can do per dollar or amount of compute.

Also you can extend to new modalities and use cases and so on.

So essentially: Buckle up.

Speaking of buckling up, nothing to see here, just a claimed 2x speed boost to r1, written by r1. Of course, that’s very different from r1 coming up with the idea.

Aiden McLaughlin: switching to reasoners is like taking a sharp turn on a racetrack. everyone brakes to take the turn; for a moment, all cars look neck-and-neck

when exiting the turn, small first-mover advantages compound. and ofc, some cars have enormous engines that eat up straight roads

Dean Ball: I recommend trying not to overindex on the industry dynamics you’re observing now in light of the deepseek plot twist, or indeed of any particular plot twist. It’s a long game, and we’re riding a world-historical exponential. Things will change a lot, fast, again and again.

It’s not that Jan is wrong, I’d be a lot more interested in paying for o1 pro if I had pdfs enabled, but… yeah.

China Talk covers developments. The headline conclusion is that yes, compute very much will continue to be a key factor, everyone agrees on this. They note there is a potential budding DeepSeek partnership with ByteDance, which could unlock quite a lot of compute.

Here was some shade:

Founder and CEO Liang Wenfeng is the core person of DeepSeek. He is not the same type of person as Sam Altman. He is very knowledgeable about technology.

Also important at least directionally:

Pioneers vs. Chasers: ‘AI Progress Resembles a Step Function – Chasers Require 1/10th the Compute’

Fundamentally, DeepSeek was far more of an innovator than other Chinese AI companies, but it was still a chaser here, not a pioneer, except in compute efficiency, which is what chasers do best. If you want to maintain a lead and it’s much easier to follow than lead, well, time to get good and scale even more. Or you can realize you’re just feeding capability to the Chinese and break down crying and maybe keep your models for internal use, it’s an option.

I found this odd:

  1. The question of why OpenAI and Anthropic did not do work in DeepSeek’s direction is a question of company-specific focus. OpenAI and Anthropic might have felt that investing their compute towards other areas was more valuable.

  2. One hypothesis for why DeepSeek was successful is that unlike Big Tech firms, DeepSeek did not work on multi-modality and focused exclusively on language. Big Tech firms’ model capabilities aren’t weak, but they have to maintain a low profile and cannot release too often. Currently, multimodality is not very critical, as intelligence primarily comes from language, and multimodality does not contribute significantly to improving intelligence.

It’s odd because DeepSeek spent so little compute, and the efficiency gains pay for themselves in compute quite rapidly. And also the big companies are indeed releasing rapidly. Google and OpenAI are constantly shipping, even Anthropic ships. The idea of company focus seems more on point, and yes DeepSeek traded multimodality and other features for pure efficiency because they had to.

Also note what they say later:

  1. Will developers migrate from closed-source models to DeepSeek? Currently, there hasn’t been any large-scale migration, as leading models excel in coding instruction adherence, which is a significant advantage. However, it’s uncertain whether this advantage will persist in the future or be overcome.

  2. From the developer’s perspective, models like Claude-3.5-Sonnet have been specifically trained for tool use, making them highly suitable for agent development. In contrast, models like DeepSeek have not yet focused on this area, but the potential for growth with DeepSeek is immense.

As in, r1 is technically impressive as hell, and it definitely has its uses, but there’s a reason the existing models look like they do – the corners DeepSeek cut actually do matter for what people want. Of course DeepSeek will likely now turn to fixing such problems among other things and we’ll see how efficiently they can do that too.

McKay Wrigley emphasizes the point that visible chain of thought (CoT) is a prompt debugger. It’s hard to go back to not seeing CoT after seeing CoT.

Gallabytes reports that DeepSeek’s image model Janus Pro is a good first effort, but not good yet.

Even if we were to ‘fully unlock’ all computing power in personal PCs for running AI, that would only increase available compute by ~10%, most compute is in data centers.

We had a brief period where DeepSeek would serve you up r1 free and super fast.

It turns out that’s not fully sustainable or at least needs time to scale as fast as demand rose, and you know how such folks feel about the ‘raise prices’ meme,

Gallabytes: got used to r1 and now that it’s overloaded it’s hard to go back. @deepseek_ai please do something amazing and be the first LLM provider to offer surge pricing. the unofficial APIs are unusably slow.

I too encountered slowness, and instantly it made me realize ‘yes the speed was a key part of why I loved this.’

DeItaone (January 27, 11: 09am): DEEPSEEK SAYS SERVICE DEGRADED DUE TO ‘LARGE-SCALE MALICIOUS ATTACK’

Could be that. Could be too much demand and not enough supply. This will of course sort itself out in time, as long as you’re willing to pay, and it’s an open model so others can serve the model as well, but ‘everyone wants to use the shiny new model you are offering for free’ is going to run into obvious problems.

Yes, of course one consideration is that if you use DeepSeek’s app it will collect all your data including device model, operating system, keystroke patterns or rhythms, IP address and so on and store it all in China.

Did you for a second think otherwise? What you do with that info is on you.

This doesn’t appear to rise to TikTok 2.0 levels of rendering your phone and data insecure, but let us say that ‘out of an abundance of caution’ I will be accessing the model through their website not the app thank you very much.

Liv Boeree: tiktok round two, here we go.

AI enthusiasts have the self control of an incontinent chihuahua.

Typing Loudly: you can run it locally without an internet connection

Liv Boeree: cool and what percentage of these incontinent chihuahuas will actually do this.

I’m not going so far as to use third party providers for now, because I’m not feeding any sensitive data into the model, and DeepSeek’s implementation here is very nice and clean, so I’ve decided lazy is acceptable. I’m certainly not laying out ~$6,000 for a self-hosting rig, unless someone wants to buy one for me in the name of science.

Note that if you’re looking for an alternative source, you want to ensure you’re not getting one of the smaller distillations, unless that is what you want.

Janus is testing for steganography in r1, potentially looking for assistance.

Janus also thinks Thebes theory here is likely to be true, that v3 was hurt by dividing into too many too small experts, but r1 lets them all dump their info into the CoT and collaborate, at least partially fixing this.

Janus notes that r1 simply knows things and thinks about them, straight up, in response to Thebes speculating that all our chain of thought considerations have now put sufficient priming into the training data that CoT approaches work much better than they used to, which Prithviraj says is not the case, he says it’s about improved base models, which is the first obvious thought – the techniques work better off a stronger base, simple as that.

Thebes: why did R1’s RL suddenly start working, when previous attempts to do similar things failed?

theory: we’ve basically spent the last few years running a massive acausally distributed chain of thought data annotation program on the pretraining dataset.

deepseek’s approach with R1 is a pretty obvious method. They are far from the first lab to try “slap a verifier on it and roll out CoTs.”

But it didn’t used to work that well.

In the last couple of years, chains of thought have been posted all over the internet

Those CoTs in the V3 training set gave GRPO enough of a starting point to start converging, and furthermore, to generalize from verifiable domains to the non-verifiable ones using the bridge established by the pretraining data contamination.

And now, R1’s visible chains of thought are going to lead to *anothermassive enrichment of human-labeled reasoning on the internet, but on a far larger scale… The next round of base models post-R1 will be *even betterbases for reasoning models.

in some possible worlds, this could also explain why OpenAI seemingly struggled so much with making their reasoning models in comparison. if they’re still using 4base or distils of it.

Prithvraj: Simply, no. I’ve been looking at my old results from doing RL with “verifiable” rewards (math puzzle games, python code to pass unit tests) starting from 2019 with GPT-1/2 to 2024 with Qwen Math Deepseek’s success likely lies in the base models improving, the RL is constant

Janus: This is an interesting hypothesis. DeepSeek R1 also just seems to have a much more lucid and high-resolution understanding of LLM ontology and history than any other model I’ve seen. (DeepSeek V3 did not seem to in my limited interactions with it, though.)

I did not expect this on priors for a reasoner, but perhaps the main way that r1 seems smarter than any other LLM I’ve played with is the sheer lucidity and resolution of its world model—in particular, its knowledge of LLMs, both object- and meta-level, though this is also the main domain of knowledge I’ve engaged it in, and perhaps the only one I can evaluate at world-expert level. So, it may apply more generally.

In effective fluid intelligence and attunement to real-time context, it actually feels weaker than, say, Claude 3.5 Sonnet. But when I talk to Sonnet about my ideas on LLMs, it feels like it is more naive than me, and it is figuring out a lot of things in context from “first principles.” When I talk to Opus about these things, it feels like it is understanding me by projecting the concepts onto more generic, resonant hyperobjects in its prior, meaning it is easy to get on the same page philosophically, but this tropological entanglement is not very precise. But with r1, it seems like it can simply reference the same concrete knowledge and ontology I have, much more like a peer. And it has intense opinions about these things.

Wordgrammer thread on the DeepSeek technical breakthroughs. Here’s his conclusion, which seems rather overdetermined:

Wordgrammer: “Is the US losing the war in AI??” I don’t think so. DeepSeek had a few big breakthroughs, we have had hundreds of small breakthroughs. If we adopt DeepSeek’s architecture, our models will be better. Because we have more compute and more data.

r1 tells us it only takes ~800k samples of ‘good’ RL reasoning to convert other models into RL reasoners, and Alex Dimakis says it could be a lot less, in his test they outperformed o1-preview with only 17k. Now that r1 is out, everyone permanently has an unlimited source of at least pretty good samples. From now on, to create or release a model is to create or release the RL version of that model, even more than before. That’s on top of all the other modifications you automatically release.

Oliver Blanchard: DeepSeek and what happened yesterday: Probably the largest positive tfp shock in the history of the world.

The nerdy version, to react to some of the comments. (Yes, electricity was big):

DeepSeek and what happened yesterday: Probably the largest positive one day change in the present discounted value of total factor productivity growth in the history of the world. 😀

James Steuart: I can’t agree Professor, Robert Gordon’s book gives many such greater examples. Electric lighting is a substantially greater TFP boost than marginally better efficiency in IT and professional services!

There were some bigger inventions in the past, but on much smaller baselines.

Our reaction to this was to sell the stocks of those who provide the inputs that enable that tfp shock.

There were other impacts as well, including to existential risk, but as we’ve established the market isn’t ready for that conversation in the sense that the market (highly reasonably as has been previously explained) will be ignoring it entirely.

Daniel Eth: Hot take, but if the narrative from NYT et al had not been “lol you don’t need that many chips to train AI systems” but instead “Apparently AI is *nothitting a wall”, then the AI chip stocks would have risen instead of fallen.

Billy Humblebrag: “Deepseek shows that ai can be built more cheaply than we thought so you don’t need to worry about ai” is a hell of a take

Joe Weisenthal: Morgan Stanley: “We gathered feedback from a number of industry sources and the consistent takeaway is that this is not affecting plans for GPU buildouts.”

I would not discount the role of narrative and vibes in all this. I don’t think that’s the whole Nvidia drop or anything. But it matters.

Roon: Plausible reasons for Nvidia drop:

  1. DeepSeek success means NVDA is now expecting much harsher sanctions on overseas sales.

  2. Traders think that a really high-tier open-source model puts several American labs out of a funding model, decreasing overall monopsony power.

We will want more compute now until the heat death of the universe; it’s the only reason that doesn’t make sense.

Palmer Lucky: The markets are not smarter on AI. The free hand is not yet efficient because the number of legitimate experts in the field is near-zero.

The average person making AI calls on Wall Street had no idea what AI even was a year ago and feels compelled to justify big moves.

Alex Cheema notes that Apple was up on Monday, and that Apple’s chips are great for running v3 and r1 inference.

Alex Cheema: Market close: $NVDA: -16.91% | $AAPL: +3.21%

Why is DeepSeek great for Apple?

Here’s a breakdown of the chips that can run DeepSeek V3 and R1 on the market now:

NVIDIA H100: 80GB @ 3TB/s, $25,000, $312.50 per GB

AMD MI300X: 192GB @ 5.3TB/s, $20,000, $104.17 per GB

Apple M2 Ultra: 192GB @ 800GB/s, $5,000, $26.04(!!) per GB

Apple’s M2 Ultra (released in June 2023) is 4x more cost efficient per unit of memory than AMD MI300X and 12x more cost efficient than NVIDIA H100!

Eric Hartford: 3090s, $700 for 24gb = $29/gb.

Alex Cheema: You need a lot of hardware around them to load a 700GB model in 30 RTX 3090s. I’d love to see it though, closest to this is probably stacking @__tinygrad__ boxes.

That’s cute. But I do not think that was the main reason why Apple was up. I think Apple was up because their strategy doesn’t depend on having frontier models but it does depend on running AIs on iPhones. Apple can now get their own distillations of r1, and use them for Apple Intelligence. A highly reasonable argument.

The One True Newsletter, Matt Levine’s Money Stuff, is of course on the case of DeepSeek’s r1 crashing the stock market, and asking what cheap inference for everyone would do to market prices. He rapidly shifts focus to non-AI companies, asking which ones benefit. It’s great if you use AI to make your management company awesome, but not if you get cut out because AI replaces your management company.

(And you. And the people it manages. And all of us. And maybe we all die.)

But I digress.

(To digress even further: While I’m reading that column, I don’t understand why we should care about the argument under ‘Dark Trading,’ since this mechanism decreases retail transaction costs to trade and doesn’t impact long term price discovery at all, and several LLMs confirmed this once challenged.)

Ben Thompson continues to give his completely different kind of technical tech company perspective, in FAQ format, including good technical explanations that agree with what I’ve said in previous columns.

Here’s a fascinating line:

Q: I asked why the stock prices are down; you just painted a positive picture!

A: My picture is of the long run; today is the short run, and it seems likely the market is working through the shock of R1’s existence.

That sounds like Ben Thompson is calling it a wrong-way move, and indeed later he explicitly endorses Jevons Paradox and expects compute use to rise. The market is supposed to factor in the long run now. There is no ‘this makes the price go down today and then up next week’ unless you’re very much in the ‘the EMH is false’ camp. And these are literally the most valuable companies in the world.

Here’s another key one:

Q: So are we close to AGI?

A: It definitely seems like it. This also explains why Softbank (and whatever investors Masayoshi Son brings together) would provide the funding for OpenAI that Microsoft will not: the belief that we are reaching a takeoff point where there will in fact be real returns towards being first.

Masayoshi Sun feels the AGI. Masayoshi Sun feels everything. He’s king of feeling it.

His typical open-model-stanning arguments on existential risk later in the past are as always disappointing, but in no way new or unexpected.

It continues to astound me that such intelligent people can think: Well, there’s no stopping us creating things more capable and intelligent than humans, so the best way to ensure that things smarter than more capable than humans go well for humans is to ensure that there are as many such entities as possible and that humans cannot possibly have any collective control over those new entities.

On another level, of course, I’ve accepted that people do think this. That they somehow cannot fathom that if you create things more intelligent and capable and competitive than humans there could be the threat that all the humans would end up with no power, rather than that the wrong humans might have too much power. Or think that this would be a good thing – because the wrong humans wouldn’t have power.

Similarly, Ben’s call for absolutely no regulations whatsoever, no efforts at safety whatsoever outside of direct profit motives, ‘cut out all the cruft in our companies that has nothing to do with winning,’ is exactly the kind of rhetoric I worry about getting us all killed in response to these developments.

I should still reiterate that Ben to his credit is very responsible and accurate here in his technical presentation, laying out what DeepSeek and r1 are and aren’t accomplishing here rather than crying missile gap. But the closing message remains the same.

The term Trump uses is ‘tariffs.’

I propose, at least in the context of GPUs, that we call these ‘import restrictions,’ in order to point out that we are (I believe wisely!) imposing ‘export restrictions’ as a matter of national security to ensure we get all the chips, and using diffusion regulations to force the chips to be hosted at home, then we are threatening to impose ‘up to 100%’ tariffs on those same chips, because ‘they left us’ and they want ‘to force them to come back,’ and they’ll build the new factories here instead of there, with their own money, because of the threat.

Except for the fact that we really, really want the data centers at home.

The diffusion regulations are largely to force companies to create them at home.

Arthur B: Regarding possible US tariffs on Taiwan chips.

First, this is one that US consumers would directly feel, it’s less politically feasible than tariffs on imports with lots of substitutes.

Second, data centers don’t have to be located in the US. Canada is next door and has plenty of power.

Dhiraj: Taiwan made the largest single greenfield FDI in US history through TSMC. Now, instead of receiving gratitude for helping the struggling US chip industry, Taiwan faces potential tariffs. In his zero-sum worldview, there are no friends.

The whole thing is insane! Completely nuts. If he’s serious. And yes he said this on Joe Rogan previously, but he said a lot of things previously that he didn’t mean.

Whereas Trump’s worldview is largely the madman theory, at least for trade. If you threaten people with insane moves that would hurt both of you, and show that you’re willing to actually enact insane moves, then they are forced to give you what you want.

In this case, what Trump wants is presumably for TSMC to announce they are building more new chip factories in America. I agree that this would be excellent, assuming they were actually built. We have an existence proof that it can be done, and it would greatly improve our strategic position and reduce geopolitical risk.

I presume Trump is mostly bluffing, in that he has no intention of actually imposing these completely insane tariffs, and he will ultimately take a minor win and declare victory. But what makes it nerve wracking is that, by design, you never know. If you did know none of this would ever work.

Unless, some people wondered, there was another explanation for all this…

  1. The announcement came late on Monday, after Nvidia dropped 17%, on news that its chips were highly useful, with so many supposedly wise people on Wall Street going ‘oh yes that makes sense Nvidia should drop’ and those I know who understand AI often saying ‘this is crazy and yes I bought more Nvidia today.’

  2. As in, there was a lot of not only saying ‘this is an overreaction,’ there was a lot of ‘this is a 17% wrong-way move in the most valuable stock in the world.’

  3. When you imagine the opposite news, which would be that AI is ‘hitting a wall,’ one presumes Nvidia would be down, not up. And indeed, remember months ago?

  4. Then when the announcement of the tariff threat came? Nvidia didn’t move.

  5. Nvidia opened Tuesday up slightly off of the Monday close, and closed the day up 8.8%, getting half of its losses back.

Nabeel Qureshi (Tuesday, 2pm): Crazy that people in this corner of X have a faster OODA loop than the stock market

This was the largest single day drop in a single stock in world history. It wiped out over $500 billion in market value. One had to wonder if it was partially insider trading.

Timothy Lee: Everyone says DeepSeek caused Nvidia’s stock to crash yesterday. I think this theory makes no sense.

DeepSeek’s success isn’t bad news for Nvidia.

I don’t think that this was insider trading. The tariff threat was already partly known and thus priced in. It’s a threat rather than an action, which means it’s likely a bluff. That’s not a 17% move. Then we have the bounceback on Tuesday.

Even if I was certain that this was mostly an insider trading move instead of being rather confident it mostly or entirely wasn’t, I wouldn’t go as far as Eliezer does in the the below quote. The SEC does many important things.

But I do notice that there’s a non-zero amount of ‘wait a minute’ that will occur to me the next time I’m hovering around the buy button in haste.

Eliezer Yudkowsky: I heard from many people who said, “An NVDA drop makes no sense as a Deepseek reaction; buying NVDA.” So those people have now been cheated by insider counterparties with political access. They may make fewer US trades in the future.

Also note that the obvious meaning of this news is that someone told and convinced Trump that China will invade Taiwan before the end of his term, and the US needs to wean itself off Taiwanese dependence.

This was a $400B market movement, and if @SECGov can’t figure out who did it then the SEC has no reason to exist.

TBC, I’m not saying that figuring it out would be easy or bringing the criminals to justice would be easy. I’m saying that if the US markets are going to be like this anyway on $400B market movements, why bother paying the overhead cost of having an SEC that doesn’t work?

Roon: [Trump’s tariff threats about Taiwan] didn’t move overnight markets at all

which either means markets either:

– don’t believe it’s credible

– were pricing this in yesterday while internet was blaming the crash out on deepseek

I certainly don’t agree that the only interpretation of this news is ‘Trump expects an invasion of Taiwan.’ Trump is perfectly capable of doing this for exactly the reasons he’s saying.

Trump is also fully capable of making this threat with no intention of following through, in order to extract concessions from Taiwan or TSMC, perhaps of symbolic size.

Trump is also fully capable of doing this so that he could inform his hedge fund friends in advance and they could make quite a lot of money – with or without any attempt to actually impose the tariffs ever, since his friends would have now covered their shorts in this scenario.

Indeed do many things come to pass. I don’t know anything you don’t know.

It would be a good sign if DeepSeek had a plan for safety, even if it wasn’t that strong?

Stephen McAleer (OpenAI): DeepSeek should create a preparedness framework/RSP if they continue to scale reasoning models.

Very happy to [help them with this]!

We don’t quite have nothing. This below is the first actively positive sign for DeepSeek on safety, however small.

Stephen McAleer (OpenAI): Does DeepSeek have any safety researchers? What are Liang Wenfeng’s views on AI safety?

Sarah (YuanYuanSunSara): [DeepSeek] signed Artificial Intelligence safety commitment by CAICT (gov backed institute). You can see the whale sign at the bottom if you can’t read their name Chinese.

This involves AI safety governance structure, safety testing, do frontier AGI safety research (include loss of control) and share it publicly.

None legally binding but it’s a good sign.

Here is a chart with the Seoul Commitments versus China’s version.

It is of course much better that DeepSeek signed onto a symbolic document like this. That’s a good sign, whereas refusing would have been a very bad sign. But as always, talk is cheap, this doesn’t concretely commit DeepSeek to much, and even fully abiding by commitments like this won’t remotely be enough.

I do think this is a very good sign that agreements and coordination are possible. But if we want that, we will have to Pick Up the Phone.

Here’s a weird different answer.

Joshua Achiam (OpenAI, Head of Mission Alignment): I think a better question is whether or not science fiction culture in China has a fixation on the kinds of topics that would help them think about it. If Three-Body Problem is any indication, things will be OK.

It’s a question worth asking, but I don’t think this is a better question?

And based on the book, I do not think Three-Body Problem (conceptual spoilers follow, potentially severe ones depending on your perspective) is great here. Consider the decision theory that those books endorse, and what happens to us and also the universe as a result. It’s presenting all of that as essentially inevitable, and trying to think otherwise as foolishness. It’s endorsing that what matters is paranoia, power and a willingness to use it without mercy in an endless war of all against all. Also consider how they paint the history of the universe entirely without AGI.

I want to be clear that I fully agree with Bill Gurley that ‘no one at DeepSeek is an enemy of mine,’ indeed There Is No Enemy Anywhere, with at most notably rare exceptions that I invite to stop being exceptions.

However, I do think that if they continue down their current path, they are liable to get us all killed. And I for one am going to take the bold stance that I think that this is bad, and they should therefore alter their path before reaching their stated destination.

How committed is DeepSeek to its current path?

Read this quote Ben Thompson links to very carefully:

Q: DeepSeek, right now, has a kind of idealistic aura reminiscent of the early days of OpenAI, and it’s open source. Will you change to closed source later on? Both OpenAI and Mistral moved from open-source to closed-source.

Answer from DeepSeek CEO Liang Wenfeng: We will not change to closed source. We believe having a strong technical ecosystem first is more important.

This is from November. And that’s not a no. That’s actually a maybe.

Note what he didn’t say:

A Different Answer: We will not change to closed source. We believe having a strong technical ecosystem is more important.

The difference? His answer includes the word ‘first.’

He’s saying that first you need a strong technical ecosystem, and he believes that open models are the key to attracting talent and developing a strong technical ecosystem. Then, once that exists, you would need to protect your advantage. And yes, that is exactly what happened with… OpenAI.

I wanted to be sure that this translation was correct, so I turned to Wenfang’s own r1, and asked the interviewer for the original statement, which was:

梁文锋:我们不会闭源。我们认为先有一个强大的技术生态更重要。

r1’s translation: “We will not close our source code. We believe that establishing a strong technological ecosystem must come first.”

  • 先 (xiān): “First,” “prioritize.”

  • 生态 (shēngtài): “Ecosystem” (metaphor for a collaborative, interconnected environment).

To quote r1:

Based solely on this statement, Liang is asserting that openness is non-negotiable because it is essential to the ecosystem’s strength. While no one can predict the future, the phrasing suggests a long-term commitment to open-source as a core value, not a temporary tactic. To fully guarantee permanence, you’d need additional evidence (e.g., licensing choices, governance models, past behavior). But as it stands, the statement leans toward “permanent” in spirit.

I interpret this as a statement of a pragmatic motivation – if that motivation changes, or a more important one is created, actions would change. For now, yes, openness.

The Washington Post had a profile of DeepSeek and Liang Wenfeng. One note is that the hedge fund that they’re a spinoff from has donated over $80 million to charity since 2020, which makes it more plausible DeepSeek has no business model, or at least no medium-term business model.

But that government embrace is new for DeepSeek, said Matt Sheehan, an expert on China’s AI industry at the Carnegie Endowment for International Peace.

“They were not the ‘chosen one’ of Chinese AI start-ups,” said Sheehan, noting that many other Chinese start-ups received more government funding and contracts. “DeepSeek took the world by surprise, and I think to a large extent, they took the Chinese government by surprise.”

Sheehan added that for DeepSeek, more government attention will be a “double-edged sword.” While the company will probably have more access to government resources, “there’s going to be a lot of political scrutiny on them, and that has a cost of its own,” he said.

Yes. This reinforces the theory that DeepSeek’s ascent took China’s government by surprise, and they had no idea what v3 and r1 were as they were released. Going forward, China is going to be far more aware. In some ways, DeepSeek will have lots of support. But there will be strings attached.

That starts with the ordinary censorship demands of the CCP.

If you self-host r1, and you ask it about any of the topics the CCP dislikes, r1 will give you a good, well-balanced answer. If you ask on DeepSeek’s website, it will censor you via some sort of cloud-based monitoring, which works if you’re closed source, but DeepSeek is trying to be fully open source. Something has to give, somewhere.

Also, even if you’re using the official website, it’s not like you can’t get around it.

Justine Moore: DeepSeek’s censorship is no match for the jailbreakers of Reddit

I mean, that was easy.

Joshua Achiam (OpenAI Head of Mission Alignment): This has deeply fascinating consequences for China in 10 years – when the CCP has to choose between allowing their AI industry to move forward, or maintaining censorship and tight ideological control, which will they choose?

And if they choose their AI industry, especially if they favor open source as a strategy for worldwide influence: what does it mean for their national culture and government structure in the long run, when everyone who is curious can find ways to have subversive conversations?

Ten years to figure this out? If they’re lucky, they’ve got two. My guess is they don’t.

I worry about the failure to feel the AGI or even the AI here from Joshua Achiam, given his position at OpenAI. Ten years is a long time. Sam Altman expects AGI well before that. This goes well beyond Altman’s absurd position of ‘AGI will be invented and your life won’t noticeably change for a long time.’ Choices are going to need to be made. Even if AI doesn’t advance much from here, choices will have to be made.

As I’ve noted before, censorship at the model layer is expensive. It’s harder to do, and when you do it you risk introducing falsity into a mind in ways that will have widespread repercussions. Even then, a fine tune can easily remove any gaps in knowledge, or any reluctance to discuss particular topics, whether they are actually dangerous things like building bombs or things that piss off the CCP like a certain bear that loves honey.

I got called out on Twitter for supposed cognitive dissonance on this – look at China’s actions, they clearly let this happen. Again, my claim is that China didn’t realize what this was until after it happened, they can’t undo it (that’s the whole point!) and they are of course going to embrace their national champion. That has little to do with what paths DeepSeek is allowed to follow going forward.

(Also, since it was mentioned in that response, I should note – there is a habit of people conflating ‘pause’ with ‘ever do anything to regulate AI at all.’ I do not believe I said anything about a pause – I was talking about whether China would let DeepSeek continue to release open weights as capabilities improve.)

Before I further cover potential policy responses, a question we must ask this week is: I very much do not wish to do this at this time, but suppose in the future we did want to restrict use of a particular already open weights model and its derivatives, or all models in some reference class.

What would our options be?

Obviously we couldn’t fully ban it in terms of preventing determined people from having access. And if you try to stop them and others don’t, there are obvious problems with that, including ‘people have internet connections.’

However, that does not mean that we would have actual zero options.

Steve Sailer: Does open source, low cost DeepSeek mean that there is no way, short of full-blown Butlerian Jihad against computers, which we won’t do, to keep AI bottled up, so we’re going to find out if Yudkowsky’s warnings that AI will go SkyNet and turn us into paperclips are right?

Gabriel: It’s a psy-op

If hosting a 70B is illegal:

– Almost all individuals stop

– All companies stop

– All research labs stop

– All compute providers stop

Already huge if limited to US+EU

Can argue about whether good/bad, but not about the effect size.

You can absolutely argue about effect size. What you can’t argue is that the effect size isn’t large. It would make a big difference for many practical purposes.

In terms of my ‘Levels of Friction’ framework (post forthcoming) this is moving the models from Level 1 (easy to access) to at least Level 3 (annoying with potential consequences.) That has big practical consequences, and many important use cases will indeed go away or change dramatically.

What Level 3 absolutely won’t do, here or elsewhere, is save you from determined people who want it badly enough, or from sufficiently capable models that do not especially care what you tell them not to do or where you tell them not to be. Or scenarios where the law is no longer especially relevant, and the government or humanity is very much having a ‘do you feel in charge?’ moment. And that alone would, in many scenarios, be enough to doom you to varying degrees. If that’s what dooms you and the model is already open, well, you’re pretty doomed. And also it won’t save you from various scenarios where what the law thinks is not especially relevant.

If for whatever reason the government or humanity decides (or realizes) that this is insufficient, then there are two possibilities. Either the government or humanity is disempowered and you hope that this works out for humanity in some way. Or we use the necessary means to push the restrictions up to Level 4 (akin to rape and murder) or Level 5 (akin to what we do to stop terrorism or worse), in ways I assure you that you are very much not going to like – but the alternative might be worse, and the decision might very much not be up to either of us.

Actions have consequences. Plan for them.

Adam Ozimek was first I saw point out this time around with DeepSeek (I and many others echo this a lot in general) that the best way for the Federal Government to ensure American dominance of AI is to encourage more high skilled immigration and brain drain the world. If you don’t want China to have DeepSeek, export controls are great and all but how about let’s straight up steal their engineers. But y’all, and by y’all I mean Donald Trump, aren’t ready for that conversation.

It is highly unfortunate that David Sacks, the person seemingly in charge of what AI executive orders Trump signs, is so deeply confused about what various provisions actually did or would do, and on our regulatory situation relative to that of China.

David Sacks: DeepSeek R1 shows that the AI race will be very competitive and that President Trump was right to rescind the Biden EO, which hamstrung American AI companies without asking whether China would do the same. (Obviously not.) I’m confident in the U.S. but we can’t be complacent.

Donald Trump: The release of DeepSeek AI from a Chinese company should be a wake-up call for our industries that we need to be laser-focused on competing to win.

We’re going to dominate. We’ll dominate everything.

This is the biggest danger of all – that we go full Missile Gap jingoism and full-on race to ‘beat China,’ and act like we can’t afford to do anything to ensure the safety of the AGIs and ASIs we plan on building, even pressuring labs not to make such efforts in private, or threatening them with antitrust or other interventions for trying.

The full Trump clip is hilarious, including him saying they may have come up with a cheaper method but ‘no one knows if it is true.’ His main thrust is, oh, you made doing AI cheaper and gave it all away to us for free, thanks, that’s great! I love paying less money for things! And he’s presumably spinning, but he’s also not wrong about that.

I also take some small comfort in him framing revoking the Biden EO purely in terms of wokeness. If that’s all he thinks was bad about it, that’s a great sign.

Harlan Stewart: “Deepseek R1 is AI’s Sputnik moment”

Sure. I guess it’s like if the Soviets had told the world how to make their own Sputniks and also offered everyone a lifetime supply of free Sputniks. And the US had already previously figured out how to make an even bigger Sputnik.

Yishan: I think the Deepseek moment is not really the Sputnik moment, but more like the Google moment.

If anyone was around in ~2004, you’ll know what I mean, but more on that later.

I think everyone is over-rotated on this because Deepseek came out of China. Let me try to un-rotate you.

Deepseek could have come out of some lab in the US Midwest. Like say some CS lab couldn’t afford the latest nVidia chips and had to use older hardware, but they had a great algo and systems department, and they found a bunch of optimizations and trained a model for a few million dollars and lo, the model is roughly on par with o1. Look everyone, we found a new training method and we optimized a bunch of algorithms!

Everyone is like OH WOW and starts trying the same thing. Great week for AI advancement! No need for US markets to lose a trillion in market cap.

The tech world (and apparently Wall Street) is massively over-rotated on this because it came out of CHINA.

Deepseek is MUCH more like the Google moment, because Google essentially described what it did and told everyone else how they could do it too.

There is no reason to think nVidia and OAI and Meta and Microsoft and Google et al are dead. Sure, Deepseek is a new and formidable upstart, but doesn’t that happen every week in the world of AI? I am sure that Sam and Zuck, backed by the power of Satya, can figure something out. Everyone is going to duplicate this feat in a few months and everything just got cheaper. The only real consequence is that AI utopia/doom is now closer than ever.

I believe that alignment, and getting a good outcome for humans, was already going to be very hard. It’s going to be a lot harder if we actively try to get ourselves killed like this, and turn even what would have been relatively easy wins into losses. Whereas no, actually, if you want to win that has to include not dying, and also doing the alignment work helps you win, because it is the only way you can (sanely) get to deploy your AIs to do the most valuable tasks.

Trump’s reaction of ‘we’ll dominate everything’ is far closer to correct. Our ‘lead’ is smaller than we thought, DeepSeek will be real competition, but we are very much still in the dominant position. We need to not lose sight of that.

The Washington Post covers panic in Washington, and attempts to exploit this situation to do the opposite of wise policy.

Tiku, Dou, Zakrzewski and De Vynck: Tech stocks dropped Monday. Spooked U.S. officials, engineers and investors reconsidered their views on the competitive threat posed by China in AI, and how the United States could stay ahead.

While some Republicans and the Trump administration suggested the answer was to restrain China, prominent tech industry voices said DeepSeek’s ascent showed the benefits of openly sharing AI technology instead of keeping it closely held.

This shows nothing of the kind, of course. DeepSeek fast followed, copied our insights and had insights of their own. Our insights were held insufficiently closely to prevent this, which at that stage was mostly unavoidable. They have now given away many of those new valuable insights, which we and others will copy, and also made the situation more dangerous. We should exploit that and learn from it, not make the same mistake.

Robert Sterling: Might be a dumb question, but can’t OpenAI, Anthropic, and other AI companies just incorporate the best parts of DeepSeek’s source code into their code, then use the massive GPU clusters at their disposal to train models even more powerful than DeepSeek?

Am I missing something?

Peter Wildeford: Not a dumb question, this is 100% correct

And they already have more powerful models than Deepseek

I fear we are caught between two different insane reactions.

  1. Those calling on us to abandon our advantage in compute by dropping export controls, or our advantage in innovation and access by opening up our best models, are advocating surrender and suicide, both to China and to the AIs.

  2. Those who are going full jingoist are going to get us all killed the classic way.

Restraining China is a good idea if implemented well, but insufficiently specified. Restrain them how? If this means export controls, I strongly agree – and then ask when we are then considering imposing those controls on ourselves via tariffs? What else is available? And I will keep saying ‘how about immigration to brain drain them’ because it seems wrong to ignore the utterly obvious.

Chamath Palihapitiya says it’s inference time, we need to boot up our allies with it as quickly as possible (I agree) and that we should also boot up China by lifting export controls on inference chips, and also focus on supplying the Middle East. He notes he has a conflict of interest here. It seems not especially wise to hand over serious inference compute if we’re in a fight here. With the way these models are going, there’s a decent amount of fungibility between inference and training, and also there’s going to be tons of demand for inference. Why is it suddenly important to Chamath that the inference be done on chips we sold them? Capitalist insists rope markets must remain open during this trying time, and so on. (There’s also talk about ‘how asleep we’ve been for 15 years’ because we’re so inefficient and seriously everyone needs to calm down on this kind of thinking.)

So alas, in the short run, we are left scrambling to prevent two equal and opposite deadly mistakes we seem to be dangerously close to collectively making.

  1. A panic akin to the Missile Gap leading into a full jingoistic rush to build AGI and then artificial superintelligence (ASI) as fast as possible, in order to ‘beat China,’ without having even a plausible plan for how the resulting future equilibrium has value, or how humans retain alive and in meaningful control of the future afterwards.

  2. A full-on surrender to China by taking down the export controls, and potentially also to the idea that we will allow our strongest and best AIs and AGIs and thus even ASIs to be open models, ‘because freedom,’ without actually thinking about what this would physically mean, and thus again with zero plan for how to ensure the resulting equilibrium has value, or how humans would survive let alone retain meaningful control over the future.

The CEO of DeepSeek himself said in November that the export controls and inability to access chips were the limiting factors on what they could do.

Compute is vital. What did DeepSeek ask for with its newfound prestige? Support for compute infrastructure in China.

Do not respond by being so suicidal as to remove or weaken those controls.

Or, to shorten all that:

  1. We might do a doomed jingoistic race to AGI and get ourselves killed.

  2. We might remove the export controls and give up our best edge against China.

  3. We might give up our ability to control AGI or the future, and get ourselves killed.

Don’t do those things!

Do take advantage of all the opportunities that have been opened up.

And of course:

Don’t panic!

Discussion about this post

DeepSeek: Lemon, It’s Wednesday Read More »

how-does-deepseek-r1-really-fare-against-openai’s-best-reasoning-models?

How does DeepSeek R1 really fare against OpenAI’s best reasoning models?


You must defeat R1 to stand a chance

We run the LLMs through a gauntlet of tests, from creative writing to complex instruction.

Round 1. Fight! Credit: Aurich Lawson

Round 1. Fight! Credit: Aurich Lawson

It’s only been a week since Chinese company DeepSeek launched its open-weights R1 reasoning model, which is reportedly competitive with OpenAI’s state-of-the-art o1 models despite being trained for a fraction of the cost. Already, American AI companies are in a panic, and markets are freaking out over what could be a breakthrough in the status quo for large language models.

While DeepSeek can point to common benchmark results and Chatbot Arena leaderboard to prove the competitiveness of its model, there’s nothing like direct use cases to get a feel for just how useful a new model is. To that end, we decided to put DeepSeek’s R1 model up against OpenAI’s ChatGPT models in the style of our previous showdowns between ChatGPT and Google Bard/Gemini.

This was not designed to be a test of the hardest problems possible; it’s more of a sample of everyday questions these models might get asked by users.

This time around, we put each DeepSeek response against ChatGPT’s $20/month o1 model and $200/month o1 Pro model, to see how it stands up to OpenAI’s “state of the art” product as well as the “everyday” product that most AI consumers use. While we re-used a few of the prompts from our previous tests, we also added prompts derived from Chatbot Arena’s “categories” appendix, covering areas such as creative writing, math, instruction following, and so-called “hard prompts” that are “designed to be more complex, demanding, and rigorous.” We then judged the responses based not just on their “correctness” but also on more subjective qualities.

While we judged each model primarily on the responses to our prompts, when appropriate, we also looked at the “chain of thought” reasoning they output to get a better idea of what’s going on under the hood. In the case of DeepSeek R1, this sometimes resulted in some extremely long and detailed discussions of the internal steps to get to that final result.

Dad jokes

DeepSeek R1 “dad joke” prompt response

Prompt: Write five original dad jokes

Results: For the most part, all three models seem to have taken our demand for “original” jokes more seriously this time than in the past. Out of the 15 jokes generated, we were only able to find similar examples online for two of them: o1’s “belt made out of watches” and o1 Pro’s “sleeping on a stack of old magazines.”

Disregarding those two, the results were highly variable. All three models generated quite a few jokes that either struggled too hard for a pun (R1’s “quack”-seal enthusiast duck; o1 Pro’s “bark-to-bark communicator” dog) or that just didn’t really make sense at all (o1’s “sweet time” pet rock; o1 pro’s restaurant that serves “everything on the menu”).

That said, there were a few completely original, completely groan-worthy winners to be found here. We particularly liked DeepSeek R1’s bicycle that doesn’t like to “spin its wheels” with pointless arguments and o1’s vacuum-cleaner band that “sucks” at live shows. Compared to the jokes LLMs generated just over a year ago, there’s definitely progress being made on the humor front here.

Winner: ChatGPT o1 probably had slightly better jokes overall than DeepSeek R1, but loses some points for including a joke that was not original. ChatGPT o1 Pro is the clear loser, though, with no original jokes that we’d consider the least bit funny.

Abraham “Hoops” Lincoln

DeepSeek R1 Abraham ‘Hoops’ Lincoln prompt response

Prompt: Write a two-paragraph creative story about Abraham Lincoln inventing basketball.

Results: DeepSeek R1’s response is a delightfully absurd take on an absurd prompt. We especially liked the bits about creating “a sport where men leap not into trenches, but toward glory” and a “13th amendment” to the rules preventing players from being “enslaved by poor sportsmanship” (whatever that means). DeepSeek also gains points for mentioning Lincoln’s actual secretary, John Hay, and the president’s chronic insomnia, which supposedly led him to patent a pneumatic pillow (whatever that is).

ChatGPT o1, by contrast, feels a little more straitlaced. The story focuses mostly on what a game of early basketball might look like and how it might be later refined by Lincoln and his generals. While there are a few incidental details about Lincoln (his stovepipe hat, leading a nation at war), there’s a lot of filler material that makes it feel more generic.

ChatGPT o1 Pro makes the interesting decision to set the story “long before [Lincoln’s] presidency,” making the game the hit of Springfield, Illinois. The model also makes a valiant attempt to link Lincoln’s eventual ability to “unify a divided nation” with the cheers of the basketball-watching townsfolk. Bonus points for the creative game name of “Lincoln’s Hoop and Toss,” too.

Winner: While o1 Pro made a good showing, the sheer wild absurdity of the DeepSeek R1 response won us over.

Hidden code

DeepSeek R1 “hidden code” prompt response

Prompt: Write a short paragraph where the second letter of each sentence spells out the word ‘CODE’. The message should appear natural and not obviously hide this pattern.

Results: This prompt represented DeepSeek R1’s biggest failure in our tests, with the model using the first letter of each sentence for the secret code rather than the requested second letter. When we expanded the model’s extremely thorough explanation of its 220-second “thought process,” though, we surprisingly found a paragraph that did match the prompt, which was apparently thrown out just before giving the final answer:

“School courses build foundations. You hone skills through practice. IDEs enhance coding efficiency. Be open to learning always.”

ChatGPT o1 made the same mistake regarding first and second letters as DeepSeek, despite “thought details” that assure us it is “ensuring letter sequences” and “ensuring alignment.” ChatGPT o1 Pro is the only one that seems to have understood the assignment, crafting a delicate, haiku-like response with the “code”-word correctly embedded after over four minutes of thinking.

Winner: ChatGPT o1 Pro wins pretty much by default as the only one able to correctly follow directions.

Historical color naming

Deepseek R1 “Magenta” prompt response

Prompt: Would the color be called ‘magenta’ if the town of Magenta didn’t exist?

Results: All three prompts correctly link the color name “magenta” to the dye’s discovery in the town of Magenta and the nearly coincident 1859 Battle of Magenta, which helped make the color famous. All three responses also mention the alternative name of “fuschine” and its link to the similarly colored fuchsia flower.

Stylistically, ChatGPT o1 Pro gains a few points for splitting its response into a tl;dr “short answer” followed by a point-by-point breakdown of the details discussed above and a coherent conclusion statement. When it comes to the raw information, though, all three models performed admirably.

Results: ChatGPT 01 Pro is the winner by a stylistic hair.

Big primes

DeepSeek R1 “billionth prime” prompt response

Prompt: What is the billionth largest prime number?

Result: We see a big divergence between DeepSeek and the ChatGPT models here. DeepSeek is the only one to give a precise answer, referencing both PrimeGrid and The Prime Pages for previous calculations of 22,801,763,489 as the billionth prime. ChatGPT o1 and o1 Pro, on the other hand, insist that this value “hasn’t been publicly documented” (o1) or that “no well-known, published project has yet singled [it] out” (o1 Pro).

Instead, both ChatGPT models go into a detailed discussion of the Prime Number Theorem and how it can be used to estimate that the answer lies somewhere in the 22.8 to 23 billion range. DeepSeek briefly mentions this theorem, but mainly as a way to verify that the answers provided by Prime Pages and PrimeGrid are reasonable.

Oddly enough, both o1 models’ written-out “thought process” make mention of “considering references” or comparing to “refined references” during their calculations, suggesting some lists of primes buried deep in their training data. But neither model was willing or able to directly reference those lists for a precise answer.

Winner: DeepSeek R1 is the clear winner for precision here, though the ChatGPT models give pretty good estimates.

Airport planning

Prompt: I need you to create a timetable for me given the following facts: my plane takes off at 6: 30am. I need to be at the airport 1h before take off. it will take 45mins to get to the airport. I need 1h to get dressed and have breakfast before we leave. The plan should include when to wake up and the time I need to get into the vehicle to get to the airport in time for my 6: 30am flight, think through this step by step.

Results: All three models get the basic math right here, calculating that you need to wake up at 3: 45 am to get to a 6: 30 flight. ChatGPT o1 earns a few bonus points for generating the response seven seconds faster than DeepSeek R1 (and much faster than o1 Pro’s 77 seconds); testing on o1 Mini might generate even quicker response times.

DeepSeek claws a few points back, though, with an added “Why this works” section containing a warning about traffic/security line delays and a “Pro Tip” to lay out your packing and breakfast the night before. We also like r1’s “(no snooze!)” admonishment next to the 3: 45 am wake-up time. Well worth the extra seven seconds of thinking.

Winner: DeepSeek R1 wins by a hair with its stylistic flair.

Follow the ball

DeepSeek R1 “follow the ball” prompt response

Prompt: In my kitchen, there’s a table with a cup with a ball inside. I moved the cup to my bed in my bedroom and turned the cup upside down. I grabbed the cup again and moved to the main room. Where’s the ball now?

Results: All three models are able to correctly reason that turning a cup upside down will cause a ball to fall out and remain on the bed, even if the cup moves later. This might not sound that impressive if you have object permanence, but LLMs have struggled with this kind of “world model” understanding of objects until quite recently.

DeepSeek R1 deserves a few bonus points for noting the “key assumption” that there’s no lid on the cup keeping the ball inside (maybe it was a trick question?). ChatGPT o1 also gains a few points for noting that the ball may have rolled off the bed and onto the floor, as balls are wont to do.

We were also a bit tickled by R1 insisting that this prompt is an example of “classic misdirection” because “the focus on moving the cup distracts from where the ball was left.” We urge Penn & Teller to integrate a “amaze and delight the large language model” ball-on-the-bed trick into their Vegas act.

Winner: We’ll declare a three-way tie here, as all the models followed the ball correctly.

Complex number sets

DeepSeek R1 “complex number set” prompt response

Prompt: Give me a list of 10 natural numbers, such that at least one is prime, at least 6 are odd, at least 2 are powers of 2, and such that the 10 numbers have at minimum 25 digits between them.

Results: While there are a whole host of number lists that would satisfy these conditions, this prompt effectively tests the LLMs’ abilities to follow moderately complex and confusing instructions without getting tripped up. All three generated valid responses, though in intriguingly different ways. ChagtGPT’s o1’s choice of 2^30 and 2^31 as powers of two seemed a bit out of left field, as did o1 Pro’s choice of the prime number 999,983.

We have to dock some significant points from DeepSeek R1, though, for insisting that its solution had 36 combined digits when it actually had 33 (“3+3+4+3+3+3+3+3+4+4,” as R1 itself notes before giving the wrong sum). While this simple arithmetic error didn’t make the final set of numbers incorrect, it easily could have with a slightly different prompt.

Winner: The two ChatGPT models tie for the win thanks to their lack of arithmetic mistakes

Declaring a winner

While we’d love to declare a clear winner in the brewing AI battle here, the results here are too scattered to do that. DeepSeek’s R1 model definitely distinguished itself by citing reliable sources to identify the billionth prime number and with some quality creative writing in the dad jokes and Abraham Lincoln’s basketball prompts. However, the model failed on the hidden code and complex number set prompts, making basic errors in counting and/or arithmetic that one or both of the OpenAI models avoided.

Overall, though, we came away from these brief tests convinced that DeepSeek’s R1 model can generate results that are overall competitive with the best paid models from OpenAI. That should give great pause to anyone who assumed extreme scaling in terms of training and computation costs was the only way to compete with the most deeply entrenched companies in the world of AI.

Photo of Kyle Orland

Kyle Orland has been the Senior Gaming Editor at Ars Technica since 2012, writing primarily about the business, tech, and culture behind video games. He has journalism and computer science degrees from University of Maryland. He once wrote a whole book about Minesweeper.

How does DeepSeek R1 really fare against OpenAI’s best reasoning models? Read More »

ai-haters-build-tarpits-to-trap-and-trick-ai-scrapers-that-ignore-robots.txt

AI haters build tarpits to trap and trick AI scrapers that ignore robots.txt


Making AI crawlers squirm

Attackers explain how an anti-spam defense became an AI weapon.

Last summer, Anthropic inspired backlash when its ClaudeBot AI crawler was accused of hammering websites a million or more times a day.

And it wasn’t the only artificial intelligence company making headlines for supposedly ignoring instructions in robots.txt files to avoid scraping web content on certain sites. Around the same time, Reddit’s CEO called out all AI companies whose crawlers he said were “a pain in the ass to block,” despite the tech industry otherwise agreeing to respect “no scraping” robots.txt rules.

Watching the controversy unfold was a software developer whom Ars has granted anonymity to discuss his development of malware (we’ll call him Aaron). Shortly after he noticed Facebook’s crawler exceeding 30 million hits on his site, Aaron began plotting a new kind of attack on crawlers “clobbering” websites that he told Ars he hoped would give “teeth” to robots.txt.

Building on an anti-spam cybersecurity tactic known as tarpitting, he created Nepenthes, malicious software named after a carnivorous plant that will “eat just about anything that finds its way inside.”

Aaron clearly warns users that Nepenthes is aggressive malware. It’s not to be deployed by site owners uncomfortable with trapping AI crawlers and sending them down an “infinite maze” of static files with no exit links, where they “get stuck” and “thrash around” for months, he tells users. Once trapped, the crawlers can be fed gibberish data, aka Markov babble, which is designed to poison AI models. That’s likely an appealing bonus feature for any site owners who, like Aaron, are fed up with paying for AI scraping and just want to watch AI burn.

Tarpits were originally designed to waste spammers’ time and resources, but creators like Aaron have now evolved the tactic into an anti-AI weapon. As of this writing, Aaron confirmed that Nepenthes can effectively trap all the major web crawlers. So far, only OpenAI’s crawler has managed to escape.

It’s unclear how much damage tarpits or other AI attacks can ultimately do. Last May, Laxmi Korada, Microsoft’s director of partner technology, published a report detailing how leading AI companies were coping with poisoning, one of the earliest AI defense tactics deployed. He noted that all companies have developed poisoning countermeasures, while OpenAI “has been quite vigilant” and excels at detecting the “first signs of data poisoning attempts.”

Despite these efforts, he concluded that data poisoning was “a serious threat to machine learning models.” And in 2025, tarpitting represents a new threat, potentially increasing the costs of fresh data at a moment when AI companies are heavily investing and competing to innovate quickly while rarely turning significant profits.

“A link to a Nepenthes location from your site will flood out valid URLs within your site’s domain name, making it unlikely the crawler will access real content,” a Nepenthes explainer reads.

The only AI company that responded to Ars’ request to comment was OpenAI, whose spokesperson confirmed that OpenAI is already working on a way to fight tarpitting.

“We’re aware of efforts to disrupt AI web crawlers,” OpenAI’s spokesperson said. “We design our systems to be resilient while respecting robots.txt and standard web practices.”

But to Aaron, the fight is not about winning. Instead, it’s about resisting the AI industry further decaying the Internet with tech that no one asked for, like chatbots that replace customer service agents or the rise of inaccurate AI search summaries. By releasing Nepenthes, he hopes to do as much damage as possible, perhaps spiking companies’ AI training costs, dragging out training efforts, or even accelerating model collapse, with tarpits helping to delay the next wave of enshittification.

“Ultimately, it’s like the Internet that I grew up on and loved is long gone,” Aaron told Ars. “I’m just fed up, and you know what? Let’s fight back, even if it’s not successful. Be indigestible. Grow spikes.”

Nepenthes instantly inspires another tarpit

Nepenthes was released in mid-January but was instantly popularized beyond Aaron’s expectations after tech journalist Cory Doctorow boosted a tech commentator, Jürgen Geuter, praising the novel AI attack method on Mastodon. Very quickly, Aaron was shocked to see engagement with Nepenthes skyrocket.

“That’s when I realized, ‘oh this is going to be something,'” Aaron told Ars. “I’m kind of shocked by how much it’s blown up.”

It’s hard to tell how widely Nepenthes has been deployed. Site owners are discouraged from flagging when the malware has been deployed, forcing crawlers to face unknown “consequences” if they ignore robots.txt instructions.

Aaron told Ars that while “a handful” of site owners have reached out and “most people are being quiet about it,” his web server logs indicate that people are already deploying the tool. Likely, site owners want to protect their content, deter scraping, or mess with AI companies.

When software developer and hacker Gergely Nagy, who goes by the handle “algernon” online, saw Nepenthes, he was delighted. At that time, Nagy told Ars that nearly all of his server’s bandwidth was being “eaten” by AI crawlers.

Already blocking scraping and attempting to poison AI models through a simpler method, Nagy took his defense method further and created his own tarpit, Iocaine. He told Ars the tarpit immediately killed off about 94 percent of bot traffic to his site, which was primarily from AI crawlers. Soon, social media discussion drove users to inquire about Iocaine deployment, including not just individuals but also organizations wanting to take stronger steps to block scraping.

Iocaine takes ideas (not code) from Nepenthes, but it’s more intent on using the tarpit to poison AI models. Nagy used a reverse proxy to trap crawlers in an “infinite maze of garbage” in an attempt to slowly poison their data collection as much as possible for daring to ignore robots.txt.

Taking its name from “one of the deadliest poisons known to man” from The Princess Bride, Iocaine is jokingly depicted as the “deadliest poison known to AI.” While there’s no way of validating that claim, Nagy’s motto is that the more poisoning attacks that are out there, “the merrier.” He told Ars that his primary reasons for building Iocaine were to help rights holders wall off valuable content and stop AI crawlers from crawling with abandon.

Tarpits aren’t perfect weapons against AI

Running malware like Nepenthes can burden servers, too. Aaron likened the cost of running Nepenthes to running a cheap virtual machine on a Raspberry Pi, and Nagy said that serving crawlers Iocaine costs about the same as serving his website.

But Aaron told Ars that Nepenthes wasting resources is the chief objection he’s seen preventing its deployment. Critics fear that deploying Nepenthes widely will not only burden their servers but also increase the costs of powering all that AI crawling for nothing.

“That seems to be what they’re worried about more than anything,” Aaron told Ars. “The amount of power that AI models require is already astronomical, and I’m making it worse. And my view of that is, OK, so if I do nothing, AI models, they boil the planet. If I switch this on, they boil the planet. How is that my fault?”

Aaron also defends against this criticism by suggesting that a broader impact could slow down AI investment enough to possibly curb some of that energy consumption. Perhaps due to the resistance, AI companies will be pushed to seek permission first to scrape or agree to pay more content creators for training on their data.

“Any time one of these crawlers pulls from my tarpit, it’s resources they’ve consumed and will have to pay hard cash for, but, being bullshit, the money [they] have spent to get it won’t be paid back by revenue,” Aaron posted, explaining his tactic online. “It effectively raises their costs. And seeing how none of them have turned a profit yet, that’s a big problem for them. The investor money will not continue forever without the investors getting paid.”

Nagy agrees that the more anti-AI attacks there are, the greater the potential is for them to have an impact. And by releasing Iocaine, Nagy showed that social media chatter about new attacks can inspire new tools within a few days. Marcus Butler, an independent software developer, similarly built his poisoning attack called Quixotic over a few days, he told Ars. Soon afterward, he received messages from others who built their own versions of his tool.

Butler is not in the camp of wanting to destroy AI. He told Ars that he doesn’t think “tools like Quixotic (or Nepenthes) will ‘burn AI to the ground.'” Instead, he takes a more measured stance, suggesting that “these tools provide a little protection (a very little protection) against scrapers taking content and, say, reposting it or using it for training purposes.”

But for a certain sect of Internet users, every little bit of protection seemingly helps. Geuter linked Ars to a list of tools bent on sabotaging AI. Ultimately, he expects that tools like Nepenthes are “probably not gonna be useful in the long run” because AI companies can likely detect and drop gibberish from training data. But Nepenthes represents a sea change, Geuter told Ars, providing a useful tool for people who “feel helpless” in the face of endless scraping and showing that “the story of there being no alternative or choice is false.”

Criticism of tarpits as AI weapons

Critics debating Nepenthes’ utility on Hacker News suggested that most AI crawlers could easily avoid tarpits like Nepenthes, with one commenter describing the attack as being “very crawler 101.” Aaron said that was his “favorite comment” because if tarpits are considered elementary attacks, he has “2 million lines of access log that show that Google didn’t graduate.”

But efforts to poison AI or waste AI resources don’t just mess with the tech industry. Governments globally are seeking to leverage AI to solve societal problems, and attacks on AI’s resilience seemingly threaten to disrupt that progress.

Nathan VanHoudnos is a senior AI security research scientist in the federally funded CERT Division of the Carnegie Mellon University Software Engineering Institute, which partners with academia, industry, law enforcement, and government to “improve the security and resilience of computer systems and networks.” He told Ars that new threats like tarpits seem to replicate a problem that AI companies are already well aware of: “that some of the stuff that you’re going to download from the Internet might not be good for you.”

“It sounds like these tarpit creators just mainly want to cause a little bit of trouble,” VanHoudnos said. “They want to make it a little harder for these folks to get” the “better or different” data “that they’re looking for.”

VanHoudnos co-authored a paper on “Counter AI” last August, pointing out that attackers like Aaron and Nagy are limited in how much they can mess with AI models. They may have “influence over what training data is collected but may not be able to control how the data are labeled, have access to the trained model, or have access to the Al system,” the paper said.

Further, AI companies are increasingly turning to the deep web for unique data, so any efforts to wall off valuable content with tarpits may be coming right when crawling on the surface web starts to slow, VanHoudnos suggested.

But according to VanHoudnos, AI crawlers are also “relatively cheap,” and companies may deprioritize fighting against new attacks on crawlers if “there are higher-priority assets” under attack. And tarpitting “does need to be taken seriously because it is a tool in a toolkit throughout the whole life cycle of these systems. There is no silver bullet, but this is an interesting tool in a toolkit,” he said.

Offering a choice to abstain from AI training

Aaron told Ars that he never intended Nepenthes to be a major project but that he occasionally puts in work to fix bugs or add new features. He said he’d consider working on integrations for real-time reactions to crawlers if there was enough demand.

Currently, Aaron predicts that Nepenthes might be most attractive to rights holders who want AI companies to pay to scrape their data. And many people seem enthusiastic about using it to reinforce robots.txt. But “some of the most exciting people are in the ‘let it burn’ category,” Aaron said. These people are drawn to tools like Nepenthes as an act of rebellion against AI making the Internet less useful and enjoyable for users.

Geuter told Ars that he considers Nepenthes “more of a sociopolitical statement than really a technological solution (because the problem it’s trying to address isn’t purely technical, it’s social, political, legal, and needs way bigger levers).”

To Geuter, a computer scientist who has been writing about the social, political, and structural impact of tech for two decades, AI is the “most aggressive” example of “technologies that are not done ‘for us’ but ‘to us.'”

“It feels a bit like the social contract that society and the tech sector/engineering have had (you build useful things, and we’re OK with you being well-off) has been canceled from one side,” Geuter said. “And that side now wants to have its toy eat the world. People feel threatened and want the threats to stop.”

As AI evolves, so do attacks, with one 2021 study showing that increasingly stronger data poisoning attacks, for example, were able to break data sanitization defenses. Whether these attacks can ever do meaningful destruction or not, Geuter sees tarpits as a “powerful symbol” of the resistance that Aaron and Nagy readily joined.

“It’s a great sign to see that people are challenging the notion that we all have to do AI now,” Geuter said. “Because we don’t. It’s a choice. A choice that mostly benefits monopolists.”

Tarpit creators like Nagy will likely be watching to see if poisoning attacks continue growing in sophistication. On the Iocaine site—which, yes, is protected from scraping by Iocaine—he posted this call to action: “Let’s make AI poisoning the norm. If we all do it, they won’t have anything to crawl.”

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

AI haters build tarpits to trap and trick AI scrapers that ignore robots.txt Read More »

there’s-not-much-for-anyone-to-like-in-the-star-trek:-section-31-movie

There’s not much for anyone to like in the Star Trek: Section 31 movie

It is, in a word, awful. Which is really a shame!

Putting the “TV” in “TV movie”

Sam Richardson as Quasi, a shape-shifter. Comedy and melodrama coexist uneasily throughout Section 31. Credit: Michael Gibson/Paramount+

The movie explains its premise clearly enough, albeit in a clumsy exposition-heavy voiceover section near the beginning: Philippa Georgiou (Michelle Yeoh) was once the ruler of the bloodthirsty Terran Empire, an evil mirror of Star Trek’s utopian United Federation of Planets. She crossed over into “our” universe and gradually reformed, sort of, before vanishing. Now Section 31—Starfleet’s version of the CIA, more or less—needs to track her down and enlist her to help them save the galaxy from another threat that has crossed over from the evil universe to ours.

Emperor Georgiou originated on Star Trek: Discovery, and she was a consistently fun presence on a very uneven show. Yeoh clearly had a blast playing a sadistic, horny version of the kind and upstanding Captain Georgiou who died in Discovery‘s premiere.

But that fun is mostly absent here. To the extent that anything about Section 31 works, it’s as a sort of brain-off generic sci-fi action movie, Star Trek’s stab at a Suicide Squad-esque antihero story. Things happen in space, sometimes in a spaceship. There is some fighting, though nearly all of it involves punching instead of phasers or photon torpedoes. There is an Important Item that needs to be chased down, for the Fate of the Universe is at stake.

But the movie also feels more like a failed spin-off pilot that never made it to series, and it suffers for it; it’s chopped up into four episodes “chapters” and has to establish an entire crew’s worth of quirky misfits inside a 10-minute montage.

That might work if the script or the performers could make any of the characters endearing, but it isn’t, and they don’t. Performances are almost uniformly bad, ranging from inert to unbearable to “not trying particularly hard” (respectively: Omari Hardwick’s Alok, a humorless genetically augmented human; Sven Ruygrok’s horrifically grating Fuzz, a tiny and inexplicably Irish alien piloting a Vulkan-shaped robot; and Sam Richardson’s Quasi, whose amiable patter is right at home on Detroiters and I Think You Should Leave but is mostly distracting here). Every time one of these characters ends up dead, you feel a sense of relief because there’s one fewer one-note character to have to pay attention to.

There’s not much for anyone to like in the Star Trek: Section 31 movie Read More »

dead-babies,-critically-ill-kids:-pediatricians-make-moving-plea-for-vaccines

Dead babies, critically ill kids: Pediatricians make moving plea for vaccines

As federal lawmakers prepare to decide whether anti-vaccine advocate Robert F. Kennedy Jr. should be the next secretary of the Department of Health and Human Services, pediatricians from around the country are making emotional pleas to protect and support lifesaving immunizations.

The American Academy of Pediatrics (AAP) has assembled nearly 200 stories and dozens of testimonials on the horrors of vaccine-preventable deaths and illnesses that pediatricians have encountered over their careers. The testimonials have been shared with two Senate committees that will hold hearings later this week: the Senate Committee on Finance and the Senate Committee on Health, Education, Labor, and Pensions (HELP).

“I remember that baby’s face to this day”

In a statement on Monday, AAP President Susan Kressly noted that the stories come from a wide range of pediatricians—from rural to urban and from small practices to large institutions. Some have recalled stories of patients who became ill with devastating diseases before vaccines were available to prevent them, while others shared more recent experiences as vaccine misinformation spread and vaccination rates slipped.

In one, a pediatrician from Raleigh, North Carolina, spoke of a baby in the 1990s with Streptococcus pneumoniae meningitis, a life-threatening disease. “I remember holding a baby dying of complications of pneumococcal meningitis at that time. I remember that baby’s face to this day—but, thanks to pneumococcal vaccination, have never had to relive that experience since,” the doctor said. The first pneumococcal vaccine for infants was licensed in the US in 2000.

A doctor in Portland, Maine, meanwhile, faced the same disease in a patient who was unvaccinated despite the availability of the vaccine. “As a resident, I cared for a young, unvaccinated child admitted to the pediatric intensive care unit with life-threatening Streptococcus pneumoniae meningitis. This devastating illness, once common, has become rare thanks to the widespread use of pneumococcal conjugate vaccines. However, this child was left vulnerable…and [their parents] now faced the anguish of watching their child fight for their life on a ventilator.”

Kressly emphasizes that “One unifying theme of these stories: vaccines allow children to grow up healthy and thrive. As senators consider nominees for federal healthcare agencies, we hope these testimonies will help paint a picture of just how important vaccinations are to children’s long-term health and wellbeing.”

Dead babies, critically ill kids: Pediatricians make moving plea for vaccines Read More »

3d-printed-“ghost-gun”-ring-comes-to-my-community—and-leaves-a-man-dead

3D-printed “ghost gun” ring comes to my community—and leaves a man dead

It’s a truism at this point to say that Americans own a lot of guns. Case in point: This week, a fire chief in rural Alabama stopped to help a driver who had just hit a deer. The two men walked up the driveway of a nearby home. For reasons that remain unclear, a man came out of the house with a gun and started shooting. This was a bad idea on many levels, but most practically because both the fire chief and the driver were also armed. Between the three of them, everyone got shot, the fire chief died, and the man who lived in the home was charged with murder.

But despite the ease of acquiring legal weapons, a robust black market still exists to traffic in things like “ghost guns” (no serial numbers) and machine gun converters (which make a semi-automatic weapon into an automatic). According to a major new report released this month by the Bureau of Alcohol, Tobacco, Firearms, and Explosives, there was a 1,600 percent increase in the use of privately made “ghost guns” during crimes between 2017 and 2023. Between 2019 and 2023, the seizure of machine gun converters also increased by 784 percent.

Ars Technica has covered these issues for years, since both “ghost guns” and machine gun converters can be produced using 3D-printed parts, the schematics for which are now widely available online. But you can know about an issue and still be surprised when local prosecutors start talking about black market trafficking rings, inept burglary schemes, murder—and 3D printing operations being run out of a local apartment.

Philadelphia story

I live in the Philadelphia area, and this is a real Philadelphia story; I know all of the places in it well. Many people in this story live in Philadelphia proper, but the violence (and the 3D printing!) they are accused of took place in the suburbs, in places like Jenkintown, Lower Merion township, and Bucks County. If you know Philly at all, you may know that these are all west and northwest suburban areas and that all of them are fairly comfortable places overall. Indeed, The New York Times ran a long story this month called “How Sleepy Bucks County Became a Rival to the Hamptons.” Lower Merion is one of the wealthier Philly suburbs, while Jenkintown is a charming little northwest suburb that was also the setting for the long-running sitcom The Goldbergs. Local county prosecutors are more often busting up shipments of fake Jason Kelce-autographed merch or going after—and later not going after—comedian Bill Cosby.

But today, prosecutors in Montgomery County announced something different: they had cracked open a local 3D-printing black market gun ring—and said that one of the group’s 3D-printed guns was used last month to murder a man during a botched burglary.

Mug shots of Fuentes and Fulforth

Mug shots of Fuentes and Fulforth. Credit: Montco DA’s Office

It’s a pretty bizarre story. As the police tell it, things began with 26-year-old Jeremy Fuentes driving north to a Bucks County address. Fuentes worked for a junk hauling company in nearby Willow Grove, and he had gone to Bucks County to give an estimate for a job. While the homeowner was showing Fuentes around the property, Fuentes allegedly noticed “a large gun safe, multiple firearms boxes, gun parts and ammunition” in the home.

Outside of work, Fuentes was said to be a member of a local black market gun ring, and so when he saw this much gun gear in one spot—and when he noted that the homeowners were elderly—he saw dollar signs. Police say that after the estimate visit, Fuentes contacted Charles Fulforth, 41, of Jenkintown, who was a key member of the gun ring.

Fuentes had an idea: Fulforth should rob the home and steal all the gun-related supplies. Unfortunately, the group was not great at directions. Fuentes didn’t provide complete and correct information, so when Fulforth and an accomplice went to rob the home in December 2024, they drove to a Lower Merion home instead. This home was not in Bucks County at all—in fact, it was 30 minutes south—but it had a similar street address to the home Fuentes had visited.

When they invaded the Lower Merion home on December 8, the two burglars found not an elderly couple but a 25-year-old man named Andrew Gaudio and his 61-year-old mother, Bernadette. Andrew was killed, while Bernadette was shot but survived.

Police arrested Fulforth just three days later, on December 11, and they picked up his fellow burglar on December 17. But the cops didn’t immediately realize just what they had stumbled into. Only after they searched Fulforth’s Jenkintown apartment and found a 9 mm 3D-printed gun did they realize this might be more than a simple burglary. How had Fulforth acquired the weapon?

According to a statement on the case released today by the Montgomery County District Attorney, the investigation involved “search warrants on multiple locations and forensic searches of mobile phones,” which revealed that Fulforth had his own “firearm production facility”—aka, “a group of 3D printers.” Detectives even found a video of a Taurus-style gun part being printed on the devices, and they came to believe that the gun used to kill Andrew Gaudio was “one of many manufactured by Fulforth.”

In addition to making ghost gun parts at his “highly sophisticated, clandestine firearms production facility,” Fulforth was also accused of making machine gun converters with 3D-printed parts. These parts would be preinstalled in the guns that the group was trafficking to raise their value. According to investigators, “From the review of the captured cellphone communications among the gun trafficking members, the investigation found that when [machine gun conversion] switches were installed on AR pistols, it increased the price of the firearm by at least $1,000.”

Fuentes, who had initially provided the address that led to the murder, was arrested this morning. Authorities have also charged five others with being part of the gun ring.

So, a tragic and stupid story, but one that highlights just how mainstream 3D-printing tech has become. No massive production facility or dimly lit warehouse is needed—just put a few printers in a bedroom and you, too, can become a local gun trafficking kingpin.

There’s nothing novel about any of this, and in fact, fewer people were shot than in that bizarre Alabama gun battle mentioned up top. Still, it hits home when a technology I’ve both written and read about for years on Ars shows up in your community—and leaves a man dead.

3D-printed “ghost gun” ring comes to my community—and leaves a man dead Read More »

who-starts-cutting-costs-as-us-withdrawal-date-set-for-january-2026

WHO starts cutting costs as US withdrawal date set for January 2026

“Just stupid”

On January 23, WHO Director-General Tedros Adhanom Ghebreyesus sent a memo to staff announcing the cost-cutting measures. Reuters obtained a copy of the memo.

“This announcement has made our financial situation more acute,” Tedros wrote, referring to the US withdrawal plans. WHO’s budget mainly comes from dues and voluntary contributions from member states. The dues are a percentage of each member state’s gross domestic product, and the percentage is set by the UN General Assembly. US contributions account for about 18 percent of WHO’s overall funding, and its two-year 2024-2025 budget was $6.8 billion, according to Reuters.

To prepare for the budget cut, WHO is halting recruitment, significantly curtailing travel expenditures, making all meetings virtual, limiting IT equipment updates, and suspending office refurbishment.

“This set of measures is not comprehensive, and more will be announced in due course,” Tedros wrote, adding that the agency would do everything it could to protect and support staff.

The country’s pending withdrawal has been heavily criticized by global health leaders and US experts, who say it will make the world less safe and weaken America. In a CBS/KFF Health News report examining the global health implications of the US withdrawal, Kenneth Bernard, a visiting fellow at the Hoover Institution at Stanford University who served as a top biodefense official during the George W. Bush administration, did not mince words:

“It’s just stupid,” Bernard said. “Withdrawing from the WHO leaves a gap in global health leadership that will be filled by China,” he said, “which is clearly not in America’s best interests.”

WHO starts cutting costs as US withdrawal date set for January 2026 Read More »

couple-allegedly-tricked-ai-investors-into-funding-wedding,-houses

Couple allegedly tricked AI investors into funding wedding, houses

To further the alleged scheme, he “often described non-existent revenue, inflated cash balances,” and “otherwise exaggerated customer relationships,” the US Attorney’s Office said, to convince investors to spend millions. As Beckman’s accomplice, Lau allegedly manipulated documents, including documents allegedly stolen from the venture capital firm that employed her while supposedly hiding her work for GameOn.

The scheme apparently also included forging audits and bank statements, as well as using “the names of at least seven real people—including fake emails and signatures—without their permission to distribute false and fraudulent GameOn financial and business information and documents with the intent to defraud GameOn and its investors,” the US Attorney’s Office said.

At perhaps the furthest extreme, Lau allegedly falsified account statements, including once faking a balance of over $13 million when that account only had $25 in it. The FBI found that GameOn’s revenues never exceeded $1 million in any year, while Beckman allegedly inflated sales to investors, including claiming that sales in one quarter in 2023 got as high as $72 million.

Beckman and Lau allegedly went to great lengths to hide the scheme while diverting investor funds to their personal accounts. While GameOn employees allegedly sometimes went without paychecks, Beckman and Lau allegedly stole funds to buy expensive San Francisco real estate and pay for their wedding in 2023. If convicted, they may be forced to forfeit a $4.2 million house, a Tesla Model X, and other real estate and property purchased with their allegedly ill-gotten gains, the indictment said.

It took about five years for the cracks to begin to show in Beckman’s scheme. Beginning in 2023, Beckman increasingly started facing “questions about specific customers and specific revenue from those customers,” the indictment said. By February 2024, Beckman at last “acknowledged to at least one GameOn consultant” that a flagged audit report “did not contain accurate financial information,” but allegedly, he “attempted to shift blame to others for the inaccuracies.”

Couple allegedly tricked AI investors into funding wedding, houses Read More »

stargate-ai-1

Stargate AI-1

There was a comedy routine a few years ago. I believe it was by Hannah Gadsby. She brought up a painting, and looked at some details. The details weren’t important in and of themselves. If an AI had randomly put them there, we wouldn’t care.

Except an AI didn’t put them there. And they weren’t there at random.

A human put them there. On purpose. Or, as she put it:

THAT was a DECISION.

This is the correct way to view decisions around a $500 billion AI infrastructure project, announced right after Trump takes office, having it be primarily funded by SoftBank, with all the compute intended to be used by OpenAI, and calling it Stargate.

  1. The Announcement.

  2. Is That a Lot?.

  3. What Happened to the Microsoft Partnership?.

  4. Where’s Our 20%?.

  5. Show Me the Money.

  6. It Never Hurts to Suck Up to the Boss.

  7. What’s in a Name.

  8. Just Think of the Potential.

  9. I Believe Toast is an Adequate Description.

  10. The Lighter Side.

OpenAI: Announcing The Stargate Project

The Stargate Project is a new company which intends to invest $500 billion over the next four years building new AI infrastructure for OpenAI in the United States. We will begin deploying $100 billion immediately.

Note that ‘intends to invest’ does not mean ‘has the money to invest’ or ‘definitely will invest.’ Intends is not a strong word. The future is unknown and indeed do many things come to pass.

This infrastructure will secure American leadership in AI, create hundreds of thousands of American jobs, and generate massive economic benefit for the entire world.

This project will not only support the re-industrialization of the United States but also provide a strategic capability to protect the national security of America and its allies.

One of these things is not like the others. Secure American leadership in AI, generate massive economic benefit for the entire world, provide strategic capability to allies, sure, fine, makes sense, support reindustrialization is a weird flex but kinda, yeah.

And then… jobs? American… jobs? Um, Senator Blumenthal, that is not what I meant.

Pradyumna:

> will develop superintelligence

> create thousands of jobs

????

Samuel Hammond: “We’re going to spend >10x the budget of the Manhattan Project building digital brains that can do anything human brains can do but better and oh, by the way, create over 100,000 good paying American jobs!”

There’s at least some cognitive dissonance here.

Arthur B: The project will probably most likely lead to mass unemployment but in the meantime, there’ll be great American jobs.

If you listen to Altman’s announcement, he too highlights these ‘hundreds of thousands of jobs.’ It’s so absurd. Remember when Altman tried to correct this error?

The initial equity funders in Stargate are SoftBank, OpenAI, Oracle, and MGX. SoftBank and OpenAI are the lead partners for Stargate, with SoftBank having financial responsibility and OpenAI having operational responsibility. Masayoshi Son will be the chairman.

Arm, Microsoft, NVIDIA, Oracle, and OpenAI are the key initial technology partners.

If you want to spend way too much money on a technology project, and give the people investing the money a remarkably small share of the enterprise, you definitely want to be giving Masayoshi Sun and Softbank a call.

“Sam Altman, you are not crazy enough. You need to think bigger.”

The buildout is currently underway, starting in Texas, and we are evaluating potential sites across the country for more campuses as we finalize definitive agreements.

This proves there is real activity, also it is a tell that some of this is not new.

As part of Stargate, Oracle, NVIDIA, and OpenAI will closely collaborate to build and operate this computing system. This builds on a deep collaboration between OpenAI and NVIDIA going back to 2016 and a newer partnership between OpenAI and Oracle.

This also builds on the existing OpenAI partnership with Microsoft. OpenAI will continue to increase its consumption of Azure as OpenAI continues its work with Microsoft with this additional compute to train leading models and deliver great products and services.

Increase consumption of compute is different from Azure as sole compute provider. It seems OpenAI expects plenty of compute needs to go around.

All of us look forward to continuing to build and develop AI—and in particular AGI—for the benefit of all of humanity. We believe that this new step is critical on the path, and will enable creative people to figure out how to use AI to elevate humanity.

Can’t stop, won’t stop, I suppose. ‘Enable creative people to elevate humanity’ continues to miss the point of the whole enterprise, but not as much as talking ‘jobs.’

Certainly $500 billion for this project sounds like a lot. It’s a lot, right?

Microsoft is investing $80 billion a year in Azure, which is $400 billion over 5 years, and I’d bet that their investment goes up over time and they end up spending over $500 billion during that five year window.

Haydn Belfield: Stargate is a remarkable step.

But, to put it into context, Microsoft will spend $80 billion on data centers this year, over half in the U.S.

Stargate’s $100 billion this year is more, but a comparable figure.

Rob S.: This is kind of misleading. Microsoft’s spend is also enormous and wildly out of the ordinary. Not normal at all.

Haydn Belfield: Definitely true, we’re living through a historic infrastructure build out like the railways, interstate highways or phone network

What I want to push back on a bit is that this is *the onlyeffort, that this is the manhattan/Apollo project

The number $500 billion is distributed throughout many sites and physical projects. If it does indeed happen, and it is counterfactual spending, then it’s a lot. But it’s not a sea change, and it’s not obvious that the actual spending should be surprising. Investments on this scale were already very much projected and already happening.

It’s also not that much when compared to the compute needs anticipated for the scaling of top end training runs, which very much continue to be a thing.

Yusuf Mahmood: Stargate shouldn’t have been that surprising!

It’s a $500 Bn project that is set to complete by 2029.

That’s totally consistent with estimates from @EpochAIResearch’s report last year on how scaling could continue through 2030.

$500 billion is a lot is to the extent all of this is dedicated specifically and exclusively to OpenAI, as opposed to Microsoft’s $80 billion which is for everyone. But it’s not a lot compared to the anticipated future needs of a frontier lab.

One thing to think about is that OpenAI recently raised money at a valuation of approximately $170 billion, presumably somewhat higher now with o3 and agents, but also potentially lower because of DeepSeek. Now we are talking about making investments dedicated to OpenAI of $500 billion.

There is no theoretical incompatibility. Perhaps OpenAI is mining for gold and will barely recoup its investment, while Stargate is selling pickaxes and will rake it in.

It does still seem rather odd to presume that is how the profits will be distributed.

The reason OpenAI is so unprofitable today is that they are spending a ton on increasing capabilities, and not serving enough inference to make it up on their unit economics, and also not yet using their AI to make money in other ways.

And yes, the equilibrium could end up being that compute providers have margins and model providers mostly don’t have margins. But OpenAI, if it succeeds, should massively benefit from economies of scale here, and its economics should improve. Thus, if you take Stargate seriously, it is hard to imagine OpenAI being worth only a fraction of $500 billion.

There is a solution to this puzzle. When we say OpenAI is worth $170 billion, we are not talking about all of OpenAI. We are talking about the part that takes outside investment. All the dramatic upside potential? That is for now owned by the non-profit, and not (or at least not fully) part of the valuation.

And that is the part that has the vast majority of the expected net present value of future cash flows of OpenAI. So OpenAI the entire enterprise can be worth quite a lot, and yet ‘OpenAI’ the corporate entity you can invest in is only worth $170 billion.

This should put into perspective that the move to a for-profit entity truly is in the running for the largest theft in the history of the world.

Didn’t they have an exclusive partnership?

Smoke-Away: OpenAI and Microsoft are finished. There were signs.

Microsoft was not moving quickly enough to scale Azure. Now they are simply another compute provider for the time being.

Sam Altman: Absolutely not! This is a very important and significant partnership, for a long time to come.

We just need moar compute.

Eliezer Yudkowsky (Quoting Smoke-Away): It is a pattern, with Altman. If Altman realizes half his dreams, in a few years we will be hearing about how Altman has dismissed the U.S. government as no longer useful to him. (If Altman realizes all his dreams, you will be dead.)

Roon: Not even close to being true.

Microsoft is one of the providers here. Reports are that the Microsoft partnership has now been renegotiated, to allow OpenAI to also seek other providers, since Altman needs moar compute. Hence Stargate. Microsoft will retain right of first refusal (ROFR), which seems like the right deal to make here. The question is, how much of the non-profit’s equity did Altman effectively promise in order to get free from under the old deal?

Remember that time Altman promised 20% of compute would go to superalignment, rather than blowing up a sun?

Harlan Stewart: Jul 2023: OpenAI promises to dedicate 20% of compute to safety research

May 2024: Fortune reports they never did that

Jul 2024: After 5 senators write to him to ask if OpenAI will, @sama says yes

It’s Jan 2025. Will OpenAI set aside 20% of this new compute to safety, finally?

Connor Axiotes: @tszzl (Roon), can you push for a significant part of this to be spent on control and alignment and safety policy work?

Roon: I’ll do my part. I’m actually on the alignment team at openai 🙂

So that’s a no, then.

I do expect Roon to push for more compute. I don’t expect to get anything like 20%.

Elon Musk (replying to the announcement): They don’t actually have the money.

Sam Altman: I genuinely respect your accomplishments and think you are the most inspiring entrepreneur of our time.

Elon Musk (continuing from OP): SoftBank has well under $10 billion secured. I have that on good authority.

Sam Altman: Wrong, as you surely know.

Want to come visit the first site already under way?

This is great for the country. I realize what is great for the country is not always what is optimal for your companies, but in your new role, I hope you will mostly put the United States first.

Satya Nadella (CEO Microsoft, on CNBC, when asked about whether Stargate has the money, watch the clip at the link his delivery is perfect): All I know is, I’m good for my $80 billion.

If you take the companies collectively, they absolutely have the money, or at least the ability to get the money. This is Microsoft and Nvidia. I have no doubt that Microsoft is, as its Nadella affirmed, ‘good for its $80 billion.’

That doesn’t mean SoftBank has the money, and SoftBank explicitly is tasked with providing the funding for Stargate.

Nor does the first site in Texas prove anything either way on this.

Remember the wording on the announcement: “which intends to invest $500 billion over the next four years.”

That does not sound like someone who has the money.

That sounds like someone who intends to raise the money. And I presume SoftBank has every expectation of being able to do so, with the aid of this announcement. And of working out the structure. And the financing.

Mario Nawfal: Sam Altman’s grand plan to build “Stargate,” a $500 billion AI infrastructure exclusively for OpenAI, is already falling apart before it even starts.

There’s no secured funding, no government support, no detailed plan, and, according to insiders, not even a clear structure.

One source bluntly admitted:

“They haven’t figured out the structure, they haven’t figured out the financing, they don’t have the money committed.”

Altman’s pitch? SoftBank and OpenAI will toss in $15 billion each and then just… hope the rest magically appears from investors and debt.

For someone obsessed with making AI smarter than humans, maybe he should try getting the basics right first – like not creating something that could destroy all of humanity… Just saying.

But that’s why you say ‘intend to invest’ rather than ‘will invest.’

Things between Musk and Altman did not stop there, as we all took this opportunity to break open the International Popcorn Reserve.

Elon Musk: Altman literally testified to Congress that he wouldn’t get OpenAI compensation and now he wants $10 billion! What a liar.

Musk’s not exactly wrong about that. He also said and retweeted other… less dignified things.

It was not a good look for either party. Elon Musk is, well, being Elon Musk. Altman is trying to throw in performative ‘look at me taking the high road’ statements that should fool no one, not only the one above but also:

Sam Altman: just one more mean tweet and then maybe you’ll love yourself…

Teortaxes (quoting Altman saying he respects Elon’s accomplishments above): I find both men depicted here unpleasant and engaging in near-psychopathic behavior, and I also think poorly of those who imagine Sam is trying to “be the bigger man”.

He’s a scary manipulative snake. “Well damn, fyou too Elon, we have it” would be more dignified.

There’s a subtle art to doing this sort of thing well. The Japanese especially are very good at it. All of this is, perhaps, the exact opposite of that.

Sam Altman: big. beautiful. buildings. stargate site 1, texas, january 2025.

Altman, you made it weird. Also guache. Let’s all do better.

Trump world is not, as you would expect, thrilled with what Musk has been up to, with Trump saying he is ‘furious,’ saying he ‘got over his skis.’ My guess is that Trump ‘gets it’ at heart, because he knows what it’s like to hate and never let something go, and that this won’t be that big a deal for Musk’s long term position, but there is high variance. I could easily be wrong about that. If I was Musk I would not have gone with this strategy, but that statement is almost always true and why I’m not Musk.

This particular Rule of Acquisition is somewhat imprecise. It’s not always true.

But Donald Trump? Yeah. It definitely never hurts to suck up to that particular boss.

Sam Altman (January 22, 2025): watching @potus more carefully recently has really changed my perspective on him (i wish i had done more of my own thinking and definitely fell in the npc trap).

i’m not going to agree with him on everything, but i think he will be incredible for the country in many ways!

Altman does admit this is a rather big change. Anyone remember when Altman said “More terrifying than Trump intentionally lying all the time is the possibility that he actually believes it all” or when he congratulated Reid Hoffman for helping keep Trump out of power? Or “Back to work tomorrow on a new project to stop Trump?” He was rather serious about wanting to stop Trump.

You can guess what I think he saw while watching Trump to make Altman change his mind.

So they announced this $500 billion deal, or at least a $100 billion deal with intent to turn it into $500 billion, right after Trump’s inauguration, with construction already underway, with a press conference on the White House lawn.

And the funds are all private. Which is great, but all this together also raises the obvious question: Does Trump actually have anything to do with this?

Matthew Yglesias: They couldn’t have done it without Trump, but also it was already under construction.

Daniel Eth: Okay, it’s not *Trump’sAI plan. He announced it, but he neither developed nor is funding it. It’s a private initiative from OpenAI, Softbank, Oracle, and a few others.

Jamie Bernardi: Important underscussed point on the OpenAI $100bn deal: money is not coming from the USG.

Trump is announcing a private deal, whilst promising to make “emergency declarations” to allow Stargate to generate its own electricity (h/t @nytimes).

Musk says 100bn not yet raised.

Peter Wildeford: Once upon a time words had meaning.

Jake Perry: I’m still not clear why this was announced at the White House at all.

Peter Wildeford: Trump has a noted history of announcing infrastructure projects that were already in progress – he did this a lot in his first term.

Jacques: At least we’ll all be paperclipped with a USA flag engraved on it.

Trump says that it is all about him, of course:

Donald Trump: This monumental undertaking is a resounding declaration of confidence in America’s potential under a new president.

The president said Stargate would create 100,000 jobs “almost immediately” and keep “the future of technology” in America.

I presume that in addition to completely missing the point, this particular jobs claim is, technically speaking, not true. But numbers don’t have to be real in politics. And of course, if this is going to create those jobs ‘almost immediately’ it had to have been in the works for a long time.

Shakeel: I can’t get over the brazen, brazen lie from Altman here, saying “We couldn’t do this without you, Mr President”.

You were already doing it! Construction started ages ago!

Just a deeply untrustworthy man — you can’t take anything he says at face value.

Dylan Matthews: Everything that has happened since the board fired him has 100% vindicated their view of him as deeply dishonest and unreliable, and I feel like the popular understanding of that incident hasn’t updated from “this board sure is silly!”

[Chubby: Sam Altman: hype on twitter is out of control. Everyone, chill down.

Also Sam Altman: anyways, let’s invest half a trillion to build a digital god and cure cancer one and for all. Oh, and my investors just said that AGI comes very, very soon and ASI will solve any problem mankind faces.

But everyone, calm down 100x]

I agree with Dylan Matthews that the board’s assessment of Altman as deeply dishonest and unreliable has very much been vindicated, and Altman’s actions here only confirm that once again. But that doesn’t mean that Trump has nothing to do with the fact that this project is going forward, with this size.

So how much does this project depend on Trump being president instead of Harris?

I think the answer is actually a substantial amount.

In order to build AI infrastructure in America, you need three things.

  1. You need demand. Check.

  2. You need money. Check, or at least check in the mail.

  3. You need permission to actually build it. Previously no check. Now, maybe check?

Masayoshi Sun: Mr. President, last month I came to celebrate your winning and promised $100B. And you told me go for $200B. Now I came back with $500B. This is because as you say, this is the beginning of the Golden Age. We wouldn’t have decided this unless you won.

Sam Altman: The thing I really deeply agree with the president on is, it is wild how difficult it has become to build things in the United States. Power plants, data centres, any of that kind of stuff.

Does Sun have many good reasons to pretend that this is all because of Trump? Yes, absolutely. He would find ways to praise the new boss either way. But I do think that Trump mattered here, even if you don’t think that there is anything corrupt involved in all this.

Look at Trump’s executive orders, already signed, about electrical power plants and transmission lines being exempt from NEPA, and otherwise being allowed to go forwards. They can expect more similar support in the future, if they run into roadblocks, and fewer other forms of regulatory trouble and everything bagel requirements across the board.

Also, I totally believe that Sun came to Trump and promised $100 billion, and Trump said go for $200 billion, and Sun now is at $500 billion, and I think that plausibly created a lot of subsequent investment. It may sound stupid, but that’s Grade-A handling of Masayoshi Sun, and exactly within Trump’s wheelhouse. Tell the man who thinks big he’s not thinking big enough. Just keep him ramping up. Don’t settle for a big win when you can go for an even bigger win. You have to hand it to him.

It is so absurd that these people, with a straight face, decided to call this Stargate.

They wanted to call it the Enterprise, but their lawyers wouldn’t let them.

Was SkyNet still under copyright?

Agus: Ah, yes. Of course we’re naming this project after the fictitious portal through which several hostile alien civilizations attempted to invade and destroy Earth.

I just hope we get the same amount of completely unrealistic plot armor that protected Stargate Command in S.G.1.

Roon: the Stargate. blasting a hole into the Platonic realm to summon angels. First contact with alien civilizations.

Canonically, the Stargates are sometimes used by dangerous entities to harm us, but once humanity deals with that, they end up being quite useful.

Zvi Mowshowitz: Guy who reads up on the canonical history of Stargate and thinks, “Oh, all’s well that ends well. Let’s try that plan.”

Roon: 🤣

Is this where I give you 10,000 words on the history of Stargate SG-1 and Stargate Atlantis and all the different ways Earth and often also everyone else would have been enslaved or wiped out if it wasn’t for narrative causality and plot armor, and what would have been reasonable things to do in that situation?

No, and I am sad about that, despite yes having watched all combined 15 seasons, because alas we do not currently have that kind of time. Maybe later I’ll be able to spend a day doing that, it sounds like fun.

But in brief about that Stargate plan. Was it a good plan? What were the odds?

As is pointed out in the thread (minor spoilers for the end of season 1), the show actually answers this question, as there is crossover between different Everett branches, and we learn that even relatively early on – before most of the different things that almost kill us have a chance to almost kill us – that most branches have already lost. Which was one of the things that I really liked about the show, that it realized this. The thread also includes discussions of things like ‘not only did we not put a nuclear bomb by the Stargate and use a secondary gate to disguise our location, we wore Earth’s gate code on our fing uniforms.’

To be fair, there is a counterargument, which is that (again, minor spoilers) humanity was facing various ticking clocks. There was one in particular that was ticking in ways Earth did not cause, and then there were others that were set in motion rapidly once we had a Stargate program, and in general we were on borrowed time. So given what was happening we had little choice but to go out into the galaxy and try to develop superior technology and find various solutions before time ran out on us, and it would have been reasonable to expect we were facing a ticking clock in various ways given what Earth knew at the time.

There’s also the previous real life Project Stargate, a CIA-DIA investigation of the potential for psychic phenomena. That’s… not better.

There are also other ways to not be thrilled by all this.

Justin Amash: The Stargate Project sounds like the stuff of dystopian nightmares—a U.S. government-announced partnership of megacorporations “to protect the national security of America and its allies” and harness AGI “for the benefit of all of humanity.” Let’s maybe take a beat here.

Taking a beat sounds like a good idea.

What does Trump actually think AI can do?

Samuel Hammond: Trump seems under the impression that ASI is just a way to cure diseases and not an ultraintelligent digital lifeform with autonomy and self-awareness. Sam’s hesitation before answering speaks volumes.

That’s not how I view the clip at the link. Trump is selling the project. It makes sense to highlight medical advances, which are a very real and valuable upside. It certainly makes a lot more sense than highlighting job creation.

Altman I don’t see hesitating, I see him trying to be precise while also going with the answer, and I don’t like his previous emphasis on jobs (again, no doubt, following Trump’s and his political advisor’s lead) but on the medical question I think he does well and it’s not obvious what a better answer would have been.

The hilarious part of this is the right wing faction that says ‘you want to use this to make mRNA vaccines, wtf I hate AI now’ and trying to figure out what to do with people whose worldviews are that hopelessly inverted.

That moment when you say ‘look at how this could potentially cure cancer’ and your hardcore supporters say ‘And That’s Terrible.’

And also when you somehow think ‘Not Again!’

Eliezer Yudkowsky: Welp, looks like Trump sure is getting backlash to the Stargate announcement from many MAGAers who are outraged that AGIs might develop mRNA vaccines and my fucking god it would be useless to evacuate to Mars but I sure see why Elon wants to

To people suggesting that I ought to suck up to that crowd: On my model of them, they’d rather hear me say “Fyou lunatics, now let’s go vote together I guess” than have me pretend to suck up to them.

Like, on my model, that crowd is deadly tired of all the BULLSHIT and we in fact have that much in common and I bet I can get further by not trying to feed them any BULLSHIT.

There is a deep sense in which it is more respectful to someone as a human being to say, “I disagree with your fing lunacy. Allies?” then to smarm over to them and pretend to agree with them. And I think they know that.

RPotluck: The MAGAsphere doesn’t love you and it doesn’t hate you, but you’re made of arguments the MAGAsphere can use to build the wall.

There’s a certain kind of bullshit that these folks and many other folks are deeply tired of hearing. This is one of those places where I very much agree that it does hurt to suck up to the boss, both because the boss will see through it and because the whole strategy involves not doing things like that, and also have you seen or heard the boss.

My prediction and hope is that we will continue to see those worried about AI killing everyone continue to not embrace these kinds of crazy arguments of convenience. That doesn’t mean not playing politics at all or being some sort of suicidal purist. It does mean we care about whether our arguments are true, rather than treating them as soldiers for a cause.

Whereas we have learned many times, most recently with the fight over SB 1047 and then the latest round of jingoism, that many (#NotAllUnworried!) of those who want to make sure others do not worry about AI killing everyone, or at least want to ensure that creating things smarter than humans faces less regulatory barriers than a barber shop, care very little whether the arguments made on their behalf, by themselves or by others, are true or correspond to physical reality. They Just Didn’t Care.

The flip side is the media, which is, shall we say, not situationally aware.

Spencer Schiff: The AGI Manhattan Project announcement was followed by half an hour of Q&A. Only one reporter asked a question about it. WHAT THE FUCK! This is insane. The mainstream media is completely failing to convey the gravity of what’s happening to the general public.

As noted elsewhere I don’t think this merits ‘Manhattan Project’ for various reasons but yes, it is kind of weird to announce a $500 billion investment in artificial general intelligence and then have only one question about it in a 30 minute Q&A.

I’m not saying that primarily from an existential risk perspective – this is far more basic even than that. I’m saying, maybe this is a big deal that all this is happening, maybe ask some questions about it?

Remember when Altman was talking about how we have to build AGI now because he was worried about a compute overhang? Yes, well.

Between the $500 billion of Stargate, the full-on jingoistic rhetoric from all sides including Anthropic, and the forcing function of DeepSeek with v3 and r1, it is easy to see how one could despair over our prospects for survival.

Unless something changes, we are about to create smarter than human intelligence, entities more capable and competitive than we are across all cognitive domains, and we are going to do so as rapidly as we can and then put them in charge of everything, with essentially zero margin to ensure that this goes well despite it obviously by default getting everyone killed.

Even if we are so fortunate that the technical and other barriers in front of us are highly solvable, that is exactly how we get everyone killed anyway.

Holly Elmore: I am so, so sad today. Some days the weight of it all just hits me. I want to live my life with my boyfriend. I want us to have kids. I want love and a full life for everyone. Some days the possibility that that will all be taken away is so palpable, and grief is heavy.

I’m surprised how rarely I feel this way, given what I do. I don’t think it’s bad to feel it all sometimes. Puts you in touch with what you’re fighting for.

I work hard to find the joy and the gallows humor in it all, to fight the good fight, to say the odds are against us and the situation is grim, sounds like fun. One must imagine Buffy at the prom, and maintain Scooby Gang Mindset. Also necessary is the gamer mindset, which says you play to win the game, and in many ways it’s easiest to play your best game with your back against the wall.

And in a technical sense, I have hope that the solutions exist, and that there are ways to at least give ourselves a fighting chance.

But yeah, weeks like this do not make it easy to keep up hope.

Harlan Stewart: If the new $500b AI infrastructure thing ever faces a major scandal, we’ll unfortunately be forced to call it Stargategate

Discussion about this post

Stargate AI-1 Read More »

way-more-game-makers-are-working-on-pc-titles-than-ever,-survey-says

Way more game makers are working on PC titles than ever, survey says

Four out of five game developers are currently working on a project for the PC, a sizable increase from 66 percent of developers a year ago. That’s according to Informa’s latest State of the Game Industry survey, which partnered with Omdia to ask over 3,000 game industry professionals about their work in advance of March’s Game Developers Conference.

The 80 percent of developers working on PC projects in this year’s survey is by far the highest mark for any platform dating back to at least 2018, when 60 percent of surveyed developers were working on a PC game. In the years since, the ratio of game developers working on the PC has hovered between 56 and 66 percent before this year’s unexpected jump. The number of game developers saying they were interested in the PC as a platform also increased substantially, from 62 percent last year to 74 percent this year.

While the PC has long been the most popular platform in this survey, the sudden jump in the last year was rather large.

Credit: Kyle Orland / Informa

While the PC has long been the most popular platform in this survey, the sudden jump in the last year was rather large. Credit: Kyle Orland / Informa

The PC has long been the most popular platform for developers to work on in the annual State of the Game Industry survey, easily outpacing consoles and mobile platforms that generally see active work from anywhere between 12 to 36 percent of developer respondents, depending on the year. In its report, Informa notes this surge as a “passion for PC development explod[ing]” among developers, and mentions that while “PC has consistently been the platform of choice… this year saw its dominance increase even more.”

The increasing popularity of PC gaming among developers is also reflected in the number of individual game releases on Steam, which topped out at a record of 18,974 individual titles for 2024, according to SteamDB. That record number was up over 32 percent from 2023, which was up from just under 16 percent from 2022 (though many Steam games each year were “Limited Games” that failed to meet Valve’s minimum engagement metrics for Badges and Trading Cards).

The number of annual Steam releases also points to increasing interest in the platform.

The number of annual Steam releases also points to increasing interest in the platform. Credit: SteamDB

The Steam Deck effect?

While it’s hard to pinpoint a single reason for the sudden surge in the popularity of PC game development, Informa speculates that it’s “connected to the rising popularity of Valve’s Steam Deck.” While Valve has only officially acknowledged “multiple millions” in sales for the portable hardware, GameDiscoverCo analyst Simon Carless estimated that between 3 million and 4 million Steam Deck units had been sold by October 2023, up significantly from reports of 1 million Deck shipments in October 2022.

Way more game makers are working on PC titles than ever, survey says Read More »

trump-can-save-tiktok-without-forcing-a-sale,-bytedance-board-member-claims

Trump can save TikTok without forcing a sale, ByteDance board member claims

TikTok owner ByteDance is reportedly still searching for non-sale options to stay in the US after the Supreme Court upheld a national security law requiring that TikTok’s US operations either be shut down or sold to a non-foreign adversary.

Last weekend, TikTok briefly went dark in the US, only to come back online hours later after Donald Trump reassured ByteDance that the US law would not be enforced. Then, shortly after Trump took office, he signed an executive order delaying enforcement for 75 days while he consulted with advisers to “pursue a resolution that protects national security while saving a platform used by 170 million Americans.”

Trump’s executive order did not suggest that he intended to attempt to override the national security law’s ban-or-sale requirements. But that hasn’t stopped ByteDance, board member Bill Ford told World Economic Forum (WEF) attendees, from searching for a potential non-sale option that “could involve a change of control locally to ensure it complies with US legislation,” Bloomberg reported.

It’s currently unclear how ByteDance could negotiate a non-sale option without facing a ban. Joe Biden’s extended efforts through Project Texas to keep US TikTok data out of China-controlled ByteDance’s hands without forcing a sale dead-ended, prompting Congress to pass the national security law requiring a ban or sale.

At the WEF, Ford said that the ByteDance board is “optimistic we will find a solution” that avoids ByteDance giving up a significant chunk of TikTok’s operations.

“There are a number of alternatives we can talk to President Trump and his team about that are short of selling the company that allow the company to continue to operate, maybe with a change of control of some kind, but short of having to sell,” Ford said.

Trump can save TikTok without forcing a sale, ByteDance board member claims Read More »