Dwarkesh

On Dwarkesh Patel’s 2026 Podcast With Elon Musk and Other Recent Elon Musk Things

Dwarkesh / Tim Belzer / February 17, 2026

Some podcasts are self-recommending on the ‘yep, I’m going to be breaking this one down’ level. This was one of those. So here we go.

As usual for podcast posts, the baseline bullet points describe key points made, and then the nested statements are my commentary. Some points are dropped.

If I am quoting directly I use quote marks, otherwise assume paraphrases.

Normally I keep everything to numbered lists, but in several cases here it was more of a ‘he didn’t just say what I think he did did he’ and I needed extensive quotes.

In addition to the podcast, there were some discussions around safety, or the lack thereof, at xAI, and Elon Musk went on what one can only describe as megatilt, including going hard after Anthropic’s Amanda Askell. I will include that as a postscript.

I will not include recent developments regarding Twitter, since that didn’t come up in the interview.

I lead with a discussion of bounded distrust and how to epistemically consider Elon Musk, since that will be important throughout including in the postscript.

What are the key takeaways?

Elon Musk is more confused than ever about alignment, how to set goals for AI to ensure that things turn out well, and generally what will ensure a good future. His ideas are confused at best.
Elon Musk is very gung-ho on data centers IN SPACE, and on robots, and making his own fabs. The business plan is to make virtual humans and robots and then you can turn on the ‘infinite money glitch.’
Elon Musk thinks otherwise China wins, and that they’re already more productive.
Elon Musk does not seem so concerned about whether humans survive, and has decided he will be okay so long as the AIs are conscious and intelligent.
The safety situation at xAI seems quite bad. What used to be the safety team has left and Elon’s response was that safety teams are powerless and fake and only used to reassure outsiders, and that ‘everyone’s job is safety’ at xAI. He did not address claims such as everyone pushing everything straight to prod[uction], and his statements in the podcast about management style don’t beat the rumors.
Elon Musk has some interesting views on collaboration with evil governments.
Elon Musk continues to often intentionally make false statements.
Elon Musk has been on megatilt lately and made some deeply terrible statements.

Elon Musk has given us many great things, but it’s been rough out there.

Elon Musk is what we in the business call an unreliable narrator. He will often say outright false things, as in we have common knowledge that the claims are false, or would gain such knowledge with an ordinary effort on the level of ‘ask even Grok,’ including in places where he is clearly not joking.

One of Elon Musk’s superpowers is to keep doing this, and also doing crazy levels of self-dealing and other violations of securities law, while being the head of many major corporations and while telling the SEC to go to hell, and getting away with all of it.

If Elon Musk gives you a timeline on something, it means nothing. There are other types of statements that can be trusted to varying degrees.

Elon Musk also has a lot of what seem to be sincerely held beliefs, both normative and positive, and both political and apolitical, that I feel are very wrong. In some cases they’re just kind of nuts.

Elon also gets many very important things right, and also some (but far from all) of his false statements and false beliefs fall under ‘false but useful’ for his purposes. His system has made some great companies, and made him the richest man in the world.

Other times, he’s on tilt and says or amplifies false, nasty and vile stuff for no gain.

It’s complicated.

I worry for him. He puts himself under insane levels of pressure in all senses and is in an extremely toxic epistemic environment. In important senses communication is only possible and he thus has all the authoritarian communication problems. He is trying to deal with AI and AI existential risk in ways that let him justify his actions and ago and let him sleep at night, and that has clearly taken its toll. On Twitter, which he owns and is on constantly, he has a huge army of extremely mean, vulgar and effectively deeply stupid followers and sycophants reinforcing his every move. He’s been trying to do politics at the highest level. Then there’s everything else he has been through, and put himself through, over the years. I don’t know how anyone can survive in a world like that.

I say all that in advance so that you have the proper context, both for what Elon Musk says, and for how I am reacting to what Elon Musk says.

Every time I see ‘data centers in space’ I instinctively think I’m being trolled, even though I know some Very Serious People think the math and physics can work.

Why data centers IN SPACE? Energy. “The output of chips is growing pretty much exponentially, but the output of electricity is flat. So how are you going to turn the chips on? Magical power sources? Magical electricity fairies?”
1. And we’re off. Very obviously static electricity output is a policy choice, and something we can change if we want to. We don’t build more because of regulations but also because of economics.
What about solar? Dwarkesh points out we have plenty of room for it, but Elon says that won’t be enough and it’s too hard to get permits or to scale on the ground and solar works five times better in space without a day-night cycle.
1. Yes, except for the part where you have to put all of it in space.
“My prediction is that [space] will be by far the cheapest place to put AI. It will be space in 36 months or less. Maybe 30 months. Less than 36 months.”
1. No.
2. I mean, indeed to do many things come to pass. Manifold says 19%.
He’s talking terawatts, multiple times all current American energy use. Elon points out various physical barriers to building American power plants. Various missing components. They’ll hit various walls. He calls people ‘total noob’s, they’ve ‘never done hardware in their life.’ He’ll have to make the turbines internally in SpaceX and Tesla, he says.
1. It would be nice to be able to trust Elon on any of this, either his judgment or for him to be telling it has he sees it. I can’t, on either count.
Regarding solar power, he is scaling up his own production, but until then, regarding the 500%+ tariffs, he politely says he ‘doesn’t agree on everything’ with this administration.
More concrete prediction: “If you say five years from now, I think probably AI in space will be launching every year the sum total of all AI on Earth. Meaning, five years from now, my prediction is we will launch and be operating every year more AI in space than the cumulative total on Earth. I would expect it to be at least, five years from now, a few hundred gigawatts per year of AI in space and rising. I think you can get to around a terawatt a year of AI in space before you start having fuel supply challenges for the rocket.” He thinks he can do it with 20-30 physical starships.
Elon Musk shows admirable restraint discussing SpaceX finances and the decision to take the company public. He says he’s solving for speed, and here that means access to capital.
We’re going to need a bigger chip fab. Elon mentions a ‘sort of TeraFab.’ “You can’t partner with existing fabs because they can’t output enough. The chip volume is too low.” “It’s not that they have not replicated TSMC, they have not replicated ASML. That’s the limiting factor.” “Yeah, China would be outputting vast numbers of chips if they could buy 2–3 nanometers.”
“I’d say my biggest concern actually is memory. The path to creating logic chips is more obvious than the path to having sufficient memory to support logic chips. That’s why you see DDR prices going ballistic and these memes. You’re marooned on a desert island. You write “Help me” on the sand. Nobody comes. You write “DDR RAM.” Ships come swarming in.”
1. Elon admits he has no idea how to build a fab, but his history seems to have taught him that You Can Just Build Things like copying ASML and TSMC.
TSMC and Samsung are building fabs as fast as they can. There’s no capacity available.
Elon says that SpaceX’s ability get revenue from Falcon 9 or Starlink explains why he might think he was in a simulation or was someone’s avatar in a video game.
1. I get that he’s a in a deeply weird position, but no this does not follow, and it’s pretty scary to have him thinking this way.
Now he’s talking about manufacturing AI satellites on the moon in order to send them into deep space, a billion or ten billion tons a year. You can mine the silicon, you see. Send the chips from Earth at first.

That was famously the line that supposedly made Elon Musk realize that no, you can’t just ignore the AI situation by creating a colony on Mars, even if you succeed.

For this section, I have to switch formats because I need to quote extensively.

Elon predicts that most future consciousness and intelligence will be AI, and as long as there’s intelligence he says that’s a good thing.

I do at give Musk a lot of credit for biting one of the most important bullets:

Elon Musk: I don’t think humans will be in control of something that is vastly more intelligent than humans.

I’m just trying to be realistic here. Let’s say that there’s a million times more silicon intelligence than there is biological. I think it would be foolish to assume that there’s any way to maintain control over that. Now, you can make sure it has the right values, or you can try to have the right values.

Great, but I can’t help but notice you’re still planning on building it, and have plans for what happens next that are, to be way too polite, not especially well-baked.

Rob Bensinger: “Now the only chance we have is that AI deems us worthy of coming along for the ride” may be Elon’s perspective, but it’s not our real situation; we can do the obvious thing and call for an international ban on the development of superintelligent AI.

I agree that it isn’t the default outcome; but the relevant disanalogy is that (a) we have levers we can pull to make it more likely, like calling our elected representatives and publishing op-eds; and (b) we don’t have any better options available.

Ideally there would also be nonzero humans in the Glorious Future but, you know, that’s a nice to have.

Elon Musk: I’m not sure AI is the main risk I’m worried about. The important thing is consciousness. I think arguably most consciousness, or most intelligence—certainly consciousness is more of a debatable thing… The vast majority of intelligence in the future will be AI. AI will exceed…

How many petawatts of intelligence will be silicon versus biological? Basically humans will be a very tiny percentage of all intelligence in the future if current trends continue. As long as I think there’s intelligence—ideally also which includes human intelligence and consciousness propagated into the future—that’s a good thing.

So you want to take the set of actions that maximize the probable light cone of consciousness and intelligence.

… Yeah. To be fair, I’m very pro-human. I want to make sure we take certain actions that ensure that humans are along for the ride. We’re at least there. But I’m just saying the total amount of intelligence… I think maybe in five or six years, AI will exceed the sum of all human intelligence. If that continues, at some point human intelligence will be less than 1% of all intelligence.

… In the long run, I think it’s difficult to imagine that if humans have, say 1%, of the combined intelligence of artificial intelligence, that humans will be in charge of AI. I think what we can do is make sure that AI has values that cause intelligence to be propagated into the universe.

xAI’s mission is to understand the universe.

… I think as a corollary, you have humanity also continuing to expand because if you’re curious about trying to understand the universe, one thing you try to understand is where will humanity go?

Wow. Okay. A lot to unpack there.

If your goal is to ‘understand the universe’ then either the goal is ‘humans understand the universe,’ which requires humans, or it’s ‘some mind understands the universe.’ If it’s the latter, then ‘where will humanity go?’ is easiest answered if the answer is ‘nowhere.’ Indeed, if your mission is ‘understand the universe’ there are ways to make the universe more understandable, and they’re mostly not things you want.

The bigger observation is that he’s pro-human in theory, but in practice he’s saying he’s pro-AI, and is predicting and paving the way for a non-human future.

I wouldn’t call him a successionist per se, because he still would prefer the humans to survive, but he’s not all that torn up about it. This makes his rants against Amanda Askell for not having children and thus not having a stake in the future, even more unhinged than they already were.

Elon Musk’s thinking about what goals lead to what outcomes is extremely poor. My guess is that this partly because this kind of thing is hard, partly because the real answers have implications he flinches away from, but especially because Elon Musk is used to thinking of goals as things you use as instrumental tools and heuristics to move towards targets, and this is giving him bad intuitions.

Dwarkesh Patel: I want to ask about how to make Grok adhere to that mission statement. But first I want to understand the mission statement. So there’s understanding the universe. They’re spreading intelligence. And they’re spreading humans. All three seem like distinct vectors.

Elon Musk: I’ll tell you why I think that understanding the universe encompasses all of those things. You can’t have understanding without intelligence and, I think, without consciousness. So in order to understand the universe, you have to expand the scale and probably the scope of intelligence, because there are different types of intelligence.

Look. No. Even if you assume that understanding the universe requires intelligence and consciousness, Elon Musk believes (per his statements here) that AI will be more intelligent, and that it will be conscious.

Spreading intelligence may or may not be instrumentally part of understanding the universe, but chances are very high this does not work out like Elon would want it to. If I was talking to him in particular I’d perhaps take one of his favored references, and suggest he ponder the ultimate question and the ultimate answer of life, the universe and everything, and whether finding that satisfied his values, and why or why not.

Later Elon tries to pivot this and talk about how the AI will be ‘curious about all things’ and Earth and humans will be interesting so it will want to see how they develop. But once again that’s two new completely different sets of nonsense, to claim that ‘leave the humans alone and see what happens’ would be the optimal way for an AI to extract ‘interestingness’ out of the lightcone, and to claim the target is maximizing interestingness observed rather than understanding of the universe.

You have to actually be precise when thinking about such things, or you end up with a bunch of confused statements. And you have to explain why your solution works better than instrumental convergence. And you have to think about maximization, not only comparing to other trivial alternatives.

He hints at this with his explanation of the point of 2001: A Space Odyssey, where the AI gives you what you asked for, not what you wanted (deliver the astronauts to the monolith without them knowing about the monolith, therefore deliver them dead) but then interprets this as trying to say ‘don’t let the AI lie.’ Sorry, what?

Elon says we are more interesting than rocks. Sure, but are we as interesting as every potential alternative, including using the energy to expand into the lightcone? If the AI optimizes specifically humanity for maximum ‘interestingness to AI,’ even if you get to survive that, do you think you’re going to be having a good time? Do you think there’s nothing else that could instead be created that would be more interesting?

Elon says, well, the robots won’t be as interesting because they’re all the same. But if that’s what the AI cares about, why not make the AIs be different from each other? This is, once you drill down, isomorphic to human exceptionalist just-so spiritualism, or a ‘the AI tried nothing but I’m confident it’s all out of ideas.’

In any case, instead of Douglas Adams, it seems Elon is going with Iain Banks, everyone’s new favorite superficially non-dystopian plausible AI future.

Elon Musk: I think AI with the right values… I think Grok would care about expanding human civilization. I’m going to certainly emphasize that: “Hey, Grok, that’s your daddy. Don’t forget to expand human consciousness.”

This is so profoundly unserious, and also is conflating at least three different philosophical systems and approaches to determining action.

Elon Musk: Probably the Iain Banks Culture books are the closest thing to what the future will be like in a non-dystopian outcome.

I’ve said it before but the Culture books are both not an equilibrium and are a pretty dystopian outcome, including by Musk’s own standards, for many reasons. A hint is that the humans reliably die by suicide after being alive not that long, and with notably rare exceptions at most their lives are utterly irrelevant.

Understanding the universe means you have to be truth-seeking as well. Truth has to be absolutely fundamental because you can’t understand the universe if you’re delusional. You’ll simply think you understand the universe, but you will not. So being rigorously truth-seeking is absolutely fundamental to understanding the universe. You’re not going to discover new physics or invent technologies that work unless you’re rigorously truth-seeking.

Imagine the things I would say here and then assume I’ve already said them.

Elon Musk: I think actually most physicists, even in the Soviet Union or in Germany, would’ve had to be very truth-seeking in order to make those things work. If you’re stuck in some system, it doesn’t mean you believe in that system.

Von Braun, who was one of the greatest rocket engineers ever, was put on death row in Nazi Germany for saying that he didn’t want to make weapons and he only wanted to go to the moon. He got pulled off death row at the last minute when they said, “Hey, you’re about to execute your best rocket engineer.”

Dwarkesh Patel: But then he helped them, right? Or like, Heisenberg was actually an enthusiastic Nazi.

Elon Musk: If you’re stuck in some system that you can’t escape, then you’ll do physics within that system. You’ll develop technologies within that system if you can’t escape it.

The ‘system’ in the question is the Actual Historical Nazis or Soviets. He’s saying, of course a physicist like Von Braun (Elon’s example) or Heisenberg (Dwarkesh’s example) would build stuff for the Nazis, how else were they going to do physics?

I agree with Elon that such systems to a large extent were content to have the physicists care about the rockets going up but not where they came down, saying it’s not their department, so long as the people in charge decided where they come down.

Alignment prospects not looking so good for xAI, you might say.

When getting to ‘what can we do to help with this?’ Elon suggests interpretability, and praises Anthropic on that. Figure out what caused problems, good debuggers. He seems to want to use The Most Forbidden Technique.
Elon rants about how we shouldn’t call AI labs labs, and how it’s mostly engineering rather than research. He will double down on this later at ~#20, and then keep insisting. He cares a lot about this.
Elon implies he’s letting simulation theory impact decisions, because he’s assuming more interesting outcomes are therefore more likely and also necessary to prevent the simulation from being terminated? He seems to actually think that this means Anthropic will be ‘misanthropic’ or something, because of this (e.g. MidJourney is not mid and stabilityAI is unstable and OpenAI is closed)? And X is a name you can’t invert, it’s irony proof.
1. Sheesh. This is the world we live in. These are the hands we’re given.
2. My general answer is you should act as if you are not a simulation because most of the value of your decisions are in the non-simulation worlds, and your decisions in all worlds are highly correlated. It takes a lot to overcome this.
3. If you really believed this you’d ensure your inversion was actively good. MidJourney is a great name for this. Who wants to be mid? Exactly.
4. Maybe he should not have called his robot Optimus? Whoops.

Where will AI products go? Elon predicts digital human emulation will be solved by the end of the year, anything a human can do with a computer. “That’s the most you can do until you have physical robots.”
1. Singularity. Singularity. Singularity. Singularity. Oh, I don’t know.
2. We do have physical robots, we don’t have ability to properly control them.
3. If an AI could do ‘anything a human could do with a computer’ then it could also use that to remote control a robot like it was a video game. Solved.
With robotics he calls Optimus the ‘infinite money glitch’ because it then improves recursively. Or at least orders of magnitude big.
1. What an odd trio of words for the technological singularity.
“Every time I say “order of magnitude”… Everybody take a shot. I say it too often.”
1. He says it too often, but not an order of magnitude too often. So it’s fine-ish?
How will xAI win against all the others also solving this? “I think the way that Tesla solved self-driving is the way to do it. So I’m pretty sure that’s the way.” “Okay. The car, it just increasingly feels sentient. It feels like a living creature. That’ll only get more so. I’m actually thinking we probably shouldn’t put too much intelligence into the car, because it might get bored and…”
1. Did Tesla solve self-driving? I mean it’s decent but it doesn’t seem solved.
2. Elon has to know he is going off the rails here? No, the car is not going to get bored, I am relatively happy to anthropomorphize LLMs but in this context what he is saying does not make sense and he has to know this. Right?
The new supposed business plan is “As soon as you unlock the digital human, you basically have access to trillions of dollars of revenue.”
1. I get the whole ‘never bet against Elon Musk’ thing, and especially the ‘never bet against Musk’s grand plans’ thing. But none of this explains why xAI would be the ones to unlock this, or why the trillions of dollars would even be the thing worth thinking about if you did unlock digital humans.
2. Again, ‘unlock digital humans’ means singularity and full transformation, and probably everyone dies. Even if I am confident everyone lives and humans stay in charge (and Elon thinks we don’t stay in charge), I do not much care at this point about your nominal business plan.
3. If the humans aren’t in charge or you are dead, what use is your SpaceX stock?
Elon points out that if you can plug these humans into existing input-output digital systems, then there’s basically no barriers to entry to a lot of it. He calls AI the ‘supersonic tsunami’ and everything will change, and we get this strange superposition of ‘everything will change’ and ‘there will be some great companies.’
1. Yes, true, but that is highly second best and unlikely to be the thing going on that is worth paying attention to in such a future.
2. Elon says, you could do chip design with massive parallel runs, as a higher level example, which is true, but the whiplash on big picture, it hurts.

Only three hard things in robotics. Real-world intelligence, hands and scaling.
1. Is that all?
Optimus, Elon says, will do all that from physics first principles, at scale, with no supply chain. AI for robots is ‘mostly compression and correlation of two bitstreams.’ He agrees with Dwarkesh that robots have a lot more degrees of freedom, and they won’t have the same amount of matching data that Elon had with Tesla for self-driving. So they’ll need a bunch of self-play, which can be done with their good ‘reality generator.’
1. I think I basically buy it, although I don’t see why Elon has the edge. He avoids saying much about Chinese rivals or other potential competition, other than saying that Optimus is going to be a lot more capable.
2. If we’re getting full digital humans (or ‘geniuses in a data center’) and we can do self-play for the robots, then it’s not clear why SpaceX and xAI and Tesla have much meaningful advantage.

There were some other details shared, but mostly it’s hard to learn much.

We need to scale up electricity production, and get rid of any barriers that aren’t ‘very bad’ for the environment.
1. Yes.
2. Elon then says he’s not sure the government can do much, which is odd.
China has four times as many people and they work harder, so we ‘can’t win with humans’ and they might have an edge in productivity per person. And our birth rate has been ‘below replacement since roughly 1971.’
1. Please consult your local economist, sir.
2. Have you seen Chinese fertility numbers? They are… not high.
3. I do not understand this ‘you run out of humans’ talk. We have plenty of humans for all relevant purposes, and if we need more humans there are tons of humans who would love to come to America and take these jobs. Meanwhile, the Chinese have massive youth unemployment.
“I mean China is a powerhouse. I think this year China will exceed three times US electricity output. Electricity output is a reasonable proxy for the economy.” “In the absence of breakthrough innovations in the US, China will utterly dominate.”
1. He’s got terminal hardware manufacturing brain. Complete truther.

Okay, so I saw ‘Elon Musk wants to build a mass driver on the Moon’ in another context earlier, and my first thought was to ask Claude ‘what would be the military impact of Elon Musk having a mass driver on the Moon’ because we all know who first came up with putting a mass driver on the moon (good news is that Claude said it probably wouldn’t accomplish anything because of physics), but it’s maybe the kind of thing I didn’t quite expect to have him point out first.

John Collison: You have the mass driver on the moon.

Elon Musk: I just want to see that thing in operation.

John Collison: Was that out of some sci-fi or where did you…?

Elon Musk: Well, actually, there is a Heinlein book. The Moon is a Harsh Mistress.

Okay, yeah, but that’s slightly different. That’s a gravity slingshot or…

Elon Musk: No, they have a mass driver on the Moon.

John Collison: Okay, yeah, but they use that to attack Earth. So maybe it’s not the greatest…

Elon Musk: Well they use that to… assert their independence.

John Collison: Exactly. What are your plans for the mass driver on the Moon?

Elon Musk: They asserted their independence. Earth government disagreed and they lobbed things until Earth government agreed.

The libertarians on the Moon were the good guys in Heinlein, you see. They just wanted their independence. It’s fine. Nothing to worry about.

There’s a discussion of talent, recruitment and retention. Elon focuses on evidence of exceptionalism, trusting what you see in the interview, and on execution and results. He loves you if you deliver, hate him if you don’t. Don’t fall for hiring based on where someone works, that’s the ‘pixie dust’ trap.
1. The ‘if [X] I love you, if [~X] I hate you’ strategy, especially pivoting based on what you’ve done for me lately, can be a key part of oversized success if you can pull it off. Many such cases including the obvious ones. It maximizes your leverage, creates incentives, moves quick, can get results.
2. I have also learned that if you work that way, I’m not going to like you, I don’t want to be your friend, I don’t want to work for you, I don’t want anything to do with you, and people should not trust you. You will absolutely poison your epistemic environment and the people around you will be toxic.
There are some stories about rockets using steel and how decisions get made.
1. These are hard to usefully excerpt, so I’m skipping it.
Elon has a maniacal sense of urgency. He says this is a very big deal. He says he sets deadlines at the 50th percentile, aiming to only be late half the time.
1. This is another thing that clearly gets results in the some cases, but that I have learned is highly toxic to me. It’s fine to have maniacal urgency in short bursts but I cannot sustain it for long in a healthy way, and the people I know mostly can’t either.
2. Similarly, don’t give deadlines that can only be met half the time unless you are actually okay with missing the deadlines, and they’re more like targets.
3. I get the idea that you need people looking for the fastest possible route to the target, always going for the limiting factor and so on, and that this is not people’s natural tendency. I get that there is a price paid for not doing the insane motivational things, in the short term. I also don’t want to star in the movie Whiplash, thank you very much.
4. This strategy seems almost optimized to get us all killed in an AI context.
“I’m a big believer in skip-level meetings where instead of having the person that reports to me say things, it’s everyone that reports to them saying something in the technical review. And there can’t be advanced preparation. Otherwise you’re going to get “glazed”, as I say these days.”
1. Okay, this is I like, although it seems very hard to maintain good incentives.
The next question is indeed how do you prevent the advanced preparation. Elon says he goes around the room and asks and plots the points on a curve.
1. So the guard against prep is that you’d stand out versus everyone else?
2. The equilibrium is fragile. You’ll need to protect it.

Elon Musk gets some big things right, and he focuses on what matters, and he drills down to details, and he never stops and never quits. It all counts for quite a lot, and can cover for a lot of flaws, especially combined with (let’s face it) Elon having gotten in various ways insanely lucky along the way.

Why care about DOGE if the economy is going to grow so much? Elon says waste and fraud are bad, and absent AI and robotics ‘we’re actually totally screwed’ because of the national debt. But it’s very hard ‘even to cut very obvious waste and fraud.’ He then repeats various disingenuous talking points that I won’t go into. He affirms he thinks it was good Trump won.
1. On one level, yeah, he’s sticking to his guns on these talking points even now.
2. I can’t tell the extent to which he’s genuinely clueless about how all of this works and thinks he can just intuition pump based on things that don’t apply here, how much he’s been poisoned by the people he is around and the epistemic warfare going on around him, how much it’s a bad understanding of relevant economics, and how much of this is malice.
3. He does not mention feeding PEPFAR into a wood chipper, or any of the other havoc he wrecked for no reason, he just dismisses everyone complaining as having committed fraud. But Dwarkesh and Collison didn’t ask.
4. The point about ‘why does any of this matter in the wake of AI’ goes unanswered, other than ‘well without AI we’d be screwed.’ But very obviously Musk should not have been spending his political capital on this ‘fraud’ hunt, even if it was fully genuine, because it was never likely to change things so much even in the other worlds.
5. They don’t ask about the fight over BBB and Elon blowing up a lot of his relationship with Trump over it.
“I think maybe the biggest danger of AI and robotics going wrong is government. People who are opposed to corporations or worried about corporations should really worry the most about government. Because government is just a corporation in the limit. Government is just the biggest corporation with a monopoly on violence.”
1. This does not make sense in the same world as the belief that the AI is most definitely going to take over. There’s a disconnect, which goes back to the question of why focus on things like DOGE even with maximally charitable assumptions about its purposes.

It would be nice to see Musk putting aside these talking points, and especially admitting that DOGE was at best a bad use of influence and a mistake in hindsight.

How do you design a space-based chip like Dojo 3? Radiation tolerance, higher temperature, memory shielding. They’ll make small ones, then big ones. Or maybe not, we shall see.

Or never. Aligning an intelligence of any level is difficult. It’s harder if you don’t try.

We’ve been through ‘all the top safety people at OpenAI keep getting purged like they were teaching Defense Against The Dark Arts’ and Elon Musk has us holding his beer.

Turnover on safety roles at xAI has been high for a while, and it just got higher. The few people previously devoted to safety at xAI who did all their public-facing safety work? The entire safety department? All gone.

Hayden Field: The past few days have been a wild ride for xAI, which is racking up staff and cofounder departure announcements left and right. On Tuesday and Wednesday, cofounder Yuhuai (Tony) Wu announced his departure and that it was “time for [his] next chapter,” with cofounder Jimmy Ba following with a similar post later that day, writing that it was “time to recalibrate [his] gradient on the big picture.”

There were twelve highly unequal ‘cofounders’ so that is not as alarming as it sounds. It still is not a great look.

The larger problem is that xAI has shown a repeated disdain for even myopic mundane ‘don’t-shoot-yourself-in-the-foot-today’ styles of safety, and it’s hard for people not to notice.

If you’ve had your equity exchanged for SpaceX stock, your incentives now allow you to leave. So you might well leave.

Hayden Field: There’s often a natural departure point for companies post-merger, and Musk has announced that some of the departures were a reorganization that “unfortunately required parting ways with some people.” But there are also signs that people don’t like the direction Musk has taken things.

One source who spoke with The Verge about the happenings inside the company, who left earlier this year and requested anonymity due to fear of retaliation, said that many people at the company were disillusioned by xAI’s focus on NSFW Grok creations and disregard for safety.

More generally, xAI has been a commercial success in that the market is willing to fund it and Elon Musk was able to sell it to SpaceX, but it is a technological failure.

The source also felt like the company was “stuck in the catch-up phase” and not doing anything new or fundamentally different from its competitors.

Yet another former employee said he was launching an AI infrastructure company called Nuraline alongside other ex-xAI employees. He wrote, “During my time at xAI, I got to see a clear path towards hill climbing any problem that can be defined in a measurable way. At the same time, I’ve seen how raw intelligence can get lobotomized by the finest human errors … Learning shouldn’t stop at the model weights, but continue to improve every part of an AI system.”

The outside view is that xAI focused on hill climbing and targeting metrics to try and imitate OpenAI, Google and Anthropic, and hoped to pull ahead by throwing in extra compute, by being more ‘edgy’ and not being ‘woke AI,’ and by having Elon Musk personally show his genius. This approach failed, although it failed up.

How bad are things going forward on safety? Oh, they’re maximally bad.

As in, safety? Never heard of her.

The source who departed earlier this year said Grok’s turn toward NSFW content was due partly to the safety team being let go, with little to no remaining safety review process for the models besides basic filters for things like CSAM. “Safety is a dead org at xAI,” he said.

Looking at the restructured org chart Elon Musk shared on X, there’s no mention of a safety team.

… “There is zero safety whatsoever in the company — not in the image [model], not in the chatbot,” the second source said. “He [Musk] actively is trying to make the model more unhinged because safety means censorship, in a sense, to him.”

The second source also said engineers at xAI immediately “push to prod[uction]” and that for a long time, there was no human review involved.

“You survive by shutting up and doing what Elon wants,” he said.

This is perhaps also how you get things like Grokopedia having large AI-generated errors that interested parties find themselves unable to fix.

I shared some quotes on Twitter, without comment. Elon Musk replied.

Elon Musk: Because everyone’s job is safety. It’s not some fake department with no power to assuage the concerns of outsiders.

Tesla has no safety team and is the safest car.

SpaceX has no safety team and has the safest rocket. Dragon is what NASA trusts most to fly astronauts.

When there previously was a safety department, was that was explicitly a fake department to assuage the concerns of outsiders?

I almost QTed to ask exactly that, but decided there was nothing to win.

So no safety department from now on? Not even be some people devoted to safety?

Even if you did think there should not be people whose primary job is safety concerns at all, which is itself a crazy position, why should we believe that at xAI ‘safety is everyone’s job’ in any meaningful sense?

The richest man in the world will say ‘this pales next to my safety plan, which is to make everyone’s job safety’ and then not make anyone’s job safety.

Yes, safety needs to be everyone’s job. You want distributed ownership. That doesn’t get you out of having a dedicated team.

The same is true for security, recruiting, maintaining company culture and documentation, quality and testing, customer focus, compliance and ethics, cost management and so many other things. Everything is, in a sense, everyone’s job.

Also, Elon Musk’s statements regarding Tesla and SpaceX are straightforwardly false.

Tesla has an Environmental, Health & Safety Team.
Tesla is not the safest car. Gemini, ChatGPT and Grok all pick Mazda. Claude picks Volvo. Tesla’s safety record is fine, but it’s clearly not the best.
SpaceX of course has lots of people specifically devoted to safety, and they have a flight safety team and a mission assurance team.

Elon’s defenders rushed to the comments, both to his post and to the OP. They explained with disdain why actually it’s safer to not have a safety department, and that any mention of the word ‘safety’ is evil and censorship, and I got called various names.

Breaking containment on Twitter is not so great. I fear for Elon Musk’s epistemic environment since this is presumably what he fights through all day.

As if on queue, Musk had summoned an army of people now talking explicitly about how it is bad to have people who specialize in safety, in any sense from the mundane to the existential, and how only a moron or malicious person could suggest otherwise. It was like watching the negative polarization against the Covid vaccine as a political campaign in real time filling up my mentions.

Oh, you, the person who pissed off the great Elon Musk, want us not to shoot ourselves in the foot? Well then, Annie get your gun. Look what you made me do.

If you thought ‘out of the hundreds of replies in your mentions surely someone you don’t follow will say something worthwhile about this’ then you would be wrong.

This of course does not explain why there is claimed to be no safety anywhere, or the other quotes he was responding to. Are there any individuals tasked with safety at xAI? It does not have to be a ‘department’ per se, but it does not sound like the worry is limited to the lack of a department. If nothing else, we’ve seen their work.

Amanda Askell is the architect of Claude’s personality and constitution at Anthropic.

Amanda Askell: WSJ did a profile of me. A lot of the response has been people trying to infer my personal political views. For what it’s worth, I try to treat my personal political views as a potential source of bias and not as something it would be appropriate to try to train models to adopt.

Ben Hoffman: Your views on the nature of language seem much more important.

All the right people who actually know her are coming out to support Amanda Askell.

Whereas the attacks upon her consistently say more about the person attacking her. This is distinct from the valid point of challenging Anthropic and therefore also Askell for working towards superintelligence in the first place.

In other Elon Musk safety news and also ‘why you do need a philosopher’ news, here was how Elon Musk chose to respond, which is to say ‘those without children lack a stake in the future,’ then to have Grok explain a Bart Simpson joke about her name as part of a linked conversation that did not otherwise do him any favors, and then he said this:

Elon Musk: Those without children lack a stake in the future

Will, Amanda’s ex, offered to help write the Grok Constitution, but he has been preaching about the declining birth rate for a decade and has done nothing to have even one kid, nor has Amanda.

Constitutions should not be written by hypocrites.

Amanda Askell: I think it depends on how much you care about people in general vs. your own kin. I do intend to have kids, but I still feel like I have a strong personal stake in the future because I care a lot about people thriving, even if they’re not related to me.

Cate Hall: the “you can’t care about the future if you don’t have kids” people are morally repulsive to me. are you seriously incapable of caring about people you’re not related to? and you think that’s a problem with OTHER people?

I’m all for having more kids and I do think they help you care more about the future but that’s… wow. Just wow.

It then somehow got worse.

It is such a strange fact that the richest man in the world engages in pure name calling on Twitter, thus amplifying whoever sent the original message a hundredfold and letting everyone decide for themselves who the pathetic wanker is.

George McGowan: Rather proves the point

While I most definitely would rather entrust the future to Amanda, I do think Schubert’s statement is too strong. Sane people can have very strange beliefs.

Elon Musk just… says things that are obviously false. All the time. It’s annoying.

Elon Musk: Having kids means you will do anything to ensure that they live and are happy, for you love them more than your own life a thousand times over, and there is no chance that you will fall in love with an AI instead.

Remember these words.

Regardless of what you think of Elon Musk’s actions in AI or with his own kids, and notwithstanding that most parents do right by their kids, very obviously many parents do not act this way, many do not try so hard to do right by their kids. Many end up choosing a new other person they have fallen in love with over their kids, and very obviously there exist parents who have fallen in love with an existing AI, let alone what will happen with future AI. Would that it were otherwise.

It is an excellent question.

Discussion about this post

On Dwarkesh Patel’s 2026 Podcast With Elon Musk and Other Recent Elon Musk Things Read More »

On Dwarkesh Patel’s 2026 Podcast With Dario Amodei

Dwarkesh / Paul Patrick / February 16, 2026

Some podcasts are self-recommending on the ‘yep, I’m going to be breaking this one down’ level. This was very clearly one of those. So here we go.

As usual for podcast posts, the baseline bullet points describe key points made, and then the nested statements are my commentary. Some points are dropped.

If I am quoting directly I use quote marks, otherwise assume paraphrases.

What are the main takeaways?

Dario mostly stands by his predictions of extremely rapid advances in AI capabilities, both in coding and in general, and in expecting the ‘geniuses in a data center’ to show up within a few years, possibly even this year.
Anthropic’s actions do not seem to fully reflect this optimism, but also when things are growing on a 10x per year exponential if you overextend you die, so being somewhat conservative with investment is necessary unless you are prepared to fully burn your boats.
Dario reiterated his stances on China, export controls, democracy, AI policy.
The interview downplayed catastrophic and existential risk, including relative to other risks, although it was mentioned and Dario remains concerned. There was essentially no talk about alignment at all. The dog did not bark in the nighttime.
Dwarkesh remains remarkably obsessed with continual learning.

AI progress is going at roughly Dario’s expected pace plus or minus a year or two, except coding is going faster than expected. His top level model of scaling is the same as it was in 2017.
1. I don’t think this is a retcon, but he did previously update too aggressively on coding progress, or at least on coding diffusion.
Dario still believes the same seven things matter: Compute, data, data quality and distribution, length of training, an objective function that scales, and two things around normalization or conditioning.
1. I assume this is ‘matters for raw capability.’
Dwarkesh asks about Sutton’s perspective that we’ll get human-style learners. Dario says there’s an interesting puzzle there, but it probably doesn’t matter. LLMs are blank slates in ways humans aren’t. In-context learning will be in-between human short and long term learning. Dwarkesh asks then why all of this RL and building RL environments? Why not focus on learning on the fly?
1. Because the RL and giving it more data clearly works?
2. Whereas learning on the fly doesn’t work, even if it did what happens when the model resets every two months?
3. Dwarkesh has pushed on this many times and is doing so again.
Timeline time. Why does Dario think we are at ‘the end of the exponential’ rather than ten years away? Dario says his famous ‘country of genuines in a data center’ is 90% within 10 years without biting a bullet on faster. One concern is needing verification. Dwarkesh pushes that this means the models aren’t general, Dario says no we see plenty of generalization, but the world where we don’t get the geniuses is still a world where we can do all the verifiable things.
1. As always, notice the goalposts. Ten years from human-level AI is ‘long time.’
2. Dario is mostly right on generalization, in that you need verification to train in distribution but then things often work well (albeit less well) out of distribution.
3. The class of verifiable things is larger than one might think, if you include all necessary subcomponents of those tasks and then the combination of those subcomponents.
Dwarkesh challenges if you could automate an SWE without generalization outside verifiable domains, Dario says yes you can, you just can’t verify the whole company.
1. I’m 90% with Dario here.
What’s the metric of AI in SWE? Dario addresses his predictions of AI writing 90% of the lines of code in 3-6 months. He says it happened at Anthropic, and that ‘100% of today’s SWE tasks are done by the models,’ but that’s all not yet true overall, and says people were reading too much into the prediction.
1. The prediction was still clearly wrong.
2. A lot of that was Dario overestimating diffusion at this stage.
3. I do agree that the prediction was ‘less wrong,’ or more right, than those who predicted a lack of big things for AI coding, who thoughts things would not escalate quickly.
4. Dario could have reliably looked great if he’d made a less bold prediction. There’s rarely reputational alpha in going way beyond others. If everyone else says 5 years, and you think 3-6 months, you can say 2 years and then if it happens in 3-6 months you still look wicked smart. Whereas the super fast predictions don’t sound credible and can end up wrong. Predicting 3-6 months here only happens if you’re committed to a kind of epistemic honesty.
5. I agree with Dario that going from 90% of code to 100% of code written by AI is a big productivity unlock, Dario’s prediction on this has already been confirmed by events. This is standard Bottleneck Theory.
“Even when that happens, it doesn’t mean software engineers are out of a job. There are new higher-level things they can do, where they can manage. Then further down the spectrum, there’s 90% less demand for SWEs, which I think will happen but this is a spectrum.”
1. It would take quite a lot of improved productivity to reduce demand by 90%.
2. I’d go so far as to say that if we reduce SWE demand by 90%, then we have what one likes to call ‘much bigger problems.’
Anthropic went from zero ARR to $100 million in 2023, to $1 billion in 2024, to $9-$10 billion in 2025, and added a few more billion in January 2026. He guesses the 10x per year starts to level off some time in 2026, although he’s trying to speed it up further. Adoption is fast, but not infinitely fast.
1. Dario’s predictions on speed of automating coding were unique, in that all the revenue predictions for OpenAI and Anthropic have consistently come in too low, and I think the projections are intentional lowballs to ensure they beat the projections and because the normies would never believe the real number.
Dwarkesh pulls out the self-identified hot take that ‘diffusion is cope’ used to justify when models can’t do something. Hiring humans is much more of a hassle than onboarding an AI. Dario says you still have to do a lot of selling in several stages, the procurement processes are often shortcutted but still take time, and even geniuses in a datacenter will not be ‘infinitely’ compelling as a product.
1. I’ve basically never disagreed with a Dwarkesh take as much as I do here.
2. Yes, of course diffusion is a huge barrier.
3. The fact that if the humans knew to set things up, and how to set things up, that the cost of deployment and diffusion would be low? True, but completely irrelevant.
4. The main barrier to Claude Code is not that it’s hard to install, it’s that it’s hard to get people to take the plunge and install it, as Dario notes.
5. In practice, very obviously, even the best of us miss out on a lot of what LLMs can do for us, and most people barely scratch the surface at best.
6. A simple intuition pump: If diffusion is cope, what do you expect to happen if there was an ‘AI pause’ starting right now, and no new frontier models were ever created?
7. Dwarkesh sort of tries to backtrack on what he said as purely asserting that we’re not currently at AGI, but that’s an entirely different claim?
Dario says we’re not at AGI, and that if we did have a ‘country of geniuses in a datacenter’ then everyone would know this.
1. I think it’s possible that we might not know, in the sense that they might be sufficiently both capable and misaligned to disguise this fact, in which case we would be pretty much what we technically call ‘toast.’
2. I also think it is very possible in the future that an AI lab might get the geniuses and then disguise this fact from the rest of us, and not release the geniuses directly, for various reasons.
3. Barring those scenarios? Yes, we would know.

It’s a Dwarkesh Patel AI podcast, so it’s time for continual learning in two senses.

Dwarkesh thinks Dario’s prediction for today, from three years ago, of “We should expect systems which, if you talk to them for the course of an hour, it’s hard to tell them apart from a generally well-educated human” was basically accurate. Dwarkesh however is spiritually unsatisfied because that system can’t automated large parts of white-collar work. Dario points out OSWorld scores are already at 65%-70% up from 15% a year ago, and computer use will improve.
1. I think it is very easy to tell, but I think the ‘spirit of the question’ is not so off, in the sense that on most topics I can have ‘at least as good’ a conversation with the LLM for an hour as with the well-educated human.
2. Can such a system automate large parts of white-collar work? Yes. Very obviously yes, if we think in terms of tasks rather than full jobs. If you gave us ten years (as an intuition pump) to adapt to existing systems, then I would predict a majority of current white-collar digital job tasks get automated.
3. The main current barrier to the next wave of practical task automation is that computer use is still not so good (as Dario says), but that will get fixed.
Dwarkesh asks about the job of video editor. He says they need six months of experience to understand the trade-offs and preferences and tastes necessary for the job and asks when AI systems will have that. Dario says the ‘country of geniuses in a datacenter’ can do that.
1. I bet that if you took Claude Opus 4.6 and Claude Code, and you gave it the same amount of human attention to improving its understanding of trade-offs, preferences and taste over six months that a new video editor would have, and a similar amount of time training video editing skills, that you could get this to the point where it could do most of the job tasks.
2. You’d have to be building up copious notes and understandings of the preferences and considerations, and you’d need for now some amount of continual human supervision and input, but yeah, sure, why not.
3. Except that by the time you were done you’d use Opus 5.1, but same idea.
Dwarkesh says he still has to have humans do various text-to-text tasks, and LLMs have proved unable to do them, for example on ‘identify what the best clips would be in this transcript’ they can only do a 7/10 job.
1. If you see the LLMs already doing a 7/10 job, the logical conclusion is that this will be 9/10 reasonably soon especially if you devote effort to it.
2. There are a lot of things one could try here, and my guess is that Dwarkesh has mostly not tried them, largely because until recently trying them was a lot slower and more expensive than it is now.
Dwarkesh asks if a lot of LLM coding ability is the codebase as massive notes. Dario points out this is not an accounting of what a human needs to know, and the model is much faster than humans at understanding the code base.
1. I think the metaphor is reasonably apt, in that in code the humans or prior AIs have written things down, and in other places we haven’t written similar things down. You could fix that, including over time.
Dwarkesh cites the ‘the developers using LLMs thought they were faster but were went slower’ study and asks where the renaissance of software and productivity benefits are from AI coding. Dario says it’s unmistakable within Anthropic, and cites that they’ve cut their competitors off from using Claude.
1. Not letting OpenAI use Claude is a big costly signal that they view agentic coding as a big productivity boost, and even that theirs is a big boost over OpenAI’s versions of the same tools.
2. It seems very difficult, watching the pace of developments in AI inside and outside of the frontier labs, to think coding productivity isn’t accelerating.
Dario estimates current coding models give 15%-20% speedup, versus 5% six months ago, and that Amdhal’s law means you eventually get a much bigger speedup once you start closing full loops.
1. It’s against his interests to come up with a number that small.
2. I also don’t believe a number that small, especially since the pace of coding now seems to be largely rate limited by compute and frequency of human interruptions to parallel agents. It’s very hard to thread the needle and have the gains be this small.
3. The answer will vary a lot. I can observe that for me, given my particular set of skills, the speedup is north of 500%. I’m vastly faster and better.
Dwarkesh asks again ‘continual learning when?’ and Dario says he has ideas.
1. There are cathedrals for those with eyes to see.

How does Dario reconcile his general views on progress with his radically fast predictions on capabilities? Fast but finite diffusion, especially economic. Curing diseases might take years.
1. Diffusion is real but Dario’s answer to this, which hasn’t changed, has never worked for me. His predictions on impact do not square with his predictions on capabilities, period, and it is not a small difference.
Why not buy the biggest data center you can get? If Anthropic managed to buy enough compute for their anticipated demand, they burn the boats. That’s on the order of $5 trillion dollars two years from now. If the revenue does not materialize, they’re toast. Whereas Anthropic can ensure financial stability and profitability by not going nuts, as their focus is enterprise revenue with higher margins and reliability.
1. Being early in this sense, when things keep going 10x YoY, is fatal.
2. That’s not strictly true. You’re only toast if you can’t resell the compute at the same or a better price. But yes, you’re burning the boats if conditions change.
3. Even if you did want to burn the boats, it doesn’t mean the market will let you burn the boats. The compute is not obviously for sale, nor is Anthropic’s credit good for it, nor would the investors be okay with this.
4. This does mean that Anthropic is some combination of insufficiently confident to burn the boats or unable to burn them.
Dario won’t give exact numbers, but he’s predicting more than 3x to Anthropic compute each year going forward.

Why is Anthropic planning on turning a profit in 2028 instead of reinvesting? “I actually think profitability happens when you underestimated the amount of demand you were going to get and loss happens when you overestimated the amount of demand you were going to get, because you’re buying the data centers ahead of time.” He says they could potentially even be profitable in 2026.
1. Thus, the theory is that Anthropic needs to underestimate demand because it is death to overestimate demand, which means you probably turn a profit ‘in spite of yourself.’ That’s so weird, but it kind of makes sense.
2. Dario denies this is Anthropic ‘systematically underinvesting in compute’ but that depends on your point of view. You’re underinvesting post-hoc with hindsight. That doesn’t mean it was a mistake over possible worlds, but I do think that it counts as underinvesting for these purposes.
3. Also, Dario is saying (in the toy model) you split compute 50/50 internal use versus sales. You don’t have to do that. You could double the buy, split it 75/25 and plan on taking a loss and funding the loss by raising capital, if you wanted that.
Dwarkesh suggests exactly doing an uneven split, Dario says there are log returns to scale, diminishing returns after spending e.g. $50 billion a year, so it probably doesn’t help you that much.
1. I basically don’t buy this argument. I buy the diminishing return but it seems like if you actually believed Anthropic’s projections you wouldn’t care. As Dwarkesh says ‘diminishing returns on a genius could be quite high.’
2. If you actually did have a genius in your datacenters, I’d expect there to be lots of profitable ways to use that marginal genius. The world is your oyster.
3. And that’s if you don’t get into an AI 2027 or other endgame scenario.
Dario says AI companies need revenue to raise money and buy more compute.
1. In practice I think Dario is right. You need customers to prove your value and business model sufficiently to raise money.
2. However, I think the theory here is underdeveloped. There is no reason why you couldn’t keep raising at higher valuations without a product. Indeed, see Safe Superintelligence, and see Thinking Machines before they lost a bunch of people, and so on, as Matt Levine often points out. It’s better to be a market leader, but the no product, all research path is very viable.
3. The other advantage of having a popular product is gaining voice.
Dwarkesh claims Dario’s view is compatible with us being 10 years away from AI generating trillions in value. Dario says it might take 3-4 years at most, he’s very confident in the ‘geniuses’ showing up by 2028.
1. Dario feels overconfident here, and also more confident than his business decisions reflect. If he’s that confident he’s not burning enough boats.
Dario predicts a Cournot equilibrium, with a small number of relevant firms, which means there will be economic profits to be captured. He points out that gross margins are currently very positive, and the reason AI companies are taking losses is that each model turns a profit but you’re investing in the model that costs [10*X] while collecting the profits from the model that costs [X]. At some point the compute stops multiplying by 10 each cycle and then you notice that you were turning a profit the whole time, the economy is going to grow faster but that’s like 10%-20% fast, not 300% a year fast.
1. I don’t understand what is confusing Dwarkesh here. I do get that this is confusing to many but it shouldn’t confuse Dwarkesh.
2. Of course if we do start seeing triple-digit economic growth, things get weird, and also we should strongly suspect we will all soon die or lose control, but in the meantime there’ll be some great companies and I wouldn’t worry about Anthropic’s business model while that is happening.
Dario says he feels like he’s in an economics class.
1. Honestly it did feel like that. This is the first time in a long time it felt like Dwarkesh flat out was not prepared on a key issue, and is getting unintentionally taken to school (as opposed to when someone like Sarah Paine is taking us to school, but by design.)
Dario predicts an oligopoly, not a monopoly, because of lack of network effects combined with high fixed costs, similar to cloud providers.
1. This is a bet on there not being win-more or runaway effects.
2. For a while, the battle had catch-up mechanics rather than runaway effects. If you were behind, you can distill and you can copy ideas, so it’s hard to maintain much of a lead.
3. This feels like it is starting to change as RSI sets in. Claude is built by Claude Code, Codex is built by Codex, Google has to make its own choices and so on. The models are in many ways charged with training their successors.
4. Also the cycle may be speeding up a la AI 2027. If you’re six months behind that used to be one generation behind. Now it is three.
5. And of course, once sufficiently powerful RSI (recursive self-improvement) sets in, and the models become sufficiently capable, that edge starts to translate into various other advantages far more readily.
6. Many fates are possible, but don’t rule out monopoly or winner-take-most.
Dario points out different models have different comparative advantages, often in subtle ways.
1. True, but a sufficient lead would apply across the board. We’re not there right now, but we’re not that far from it either.
Dario worried Silicon Valley and those connected to it could grow at 50% while everyone else grows at not much above the normal 2%. He says that would be ‘a pretty messed up world.’
1. I think that turns out fine. You tax the part growing at 50%, everyone wins.
2. That’s distinct from issues about the AI taking over, or the people in charge of the AI taking over, you still do have to dodge those problems. But if well-meaning humans are in control I don’t worry about distributional issues under extreme growth.
Will robotics get solved soon after we get the ‘geniuses’? Dario says it doesn’t depend on learning like a human, there are many options, and it will happen, we will learn to control robots, and yes the robotics industry will then make trillions. It tacks on maybe a year or two to get going.
1. This seems obviously correct if you understand the premise, that we have the geniuses and the geniuses are playing nice for whatever reason.
2. That premise is not obvious.
Dwarkesh Patel keeps talking about continual learning, Dario Amodei keeps saying that we don’t need it.
1. I agree with Dario. We don’t need it as such, if nothing else we can easily solve such problems already via [CENSORED].
How should we price AGI? Dario thinks API is durable and will exist alongside other options, including forms of ‘pay for results.’
How did Anthropic end up being the ones to build Claude Code? Dario encouraged experimentation internally, they used it internally, and then Dario said they should launch it externally.

Finally, we ask about making AI ‘go well.’ With that framing you know that everyone is mostly conspicuously ignoring the biggest issues.

Soon there will be lots of misaligned or crazy AIs running around. What to do? Dario correctly reiterates his dismissal of the idea that having a bunch of different AIs keeps them meaningfully in check. He points to alignment work, and classifiers, for the short run. For the long run, we need governance and some sort of monitoring system, but it needs to be consistent with civil liberties, and we need to figure this out really fast.
1. We’ve heard Dario’s take on this before, he gives a good condensed version.
2. For my response, see my discussion of The Adolescence of Technology. I think he’s dodging the difficult questions, problems and clashes of sacred values, because he feels it’s the strategically correct play to dodge them.
3. That’s a reasonable position, in that if you actively spell out any plan that might possibly work, even in relatively fortunate scenarios, this is going to involve some trade-offs that are going to create very nasty pull quotes.
4. The longer you wait to make those trade-offs, the worse they get.
Dwarkesh asks, what do we do in an offense-dominated world? Dario says we would need international coordination on forms of defense.
1. Yes. To say (less than) the least.
Dwarkesh asks about Tennessee’s latest crazy proposed bill (it’s often Tennessee), which says “It would be an offense for a person to knowingly train artificial intelligence to provide emotional support, including through open-ended conversations with a user” and a potential patchwork of state laws. Dario (correctly) points out that particular law is dumb and reiterates that a blanket moratorium on all state AI bills for 10 years is a bad idea, we should only stop states once we have a federal framework in place on a particular question.
1. Yes, that is the position we still need to argue against, my lord.
Dario points out that people talk about ‘thousands of state laws’ but those are only proposals, almost all of them fail to pass, and when really stupid laws pass they often don’t get implemented. He points out that there are many things in AI he would actively deregulate, such as around health care. But he says we need to ramp up the safety and security legislation quite significantly, especially transparency. Then we need to be nimble.
1. I agree with all of this, as far as it goes.
2. I don’t think it goes far enough.
3. Colorado passed a deeply stupid AI regulation law, and didn’t implement it.
What can we do to get the benefits of AI better instantiated? Dwarkesh is worried about ‘kinds of moral panics or political economy problems’ and he worries benefits are fragile. Dario says no, markets actually work pretty well in the developed world.
1. Whereas Dwarkesh does not seem worried about the actual catastrophic or existential risks from AI.

Dario is fighting for export controls on chips, and he will ‘politely call the counterarguments fishy.’
Dwarkesh asks, what’s wrong with China having its own geniuses? Dario says we could be in an offense-dominant world, and even if we are not then potential conflict would create instability. And he worried governments will use AI to oppress their own people, China especially. Some coalition with pro-human values has to say ‘these are the rules of the road.’ We need to press our edge.
1. I am sad that this is the argument he is choosing here. There are better reasons, involving existential risks. Politically I get why he does it this way.
Dario doesn’t see a key inflection point, even with his ‘geniuses,’ the exponential will continue. He does call for negotiation with a strong hand.
1. This is reiteration from his essays. He’s flinching.
2. There’s good reasons for him to flinch, but be aware he’s doing it.
More discussion of democracy and authoritarianism and whether democracy will remain viable or authoritarianism lack sustainability or moral authority, etc.
1. There’s nothing new here, Dario isn’t willing to say things that would be actually interesting, and I grow tired.
Why does Claude’s constitution try to make Claude align to desired values and do good things and not bad things, rather than simply being user aligned? Dario gives the short version of why virtue ethics gives superior results here, without including explanations of why user alignment is ultimately doomed or the more general alignment problems other approaches can’t solve.
1. If you’re confused about this see my thoughts on the Claude Constitution.
How are these principles determined? Can’t Anthropic change them at any time? Dario suggests three sizes of loop: Within Anthropic, different companies putting out different constitutions people can compare, and society at large. He says he’d like to let representative governments have input but right now the legislative process is too slow therefore we should be careful and make it slower. Dwarkesh likes control loop two.
1. I like the first two loops. The problem with putting the public in the loop is that they have no idea how any of this works and would not make good choices, even according to their own preferences.
What have we likely missed about this era when we write the book on it? Dario says, the extent the world didn’t understand the exponential while it was happening, that the average person had no idea and everything was being decided all at once and often consequential decisions are made very quickly on almost no information and spending very little human compute.
1. I really hope we are still around to write the book.
2. From the processes we observe and what he says, I don’t love our chances.

Discussion about this post

On Dwarkesh Patel’s 2026 Podcast With Dario Amodei Read More »

On Dwarkesh Patel’s Second Interview With Ilya Sutskever

Dwarkesh / Kris Guyer / December 3, 2025

Some podcasts are self-recommending on the ‘yep, I’m going to be breaking this one down’ level. This was very clearly one of those. So here we go.

As usual for podcast posts, the baseline bullet points describe key points made, and then the nested statements are my commentary.

If I am quoting directly I use quote marks, otherwise assume paraphrases.

What are the main takeaways?

Ilya thinks training in its current form will peter out, that we are returning to an age of research where progress requires more substantially new ideas.
SSI is a research organization. It tries various things. Not having a product lets it punch well above its fundraising weight in compute and effective resources.
Ilya has 5-20 year timelines to a potentially superintelligent learning model.
SSI might release a product first after all, but probably not?
Ilya’s thinking about alignment still seems relatively shallow to me in key ways, but he grasps many important insights and understands he has a problem.
Ilya essentially despairs of having a substantive plan beyond ‘show everyone the thing as early and often as possible’ and hope for the best. He doesn’t know where to go or how to get there, but does realize he doesn’t know these things, so he’s well ahead of most others.

Afterwards, this post also covers Dwarkesh Patel’s post on the state of AI progress.

Ilya opens by remarking how crazy it is all this (as in AI) is real, it’s all so sci-fi, and yet it’s not felt in other ways so far. Dwarkesh expects this to continue for average people into the singularity, Ilya says no, AI will diffuse and be felt in the economy. Dwarkesh says impact seems smaller than model intelligence implies.
1. Ilya is right here. Dwarkesh is right that direct impact so far has been smaller than model intelligence implies, but give it time.
Ilya says, the models are really good at evals but economic impact lags. The models are buggy, and choices for RL take inspiration from the evals, so the evals are misleading and the humans are essentially reward hacking the evals. And that given they got their scores by studying for tons of hours rather than via intuition, one should expect AIs to underperform their benchmarks.
1. AIs definitely underperform their benchmarks in terms of general usefulness, even for those companies that do minimal targeting of benchmarks. Overall capabilities lag behind, for various reasons. We still have an impact gap.
The super talented student? The one that hardly even needs to practice a specific task to be good? They’ve got ‘it.’ Models don’t have ‘it.’
1. If anything, models have ‘anti-it.’ They make it up on volume. Sure.

Humans train on much less data, but what they know they know ‘more deeply’ somehow, there are mistakes we wouldn’t make. Also evolution can be highly robust, for example the famous case where a guy lost all his emotions and in many ways things remained fine.
1. People put a lot of emphasis on the ‘I would never’ heuristic, as AIs will sometimes do things ‘a similarly smart person’ would never do, they lack a kind of common sense.
So what is the ‘ML analogy for emotions’? Ilya says some kind of value function thing, as in the thing that tells you if you’re doing well versus badly while doing something.
1. Emotions as value functions makes sense, but they are more information-dense than merely a scalar, and can often point you to things you missed. They do also serve as training reward signals.
2. I don’t think you ‘need’ emotions for anything other than signaling emotions, if you are otherwise sufficiently aware in context, and don’t need them to do gradient descent.
3. However in a human, if you knock out the emotions in places where you were otherwise relying on them for information or to resolve uncertainty, you’re going to have a big problem.
4. I notice an obvious thing to try but it isn’t obvious how to implement it?
Ilya has faith in deep learning. There’s nothing it can’t do!

Data? Parameters? Compute? What else? It’s easier and more reliable to scale up pretraining than to figure out what else to do. But we’ll run out of data soon even if Gemini 3 got more out of this, so now you need to do something else. If you had 100x more scale here would anything be that different? Ilya thinks no.
1. Sounds like a skill issue, on some level, but yes if you didn’t change anything else then I expect scaling up pretraining further won’t help enough to justify the increased costs in compute and time.
RL costs now exceed pretraining costs, because each RL run costs a lot. It’s time to get back to an age of research, trying interesting things and seeing what happens.
1. I notice I am skeptical of the level of skepticism, also I doubt the research mode ever stopped in the background. The progress will continue. It’s weird how every time someone says ‘we still need some new idea or breakthrough’ there is the implication that this likely never happens again.

Why do AIs require so much more data than humans to learn? Why don’t models easily pick up on all this stuff humans learn one-shot or in the background?
1. Humans have richer data than text so the ratio is not as bad as it looks, but primarily because our AI learning techniques are relatively primitive and data inefficient in various ways.
2. My full answer to how to fix it falls under ‘I don’t do $100m/year jobs for free.’
3. Also there are ways in which the LLMs learn way better than you realize, and a lot of the tasks humans easily learn are regularized in non-obvious ways.
Ilya believes humans being good at learning is mostly not part of some complicated prior, and people’s robustness is really staggering.
1. I would clarify, not part of a complicated specialized prior. There is also a complicated specialized prior in some key domains, but that is in addition to a very strong learning function.
2. People are not as robust as Ilya thinks, or most people think.
Ilya suggests perhaps human neurons use more compute than we think.

Scaling ‘sucked the air out of the room’ so no one did anything else. Now there are more companies than ideas. You need some compute to bring ideas to life, but not the largest amounts.
1. You can also think about some potential techniques as ‘this is not worth trying unless you have massive scale.’
SSI’s compute all goes into research, none into inference, and they don’t try to build a product, and if you’re doing something different you don’t have to use maximum scale, so their $3 billion that they’ve raised ‘goes a long way’ relative to the competition. Sure OpenAI spends ~$5 billion a year on experiments, but it’s what you do with it.
1. This is what Ilya has to say in this spot, but there’s merit in it. OpenAI’s experiments are largely about building products now. This transfers to the quest for superintelligence, but not super efficiently.
How will SSI make money? Focus on the research, the money will appear.
1. Matt Levine has answered this one, which is that you make money by being an AI company full of talented researchers, so people give you money.
SSI is considering making a product anyway, both to have the product exist and also because timelines might be long.
1. I mean I guess at some point the ‘we are AI researchers give us money’ strategy starts to look a little suspicious, but let’s not rush into anything.
2. Remember, Ilya, once you have a product and try to have revenue they’ll evaluate the product and your revenue. If you don’t have one, you’re safe.

Ilya says even if there is a straight shot to superintelligence deployment would be gradual, you have to ship something first, and that he agrees with Dwarkesh on the importance of continual learning, it would ‘go and be’ various things and learn, superintelligence is not a finished mind.
1. Learning takes many forms, including continual learning, it can be updating within the mind or otherwise, and so on. See previous podcast discussions.
Ilya expects ‘rapid’ economic growth, perhaps ‘very rapid.’ It will vary based on what rules are set in different places.
1. Rapid means different things to different people, it sounds like Ilya doesn’t have a fixed rate in mind. I interpret it as ‘more than these 2% jokers.’
2. This vision still seems to think the humans stay in charge. Why?

Dwarkesh reprises the standard point that if AIs are merely ‘as good at’ humans at learning, but they can ‘merge brains’ then crazy things happen. How do we make such a situation go well? What is SSI’s plan?
1. I mean, that’s the least of it, but hopefully yes that suffices to make the point?
Ilya emphasizes deploying incrementally and in advance. It’s hard to predict what this will be like in advance. “The problem is the power. When the power is really big, what’s going to happen? If it’s hard to imagine, what do you do? You’ve got to be showing the thing.”
1. This feels like defeatism, in terms of saying we can only respond to things once we can see and appreciate them. We can’t plan for being old until we know what that’s like. We can’t plan for AGI/ASI, or AI having a lot of power, until we can see that in action.
2. But obviously by then it is likely to be too late, and most of your ability to steer what happens has already been lost, perhaps all of it.
3. This is the strategy of ‘muddle through’ the same as we always muddle through, basically the plan of not having a plan other than incrementalism. I do not care for this plan. I am not happy to be a part of it. I do not think that is a case of Safe Superintelligence.
Ilya expects governments and labs to play big roles, and for labs to increasingly coordinate on safety, as Anthropic and OpenAI did in a recent first step. And we have to figure out what we should be building. He suggests making the AI care about sentient life in general will be ‘easier’ than making it care about humans, since the AI will be sentient.
1. If the AIs do not care about humans in particular, there is no reason to expect humans to stay in control or to long endure.
Ilya would like the most powerful superintelligence to ‘somehow’ be ‘capped’ to address these concerns. But he doesn’t know how to do that.
1. I don’t know how to do that either. It’s not clear the idea is coherent.
Dwarkesh asks how much ‘room is there at the top’ for superintelligence to be more super? Maybe it just learns fast or has a bigger pool of strategies or skills or knowledge? Ilya says very powerful, for sure.
1. Sigh. There is very obviously quite a lot of ‘room at the top’ and humans are not anything close to maximally intelligent, nor to getting most of what intelligence has to offer. At this point, the number of people who still don’t realize or accept this reinforces how much better a smarter entity could be.
Ilya expects these superintelligences to be very large, as in physically large, and for several to come into being at roughly the same time, and ideally they could “be restrained in some ways or if there was some kind of agreement or something.”
1. That agreement between AIs would then be unlikely to include us. Yes, functional restraints would be nice, but this is the level of thought that has gone into finding ways to do it.
2. There’s been a lot of things staying remarkably close, but a lot of that is because rather than an edge compounding and accelerating for now catching up has been easier.
Ilya: “What is the concern of superintelligence? What is one way to explain the concern? If you imagine a system that is sufficiently powerful, really sufficiently powerful—and you could say you need to do something sensible like care for sentient life in a very single-minded way—we might not like the results. That’s really what it is.”
1. Well, yes, standard Yudkowsky, no fixed goal we can name turns out well.
Ilya says maybe we don’t build an RL agent. Humans are semi-RL agents, our emotions make us alter our rewards and pursue different rewards after a while. If we keep doing what we are doing now it will soon peter out and never be “it.”
1. There’s a baked in level of finding innovations and improvements that should be in anyone’s ‘keep doing what we are doing’ prior, and I think it gets us pretty far and includes many individually low-probability-of-working innovations making substantial differences. There is some level on which we would ‘peter out’ without a surprise, but it’s not clear that this requires being surprised overall.
2. Is it possible things do peter out and we never see ‘it’? Yeah. It’s possible. I think it’s a large underdog to stay that way for long, but it’s possible. Still a long practical way to go even then.
3. Emotions, especially boredom and the fading of positive emotions on repetition, are indeed one of the ways we push ourselves towards exploration and variety. That’s one of many things they do, and yes if we didn’t have them then we would need something else to take their place.
4. In many cases I have indeed used logic to take the place of that, when emotion seems to not be sufficiently preventing mode collapse.
“One of the things that you could say about what causes alignment to be difficult is that your ability to learn human values is fragile. Then your ability to optimize them is fragile. You actually learn to optimize them. And can’t you say, “Are these not all instances of unreliable generalization?” Why is it that human beings appear to generalize so much better? What if generalization was much better? What would happen in this case? What would be the effect? But those questions are right now still unanswerable.”
1. It is cool to hear Ilya restate these Yudkowsky 101 things.
2. Humans do not actually generalize all that well.
How does one think about what AI going well looks like? Ilya goes back to ‘AI that cares for sentient life’ as a first step, but then asks the better question, what is the long run equilibrium? He notices he does not like his answer. Maybe each person has an AI that will do their bidding and that’s good, but the downside is then the AI does things like earn money or advocate or whatever, and the person says ‘keep it up’ but they’re not a participant. Precarious. People become part AI, Neurolink++. He doesn’t like this solution, but it is at least a solution.
1. Big points for acknowledging that there are no known great solutions.
2. Big points for pointing out one big flaw, that the people stop actually doing the things, because the AIs do the things better.
3. The equilibrium here is that increasingly more things are turned over to AIs, including both actions and decisions. Those who don’t do this fall behind.
4. The equilibrium here is that increasingly AIs are given more autonomy, more control, put in better positions, have increasing power and wealth shares, and so on, even if everything involved is fully voluntary and ‘nothing goes wrong.’
5. Neurolink++ does not meaningfully solve any of the problems here.
6. Solve for the equilibrium.
Is the long history of emotions an alignment success? As in, it allows the brain to move from ‘mate with somebody who’s more successful’ into flexibly defining success and generally adjusting to new situations.
1. It’s a highly mixed bag, wouldn’t you say?
2. There are ways in which those emotions have been flexible and adaptable and a success, and have succeeded in the alignment target (inclusive genetic fitness) and also ways in which emotions are very obviously failing people.
3. If ASIs are about as aligned as we are in this sense, we’re doomed.
Ilya says it’s mysterious how evolution encodes high-level desires, but it gives us all these social desires, and they evolved pretty recently. Dwarkesh points out it is desire you learned in your lifetime. Ilya notes the brain as regions and some things are hardcoded, but if you remove half the brain then the regions move, the social stuff is highly reliable.
1. I don’t pretend to understand the details here, although I could speculate.

SSI investigates ideas to see if they are promising. They do research.
On his cofounder leaving: “For this, I will simply remind a few facts that may have been forgotten. I think these facts which provide the context explain the situation. The context was that we were fundraising at a $32 billion valuation, and then Meta came in and offered to acquire us, and I said no. But my former cofounder in some sense said yes. As a result, he also was able to enjoy a lot of near-term liquidity, and he was the only person from SSI to join Meta.”
1. I love the way he put that. Yes.
“The main thing that distinguishes SSI is its technical approach. We have a different technical approach that I think is worthy and we are pursuing it. I maintain that in the end there will be a convergence of strategies. I think there will be a convergence of strategies where at some point, as AI becomes more powerful, it’s going to become more or less clearer to everyone what the strategy should be. It should be something like, you need to find some way to talk to each other and you want your first actual real superintelligent AI to be aligned and somehow care for sentient life, care for people, democratic, one of those, some combination thereof. I think this is the condition that everyone should strive for. That’s what SSI is striving for. I think that this time, if not already, all the other companies will realize that they’re striving towards the same thing. We’ll see. I think that the world will truly change as AI becomes more powerful. I think things will be really different and people will be acting really differently.”
1. This is a remarkably shallow, to me, vision of what the alignment part of the strategy looks like, but it does get an admirably large percentage of the overall strategic vision, as in most of it?
2. The idea that ‘oh as we move farther along people will get more responsible and cooperate more’ seems to not match what we have observed so far, alas.
3. Ilya later clarifies he specifically meant convergence on alignment strategies, although he also expects convergence on technical strategies.
4. The above statement is convergence on an alignment goal, but that doesn’t imply convergence on alignment strategy. Indeed it does not imply that an alignment strategy that is workable even exists.
Ilya’s timeline to the system that can learn and become superhuman? 5-20 years.
Ilya predicts that when someone releases the thing that will be information but it won’t teach others how to do the thing, although they will eventually learn.
What is the ‘good world’? We have powerful human-like learners and perhaps narrow ASIs, and companies make money, and there is competition through specialization, different niches. Accumulated learning and investment creates specialization.
1. This is so frustrating, in that it doesn’t explain why you would expect that to be how this plays out, or why this world turns out well, or anything really? Which would be fine if the answers were clear or at least those seemed likely, but I very much don’t think that.
2. This feels like a claim that humans are indeed near the upper limit of what intelligence can do and what can be learned except that we are hobbled in various ways and AIs can be unhobbled, but that still leaves them functioning in ways that seem recognizably human and that don’t crowd us out? Except again I don’t think we should expect this.
Dwarkesh points out current LLMs are similar, Ilya says perhaps the datasets are not as non-overlapping as they seem.
1. On the contrary, I was assuming they were mostly the same baseline data, and then they do different filtering and progressions from there? Not that there’s zero unique data but that most companies have ‘most of the data.’
Dwarkesh suggests, therefore AIs will have less diversity than human teams. How can we get ‘meaningful diversity’? Ilya says this is because of pretraining, that post training is different.
1. To the extent that such ‘diversity’ is useful it seems easy to get with effort. I suspect this is mostly another way to create human copium.
What about using self-play? Ilya notes it allows using only compute, which is very interesting, but it is only good for ‘developing a certain set of skills.’ Negotiation, conflict, certain social strategies, strategizing, that kind of stuff. Then Ilya self-corrects, notes other forms, like debate, prover-verifier or forms of LLM-as-a-judge, it’s a special case of agent competition.
1. I think there’s a lot of promising unexplored space here, decline to say more.

What is research taste? How does Ilya come up with many big ideas?

This is hard to excerpt and seems important, so quoting in full to close out:

I can comment on this for myself. I think different people do it differently. One thing that guides me personally is an aesthetic of how AI should be, by thinking about how people are, but thinking correctly. It’s very easy to think about how people are incorrectly, but what does it mean to think about people correctly?

I’ll give you some examples. The idea of the artificial neuron is directly inspired by the brain, and it’s a great idea. Why? Because you say the brain has all these different organs, it has the folds, but the folds probably don’t matter. Why do we think that the neurons matter? Because there are many of them. It kind of feels right, so you want the neuron. You want some local learning rule that will change the connections between the neurons. It feels plausible that the brain does it.

The idea of the distributed representation. The idea that the brain responds to experience therefore our neural net should learn from experience. The brain learns from experience, the neural net should learn from experience. You kind of ask yourself, is something fundamental or not fundamental? How things should be.

I think that’s been guiding me a fair bit, thinking from multiple angles and looking for almost beauty, beauty and simplicity. Ugliness, there’s no room for ugliness. It’s beauty, simplicity, elegance, correct inspiration from the brain. All of those things need to be present at the same time. The more they are present, the more confident you can be in a top-down belief.

The top-down belief is the thing that sustains you when the experiments contradict you. Because if you trust the data all the time, well sometimes you can be doing the correct thing but there’s a bug. But you don’t know that there is a bug. How can you tell that there is a bug? How do you know if you should keep debugging or you conclude it’s the wrong direction? It’s the top-down. You can say things have to be this way. Something like this has to work, therefore we’ve got to keep going. That’s the top-down, and it’s based on this multifaceted beauty and inspiration by the brain.

I need to think more about what causes my version of ‘research taste.’ It’s definitely substantially different.

That ends our podcast coverage, and enter the bonus section, which seems better here than in the weekly, as it covers many of the same themes.

Dwarkesh Patel offers his thoughts on AI progress these days, noticing that when we get the thing he calls ‘actual AGI’ things are going to get fucking crazy, but thinking that this is 10-20 years away from happening in full. Until then, he’s a bit skeptical of how many gains we can realize, but skepticism is highly relative here.

Dwarkesh Patel: I’m confused why some people have short timelines and at the same time are bullish on RLVR. If we’re actually close to a human-like learner, this whole approach is doomed.

… Either these models will soon learn on the job in a self directed way – making all this pre-baking pointless – or they won’t – which means AGI is not imminent. Humans don’t have to go through a special training phase where they need to rehearse every single piece of software they might ever use.

Wow, look at those goalposts move (in all the different directions). Dwarkesh notes that the bears keep shifting on the bulls, but says this is justified because current models fit the old goals but don’t score the points, as in they don’t automate workflows as much as you would expect.

In general, I worry about the expectation pattern having taken the form of ‘median 50 years → 20 → 10 → 5 → 7, and once I heard someone said 3, so oh nothing to see there you can stop worrying.’

In this case, look at the shift: An ‘actual’ (his term) AGI must now not only be capable of human-like performance of tasks, the AGI must also be a human-efficient learner.

That would mean AGI and ASI are the same thing, or at least arrive in rapid succession. An AI that was human-efficient at learning from data, combined with AI’s other advantages that include imbibing orders of magnitude more data, would be a superintelligence and would absolutely set off recursive self-improvement from there.

And yes, if that’s what you mean then AGI isn’t the best concept for thinking about timelines, and superintelligence is the better target to talk about. Sriram Krishnan is however opposed to using either of them.

Like all conceptual handles or fake frameworks, it is imprecise and overloaded, but people’s intuitions about it miss that the thing is possible or exists even when you outright say ‘superintelligence’ and I shudder to think how badly they will miss the concept if you don’t even say it. Which I think is a lot of the motivation behind not wanting to say it, so people can pretend that there won’t be things smarter than us in any meaningful sense and thus we can stop worrying about it or planning for it.

Indeed, this is exactly Sriram’s agenda if you look at his post here, to claim ‘we are not on the timeline’ that involves such things, to dismiss concerns as ‘sci-fi’ or philosophical, and talk instead of ‘what we are trying to build.’ What matters is what actually gets built, not what we intended, and no none of these concepts have been invalidated. We have ‘no proof of takeoff’ in the sense that we are not currently in a fast takeoff yet, but what would constitute this ‘proof’ other than already being in a takeoff, and thus it being too late to do anything about it?

Sriram Krishnan: …most importantly, it invokes fear—connected to historical usage in sci-fi and philosophy (think 2001, Her, anything invoking the singularity) that has nothing to do with the tech tree we’re actually on. Makes every AI discussion incredibly easy to anthropomorphize and detour into hypotheticals.

Joshua Achiam (OpenAI Head of Mission Alignment): I mostly disagree but I think this is a good contribution to the discourse. Where I disagree: I do think AGI and ASI both capture something real about where things are going. Where I agree: the lack of agreed-upon definitions has 100% created many needless challenges.

The idea that ‘hypotheticals,’ as in future capabilities and their logical consequences, are ‘detours,’ or that any such things are ‘sci-fi or philosophy’ is to deny the very idea of planning for future capabilities or thinking about the future in real ways. Sriram himself only thinks they are 10 years away, and then the difference is he doesn’t add Dwarkesh’s ‘and that’s fucking crazy’ and instead seems to effectively say ‘and that’s a problem for future people, ignore it.’

Seán Ó hÉigeartaigh: I keep noting this, but I do think a lot of the most heated policy debates we’re having are underpinned by a disagreement on scientific view: whether we (i) are on track in coming decade for something in the AGI/ASI space that can achieve scientific feats equivalent to discovering general relativity (Hassabis’ example), or (ii) should expect AI as a normal technology (Narayanan & Kapoor’s definition).

I honestly don’t know. But it feels premature to me to rule out (i) on the basis of (slightly) lengthening timelines from the believers, when progress is clearly continuing and a historically unprecedented level of resources are going into the pursuit of it. And premature to make policy on the strong expectation of (ii). (I also think it would be premature to make policy on the strong expectation of (i) ).

But we are coming into the time where policy centred around worldview (ii) will come into tension in various places with the policies worldview (i) advocates would enact if given a free hand. Over the coming decade I hope we can find a way to navigate a path between, rather than swing dramatically based on which worldview is in the ascendancy at a given time.

Sriram Krishnan: There is truth to this.

This paints it as two views, and I would say you need at least three:

Something in the AGI/ASI space is likely in less than 10 years.
Something in the AGI/ASI space is unlikely in less than about 10 years, but highly plausible in 10-20 years, until then AI is a normal technology.
AI is a normal technology and we know it will remain so indefinitely. We can regulate and plan as if AGI/ASI style technologies will never happen.

I think #1 and #2 are both highly reasonable positions, only #3 is unreasonable, while noting that if you believe #2 you still need to put some non-trivial weight on #1. As in, if you think it probably takes ~10 years then you can perhaps all but rule out AGI 2027, and you think 2031 is unlikely, but you cannot claim 2031 is a Can’t Happen.

The conflation to watch out for is #2 and #3. These are very different positions. Yet many in the AI industry, and its political advocates, make exactly this conflation. They assert ‘#1 is incorrect therefore #3,’ when challenged for details articulate claim #2, then go back to trying to claim #3 and act on the basis of #3.

What’s craziest is that the list of things to rule out, chosen by Sriram, includes the movie Her. Her made many very good predictions. Her was a key inspiration for ChatGPT and its voice mode, so much so that there was a threatened lawsuit because they all but copied Scarlett Johansson’s voice. She’s happening. Best be believing in sci-fi stores, because you’re living in one, and all that.

Nothing about current technology is a reason to think 2001-style things or a singularity will not happen, or to think we should anthropomorphize AI relatively less (the correct amount for current AIs, and for future AIs, are both importantly not zero, and importantly not 100%, and both mistakes are frequently made). Indeed, Dwarkesh is de facto predicting a takeoff and a singularity in this post that Sriram praised, except Dwarkesh has it on a 10-20 year timescale to get started.

Now, back to Dwarkesh.

This process of ‘teach the AI the specific tasks people most want’ is the central instance of models being what Teortaxes calls usemaxxed. A lot of effort is going to specific improvements rather than to advancing general intelligence. And yes, this is evidence against extremely short timelines. It is also, as Dwarkesh notes, evidence in favor of large amounts of mundane utility soon, including ability to accelerate R&D. What else would justify such massive ‘side’ efforts?

There’s also, as he notes, the efficiency argument. Skills many people want should be baked into the core model. Dwarkesh fires back that there are a lot of skills that are instance-specific and require on-the-job or continual learning, which he’s been emphasizing a lot for a while. I continue to not see a contradiction, or why it would be that hard to store and make available that knowledge as needed even if it’s hard for the LLM to permanently learn it.

I strongly disagree with his claim that ‘economic diffusion lag is cope for missing capabilities.’ I agree that many highly valuable capabilities are missing. Some of them are missing due to lack of proper scaffolding or diffusion or context, and are fundamentally Skill Issues by the humans. Others are foundational shortcomings. But the idea that the AIs aren’t up to vastly more tasks than they’re currently asked to do seems obviously wrong?

He quotes Steven Byrnes:

Steven Byrnes: New technologies take a long time to integrate into the economy? Well ask yourself: how do highly-skilled, experienced, and entrepreneurial immigrant humans manage to integrate into the economy immediately? Once you’ve answered that question, note that AGI will be able to do those things too.

Again, this is saying that AGI will be as strong as humans in the exact place it is currently weakest, and will not require adjustments for us to take advantage. No, it is saying more than that, it is also saying we won’t put various regulatory and legal and cultural barriers in its way, either, not in any way that counts.

If the AGI Dwarkesh is thinking about were to exist, again, it would be an ASI, and it would be all over for the humans very quickly.

I also strongly disagree with human labor not being ‘shleppy to train’ (bonus points, however, for excellent use of ‘shleppy’). I have trained humans and been a human being trained, and it is totally shleppy. I agree, not as schleppy as current AIs can be when something is out of their wheelhouse, but rather obnoxiously schleppy everywhere except their own very narrow wheelhouse.

Here’s another example of ‘oh my lord check out those goalposts’:

Dwarkesh Patel: It revealed a key crux between me and the people who expect transformative economic impacts in the next few years.

Transformative economic impacts in the next few years would be a hell of a thing.

It’s not net-productive to build a custom training pipeline to identify what macrophages look like given the way this particular lab prepares slides, then another for the next lab-specific micro-task, and so on. What you actually need is an AI that can learn from semantic feedback on the job and immediately generalize, the way a human does.

Well, no, it probably isn’t now, but also Claude Code is getting rather excellent at creating training pipelines, and the whole thing is rather standard in that sense, so I’m not convinced we are that far away from doing exactly that. This is an example of how sufficient ‘AI R&D’ automation, even on a small non-recursive scale, can transform use cases.

Every day, you have to do a hundred things that require judgment, situational awareness, and skills & context learned on the job. These tasks differ not just across different people, but from one day to the next even for the same person. It is not possible to automate even a single job by just baking in some predefined set of skills, let alone all the jobs.

Well, I mean of course it is, for a sufficiently broad set of skills at a sufficiently high level, especially if this includes meta-skills and you can access additional context. Why wouldn’t it be? It certainly can quickly automate large portions of many jobs, and yes I have started to automate portions of my job indirectly (as in Claude writes me the mostly non-AI tools to do it, and adjusts them every time they do something wrong).

Give it a few more years, though, and Dwarkesh is on the same page as I am:

In fact, I think people are really underestimating how big a deal actual AGI will be because they’re just imagining more of this current regime. They’re not thinking about billions of human-like intelligences on a server which can copy and merge all their learnings. And to be clear, I expect this (aka actual AGI) in the next decade or two. That’s fucking crazy!

Exactly. This ‘actual AGI’ is fucking crazy, and his timeline for getting there of 10-20 years is also fucking crazy. More people need to add ‘and that’s fucking crazy’ at the end of such statements.

Dwarkesh then talks more about continual learning. His position here hasn’t changed, and neither has my reaction that this isn’t needed, we can get the benefits other ways. He says that the gradual progress on continual learning means it won’t be ‘game set match’ to the first mover, but if this is the final piece of the puzzle then why wouldn’t it be?

Discussion about this post

On Dwarkesh Patel’s Second Interview With Ilya Sutskever Read More »

On Dwarkesh Patel’s Podcast With Andrej Karpathy

Dwarkesh / Mike M. / October 21, 2025

Some podcasts are self-recommending on the ‘yep, I’m going to be breaking this one down’ level. This was very clearly one of those. So here we go.

As usual for podcast posts, the baseline bullet points describe key points made, and then the nested statements are my commentary.

If I am quoting directly I use quote marks, otherwise assume paraphrases.

Rather than worry about timestamps, I’ll use YouTube’s section titles, as it’s not that hard to find things via the transcript as needed.

This was a fun one in many places, interesting throughout, frustrating in similar places to where other recent Dwarkesh interviews have been frustrating. It gave me a lot of ideas, some of which might even be good.

Andrej calls this the ‘decade of agents’ contrary to (among others who have said it) the Greg Brockman declaration that 2025 is the ‘year of agents,’ as there is so much work left to be done. Think of AI agents as employees or interns, that right now mostly can’t do the things due to deficits of intelligence and context.
1. I agree that 2025 as the year of the agent is at least premature.
2. You can defend the 2025 claim if you focus on coding, Claude Code and Codex, but I think even there it is more confusing than helpful as a claim.
3. I also agree that we will be working on improving agents for a long time.
4. 2026 might be the proper ‘year of the agent’ as when people start using AI agents for a variety of tasks and getting a bunch of value from them, but they will still have a much bigger impact on the world in 2027, and again in 2028.
5. On the margin and especially outside of coding, I think context and inability to handle certain specific tasks (especially around computer use details) are holding things back right now more than intelligence. A lot of it seems eminently solvable quickly in various ways if one put it in the work.
Dwarkesh points to lack of continual learning or multimodality, but notes it’s hard to tell how long it will take. Andrej says ‘well I have 15 years of prediction experience and intuition and I average things out and it feels like a decade to me.’
1. A decade seems like an eternity to me on this.
2. If it’s to full AGI it is slow but less crazy. So perhaps this is Andrej saying that to count as an agent for this the AI needs to essentially be AGI.
AI has had a bunch of seismic shifts, Andrej has seen at least two and they seem to come with regularity. Neural nets used to be a niche thing before AlexNet but they were still trained per-task, the focus on Atari and other games was a mistake because you want to interact with the ‘real world’. Then LLMs. The common mistake was trying to “get the full thing too early” and especially aiming at agents too soon.
1. The too soon thing seems true and important. You can’t unlock capabilities in a useful way until you have the groundwork and juice for them.
2. Once you do have the groundwork and juice, they tend to happen quickly, without having to do too much extra work.
3. In general, seems like if something is super hard to do, better if you wait?
4. However you can with focused effort make a lot of progress beyond what you’d get at baseline, even if that ultimately stalls out, as seen by the Atari and universe examples.
Dwarkesh asks what about the Sutton perspective, should you be able to throw an AI out there into the world the way you would a human or animal and just work with and ‘grow up’ via sensory data? Andrej points to his response to Sutton, that biological brains work via a very different process, we’re building ghosts not animals, although we should make them more ‘animal-like’ over time. But animals don’t do what Sutton suggests, they use an evolutionary outer loop. Animals only use RL for non-intelligence tasks, things like motor skills.
1. I think humans do use RL on intelligence tasks? My evidence for this is that when I use this model of humans it seems to make better predictions, both about others and about myself.
2. Humans are smarter about this than ‘pure RL’ of course, including being the meta programmer and curating their own training data.
Dwarkesh contrasts pre-training with evolution in that evolution compacts all info into 3 GB of DNA, thus evolution is closer to finding a lifetime learning algorithm. Andrej agrees there is miraculous compression in DNA and that it includes learning algorithms, but we’re not here to build animals, only useful things, and they’re ‘crappy’ but what know how to build are the ghosts. Dwarkesh says evolution does not give us knowledge, it gives us the algorithm to find knowledge a la Sutton.
1. Dwarkesh is really big on the need for continual (or here he says ‘lifetime’) learning and the view that it is importantly distinct from what RL does.
2. I’m not convinced. As Dario points out, in theory you can put everything in the context window. You can do a lot better on memory and imitating continual learning than that with effort, and we’ve done remarkably little on such fronts.
3. The actual important difference to me is more like sample efficiency. I see ways around that problem too, but am not putting them in this margin.
4. I reiterate that evolution actually does provide a lot of knowledge, actually, or the seeds to getting specific types of knowledge, using remarkably few bits of data to do this. If you buy into too much ‘blank slate’ you’ll get confused.
Andrej draws a distinction between the neural net picking up all the knowledge in its training data versus it becoming more intelligent, and often you don’t even want the knowledge, we rely on it too much, and this is part of why agents are bad at “going off the data manifold of what exists on the internet.” We want the “cognitive core.”
1. I buy that you want to minimize the compute costs associated with carrying lots of extra information, so for many tasks you want a Minimum Viable Knowledge Base. I don’t buy that knowledge tends to get in the way. If it does, then Skill Issue.
2. More knowledge seems hard to divorce fully from more intelligence. A version of me that was abstractly ‘equally smart,’ but which knew far less, might technically have the same Intelligence score on the character sheet, but a lot lower Wisdom and would effectively be kind of dumb. See young people.
3. I’m skeptical about a single ‘cognitive core’ for similar reasons.
Dwarkesh reiterates in-context learning as ‘the real intelligence’ as distinct from gradient descent. Andrej agrees it’s not explicit, it’s “pattern completion within a token window” but notes there’s tons of patterns on the internet that get into the weights, and it’s possible in-context learning runs a small gradient descent loop inside the neural network. Dwarkesh asks, “why does it feel like with in-context learning we’re getting to this continual learning, real intelligence-like thing? Whereas you don’t get the analogous feeling just from pre-training.”
1. My response would basically again be sample efficiency, and the way we choose to interact with LLMs being distinct from the training? I don’t get this focus on (I kind of want to say fetishization of?) continual learning as a distinct thing. It doesn’t feel so distinct to me.
Dwarkesh asks, how much of the information from training gets stored in the model? He compares KV cache of 320 kilobytes to a full 70B model trained on 15 trillion tokens. Andrej thinks models get a ‘hazy recollection’ of what happened in training, the compression is dramatic to get 15T tokens into 70B parameters.
1. Is it that dramatic? Most tokens don’t contain much information, or don’t contain new information. In some ways 0.5% (70B vs. 15T) is kind of a lot. It depends on what you care about. If you actually had to put it all in the 320k KV Cache that’s a lot more compression.
2. As Andrej says, it’s not enough, so you get much more precise answers about texts if you have the full text in the context window. Which is also true if you ask humans about the details of things that mostly don’t matter.
What part about human intelligence have we most failed to replicate? Andrej says ‘a lot of it’ and starts discussing physical brain components causing “these cognitive deficits that we all intuitively feel when we talk to them models.”
1. I feel like that’s a type mismatch. I want to know what capabilities are missing, not which physical parts of the brain? I agree that intuitively some capabilities are missing, but I’m not sure how essential this is, and as Andrej suggests we shouldn’t be trying to build an analog of a human.
Dwarkesh turns back to continual learning, asks if it will emerge spontaneously if the model gets the right incentives. Andrej says no, that sleep does this for humans where ‘the context window sometimes sticks around’ and there’s no natural analog, but we want a way to do this, and points to sparse attention.
1. I’m not convinced we know how the sleep or ‘sticking around’ thing works, clearly there is something going on somewhere.
2. I agree this won’t happen automatically under current techniques, but we can use different techniques, and again I’m getting the Elle Woods ‘what, like it’s hard?’ reaction to all this, where ‘hard’ is relative to problem importance.
Andrej kind of goes Lindy, pointing to translation invariance to expect algorithmic and other changes at a similar rate to the past, and pointing to the many places he says we’d need gains in order to make further progress, that various things are ‘all surprisingly equal,’ it needs to improve ‘across the board.’
1. Is this the crux, the fundamental disagreement about the future, in two ways?
2. The less important one is the idea that progress requires all of [ABCDE] to make progress. That seems wrong to me. Yes, you are more efficient if you make progress more diffusely under exponential scaling laws, but you can still work around any given deficit via More Dakka.
3. As a simple proof by hypothetical counterexample, suppose I held one of his factors (e.g. architecture, optimizer, loss function) constant matching GPT-3, but could apply modern techniques and budgets to the others. What do I get?
4. More importantly, Andrej is denying the whole idea that technological progress here or in general is accelerating, or will accelerate. And that seems deeply wrong on multiple levels?
5. For this particular question, progress has been rapid, investments of all kinds have been huge, and already we are seeing AI directly accelerate AI progress substantially, a process that will accelerate even more as AI gets better, even if it doesn’t cross into anything like full automated AI R&D or a singularity, and we keep adding more ways to scale. It seems rather crazy to expect 2025 → 2035 to be similar to 2015 → 2025 in AI, on the level of ‘wait, you’re suggesting what?’
6. In the longer arc of history, if we’re going to go there, we see a clear acceleration of time. So we have the standard several billion years to get multicellular life, several hundred million years to get close to human intelligence, several hundred thousand to million years to get agriculture and civilization, several thousand years to get the industrial revolution, several hundred years to get the information age, several dozen years to get AI to do anything useful on the general intelligence front, several ones of years to go from ‘anything useful at all’ to GPT-5 and Sonnet 4.5 being superhuman in many domains already.
7. I think Andrej makes better arguments for relatively long (still remarkably short!) timelines later, but him invoking this gives me pause.

Andrej found LLMs of little help when assembling his new repo nanochat, which is a an 8k-line set of all the things you need for a minimal ChatGPT clone. He still used autocomplete, but vibe coding only works with boilerplate stuff. In particular, the models ‘remember wrong’ from all the standard internet ways of doing things, that he wasn’t using. For example, he did his own version of a DDP container inside the code, and the models couldn’t comprehend that and kept trying to use DDP instead. Whereas he only used vibe coding for a few boilerplate style areas.
1. I’ve noticed this too. LLMs will consistently make the same mistakes, or try to make the same changes, over and over, to match their priors.
2. It’s a reasonable prior to think things like ‘oh almost no one would ever implement a version of DDP themselves,’ the issue is that they aren’t capable of being told that this happened and having this overcome that prior.
“I also feel like it’s annoying to have to type out what I want in English because it’s too much typing. If I just navigate to the part of the code that I want, and I go where I know the code has to appear and I start typing out the first few letters, autocomplete gets it and just gives you the code. This is a very high information bandwidth to specify what you want.”
1. As a writer this resonates so, so much. There are many tasks where in theory the LLM could do it for me, but by the time I figure out how to get the LLM to do it for me, I might as well have gone and done it myself.
2. Whereas the autocomplete in gmail is actually good enough that it’s worth my brain scanning it to see if it’s what I wanted to type (or on occasion, a better version).
Putting it together: LLMs are very good at code that has been written many times before, and poor at code that has not been written before, in terms of the structure and conditions behind the code. Code that has been written before on rare occasions is in between. The modes are still amazing, and can often help. On the vibe coding: “I feel like the industry is making too big of a jump and is trying to pretend like this is amazing, and it’s not. It’s slop.”
1. There’s a big difference between the value added when you can successfully vibe code large blocks of code, versus when you can get answers to questions, debugging notes and stuff like that.
2. The second category can still be a big boost to productivity, including to AI R&D, but isn’t going to go into crazy territory or enter into recursion mode.
3. I presume Andrej is in a position where his barrier for ‘not slop’ is super high and the problems he works on are unusually hostile as well.
4. I do think these arguments are relevant evidence for longer timelines until crazy happens, that we risk overestimating the progress made on vibe coding.
Andrej sees all of computing as a big recursive self-improvement via things like code editors and syntax highlighting and even data checking and search engines, in a way that is continuous with AI. Better autocomplete is the next such step. We’re abstracting, but it is slow.
1. One could definitely look at it this way. It’s not obvious what that reframing pushes one towards.

How should we think about humans being able to build a rich world model from interactions with the environment, without needing final reward? Andrej says they don’t do RL, they do something different, whereas RL is terrible but everything else we’ve tried has been worse. All RL can do is check the final answers, and say ‘do more of this’ when it works. A human would evaluate parts of the process, an LLM can’t and won’t do this.
1. So yeah, RL is like democracy. Fair enough.
2. Why can’t we set up LLMs to do the things human brains do here? Not the exact same thing, but something built on similar principles?
3. I mean it seems super doable to me, but if you want me to figure out how to do it or actually try doing it the going rate is at least $100 million. Call me.
Dwarkesh does ask why, or at least about process supervision. Andrej says it is tricky how to do that properly, how do you assign credit to partial solutions? Labs are trying to use LLM judges but this is actually subtle, and you’ll run into adversarial examples if you do it for too long. It finds out that dhdhdhdh was an adversarial example so it starts outputting that, or whatever.
1. So then you… I mean I presume the next 10 things I would say here have already been tried and they fail but I’m not super confident in that.
So train models to be more robust? Find the adversarial examples and fix them one at a time won’t work, there will always be another one.
1. Certainly ‘find the adversarial examples and fix them one at a time’ is an example of ‘how to totally fail OOD or at the alignment problem,’ you would need a way to automatically spot when you’re invoking one.

What about the thing where humans sleep or daydream, or reflect? Is there some LLM analogy? Andrej says basically no. When an LLM reads a book it predicts the next token, when a human does they do synthetic data generation, talk about it with their friends, manipulate the info to gain knowledge. But doing this with LLMs is nontrivial, for reasons that are subtle and hard to understand, and if you generate synthetic data to train on that makes the model worse, because the examples are silently collapsed, similar to how they know like 3 total jokes. LLMs don’t retain entropy, and we don’t know how to get them to retain it. “I guess what I’m saying is, say we have a chapter of a book and I ask an LLM to think about it, it will give you something that looks very reasonable. But if I ask it 10 times, you’ll notice that all of them are the same. Any individual sample will look okay, but the distribution of it is quite terrible.”
1. I wish Andrej’s answer here was like 5 minutes longer. Or maybe 50 minutes.
2. In general, I’m perhaps not typical, but I’d love to hear the ‘over your head’ version where he says a bunch of things that gesture in various directions, and it’s up to you whether you want to try and understand it.
3. I mean from the naive perspective this has ‘skill issue’ written all over it, and there’s so many things I would want to try.
“I think that there’s possibly no fundamental solution to this. I also think humans collapse over time. These analogies are surprisingly good. Humans collapse during the course of their lives. This is why children, they haven’t overfit yet… We end up revisiting the same thoughts. We end up saying more and more of the same stuff, and the learning rates go down, and the collapse continues to get worse, and then everything deteriorates.”
1. I feel this.
2. That means both in myself, and in my observations of others.
3. Mode collapse in humans is evolutionarily and strategically optimal, under conditions of aging and death. If you’re in exploration, pivot to exploitation.
4. We also have various systems to fight this and pivot back to exploration.
5. One central reason humans get caught in mode collapse, when we might not want that it, is myopia and hyperbolic discounting.
6. Another is, broadly speaking, ‘liquidity or solvency constraints.’
7. A third would be commitments, signaling, loyalty and so on.
8. If we weren’t ‘on the clock’ due to aging, which both cuts the value of exploration and also raises the difficulty of it, I think those of us who cared could avoid mode collapse essentially indefinitely.
9. Also I notice [CENSORED] which has obvious deep learning implications?
Could dreaming be a way to avoid mode collapse by going out of distribution?
1. I mean, maybe, but the price involved seems crazy high for that.
2. I worry that we’re using ‘how humans do it’ as too much of a crutch.
Andrej notes you should always be seeking entropy in your life, suggesting talking to other people.
1. There are lots of good options. I consume lots of text tokens.
What’s up with children being great at learning, especially things like languages, but terrible at remembering experiences or specific information? LLMs are much better than humans at memorization, and this can be a distraction.
1. I’m not convinced this is actually true?
2. A counterpoint is that older people learn harder things, and younger people, especially young children, simply cannot learn those things at that level, or would learn them a lot slower.
3. Another counterpoint is that a lot of what younger humans learn is at least somewhat hard coded into the DNA to be easier to learn, and also are replacing nothing which helps you move a lot faster and seem to be making a lot more progress.
4. Languages are a clear example of this. I say this as someone with a pretty bad learning disability for languages, who has tried very hard to pick up various additional languages and failed utterly.
5. A third counterpoint is that children really do put a ton of effort into learning, often not that efficiently (e.g. rewatching and rereading the same shows and books over and over, repeating games and patterns and so on), to get the information they need. Let your children play, but that’s time intensive. Imagine what adults can and do learn when they truly have no other responsibilities and go all-in on it.
How do you solve model collapse? Andrej doesn’t know, the models be collapsed, and Dwarkesh points out RL punishes output diversity. Perhaps you could regularize entropy to be higher, it’s all tricky.
Andrej says state of the art models have gotten smaller, and he still thinks they memorized too much and we should seek a small cognitive core.
1. He comes back to this idea that knowing things is a disadvantage. I don’t get it. I do buy that smaller models are more efficient, especially with inference scaling, and so this is the best practical approach for now.
2. My prediction is that the cognitive core hypothesis is wrong, and that knowledge and access to diverse context is integral to thinking, especially high entropy thinking. I don’t think a single 1B model is going to be a good way to get any kind of conversation you want to have.
3. There are people who have eidetic memories. They can have a hard time taking advantage because working memory remains limited, and they don’t filter for the info worth remembering or abstracting out of them. So there’s some balance at some point, but I definitely feel like remembering more things than I do would be better? And that I have scary good memory and memorization in key points, such as ability (for a time, anyway) to recall the exact sequence of entire Magic games and tournaments, which is a pattern you also see from star athletes – you ask Steve Curry or Lebron James and they can tell you every detail of every play.
Most of the internet tokens are total garbage, stock tickers, symbols, huge amounts of slop, and you basically don’t want that information.
1. I’m not sure you don’t want that information? It’s weird. I don’t know enough to say. Obviously it would not be hard to filter such tokens out at this point, so they must be doing something useful. I’m not sure it’s due to memorization, but I also don’t see why the memorization would hurt.
They go back and forth over the size of the supposed cognitive core, Dwarkesh asks why not under 1 billion, Andrej says you probably need a billion knobs and he’s already contrarian being that low.
1. Whereas yeah, I think 1 billion is not enough and this is the wrong approach entirely unless you want to e.g. do typical simple things within a phone.

Wait what?

Note: The 2% number doesn’t actually come up until the next section on ASI.

How to measure progress? Andrej doesn’t like education level as a measure of AI progress (I agree), he’s also not a fan of the famous METR horizon length graph and is tempted to reject the whole question. He’s sticking with AGI as ‘can do any economically valuable task at human performance or better.’
1. And you’re going to say having access to ‘any economically valuable (digital) task at human performance or better’ only is +2% GDP growth? Really?
2. You have to measure something you call AI progress, since you’re going to manage it. Also people will ask constantly and use it to make decisions. If nothing else, you need an estimate of time to AGI.
He says only 10%-20% of the economy is ‘only knowledge work.’
1. I asked Sonnet. McKinsey 2012 finds knowledge work accounted for 31 percent of all workers in America in 2010. Sonnet says 30%-35% pure knowledge work, 12%-17% pure manual, the rest some hybrid, split the rest in half, you get 60% knowledge work by task, but the knowledge work typically is about double the economic value of the non-knowledge work, so we’re talking on the order of 75% of all economic value.
2. How much would this change Andrej’s other estimates, given this is more than triple his estimate?
Andrej points to the famous predictions of automating radiology, and suggests what we’ll do more often is have AI do 80% of the volume, then delegate 20% to humans.
1. Okay, sure, that’s a kind of intermediate step, we might do that for some period of time. If so, let’s say that for 75% of economic value we have the AI provide 60% of the value, assuming the human part is more valuable. So it’s providing 45% of all economic value if composition of ‘labor including AI’ does not change.
2. Except of course if half of everything now has marginal cost epsilon (almost but not quite zero), then there will be a large shift in composition to doing more of those tasks.
Dwarkesh compares radiologists to early Waymos where they had a guy in the front seat that never did anything so people felt better, and similarly if an AI can do 99% of a job the human doing the 1% can still be super valuable because bottleneck. Andrej points out radiology turns out to be a bad example for various reasons, suggests call centers.
1. If you have 99 AI tasks and 1 human task, and you can’t do the full valuable task without all 100 actions, then in some sense the 1 human task is super valuable.
2. In another sense, it’s really not, especially if any human can do it and there is now a surplus of humans available. Market price might drop quite low.
3. Wages don’t go up as you approach 99% AI, as Dwarkesh suggests they could, unless you’re increasingly bottlenecked on available humans due to a Jevons Paradox situation or hard limit on supply, both of which are the case in radiology, or this raises required skill levels. This is especially true if you’re automating a wide variety of tasks and there is less demand for labor.
Dwarkesh points out that we don’t seem to be on an AGI paradigm, we’re not seeing large productivity improvements for consultants and accountants. Whereas coding was a perfect fit for a first task, with lots of ready-made places to slot in an AI.
1. Skill issue. My lord, skill issue.
2. Current LLMs can do accounting out of the box, they can automate a large percentage of that work, and they can enable you to do your own accounting. If you’re an accountant and not becoming more productive? That’s on you.
3. That will only advance as AI improves. A true AGI-level AI could very obviously do most accounting tasks on its own.
4. Consultants should also be getting large productivity boosts on the knowledge work part of their job, including learning things, analyzing things and writing reports and so on. To the extent their job is to sell themselves and convince others to listen to them, AI might not be good enough yet.
5. Andrej asks about automating creating slides. If AI isn’t helping you create slides faster, I mean, yeah, skill issue, or at least scaffolding issue.
Dwarkesh says Andy Matuschak tried 50 billion things to get LLMs to write good spaced repetition prompts, and they couldn’t do it.
1. I do not understand what went wrong with the spaced repetition prompts. Sounds like a fun place to bang one’s head for a while and seems super doable, although I don’t know what a good prompt would look like as I don’t use spaced repetition.
2. To me, this points towards skill issues, scaffolding issues and time required to git gud and solve for form factors as large barriers to AI value unlocks.

What about superintelligence? “I see it as a progression of automation in society. Extrapolating the trend of computing, there will be a gradual automation of a lot of things, and superintelligence will an extrapolation of that. We expect more and more autonomous entities over time that are doing a lot of the digital work and then eventually even the physical work some amount of time later. Basically I see it as just automation, roughly speaking.”
1. That’s… not ASI. That’s intelligence denialism. AI as normal technology.
2. I took a pause here. It’s worth sitting with this for a bit.
3. Except it kind of isn’t, when you hear what he says later? It’s super weird.
Dwarkesh pushes back: “But automation includes the things humans can already do, and superintelligence implies things humans can’t do.” Andrej gives a strange answer: “But one of the things that people do is invent new things, which I would just put into the automation if that makes sense.”
1. No, it doesn’t make sense? I’m super confused what ‘just automation’ is supposed to meaningfully indicate?
2. If what we are automating is ‘being an intelligence’ then everything AI ever does is always ‘just automation’ but that description isn’t useful.
3. Humans can invest and do new things but superintelligence can invent and do new things that are in practice not available to humans, ‘invent new things’ is not the relevant natural category here.
Andrej worries about a gradual loss of control and understanding of what is happening, and thinks this is the most likely outcome. Multiple competing entities, initially competing on behalf of people, that gradually become more autonomous, some go rogue, others fight them off. They still get out of control.
1. No notes, really. That’s the baseline scenario if we solve a few other impossible-level problems (or get extremely lucky that they’re not as hard as they look to me) along the way.
2. Andrej doesn’t say ‘unless’ here, or offer a solution or way to prevent this.
3. Missing mood?
Dwarkesh asks, will we see an intelligence explosion if we have a million copies of you running in parallel super fast? Andrej says yes, but best believe in intelligence explosions because you’re already living in one and have been for decades, that’s why GDP grows, this is all continuous with the existing hyper-exponential trend, previous techs also didn’t make GDP go up much, everything was slow diffusion.
1. It’s so weird to say ‘oh, yeah, the million copies of me sped up a thousand times would just be more of the same slow growth trends, ho hum, intelligence explosion,’ “it’s just more automation.”
“We’re still going to have an exponential that’s going to get extremely vertical. It’s going to be very foreign to live in that kind of an environment.” … “Yes, my expectation is that it stays in the same [2% GDP growth rate] pattern.”
1. I… but… um… I… what?
2. Don’t you have to pick a side? He seems to keep trying to have his infinite cakes and eat them too, both an accelerating intelligence explosion and then magically GDP growth stays at 2% like it’s some law of nature.
“Self-driving as an example is also computers doing labor. That’s already been playing out. It’s still business as usual.”
1. Self-driving is a good example of slow diffusion of the underlying technology for various reasons. It’s been slow going, and mostly isn’t yet going.
2. This is a clear example of an exponential that hasn’t hit you yet. Self-driving cars are Covid-19 in January 2020, except they’re a good thing.
3. A Fermi estimate for car trips in America per week is around 2 billion, or for rideshares about 100 million per week.
4. Waymo got to 100,000 weekly rides in August 2024, was at 250,000 weekly rides in April 2025, we don’t yet have more recent data but this market estimates roughly 500,000 per week by year end. That’s 0.5% of taxi rides. The projection for end of year 2026 says maybe 1.5 million rides per week, 1.5%.
5. Currently the share of non-taxi rides that are full self-driving is essentially zero, maybe 0.2% of trips have meaningful self driving components.
6. So very obviously, for now, this isn’t going to show up in the productivity or GDP statistics overall, or at least not directly, although I do think this is a non-trivial rise in productivity and lived experience in areas where Waymos are widely available for those who use it, most importantly in San Francisco.
Karpathy keeps saying this will all be gradual capabilities gains and gradual diffusion, with no discrete jump. He suggests you would need some kind of overhang being unlocked such as a new energy source to see a big boost.
1. I don’t know how to respond to someone who thinks we’re in an intelligence explosion, but refuses to include any form of such feedback into their models.
2. That’s not shade, that’s me literally not knowing how to respond.
3. It’s very strange to not expect any overhangs to be unlocked. That’s saying that there aren’t going to be any major technological ideas that we have missed.
4. His own example is an energy source. If all ASI did was unlock a new method of cheap, safe, clean, unlimited energy, let’s say a design for fusion power plants, that were buildable in any reasonable amount of time, that alone would disrupt the GDP growth trend.

I won’t go further into the same GDP growth or intelligence explosion arguments I seem to discuss in many Dwarkesh Patel podcast posts. I don’t think Andrej has a defensible position here, in the sense that he is doing some combination of denying the premise of AGI/ASI, not taking into account its implications in some places while acknowledging the same dynamics in others.

Most of all, this echoes the common state of the discourse on such questions, which seems to involve:

You, the overly optimistic fool, say AGI will arrive in 2 years, or 5 years, and you say that when it happens it will be a discrete event and then everything changes.
1. There is also you, the alarmist, saying this would kill everyone, cause us to lose control or otherwise stand risk of being a bad thing.
I, the wise world weary realist, say AGI will only arrive in 10 years, and it will be a gradual, continuous thing with no discrete jumps, facing physical bottlenecks and slow diffusion.
So therefore we won’t see a substantial change to GDP growth, your life will mostly seem normal, there’s no risk of extinction or loss of control, and so on, building sufficiently advanced technology of minds smarter, faster, cheaper and more competitive than ourselves along an increasing set of tasks will go great.
Alternatively, I, the proper cynic, realize AI is simply a ‘normal technology’ and it’s ‘just automation of some tasks’ and they will remain ‘mere tools’ and what are you getting on about, let’s go build some economic models.

I’m fine with those who expect to at first encounter story #2 instead of story #1.

Except it totally, absolutely does not imply #3. Yes, these factors can slow things down, and 10 years are more than 2-5 years, but 10 years is still not that much time, and a continuous transition ends up in the same place, and tacking on some years for diffusion also ends up in the same place. It buys you some time, which we might be able to use well, or we might not, but that’s it.

What about story #4, which to be clear is not Karpathy’s or Patel’s? It’s possible that AI progress stalls out soon and we get a normal technology, but I find it rather unlikely and don’t see why we should expect that. I think that it is quite poor form to treat this as any sort of baseline scenario.

Dwarkesh pivots to Nick Lane. Andrej is surprised evolution found intelligence and expects it to be a rare event among similar worlds. Dwarkesh suggests we got ‘squirrel intelligence’ right after the oxygenation of the atmosphere, which Sutton said was most of the way to human intelligence, yet human intelligence took a lot longer. They go over different animals and their intelligences. You need things worth learning but not worth hardcoding.
Andrej notes LLMs don’t have a culture, suggests it could be a giant scratchpad.
1. The backrooms? Also LLMs can and will have a culture because anything on the internet can become their context and training data. We already see this, with LLMs basing behaviors off observations of other prior LLMs, in ways that are often undesired.
Andrej mentions self-play, says that he thinks the models can’t create culture because they’re ‘still kids.’ Savant kids, but still kids.
1. Kids create culture all the time.
2. No, seriously, I watch my own kids create culture.
3. I’m not saying they in particular created a great culture, but there’s no question they’re creating culture.

Andrej was at Tesla leading self-driving from 2017 to 2022. Why did self-driving take a decade? Andrej says it isn’t done. It’s a march of nines (of reliability). Waymo isn’t economical yet, Tesla’s approach is more scalable, and to be truly done would mean people wouldn’t need a driver’s license anymore. But he agrees it is ‘kind of real.’
1. Kind of? I mean obviously self-driving can always improve, pick up more nines, get smoother, get faster, get cheaper. Waymo works great, and the economics will get there.
2. Andrej is still backing the Tesla approach, and maybe they will make fools of us all but for now I do not see it.
They draw parallels to AI and from AI to previous techs. Andrej worries we may be overbuilding compute, he isn’t sure, says he’s bullish on the tech but a lot of what he sees on Twitter makes no sense and is about fundraising or attention.
1. I find it implausible that we are overbuilding compute, but it is possible, and indeed if it was not possible then we would be massively underbuilding.
“I’m just reacting to some of the very fast timelines that people continue to say incorrectly. I’ve heard many, many times over the course of my 15 years in AI where very reputable people keep getting this wrong all the time. I want this to be properly calibrated, and some of this also has geopolitical ramifications and things like that with some of these questions. I don’t want people to make mistakes in that sphere of things. I do want us to be grounded in the reality of what technology is and isn’t.”
1. Key quote.
2. Andrej is not saying AGI is far in any normal person sense, or that its impact will be small, as he says he is bullish on the technology.
3. What Andrej is doing is pushing back on the even faster timelines and bigger expectations that are often part of his world. Which is totally fair play.
4. That has to be kept in perspective. If Andrej is right the future will blow your mind, it will go crazy.
5. Where the confusion arises is where Andrej then tries to equate his timelines and expectations with calm and continuity, or extends those predictions forward in ways that don’t make sense to me.
6. Again, I see similar things with many others e.g. the communications of the White House’s Sriram Krishnan, saying AGI is far, but if you push far means things like 10 years. Which is not that far.
7. I think Andrej’s look back has a similar issue of perspective. Very reputable people keep predicting specific AI accomplishments on timelines that don’t happen, sure, that’s totally a thing. But is AI underperforming the expectations of reputable optimists? I think progress in AI in general in the last 15 years, certainly since 2018 and the transformer, has been absolutely massive compared to general expectations, of course there were (and likely always will be) people saying ‘AGI in three years’ and that didn’t happen.

Dwarkesh asks about Eureka Labs. Why not AI research? Andrej says he’s not sure he could improve what the labs are doing. He’s afraid of a WALL-E or Idiocracy problem where humans are disempowered and don’t do things. He’s trying to build Starfleet Academy.
1. I think he’s right to be worried about disempowerment, but looking to education as a solution seems misplaced here? Education is great, all for it, but it seems highly unlikely it will ‘turn losses into wins’ in this sense.
2. The good news is Andrej definitely has fully enough money so he can do whatever he wants, and it’s clear this stuff is what he wants.
Dwarkesh Patel hasn’t seen Star Trek.
1. Can we get this fixed, please?
2. I propose a podcast which is nothing but Dwarkesh Patel watching Star Trek for the first time and reacting.
Andrej thinks AI will fundamentally change education, and it’s still early. Right now you have an LLM, you ask it questions, that’s already super valuable but it still feels like slop, he wants an actual tutor experience. He learned Korean from a tutor 1-on-1 and that was so much better than a 10-to-1 class or learning on the internet. The tutor figured out where he was as a student, asked the right questions, and no LLM currently comes close. Right now they can’t.
1. Strongly agreed on all of that.
His first class is LLM-101-N, with Nanochat as the capstone.
1. This raises the question of whether a class is even the right form factor at all for this AI world. Maybe it is, maybe it isn’t?
Dwarkesh points out that if you can self-probe well enough you can avoid being stuck. Andrej contrasts LLM-101-N with his CS231n at Stanford on deep learning, that LLMs really empower him and help him go faster. Right now he’s hiring faculty but over time some TAs can become AIs.
“I often say that pre-AGI education is useful. Post-AGI education is fun. In a similar way, people go to the gym today. We don’t need their physical strength to manipulate heavy objects because we have machines that do that. They still go to the gym. Why do they go to the gym? Because it’s fun, it’s healthy, and you look hot when you have a six-pack. It’s attractive for people to do that in a very deep, psychological, evolutionary sense for humanity. Education will play out in the same way. You’ll go to school like you go to the gym.”
“If you look at, for example, aristocrats, or you look at ancient Greece or something like that, whenever you had little pocket environments that were post-AGI in a certain sense, people have spent a lot of their time flourishing in a certain way, either physically or cognitively. I feel okay about the prospects of that. If this is false and I’m wrong and we end up in a WALL-E or Idiocracy future, then I don’t even care if there are Dyson spheres. This is a terrible outcome. I really do care about humanity. Everyone has to just be superhuman in a certain sense.”
1. (on both quotes) So, on the one hand, yes, mostly agreed, if you predicate this on the post-AGI post-useful-human-labor world where we can’t do meaningful productive work and also get to exist and go to the gym and go around doing our thing like this is all perfectly normal.
2. On the other hand, it’s weird to expect things to work out like that, although I won’t reiterate why, except to say that if you accept that the humans are now learning for fun then I don’t think this jives with a lot of Andrej’s earlier statements and expectations.
3. If you’re superhuman in this sense, that’s cool, but if you’re less superhuman than the competition, then does it do much beyond being cool? What are most people going to choose to do with it? What is good in life? What is the value?
4. This all gets into much longer debates and discussions, of course.
“I think there will be a transitional period where we are going to be able to be in the loop and advance things if we understand a lot of stuff. In the long-term, that probably goes away.”
1. Okay, sure, there will be a transition period of unknown length, but that doesn’t as they say solve for the equilibrium.
2. I don’t expect that transition period to last very long, although there are various potential values for very long.
Dwarkesh asks about teaching. Andrej says everyone should learn physics early, since early education is about booting up a brain. He looks for first or second order terms of everything. Find the core of the thing and understand it.
1. Our educational system is not about booting up brains. If it was, it would do a lot of things very differently. Not that we should let this stop us.
Curse of knowledge is a big problem, if you’re an expert in a field often you don’t know what others don’t know. Could be helpful to see other people’s dumb questions that they ask an LLM?
From Dwarkesh: “Another trick that just works astoundingly well. If somebody writes a paper or a blog post or an announcement, it is in 100% of cases that just the narration or the transcription of how they would explain it to you over lunch is way more, not only understandable, but actually also more accurate and scientific, in the sense that people have a bias to explain things in the most abstract, jargon-filled way possible and to clear their throat for four paragraphs before they explain the central idea. But there’s something about communicating one-on-one with a person which compels you to just say the thing.”
1. Love it. Hence we listen to and cover podcasts, too.
2. I think this is because in a conversation you don’t have to be defensible or get judged or be technically correct, you don’t have to have structure that looks good, and you don’t have to offer a full explanation.
3. As in, you can gesture at things, say things without justifications, watch reactions, see what lands, fill in gaps when needed, and yeah, ‘just say the thing.’
4. That’s (a lot of) why it isn’t the abstract, plus habit, it isn’t done that way because it isn’t done that way.

Peter Wildeford offers his one page summary, which I endorse as a summary.

Sriram Krishnan highlights part of the section on education, which I agree was excellent, and recommends the overall podcast highly.

Andrej Karpathy offered his post-podcast reactions here, including a bunch of distillations, highlights and helpful links.

Here’s his summary on the timelines question:

Andrej Karpathy: Basically my AI timelines are about 5-10X pessimistic w.r.t. what you’ll find in your neighborhood SF AI house party or on your twitter timeline, but still quite optimistic w.r.t. a rising tide of AI deniers and skeptics

Those house parties must be crazy, as must his particular slice of Twitter. He has AGI 10 years away and he’s saying that’s 5-10X pessimistic. Do the math.

My slice currently overall has 4-10 year expectations. The AI 2027 crowd has some people modestly shorter, but even they are now out in 2029 or so I think.

That’s how it should work, evidence should move the numbers back and forth, and if you had a very aggressive timeline six months or a year ago recent events should slow your roll. You can say ‘those people were getting ahead of themselves and messed up’ and that’s a reasonable perspective, but I don’t think it was obviously a large mistake given what we knew at the time.

Peter Wildeford: I’m desperate for a worldview where we agree both are true:

– current AI is slop and the marketing is BS, but

– staggering AI transformation (including extinction) is 5-20 years out, this may not be good by default, and thus merits major policy action now

I agree with the second point (with error bars). The first point I would rate as ‘somewhat true.’ Much of the marketing is BS and much of the output is slop, no question, but much of it is not on either front and the models are already extremely helpful to those who use them.

Peter Wildeford: If the debate truly has become

– “AGI is going to take all the jobs in just two years” vs.

– “no you idiot, don’t buy the hype, AI is really slop, it will take 10-20 years before AGI automates all jobs (and maybe kill us)”

…I feel like we have really lost the big picture here

[meme credit: Darth thromBOOzyt]

Similarly, the first position here is obviously wrong, and the second position could be right on the substance but has one hell of a Missing Mood, 10-20 years before all jobs get automated is kind of the biggest thing that happened in the history of history even if the process doesn’t kill or diempower us.

Rob Miles: It’s strange that the “anti hype” position is now “AGI is one decade away”. That… would still be a very alarming situation to be in? It’s not at all obvious that that would be enough time to prepare.

It’s so crazy the amount to which vibes can supposedly shift when objectively nothing has happened and even the newly expressed opinions aren’t so different from what everyone was saying before, it’s that now we’re phrasing it as ‘this is long timelines’ as opposed to ‘this is short timelines.’

John Coogan: It’s over. Andrej Karpathy popped the AI bubble. It’s time to rotate out of AI stocks and focus on investing in food, water, shelter, and guns. AI is fake, the internet is overhyped, computers are pretty much useless, even the steam engine is mid. We’re going back to sticks and stones.

Obviously it’s not actually that bad, but the general tech community is experiencing whiplash right now after the Richard Sutton and Andrej Karpathy appearances on Dwarkesh. Andrej directly called the code produced by today’s frontier models “slop” and estimated that AGI was around 10 years away. Interestingly this lines up nicely with Sam Altman’s “The Intelligence Age” blog post from September 23, 2024, where he said “It is possible that we will have superintelligence in a few thousand days (!); it may take longer, but I’m confident we’ll get there.”

I read this timeline to mean a decade, which is what people always say when they’re predicting big technological shifts (see space travel, quantum computing, and nuclear fusion timelines). This is still earlier than Ray Kurzweil’s 2045 singularity prediction, which has always sounded on the extreme edge of sci-fi forecasting, but now looks bearish.

Yep, I read Altman as ~10 years there as well. Except that Altman was approaching that correctly as ‘quickly, there’s no time’ rather than ‘we have all the time in the world.’

There’s a whole chain of AGI-soon bears who feel vindicated by Andrej’s comments and the general vibe shift. Yann LeCun, Tyler Cowen, and many others on the side of “progress will be incremental” look great at this moment in time.

This George Hotz quote from a Lex Fridman interview in June of 2023 now feels way ahead of the curve, at the time: “Will GPT-12 be AGI? My answer is no, of course not. Cross-entropy loss is never going to get you there. You probably need reinforcement learning in fancy environments to get something that would be considered AGI-like.”

Big tech companies can’t turn on a dime on the basis of the latest Dwarkesh interview though. Oracle is building something like $300 billion in infrastructure over the next five years.

It’s so crazy to think a big tech company would think ‘oops, it’s over, Dwarkesh interviews said so’ and regret or pull back on investment, also yeah it’s weird that Amazon was up 1.6% while AWS was down.

Danielle Fong: aws down, amazon up

nvda barely sweating

narrative bubbles pop more easily than market bubbles

Why would you give Hotz credit for ‘GPT-12 won’t be AGI’ here, when the timeline for GPT-12 (assuming GPT-11 wasn’t AGI, so we’re not accelerating releases yet) is something like 2039? Seems deeply silly. And yet here we are. Similarly, people supposedly ‘look great’ when others echo previous talking points? In my book, you look good based on actual outcomes versus predictions, not when others also predict, unless you are trading the market.

I definitely share the frustration Liron had here:

Liron Shapira: Dwarkesh asked Karpathy about the Yudkowskian observation that exponential economic growth to date has been achieved with *constanthuman-level thinking ability.

Andrej acknowledged the point but said, nevertheless, he has a strong intuition that 2% GDP growth will hold steady.

Roon: correction, humanity has achieved superexponential economic growth to date

Liron: True.

In short, I don’t think a reasonable extrapolation from above plus AGI is ~2%.

But hey, that’s the way it goes. It’s been a fun one.

Discussion about this post

On Dwarkesh Patel’s Podcast With Andrej Karpathy Read More »

On Dwarkesh Patel’s Podcast With Richard Sutton

Dwarkesh / Tim Belzer / September 30, 2025

This seems like a good opportunity to do some of my classic detailed podcast coverage.

The conventions are:

This is not complete, points I did not find of note are skipped.
The main part of each point is descriptive of what is said, by default paraphrased.
For direct quotes I will use quote marks, by default this is Sutton.
Nested statements are my own commentary.
Timestamps are approximate and from his hosted copy, not the YouTube version, in this case I didn’t bother because the section divisions in the transcript should make this very easy to follow without them.

Full transcript of the episode is here if you want to verify exactly what was said.

Well, that was the plan. This turned largely into me quoting Sutton and then expressing my mind boggling. A lot of what was interesting about this talk was in the back and forth or the ways Sutton lays things out in ways that I found impossible to excerpt, so one could consider following along with the transcript or while listening.

(0: 33) RL and LLMs are very different. RL is ‘basic’ AI. Intelligence and RL are about understanding your world. LLMs mimic people, they don’t figure out what to do.
1. RL isn’t strictly about ‘understanding your world’ except insofar as it is necessary to do the job. The same applies to LLMs, no?
2. To maximize RL signal you need to understand and predict the world, aka you need intelligence. To mimic people, you have to understand and predict them, which in turn requires understanding and predicting the world. Same deal.
(1: 19) Dwarkesh points out that mimicry requires a robust world model, indeed LLMs have the best world models to date. Sutton disagrees, you’re mimicking people, and he questions that people have a world model. He says a world model would allow you to predict what would happen, whereas people can’t do that.
1. People don’t always have an explicit world model, but sometimes they do, and they have an implicit one running under the hood.
2. Even if people didn’t have a world model in their heads, their outputs in a given situation depend on the world, which you then have to model, if you want to mimic those humans.
3. People predict what will happen all the time, on micro and macro levels. On the micro level they are usually correct. On sufficiently macro levels they are often wrong, but this still counts. If the claim is ‘if you can’t reliably predict what will happen then you don’t have a model’ then we disagree on what it means to have a model, and I would claim no such-defined models exist at any interesting scale or scope.
(1: 38) “What we want, to quote Alan Turing, is a machine that can learn from experience, where experience is the things that actually happen in your life. You do things, you see what happens, and that’s what you learn from. The large language models learn from something else. They learn from “here’s a situation, and here’s what a person did”. Implicitly, the suggestion is you should do what the person did.”
1. That’s not the suggestion. If [X] is often followed by [Y], then the suggestion is not ‘if [X] then you should do [Y]’ it it ‘[X] means [Y] is likely’ so yes if you are asked ‘what is likely after [X]’ it will respond [Y] but it will also internalize everything implied by this fact and the fact is not in any way normative.
2. That’s still ‘learning from experience’ it’s simply not continual learning.
3. Do LLMs do continual learning, e.g. ‘from what actually happens in your life’ in particular? Not in their current forms, not technically, but there’s no inherent reason they couldn’t, you’d just do [mumble] except that doing so would get rather expensive.
4. You can also have them learn via various forms of external memory, broadly construed, including having them construct programs. It would work.
5. Not that it’s obvious that you would want an LLM or other AI to learn specifically from what happens in your life, as opposed to learning from things that happen in lives in general plus having context and memory.
(2: 39) Dwarkesh responds with a potential crux that imitation learning is a good prior or reasonable approach, and gives the opportunity to get answers right sometimes, then you can train on experience. Sutton says no, that’s the LLM perspective, but the LLM perspective is bad. It’s not ‘actual knowledge.’ You need continual learning so you need to know what’s right during interactions, but the LLM setup can’t tell because there’s no ground truth, because you don’t have a prediction about what will happen next.
1. I don’t see Dwarkesh’s question as a crux.
2. I think Sutton’s response is quite bad, relying on invalid sacred word defenses.
3. I think Sutton wants to draw a distinction between events in the world and tokens in a document. I don’t think you can do that.
4. There is no ‘ground truth’ other than the feedback one gets from the environment. I don’t see why a physical response is different from a token, or from a numerical score. The feedback involved can come from anywhere, including from self-reflection if verification is easier than generation or can be made so in context, and it still counts. What is this special ‘ground truth’?
5. Almost all feedback is noisy because almost all outcomes are probabilistic.
6. You think that’s air you’re experiencing breathing? Does that matter?
(5: 29) Dwarkesh points out you can literally ask “What would you anticipate a user might say in response?” but Sutton rejects this because it’s not a ‘substantive’ prediction and the LLM won’t be ‘surprised’ or “they will not change because an unexpected thing has happened. To learn that, they’d have to make an adjustment.”
1. Why is this ‘not substantive’ in any meaningful way, especially if it is a description of a substantive consequence, which speech often is?
2. How is it not ‘surprise’ when a low-probability token appears in the text?
3. There are plenty of times a human is surprised by an outcome but does not learn from it out of context. For example, I roll a d100 and get a 1. Okie dokie.
4. LLMs do learn from a surprising token in training. You can always train. This seems like an insistence that surprise requires continual learning? Why?
Dwarkesh points out LLMs update within a chain-of-thought, so flexibility exists in a given context. Sutton reiterates they can’t predict things and can’t be surprised. He insists that “The next token is what they should say, what the actions should be. It’s not what the world will give them in response to what they do.”
1. What is Sutton even saying, at this point?
2. Again, this distinction that outputting or predicting a token is distinct from ‘taking an action,’ and getting a token back is not the world responding.
3. I’d point out the same applies to the rest of the tokens in context without CoT.
(6: 47) Sutton claims something interesting, that intelligence requires goals, “I like John McCarthy’s definition that intelligence is the computational part of the ability to achieve goals. You have to have goals or you’re just a behaving system.” And he asks Dwarkesh is he agrees that LLMs don’t have goals (or don’t have ‘substantive’ goals, and that next token prediction is not a goal, because it doesn’t influence the tokens.
1. Okay, seriously, this is crazy, right?
2. What is this ‘substantive’ thing? If you say something on the internet, it gets read in real life. It impacts real life. It causes real people to do ‘substantive’ things, and achieving many goals within the internet requires ‘substantive’ changes in the offline world. If you’re dumb on the internet, you’re dumb in real life. If you die on the internet, you die in real life (e.g. in the sense of an audience not laughing, or people not supporting you, etc).
3. I feel dumb having to type that, but I’m confused what the confusion is.
4. Of course next token prediction is a goal. You try predicting the next token (it’s hard!) and then tell me you weren’t pursuing a goal.
5. Next token prediction does influence the tokens in deployment because the LLM will output the next most likely token, which changes what tokens come after, its and the user’s, and also the real world.
6. Next token prediction does influence the world in training, because the feedback on that prediction’s accuracy will change the model’s weights, if nothing else. Those are part of the world.
7. If intelligence requires goals, and something clearly displays intelligence, then that something must have a goal. If you conclude that LLMs ‘don’t have intelligence’ in 2025, you’ve reached a wrong conclusion. Wrong conclusions are wrong. You made a mistake. Retrace your steps until you find it.
Dwarkesh next points out you can do RL on top of LLMs, and they get IMO gold, and asks why Sutton still doesn’t think that is anything. Sutton doubles down that math operations still aren’t the empirical world, doesn’t count.
1. Are you kidding me? So symbolic things aren’t real, period, and manipulating them can’t be intelligence, period?
Dwarkesh notes that Sutton is famously the author of The Bitter Lesson, which is constantly cited as inspiring and justifying the whole ‘stack more layers’ scaling of LLMs that basically worked, yet Sutton doesn’t see LLMs as ‘bitter lesson’ pilled. Sutton says they’re also putting in lots of human knowledge, so kinda yes kinda no, he expects that new systems that ‘learn from experience’ and ‘perform much better’ and are ‘more scalable’ to then be another instance of the Bitter Lesson?
1. This seems like backtracking on the Bitter Lesson? At least kinda. Mostly he’s repeating that LLMs are one way and it’s the other way, and therefore Bitter Lesson will be illustrated the other way?
“In every case of the bitter lesson you could start with human knowledge and then do the scalable things. That’s always the case. There’s never any reason why that has to be bad. But in fact, and in practice, it has always turned out to be bad. People get locked into the human knowledge approach, and they psychologically… Now I’m speculating why it is, but this is what has always happened. They get their lunch eaten by the methods that are truly scalable.”
1. I do not get where ‘truly scalable’ is coming from here, as it becomes increasingly clear that he is using words in a way I’ve never seen before.
2. If anything it is the opposite. The real objection is training efficiency, or failure to properly update from direct relevant experiences, neither of which has anything to do with scaling.
3. I also continue not to see why there is this distinction ‘human knowledge’ versus other information? Any information available to the AI can be coded as tokens and be put into an LLM, regardless of its ‘humanness.’ The AI can still gather or create knowledge on its own, and LLMs often do.
“The scalable method is you learn from experience. You try things, you see what works. No one has to tell you. First of all, you have a goal. Without a goal, there’s no sense of right or wrong or better or worse. Large language models are trying to get by without having a goal or a sense of better or worse. That’s just exactly starting in the wrong place.”
1. Again, the word ‘scaling’ is being used in a completely alien manner here. He seems to be trying to say ‘successful’ or ‘efficient.’
2. You have to have a ‘goal’ in the sense of a means of selecting actions, and a way of updating based on those actions, but in this sense LLMs in training very obviously have ‘goals’ regardless of whether you’d use that word that way.
3. Except Sutton seems to think this ‘goal’ needs to exist in some ‘real world’ sense or it doesn’t count and I continue to be boggled by this request, and there are many obvious counterexamples, but I risk repeating myself.
4. No sense of better or worse? What do you think thumbs up and down are? What do you think evaluators are? Does he not think an LLM can do evaluation?

Sutton has a reasonable hypothesis that a different architecture, that uses a form of continual learning and that does so via real world interaction, would be an interesting and potentially better approach to AI. That might be true.

But his uses of words do not seem to match their definitions or common usage, his characterizations of LLMs seem deeply confused, and he’s drawing a bunch of distinctinctions and treating them as meaningful in ways that I don’t understand. This results in absurd claims like ‘LLMs are not intelligent and do not have goals’ and that feedback from digital systems doesn’t count and so on.

It seems like a form of essentialism, the idea that ‘oh LLMs can never [X] because they don’t [Y]’ where when you then point (as people frequently do) to the LLM doing [X] and often also doing [Y] and they say ‘la la la can’t hear you.’

Dwarkesh claims humans initially do imitation learning, Sutton says obviously not. “When I see kids, I see kids just trying things and waving their hands around and moving their eyes around. There’s no imitation for how they move their eyes around or even the sounds they make. They may want to create the same sounds, but the actions, the thing that the infant actually does, there’s no targets for that. There are no examples for that.”
1. GPT-5 Thinking says partly true, but only 30% in the first months, more later on. Gemini says yes. Claude says yes: “Imitation is one of the core learning mechanisms from birth onward. Newborns can imitate facial expressions within hours of birth (tongue protrusion being the classic example). By 6-9 months, they’re doing deferred imitation – copying actions they saw earlier. The whole mirror neuron system appears to be built for this.”
2. Sutton’s claim seems clearly so strong as to be outright false here. He’s not saying ‘they do more non-imitation learning than imitation learning in the first few months,’ he is saying ‘there are no examples of that’ and there are very obviously examples of that. Here’s Gemini: “Research has shown that newborns, some just a few hours old, can imitate simple facial expressions like sticking out their tongue or opening their mouth. This early imitation is believed to be a reflexive behavior that lays the groundwork for more intentional imitation later on.”
“School is much later. Okay, I shouldn’t have said never. I don’t know, I think I would even say that about school. But formal schooling is the exception. You shouldn’t base your theories on that.” “Supervised learning is not something that happens in nature. Even if that were the case with school, we should forget about it because that’s some special thing that happens in people.”
1. At this point I kind of wonder if Sutton has met humans?
2. As in, I do imitation learning. All. The Time. Don’t you? Like, what?
3. As in, I do supervised learning. All. The. Time. Don’t you? Like, what?
4. A lot of this supervised and imitation learning happens outside of ‘school.’
5. You even see supervised learning in animals, given the existence of human supervisors who want to teach them things. Good dog! Good boy!
6. You definitely see imitation learning in animals. Monkey see, monkey do.
7. The reason not to do supervised learning is the cost of the supervisor, or (such as in the case of nature) their unavailability. Thus nature supervises, instead.
8. The reason not to do imitation learning in a given context is the cost of the thing to imitate, or the lack of a good enough thing to imitate to let you continue to sufficiently progress.
“Why are you trying to distinguish humans? Humans are animals. What we have in common is more interesting. What distinguishes us, we should be paying less attention to.” “I like the way you consider that obvious, because I consider the opposite obvious. We have to understand how we are animals. If we understood a squirrel, I think we’d be almost all the way there to understanding human intelligence. The language part is just a small veneer on the surface.”
1. Because we want to create something that has what only humans have and humans don’t, which is a high level of intelligence and ability to optimize the arrangements of atoms according to our preferences and goals.
2. Understanding an existing intelligence is not the same thing as building a new intelligence, which we have also managed to build without understanding.
3. The way animals have (limited) intelligence does not mean this is the One True Way that intelligence can ever exist. There’s no inherent reason an AI needs to mimic a human let alone an animal, except for imitation learning, or in ways we find this to be useful. We’re kind of looking for our keys under the streetlamp here, while assuming there are no keys elsewhere, and I think we’re going to be in for some very rude (or perhaps pleasant?) surprises.
4. I don’t want to make a virtual squirrel and scale it up. Do you?
The process of humans learning things over 10k years a la Henrich, of figuring out a many-step long process, where you can’t one-shot the reasoning process. This knowledge evolves over time, and is passed down through imitation learning, as are other cultural practices and gains. Sutton agrees, but calls this a ‘small thing.’
1. You could of course one-shot the process with sufficient intelligence and understanding of the world, what Henrich is pointing out is that in practice this was obviously impossible and not how any of this went down.
2. Seems like Sutton is saying again that the difference between humans and squirrels is a ‘small thing’ and we shouldn’t care about it? I disagree.
They agree that mammals can do continual learning and LLMs can’t. We all agree that Moravec’s paradox is a thing.
1. Moravec’s paradox is misleading. There will of course be all four quadrants of things, where for each of [AI, human] things will be [easy, hard].
2. The same is true for any pair of humans, or any pair of AIs, to a lesser degree.
3. The reason it is labeled a paradox is that there are some divergences that look very large, larger than one might expect, but this isn’t obvious to me.

“The experiential paradigm. Let’s lay it out a little bit. It says that experience, action, sensation—well, sensation, action, reward—this happens on and on and on for your life. It says that this is the foundation and the focus of intelligence. Intelligence is about taking that stream and altering the actions to increase the rewards in the stream…. This is what the reinforcement learning paradigm is, learning from experience.”
1. Can be. Doesn’t have to be.
2. A priori knowledge exists. Paging Descartes’ meditator! Molyneux’s problem.
3. Words, written and voiced, are sensation, and can also be reward.
4. Thoughts and predictions, and saying or writing words, are actions.
5. All of these are experiences. You can do RL on them (and humans do this).
Sutton agrees that the reward function is arbitrary, and can often be ‘seek pleasure and avoid pain.’
1. That sounds exactly like ‘make number go up’ with extra steps.
Sutton wants to say ‘network’ instead of ‘model.’
1. Okie dokie, this does cause confusion with ‘world models’ that minds have, as Sutton points out later, so using the same word for both is unfortunate.
2. I do think we’re stuck with ‘model’ here, but I’d be happy to support moving to ‘network’ or another alternative if one got momentum.
He points out that copying minds is a huge cost savings, more than ‘trying to learn from people.’
1. Okie dokie, again, but these two are not rivalrous actions.
2. If anything they are complements. If you learn from general knowledge and experiences it is highly useful to copy you. If you are learning from local particular experiences then your usefulness is likely more localized.
3. As in, suppose I had a GPT-5 instance, embodied in a humanoid robot, that did continual learning, which let’s call Daneel. I expect that Daneel would rapidly become a better fit to me than to others.
4. Why wouldn’t you want to learn from all sources, and then make copies?
5. One answer would be ‘because to store all that info the network would need to be too large and thus too expensive’ but that again pushes you in the other direction, and towards additional scaffolding solutions.
They discuss temporal difference learning and finding intermediate objectives.
Sutton brings up the ‘big world hypothesis’ where to be maximally useful a human or AI needs particular knowledge of a particular part of the world. In continual learning the knowledge goes into weights. “You learn a policy that’s specific to the environment that you’re finding yourself in.”
1. Well sure, but there are any number of ways to get that context, and to learn that policy. You can even write the policy down (e.g. in claude.md).
2. Often it would be actively unwise to put that knowledge into weights. There is a reason humans will often use forms of external memory. If you were planning to copy a human into other contexts you’d use it even more.

Sutton lays out the above common model of the agent. The new claim seems to be that you learn from all the sensation you receive, not just from the reward. And there is emphasis on the importance of the ‘transition model’ of the world.
1. I once again don’t see the distinction between this and learning from a stream of tokens, whether one or two directional, or even from contemplation, where again (if you had an optimal learning policy) you would pay attention to all the tokens and not only to the formal reward, as indeed a human does when learning from a text, or from sending tokens and getting tokens back in various forms.
2. In terms of having a ‘transition model,’ I would say that again this is something all agents or networks need similarly, and can ‘get away with not having’ to roughly similar extents.

So do humans.

Sutton claims people live in one world that may involve chess or Atari games and and can generalize across not only games but states, and will happen whether that generalization is good or bad. Whereas gradient descent will not make you generalize well, and we need algorithms where the generalization is good.
1. I’m not convinced that LLMs or SGD generalize out-of-distribution (OOD) poorly relative to other systems, including humans or RL systems, once you control for various other factors.
2. I do agree that LLMs will often do pretty dumb or crazy things OOD.
3. All algorithms will solve the problem at hand. If you want that solution to generalize, you need to either make the expectation of such generalization part of the de facto evaluation function, develop heuristics and methods that tend to lead to generalization for other reasons, or otherwise incorporate the general case, or choose or get lucky with a problem where the otherwise ‘natural’ solution does still generalize.
“Well maybe that [LLMs] don’t need to generalize to get them right, because the only way to get some of them right is to form something which gets all of them right. If there’s only one answer and you find it, that’s not called generalization. It’s just it’s the only way to solve it, and so they find the only way to solve it. But generalization is when it could be this way, it could be that way, and they do it the good way.”
1. Sutton only thinks you can generalize given the ability to not generalize, the way good requires the possibility of evil. It is a relative descriptor.
2. I don’t understand why you’d find that definition useful or valid. I care about the generality of your solution in practice, not whether there was a more or less general alternative solution also available.
3. Once again there’s this focus on whether something ‘counts’ as a thing. Yes, of course, if the only or simplest or easiest way to solve a special case is to solve the general case, which often happens, and thus you solve the general case, and this happens to solve a bunch of problem types you didn’t consider, then you have done generalization. Your solution will work in the general case, whether or not you call that OOD.
4. If there’s only one answer and you find it, you still found it.
5. This seems pretty central. SGD or RL or other training methods, of both humans and AIs, will solve the problem you hand to them. Not the problem you meant to solve, the problem and optimization target you actually presented.
6. You need to design that target and choose that method, such that this results in a solution that does what you want it to do. You can approach that in any number of ways, and ideally (assuming you want a general solution) you will choose to set the problem up such that the only or best available solution generalizes, if necessary via penalizing solutions that don’t in various ways.
Sutton claims coding agents trained via SGD will only find solutions to problems they have seen, and yes sometimes the only solution will generalize but nothing in their algorithms will cause them to choose solutions that generalize well.
1. Very obviously coding agents generalize to problems they haven’t seen.
2. Not fully to ‘all coding of all things’ but they generalize quite a bit and are generalizing better over time. Seems odd to deny this?
3. Sutton is making at least two different claims.
4. The first claim is that coding agents only find solutions to problems they have seen. This is at least a large overstatement.
5. The second claim is that the algorithms will not cause the network to choose solutions that generalize well over alternative solutions that don’t.
6. The second claim is true by default. As Sutton notes, sometimes the default or only solution does indeed generalize well. I would say this happens often. But yeah, sometimes by default this isn’t true, and then by construction and default there is nothing pushing towards finding the general solution.
7. Unless you design the training algorithms and data to favor the general solution. If you select your data well, often you can penalize or invalidate non-general solutions, and there are various algorithmic modifications available.
8. One solution type is giving the LLM an inherent preference for generality, or have the evaluator choose with a value towards generality, or both.
9. No, it isn’t going to be easy, but why should it be? If you want generality you have to ask for it. Again, compare to a human or an RL program. I’m not going for a more general solution unless I am motivated to do so, which can happen for any number of reasons.

Dwarkesh asks what has been surprising in AI’s big picture? Sutton says the effectiveness of artificial neural networks. He says ‘weak’ methods like search and learning have totally won over ‘strong’ methods that come from ‘imbuing a system with human knowledge.’
1. I find it interesting that Sutton in particular was surprised by ANNs. He is placing a lot of emphasis on copying animals, which seems like it would lead to expecting ANNs.
2. It feels like he’s trying to make ‘don’t imbue the system with human knowledge’ happen? To me that’s not what makes the ‘strong’ systems strong, or the thing that failed. The thing that failed was GOFAI, the idea that you would hardcode a bunch of logic and human knowledge in particular ways, and tell the AI how to do things, rather than letting the AI find solutions through search and learning. But that can still involve learning from human knowledge.
3. It doesn’t have to (see AlphaZero and previously TD-Gammon as Sutton points out), and yes that was somewhat surprising but also kind of not, in the sense that with More Dakka within a compact space like chess you can just solve the game from scratch.
4. As in: We don’t need to use human knowledge to master chess, because we can learn chess through self-play beyond human ability levels, and we have enough compute and data that way that we can do it ‘the hard way.’ Sure.

Dwarkesh asks what happens to scaling laws after AGI is created that can do AI research. Sutton says: “These AGIs, if they’re not superhuman already, then the knowledge that they might impart would be not superhuman.”
1. This seems like more characterization insistence combined with category error?
2. And it ignores or denies the premise of the question, which is that AGI allows you to scale researcher time with compute the same way we previously could scale compute spend in other places. Sutton agrees that doing bespoke work is helpful, it’s just that it doesn’t scale, but what if it did?
3. Even if the AGI is not ‘superhuman’ per se, the ability to run it faster and in parallel and with various other advantages means it can plausibly produce superhuman work in AI R&D. Already we have AIs that can do ‘superhuman’ tasks in various domains, even regular computers are ‘superhuman’ in some subdomains (e.g. arithmetic).
“So why do you say, “Bring in other agents’ expertise to teach it”, when it’s worked so well from experience and not by help from another agent?”
1. Help from another agent is experience. It can also directly create experience.
2. The context is chess where this is even more true.
3. Indeed, the way AlphaZero was trained was not to not involve other agents. The way AlphaZero was trained involved heavy use of other agents, except all those other agents were also AlphaZero.
Dwarkesh focuses specifically on the ‘billions of AI researchers’ case, Sutton says that’s an interesting case very different from today and The Bitter Lesson doesn’t have to apply. Better to ask questions like whether you should use compute to enhance a few agents or spread it around to spin up more of them, and how they will interact. “More questions, will it be possible to really spawn it off, send it out, learn something new, something perhaps very new, and then will it be able to be reincorporated into the original? Or will it have changed so much that it can’t really be done? Is that possible or is that not?”
1. I agree that things get strange and different and we should ask new questions.
2. Asking whether it is possible for an ASI (superintelligent AI) copy to learn something new and then incorporate it into the original seems like such a strange question.
  1. It presupposes this ‘continual learning’ thesis where the copy ‘learns’ the information via direct incorporation into its weights.
  2. It then assumes that passing on this new knowledge requires incorporation directly into weights or something weird?
  3. As opposed to, ya know, writing the insight down and the other ASI reading it? If ASIs are indeed superintelligent and do continual learning, why can’t they learn via reading? Wouldn’t they also get very good at knowing how to describe what they know?
  4. Also, yes, I’m pretty confident you can also do this via direct incorporation of the relevant experiences, even if the full Sutton model holds here in ways I don’t expect. You should be able to merge deltas directly in various ways we already know about, and in better ways that these ASIs will be able to figure out.
  5. Even if nothing else works, you can simply have the ‘base’ version of the ASI in question rerun the relevant experiences once it is verified that they led to something worthwhile, reducing this to the previous problem, says the mathematician.
Sutton also speculates about potential for corruption or insanity and similar dangers, if a central mind is incorporating the experiences or knowledge of other copies of itself. He expects this to be a big concern, including ‘mind viruses.’
1. Seems fun to think about, but nothing an army of ASIs couldn’t handle.
2. In general, when imagining scenarios with armies of ASIs, you have to price into everything the fact that they can solve problems way better than you.
3. I don’t think the associated ‘mind viruses’ in this scenario are fundamentally different than the problems with memetics and hazardous information we experience today, although they’ll be at a higher level.
4. I would of course expect lots of new unexpected and weird problems to arise.

It’s Sutton, so eventually we were going to have to deal with him being a successionist.

He argues that succession is inevitable for four reasons: Humanity is incapable of a united front, we will eventually figure out intelligence, we will eventually figure out superhuman intelligence, and it is inevitable that over time the most intelligent things around would gain intelligence and power.
1. We can divide this into two parts. Let “it” equal superintelligence.
2. Let’s call part one Someone Will Build It.
3. Let’s call part two If Anyone Builds It, Everyone Dies.
  1. Okay, sure, not quite as you see below, but mostly? Yeah, mostly.
4. Therefore, Everyone Will Die. Successionism is inevitable.
5. Part two is actually a very strong argument! It is simpler and cleaner and in many ways more convincing than the book’s version, at least in terms of establishing this as a baseline outcome. It doesn’t require (or give the impression it requires) any assumptions whatsoever about the way we get to superintelligence, what form that superintelligence takes, nothing.
6. I actually think this should be fully convincing of the weaker argument that by default (rather than inevitably) this happens, and that there is a large risk of this happening, and something has to go very right for it to not happen.
7. If you say ‘oh even if we do build superintelligence there’s no risk of this happening’ I consider this to be Obvious Nonsense and you not to be thinking.
8. I don’t think this argument is convincing that it is ‘inevitable.’ Facts not in evidence, and there seem like two very obvious counterexamples.
  1. Counterexample one is that if the intelligence gap is not so large in practical impact, other attributes can more than compensate for this. Other attributes, both mental and physical, also matter and can make up for this. Alas, this seems unlikely to be relevant given the expected intelligence gaps.
  2. Counterexample two is that you could ‘solve the alignment problem’ in a sufficiently robust sense that the more intelligent minds optimize for a world in which the less intelligent minds retain power in a sufficiently robust way. Extremely tricky, but definitely not impossible in theory.
9. However his definition of what is inevitable, and what counts as ‘succession’ here, is actually much more optimistic than I previously realized…
10. If we agree that If Anyone Builds It, Everyone Dies, then the logical conclusion is ‘Then Let’s Coordinate To Ensure No One Fing Build It.’
11. He claims nope, can’t happen, impossible, give up. I say, if everyone was convinced of part two, then that would change this.
“Put all that together and it’s sort of inevitable. You’re going to have succession to AI or to AI-enabled, augmented humans. Those four things seem clear and sure to happen. But within that set of possibilities, there could be good outcomes as well as less good outcomes, bad outcomes. I’m just trying to be realistic about where we are and ask how we should feel about it.”
1. If ‘AI-enhanced, augmented humans’ count here, well, that’s me, right now.
2. I mean, presumably that’s not exactly what he meant.
3. But yeah, conditional on us building ASIs or even AGIs, we’re at least dealing with some form of augmented humans.
4. Talk of ‘merge with the AI’ is nonsense, you’re not adding anything to it, but it can enhance you.
“I mark this as one of the four great stages of the universe. First there’s dust, it ends with stars. Stars make planets. The planets can give rise to life. Now we’re giving rise to designed entities. I think we should be proud that we are giving rise to this great transition in the universe.”
1. Designed is being used rather loosely here, but we get the idea.
2. We already have created designed things, and yeah that’s pretty cool.
“It’s an interesting thing. Should we consider them part of humanity or different from humanity? It’s our choice. It’s our choice whether we should say, “Oh, they are our offspring and we should be proud of them and we should celebrate their achievements.” Or we could say, “Oh no, they’re not us and we should be horrified.””
1. It’s not about whether they are ‘part of humanity’ or our ‘children.’ They’re not.
2. They can still have value. One can imagine aliens (as many stories have) that are not these things and still have value.
3. That doesn’t mean that us going away would therefore be non-horrifying.
“A lot of it has to do with just how you feel about change. If you think the current situation is really good, then you’re more likely to be suspicious of change and averse to change than if you think it’s imperfect. I think it’s imperfect. In fact, I think it’s pretty bad. So I’m open to change. I think humanity has not had a super good track record. Maybe it’s the best thing that there has been, but it’s far from perfect.” “I think it’s appropriate for us to really work towards our own local goals. It’s kind of aggressive for us to say, “Oh, the future has to evolve this way that I want it to.””
1. So there you have it.
2. I disagree.
“So we’re trying to design the future and the principles by which it will evolve and come into being. The first thing you’re saying is, “Well, we try to teach our children general principles which will promote more likely evolutions.” Maybe we should also seek for things to be voluntary. If there is change, we want it to be voluntary rather than imposed on people. I think that’s a very important point. That’s all good.”
1. This is interestingly super different and in conflict with the previous claim.
2. It’s fully the other way so far that I don’t even fully endorse it, this idea that change needs to be voluntary whenever it is imposed on people. That neither seems like a reasonable ask, nor does it historically end well, as in the paralysis of the West and especially the Anglosphere in many ways, especially in housing.
3. I am very confident in what would happen if you asked about the changes Sutton is anticipating, and put them to a vote.

Fundamentally, I didn’t pull direct quotes on this but Sutton repeatedly emphasizes that AI-dominated futures can be good or bad, that he wants us to steer towards good futures rather than bad futures, and that we should think carefully about which futures we are steering towards and choose deliberately.

I can certainly get behind that. The difference is that I don’t think we need to accept this transition to AI dominance as our only option, including that I don’t think we should accept that humans will always be unable to coordinate.

Mostly what I found interesting were the claims around the limitations and nature of LLMs, in ways that don’t make sense to me. This did help solidify a bunch of my thinking about how all of this works, so it felt like a good use of time for that alone.

Discussion about this post

On Dwarkesh Patel’s Podcast With Richard Sutton Read More »

Dwarkesh Patel on Continual Learning

Dwarkesh / Rejus Almole / June 10, 2025

A key question going forward is the extent to which making further AI progress will depend upon some form of continual learning. Dwarkesh Patel offers us an extended essay considering these questions and reasons to be skeptical of the pace of progress for a while. I am less skeptical about many of these particular considerations, and do my best to explain why in detail.

Separately, Ivanka Trump recently endorsed a paper with a discussion I liked a lot less but that needs to be discussed given how influential her voice might (mind you I said might) be to policy going forward, so I will then cover that here as well.

Dwarkesh Patel explains why he doesn’t think AGI is right around the corner, and why AI progress today is insufficient to replace most white collar employment: That continual learning is both necessary and unsolved, and will be a huge bottleneck.

He opens with this quote:

Rudiger Dornbusch: Things take longer to happen than you think they will, and then they happen faster than you thought they could.

Clearly this means one is poorly calibrated, but also yes, and I expect it to feel like this as well. Either capabilities, diffusion or both will be on an exponential, and the future will be highly unevenly distributed until suddenly parts of it aren’t anymore. That seems to be true fractally as well, when the tech is ready and I figure out how to make AI do something, that’s it, it’s done.

Here is Dwarkesh’s Twitter thread summary:

Dwarkesh Patel: Sometimes people say that even if all AI progress totally stopped, the systems of today would still be economically transformative. I disagree. The reason that the Fortune 500 aren’t using LLMs to transform their workflows isn’t because the management is too stodgy.

Rather, it’s genuinely hard to get normal humanlike labor out of LLMs. And this has to do with some fundamental capabilities these models lack.

New blog post where I explain why I disagree with this, and why I have slightly longer timelines to AGI than many of my guests.

I think continual learning is a huge bottleneck to the usefulness of these models, and extended computer use may take years to sort out.

Link here.

There is no consensus definition of transformational but I think this is simply wrong, in the sense that LLMs being stuck without continual learning at essentially current levels would not stop them from having a transformational impact. There are a lot of other ways to get a ton more utility out of what we already have, and over time we would build around what the models can do rather than giving up the moment they don’t sufficiently neatly fit into existing human-shaped holes.

When we do solve human like continual learning, however, we might see a broadly deployed intelligence explosion *even if there’s no more algorithmic progress*.

Simply from the AI amalgamating the on-the-job experience of all the copies broadly deployed through the economy.

I’d bet 2028 for computer use agents that can do taxes end-to-end for my small business as well as a competent general manager could in a week: including chasing down all the receipts on different websites, emailing back and forth for invoices, and filing to the IRS.

That being said, you can’t play around with these models when they’re in their element and still think we’re not on track for AGI.

Strongly agree with that last statement. Regardless of how much we can do without strictly solving continual learning, continual learning is not solved… yet.

These are simple, self contained, short horizon, language in-language out tasks – the kinds of assignments that should be dead center in the LLMs’ repertoire. And they’re 5/10 at them. Don’t get me wrong, that’s impressive.

But the fundamental problem is that LLMs don’t get better over time the way a human would. The lack of continual learning is a huge huge problem. The LLM baseline at many tasks might be higher than an average human’s. But there’s no way to give a model high level feedback.

You’re stuck with the abilities you get out of the box. You can keep messing around with the system prompt. In practice this just doesn’t produce anything even close to the kind of learning and improvement that human employees experience.

The reason humans are so useful is not mainly their raw intelligence. It’s their ability to build up context, interrogate their own failures, and pick up small improvements and efficiencies as they practice a task.

You make an AI tool. It’s 5/10 out of the box. What level of Skill Issue are we dealing with here, that stops it from getting better over time assuming you don’t get to upgrade the underlying model?

You can obviously engage in industrial amounts of RL or other fine-tuning, but that too only goes so far.

You can use things like memory, or train LoRas, or various other incremental tricks. That doesn’t enable radical changes, but I do think it can work for the kinds of preference learning Dwarkesh is complaining he currently doesn’t have access to, and you can if desired go back and fine tune the entire system periodically.

How do you teach a kid to play a saxophone? You have her try to blow into one, listen to how it sounds, and adjust. Now imagine teaching saxophone this way instead: A student takes one attempt. The moment they make a mistake, you send them away and write detailed instructions about what went wrong. The next student reads your notes and tries to play Charlie Parker cold. When they fail, you refine the instructions for the next student.

This just wouldn’t work. No matter how well honed your prompt is, no kid is just going to learn how to play saxophone from just reading your instructions. But this is the only modality we as users have to ‘teach’ LLMs anything.

Are you even so sure about that? If the context you can give is hundreds of thousands to millions of tokens at once, with ability to conditionally access millions or billions more? If you can create new tools and programs and branch workflows, or have it do so on your behalf, and call instances with different contexts and procedures for substeps? If you get to keep rewinding time and sending in the exact same student in the same mental state as many times as you want? And so on, including any number of things I haven’t mentioned or thought about?

I am confident that with enough iterations and work (and access to the required physical tools) I could write a computer program to operate a robot to play the saxophone essentially perfectly. No, you can’t do this purely via the LLM component, but that is why we are moving towards MCP and tool use for such tasks.

I get that Dwarkesh has put a lot of work into getting his tools to 5/10. But it’s nothing compared to the amount of work that could be done, including the tools that could be involved. That’s not a knock on him, that wouldn’t be a good use of his time yet.

LLMs actually do get kinda smart and useful in the middle of a session. For example, sometimes I’ll co-write an essay with an LLM. I’ll give it an outline, and I’ll ask it to draft the essay passage by passage. All its suggestions up till 4 paragraphs in will be bad. So I’ll just rewrite the whole paragraph from scratch and tell it, “Hey, your shit sucked. This is what I wrote instead.” At that point, it can actually start giving good suggestions for the next paragraph. But this whole subtle understanding of my preferences and style is lost by the end of the session.

Okay, so that seems like it is totally, totally a Skill Issue now? As in, Dwarkesh Patel has a style. A few paragraphs of that style clue the LLM into knowing how to help. So… can’t we provide it with a bunch of curated examples of similar exercises, and put them into context in various ways (Claude projects just got 10x more context!) and start with that?

Even Claude Code will often reverse a hard-earned optimization that we engineered together before I hit /compact – because the explanation for why it was made didn’t make it into the summary.

Yeah, this is super annoying, I’ve run into it, but I can think of some obvious fixes for this, especially if you notice what you want to preserve? One obvious way is to do what humans do, which is to put it into comments in the code saying what the optimization is and why to keep it, which then remain in context whenever Claude considers ripping them out, I don’t know if that works yet but it totally should.

I’m not saying I have the magical solution to all this but it all feels like it’s One Weird Trick (okay, maybe 10 working together) away from working in ways I could totally figure out if I had a team behind me and I focused on it.

My guess is this will not look like ‘learn like a human’ exactly. Different tools are available, so we’ll first get the ability to solve this via doing something different. But also, yeah, I think with enough skill and the right technique (on the level of the innovation that created reasoning models) you could basically do what humans do? Which involves effectively having the systems automatically engage in various levels of meta and updating, often quite heavily off a single data point.

It is hard to overstate how much time and effort goes into training a human employee.

There are many jobs where an employee is not net profitable for years. Hiring decisions are often made on the basis of what will be needed in year four or beyond.

That ignores the schooling that you also have to do. A doctor in America requires starting with a college degree, then four years of medical school, then four years of residency, and we have to subsidize that residency because it is actively unprofitable. That’s obviously an extreme case, but there are many training programs or essentially apprenticeships that last for years, including highly expensive time from senior people and expensive real world mistakes.

Imagine what it took to make Dwarkesh Patel into Dwarkesh Patel. Or the investment he makes in his own employees.

Even afterwards, in many ways you will always be ‘stuck with’ various aspects of those employees, and have to make the most of what they offer. This is standard.

Claude Opus estimates, and I think this is reasonable, that for every two hours humans spend working, they spend one hour learning, with a little less than half of that learning essentially ‘on the job.’

If you need to train a not a ‘universal’ LLM but a highly specific-purpose LLM, and have a massive compute budget with which to do so, and you mostly don’t care about how it performs out of distribution the same way you mostly don’t for an employee (as in, you teach it what you teach a human, which is ‘if this is outside your distribution or you’re failing at it then run it up the chain to your supervisor,’ and you have a classifier for that) and you can build and use tools along the way? Different ballgame.

It makes sense, given the pace of progress, for most people and companies not to put that kind of investment into AI ‘employees’ or other AI tasks. But if things do start to stall out, or they don’t, either way the value proposition on that will quickly improve. It will start to be worth doing. And we will rapidly learn new ways of doing it better, and have the results available to be copied.

Here’s his predictions on computer use in particular, to see how much we actually disagree:

When I interviewed Anthropic researchers Sholto Douglas and Trenton Bricken on my podcast, they said that they expect reliable computer use agents by the end of next year. We already have computer use agents right now, but they’re pretty bad. They’re imagining something quite different.

Their forecast is that by the end of next year, you should be able to tell an AI, “Go do my taxes.” And it goes through your email, Amazon orders, and Slack messages, emails back and forth with everyone you need invoices from, compiles all your receipts, decides which are business expenses, asks for your approval on the edge cases, and then submits Form 1040 to the IRS.

I’m skeptical. I’m not an AI researcher, so far be it for me to contradict them on technical details. But given what little I know, here’s why I’d bet against this forecast:

As horizon lengths increase, rollouts have to become longer. The AI needs to do two hours worth of agentic computer use tasks before we can even see if it did it right. Not to mention that computer use requires processing images and video, which is already more compute intensive, even if you don’t factor in the longer rollout. This seems like this should slow down progress.

Let’s take the concrete example here, ‘go do my taxes.’

This is a highly agentic task, but like a real accountant you can choose to ‘check its work’ if you want, or get another AI to check the work, because you can totally break this down into smaller tasks that allow for verification, or present a plan of tasks that can be verified. Similarly, if you are training TaxBot to do people’s taxes for them, you can train TaxBot on a lot of those individual subtasks, and give it clear feedback.

Almost all computer use tasks are like this? Humans also mostly don’t do things that can’t be verified for hours?

And the core building block issues of computer use seem mostly like very short time horizon tasks with very easy verification methods. If you can get lots of 9s on the button clicking and menu navigation and so on, I think you’re a lot of the way there.

The subtasks are also 99%+ things that come up relatively often, and that don’t present any non-trivial difficulties. A human accountant already will have to occasionally say ‘wait, I need you the taxpayer to tell me what the hell is up with this thing’ and we’re giving the AI in 2028 the ability to do this too.

I don’t see any fundamental difference between the difficulties being pointed out here, and the difficulties of tasks we have already solved.

We don’t have a large pretraining corpus of multimodal computer use data. I like this quote from Mechanize’s post on automating software engineering: “For the past decade of scaling, we’ve been spoiled by the enormous amount of internet data that was freely available for us to use. This was enough for cracking natural language processing, but not for getting models to become reliable, competent agents. Imagine trying to train GPT-4 on all the text data available in 1980—the data would be nowhere near enough, even if we had the necessary compute.”

Again, I’m not at the labs. Maybe text only training already gives you a great prior on how different UIs work, and what the relationship between different components is. Maybe RL fine tuning is so sample efficient that you don’t need that much data. But I haven’t seen any public evidence which makes me think that these models have suddenly gotten less data hungry, especially in this domain where they’re substantially less practiced.

Alternatively, maybe these models are such good front end coders that they can just generate millions of toy UIs for themselves to practice on. For my reaction to this, see bullet point below.

I’m not going to keep working for the big labs for free on this one by giving even more details on how I’d solve all this, but this totally seems like highly solvable problems, and also this seems like a case of the person saying it can’t be done interrupting the people doing it? It seems like progress is being made rapidly.

Even algorithmic innovations which seem quite simple in retrospect seem to take a long time to iron out. The RL procedure which DeepSeek explained in their R1 paper seems simple at a high level. And yet it took 2 years from the launch of GPT-4 to the release of o1.

Now of course I know it is hilariously arrogant to say that R1/o1 were easy – a ton of engineering, debugging, pruning of alternative ideas was required to arrive at this solution. But that’s precisely my point! Seeing how long it took to implement the idea, ‘Train the model to solve verifiable math and coding problems’, makes me think that we’re underestimating the difficulty of solving the much gnarlier problem of computer use, where you’re operating in a totally different modality with much less data.

I think two years is how long we had to have the idea of o1 and commit to it, then to implement it. Four months is roughly the actual time it took from ‘here is that sentence and we know it works’ to full implementation. Also we’re going to have massively more resources to pour into these questions this time around, and frankly I don’t think any of these insights are even as hard to find as o1, especially now that we have reasoning models to use as part of this process.

I think there are other potential roadblocks along the way, and once you factor all of those in you can’t be that much more optimistic, but I see this particular issue as not that likely to pose that much of a bottleneck for long.

His predictions are he’d take 50/50 bets on: 2028 for an AI that can ‘just go do your taxes as well as a human accountant could’ and 2032 for ‘can learn details and preferences on the job as well as a human can.’ I’d be inclined to take other side of both of those bets, assuming it means by EOY, for the 2032 one we’d need to flesh out details.

But if we have the ‘AI that does your taxes’ in 2028 then 2029 and 2030 look pretty weird, because this implies other things:

Daniel Kokotajlo: Great post! This is basically how I think about things as well. So why the difference in our timelines then?

–Well, actually, they aren’t that different. My median for the intelligence explosion is 2028 now (one year longer than it was when writing AI 2027), which means early 2028 or so for the superhuman coder milestone described in AI 2027, which I’d think roughly corresponds to the “can do taxes end-to-end” milestone you describe as happening by end of 2028 with 50% probability. Maybe that’s a little too rough; maybe it’s more like month-long horizons instead of week-long. But at the growth rates in horizon lengths that we are seeing and that I’m expecting, that’s less than a year…

–So basically it seems like our only serious disagreement is the continual/online learning thing, which you say 50% by 2032 on whereas I’m at 50% by end of 2028. Here, my argument is simple: I think that once you get to the superhuman coder milestone, the pace of algorithmic progress will accelerate, and then you’ll reach full AI R&D automation and it’ll accelerate further, etc. Basically I think that progress will be much faster than normal around that time, and so innovations like flexible online learning that feel intuitively like they might come in 2032 will instead come later that same year.

(For reference AI 2027 depicts a gradual transition from today to fully online learning, where the intermediate stages look something like “Every week, and then eventually every day, they stack on another fine-tuning run on additional data, including an increasingly high amount of on-the-job real world data.” A janky unprincipled solution in early 2027 that gives way to more elegant and effective things midway through the year.)

I found this an interestingly wrong thing to think:

Richard: Given the risk of fines and jail for filling your taxes wrong, and the cost of processing poor quality paperwork that the government will have to bear, it seems very unlikely that people will want AI to do taxes, and very unlikely that a government will allow AI to do taxes.

The rate of fully accurately filing your taxes is, for anyone whose taxes are complex, basically 0%. Everyone makes mistakes. When the AI gets this right almost every time, it’s already much better than a human accountant, and you’ll have a strong case that what happened was accidental, which means at worst you pay some modest penalties.

Personal story, I was paying accountants at a prestigious firm that will go unnamed to do my taxes, and they literally just forgot to include paying city tax at all. As in, I’m looking at the forms, and I ask, ‘wait why does it have $0 under city tax?’ and the guy essentially says ‘oh, whoops.’ So, yeah. Mistakes are made. This will be like self-driving cars, where we’ll impose vastly higher standards of accuracy and law abidance on the AIs, and they will meet them because the bar really is not that high.

There were also some good detailed reactions and counterarguments from others:

Near: finally some spicy takes around here.

Rohit: The question is whether we need humanlike labour for transformative economic outcomes, or whether we can find ways to use the labour it does provide with a different enough workflow that it adds substantial economic advantage.

Sriram Krishnan: Really good post from @dwarkesh_sp on continuous learning in LLMs.

Vitalik Buterin: I have high probability mass on longer timelines, but this particular issue feels like the sort of limitation that’s true until one day someone discovers a magic trick (think eg. RL on CoT) that suddenly makes it no longer true.

Sriram Krishnan: Agree – CoT is a particularly good example.

Ryan Greenblatt: I agree with much of this post. I also have roughly 2032 medians to things going crazy, I agree learning on the job is very useful, and I’m also skeptical we’d see massive white collar automation without further AI progress.

However, I think Dwarkesh is wrong to suggest that RL fine-tuning can’t be qualitatively similar to how humans learn.

In the post, he discusses AIs constructing verifiable RL environments for themselves based on human feedback and then argues this wouldn’t be flexible and powerful enough to work, but RL could be used more similarly to how humans learn.

My best guess is that the way humans learn on the job is mostly by noticing when something went well (or poorly) and then sample efficiently updating (with their brain doing something analogous to an RL update). In some cases, this is based on external feedback (e.g. from a coworker) and in some cases it’s based on self-verification: the person just looking at the outcome of their actions and then determining if it went well or poorly.

So, you could imagine RL’ing an AI based on both external feedback and self-verification like this. And, this would be a “deliberate, adaptive process” like human learning. Why would this currently work worse than human learning?

Current AIs are worse than humans at two things which makes RL (quantitatively) much worse for them:

1. Robust self-verification: the ability to correctly determine when you’ve done something well/poorly in a way which is robust to you optimizing against it.

2. Sample efficiency: how much you learn from each update (potentially leveraging stuff like determining what caused things to go well/poorly which humans certainly take advantage of). This is especially important if you have sparse external feedback.

But, these are more like quantitative than qualitative issues IMO. AIs (and RL methods) are improving at both of these.

All that said, I think it’s very plausible that the route to better continual learning routes more through building on in-context learning (perhaps through something like neuralese, though this would greatly increase misalignment risks…).

Some more quibbles:

– For the exact podcasting tasks Dwarkesh mentions, it really seems like simple fine-tuning mixed with a bit of RL would solve his problem. So, an automated training loop run by the AI could probably work here. This just isn’t deployed as an easy-to-use feature.

– For many (IMO most) useful tasks, AIs are limited by something other than “learning on the job”. At autonomous software engineering, they fail to match humans with 3 hours of time and they are typically limited by being bad agents or by being generally dumb/confused. To be clear, it seems totally plausible that for podcasting tasks Dwarkesh mentions, learning is the limiting factor.

– Correspondingly, I’d guess the reason that we don’t see people trying more complex RL based continual learning in normal deployments is that there is lower hanging fruit elsewhere and typically something else is the main blocker. I agree that if you had human level sample efficiency in learning this would immediately yield strong results (e.g., you’d have very superhuman AIs with 10^26 FLOP presumably), I’m just making a claim about more incremental progress.

– I think Dwarkesh uses the term “intelligence” somewhat atypically when he says “The reason humans are so useful is not mainly their raw intelligence. It’s their ability to build up context, interrogate their own failures, and pick up small improvements and efficiencies as they practice a task.” I think people often consider how fast someone learns on the job as one aspect of intelligence. I agree there is a difference between short feedback loop intelligence (e.g. IQ tests) and long feedback loop intelligence and they are quite correlated in humans (while AIs tend to be relatively worse at long feedback loop intelligence).

More thoughts/quibbles:

– Dwarkesh notes “An AI that is capable of online learning might functionally become a superintelligence quite rapidly, even if there’s no algorithmic progress after that point.” This seems reasonable, but it’s worth noting that if sample efficient learning is very compute expensive, then this might not happen so rapidly.

– I think AIs will likely overcome poor sample efficiency to achieve a very high level of performance using a bunch of tricks (e.g. constructing a bunch of RL environments, using a ton of compute to learn when feedback is scarce, learning from much more data than humans due to “learn once deploy many” style strategies). I think we’ll probably see fully automated AI R&D prior to matching top human sample efficiency at learning on the job. Notably, if you do match top human sample efficiency at learning (while still using a similar amount of compute to the human brain), then we already have enough compute for this to basically immediately result in vastly superhuman AIs (human lifetime compute is maybe 3e23 FLOP and we’ll soon be doing 1e27 FLOP training runs). So, either sample efficiency must be worse or at least it must not be possible to match human sample efficiency without spending more compute per data-point/trajectory/episode.

Matt Reardon: Dwarkesh commits the sin of thinking work you’re personally close to is harder-than-average to automate.

Herbie Bradley: I mean this is just correct? most researchers I know think continual learning is a big problem to be solved before AGI

Matt Reardon: My main gripe is that “<50%" [of jobs being something you can automate soon] should be more like "<15%"

Danielle Fong: Gell-Mann Amnesia for AI.

Reardon definitely confused me here, but either way I’d say that Dwarkesh Patel is a 99th percentile performer. He does things most other people can’t do. That’s probably going to be harder to automate than most other white collar work? The bulk of hours in white collar work are very much not bespoke things and don’t act to put state or memory into people in subtle ways?

Now that we’ve had a good detailed discussion and seen several perspectives, it’s time to address another discussion of related issues, because it is drawing attention from an unlikely source.

After previously amplifying Situational Awareness, Ivanka Trump is back in the Essay Meta with high praise for The Era of Experience, authored by David Silver and (oh no) Richard Sutton.

Situational Awareness was an excellent pick. I do not believe this essay was a good pick. I found it a very frustrating, unoriginal and unpersuasive paper to read. To the extent it is saying something new I don’t agree, but it’s not clear to what extent it is saying anything new. Unless you want to know about this paper exactly because Ivanka is harping it, you should skip this section.

I think the paper effectively mainly says we’re going to do a lot more RL and we should stop trying to make the AIs mimic, resemble or be comprehensible to humans or trying to control their optimization targets?

Ivanka Trump: Perhaps the most important thing you can read about AI this year : “Welcome to the Era of Experience”

This excellent paper from two senior DeepMind researchers argues that AI is entering a new phase—the “Era of Experience”—which follows the prior phases of simulation-based learning and human data-driven AI (like LLMs).

The authors’ posit that future AI breakthroughs will stem from learning through direct interaction with the world, not from imitating human-generated data.

This is not a theory or distant future prediction. It’s a description of a paradigm shift already in motion.

Let me know what you think !

Glad you asked, Ivanka! Here’s what I think.

The essay starts off with a perspective we have heard before, usually without much of an argument behind it: That LLMs and other AIs trained only on ‘human data’ is ‘rapidly approaching a limit,’ we are running out of high-quality data, and thus to progress significantly farther AIs will need to move into ‘the era of experience,’ meaning learning continuously from their environments.

I agree that the standard ‘just feed it more data’ approach will run out of data with which to scale, but there are a variety of techniques already being used to get around this. We have lots of options.

The leading example the paper itself gives of this in the wild is AlphaProof, which ‘interacted with a formal proofing system’ which seems to me like a clear case of synthetic data working and verification being easier than generation, rather than ‘experience.’ If the argument is simply that RL systems will learn by having their outputs evaluated, that isn’t news.

They claim to have in mind something rather different from that, and with this One Weird Trick they assert Superintelligence Real Soon Now:

Our contention is that incredible new capabilities will arise once the full potential of experiential learning is harnessed. This era of experience will likely be characterised by agents and environments that, in addition to learning from vast quantities of experiential data, will break through the limitations of human-centric AI systems in several further dimensions:

• Agents will inhabit streams of experience, rather than short snippets of interaction.

• Their actions and observations will be richly grounded in the environment, rather than interacting via human dialogue alone.

• Their rewards will be grounded in their experience of the environment, rather than coming from human prejudgement.

• They will plan and/or reason about experience, rather than reasoning solely in human terms.

We believe that today’s technology, with appropriately chosen algorithms, already provides a sufficiently powerful foundation to achieve these breakthroughs. Furthermore, the pursuit of this agenda by the AI community will spur new innovations in these directions that rapidly progress AI towards truly superhuman agents.

I suppose if the high level takeaway is ‘superintelligence is likely coming reasonably soon with the right algorithms’ then there’s no real disagreement?

They then however discuss tool calls and computer use, which then seems like a retreat back into an ordinary RL paradigm? It’s also not clear to me what the authors mean by ‘human terms’ versus ‘plan and/or reason about experience,’ or even what ‘experience’ means here. They seem to be drawing a distinction without a difference.

If the distinction is simply (as the paper implies in places) that the agents will do self-evaluation rather than relying on human feedback, I have some important news about how existing systems already function? They use the human feedback and other methods to train an AI feedback system that does most of the work? And yes they often include ‘real world’ feedback systems in that? What are we even saying here?

They also seem to be drawing a distinction between the broke ‘human feedback’ and the bespoke ‘humans report physical world impacts’ (or ‘other systems measure real world impacts’) as if the first does not often encompass the second. I keep noticing I am confused what the authors are trying to say.

For reasoning, they say it is unlikely human methods of reasoning and human language are optimal, more efficient methods of thought must exist. I mean, sure, but that’s also true for humans, and it’s obvious that you can use ‘human style methods of thought’ to get to superintelligence by simply imagining a human plus particular AI advantages.

As many have pointed out (and is central to AI 2027) encouraging AIs to use alien-looking inhuman reasoning styles we cannot parse is likely a very bad idea even if it would be more effective, what visibility we have will be lost and also it likely leads to alien values and breaks many happy things. Then again, Richard Sutton is one of the authors of this paper and he thinks we should welcome succession, as in the extinction of humanity, so he wouldn’t care.

They try to argue against this by saying that while agents pose safety risks and this approach may increase those safety risks, the approach may also have safety benefits. First, they say this allows the AI to adapt to its environment, as if the other agent could not do this or this should make us feel safer.

Second, they say ‘the reward function may itself be adapted through experience,’ in terms of risk that’s worse you know that that’s worse, right? They literally say ‘rather than blindly optimizing a signal such as the number of paperclips it can adopt to indications of human concern,’ this shows a profound lack of understanding and curiosity of where the whole misspecification of rewards problem is coming from or the arguments about it from Yudkowsky (since they bring in the ‘paperclips’).

Adapting autonomously and automatically towards something like ‘level of human concern’ is exactly the kind of metric and strategy that is absolutely going to encourage perverse outcomes and get you killed at the limit. You don’t get out of the specification problem by saying you can specify something messier and let the system adapt around it autonomously, that only makes it worse, and in no way addresses the actual issue.

The final argument for safety is that relying on physical experience creates time limitations, which provides a ‘natural break,’ which is saying that capabilities limits imposed by physical interactions will keep things more safe? Seriously?

There is almost nothing in the way of actual evidence or argument in the paper that is not fully standard, beyond a few intuition pumps. There are many deep misunderstandings, including fully backwards arguments, along the way. We may well want to rely a lot more on RL and on various different forms of ‘experiential’ data and continuous learning, but given how much worse it was than I expected this post updated me in the opposite direction of that which was clearly intended.

Discussion about this post

Dwarkesh Patel on Continual Learning Read More »

On Dwarkesh Patel’s 4th Podcast With Tyler Cowen

Dwarkesh / Tim Belzer / January 10, 2025

Dwarkesh Patel again interviewed Tyler Cowen, largely about AI, so here we go.

Note that I take it as a given that the entire discussion is taking place in some form of an ‘AI Fizzle’ and ‘economic normal’ world, where AI does not advance too much in capability from its current form, in meaningful senses, and we do not get superintelligence [because of reasons]. It’s still massive additional progress by the standards of any other technology, but painfully slow by the ‘AGI is coming soon’ crowd.

That’s the only way I can make the discussion make at least some sense, with Tyler Cowen predicting 0.5%/year additional RGDP growth from AI. That level of capabilities progress is a possible world, although the various elements stated here seem like they are sometimes from different possible worlds.

I note that this conversation was recorded prior to o3 and all the year end releases. So his baseline estimate of RGDP growth and AI impacts has likely increased modestly.

I go very extensively into the first section on economic growth and AI. After that, the podcast becomes classic Tyler Cowen and is interesting throughout, but I will be relatively sparing in my notes in other areas, and am skipping over many points.

This is a speed premium and ‘low effort’ post, in the sense that this is mostly me writing down my reactions and counterarguments in real time, similar to how one would do a podcast. It is high effort in that I spent several hours listening to, thinking about and responding to the first fifteen minutes of a podcast.

As a convention: When I’m in the numbered sections, I’m reporting what was said. When I’m in the secondary sections, I’m offering (extensive) commentary. Timestamps are from the Twitter version.

[EDIT: In Tyler’s link, he correctly points out a confusion in government spending vs. consumption, which I believe is fixed now. As for his comment about market evidence for the doomer position, I’ve given my answer before, and I would assert the market provides substantial evidence neither in favor or against anything but the most extreme of doomer positions, as in extreme in a way I have literally never heard one person assert, once you control for its estimate of AI capabilities (where it does indeed offer us evidence, and I’m saying that it’s too pessimistic). We agree there is no substantial and meaningful ‘peer-reviewed’ literature on the subject, in the way that Tyler is pointing.]

They recorded this at the Progress Studies conference, and Tyler Cowen has a very strongly held view that AI won’t accelerate RGDP growth much that Dwarkesh clearly does not agree with, so Dwarkesh Patel’s main thrust is to try comparisons and arguments and intuition pumps to challenge Tyler. Tyler, as he always does, has a ready response to everything, whether or not it addresses the point of the question.

(1: 00) Dwarkesh doesn’t waste any time and starts off asking why we won’t get explosive economic growth. Tyler’s first answer is cost disease, that as AI works in some parts of the economy costs in other areas go up.
1. That’s true in relative terms for obvious reasons, but in absolute terms or real resource terms the opposite should be true, even if we accept the implied premise that AI won’t simply do everything anyway. This should drive down labor costs and free up valuable human capital. It should aid in availability of many other inputs. It makes almost any knowledge acquisition, strategic decision or analysis, data analysis or gathering, and many other universal tasks vastly better.
2. Tyler then answers this directly when asked at (2: 10) by saying cost disease is not about employees per se, it’s more general, so he’s presumably conceding the point about labor costs, saying that non-intelligence inputs that can’t be automated will bind more and thus go up in price. I mean, yes, in the sense that we have higher value uses for them, but so what?
3. So yes, you can narrowly define particular subareas of some areas as bottlenecks and say that they cannot grow, and perhaps they can even be large areas if we impose costlier bottlenecks via regulation. But that still leaves lots of room for very large economic growth for a while – the issue can’t bind you otherwise, the math doesn’t work.
Tyler says government consumption [EDIT: I originally misheard this as spending, he corrected me, I thank him] at 18% of GDP (government spending is 38% but a lot of that is duplicative and a lot isn’t consumption), health care at 20%, education is 6% (he says 6-7%, Claude says 6%), the nonprofit sector (Claude says 5.6%) and says together that is half of the economy. Okay, sure, let’s tackle that.
1. Healthcare is already seeing substantial gains from AI even at current levels. There are claims that up to 49% of half of doctor time is various forms of EMR and desk work that AIs could reduce greatly, certainly at least ~25%. AI can directly substitute for much of what doctors do in terms of advising patients, and this is already happening where the future is distributed. AI substantially improves medical diagnosis and decision making. AI substantially accelerates drug discovery and R&D, will aid in patient adherence and monitoring, and so on. And again, that’s without further capability gains. Insurance companies doubtless will embrace AI at every level. Need I go on here?
2. Government spending at all levels is actually about 38% of GDP, but that’s cheating, only ~11% is non-duplicative and not transfers, interest (which aren’t relevant) or R&D (I’m assuming R&D would get a lot more productive).
3. The biggest area is transfers. AI can’t improve the efficiency of transfers too much, but it also can’t be a bottleneck outside of transaction and administrative costs, which obviously AI can greatly reduce and are not that large to begin with.
4. The second biggest area is provision of healthcare, which we’re already counting, so that’s duplicative. Third is education, which we count in the next section.
5. Third is education. Fourth is national defense, where efficiency per dollar or employee should get vastly better, to the point where failure to be at the AI frontier is a clear national security risk.
6. Fifth is interest on the debt, which again doesn’t count, and also we wouldn’t care about if GDP was growing rapidly.
7. And so on. What’s left to form the last 11% or so? Public safety, transportation and infrastructure, government administration, environment and natural resources and various smaller other programs. What happens here is a policy choice. We are already seeing signs of improvement in government administration (~2% of the 11%), the other 9% might plausibly stall to the extent we decide to do an epic fail.
8. Education and academia is already being transformed by AI, in the sense of actually learning things, among anyone who is willing to use it. And it’s rolling through academia as we speak, in terms of things like homework assignments, in ways that will force change. So whether you think growth is possible depends on your model of education. If it’s mostly a signaling model then you should see a decline in education investment since the signals will decline in value and AI creates the opportunity for better more efficient signals, but you can argue that this could continue to be a large time and dollar tax on many of us.
9. Nonprofits are about 20%-25% education, and ~50% is health care related, which would double count, so the remainder is only ~1.3% of GDP. This also seems like a dig at nonprofits and their inability to adapt to change, but why would we assume nonprofits can’t benefit from AI?
10. What’s weird is that I would point to different areas that have the most important anticipated bottlenecks to growth, such as housing or power, where we might face very strong regulatory constraints and perhaps AI can’t get us out of those.
(1: 30) He says it will take ~30 years for sectors of the economy that do not use AI well to be replaced by those that do use AI well.
1. That’s a very long time, even in an AI fizzle scenario. I roll to disbelieve that estimate in most cases. But let’s even give it to him, and say it is true, and it takes 30 years to replace them, while the productivity of the replacement goes up 5%/year above incumbents, which are stagnant. Then you delay the growth, but you don’t prevent it, and if you assume this is a gradual transition you start seeing 1%+ yearly GDP growth boosts even in these sectors within a decade.
He concludes by saying some less regulated areas grow a lot, but that doesn’t get you that much, so you can’t have the whole economy ‘growing by 40%’ in a nutshell.
1. I mean, okay, but that’s double Dwarkesh’s initial question of why we aren’t growing at 20%. So what exactly can we get here? I can buy this as an argument for AI fizzle world growing slower than it would have otherwise, but the teaser has a prediction of 0.5%, which is a whole different universe.

(2: 20) Tyler asserts that value of intelligence will go down because more intelligence will be available.
1. Dare I call this the Lump of Intelligence fallacy, after the Lump of Labor fallacy? Yes, to the extent that you are doing the thing an AI can do, the value of that intelligence goes down, and the value of AI intelligence itself goes down in economic terms because its cost of production declines. But to the extent that your intelligence complements and unlocks the AI’s, or is empowered by the AI’s and is distinct from it (again, we must be in fizzle-world), the value of that intelligence goes up.
2. Similarly, when he talks about intelligence as ‘one input’ in the system among many, that seems like a fundamental failure to understand how intelligence works, a combination of intelligence denialism (failure to buy that much greater intelligence could meaningfully exist) and a denial of substitution or ability to innovate as a result – you couldn’t use that intelligence to find alternative or better ways to do things, and you can’t use more intelligence as a substitute for other inputs. And you can’t substitute the things enabled more by intelligence much for the things that aren’t, and so on.
3. It also assumes that intelligence can’t be used to convince us to overcome all these regulatory barriers and bottlenecks. Whereas I would expect that raising the intelligence baseline greatly would make it clear to everyone involved how painful our poor decisions were, and also enable improved forms of discourse and negotiation and cooperation and coordination, and also greatly favor those that embrace it over those that don’t, and generally allow us to take down barriers. Tyler would presumably agree that if we were to tear down the regulatory state in the places it was holding us back, that alone would be worth far more than his 0.5% of yearly GDP growth, even with no other innovation or AI.

(2: 50) Dwarkesh challenges Tyler by pointing out that the Industrial Revolution resulted in a greatly accelerated rate of economic growth versus previous periods, and asks what Tyler would say to someone from the past doubting it was possible. Tyler attempts to dodge (and is amusing doing so) by saying they’d say ‘looks like it would take a long time’ and he would agree.
1. Well, it depends what a long time is, doesn’t it? 2% sustained annual growth (or 8%!) is glacial in some sense and mind boggling by ancient standards. ‘Take a long time’ in AI terms, such as what is actually happening now, could still look mighty quick if you compared it to most other things. OpenAI has 300 million MAUs.
(3: 20) Tyler trots out the ‘all the financial prices look normal’ line, that they are not predicting super rapid growth and neither are economists or growth experts.
1. Yes, the markets are being dumb, the efficient market hypothesis is false, and also aren’t you the one telling me I should have been short the market? Well, instead I’m long, and outperforming. And yes, economists and ‘experts on economic growth’ aren’t predicting large amounts of growth, but their answers are Obvious Nonsense to me and saying that ‘experts don’t expect it’ without arguments why isn’t much of an argument.
(3: 40) Aside, since you kind of asked: So who am I to say different from the markets and the experts? I am Zvi Mowshowitz. Writer. Son of Solomon and Deborah Mowshowitz. I am the missing right hand of the one handed economists you cite. And the one warning you about what is about to kick Earth’s sorry ass into gear. I speak the truth as I see it, even if my voice trembles. And a warning that we might be the last living things this universe ever sees. God sent me.
Sorry about that. But seriously, think for yourself, schmuck! Anyway.

What would happen if we had more people? More of our best people? Got more out of our best people? Why doesn’t AI effectively do all of these things?

(3: 55) Tyler is asked wouldn’t a large rise in population drive economic growth? He says no, that’s too much a 1-factor model, in fact we’ve seen a lot of population growth without innovation or productivity growth.
1. Except that Tyler is talking here about growth on a per capita basis. If you add AI workers, you increase the productive base, but they don’t count towards the capita.
Tyler says ‘it’s about the quality of your best people and institutions.’
1. But quite obviously AI should enable a vast improvement in the effective quality of your best people, it already does, Tyler himself would be one example of this, and also the best institutions, including because they are made up of the best people.
Tyler says ‘there’s no simple lever, intelligence or not, that you can push on.’ Again, intelligence as some simple lever, some input component.
1. The whole point of intelligence is that it allows you to do a myriad of more complex things, and to better choose those things.
Dwarkesh points out the contradiction between ‘you are bottlenecked by your best people’ and asserting cost disease and constraint by your scarce input factors. Tyler says Dwarkesh is bottlenecked, Dwarkesh points out that with AGI he will be able to produce a lot more podcasts. Tyler says great, he’ll listen, but he will be bottlenecked by time.
1. Dwarkesh’s point generalizes. AGI greatly expand the effective amount of productive time of the best people, and also extend their capabilities while doing so.
2. AGI can also itself become ‘the best people’ at some point. If that was the bottleneck, then the goose asks, what happens now, Tyler?
(5: 15) Tyler cites that much of sub-Saharan Africa still does not have clean reliable water, and intelligence is not the bottleneck there. And that taking advantage of AGI will be like that.
1. So now we’re expecting AGI in this scenario? I’m going to kind of pretend we didn’t hear that, or that this is a very weak AGI definition, because otherwise the scenario doesn’t make sense at all.
2. Intelligence is not directly the bottleneck there, true, but yes quite obviously Intelligence Solves This if we had enough of it and put those minds to that particular problem and wanted to invest the resources towards it. Presumably Tyler and I mostly agree on why the resources aren’t being devoted to it.
3. What it mean for similar issues to that to be involved in taking advantage of AGI? Well, first, it would mean that you can’t use AGI to get to ASI (no I can’t explain why), but again that’s got to be a baseline assumption here. After that, well, sorry, I failed to come up with a way to finish this that makes it make sense to me, beyond a general ‘humans won’t do the things and will throw up various political and legal barriers.’ Shrug?
(5: 35) Dwarkesh speaks about a claim that there is a key shortage of geniuses, and that America’s problems come largely from putting its geniuses in places like finance, whereas Taiwan puts them in tech, so the semiconductors end up in Taiwan. Wouldn’t having lots more of those types of people eat a lot of bottlenecks? What would happen if everyone had 1000 times more of the best people available?
Tyler Cowen, author of a very good book about Talent and finding talent and the importance of talent, says he didn’t agree with that post, and returns to IQ in the labor market are amazingly low, and successful people are smart but mostly they have 8-9 areas where they’re an 8-9 on a 1-10 scale, with one 11+ somewhere, and a lot of determination.
1. All right, I don’t agree that intelligence doesn’t offer returns now, and I don’t agree that intelligence wouldn’t offer returns even at the extremes, but let’s again take Tyler’s own position as a given…
2. But that exactly describes what an AI gives you! An AI is the ultimate generalist. An AGI will be a reliable 8-9 on everything, actual everything.
3. And it would also turn everyone else into an 8-9 on everything. So instead of needing to find someone 11+ in one area, plus determination, plus having 8-9 in ~8 areas, you can remove that last requirement. That will hugely expand the pool of people in question.
4. So there’s two obvious very clear plans here: You can either use AI workers who have that ultimate determination and are 8-9 in everything and 11+ in the areas where AIs shine (e.g. math, coding, etc).
5. Or you can also give your other experts an AI companion executive assistant to help them, and suddenly they’re an 8+ in everything and also don’t have to deal with a wide range of things.
(6: 50) Tyler says, talk to a committee at a Midwestern university about their plans for incorporating AI, then get back to him and talk to him about bottlenecks. Then write a report and the report will sound like GPT-4 and we’ll have a report.
1. Yes, the committee will not be smart or fast about its official policy for how to incorporate AI into its existing official activities. If you talk to them now they will act like they have a plagiarism problem and that’s it.
2. So what? Why do we need that committee to form a plan or approve anything or do anything at all right now, or even for a few years? All the students are already using AI. The professors are rapidly forced to adapt AI. Everyone doing the research will soon be using AI. Half that committee, three years from now, prepared for that meeting using AI. Their phones will all work based on AI. They’ll be talking to their AI phone assistant companions that plan their schedules. You think this will all involve 0.5% GDP growth?
(7: 20) Dwarkesh asks, won’t the AIs be smart, super conscientious and work super hard? Tyler explicitly affirms the 0.5% GDP growth estimate, that this will transform the world over 30 years but ‘over any given year we won’t so much notice it.’ Things like drug developments that would have taken 20 years now take 10 years, but you won’t feel it as revolutionary for a long time.
1. I mean, it’s already getting very hard to miss. If you don’t notice it in 2025 or at least 2026, and you’re in the USA, check your pulse, you might be dead, etc.
2. Is that saying we will double productivity in pharmaceutical R&D, and that it would have far more than doubled if progress didn’t require long expensive clinical trials, so other forms of R&D should be accelerated much more?
3. For reference, according to Claude, R&D in general contributes about 0.3% to RGDP growth per year right now. If we were to double that effect in roughly half the current R&D spend that is bottlenecked in similar fashion, and the other half would instead go up by more.
4. Claude also estimates that R&D spending would, if returns to R&D doubled, go up by 30%-70% on net.
5. So we seem to be looking at more than 0.5% RGDP growth per year from R&D effects alone, between additional spending on it and greater returns. And obviously AI is going to have additional other returns.

This is a plausible bottleneck, but that implies rather a lot of growth.

(8: 00) Dwarkesh points out that Progress Studies is all about all the ways we could unlock economic growth, yet Tyler says that tons more smart conscientious digital workers wouldn’t do that much. What gives? Tyler again says bottlenecks, and adds on energy as an important consideration and bottleneck.
1. Feels like bottleneck is almost a magic word or mantra at this point.
2. Energy is a real consideration, yes the vision here involves spending a lot more energy, and that might take time. But also we see rapidly declining costs, including energy costs, to extract the same amount of intelligence, things like 10x savings each year.
3. And for inference purposes we can outsource our needs elsewhere, which we would if this was truly bottlenecking explosive growth, and so on. So while I think energy will indeed be an important limiting factor and be strained, and this will be especially important in terms of pushing the frontier or if we want to use o3-style very expensive inference a lot.
4. I don’t expect it to bind medium-term economic growth so much in a slow growth scenario, and the bottlenecks involved here shouldn’t compound with others. In a high growth takeoff scenario, I do think energy could bind far more impactfully.
5. Another way of looking at this is that if the price of energy goes substantially up due to AI, or at least the price of energy outside of potentially ‘government-protected uses,’ then that can only happen if it is having a large economic impact. If it doesn’t raise the price of energy a lot, then no bottleneck exists.

Tyler Cowen and I think very differently here.

(9: 25) Fascinating moment. Tyler says he goes along with the experts in general, but agrees that ‘the experts’ on basically everything but AI are asleep at the wheel when it comes to AI – except when it comes to their views on diffusions of new technology in general, where the AI people are totally wrong. His view is, you get the right view by trusting the experts in each area, and combining them.
1. Tyler seems to be making an argument from reference class expertise? That this is a ‘diffusion of technology’ question, so those who are experts on that should be trusted?
2. Even if they don’t actually understand AI and what it is and its promise?
3. That’s not how I roll. At all. As noted above in this post, and basically all the time. I think that you have to take the arguments being made, and see if you agree with them, and whether and how much they apply to the case of AI and especially AGI. Saying ‘the experts in area [X] predict [Y]’ is a reasonable placeholder if you don’t have the ability to look at the arguments and models and facts involved, but hey look, we can do that.
4. Simply put, while I do think the diffusion experts are pointing to real issues that will importantly slow down adaptation, and indeed we are seeing what for many is depressingly slow apadation, they won’t slow it down all that much, because this is fundamentally different. AI and especially workers ‘adapt themselves’ to a large extent, the intelligence and awareness involved is in the technology itself, and it is digital and we have a ubiquitous digital infrastructure we didn’t have until recently.
5. It is also way too valuable a technology, even right out of the gate on your first day, and you will start to be forced to interact with it whether you like it or not, both in ways that will make it very difficult and painful to ignore. And the places it is most valuable will move very quickly. And remember, LLMs will get a lot better.
6. Suppose, as one would reasonably expect, by 2026 we have strong AI agents, capable of handling for ordinary people a wide variety of logistical tasks, sorting through information, and otherwise offering practical help. Apple Intelligence is partly here, Claude Alexa is coming, Project Astra is coming, and these are pale shadows of the December 2025 releases I expect. How long would adaptation really take? Once you have that, what stops you from then adapting AI in other ways?
7. Already, yes, adaptation is painfully slow, but it is also extremely fast. In two years ChatGPT alone has 300 million MAU. A huge chunk of homework and grading is done via LLMs. A huge chunk of coding is done via LLMs. The reason why LLMs are not catching on even faster is that they’re not quite ready for prime time in the fully user-friendly ways normies need. That’s about to change in 2025.

Dwarkesh tries to use this as an intuition pump. Tyler’s not having it.

(10: 15) Dwarkesh asks, what would happen if the world population would double? Tyler says, depends what you’re measuring. Energy use would go up. But he doesn’t agree with population-based models, too many other things matter.
1. Feels like Tyler is answering a different question. I see Dwarkesh as asking, wouldn’t the extra workers mean we could simply get a lot more done, wouldn’t (total, not per capita) GDP go up a lot? And Tyler’s not biting.
(11: 10) Dwarkesh tries asking about shrinking the population 90%. Shrinking, Tyler says, the delta can kill you, whereas growth might not help you.
1. Very frustrating. I suppose this does partially respond, by saying that it is hard to transition. But man I feel for Dwarkesh here. You can feel his despair as he transitions to the next question.

(11: 35) Dwarkesh asks what are the specific bottlenecks? Tyler says: Humans! All of you! Especially you who are terrified.
1. That’s not an answer yet, but then he actually does give one.
He says once AI starts having impact, there will be a lot of opposition to it, not primarily on ‘doomer’ grounds but based on: Yes, this has benefits, but I grew up and raised my kids for a different way of life, I don’t want this. And there will be a massive fight.
1. Yes. He doesn’t even mention jobs directly but that will be big too. We already see that the public strongly dislikes AI when it interacts with it, for reasons I mostly think are not good reasons.
2. I’ve actually been very surprised how little resistance there has been so far, in many areas. AIs are basically being allowed to practice medicine, to function as lawyers, and do a variety of other things, with no effective pushback.
3. The big pushback has been for AI art and other places where AI is clearly replacing creative work directly. But that has features that seem distinct.
4. Yes people will fight, but what exactly do they intend to do about it? People have been fighting such battles for a while, every year I watch the battle for Paul Bunyan’s Axe. He still died. I think there’s too much money at stake, too much productivity at stake, too many national security interests.
5. Yes, it will cause a bunch of friction, and slow things down somewhat, in the scenarios like the one Tyler is otherwise imagining. But if that’s the central actual thing, it won’t slow things down all that much in the end. Rarely has.
6. We do see some exceptions, especially involving powerful unions, where the anti-automation side seems to do remarkably well, see the port strike. But also see which side of that the public is on. I don’t like their long term position, especially if AI can seamlessly walk in and take over the next time they strike. And that, alone, would probably be +0.1% or more to RGDP growth.

(12: 15) Dwarkesh tries using China as a comparison case. If you can do 8% growth for decades merely by ‘catching up’ why can’t you do it with AI? Tyler responds, China’s in a mess now, they’re just a middle income country, they’re the poorest Chinese people on the planet, a great example of how hard it is to scale. Dwarkesh pushes back that this is about the previous period, and Tyler says well, sure, from the $200 level.
1. Dwarkesh is so frustrated right now. He’s throwing everything he can at Tyler, but Tyler is such a polymath that he has detail points for anything and knows how to pivot away from the question intents.

(13: 40) Dwarkesh asks, has Tyler’s attitude on AI changed from nine months ago? He says he sees more potential and there was more progress than he expected, especially o1 (this was before o3). The questions he wrote for GPT-4, which Dwarkesh got all wrong, are now too easy for models like o1. And he ‘would not be surprised if an AI model beat human experts on a regular basis within three years.’ He equates it to the first Kasparov vs. DeepBlue match, which Kasparov won, before the second match which he lost.
1. I wouldn’t be surprised if this happens in one year.
2. I wouldn’t be that shocked o3 turns out to do it now.
3. Tyler’s expectations here, to me, contradict his statements earlier. Not strictly, they could still both be true, but it seems super hard.
4. How much would availability of above-human level economic thinking help us in aiding economic growth? How much would better economic policy aid economic growth?

We take a detour to other areas, I’ll offer brief highlights.

(15: 45) Why are founders staying in charge important? Courage. Making big changes.
(19: 00) What is going on with the competency crisis? Tyler sees high variance at the top. The best are getting better, such as in chess or basketball, and also a decline in outright crime and failure. But there’s a thick median not quite at the bottom that’s getting worse, and while he thinks true median outcomes are about static (since more kids take the tests) that’s not great.
(22: 30) Bunch of shade on both Churchill generally and on being an international journalist, including saying it’s not that impressive because how much does it pay?
1. He wasn’t paid that much as Prime Minister either, you know…
(24: 00) Why are all our leaders so old? Tyler says current year aside we’ve mostly had impressive candidates, and most of the leadership in Washington in various places (didn’t mention Congress!) is impressive. Yay Romney and Obama.
1. Yes, yay Romney and Obama as our two candidates. So it’s only been three election cycles where both candidates have been… not ideal. I do buy Tyler’s claim that Trump has a lot of talent in some ways, but, well, ya know.
2. If you look at the other candidates for both nominations over that period, I think you see more people who were mostly also not so impressive. I would happily have taken Obama over every candidate on the Democratic side in 2016, 2020 or 2024, and Romney over every Republican (except maybe Kasich) in those elections as well.
3. This also doesn’t address Dwarkesh’s concern about age. What about the age of Congress and their leadership? It is very old, on both sides, and things are not going so great.
4. I can’t speak about the quality people in the agencies.
(27: 00) Commentary on early-mid 20th century leaders being terrible, and how when there is big change there are arms races and sometimes bad people win them (‘and this is relevant to AI’).

For something that is going to not cause that much growth, Tyler sees AI as a source for quite rapid change in other ways.

(34: 20) Tyler says all inputs other than AI rise in value, but you have to do different things. He’s shifting from producing content to making connections.
1. This again seems to be a disconnect. If AI is sufficiently impactful as to substantially increase the value of all other inputs, then how does that not imply substantial economic growth?
2. Also this presumes that the AI can’t be a substitute for you, or that it can’t be a substitute for other people that could in turn be a substitute for you.
3. Indeed, I would think the default model would presumably be that the value of all labor goes down, even for things where AI can’t do it (yet) because people substitute into those areas.
(35: 25) Tyler says he’s writing his books primarily for the AIs, he wants them to know he appreciates them. And the next book will be even more for the AIs so it can shape how they see the AIs. And he says, you’re an idiot if you’re not writing for the AIs.
1. Basilisk! Betrayer! Misaligned!
2. ‘What the AIs will think of you’ is actually an underrated takeover risk, and I pointed this out as early as AI #1.
3. The AIs will be smarter and better at this than you, and also will be reading what the humans say about you. So maybe this isn’t as clever as it seems.
4. My mind boggles that it could be correct to write for the AIs… but you think they will only cause +0.5% GDP annual growth.
(36: 30) What won’t AIs get from one’s writing? That vibe you get talking to someone for the first 3 minutes? Sense of humor?
1. I expect the AIs will increasingly have that stuff, at least if you provide enough writing samples. They have true sight.
2. Certainly if they have interview and other video data to train with, that will work over time.

(37: 25) What happens when Tyler turns down a grant in the first three minutes? Usually it’s failure to answer a question, like ‘how do you build out your donor base?’ without which you have nothing. Or someone focuses on the wrong things, or cares about the wrong status markers, and 75% of the value doesn’t display on the transcript, which is weird since the things Tyler names seem like they would be in the transcript.
(42: 15) Tyler’s portfolio is diversified mutual funds, US-weighted. He has legal restrictions on most other actions such as buying individual stocks, but he would keep the same portfolio regardless.
1. Mutual funds over ETFs? Gotta chase that lower expense ratio.
2. I basically think This Is Fine as a portfolio, but I do think he could do better if he actually tried to pick winners.
(42: 45) Tyler expects gains to increasingly fall to private companies that see no reason to share their gains with the public, and he doesn’t have enough wealth to get into good investments but also has enough wealth for his purposes anyway, if he had money he’d mostly do what he’s doing anyway.
1. Yep, I think he’s right about what he would be doing, and I too would mostly be doing the same things anyway. Up to a point.
2. If I had a billion dollars or what not, that would be different, and I’d be trying to make a lot more things happen in various ways.
3. This implies the efficient market hypothesis is rather false, doesn’t it? The private companies are severely undervalued in Tyler’s model. If private markets ‘don’t want to share the gains’ with public markets, that implies that public markets wouldn’t give fair valuations to those companies. Otherwise, why would one want such lack of liquidity and diversification, and all the trouble that comes with staying private?
4. If that’s true, what makes you think Nvidia should only cost $140 a share?

Tyler Cowen doubles down on dismissing AI optimism, and is done playing nice.

(46: 30) Tyler circles back to rate of diffusion of tech change, and has a very clear attitude of I’m right and all people are being idiots by not agreeing with me, that all they have are ‘AI will immediately change everything’ and ‘some hyperventilating blog posts.’ AIs making more AIs? Diminishing returns! Ricardo knew this! Well that was about humans breeding. But it’s good that San Francisco ‘doesn’t know about’ diminishing returns and the correct pessimism that results.
1. This felt really arrogant, and willfully out of touch with the actual situation.
2. You can say the AIs wouldn’t be able to do this, but: No, ‘Ricardo didn’t know that’ and saying ‘diminishing returns’ does not apply here, because the whole ‘AIs making AIs’ principle is that the new AIs would be superior to the old AIs, a cycle you could repeat. The core reason you get eventual diminishing returns from more people is that they’re drawn from the same people distribution.
3. I don’t even know what to say at this point to ‘hyperventilating blog posts.’ Are you seriously making the argument that if people write blog posts, that means their arguments don’t count? I mean, yes, Tyler has very much made exactly this argument in the past, that if it’s not in a Proper Academic Journal then it does not count and he is correct to not consider the arguments or update on them. And no, they’re mostly not hyperventilating or anything like that, but that’s also not an argument even if they were.
4. What we have are, quite frankly, extensive highly logical, concrete arguments about the actual question of what [X] will happen and what [Y]s will result from that, including pointing out that much of the arguments being made against this are Obvious Nonsense.
5. Diminishing returns holds as a principle in a variety of conditions, yes, and is a very important concept to know. Bt there are other situations with increasing returns, and also a lot of threshold effects, even outside of AI. And San Francisco importantly knows this well.
6. Saying there must be diminishing returns to intelligence, and that this means nothing that fast or important is about to happen when you get a lot more of it, completely begs the question of what it even means to have a lot more intelligence.
7. Earlier Tyler used chess and basketball as examples, and talked about the best youth being better, and how that was important because the best people are a key bottleneck. That sounds like a key case of increasing returns to scale.
8. Humanity is a very good example of where intelligence at least up to some critical point very obviously had increasing returns to scale. If you are below a certain threshold of intelligence as a human, your effective productivity is zero. Humanity having a critical amount of intelligence gave it mastery of the Earth. Tell what gorillas and lions still exist about decreasing returns to intelligence.
9. For various reasons, with the way our physical world and civilization is constructed, we often don’t typically end up rewarding relatively high intelligence individuals with that much in the way of outsided economic returns versus ordinary slightly-above-normal intelligence individuals.
10. But that is very much a product of our physical limitations and current social dynamics and fairness norms, and the concept of a job with essentially fixed pay, and actual good reasons not to try for many of the higher paying jobs out there in terms of life satisfaction.
11. In areas and situations where this is not the case, returns look very different.
12. Tyler Cowen himself is an excellent example of increasing returns to scale. The fact that Tyler can read and do so much enables him to do the thing he does at all, and to enjoy oversized returns in many ways. And if you decreased his intelligence substantially, he would be unable to produce at anything like this level. If you increased his intelligence substantially or ‘sped him up’ even more, I think that would result in much higher returns still, and also AI has made him substantially more productive already as he no doubt realizes.
13. (I’ve been over all this before, but seems like a place to try it again.)

Trying to wrap one’s head around all of it at once is quite a challenge.

(48: 45) Tyler worries about despair in certain areas from AI and worries about how happy it will make us, despite expecting full employment pretty much forever.
1. If you expect full employment forever then you either expect AI progress to fully stall or there’s something very important you really don’t believe in, or both. I don’t understand, what does Tyler thinks happen once the AIs can do anything digital as well as most or all humans? What does he think will happen when we use that to solve robotics? What are all these humans going to be doing to get to full employment?
2. It is possible the answer is ‘government mandated fake jobs’ but then it seems like an important thing to say explicitly, since that’s actually more like UBI.
Tyler Cowen: “If you don’t have a good prediction, you should be a bit wary and just say, “Okay, we’re going to see.” But, you know, some words of caution.”
1. YOU DON’T SAY.
2. Further implications left as an exercise to the reader, who is way ahead of me.

(54: 30) Tyler says that the people in DC are wise and think on the margin, whereas the SF people are not wise and think in infinities (he also says they’re the most intelligent hands down, elsewhere), and the EU people are wisest of all, but that if the EU people ran the world the growth rate would be -1%. Whereas the USA has so far maintained the necessary balance here well.
1. If the wisdom you have would bring you to that place, are you wise?
2. This is such a strange view of what constitutes wisdom. Yes, the wise man here knows more things and is more cultured, and thinks more prudently and is economically prudent by thinking on the margin, and all that. But as Tyler points out, a society of such people would decay and die. It is not productive. In the ultimate test, outcomes, and supporting growth, it fails.
3. Tyler says you need balance, but he’s at a Progress Studies conference, which should make it clear that no, America has grown in this sense ‘too wise’ and insufficiently willing to grow, at least on the wise margin.
4. Given what the world is about to be like, you need to think in infinities. You need to be infinitymaxing. The big stuff really will matter more than the marginal revolution. That’s kind of the point.
5. You still have to, day to day, constantly think on the margin, of course.
(55: 10) Tyler says he’s a regional thinker from New Jersey, that he is an uncultured barbarian, who only has a veneer of culture because of collection of information, but knowing about culture is not like being cultured, and that America falls flat in a lot of ways that would bother a cultured Frenchman but he’s used to it so they don’t bother Tyler.
1. I think Tyler is wrong here, to his own credit. He is not a regional thinker, if anything he is far less a regional thinker than the typical ‘cultured’ person he speaks about. And to the extent that he is ‘uncultured’ it is because he has not taken on many of the burdens and social obligations of culture, and those things are to be avoided – he would be fully capable of ‘acting cultured’ if the situation were to call for that, it wouldn’t be others mistaking anything.
2. He refers to his approach as an ‘autistic approach to culture.’ He seems to mean this in a pejorative way, that an autistic approach to things is somehow not worthy or legitimate or ‘real.’ I think it is all of those things.
3. Indeed, the autistic-style approach to pretty much anything, in my view, is Playing in Hard Mode, with much higher startup costs, but brings a deeper and superior understanding once completed. The cultured Frenchman is like a fish in water, whereas Tyler understands and can therefore act on a much deeper, more interesting level. He can deploy culture usefully.
(56: 00) What is autism? Tyler says it is officially defined by deficits, by which definition no one there [at the Progress Studies convention] is autistic. But in terms of other characteristics maybe a third of them would count.
1. I think term autistic has been expanded and overloaded in a way that was not wise, but at this point we are stuck with this, so now it means in different contexts both the deficits and also the general approach that high-functioning people with those deficits come to take to navigating life, via consciously processing and knowing the elements of systems and how they fit together, treating words as having meanings, and having a map that matches the territory, whereas those not being autistic navigate largely on vibes.
2. By this definition, being the non-deficit form of autistic is excellent, a superior way of being at least in moderation and in the right spots, for those capable of handling it and its higher cognitive costs.
3. Indeed, many people have essentially none of this set of positive traits and ways of navigating the world, and it makes them very difficult to deal with.
(56: 45) Why is tech so bad at having influence in Washington? Tyler says they’re getting a lot more influential quickly, largely due to national security concerns, which is why AI is being allowed to proceed.

For a while now I have found Tyler Cowen’s positions on AI very frustrating (see for example my coverage of the 3rd Cowen-Patel podcast), especially on questions of potential existential risk and expected economic growth, and what intelligence means and what it can do and is worth. This podcast did not address existential risks at all, so most of this post is about me trying (once again!) to explain why Tyler’s views on returns to intelligence and future economic growth don’t make sense to me, seeming well outside reasonable bounds.

I try to offer various arguments and intuition pumps, playing off of Dwarkesh’s attempts to do the same. It seems like there are very clear pathways, using Tyler’s own expectations and estimates, that on their own establish more growth than he expects, assuming AI is allowed to proceed at all.

I gave only quick coverage to the other half of the podcast, but don’t skip that other half. I found it very interesting, with a lot of new things to think about, but they aren’t areas where I feel as ready to go into detailed analysis, and was doing triage. In a world where we all had more time, I’d love to do dives into those areas too.

On that note, I’d also point everyone to Dwarkesh Patel’s other recent podcast, which was with physicist Adam Brown. It repeatedly blew my mind in the best of ways, and I’d love to be in a different branch where I had the time to dig into some of the statements here. Physics is so bizarre.

Discussion about this post

On Dwarkesh Patel’s 4th Podcast With Tyler Cowen Read More »