Dwarkesh’s

on-dwarkesh’s-podcast-with-leopold-aschenbrenner

On Dwarkesh’s Podcast with Leopold Aschenbrenner

Previously: Quotes from Leopold Aschenbrenner’s Situational Awareness Paper

Dwarkesh Patel talked to Leopold Aschenbrenner for about four and a half hours.

The central discussion was the theses of his paper, Situational Awareness, which I offered quotes from earlier, with a focus on the consequences of AGI rather than whether AGI will happen soon. There are also a variety of other topics.

Thus, for the relevant sections of the podcast I am approaching this via roughly accepting the technological premise on capabilities and timelines, since they don’t discuss that. So the background is we presume straight lines on graphs will hold to get us to AGI and ASI (superintelligence), and this will allow us to generate a ‘drop in AI researcher’ that can then assist with further work. Then things go into ‘slow’ takeoff.

I am changing the order of the sections a bit. I put the pure AI stuff first, then afterwards are most of the rest of it.

The exception is the section on What Happened at OpenAI.

I am leaving that part out because I see it as distinct, and requiring a different approach. It is important and I will absolutely cover it. I want to do that in its proper context, together with other events at OpenAI, rather than together with the global questions raised here. Also, if you find OpenAI events relevant to your interests that section is worth listening to in full, because it is absolutely wild.

Long post is already long, so I will let this stand on its own and not combine it with people’s reactions to Leopold or my more structured response to his paper.

While I have strong disagreements with Leopold, only some of which I detail here, and I especially believe he is dangerously wrong and overly optimistic about alignment, existential risks and loss of control in ways that are highly load bearing, causing potential sign errors in interventions, and also I worry that the new AGI fund may make our situation worse rather than better, I want to most of all say: Thank you.

Leopold has shown great courage. He stands up for what he believes in even at great personal cost. He has been willing to express views very different from those around him, when everything around him was trying to get him not to do that. He has thought long and hard about issues very hard to think long and hard about, and is obviously wicked smart. By writing down, in great detail, what he actually believes, he allows us to compare notes and arguments, and to move forward. This is The Way.

I have often said I need better critics. This is a better critic. A worthy opponent.

Also, on a great many things, he is right, including many highly important things where both the world at large and also those at the labs are deeply wrong, often where Leopold’s position was not even being considered before. That is a huge deal.

The plan is to then do a third post, where I will respond holistically to Leopold’s model, and cover the reactions of others.

Reminder on formatting for Podcast posts:

  1. Unindented first-level items are descriptions of what was said and claimed on the podcast unless explicitly labeled otherwise.

  2. Indented second-level items and beyond are my own commentary on that, unless labeled otherwise.

  3. Time stamps are from YouTube.

  1. (2: 00) We start with the trillion-dollar cluster. It’s coming. Straight lines on a graph at half an order of magnitude a year, a central theme throughout.

  2. (4: 30) Power. We’ll need more. American power generation has not grown for decades. Who can build a 10 gigawatt center let alone 100? Leonard thinks 10 was so six months ago and we’re on to 100. Trillion dollar cluster a bit farther out.

  3. (6: 15) Distinction between cost of cluster versus rental cost of compute. If you want the biggest cluster you have to build it, not only rent it.

    1. So in several ways, despite the profit margins on rentals, it is plausible efficiently scaling big costs proportionally more per compute relative to staying small. Suddenly you are buying or building power plants, lobbying governments, bribing utilities and so on. Indeed, in his paper Leopold thinks large scale power might become de facto somewhat priceless.

    2. This also implies that dollar training costs behind the size curve should drop faster relative to the cost at the frontier.

    3. A clear claim in Leopold’s model is that (in effect) power futures are radically underpriced. It’s time to build, anyone with the permits or a way to get them should be building everything they can.

  4. (7: 00) Should we expect sufficient revenue from AI to pay for all this? Leopold calls back to the $100/month Office subscription idea, which he thinks you could sell to a third of subscribers, since the productivity returns will be enormous.

    1. I agree the productivity gains will be enormous versus no AI.

    2. It seems likely that if you have ‘the good AI’ that is integrated into workflow, that also is a very large productivity gain over other AIs, even if your AI is not overall smarter. Having an Office integrated GPT-N-based AI, that also integrates your email and other accounts via Outlook and such plus your entire desktop via something like Microsoft Recall is going to be a big boost if you ignore the times all your data gets seized or stolen.

    3. This still feels like largely asking the wrong questions. Willingness to pay is not as correlated to marginal productivity or value as one might wish. We already see this in AI same as everywhere else.

    4. I see this as one of the places where Leopold’s argument seems unconvincing, but I do agree with the conclusion. I expect AI will be making a lot of money in various ways soon enough, even if it is not transformational.

  5. (7: 50) What can the AIs trained by these different datacenters do? 10 GW for AGI. 2025/26 timeline for models ‘smarter than most college graduates.’ Leopold calls adding affordances ‘unhobbling,’ conceptually the AI always had those abilities inside it but you needed to free its mind with various tools and tricks.

    1. I am torn on the idea of these improvements as ‘unhobbling.’

    2. On the one hand, it is highly useful to think about ‘this is what this system would be able to do if you gave it the right help,’ and contrasting that with the constraints inherent in the system. When considering the risks from a system, you need to think about what the system could do in the future, so the ‘unhobbled’ version is in many ways the version that matters.

    3. On the other hand, it is not entirely fair or useful to say that anything an AI (or a human) could do with enough additional affordances and scaffolding is something they ‘had inside them all along.’ Even more than that, this framing implies that something hobbled the system, which could give people the wrong idea about what is happening.

  6. (9: 00) Right now you need a lot of time to integrate GPT-4-level AIs into your workflow. That will change. Drop in remote workers that interface like workers. No kill like overkill on capabilities to make people actually integrate the AIs.

  7. (11: 00) Where does the training data come for a zoom call like you have for text? Test time compute overhang will be key, issue of GPT-4 having to say first thing that comes to mind versus chain of thought. Tradeoff of test time compute versus training compute. ‘System 2 process’ via what he calls unhobbling.

  8. (14: 45) Why should we think we can get it to do extended thinking? ‘Pretraining is magical,’ letting the model learn rich representations, which is key to Leopold’s model. Robotics increasingly becoming a software problem, not a hardware one.

    1. I mostly think I get (and it is clear) what Leopold means when he says pretraining is magic, or similarly when he says ‘deep learning just works.’

    2. It still seems important to lay out more about how it works, and what it actually does and does not do and why. I’d like to compare Leopold’s model of this to mine and hear him talk about implications, especially versus his thoughts on alignment, where it feels a lot like magic there too.

  9. (17: 10) Leopold says, at some point probably around college Dwarkesh transitioned from pretraining to being able to learn by himself. Metaphor for AI. Reinforcement learning (RL) as most efficient data, potential transition to that.

  10. (20: 30) The transition from GPT-2 to GPT-4, emphasis on the ‘school’ scale of what type of person it is similar to. Again looks ahead to drop in remote workers.

    1. As many others have commented, I would caution against taking the ‘as smart as an Xth grader’ style charts and comparisons too seriously or literally. What is going on here is not that similar to what it is being compared against.

  1. (21: 20) In 2023, Leopold could start to feel the AGI, see the training clusters that would be built, the rough algorithms it would use and so on. Expects rest of the world to feel it soon. Expects wrapper companies to get ‘sonic boomed.’

  2. (24: 20) Who will be paying attention in 26/27? The national security state. When will they and the CCP wake up to superintelligence and its impact on national power?

    1. I have learned that ‘surely they would not be so stupid as to not realize’ is not so strong an argument. Nor is ‘they would never allow this to happen.’

    2. There is not always a ‘they,’ and what they there is can stay unaware longer than the situation can stay solvable.

    3. In the paper and later in the podcast, Leopold draws the parallel to Covid. But Leopold, like many others I know, knew the whole thing was baked in by February. Yes, as he says, the government eventually acted, but well after it was too late, and only after people started shutting down events themselves. They spent a lot of time worrying about petty things that did not matter. They did not ‘feel the Covid’ in advance.

    4. A similarly delayed reaction on AGI, if the technology is on the pace Leonard projects, would wake up to find the government no longer in charge. And indeed, so far we have seen a very similar reaction to early Covid. Leopold (at 32: 00) mentions the talk of ‘Asian racism’ and the parallel is clear for AI.

    5. I don’t buy Leopold’s claim that ‘crazy radical reactions’ came when people saw Covid in America, although I do think that fits for China. Notice the big differences. If we see that difference again for AI, that’s huge. And notice that even when the government had indeed ‘woken up’ we still valued many other things far more than dealing with Covid. Consider the testing situation. Consider vaccine distribution. And so on.

    6. Similarly, today, look at the H5N1 situation. A huge portion of our livestock are infected. What are we doing? We are letting the farm lobby shut down testing. We have learned nothing. I do not even see much effort to get people to not drink raw milk. The good news is it looks like we got away with it and this time is not that dangerous to humans unless we see another mutation, but again this is burying heads in sand until there is no other option.

    7. Could the state actors wake up sooner? Oh, sure. But they might well not.

  3. (25: 30) One of first automated jobs will be AI research. Then things get very fast. Decades of work in a year. One to a few years for much smarter than human things. Then figure out robotics. A ‘couple of years’ lead ‘could be decisive’ in military competition. Comparison to Gulf War I tech edge. Some speculations about physically how to do this.

    1. No he did not say ‘alignment researcher.’ Whoops.

    2. If anything his estimates after that seem rather slow if it was really all that.

    3. If all this was really happening, a few years of edge is massive overkill.

    4. We do not need to know exactly how this physically plays out to know it.

  4. (28: 30) A core thesis of Leopold’s paper, that once NatSec and CCP ‘wake up’ to all this, the researchers stop being in charge. The governments will be in charge. There will be all-out espionage efforts.

    1. Even if we assume no cooperation, again, I would not assume any of this. It seems entirely plausible that one or both countries could stay asleep.

    2. Even if they do ‘wake up,’ there are levels of waking up. It is one thing to notice the issue, another to treat it like the only issue, as if we are in an existential war (in the WW2 sense.) In that example, what America did before and after Pearl Harbor is telling, despite already knowing the stakes.

  5. (29: 00) China has built more power in the last decade than America has total, they can outbuild us.

    1. Never count America out in situations like this.

    2. Yes, right now we look terrible at building things, because we have chosen to be unable to build things in various ways. And That’s Terrible.

    3. If we woke up and decided to hell with all that? Buckle up.

  6. (29: 30) Dwarkesh asks, if you make the AI that can be an AI researcher, and you then use it at first only to build AI researchers because that’s the obviously right play, might others not notice what happened until suddenly everything happened? Leopold says it will be more gradual than that, you do some jobs, then you do robotics and supercharging factory workers, go from there.

    1. I actually think Dwarkesh has a strong point here. If your compute is limited and also you are not trying to draw too much attention, especially if you are worried about national security types, it would make a lot of sense to not do those other things in visible ways ‘until it was too late’ to respond.

    2. It is not only the AI that can sandbag its capabilities and then do a type of treacherous turn. If I was running an AI lab in this situation, I would be foolish not to give a lot of thought to whether I wanted to get taken over by the government or I would rather the government get taken over by my lab.

  7. (30: 30) Will they actually realize it, and when? Leopold agrees this is the big question, says we likely have a few years, points to Covid, see discussion above. Leopold says he did indeed short the market in 2020.

  8. (33: 00) Dwarkesh points out that right now government debates are about very different questions. Big tech. Parallels to social media. Climate change. Algorithmic discrimination. This doesn’t look like ‘we need to ensure America wins?’ Leopold notes that intense international competition is the norm, and in WW2 we had 50%+ of GDP going to the war effort, many countries borrowed over 100% of GDP.

    1. I think Dwarkesh is underselling the ‘America must win’ vibes and actions. That is most definitely a big deal in Washington now. We must beat China is one of the things the parties agree upon, and they do apply this to AI, even without having any idea what the stakes here actually are.

    2. There is thus a lot of talk of ‘promoting innovation’ and America, and of course note the Chips Act. Whether that all translates to anything actually useful to America’s AI efforts is another question. The traditional government view of what matters seems so clueless on AI.

    3. No mention of existential risk there, another aspect of the debate. There are those who very much want to do the opposite of full speed ahead for that reason, on top of those who have other reasons.

    4. Even though many saw WW2 coming, those dramatic spending efforts (at least on the Allied side) mostly only happened once the war began. Things would have gone very differently if France and the UK had spent in 1938 the way they everyone spent in 1940.

    5. So when Leopold asks, will people see how high the stakes are, the obvious answer is that people never understand the stakes until events force them to.

  9. (35: 20) Leopold agrees the question is timing. Will this happen only after the intelligence explosion is already happening, or earlier? Once it happens, it will activate ‘forces we have not seen in a long time.’

    1. Yes, at some point the governments will notice the way they need to actually notice, assuming Leopold is right about the tech. That does not mean that on that day when they feel the AGI, they will still ‘feel in charge.

  10. (36: 00) AI-enabled permanent dictatorship worries. Growing up in Germany makes this more salient.

  11. (39: 30) Are the Westernized Chinese AI researchers going to be down for AI research on behalf of CCP? Leopold asks, will they be in charge? OpenAI drama as highlighting the benefits of representative democracy.

    1. One could take exactly the opposite perspective on the OpenAI drama, that it was a perfect illustration of what happens when a superficially popular demagogue who rules through a combination of fear and promises of spoils to his elite overthrows the rightful parliament when they try to stop him, by threatening to tear the whole thing down if he does not get his way. And that ‘the people’ fell in line, making their last decision, after which dissent was suppressed.

    2. Or one could say that it was democracy in action, except that it is now clear that the voters were fooled by manufactured consent and chose wrong.

    3. In this case I actually think a third parallel is more relevant. OpenAI, they say, is nothing without its people, a theory its people are increasingly testing. When a group seen as the enemy (the board, which was portrayed as a metaphorical CCP here by its enemies, and in some cases accused of being literal CCP agents) told everyone they were in charge and wanted a change of leadership, despite promoting from within and saying everything else would continue as normal, what happened?

    4. What happened was that the bulk of employees, unconvinced that they wanted to work for this new regime (again, despite keeping the same purported goals) threatened to take their talents elsewhere.

    5. Thus, I think the question of cooperation is highly valid. We have all seen Bond movies, but it is very difficult to get good intellectual progress and production out of someone who does not want to succeed, even if you have control over them. There would still be true believers, and those who were indifferent but happy to take the money and prestige on offer. We should not be so arrogant to think that all the most capable Chinese want America to win the future over the CCP. But yes, if you were AI talent that actively wanted the CCP to lose, because you had met the CCP, it seems easy to end up working on something else, or to not be so effective if not given that choice, even if you are not up for active sabotage.

    6. We could and should, of course, be using immigration and recruitment now, while we still can, towards such ends. It is a key missing piece of Leopold’s ‘situational awareness’ that this weapon of America’s is not in his model.

  1. (41: 15) How are we getting the power? Most obvious way is to displace less productive industrial uses but we won’t let that happen. We must build new power. Natural gas. 100 GW will get pretty wild but still doable with natural gas. Vital that the clusters be in America.

  2. (42: 30) Why in America? National security. If you put the cluster in the UAE, they could steal your weights and other IP, or at minimum seize the compute. Even if they don’t do that, why give dictatorships leverage and a seat at the table? Why risk proliferation?

    1. Altman seeking to put his data centers in the UAE is an underrated part of the evidence that he is not our friend.

  3. (45: 30) Riskiest situation is a tight international struggle, only months apart, national security at stake, no margin for error or wiggle room. Also China might steal the weights and win by building better, and they might have less caution.

    1. Maybe China would be more reckless than us. Maybe we would be more reckless than China. I don’t see much evidence cited on this.

    2. If China can steal the weights then you are always potentially in a close race, and indeed it is pointless to go faster or harder on the software side until you fix that issue. You can still go faster and harder on the hardware side.

    3. Leopold’s model (via the paper) puts essentially zero hope in cooperation because the stakes are too high and the equilibrium is too unstable. As you would expect, I strongly disagree that failure is inevitable here. If there is a reason cooperation is impossible, it seems if anything more likely to be America’s unwillingness rather than China’s.

  4. (46: 45) More cluster location talk. Potential to fool yourself into thinking it is only for inference, but compute is fungible. Talk of people who bet against the liberal order and America, America can totally pull this off with natural gas. But oh no, climate commitments, so no natural gas until national security overrides.

    1. For those thinking about carbon, doing it in America with natural gas emits less carbon than doing it in the UAE where presumably you are using oil. Emissions are fungible. If you say ‘but think of our climate commitments’ and say that it matters where the emissions happen, you are at best confusing the map for the territory.

    2. Same with both country and company commitments. This is insane. It is not a hypothetical, we see it everywhere. Coal plants are being restarted or used because people demand that we ‘keep climate commitments.’ What matters is not your commitment. What matters it the carbon. Stop it.

  5. (49: 45) You could also do green energy mega projects, solar with batteries, SMRs, geothermal and so on, but you can’t do it with current permitting processes. You need blanket exemptions, for both federal and state rules.

    1. Yep. It is completely insane that we have not addressed this.

    2. No, I am in some ways not especially thrilled to accelerate the amount of compute available because safety, but we would be infinitely better off if we got the power from green sources and I do not want America to wither for lack of electrical power. And I definitely don’t want to force the data centers overseas.

  6. (51: 00) Harkening back to strikes in 1941 saying war threats were excuses, comparing to climate change objections. Will we actually get our act together? We did in the 40s. Leopold thinks China will be able to make a lot of chips and they can build fast.

    1. That didn’t respond on the climate change issue. As I say above, if people actually cared about climate change they would be acting very differently.

    2. That is true even if you don’t accept that ASI will of course solve climate change in the worlds where we keep it under our control, and in the worlds were we fail to do that we have much bigger problems.

  7. (53: 30) What are the lab plans? Middle east has capital but America has tons of capital. Microsoft can issue infinite bonds. What about worries UAE would work with China instead? We can offer to share the bounty with them to prevent this.

    1. The obvious note is that they can try going to China, but China knows as well as we do that data centers in the UAE are not secure for them, and would then have to use Chinese chips. So why not use those chips inside China?

  8. (56: 10) “There’s another reason I’m a little suspicious of this argument that if the US doesn’t work with them, they’ll go to China. I’ve heard from multiple people — not from my time at OpenAI, and I haven’t seen the memo — that at some point several years ago, OpenAI leadership had laid out a plan to fund and sell AGI by starting a bidding war between the governments of the United States, China, and Russia. It’s surprising to me that they’re willing to sell AGI to the Chinese and Russian governments.” – Leopold

    1. The above is a direct quote. I haven’t heard any denials.

    2. If true, this sure sounds like a Bond Villain plot. Maybe Mission Impossible.

    3. “But Russia and China are our enemies, you can’t give them AGI!”

    4. “Then I suppose your government should bid highly, Mr. Bond!”

    5. There is of course a difference between brainstorming an idea and trying to put it into practice. One should be cautious not to overreact.

    6. But if this made it into a memo that a lot of people saw? I mean, wow. That seems like the kind of thing that national security types should notice?

  9. (56: 30): “It’s surprising to me that they’re willing to sell AGI to the Chinese and Russian governments. There’s also something that feels eerily familiar about starting this bidding war and then playing them off each other, saying, “well, if you don’t do this, China will do it.” Dwarkesh responds: “Interesting. That’s pretty fucked up.”

    1. Yes. That does sound pretty fucked up, Mr. Patel.

  10. (57: 10) UAE is export controlled, they are not competitive. Dwarkesh asks if they can catch up? Leopold says yes, but you have to steal the algorithms and weights.

  11. (58: 00) So how hard to steal those? Easy. DeepMind’s security level is currently at 0 on their own scale, by self-description, and Google probably has the best security. It’s startup security, which is not good.

  12. (1: 00: 00) What’s the threat model? One is steal the weights. That’s important later, less important now but we need to get started now to be ready. But what we do need to protect now are algorithms. We will need new algorithms, everyone is working on RL to get through the data wall.

    1. I wouldn’t downplay the theft of GPT-4. It is highly useful to have those weights for training and research, even if the model is not dangerous per se.

    2. It also would be a huge economic boon, if they dared use them that way.

    3. If the plan is to use RL to get around the data wall, notice how this impacts the statements in the alignment section of situational awareness.

  13. (1: 02: 30) Why will state-level security be sufficient to protect our lead? We have a big lead now. China has good LLMs but they have them because they took our open weights LLMs and modified them. The algorithmic gap is expanding now that we do not publish that stuff, if we can keep the secrets. Also tacit knowledge.

  14. (1: 03: 30) Aside about secrecy and the atomic bomb.

  15. (1: 06: 30) Shouldn’t we expect parallel invention? Leopold thinks it would take years, and that makes all the difference. The time buffer is super important. Once again he paints a picture of China going hard without safety concerns, national security threats, huge pressure.

    1. The buffer theory of alignment has a bunch of implicit assumptions.

    2. First, it assumes that time spent at the end, with the most capable models and the greater resources later on, is far more valuable to safety than time spent previously. That we cannot or will not make those safety investments now.

    3. Second, it assumes that the work we would do with the buffer could plausibly be both necessary and sufficient. You have to turn losses (worlds that turn out poorly, presumably due to loss of control or everyone dying) into wins. The theoretical worlds where we get ‘alignment by default’ and it is easy we don’t need it. The worlds where you only get one shot, you would be a fool to ask the AI to ‘do your alignment homework’ and your attempts will be insufficient will still die.

    4. Thus, you have to be in the middle. If you look at the relevant section of the paper, this is a vision where ‘superalignment’ is a mere difficult engineering problem, and you have some slack and fuzzy metrics and various vague hopes and the more empirical work you do the better your chances. And then when you get the chance you actually do the real work.

    5. Not mentioned by Leopold, but vital, is that even if you ‘solve alignment’ you then still have to win. Leopold frames the conflict as USA vs. CCP, democracy versus dictatorship. That is certainly one conflict where we share a strong preference. However it is not the only conflict, certainly not if democracy is to win, and a pure alignment failure is not the only way to lose control of events. While you are using superintelligence to turbocharge the economy and military and gain decisive advantage, as things get increasingly competitive, how are we going to navigate that world and keep it human, assuming we want that? Reminder that this is a highly unnatural outcome, and ‘we got the AIs to do what we tell them to do in a given situation’ helps but if people are in competition with widespread access to ASIs then I implore you to solve for the equilibrium and find an intervention that changes the result, rather than fooling yourself into thinking it will go a different way. In this type of scenario, these AIs are very much not ‘mere tools.’

  1. (1: 09: 20) Dwarkesh notes no one he talks to thinks about the geopolitical implications of AI. Leopold says wait for it. “Now is the last time you can have some kids.”

    1. That seems weird to me given who Dwarkesh talks to. I definitely think about those implications.

  2. (1: 11: 00) More Covid talk. Leopold expected us to let it happen and the hospitals collapsed, instead we spent a huge percent of GDP and shut down the country.

  3. (1: 11: 45) Smart people underestimate espionage. They don’t get it.

  4. (1: 14: 15) What happens if the labs are locked down? Leopold says that the labs probably won’t be locked down, he doesn’t see it happening. Dwarkesh asks, what would a lockdown look like? You need to stay ahead of the curve of what is coming at you, right now the labs are behind. Eventually you will need air gapped systems, de facto security guards, all actions monitored, vetted hardware, that sort of thing. Private companies can’t do it on their own, not against the full version, you need people with security clearances. But probably we will always be behind this curve rather than ahead of it.

    1. I strongly agree that the labs need to be locked down. I am not a security expert, and I do not have the clearances, so I do not know the correct details. I have no idea how intense is the situation now or where we need to be on the curve.

    2. What I do know is that what the labs are doing right now almost certainly will not cut it. There is no sign that they will do what is necessary on their own.

    3. This should be one of the places everyone can agree. We need steadily increasing security at major AI labs, the way we would treat similarly powerful government military secrets, and we need to start now. This decision cannot be left up to the labs themselves, nor could they handle the task even if they understood the gravity of the situation. Coordinating these actions makes them much easier and keeps the playing field level.

  5. (1: 18: 00) Dwarkesh challenges the USA vs. China framework. Are we not all Team Humanity? Do we really want to treat this as an adversarial situation? Yes some bad people run China right now, but will our descendants care so much about such national questions? Why not cooperate? Leopold reiterates his position, says this talk is descriptive, not normative. Cooperation would be great, but it won’t happen. People will wake up. The treaty won’t be stable. Breakout is too easy. The incentives to break the deal are too great.

    1. This assumes that both sides want to gain and then use this decisive strategic advantage. If America would use a decisive advantage to conquer or ensure permanent dominance over China and vice versa, or it is seen as the battle for the lightcone, then that is a highly unstable situation. Super hard. Still does not seem impossible. I have seen the decision theory, it can be done. If this is largely defensive in nature, that is different. On so many levels, sure you can say this is naive, but it is not obvious why America and China need to be fighting at all.

    2. Certainly we will not know if we never, as I say it, pick up the phone.

    3. So far attempts to coordinate, including the Seoul summit, are moving slowly but do seem to be moving forward. Diplomacy takes time, and it is difficult to tell how well it is working.

    4. One core assumption Leopold is making here is that breakout is too easy. What if breakout was not so easy? Data centers are large physical structures. There are various ways one could hope to monitor the situation, to try and ensure that any attempt to break out would be noticed. I do not have a foolproof plan here, but it seems highly underexplored.

    5. Perhaps ultimately we will despair, perhaps because we cannot agree on a deal because America wants to stay ahead and China will demand equality or more, or something similar. Perhaps the political climate will render it impossible. Perhaps the breakout problem has no physical solutions. It still seems completely and utterly crazy not to try very hard to make it work, if you believed anything like Leopold.

  6. (1: 21: 45) Dwarkesh points out you can blow up data centers. Leopold says yes, this is a highly unstable situation. First strikes are very tempting, someone might get desperate. Data centers likely will be protected by potential nuclear retaliation.

  7. (1: 24: 30) Leopold agrees: A deal with China would be great, but it is tough while in an unstable equilibrium.

    1. No argument there. It’s more about missing mood, where he’s effectively giving up on the possibility. Everything about this situation is tough.

  8. (1: 24: 40) Leopold’s strategy is, essentially, they’ll like us when we win. Peace through strength. Make it clear to everyone that we will win, lock down all the secrets, do everything locally. Then you can offer a deal, offer to respect them and let them do what they want, give them ‘their slice of the galaxy.’

    1. Leopold seems to be making the mistake a lot of smart people make (and I have been among them) of assuming people and nations act in their own self-interest. The equilibrium is unstable so it cannot hold. If we are ahead, then China will take the deal because it is in their interest to do so.

    2. My read on this is that China sees its self interest in very different fashion than this. What Leopold proposes is humiliating if it accomplishes what it sets out to do, enshrining us in pole position. It requires them to trust us on many levels. I don’t see it as a more hopeful approach.

    3. It also is not so necessary, if you can get to that position, unless your model is that China would otherwise launch a desperation war.

    4. To be clear, if we did reach that position, I would still want to try it.

  9. (1: 26: 25) Not going to spoil this part. It’s great. And it keeps going.

  10. (1: 27: 50) Back to business. Leopold emphasizes locking down the labs. No deals without that, our position will not allow it. Worry about desperation sabotage attacks or an attack on Taiwan.

    1. Leopold does not seem to appreciate that China might want to invade Taiwan because they want Taiwan back for ordinary nationalist reasons, rather than because of TSMC.

  11. (1: 31: 00) Central point is to challenge talk about private labs getting AGI. The national security crowd is going to get involved in some fashion.

    1. I do think most people are underestimating the probability that the government will intervene. I still think Leopold is coming in too high.

  12. (1: 32: 10) Is China load bearing in all this? Leopold says not really on security. Even if no China, Russia and North Korea and so on are still a thing. But yes, if we were in a weird world like we had in 2005 where there was no central rival we could have less government involvement.

  1. (1: 33: 40) Dwarkesh challenges. Discussion of the Manhattan Project. Leopold says the regret was due to the tech, not the nature of the project, and it will happen again. Do we need to give the ASI to the monopoly on violence department, or can we simply require higher security? Why are we trying to win against China, what’s the point if we get government control anyway?

  2. (1: 37: 20) Leopold responds. Open source was never going to be how AGI goes down, the $100 billion computer does not get onto your phone soon, we will have 2-3 big players. If you don’t go with the government you are counting on a benevolent private dictator instead.

    1. So, about OpenAI as democracy in action, that’s what I thought.

  3. (1: 39: 00) Dwarkesh notes a lot of private actors could do a lot of damage and they almost never do, and history says that works best. Leopold says we don’t handle nukes with distributed arsenals. The government having the biggest guns is good, actually, great innovation in civilization. He says the next few decades are especially dangerous and this is why we need a government project. After that, the threat is mostly passed in his model.

    1. Dwarkesh makes a valid point that most people with destructive capacity never use it, but some do, as that amount scales it becomes a bigger issue, and also it does not address Leopold’s claim here. Leopold is saying that some AI lab is going to win the race to ASI and then will effectively become the sovereign if the current sovereign stays out of it. Us being able to handle or not handle multi-polarity is irrelevant if it never shows up.

    2. As usual when people talk about history saying that private actors with maximally free reign have historically given you the best results, I agree this is true historically, mostly, although we have indeed needed various restrictions at various times even if we usually go too far on that. The key issue is that the core principle held when the humans were the powerful and capable intelligences, agents and optimizers. Here we are talking about ASIs. They would now be the most powerful and capable things around on all fronts. That requires a complete reevaluation of the scenario, and how we want to set the ground rules for the private actors, if we want to stay in control and preserve the things we care about. Otherwise, if nothing else, everyone is forced to steadily turn everything over to AIs because they have to stay competitive, giving the AIs increasing freedom of action and complexity of instructions and taking humans out of all the loops, and so on, and whoops.

    3. I do not see Leopold engaging with this threat model at all. His model of the post-critical period sounds a like a return to normal, talk of buying galaxies and epic economic growth aside.

    4. My guess is Leopold is implicitly imagining a world with ground rules that keep the humans in control and in the loop, while still having ASI use be widespread, but he does not specify how that works. From other context, I presume his solution is something like ‘a central source ensures the competitively powerful ASIs all have proper alignment in the necessary senses to keep things in balance’ and I also presume his plan for working that out is to get the ASIs to ‘do his alignment homework’ for him in this sense. Which, if the other kinds of alignment are solved, is then perhaps not so crazy, as much as I would much prefer a better plan. Certainly it is a more reasonable plan than doing this handoff in the earlier phase.

  4. (1: 42: 15) Leopold points out that the head of the ASI company can overthrow the government, that effectively it is in charge if it wants that. Dwarkesh challenges that there would be other companies, but Leopold is not so sure about that. And if there are 2-3 companies close to each other, then that is the same as the USA-China problem, and is the government going to allow that, plus also you’d have the China (and company) problem?

    1. There is not going to not be a government. If the government abdicates by letting a private lab control AGI and ASI, then we will get a new one in some form. And that new government will either find rules that preserve human control, or humans will lose control.

    2. So the government has to step in at least enough to stop that from happening, if as Leopold’s model suggests only a small number of labs are relevant.

    3. They still might not do it, or not do it in time. In which case, whoops.

  5. (1: 44: 00) Dwarkesh says yes the labs could do a coup, but so could the actual government project. Do you want to hand that over to Trump? Isn’t that worse? Leopold says checks and balances. Dwarkesh tries to pounce and gets cut off. Leopold discusses what the labs might do, or rogue employees might do since security will suck. Leopold notes the need for an international coalition.

    1. I find the optimism about cooperating with current allies, combined with skepticism of cooperating with current enemies, rather jarring.

    2. Dwarkesh was likely pouncing to say that the checks and balances will stop working here the same way the private company could also go through them. The whole point is that previous power relationships will stop mattering.

    3. Indeed, Leopold’s model seems to in some places be very sober about what it means to have ASIs running around. In other places, like ‘checks and balances,’ it seems to not do that. Congress has to spend the money, has to approve it. The courts are there, the first amendment. Once again, do those people have the keys to the ASI? Do they feel like they can be checking and balancing? Why? How?

    4. Leopold says that these institutions have ‘stood the test of time in a powerful way,’ but this new situation quite obviously invalidates that test, even if you ignore that perhaps things are not so stable to begin with. It is one thing to say humans will be in the loop, it is another to think Congress will be.

    5. Another contrast is ‘military versus civilian’ applications, with the idea that putting ASIs into use in other places is not dangerous and we can be happy to share that. Certainly there are other places that are fine, but there are also a lot of places that seem obviously potentially not fine, and many other ways you would not want these ASIs ‘fully unlocked’ shall we say.

  6. (1: 47: 05) Leopold says it will be fine because you program the AIs to follow the constitution. Generals cannot follow unlawful orders.

    1. Constitutional AI except our actual constitution? Really?

    2. No, just no. This absolutely will not work, even if you succeeded technically.

    3. I leave proving this as an exercise to the reader. There are a lot of distinct ways to show this.

  7. (1: 47: 50) Dwarkesh asks, given you cannot easily un-nationalize, why not wait until we know more about which world we live in? Leopold says we are not going to nationalize until it is clear what is happening.

    1. Reminder that Leopold says his claims are descriptive not normative here.

    2. Indeed, in a few minutes he says he is not confident the government project is good, but at various points he essentially says it is the only way.

  8. (1: 48: 45) Dwarkesh says dictatorship is the default state of mankind, and that we did a lot of work to prevent nuclear war but handing ASI to government here does not seem to be doing that work. Leopold says the government has checks and balances that are much better than those of private companies.

    1. I notice I am confused by the nuclear metaphor here.

    2. I do not think dictatorship is the default state of mankind, but that question is based on circumstances and technology, and ASI would be a huge change in the relevant forces, in hard to predict (and existentially dangerous) directions.

    3. Kind of stunning, actually, how little talk there has been about existential risk.

    4. Dwarkesh speaks of ‘handing ASI to the government’ but in the scenarios we are describing, as constructed, if you instead keep the ASI then you are now the government. You do not get to stay a ‘private actor’ long.

    5. I worry a lot of such debates, both with and without existential risk involved, is people seeing solution X, noticing problem Y they consider a dealbreaker, and thus saying therefore we must do Z instead. The problem is that Z has its own dealbreakers, often including Y. I do not know what the right future is to aim for exactly, but I do know that there is going to be some aspect of it that is going to seem like a hell of a Y not, because there are unavoidable dilemmas.

  9. (1: 51: 00) What does the government project look like? A joint venture between labs, cloud providers and the government. In the paper he uses the metaphor of Boeing and Lockheed Martin. Leopold says no, he does not especially want to start off using ASI for what it will first be used for, but you have to start by limiting proliferation and stabilizing the situation. Dwarkesh says that would be bad. Leopold asks what is the alternative? Many companies going for it, government involved in security.

  10. (1: 54: 00) Leopold’s model involves broad deployment of AIs, with open models that are a few years behind as well. Civilian applications will have their day. Governments ‘have the biggest guns.’

    1. The guns that matter in this future are the ASIs. So either the government has them, or they’re not the government.

  11. (1: 56: 00) Why do those in The Project of the ASI, who are decades ahead on tech, need to trade with the rest of us? Leopold says that economic distribution is a completely different issue, there he has no idea.

    1. That seems kind of important? And it is not only economics and trade. It is so many other aspects of that situation as well.

  12. (1: 56: 30) Leopold comes back to the stakes being, will liberal democracy survive? Will the CCP survive? And that will activate greater forces, national security will dominate.

    1. Will humanity survive? Hello? Those are the stakes.

    2. Beyond that, yes, there are different ways to survive. They very much matter.

    3. But for all of this talk about the stakes of liberal democracy, Leopold fails to ask whether and how liberal democracy can function in this future ASI-infused world. I am not saying it is impossible, I am saying he does not answer the question of how that would work, or whether it would be the desirable way of being. He notices some ways the world could be incompatible with it, but not others.

    4. I wonder how much of that is strategic, versus a blind spot.

  13. (1: 58: 30) Dwarkesh says this does not sound like what we would do if we suddenly thought there were going to be hundreds of millions of Von Neumanns. Wouldn’t we think it was good rather than obsessing over exactly which ones went where? Leopard points to the very short period of time and the geopolitical rivalries, and also says yes obviously we would be concerned in that scenario.

    1. I thought we were past this sort of question? There are many big differences that should be highly obvious?

    2. Of course a lot of those scenarios are actually identical in the sense that the first thing the Von Neumanns do is build ASI anyway. Perhaps being that smart they figure out how to do it safety. One can hope.

    3. The other possibility is that they do better decision theory and realize that since they are all Von Neumann they can cooperate to not build it and work together in other ways and everything goes amazingly great.

  14. (2: 00: 30) If we are merging these various companies are we sure this even speeds things up? Leopold says Google’s merge of Brain and DeepMind went fine, although that was easier. Dwarkesh notices Operation Warp Speed was at core private using advance market commitments and was the only Covid thing we did that worked, Leopold says this will look close to that, it will be a partnership with private institutions and thinks merging is not that difficult. People would not sign up for it yet, but that will change.

    1. The more details I hear and think through, the more it sounds remarkably like a private effort that then gives the results to the government? The government will assist with security and cybersecurity, and perhaps capital, but what else is it going to be contributing?

  15. (2: 04: 00) Talk about nuclear weapon development and regret. Leopold says it was all inevitable, regret would be wrong. Also nukes went really well.

    1. I strongly agree nukes actually went really well. We are still here.

    2. Indeed, the exact way it played out, with a rush to control it and a demonstration of why no one can ever use it, might have been a remarkable stroke of luck for humanity, on top of the other times we got lucky later.

  16. (2: 07: 45) Leopold does not see alternatives. This is a war. There is no time for even ordinary safety standards, or a deliberate regulatory regime. There will be fog of war, we will not know what it is going on, the curves don’t look great, the tests are showing alarm bells but we hammered it out, China stole the weights, what to do?

    1. I really hope he is wrong. Because there is a technical term for humanity in that situation, and that term is ‘toast.’

    2. If he is right, that is a very strong argument for a deliberate regulatory regime now. There is no ‘wait until we know more’ if developments are going to be well outside the expected OODA loop and we will not have the time later. We can only hope we have the time now.

    3. Indeed, exactly what we need most now in this scenario is visibility, and the ability to intervene if needed. Which is exactly what is being pushed for. Then the question is whether you can hope to do more than that, but clearly if you believe in Leopold’s model you should start setting that up now in case this is a bit slower and you do have the time?

  17. (2: 09: 10) The startups claim they are going to do safety, but it is really rough when you are in a commercial race, and they are startups. Startups are startups.

    1. The accent on claim here is the opposite of reassuring.

    2. I too do not expect much in the way of safety.

    3. But also turning this into a military race doesn’t sound better on this axis?

  18. (2: 09: 45) Could the RSPs work to retain private companies? Keep things from getting off the rails? Leopold says current regulations and RSPs are good for flashing warning signs. But if the lights flash and we have the automated worker than it is time to go.

    1. But we will ignore the flashing lights and proceed anyway, said Eliezer.

    2. That is true, says Leopold.

    3. And then we die, says Eliezer.

    4. That does seem like the baseline scenario in that spot.

  19. (2: 12: 45) Mention that if the courts blocked lesser attempts, then actual nationalization would likely be what followed. And yeah, that is indeed what Leopold expects. They have a good laugh about it.

We will for now be skipping ‘Becoming Valedictorian of Columbia at 19’ and ‘What Happened at OpenAI,’ as well as all the sections after Alignment.

Thus, we will jump to 2: 46: 00, where they discuss the intelligence explosion.

  1. (2: 46: 10) The fast AGI → AI AI researchers → ASI (superintelligence). Dwarkesh is skeptical of the input-output model. Leopold says obviously inputs obviously matter, but Dwarkesh points out that small groups often outcompete the world. Leopold says those groups are highly selected. That the story is straight lines on log graphs, things get harder. More researchers balancing harder problems is an equilibrium the same way supply equals demand.

    1. Leopold seems overconfident in the details, but the argument that the researcher inputs do not matter gets more absurd each time I think about it. You can question the premise of there being AGI sufficient to allow AI researchers that are on par with the meaningful human researchers, but if you do allow this then the conclusions follow and we are talking price.

    2. Yes, a small selected group can and often does outcompete very large groups at idea generation or other innovation, if the large groups are not trying to do the thing, or pursuing the same misguided strategy over and over again. If all you are doing is adding more AIs to the same strategy, you are not maximizing what you can get.

    3. But that is similar to another true statement, which is that stacking more layers and throwing more compute and data at your transformer is not the most efficient thing you could have in theory done with your resources, and that someone with a better design but less resources could potentially beat you. The point is the bitter lesson, that we know how to scale this strategy and add more zeroes to it, and that gets you farther than bespoke other stuff that doesn’t similarly scale.

    4. So it would be with AI researchers. If you can ‘do the thing that scales’ then it probably won’t much matter if you lose 50% or 90% or even 99% efficiency, so long as you can indeed scale and no one else is doing the better version. Also, one of the first things you can do with those AGI researchers is figure out how to improve your strategy to mitigate this issue. And I presume a lot of the reason small groups can win is that humans have limited bandwidth and various coordination issues and incentive problems and institutional decay, so large groups and systems have big disadvantages. Whereas a new large AI strategy would be able to avoid much of that.

    5. It makes sense that if idea difficulty is a log scale that rises faster than you can innovate better research methods, that no matter how much you spend on research and how many researchers you hire, your rate of progress will mostly look similar, because you find ideas until they get harder again.

    6. If instead your ability to innovate and improve your research goes faster than the rate at which ideas get harder, because of something like AGI changing the equation, then things speed up without limit until that stops being true.

  2. (2: 51: 00) Dwarkesh asks, then why doesn’t OpenAI scale faster and hire every smart person? His theory is transaction costs, parallelization difficulty and such. Leopold starts off noting that AI researcher salaries are up ~400% over the last year, so the war for the worthy talent is indeed pretty crazy. Not everyone 150 IQ would be net useful. Leopold notes that training is not easily scalable (in humans). Training is very hard.

    1. Yep. Among humans we have all these barriers to rapid scaling. Training is super expensive because it costs time from your best people. Retaining your corporate culture is super valuable and limits how fast you can go. Bad hires are a very large mistake, especially if not corrected, B players hire C players and so on. All sorts of things get harder as you scale the humans.

  3. (2: 53: 10) AI is not like that. You do not need to train each copy. They will be able to learn in parallel, quickly, over vast amounts of data. They can share context. No culture issues, no talent searches. Ability to put high level talent on low level tasks. The 100 million researchers are largely a metaphor, you do what makes sense in context. An internet of tokens every day.

    1. If you accept the premise that such AIs will exist, the conclusion that they will greatly accelerate progress in such areas seems to follow. I see most disagreement here as motivated by not wanting it to be true or not appreciating the arguments.

  4. (2: 56: 00) What hobblings are still waiting for us? Unknown. Leopold’s model is you solve some aspects, that accelerates you, you then solve other aspects (more ‘unhobbling’) until you get there.

  5. (2: 58: 00) How to manage a million AI researchers? Won’t it be slow figuring out how to use all this? Doesn’t adaptation take way longer than you would think? Leopold agrees there are real world bottlenecks. You remove the labor bottleneck, others remain. AI researchers are relatively easy versus other things.

    1. The paper goes into more detail on all this. I am mostly with Leopold on this point. Yes, there will be bottlenecks, but you can greatly improve the things that lack them, and algorithmic progress alone will be a huge deal. This slows us down versus the alternative, and is the reason why in this model the transition is a year or two rather than a Tuesday or a lunch break.

    2. If anything, all the timelines Leopold discusses after getting to AGI seem super long to me, rather than short, despite the bottlenecks. What is taking so long? How are we capable of improving so little? The flip side of bottlenecks is that you do not need to do the same things you did before. If some things get vastly better and more effective, and others do not, you can shift your input composition and your consumption basket, and we do.

    3. The ‘level one adaptation’ of AI is to plug AI into the subtasks where it improves performance. That is already worth a ton, but has bottleneck issues. That is still, for example, where I largely am right now in my own work. Level two is to adjust your strategy to rely on the newly powerful and easy stuff more, I do some of that but that is harder.

  6. (3: 02: 45) What lack of progress would suggest that AI progress is going to take longer than Leopold expects? Leopold suggests the data wall as the most plausible cause of stagnation. Can we crack the data wall by 2026, or will we stall? Dwarkesh asks, is it a coincidence that we happen to have about enough data to train models about powerful enough at 4.5-level to potentially kick off self-play? Leopold doesn’t directly answer but says 3 OOMs (orders of magnitude) less data would have been really rough, probably we needed to be within 1 OOM.

    1. My intuition is that this is less coincidental than it looks and another of those equilibrium things. If you had less data you would find a way to get here more efficiently in data, or you had more data you would worry about data efficiency even less. Humans are super data efficient because we have to be.

    2. Intiutively, at some point getting more data on the same distribution should not much matter, the same way duplicate data does not much matter. The new is decreasingly new. Also intiutively data helps you a lot more when you are not as capable as the thing generating the data, and a lot less once you match it, and that seems like it should matter more than it seems to. But of course I am not an ML researcher or engineer.

    3. The part where something around human level is the minimum required for a model to maybe learn from itself? That’s definitely not a coincidence.

  7. (3: 06: 30) Dwarkesh is still skeptical that too much of this is first principles and in theory, not in practice. Leopold says, maybe, we’ll find out soon. Run-time horizon of thinking will be crucial. Context windows help but aren’t enough. GPT-4 has had very large post-training gains over time, and 4-level is when tools become workable.

  8. (3: 11: 00) What other domains are there where vast amounts of intelligence would accelerate you this same way this quickly? Could you have done it with flight? Leopold says there are limits, but yeah, decades of progress in a year, sure. The AI AI researchers help with many things, including robotics, you do need to try things in the physical world, although simulations are a thing too.

    1. I have sometimes used the term ‘intelligence denialism’ for those who deny that pumping dramatically more intelligence into things would make much difference. Yes, there will still be some amount of bottlenecks, but unlimited intellectual directed firepower is a huge game.

  9. (3: 14: 00) Magnitudes matter. If you multiply your AI firepower by 10 each year even now that’s a lot. Would be quite a knife’s edge story to think you need that to stay on track. Dwarkesh notices this is the opposite of the earlier story. Leopold says this a different magnitude change.

  10. (3: 17: 30) Lot of uncertainty over the 2030s, but it’s going to be fing crazy. Dwarkesh asks, what happens if the new bigger models are more expensive? If they cost $100/hour of human output? Will we have enough compute for inference? Leopold notes GPT-4 now is cheaper than GPT-3 at launch, inference costs seem largely constant. And that this continuing seems plausible.

  11. (3: 22: 15) Scaling laws keep working. Dwarkesh points out this is for the loss function they are trained on, but the new capabilities are different. Leopold thinks GPT-4 tokens are perhaps not that different from Leopold internal tokens. Leopold says it is not so crazy to think AGI within a year (!).

    1. A question I have asked several times is, if you got this theoretical ‘minimum loss’ AI, what would it look like? What could it do? What could it not do? No one I have asked has good intuitions for this.

    2. I think Leopold internal tokens are rather different from GPT-4 internal tokens. They are definitely very different in the sense that Leopold tokens are very different from Random Citizen tokens, and then more so.

This is a super frustrating segment. I did my best to give the benefit of the doubt and steelman throughout, and to gesture at the most salient problems without going too far down rabbit holes. I cite some problems here, but I mostly can only gesture and there are tons more I am skipping. What else can one do here?

  1. (3: 27: 00) Leopold’s model is that alignment is an ordinary problem, just ensuring the machines do what we want them to do, not ‘some doomer’ problem about finding a narrow survivable space.

    1. I wish it was that way. I am damn certain it’s the other way.

    2. That does not mean it cannot be done, but… not with that attitude, no.

    3. Ironically, I see Leopold here as severely lacking… situational awareness.

    4. And yes, I mean that exactly the same way he uses the term.

  2. (3: 27: 20) Dwarkesh asks, if your theory here is correct, should we not worry that alignment could fall into the wrong hands? That it could enable brainwashing, dictatorial control? Shouldn’t we keep this secret? Leopold says yes. Alignment is dual use, it enables the CCP bots, and how you get the USA bots to – and Zvi Mowshowitz is not making this up, it is a direct quote – “follow the Constitution, disobey unlawful orders, and respect separation of powers and checks and balances.”

    1. I am going to give the benefit of the doubt based on discussion that follows, and assume that this is a proxy for ‘together with the ASIs we will design, decide on and enshrine a set of rules that promote human flourishing and then get the ASIs to enforce those rules’ and when stated like that (instead of a fetish for particular mechanism designs that are unlikely to make sense, and with sufficient flexibility) it is not utter lunacy or obviously doomed.

    2. Leopold is at best still massively downplaying (as in by OOMs) how hard that is going to be to get to work. That does not mean we cannot pull it off.

    3. It is a stunning amount of contempt for the problem and the dangers, or perhaps a supreme confidence in our victory (or actual ‘better dead than red’ thinking perhaps), to think that we should be locking down our alignment secrets so the Chinese do not get them. Yes, I get that there are ways this can turn Chinese wins into American wins. This still feels like something out of Dr. Strangelove.

    4. That kind of goes double if you think the only way China catches up is if they steal our secrets anyway? So either they steal our secrets, in which case keeping alignment secret did not help, or they don’t, in which case it does not help because they lose either way? It is so, so hard to make this a good idea.

    5. Keeping alignment secret is one good way to ensure zero cooperation and an all-out race to the finish line. Even I would do it if you tried that.

    6. If this view of alignment is true, then given its failure to invest in this valuable dual use technology OpenAI is in a lot of trouble.

  3. (3: 28: 30) Dwarkesh suggests future paths. Solving alignment shuts off the fully doomed paths like (metaphorical) paperclipping. Now it is humans making decisions. You can’t predict the future, but it will be human will not AI will, and it intensifies human conflicts. Leopold essentially agrees.

    1. This ignores what I tried to call the ‘phase two’ problem. Phase one is the impossibly hard problem ‘solve alignment’ in the sense Leopold is thinking about it. For now, let’s say we do manage to solve it.

    2. Then you have to set up a stable equilibrium, despite intense human competition over the future and resources and everything humans fight about, where humans stay in control. Where it is not the right (or chosen in practice even if wrong) move to steadily hand over control of the future, or to increasingly do things that risk loss of control or other catastrophically bad outcomes. Indeed, some will intentionally seek to put those on the table to get leverage, as humans have often done in the past.

    3. Asking humanity to stay in charge of increasingly super superintelligence indefinitely is asking quite a lot. It is not a natural configuration of atoms. I would not go as far as Roman Yampolskiy who says ‘perpetual alignment is like a perpetual motion machine’ but there is wisdom in that. It is closer than I would like.

    4. That is the problem scenario we want to have. That is still far from victory.

    5. There are solutions that have been proposed, but at best and even if they work they all have big downsides. Imagining the good AI future is very hard even if you assume you live in a in many ways highly convenient world.

    6. One hope is that with access to these ASIs, humans would be wiser, better able to coordinate and use decision theory, have a much bigger surplus to divide, and with those better imaginations we would come up with a much better solution than anything we know about now. This is the steelman of Leopold’s essentially punting on this question.

    7. Synthesizing, the idea is that with ASI help we would come up with a rules set that would allow for such conflicts without allowing the move of putting human control in increasing danger. That presumably means, in its own way, giving up some well-chosen forms of control, the same way we live in a republic and not an anarchy.

  4. (3: 29: 40) Dwarkesh brings up ‘the merge’ with superintelligence plus potential market style order. Asks about rights, brainwashing, red teaming, takeovers. Notes how similar proposed ‘alignment techniques’ sound to something out of Maoist cultural revolution techniques. Leopold says sentient AI is a whole different topic and it will be important how we treat them. He reiterates that alignment is ‘a technical problem with a technical solution.’

    1. A subset of alignment is a technical problem with a technical solution. It is also a philosophical problem, and a design problem, and also other things.

    2. It would still be a huge help if we were on track to solve the technical parts of the problem. We are not.

  5. (3: 31: 25) Back to the Constitution. Leopold notes really smart people really believe in the Constitution and debate what it means and how to implement it in practice. We will need to figure out what the new Constitution looks like with AI police and AI military.

    1. So the good news is this is at least envisioning a very different set of laws and rules than our current one that the AIs will be following under this plan. I am writing the above notes with the sane and expansive version of this as my assumption.

  6. (3: 32: 20) Leopold says it is really important that each faction, even if you disagree with their values, gets their own AI, in a classical liberal way.

    1. I see the very good reasons for this, but again, if you do this then the default thing that happens is the factions steadily turn everything over to their AIs. Humanity quickly loses control, after that it probably gets worse.

    2. If you do not want that to happen, you have to prevent it from happening. You have to set up a design and an equilibrium that lets the factions do their thing without the loss of control happening. This is at best very hard.

    3. Classical liberalism has been our best option, but that involves updating how it works to match the times. Where we have failed to do that, we have already suffered very greatly, such as entire nations unable to build houses.

    4. That is all assuming that you did fully solve technical alignment.

  7. (3: 33: 00) On the technical level, why so optimistic? Timelines could vary. Dwarkesh says GPT-4 is pretty aligned, Leopold agrees. Say you pull a crank to ASI. Does a sharp left turn happen? Do agents change things? Leopold questions the sharp left turn concept but yes there are qualitative changes all along the way. We have to align the automated researcher ourselves. Say you have the RL-story to get past the data wall and you get agents with long horizons. Pre-training is alignment neutral, it has representations of everything, it is not scheming against you. The long horizon creates the bigger problems. You want to add side constraints like don’t lie or commit fraud. So you want to use RLHF, but the problem is the systems get superhuman, so things are too complex to evaluate.

    1. GPT-4’s alignment is not where we are going to need alignment to be.

    2. This is a vision where alignment is a problem because there is a fixed set of particular things you do not want the AI to do. So you check a bunch of outputs to see if Bad Things are involved, thumbs down if you find one, then it stops doing the Bad Things until the outputs are so complex you cannot tell. Of course, that implies you could tell before.

    3. To the extent that you could not tell before, or the simplest best model of your responses will fail outside distribution, or you did not consider potential things you would not like, or there are things in your actual decision process on feedback that you don’t endorse on reflection out of distribution, or there are considerations that did not come up, or there is any other solution to the ‘get thumbs up’ problem besides the one you intended, or the natural generalizations start doing things you did not want, you are screwed.

    4. I could go on but I will stop there.

  8. (3: 37: 00) Then you have the superintelligence part and that’s super scary. Failure could be really bad, and everything is changing extremely rapidly. Maybe initially we can read what the workers are thinking via chain of thought, but the more efficient way won’t let us do that. The thinking gets alien. Scary. But you can use the automated researchers to do alignment.

    1. So the plan is ‘get the AIs to do our alignment homework,’ no matter how many times there are warnings that this perhaps the worst possible task to ask an AI to do on your behalf. It encompasses anything and everything, it involves so many complexities and failure modes, and so on.

  9. (3: 39: 20) Dwarkesh says OpenAI started with people worried about exactly these things. Leopold interjects ‘but are they still there?’ A good nervous laugh. But yes, also some of the ones still there including Altman. There are still trade-offs made. Why should we be optimistic about national security people making those decisions without domain knowledge? Leopold says they might not be, but the private world is tough, the labs are racing and will get their stuff stolen. You need a clear lead. Leopold says he has faith in the mechanisms of a liberal society.

    1. Look, I love classical liberalism far more than most next guys, but this sounds more and more like some kind of mantra or faith. Classical liberalism is based on muddling through, on experimentation and error correction, on being able to react slowly, and on the ‘natural’ outcome being good because economics is awesome that way. It is about using government to create incentive and mechanism design and not to trust it to make good decisions in the breach.

    2. You can’t use that to have faith in a classical liberal government making good tactical or strategic alignment decisions in a rapidly moving unique situation. The whole point of classical liberal government is that when it makes terrible decisions it still turns out fine.

    3. ‘Vastly superior to all known alternatives and especially to the CCP’ should not be confused with a terminal value system.

  10. (3: 41: 50) If evidence is ambiguous, as in many words it will be, that is where you need the safety margin.

    1. If you have the levels of rigor described in this podcast, and the evidence looks unambiguous, you should worry quite a lot that you are not smart enough or methodical enough to not fool yourself and have made a mistake.

    2. If you have the levels of rigor described in this podcast, and the evidence looks ambiguous, you almost certainly have not solved the problem and are about to lose control of the future with unexpected results.

    3. This is one of those ‘no matter how many times you think you have adjusted for the rules above’ situations.

    4. Leopold talks a lot about this ‘safety margin’ of calendar time. I agree that this is a very good thing to have, and can plausibly turn a substantial number of losses into wins. We very much want it. But what to do with it? How are you going to use this window to actually solve the problem? The assertion Leopold makes is that this is an ‘ordinary engineering’ problem, so time is all you need.

  1. (2: 13: 12) How (the f did that happen? He really wanted out of Germany. German public school sucked, no elite colleges there, no opportunities for talent, no meritocracy. Have to get to America.

    1. This is the future America [many people] want, alas, as they attack our remaining talent funnels and outlets for our best and brightest. School is a highly oppressive design for anyone smart even when they are trying to be helpful, because the main focus remains on breaking your will, discipline and imprisonment. I can only imagine this next level.

  2. He loved college, liked the core curriculum, majored math/statistics/economics. In hindsight he would focus on finding the great professors teaching pretty much anything.

    1. This is definitely underrated if you know which classes they are.

    2. Columbia does not make it easy, between the inevitable 15+ AP credits and the 40 or so credit core curriculum and your 35-42 credit major there are not going to be many credits left to use on exploration.

  3. (2: 16: 50) Leonard wrote at 17 a novel economic paper on economic growth and existential risk and it got noticed. To him, why wouldn’t you do that? He notices he has peak productivity times and they matter a lot. Dwarkesh notices that being bipolar or manic is common among CEOs.

  4. (2: 18: 30) Why economics? Leopold notes economic thinking imbues what he does even now, straight lines on graphs. He loves the concepts but he’s down on economic academia, finding it decadent, its models too complex and fiddly. The best economic insights are conceptually very easy and intuitive once pointed out and then highly useful. Tyler Cowen warned Leopold off going to graduate school and steered him to Twitter weirdos instead, bravo.

    1. I very much endorse this model of economics. Economics to me is full of simple concepts that make perfect sense once you are in the right frame of mind and can transform how you see the world and apply everywhere. Someone does have to apply it and go into the details.

    2. The goal when reading an economics paper (or taking an economics course!) is to be a distillation learning algorithm that extracts the much shorter version that contains the actual crisp insights. If there is a 50 page economic paper and I have to read it all in order to understand it, it is almost never going to be all that interesting or important.

  5. (2: 22: 10) Leopold says the best insights still require a lot of work to get the crisp insight.

    1. Yes and no, for me? Sometimes the crisp insight is actually super intuitive and easy. Perhaps this is because one already ‘did the work’ of getting the right frame of mind, and often they did the work of searching the space.

    2. A lot of my frustration with economists on AI seems to be a clash of crisp insights? They want to draw straight lines on historical linear graphs, apply historical patterns forward, demand particular models to various degrees, assume that anyone worrying about technological unemployment or other disruptive technology or runaway growth or unbridled competition and selection or having any confidence in smart actors to defeat Hayekian wisdom is being foolish.

    3. They think this because inside their training samples of everything that ever happened, they’re right, and they’ve crystalized that in highly useful ways. Also like everyone else they find it hard (and perhaps scary) to imagine the things that are about to happen. They lack Leopold’s situational awareness. They look for standard economic reasons things won’t happen, demand you model this.

    4. This actually parallels a key issue in machine learning and alignment, perhaps? You are training on the past to distill a set of heuristics that predicts future output. When the future looks like the past, and you are within distribution in the ways that count and the implicit assumptions hold well enough, this can work great.

    5. However, what happens when those assumptions break? A lot of the dynamics we are counting on revolve around limitations of humans, and there not existing other things with different profiles whose capabilities match or dominate those of humans. Things turning out well for the humans relies on them being competitive, having something to offer, being scope limited with decreasing marginal returns, and having values and behaviors that are largely hardcoded. And on our understanding of the action space and physical affordances.

    6. All of that is about to break if capabilities continue on their straight lines. A lot of it breaks no matter what, and then a lot more breaks when these new entities are no longer ‘mere tools.’

    7. There is a new set of crisp insights that applies to such situations, that now seems highly intuitive to some people to a large extent, but it is like a new form of economics. And the same as a lot of people really don’t ‘get’ simple economic principles like supply and demand, even fewer people get the new concepts, and their brains largely work to avoid understanding.

    8. So I have a lot of sympathy, but also come on everyone, stop being dense.

  6. (2: 22: 20) Valedictorian is highest average grade, so average productivity, how did that happen here too? He loved all this stuff.

    1. It is not only highest average, it is highly negative selection, very punishing. The moment I got my first bad grade in college I essentially stopped caring about GPA due to this, there was nothing to win.

  7. (2: 24: 00) A key lesson of the horrible situation in Germany was that trying works. The people with agency become the people who matter.

  8. (2: 25: 25) Life history, Leopold did a bit of econ research after college, then went to Future Fund, funded by SBF and FTX. Plan was for four people to move fast, break things and deploy billions to remake philanthropy. Real shame about FTX being a fraud and SBF doing all the crime, collapsing the whole thing overnight. He notes the tendency to give successful CEOs a pass on their behavior, and says it is important to pay attention to character.

  1. (3: 42: 15) What was different about Germany or even all of Europe after WW2, versus other disasters that killed similar portions of populations? Why aren’t we discussing Europe in all this? Leopold is very bearish on Germany, although he still thinks Germany is top 5 and has strong state capacity. USA has creativity and a ‘wild west’ feeling you don’t see in Germany with its rule following and backlash against anything elite.

  2. (3: 45: 00) Why turn against elitism? Response to WW2 was way harsher than WW1, imposition of new political systems, country in ruins, but it worked out better, maybe don’t wake the sleeping beast even if it is too sleepy.

    1. I do not think it is obvious that the post-WW2 treatment was harsher. Imposing a ruinous debt burden is quite terrible, whereas after WW2 there was interest in making each side’s Germany prosperous. Destroying half the housing stock is terrible but it can be rebuilt.

  3. (3: 46: 30) Chinese and German elite selection is very conformist, for better and worse. America is not like that. To Leopold China is worryingly impenetrable. What is the state of mind or debate? Dwarkesh is thinking about going to China, asking for help on that.

    1. Those who warn or worry about China do not seem to think much of this dynamic. To me it seems like a huge deal. China’s system does not allow for exactly the types of cultural contexts and dynamics that are the secret of American progress in AI. For all the talk of how various things could cripple American AI or are holding it back, China is already doing lots of far more crippling things (to be clear, not for existential risk related reasons, unless maybe you mean existential to the regime).

    2. Leopold’s model here does say China would have to steal the algorithms or weights to catch up, which reconciles this far more than most warnings.

  4. (3: 50: 00) ByteDance cold emailed everyone on the Gemini paper with generous offers to try and recruit them. How much of the alpha from a lab could one such person bring? Leopold says a lot, if the person was intentional about it. Whereas China doesn’t let its senior AI researchers leave the country.

    1. Sure, why not? Worth a shot. Plausibly should have bid ten times higher.

    2. It is indeed scary what one of a large number of people could do here, less so if it has to all be in their head but even then. As Leopold says, we need to lock a variety of things down.

  5. (3: 52: 30) What perspective is Leopold missing? Insight on China. How normal people in America will (or won’t) engage with AI, or react to it. Dwarkesh mentions Tucker Carlson’s mention that nukes are always immoral except when you use them on data centers to stop superintelligence that might enslave humanity. Political positions can flip. Technocratic proposals might not have space, it might be crude reactions only.

    1. Unless I am missing something big: The more you believe we will do crude reactions later without ability to do anything else, the more you should push for technocratic solutions to be implemented now.

    2. If you think the alternative to technocratic solutions now is technocratic solutions later, and later will let you know better what the right solution looks like, and you think a mistake now would get worse over time, and nothing too bad was going to happen soon, then it would make sense to wait. This goes double if you think a future very light touch is plausibly good enough.

    3. However, if you think that the alternative to technocratic solutions now is poorly considered blunt solutions later, largely based on panic and emotion and short term avoidance of blame, then that does not make sense. You need to design things as best you can now, because you won’t get to design them later, especially if you have not laid groundwork.

    4. This is especially true if failure to act now constrains our options in the future. Not locking down the labs now plausibly means much harsher actions later after things are stolen. Allowing actively dangerous future open models to be released in ways that cannot be undone, and especially failing to prevent an AI-caused or AI-enabled catastrophic event, could plausibly force a draconian response.

    5. At minimum, we need desperately to push for visibility and nimble state capacity, so that we can know what is going on and what to do, and have the ability to choose technocratic solutions over blunt solutions. The option to do nothing indefinitely is not on the table even if there are no existential risks, the public wouldn’t allow it and neither would the national security state.

    6. The parallel to Covid response may be helpful here. If you did not get proactive early, you paid the price later via politically forced overreactions, and got worse outcomes all around. There are a lot of metaphorical ‘we should not mandate investments in Covid testing’ positions running around, or even metaphorical calls to do as we actually did at first and try to ban testing.

    7. Scott Sumner might use the example of monetary policy. Fail to properly adjust the expected future path rates as circumstances change, making policy too tight or loose, and you end up raising or lowering interest rates far more to fix the problem than you would have moved them if you had done it earlier.

  6. (3: 55: 30) When the time comes, you will want the security guards.

  7. (3: 55: 35) China will read Leopold’s paper too. What about the tradeoff of causing the issues you warn about? Cat is largely out of the bag, China already knows, and we need to wake up. Tough trade-off, he hopes more of us read it than they do.

    1. This echoes questions rationalists and those worried about AI have thought about and dealt with for two decades now. To what extent might your warnings and efforts cause, worsen or accelerate the exact thing you are trying to prevent or slow down?

    2. The answer was plausibly quite a lot. All three major labs (DeepMind, OpenAI and Anthropic) were directly founded in response to these concerns. The warnings about existential risk proved far more dangerous than the technical details people worried about sharing. Meanwhile, although I do think we laid foundations that are now proving highly useful as things move forward, those things we deliberately did not discuss plausibly held back our ability to make progress that would have helped, and were in hindsight unlikely to have made things worse in other ways. A key cautionary tale.

    3. In this case, if Leopold believes his own model, he should worry that he is not only waking the CCP up to AGI and the stakes, he is also making cooperation even harder than it already was. If you are CCP and reading situational awareness, your hopes for cooperating with America are growing dim. Meanwhile, you are all but being told to go steal all our secrets before we wake up, and prepare to race.

    4. There is a continual flux in Leopold’s talk, and I think his actual beliefs, between when he is being normative and when he is being descriptive. He says repeatedly that his statements are descriptive, that he is saying The Project (national effort to build an ASI) will happen, rather than that it should happen. But at other times he very clearly indicates he thinks it also should happen. And at times like this, he indicates that he is worried that it might not happen, and he wants to ensure that it happens, not merely steer the results of any such project in good directions. Mostly I think he is effectively saying both that the path he predicts is going to happen, and also that it is good and right, and that we should do it faster and harder.

  1. (3: 57: 50) Dwarkesh’s immigration story. He got here at 8 but he came very close to being kicked out at 21 and having to start the process again. He only got his green card a few months before the deadline for highly contingent reasons. Made Dwarkesh realize he needed to never be a code monkey, which was otherwise his default path. Future Fund giving him $20k and several other contingent things helped Dwarkesh stay on his path.

    1. The whole thing is totally nuts. Everyone agrees (including both parties) that we desperately need lots more high skill immigration and to make the process work, things would be so much better in every way, in addition to helping with AI. If we want to have a great country, to ‘beat China’ in any sense, this should be very high up on our priorities list and is plausibly at the top. Why do we only grant 20% of HB-1 visas? Why do we kick graduates of our colleges out of the country? Very few people actually want these things.

    2. Yet the fix does not happen, because to ‘make a deal’ on immigration in general is impossible due to disagreements about low skill immigration, and the parties are unable to set that aside and deal with this on its own. Their bases will not let them, or they think it would be unstrategic, and all that.

    3. Standard exhortations to ‘lock the people in a room’ and what not until this happens, or to use executive power to work around much of this.

    4. Spending in the high leverage spots is so amazingly better. $20k!

  1. (4: 03: 15) Convert to mormonism for real if you could? Leopold draws parallel of a mormon outside Utah to being an outsider in Germany, giving you strength. He also notes the fertility rate question and whether isolation can scale. Notes the value of believing in and serving something greater than yourself.

  2. (4: 06: 20) At OpenAI, Dwarkesh notes that plenty of financially ironclad employees had to have similar concerns, but only Leopold, a 22-year-old with less than a year there and little in savings, made a fuss.

  1. (4: 08: 00) Leopold is launching one. Why? The post-AGI period will be important and there is a lot of money to be made. It gives freedom and independence. Puts him in a position to advise.

    1. I have not seen good outcomes so far from ‘invest to have a seat at the table.’

    2. If you are investing betting on AGI, I think you will have very good expected returns in dollars. That does not obviously mean you have high expected returns in utility. Ask in what worlds money has how much marginal utility.

    3. Also ask what impact your investing has on the path to AGI. Many companies are already saturated with capital, if you buy Nvidia stock they do not then invest more money in making chips. Startups are different. Leopold of course might say that is good actually.

  2. (4: 11: 15) Worried about timing? Not blowing up is important. They will bet on fast AGI, otherwise firm will not do well. Sequence of bets is critical. Last year Nvidia was the only real play. In the future utilities and companies like Google get involved but right now they are not so big on AI. He expects high interest rates (perhaps >10% by end of the decade) creating tailwinds on stocks, higher growth rates might not depress stocks. Nationalization. The big short on bonds. Bets on the tails.

    1. There are a lot of different ways to play this sort of situation.

    2. If you want to get maximum effective leverage and exposure, then yes in many ways you will become progressively more exposed to timing and getting the details right.

    3. If you are willing to take a more conservative approach and use less leverage in multiple senses, you can get a less exposure but still a lot of exposure to the underlying factors without also being massively exposed to the timing and details. You can make otherwise solid plays. That’s my typical move.

  3. (4: 16: 15) Dwarkesh asks the important question of whether your property rights will be respected in these worlds. Will your Fidelity account be worth anything? Leopold thinks yes, until the galaxy rights phase.

    1. Leopold does not express how certain he is here. I certainly would not be so confident. The history of property rights holding under transformations is not that great and this is going to be far crazier. Even if they technically hold, one should not assume that will obviously matter.

  4. (1: 16: 45) A lot of the idea is Leopold wants to get capital to have influence. Dwarkesh notes the ‘landed gentry’ from before the industrial revolution did not get great returns, and most benefits from progress were diffused. Leopold notes that the actual analogue is you would sell your land and invest it in industry. Whereas human capital is going to decay rapidly, so this is a hedge.

    1. The good news is that the landed gentry in many places survived intact and did fine. Others now have more money, but they do not obviously have less. In other places, of course, not so much, but you had a shot.

    2. I do think the hedge on human capital depreciation argument has merit. If AGI does not arrive and civilization continues, then anyone with strong human capital does not need that much financial capital, especially if you are as young as Leopold. We wouldn’t like it and it would be insane to get into such a position, but if necessary most of us could totally pull an ‘if.

    3. Whereas if you think AGI means there is lots of wealth and production but your human capital is worthless, They Took Our Jobs comes for Leopold, but you expect property rights to hold and people to mostly be fine, then yes you might highly value having enough capital. The UBI might not show up and it might not be that generous, and there might be wonders for sale. Note that a lot of the value here is a threshold wealth effect where you can survive.

  5. (4: 18: 30) The economist or Tyler Cowen question: Why has AGI not been priced in? Aren’t markets efficient? Leopold used to be a true EMH (efficient markets hypothesis) guy, but has changed his mind. Groups can have alpha in seeing the future, similar to Covid. Not many people take these ideas seriously.

    1. In a sense this begs the question. The market is failing to price it in because the market is failing to price it in. But also that is a real answer. It is an explicit rejection, which I share, of the EMH in spots like this. Yes, we have enough information to say the market is being bonkers. And yes, we know why we are able to make this trade, the market is bonkers because society is asleep at the wheel on this, and the market is made up of those people. Those who know do not have that much capital.

    2. Rather than offer additional arguments I will say this all seems straightforwardly and obviously true at this point.

    3. On AI in particular, the market and economic forecasts are not even pricing in the sane bear case for AI, let alone pricing in potential AGI.

    4. Your periodic reminder: Substantial existential risk does not change this so much. If the world ends with 50% probability, then you factor that in. That does not mean that in those worlds that the world will ‘wake up’ to the situation and crash the market or otherwise give you an opportunity, see market prices during the Cuban Missile Crisis to show that even everyone knowing about such dangers did not move things much. And it does not mean that, even if you could indeed make a lot of money this way, you would have had anything useful to do with money before the end. If the universe definitely ends in a month no matter what I do, giving me a trillion dollars would be of little utility. What would I do with it that I can’t do already?

  1. (4: 20: 00) Why did the Allies make better overall decisions than the Axis? Leopold thinks Blitzkrieg was forced, because they could not win a long war industrially. The invasion of Russia was about the resources needed to fight the West, especially oil. Lots of men died in the Eastern front but German industrial might largely was directed West.

  2. (4: 22: 00) China builds like 200 times more ships than we do. Over time in a war China could mobilize industrial resources better than we could. Or for AI if this all came down to a building game.

    1. We don’t build ships because of the Jones Act. Yes, it claims to be protecting American shipbuilding, but it destroyed it instead through lack of competition, now we simply don’t have any ships. And we also can’t buy them from Japan and South Korea and Europe for the same reason. This is all very dumb, but also the important thing is that we need the ships, not to build the ships. Donate to Balsa Research today to help us repeal the Jones Act.

    2. This is a very strange view of a potential future war, where both sides mobilize their industrial might over years in a total war fashion without things going nuclear, and where America is presumably largely cut off from trade and allies. We cannot rule that scenario out, but it is super weird, no?

    3. I would not count out American industrial might in a long war. Right now we make plenty of things when it would make economic sense to do so and we make doing so legal. But we do not make many other things because it is not in our economic interest to make those things, and because we often make it illegal or prohibitively annoying and expensive and slow to make things. That is a set of choices we could reverse.

    4. Also in this scenario, America would have a large AI advantage over China, and no I do not think some amount of espionage will do it.

    5. Could we still get outproduced long term by China with its much larger population? Absolutely, but people keep betting against America in these situations and they keep losing.

  3. (4: 23: 15) Leopold asks, will we let the robot factories and robot armies run wild? He says we won’t but maybe China will.

    1. Seriously, why are we assuming America will definitely act all responsible and safe in these situations, but thinking China might not?

    2. I wonder if Leopold has read The Doomsday Machine. We do not exactly have a great track record of making war plans that would not cause an apocalypse.

  4. (4: 23: 55) What do you do with industrial strength ASI? Not (only) chatbots. Oil transformed America before we even invented cars. What do we do once we have our intelligence explosion and lots of compute? How will everyone react.

  5. (4: 26: 50) Changing your mind is really important. Leopold says many ‘doomers’ were prescient early, but have not updated to the realities of deep learning, their proposals are naive and don’t make sense, people come in with a predefined ideology.

    1. Shots. Fired.

    2. I know who talks about changing their mind and works on it a lot and I see doing it a lot, and who I do not. I will let you judge.

    3. I see lots of the proposals by many on alignment, including Leopold here and in his paper, as being naive and not making sense and not reflecting the underlying realities, so there you go.

    4. On ‘the realities of deep learning’ I think there are some people making this mistake, but more common is accusing people of making this mistake without checking if the mistake is being made. Or claiming that the update that can be made without being an engineer at a top lab is not the true update, you can’t know what it is like without [whatever thing you haven’t done].

    5. Also this ‘update on realities’ is usually code for saying: I believe all future systems will of course be like current systems, except more intelligent, there is only empiricism and curve extrapolation. Anyone who thinks that is not true, they are saying, is hopelessly naive and not getting with the zeitgeist.

  6. (4: 27: 15) Leopold notes e/accs shitpost but they are not thinking through the technology.

    1. Well, yes.

  7. (4: 27: 25) There is risk in writing down your worldview. You get attached to it. So he wants to be clear that painting a concrete picture is valuable, and that this is Leopold’s best guess for the next decade, and anything like this will be wild. But we will learn more soon, and will need to update and stay sane.

    1. Yes, strongly endorsed. I am very happy Leopold wrote down what he actually believes and was highly concrete about it. This is The Way. And yes, one big danger is that this could make it difficult for Leopold to change his mind when the situation changes or he hears better arguments or thinks more. It is good that he is noticing that too.

  8. (4: 28: 15) The point that Patrick McKenzie correctly says he cannot emphasize enough. That there need to be good people willing to stare this in the face and do what needs to be done. It seems worth quoting again from the paper here, because yeah, we can’t say it enough.

But the scariest realization is that there is no crack team coming to handle this. As a kid you have this glorified view of the world, that when things get real there are the heroic scientists, the uber- competent military men, the calm leaders who are on it, who will save the day. It is not so. The world is incredibly small; when the facade comes off, it’s usually just a few folks behind the scenes who are the live players, who are desperately trying to keep things from falling apart.

On Dwarkesh’s Podcast with Leopold Aschenbrenner Read More »

on-dwarkesh’s-podcast-with-openai’s-john-schulman

On Dwarkesh’s Podcast with OpenAI’s John Schulman

Dwarkesh Patel recorded a Podcast with John Schulman, cofounder of OpenAI and at the time their head of current model post-training. Transcript here. John’s job at the time was to make the current AIs do what OpenAI wanted them to do. That is an important task, but one that employs techniques that their at-the-time head of alignment, Jan Leike, made clear we should not expect to work on future more capable systems. I strongly agree with Leike on that.

Then Sutskever left and Leike resigned, and John Schulman was made the new head of alignment, now charged with what superalignment efforts remain at OpenAI to give us the ability to control future AGIs and ASIs.

This gives us a golden opportunity to assess where his head is at, without him knowing he was about to step into that role.

There is no question that John Schulman is a heavyweight. He executes and ships. He knows machine learning. He knows post-training and mundane alignment.

The question is, does he think well about this new job that has been thrust upon him?

Overall I was pleasantly surprised and impressed.

In particular, I was impressed by John’s willingness to accept uncertainty and not knowing things.

He does not have a good plan for alignment, but he is far less confused about this fact than most others in similar positions.

He does not know how to best navigate the situation if AGI suddenly happened ahead of schedule in multiple places within a short time frame, but I have not ever heard a good plan for that scenario, and his speculations seem about as directionally correct and helpful as one could hope for there.

Are there answers that are cause for concern, and places where he needs to fix misconceptions as quickly as possible? Oh, hell yes.

His reactions to potential scenarios involved radically insufficient amounts of slowing down, halting and catching fire, freaking out and general understanding of the stakes.

Some of that I think was about John and others at OpenAI using a very weak definition of AGI (perhaps partly because of the Microsoft deal?) but also partly he does not seem to appreciate what it would mean to have an AI doing his job, which he says he expects in a median of five years.

His answer on instrumental convergence is worrisome, as others have pointed out. He dismisses concerns that an AI given a bounded task would start doing things outside the intuitive task scope, or the dangers of an AI ‘doing a bunch of wacky things’ a human would not have expected. On the plus side, it shows understanding of the key concepts on a basic (but not yet deep) level, and he readily admits it is an issue with commands that are likely to be given in practice, such as ‘make money.’

In general, he seems willing to react to advanced capabilities by essentially scaling up various messy solutions in ways that I predict would stop working at that scale or with something that outsmarts you and that has unanticipated affordances and reason to route around typical in-distribution behaviors. He does not seem to have given sufficient thought to what happens when a lot of his assumptions start breaking all at once, exactly because the AI is now capable enough to be properly dangerous.

As with the rest of OpenAI, another load-bearing assumption is presuming gradual changes throughout all this, including assuming past techniques will not break. I worry that will not hold.

He has some common confusions about regulatory options and where we have viable intervention points within competitive dynamics and game theory, but that’s understandable, and also was at the time very much not his department.

As with many others, there seems to be a disconnect. A lot of the thinking here seems like excellent practical thinking about mundane AI in pre-transformative-AI worlds, whether or not you choose to call that thing ‘AGI.’ Indeed, much of it seems built (despite John explicitly not expecting this) upon the idea of a form of capabilities plateau, where further progress is things like modalities and making the AI more helpful via post-training and helping it maintain longer chains of actions without the AI being that much smarter.

Then he clearly says we won’t spend much time in such worlds. He expects transformative improvements, such as a median of five years before AI does his job.

Most of all, I came away with the impression that this was a person thinking and trying to figure things out and solve problems. He is making many mistakes a person in his new position cannot afford to make for long, but this was a ‘day minus one’ interview, and I presume he will be able to talk to Jan Leike and others who can help him get up to speed.

I did not think the approach of Leike and Sutskever would work either, I was hoping they would figure this out and then pivot (or, perhaps, prove me wrong, kids.) Sutskever in particular seemed to have some ideas that felt pretty off-base, but with a fierce reputation for correcting course as needed. Fresh eyes are not the worst thing.

Are there things in this interview that should freak you out, aside from where I think John is making conceptual mistakes as noted above and later in detail?

That depends on what you already knew. If you did not know the general timelines and expectations of those at OpenAI? If you did not know that their safety work is not remotely ready for AGI or on track to get there and they likely are not on track to even be ready for GPT-5, as Jan Leike warned us? If you did not know that coordination is hard and game theory and competitive dynamics are hard to overcome? Then yeah, you are going to get rather a bit blackpilled. But that was all known beforehand.

Whereas, did you expect someone at OpenAI, who was previously willing to work on their capabilities teams given everything we now know, having a much better understanding of and perspective on AI safety than the one expressed here? To be a much better thinker than this? That does not seem plausible.

Given everything that we now know has happened at OpenAI, John Schulman seems like the best case scenario to step into this role. His thinking on alignment is not where it needs to be, but it is at a place he can move down the path, and he appears to be a serious thinker. He is a co-founder and knows his stuff, and has created tons of value for OpenAI, so hopefully he can be taken seriously and fight for resources and procedures, and to if necessary raise alarm bells about models, or other kinds of alarm bells to the public or the board. Internally, he is in every sense highly credible.

Like most others, I am to put it mildly not currently optimistic about OpenAI from a safety or an ethical perspective. The superalignment team, before its top members were largely purged and its remaining members dispersed, was denied the resources they were very publicly promised, with Jan Leike raising alarm bells on the way out. The recent revelations with deceptive and coercive practices around NDAs and non-disparagement agreements are not things that arise at companies I would want handling such grave matters, and they shine new light on everything else we know. The lying and other choices around GPT-4o’s Sky voice only reinforce this pattern.

So to John Schulman, who is now stepping into one of the most important and hardest jobs under exceedingly difficult conditions, I want to say, sincerely: Good luck. We wish you all the best. If you ever want to talk, I’m here.

This follows my usual podcast analysis format. I’ll offer comments with timestamps.

To make things clearer, things said in the main notes are what Dwarkesh and John are saying, and things in secondary notes are my thoughts.

  1. (2: 40) What do we anticipate by the end of the year? The next five years? The models will get better but in what ways? In 1-2 years they will do more involved tasks like carrying out an entire coding project based on high level instructions.

  2. (4: 00) This comes from training models to do harder tasks and multi-step tasks via RL. There’s lots of low-hanging fruit. Also they will get better error recovery and ability to deal with edge cases, and more sample efficient. They will generalize better, including generalizing from examples of ‘getting back on track’ in the training data, which they will use to learn to get back on track.

    1. The interesting thing he did not say yet is ‘the models will be smarter.’

    2. Instead he says ‘stronger model’ but this vision is more that a stronger model is more robust and learns from less data. Those are different things.

  3. (6: 50) What will it take for how much robustness? Now he mentions the need for more ‘model intelligence.’ He expects clean scaling laws, with potential de facto phase transitions. John notes we plan on different timescales and complexity levels using the same mental functions and expects that to apply to AI also.

  4. (9: 20) Would greater coherence mean human-level intelligence? John gives a wise ‘I don’t know’ and expects various other deficits and issues, but thinks this going quite far is plausible.

  5. (10: 50) What other bottlenecks might remain? He speculates perhaps something like taste or ability to handle ambiguity, or other mundane barriers, which he expects not to last.

    1. This seems like a focus on the micro at the expense of the bigger picture? It seems to reinforce an underlying implicit theory that the underlying ‘raw G’ is not going to much improve, and your wins come from better utilization. It is not obvious how far John thinks you can take that.

  6. (12: 00) What will the multimodal AI UI look like? AIs should be able to use human websites via vision. Some could benefit from redesigns to make AI interactions easier via text representations, but mostly the AIs will be the ones that adapt.

    1. That seems bizarre to me, at least for websites that have very large user bases. Wouldn’t you want to build a parallel system for AIs even if they could handle the original one? It seems highly efficient and you should capture some gains.

  7. (13: 40) Any surprising generalizations? Some in post-training, such as English fine-tuning working in other languages. He also mentions a tiny amount of data (only ~30 examples) doing the trick of universally teaching the model it couldn’t do things like order an Uber or send an Email.

  8. (16: 15) Human models next year? Will these new abilities do that, if not why not? John points out coherence is far from the only issue with today’s models.

    1. This whole frame of ‘improved coherence with the same underlying capabilities otherwise’ is so weird a hypothetical to dive into this deeply, unless you have reason to expect it. Spider senses are tingling. And yet…

  9. (17: 15) Dwarkesh asks if we should expect AGI soon. John says that would be reasonable (and will later give a 5 year timeline to replace his own job.) So Dwarkesh asks: What’s the plan? John says: “Well, if it came sooner than expected, we would want to be careful. We might want to slow down a little bit on training and deployment until we’re pretty sure we can deal with it safely. We would have a good handle on what it’s going to do and what it can do. We would have to be very careful if it happened way sooner than expected. Because our understanding is still rudimentary in a lot of ways.”

    1. You keep using that word? What were we even talking about before? Slow down a little bit? Pretty sure? I am going to give the benefit of the doubt, and say that this does not sound like much of an AGI.

    2. This seems like the right answer directionally, but with insufficient caution and freaking out, even if this is a relatively weak AGI? If this happens as a surprise, I would quite deliberately freak out.

  10. (18: 05) Dwarkesh follows up. What would ‘being careful’ mean? Presumably you’re already careful, right? John says, maybe it means not training the even smarter version or being really careful when you do train it that it’s properly sandboxed ‘and everything,’ not deploying it at scale.

    1. Again, that seems directionally right, but magnitude poor and that’s assuming the AGI definition is relative weaksauce. The main adjustment for ‘we made AGI when we didn’t expect it’ is to move somewhat slower on the next model?

    2. I mean it seems like ‘what to do with the AGI we have’ here is more or less ‘deploy it to all our users and see what happens’? I mean, man, I dunno.

  11. Let’s say AGI turns out to be easier than we expect and happens next year, and you’re deploying in a ‘measured way,’ but you wait and then other companies catch up. Now what does everyone do? John notes the obvious game theory issues, says we need some coordination so people can agree on some limits to deployment to avoid race dynamics and compromises on safety.

    1. This emphasizes that we urgently need an explicit antitrust exemption for exactly this scenario. At a bare minimum, I would hope we could all agree that AI labs need to able to coordinate and agree to delay development or deployment of future frontier models to allow time for safety work. The least the government can do, in that situation, is avoid making the problem worse.

    2. Norvid Studies: The Dwarkesh Schulman conversation is one of the crazier interviews I’ve ever heard. The combination of “AGI-for-real may fall out automatically from locked-in training in 1 to 3 years” and “when it happens I guess we’ll uh, maybe labs will coordinate, we’ll try to figure that out.”

    3. I read John here as saying he does not expect this to happen, that it would be a surprise and within a year would be a very large surprise (which seems to imply not GPT-5?) but yes that it is possible. John does not pretend that this coordination would then happen, or that he’s given it a ton of thought (nor was it his job), instead correctly noting that it is what would be necessary.

    4. His failure to pretend here is virtuous. He is alerting us to the real situation of what would happen if AGI did arrive soon in many places. Which is quite bad. I would prefer a different answer but only if it was true.

    5. Justin Halford: Schulman’s body language during the portion on game theory/coordination was clear – universal coordination is not going to happen. Firms and nation states will forge the path at a blistering pace. There is not a clear incentive to do anything but compete.

    6. I saw talk about how calm he was here. To my eyes, he was nervous but indeed insufficiently freaked out as I noted above. But also he’s had a while to let such things sink in, he shouldn’t be having the kind of emotional reaction you get when you first realize this scenario might happen.

  12. (20: 15) Pause what then? Deployment, training, some types of training, set up some reasonable rules for what everyone should do.

    1. I’m fine with the vagueness here. You were surprised by the capabilities in question, you should update on that and respond accordingly. I would still prefer the baseline be ‘do not train anything past this point and keep the AGI very carefully sandboxed at minimum until safety is robustly established.’

    2. That is true even in the absence of any of the weirder scenarios. True AGI is a big freaking deal. Know what you are doing before deployment.

  13. (21: 00) OK, suppose a pause. What’s the plan? John doesn’t have a good answer, but if everyone can coordinate like that it would be an OK scenario. He does notice that maintaining the equilibrium would be difficult.

    1. I actually give this answer high marks. John is being great all around about noticing and admitting confusion and not making up answers. He also notes how fortunate we would be to be capable of this coordination at all.

    2. I presume that if we did get there, that the government would then either be amenable to enshrining the agreement and extending it, or they would actively betray us all and demand the work resume. It seems implausible they would let it play out on its own.

  14. (22: 20) Dwarkesh pushes. Why is this scenario good? John says we could then solve technical problems and coordinate to deploy smart technical AIs with safeguards in place, which would be great, prosperity, science, good things. That’s the good scenario.

    1. The issue is this assumes both even stronger coordination on deployment, which could be far harder than coordination on pausing, making a collective decision to hold back including internationally, and it supposes that we figure out how to make the AI safety work on our behalf.

    2. Again, I wish we had better answers all around, but given that we do not admitting we don’t have them is the best answer available.

  15. (23: 15) What would be proof the systems were safe to deploy? John proposes incremental deployment of smarter systems, he’d prefer to avoid the lockdown scenario. Better to continuously release incremental improvements, each of which improves safety and alignment alongside capability, with ability to slow down if things look scary. If you did have a discontinuous jump? No generic answer, but maybe a lot of testing simulated deployment and red teaming, under conditions more likely to fail than the real world, and have good monitoring. Defense in depth, good morals instilled, monitoring for trouble.

    1. Again I love the clear admission that he doesn’t know many things.

    2. Incremental deployment has its advantages, but there is an underlying assumption that alignment and safety are amenable to incremental progress as well, and that there won’t be any critical jumps or inflection points where capabilities effectively jump or alignment techniques stop working in various ways. I’d have liked to see these assumptions noted, especially since I think they are not true.

    3. We are in ‘incremental deployment’ mode right now because we went 4→Turbo→4o while others were catching up but I expect 5 to be a big jump.

  16. (26: 30) How to notice a discontinuous jump? Should we do these long-range trainings given that risk? Evals. Lots of evals. RIght now, John says, we’re safe, but in future we will need to check if they’re going to turn against us, and look for discontinuous jumps. ‘That doesn’t seem like the hardest thing to do. The way we train them with RLHF, even though the models are very smart, the model is just trying to produce something that is pleasing to a human. It has no other concerns in the world other than whether this text is approved.’ Then he notices tool use over many steps might change that, but ‘it wouldn’t have any incentive to do anything except produce a very high quality output at the end.’

    1. So this is the first answer that made me think ‘oh no.’ Eliezer has tried to explain so many times why it’s the other way. I have now tried many times to explain why it’s the other way. Or rather, why at some point in the capability curve it becomes the other way, possibly all at once, and you should not be confident you will notice.

    2. No, I’m not going to try again to explain it here. I do try a bit near the end.

  17. (29: 00) He mentions the full instrumental convergence scenario of ‘first take over the world’ and says it’s a little hard to imagine. Maybe with a task like ‘make money’ that would be different and lead to nefarious instrumental goals.

    1. So close to getting it.

    2. Feels like there’s an absurdity heuristic blocking him from quite getting there.

    3. If John really does dive deep into these questions, seems like he’ll get it.

  1. (30: 00) Psychologically what kind of thing is being changed by RLHF? John emphasizes this is an analogy, like the satisfaction you get from achieving a goal, one can metaphorically think of the models as having meaningful drives and goals.

    1. I love the balanced approach here.

  2. (31: 30) What is the best approach to get good reasoning? Train on chains of thought, or do inference in deployment? John says you could think of reasoning as tasks that require computation or deduction at test time, and that you should use a mix of both.

    1. Yep, seems right to me.

  3. (33: 45) Is there a path between in-context learning and pre-training, some kind of medium-term memory? What would ‘doing the research for the task’ or ‘looking into what matters here that you don’t know’ look like? John says this is missing from today’s systems and has been neglected. Instead we scale everything including the context window. But you’d want to supplement that through fine-tuning.

    1. This suggests a kind of lightweight, single-use automated fine-tuning regime?

    2. Currently this is done through scaffolding, chain of thought and external memory for context, as I understand this, but given how few-shot fine-tuning can be and still be effective, this does seem underexplored?

  4. (37: 30) What about long horizon tasks? You’re learning as you go so your learning and memory must update. Really long context also works but John suggests you also want fine tuning, and you might get active learning soon.

  5. (39: 30) What RL methods will carry forward to this? John says policy grading is not sample efficient, similar to motor learning in animals, so don’t use that at test time. You want in-context learning with a learned algorithm, things that look like learned search algorithms.

  6. (41: 15) Shift to personal history and experiences. Prior to ChatGPT they had ‘instruction following models’ that would at least do things like answer questions. They did a bunch of work to make the models more usable. Coding was a clear early use case. They had browsing early but they de-emphasized it. Chat orientation made it all much easier, people knew what to reinforce.

  7. (47: 30) Creating ChatGPT requires several iterations of bespoke fine-tuning.

  8. (49: 40) AI progress has been faster than John expected since GPT-2. John’s expectations pivot was after GPT-3.

  9. (50: 30) John says post-training likely will take up a larger portion of training costs over time. They’ve found a lot of gains through post-training.

  10. (51: 30) The improvement in Elo score for GPT-4o is post-training.

    1. Note: It was a 100-point Elo improvement based on the ‘gpt2’ tests prior to release, but GPT-4o itself while still on top saw only a more modest increase.

  11. (52: 40) What makes a good ML researcher? Diverse experience. Knows what to look for. Emperia and techne, rather than metis.

  12. (53: 45) Plateau? Can data enable more progress? How much cross-progress? John correctly warns us that it has not been so long since GPT-4. He does not expect us to hit the data wall right away but that we will approach it soon and this will change training. He also notes that running experiments on GPT-4 level training runs are too expensive to be practical, but you could run ablation experiments on GPT-2 level models, but John notes that transfer failure at small scale only provides weak evidence for what happens at large scale.

  13. (57: 45) Why does more parameters make a model smarter on less data? John does not think anyone understand the mechanisms of scaling laws for parameter counts. John speculates that the extra parameters allow more computations and better residual streams and doing more things in parallel. You can have a bigger library of functions you can chain together.

  1. (1: 01: 00) What other modalities and impacts should we expect over the next few years? New modalities coming soon and over time. Capabilities will improve through a combination of pre-training and post-training. Higher impact on economy over time, even if model abilities were frozen. Much more wide use and for more technically sophisticated tasks. Science analysis and progress. Hopefully humans are still in command and directing the AIs.

    1. This all seems right and very much like the things that are baked in even with disappointing AI progress. I continue to be baffled by the economists who disagree that similar changes are coming.

    2. What this does not sound like is what I would think about as AGI.

  2. (1: 05: 00) What happens on the path to when AI is better at everything? Is that gradual? Will the systems stay aligned? John says maybe not jump to AIs running whole firms, maybe have people oversee key decisions. Hopefully humans are still the drivers of what AIs end up doing.

    1. Agreed, but how do we make that happen, when incentives run against it?

  3. (1: 07: 00) In particular, Dwarkesh raises Amdahl’s law, that the slowest part of the process bottlenecks you. How do you compete with the corporation or nations that take humans out of their loops? John suggests regulation.

    1. But obviously that regulation gets de facto ignored. The human becomes at best a rubber stamp, if it would be expensive to be more than that.

    2. Thus this is not a valid bottleneck to target. Once you let the AI ‘out of the box’ in this sense, and everyone has access to it, even if the AIs are all being remarkably aligned and well-behaved this style of regulation is swimming too upstream.

    3. Even if you did institute ‘laws with teeth’ that come at great relative efficiency cost but would do the job, how are you going to enforce them? At best you are looking at a highly intrusive regime requiring international cooperation.

  4. (1: 08: 15) Dwarkesh is there. If you do this at the company level then every company must be monitored in every country. John correctly notes that the alternative is to get all the model providers onboard.

    1. Not only every company, also every individual and every computer or phone.

    2. John gets the core insight here. In my word: If capabilities advance sufficiently then even in relatively otherwise good worlds, we can either:

      1. ‘Allow nature to take its course’ in the sense of allowing everything to be run and be controlled by AIs and hope that goes well for the humans OR

      2. Use models and providers as choke points to prevent this OR

      3. Use another choke point, but that looks far worse and more intrusive.

  5. (1: 09: 45) John speculates, could AI-run companies still have weaknesses, perhaps higher tail risk? Perhaps impose stricter liability? He says if alignment is solved that even then letting AIs run the firms, or fully run firms, might be pretty far out.

    1. Tail risk to the firm, or to the world, or both?

    2. Wouldn’t a capable AI, if it had blind spots, know when to call upon a human or another AI to check for those blind spots, if it could not otherwise fix them? That does not seem so hard, relative to the rest of this.

    3. I agree there could be a period where the right play on a company level is ‘the AI is mostly running things but humans still need to supervise for real to correct errors and make macro decisions,’ and it might not only be a Tuesday.

    4. You still end up in the same place?

  6. (1: 11: 00) What does aligned mean here? User alignment? Global outcome optimization? John notes we would have to think about RLHF very differently than we do now. He refers to the Model Spec on how to settle various conflicts. Mostly be helpful to the user, but not when it impinges on others. Dwarkesh has seen the model spec, is impressed by its handling of edge cases. John notes it is meant to be actionable with examples.

    1. This is the scary stuff. At the capabilities levels being discussed and under the instructions involved in running a firm, I fully expect RLHF to importantly fail, and do so in unexpected, sudden and hard to detect and potentially catastrophic ways.

    2. I will be analyzing the Model Spec soon. Full post is coming. The Model Spec is an interesting first draft of a useful document, very glad they shared it with us, but it does not centrally address this issue.

    3. Mostly resolution of conflicts is simple at heart, as spelled out in the Model Spec? Platform > Developer > User > Tool. You can in a sense add Government at the front of that list, perhaps, as desired. With the upper levels including concern for others and more. More discussion will be in full post.

    4. I do suggest a number of marginal changes to the Model Spec, both for functionality and for clarity.

    5. I’m mostly holding onto that post because I worry no one would read it atm.

  7. (1: 15: 40) Does ML research look like p-hacking? John says it’s relatively healthy due to practicality, although everyone has complaints. He suggests using base models to do social science research via simulation.

    1. I don’t see much p-hacking either. We got 99 problems, this aint one.

    2. Using base models for simulated social science sounds awesome, especially if we have access to strong enough base models. I both hope and worry that this will be accurate enough that certain types will absolutely freak out when they see the results start coming back. Many correlations are, shall we say, unwelcome statements in polite society.

  8. (1: 19: 00) How much of big lab research is compute multipliers versus stabilizing learning versus improving infrastructure? How much algorithmic improvement in efficiency? John essentially says they trade off against each other, and there’s a lot of progress throughout.

    1. First time an answer felt like it was perhaps a dodge. Might be protecting insights, might also be not the interesting question, Dwarkesh does not press.

  9. (1: 20: 15) RLHF rapid-fire time. Are the raters causing issues like all poetry having to rhyme until recently? John says processes vary a lot, progress is being made including to make the personality more fun. He wonders about ticks like ‘delve.’ An interesting speculation is, what if there is de facto distillation because people you hire decided to use other chatbots to generate their feedback for the model via cut and paste. But people like bullet points and structure and info dumps.

    1. Everyone has different taste, but I am not a fan of the new audio personality as highlighted in the GPT-4o demos. For text it seems to still mostly have no personality at least with my instructions, but that is how I like it.

    2. It does make sense that people like bullet points and big info dumps. I notice that I used to hate it because it took forever, with GPT-4o I am largely coming around to it with the new speed, exactly as John points out in the next section. I do still often long for more brevity.

  10. (1: 23: 15) Dwarkesh notes it seems to some people too verbose perhaps due to labeling feedback. John speculates that only testing one message could be a cause of that, for example clarifying questions get feedback to be too long. And he points to the rate of output as a key factor.

  11. (1: 24: 45) For much smarter models, could we give a list of things we want that are non-trivial and non-obvious? Or are our preferences too subtle and need to be found via subliminal preferences? John agrees a lot of things models learn are hard to articulate in an instruction manual, potentially you can use a lot of examples like the Model Spec. You can do distillation, and bigger models learn a lot of concepts automatically about what people find helpful and useful and they can latch onto moral theories or styles.

    1. Lot to dig into here, and this time I will attempt it.

    2. I strongly agree, as has been pointed out many times, that trying to precisely enumerate and define what we want doesn’t work, our actual preferences are too complex and subtle.

    3. Among humans, we adjust for all that, and our laws and norms are chosen with the expectation of flexible enforcement and taking context and various considerations into account.

    4. When dealing with current LLMs, and situations that are effectively inside the distribution and that do not involve outsized capabilities, the ‘learn preferences through osmosis’ strategy should and so far does work well when combined with a set of defined principles, with some tinkering. And indeed, for now, as optimists have pointed out, making the models more capable and smarter should make them better able to do this.

    5. In my world model, this works for now because there are not new affordances, options and considerations that are not de facto already in the training data. If the AI tried to (metaphorically, non-technically) take various bizarre or complex paths through causal space, they would not work, the AI and its training are not capable enough to profitably find and implement them. Even when we try to get the AIs to act like agents and take complex paths and do strategic planning, they fall on their metaphorical faces. We are not being saved from these outcomes because the AI has a subtle understanding of human morality and philosophy and the harm principles.

    6. However, if the AIs got sufficiently capable that those things would stop failing, all bets are off. A lot of new affordances come into play, things that didn’t happen before because they wouldn’t have worked now work and therefore happen. The correspondence between what you reward and what you want will break.

    7. Even if the AIs did successfully extract all our subtle intuitions for what is good in life, and even if the AIs were attempting to follow that, those intuitions only give you reasonable answers inside the human experiential distribution. Go far enough outside it, change enough features, and they become deeply stupid and contradictory.

    8. You also have the full ‘the genie knows but does not care’ problem.

    9. We are going to need much better plans for now to deal with all this. I certainly do not have the answers.

  12. (1: 27: 20) What will be the moat? Will it be the finicky stuff versus model size? John says post training can be a strong moat in the future, it requires a lot of tacit knowledge and organizational knowledge and skilled work that accumulates over time to do good post training. It can be hard to tell because serious pre-training and post-training efforts so far have happened in lockstep. Distillation could be an issue, either copying or using the other AI as output judge, if you are willing to break terms of service and take the hit to your pride.

    1. There are other possible moats as well, including but not limited to user data and customers and social trust and two-sided markets and partnerships.

    2. And of course potentially regulatory capture. There has been a bunch of hyperbolic talk about it, but eventually this is an important consideration.

  13. (1: 29: 40) What does the median rater look like? John says it varies, but one could look on Upwork or other international remote work job sites for a baseline, although there are a decent number of Americans. For STEM you can use India or lower income countries, for writing you want Americans. Quality varies a lot.

  14. (1: 31: 30) To what extent are useful outputs closely matched to precise labelers and specific data? John says you can get a lot out of generalization.

  15. (1: 35: 40) Median timeline to replace John’s job? He says five years.

    1. I like the concreteness of the question phrasing, especially given John’s job.

    2. If the AI can do John’s job (before or after the switch), then… yeah.

    3. Much better than asking about ‘AGI’ given how unclear that term is.

I put my conclusion and overall thoughts at the top.

It has not been a good week for OpenAI, or a good week for humanity.

But given what else happened and that we know, and what we might otherwise have expected, I am glad John Schulman is the one stepping up here.

Good luck!

On Dwarkesh’s Podcast with OpenAI’s John Schulman Read More »

on-dwarkesh’s-3rd-podcast-with-tyler-cowen

On Dwarkesh’s 3rd Podcast with Tyler Cowen

This post is extensive thoughts on Tyler Cowen’s excellent talk with Dwarkesh Patel.

It is interesting throughout. You can read this while listening, after listening or instead of listening, and is written to be compatible with all three options. The notes are in order in terms of what they are reacting to, and are mostly written as I listened.

I see this as having been a few distinct intertwined conversations. Tyler Cowen knows more about more different things than perhaps anyone else, so that makes sense. Dwarkesh chose excellent questions throughout, displaying an excellent sense of when to follow up and how, and when to pivot.

The first conversation is about Tyler’s book GOAT about the world’s greatest economists. Fascinating stuff, this made me more likely to read and review GOAT in the future if I ever find the time. I mostly agreed with Tyler’s takes here, to the extent I am in position to know, as I have not read that much in the way of what these men wrote, and at this point even though I very much loved it at the time (don’t skip the digression on silver, even, I remember it being great) The Wealth of Nations is now largely a blur to me.

There were also questions about the world and philosophy in general but not about AI, that I would mostly put in this first category. As usual, I have lots of thoughts.

The second conversation is about expectations given what I typically call mundane AI. What would the future look like, if AI progress stalls out without advancing too much? We cannot rule such worlds out and I put substantial probability on them, so it is an important and fascinating question.

If you accept the premise of AI remaining within the human capability range in some broad sense, where it brings great productivity improvements and rewards those who use it well but remains foundationally a tool and everything seems basically normal, essentially the AI-Fizzle world, then we have disagreements but Tyler is an excellent thinker about these scenarios. Broadly our expectations are not so different here.

That brings us to the third conversation, about the possibility of existential risk or the development of more intelligent and capable AI that would have greater affordances. For a while now, Tyler has asserted that such greater intelligence likely does not much matter, that not so much would change, that transformational effects are highly unlikely, whether or not they constitute existential risks. That the world will continue to seem normal, and follow the rules and heuristics of economics, essentially Scott Aaronson’s Futurama. Even when he says AIs will be decentralized and engage in their own Hayekian trading with their own currency, he does not think this has deep implications, nor does it imply much about what else is going on beyond being modestly (and only modestly) productive.

Then at other times he affirms the importance of existential risk concerns, and indeed says we will be in need of a hegemon, but the thinking here seems oddly divorced from other statements, and thus often rather confused. Mostly it seems consistent with the view that it is much easier to solve alignment quickly, build AGI and use it to generate a hegemon, than it would be to get any kind of international coordination. And also that failure to quickly build AI risks our civilization collapsing. But also I notice this implies that the resulting AIs will be powerful enough to enable hegemony and determine the future, when in other contexts he does not think they will even enable sustained 10% GDP growth.

Thus at this point, I choose to treat most of Tyler’s thoughts on AI as if they are part of the second conversation, with an implicit ‘assuming an AI at least semi-fizzle’ attached to them, at which point they become mostly excellent thoughts.

Dealing with the third conversation is harder. There is place where I feel Tyler is misinterpreting a few statements, in ways I find extremely frustrating and that I do not see him do in other contexts, and I pause to set the record straight in detail. I definitely see hope in finding common ground and perhaps working together. But so far I have been unable to find the road in.

  1. I don’t buy the idea that investment returns have tended to be negative, or that VC investment returns have overall been worse than the market, but I do notice that this is entirely compatible with long term growth due to positive externalities not captured by investors.

  2. I agree with Tyler that the entrenched VCs are highly profitable, but that other VCs due to lack of good deal flow and adverse selection, and lack of skill, don’t have good returns. I do think excessive optimism produces competition that drives down returns but that returns would otherwise be insane.

  3. I also agree with Tyler that those with potential for big innovations or otherwise very large returns both do well themselves and also capture only a small fraction of total returns they generate, and I agree that the true rate is unknown and 2% is merely a wild guess.

  4. And yes, many people foolishly (or due to highly valuing independence) start small businesses that will have lower expected returns than a job. But I think that they are not foolish to value that independence highly versus taking a generic job, and also I believe that with proper attention to what actually causes success plus hard work small business can have higher private returns than a job for a great many people. A bigger issue is that many small businesses are passion projects such as restaurants and bars where the returns tend to be extremely bad. But the reason the returns are low is exactly because so many are passionate and want to do it.

  5. I find it silly to think that literal Keynes did not at the time have the ability to beat the market by anticipating what others would do. I am on record as saying the efficient market hypothesis is false, certainly in this historical context it should be expected to be highly false. The reason you cannot make money from this kind of anticipation easily is that the anticipation is priced in, but Keynes was clearly in position to notice it being not priced in. I share Tyler’s disdain for where the argument was leading regarding socializing long term investment, and also think that long term fundamentals-based investing or building factories is profitable, having less insight and more risk should get priced in. That is indeed what I am doing with most of my investments.

  6. Financial system at 2% of wealth might not be growing in those terms and maybe it’s not outrageous on its face but it is at least suspicious, that’s a hell of a management fee especially given many assets aren’t financialized, and 8% of GDP still seems like a huge issue. And yes, I think that if that number goes up as wealth goes up that still constitutes a very real problem.

  7. Risk behavior where you buy insurance for big things and take risks in small things makes perfect sense, both as mood management and otherwise, considering marginal utility curves and blameworthiness. You need to take a lot of small risks at minimum. No Gamble, No Future.

  8. The idea that someone’s failures are highly illustrative seems right, also I worry about people adapting that idea too rigorously.

  9. The science of what lets people ‘get away with’ what is generally considered socially unacceptable behaviors while being prominent seems neglected.

  10. Tyler continuing to bet on economic growth meaning things turned out well pretty much no matter what, whereas shrinking fertility risks things turning out badly. I find it so odd to model the future in ways that implicitly assume away AI.

  11. If hawks always gain long term status and pacifists always lose it, that does not seem like it can be true in equilibrium?

  12. I think that Hayek’s claim that there is a general natural human trend towards more socialism has been proven mostly right, and I’m confused why Tyler disagrees. I do think there are other issues we are facing now that are at least somewhat distinct from that question, and those issues are important, but also I would notice that those other problems are mostly closely linked to larger government intervention in markets.

  13. Urbanization is indeed very underrated. Housing theory of everything.

  14. ‘People overrate the difference between government and market’ is quite an interesting claim, that the government acts more like a market than you think. I don’t think I agree with this overall, although some doubtless do overrate it?

  15. (30: 00) The market as the thing that finds a solution that gets us to the next day is a great way to think about it. And the idea that doing that, rather than solving for the equilibrium, is the secret of its success, seems important. It turns out that, partly because humans anticipate the future and plan for it, this changes what they are willing to do at what price today, and that this getting to tomorrow to fight another day will also do great things in the longer term. That seems exactly right, and also helps us point to the places this system might fail, while keeping in mind that it tends to succeed more than you would expect. A key question regarding AI is whether this will continue to work.

  16. Refreshing to hear that the optimum amount of legibility and transparency is highly nonzero but also not maximally high either.

  17. (34: 00): Tyler reiterates that AIs will create their own markets, and use their own currencies, property rights and perhaps Bitcoins and NFTs will be involved, and that decentralized AI systems acting in self-interested ways will be an increasing portion of our economic activity. Which I agree is a baseline scenario of sorts if we dodge some other bullets. He even says that the human and AI markets will be fully integrated. And that those who are good at AI integration, at outsourcing their activities to AI, will be vastly more productive than those who do not (and by implication, outcompete them).

  18. What I find frustrating is Tyler failing to then solve for the equilibrium, and asking what happens next. If we are increasingly handing economic activity over to self-interested competitive AI agents who compete against each other in a market and to get humans to allocate power and resources to them, subject to the resulting capitalistic and competitive and evolutionary and selection dynamics, where does that lead? How do we survive? I would as Tyler often requests Model This, except that I don’t see how not to assume the conclusion.

  19. (37: 00) Tyler expresses skepticism that GPT-N can scale up its intelligence that far, that beyond 5.5 maybe integration with other systems matters more, and says ‘maybe the universe is not that legible.’ I essentially read this as Tyler engaging in superintelligence denialism, consistent with his idea that humans with very high intelligence are themselves overrated, and saying that there is no meaningful sense in which intelligence can much exceed generally smart human level other than perhaps literal clock speed.

  20. A lot of this, that I see from many economists, seems to be based on the idea that the world will still be fundamentally normal and respond to existing economic principles and dynamics, and effectively working backwards from there, although of course it is not framed or presented that way. Thus intelligence and other AI capabilities will ‘face bottlenecks’ and regulations that they will struggle to overcome, which will doubtless be true, but I think gets easily overrun or gone around at some point relatively quickly.

  21. (39: 00) Tyler asks, is more intelligence likely to be good or bad against existential risk? And says he thinks it is more likely to be good. There are several ways to respond with ‘it depends.’ The first is that while I would very much be against this as a strategy of course, if we were always not as intelligent as we actually are, such that we never industrialized, then we would not face substantial existential risks except over very long time horizons. Talk of asteroid impacts is innumerate, without burning coal we wouldn’t be worried about climate, nuclear and biological threats and AI would be irrelevant, fertility would remain high.

  22. Then on the flip side of adding more intelligence, I agree that adding more actually human intelligence will tend to be good, so the question again is how to think about this new intelligence and how it will get directed and to what extent we will remain in control of it and of what happens, and so on. How exactly will this new intelligence function and to what extent will it be on our behalf? Of course I have said much of this before as has Tyler, so I will stop there.

  23. The idea that AI potentially prevents other existential risks is of course true. It also potentially causes them. We are (or should be) talking price. As I have said before, if AI posed a non-trivial but sufficiently low existential risk, its upsides including preventing other risks would outweigh that.

  24. (40: 30) Tyler made an excellent point here, that market participants notice a lot more than the price level. They care about size, about reaction speed and more, and take in the whole picture. The details teach you so much more. This is also another way of illustrating that the efficient market hypothesis is false.

  25. How do some firms improve over time? It is a challenge for my model of Moral Mazes that there are large centuries old Japanese or Dutch companies. It means there is at least some chance to reinvigorate such companies, or methods that can establish succession and retain leadership that can contain the associated problems. I would love to see more attention paid to this. The fact that Israel and the United States only have young firms and have done very well on economic growth suggests the obvious counterargument.

  26. I love the point that a large part of the value of free trade is that it bankrupts your very worst firms. Selection is hugely important.

  27. (48: 00) Tyler says we should treat children better and says we have taken quite a few steps in that direction. I would say that we are instead treating children vastly worse. Children used to have copious free time and extensive freedom of movement, and now they lack both. If they do not adhere to the programs we increasingly put them on medication and under tremendous pressure. The impacts of smartphones and social media are also ‘our fault.’ There are other ways in which we treat them better, in particular not tolerating using corporal punishment or other forms of what we now consider abuse. Child labor is a special case, where we have gone from forcing children to do productive labor in often terrible ways to instead forcing children to do unproductive labor in often terrible ways, and also banning children from doing productive labor for far too long, which is also its own form of horrific. But of course most people will say that today’s abuses are fine and yesterday’s are horrific.

  28. Mill getting elected to Parliament I see as less reflecting differential past ability for a top intellectual to win an election, and more a reflection of his willingness to put himself up for the office and take one for the team. I think many of our best intellectuals could absolutely make it to Congress if they cared deeply about making it to Congress, but that they (mostly wisely) choose not to do that.

  29. (53: 00) Smith noticed, despite persistent millennia long very slow if any growth, that economic growth was coming by observing a small group and seeing those dynamics as the future. The parallels to AI are obvious and Patel asks about it. Cowen says that to Smith 10% growth would likely be inconceivable, and he wouldn’t predict it because it would just shock him. I think this is right, and also I believe a lot of current economists are doing exactly that mental step today.

  30. Cowen also says he finds 10% growth for decades on end implausible. I would agree that seems unlikely, but I would say that not because it is too high but because you would then see such growth accelerate if it failed to rapidly hit a hard wall or cause a catastrophe, not because there would be insufficient room for continued growth. I do think his point that GDP growth ceases to be a good measure under sufficiently large level changes is sensible.

  31. I am curious how he would think about all these questions with regard to for example China’s emergence in the late 20th century. China has grown at 9% a year since 1978, so it is an existence proof that this can happen for some time. In some sense you can think of growth under AI potentially as a form of catch-up growth as well, in the sense that AI unlocks a superior standard of technological, intellectual and physical capacity for production (assuming the world is somehow recognizable at all) and we would be adapting to it.

  32. Tyler asks: If you had the option to buy from today’s catalogue or the Sears catalogue from 1905 and had $50,000 to spend, which would you choose? He points out you have to think about it, which indeed you do if this is to be your entire consumption bundle. If you are allowed trade, of course, it is a very easy decision, you can turn that $50,000 into vastly more.

  33. (1: 05: 00) Dwarkesh says my exact perspective on Tyler’s thinking, that he is excellent on GPT-5 level stuff, then seems (in my words not his) to hit a wall, and fails (in Dwarkesh’s words) to take all his wide ranging knowledge and extrapolate. That seems exactly right to me, that there is an assumption of normality of sorts, and when we get to the point where normality as a baseline stops making sense the predictions stop making sense. Tyler responds saying he writes about AI a lot and shares ideas he has them, and I don’t doubt those claims, but it does not address the point. I like that Dwarkesh asked the right question, and also realized that it would not be fruitful to pursue it once Tyler dodged answering. Dwarkesh has GOAT-level podcast question game.

  34. Should we subsidize savings? Tyler says he will come close to saying yes, at minimum we should stop taxing savings, which I agree with. He warns that the issue with subsidizing savings is it is regressive and would be seen as unacceptable.

  1. (1: 14: 00) Tyler worries about the fragile world hypothesis, not in terms of what AI could do but in terms of what could be done with… cheap energy? He asks what would happen if a nuclear bomb costs $50k. Which is a great question, but seems rather odd to worry about it primarily in terms of cost of energy?

  2. Tyler notes that due to intelligence we are doing better than the other great apes. I would reply that this is very true, that being the ape with the most intelligence has gone very well for us, and perhaps we should hesitate to create something that in turn has more intelligence than we do, for similar reasons?

  3. He says the existential risk people say ‘we should not risk all of this’ for AI, and that this is not how you should view history. Well, all right, then let’s talk price?

  4. Tyler thinks there is a default outcome of retreating to a kind of Medieval Balkans style existence with a much lower population ‘with or without AI.’ The with or without part really floors me, and makes me more confident that when he thinks about AI he simply is not pondering what I am pondering, for whatever reason, at all? But the more interesting claim is that, absent ‘going for it’ via AI, we face this kind of outcome.

  5. Tyler says things are hard to control, that we cannot turn back (and that we ‘chose a decentralized world well before humans even existed’) and such, although he does expect us to turn back via the decline scenario? He calls for some set of nations to establish dominance in AI, to at least buy us some amount of time. In some senses he has a point, but he seems to be doing some sort of confluence of the motte and bailey here. Clearly some forms of centralization are possible.

  6. By calling for nations such as America and the UK to establish dominance in this way, he must mean for particular agents within those nations to establish that dominance. It is not possible for every American to have root access and model weights and have that stay within America, or be functionally non-decentralized in the way he sees as necessary here. It could be the governments themselves, a handful of corporations or a combination or synthesis thereof. I would note this is, among other things, entirely incompatible with open model weights for frontier systems, and will require a compute monitoring regime.

  7. It certainly seems like Tyler is saying that we need to avoid misuse and proliferation of sufficiently capable AI systems at the cost of establishment of hegemonic control over AI, with all that implies? There is ultimately remarkable convergence of actual models of the future and of what is to be done, on many fronts, even without Tyler buying the full potential of such systems or thinking their consequences fully through. But notice the incompatibility of American dominance in AI with the idea of everyone’s AIs engaging in Hayekian commerce under a distinct ecosystem, unless you think that there is some form of centralized control over those AIs and access to them. So what exactly is he actually proposing? And how does he propose that we lay the groundwork now in order to get there?

  1. I get a mention and am praised as super smart which is always great to hear, but in the form of Tyler once again harping on the fact that when China came out saying they would require various safety checks on their AIs, I and others pointed out that China was open to potential cooperation and was willing to slow down its AI development in the name of safety even without such cooperation. He says that I and others said “see, China is not going to compete with us, we can shut AI down.”

So I want to be clear: That is simply not what I said or was attempting to convey.

I presume he is in particular referring to this:

Zvi Mowshowitz (April 19, 2023): Everyone: We can’t pause or regulate AI, or we’ll lose to China.

China: All training data must be objective, no opinions in the training data, any errors in output are the provider’s responsibility, bunch of other stuff.

I look forward to everyone’s opinions not changing.

[I quote tweeted MMitchell saying]: Just read the draft Generative AI guidelines that China dropped last week. If anything like this ends up becoming law, the US argument that we should tiptoe around regulation ‘cos China will beat us will officially become hogwash. Here are some things that stood out…

So in this context, Tyler and many others were claiming that if we did any substantive regulations on AI development we risked losing to China.

I was pointing out that China was imposing substantial regulations for its own reasons. These requirements, even if ultimately watered down, would be quite severe restrictions on their ability to deploy such systems.

The intended implication was that China clearly was not going to go full speed ahead with AI, they were going to impose meaningfully restrictive regulations, and so it was silly to say that unless we imposed zero restrictions we would ‘lose to China.’ And also that perhaps China would be open to collaboration if we would pick up the phone.

And yes, that we could pause the largest AI training runs for some period of time without substantively endangering our lead, if we choose to do that. But the main point was that we could certainly do reasonable regulations.

The argument was not that we could permanently shut down all AI development forever without any form of international agreement, and China and others would never move forward or never catch up to that.

I believe actually that the rest of 2023 has borne out that China’s restrictions in various ways have mattered a lot, that even within specifically AI they have imposed more meaningful barriers than we have, that they remain quite behind, and that they have shown willingness to sit down to talk on several occasions, including the UK Summit, the agreement on nuclear weapons and AI, a recent explicit statement of the importance of existential risk and more.

Tyler also says we seem to have “zero understanding of some properties of decentralized worlds.” On many such fronts I would strongly deny this, I think we have been talking extensively about these exact properties for a long time, and treating them as severe problems to finding any solutions. We studied game theory and decision theory extensively, we say ‘coordination is hard’ all the time, we are not shy about the problem that places like China exist. Yes, we think that such issues could potentially be overcome, or at least that if we see no other paths to survival or victory that we need to try, and that we should not treat ‘decentralized world’ as a reason to completely give up on any form of coordination and assume that we will always be in a fully competitive equilibrium where everyone defects.

Based on his comments in the last two minutes, perhaps instead the thing he thinks we do not understand is that the AI itself will naturally and inevitably also be decentralized, and there will not be only one AI? But again that seems like something we talk about a lot, and something I actively try to model and think about a lot, and try to figure out how to deal with or prevent the consequences. This is not a neglected point.

There are also the cases made by Eliezer and others that with sufficiently advanced decision theory and game theory and ability to model others or share source code and generate agents with high correlations and high overlap of interests and identification and other such affordances then coordination between various entities becomes more practical, and thus we should indeed expect that the world with sufficiently advanced agents will act in a centralized fashion even if it started out decentralized, but that is not a failure to understand the baseline outcome absent such new affordances. I think you have to put at least substantial weight on those possibilities.

Tyler once warned me – wisely and helpfully – in an email, that I was falling into too often strawmanning or caricaturing opposing views and I needed to be careful to avoid that. I agree, and have attempted to take those words to heart, the fact that I could say many others do vastly worse, both to views I hold and to many others, on this front is irrelevant. I am of course not perfect at this, but I do what I can, and I think I do substantially less than I would be doing absent his note.

Then he notes that Eliezer made a Tweet that Tyler thinks probably was not a joke – that I distinctly remember and that was 100% very much a joke – that the AI could read all the legal code and threaten us with enforcement of the legal system. That Eliezer does not seem to understand how screwed up the legal system is, talking about how this would cause very long courtroom waits and would be impractical and so on.

That’s the joke. The whole point was that the legal system is so screwed up that it would be utterly catastrophic if we actually enforced it, and also that is bad. Eliezer is constantly tweeting and talking, independently of AI, about how screwed up the legal system is, if you follow him it is rather impossible to miss. There are also lessons here about potential misalignment of socially verbally affirmed with what we actually want to happen, and also an illustration of the fact that a sufficiently capable AI would have lots of different forms of leverage over humans, it works on many levels. I laughed at the time, and knew it was a joke without being told. It was funny.

I would say to him, please try to give a little more benefit of the doubt, perhaps?

  1. Tyler predicts that until there is an ‘SBF-like’ headline incident, the government won’t do much of anything about AI even though the smartest people in the government in national security will think we should, and then after the incident we will overreact. If that is the baseline, it seems odd to oppose (as Tyler does) doing anything at all now, as this is how you get that overreaction.

  2. Should we honor past generations more because we want our views to be respected more in the future? Tyler says probably yes, that there is no known philosophically consistent view on this that anyone lives by. I can’t think of one either. He points out the Burke perspective on this is time inconsistent, as you are honoring the recent dead only, which is how most of us actually behave. Perhaps one way to think about this is that we care about the wishes of the dead in the sense that people still alive care about those particular dead, and thus we should honor the dead to the extent that they have a link to those who are alive? Which can in turn pass along through the ages, as A begets B begets C on to Z, and we also care about such traditions as traditions, but that ultimately this fades, faster with some than others? But that if we do not care about that particular person at all anymore, than we also don’t care about their preferences because dead is dead? And on top of that, we can say that there are certain specific things which we feel the dead are entitled to, like a property right or human right, such as their funerals and graves, and the right to a proper burial even if we did not know them at all, and we honor those things for everyone as a social compact exactly to keep that compact going. However none of this bodes especially well for getting future generations, or especially future AIs, to much care about our quirky preferences in the long run.

  3. Why does Argentina go crazy with the printing press and have hyperinflation so often? Tyler points out this is a mystery. My presumption is this begets itself. The markets expect it again, although not to the extent they should, I can’t believe (and didn’t at the time) some of the bond sales over the years actually happened at the prices they got and this seems like another clear case of the EMH being false, but certainly everyone involved has ‘hyperinflation expectations’ that make it much harder to go back from the brink, and will be far more tolerant of irresponsible policies that go down such roads into the future because it looks relatively responsible, and because as Tyler asks about various interest groups presumably are used to capturing more rents than the state can afford. Of course, this can also go the other way, at some point you get fed up with all that, and thus you get Milei.

  4. So weird to hear Tyler talk about the power of American civic virtue, but he still seems right compared to most places. We have so many clearly smart and well meaning people in government, yet it in many ways functions so poorly, as they operate under such severe constraints.

  5. Agreement that in the past economists and other academics were inclined to ask bigger questions, and now they more often ask smaller questions and overspecialize.

  6. (1: 29: 00) Tyler worries about competing against AI as an academic or thinker, that people might prefer to read what the AI writes for 10-20 years. This seems to me like a clear case of ‘if this is true then we have much bigger problems.’

  7. I love Tyler’s ‘they just say that’ to the critique that you can’t have capitalism with proper moral equality. And similarly with Fukuyama. Tyler says today’s problems are more manageable than those of any previous era, although we might still all go poof. I think that if you judge relative to standards and expectations and what counts as success that is not true, but his statement that we are in the fight and have lots of resources and talent is very true. I would say, we have harder problems that we aim to solve, while also having much better tools to solve them. As he says, let’s do it, indeed. This all holds with or without AI concerns.

  8. Tyler predicts that volatility will go up a lot due to AI. I am trying out two manifold markets to attempt to capture this.

  9. It seems like Tyler is thinking of greater intelligence in terms of ‘fitting together quantum mechanics and relativity’ and thus thinking it might cap out, rather than thinking about what that intelligence could do in various more practical areas. Strange to see a kind of implicit Straw Vulcan situation.

  10. Tyler says (responding to Dwarkesh’s suggestion) that maybe the impact of AI will be like the impact of Jews in the 20th century, in terms of innovation and productivity, where they were 2% of the population and generated 20% of the Nobel Prizes. That what matters is the smartest model, not how many copies you have (or presumably how fast it can run). So once again, the expectation that the capabilities of these AIs will cap out in intelligence, capabilities and affordances essentially within the human range, even with our access to them to help us go farther? I again don’t get why we would expect that.

  11. Tyler says existential risk is indeed one of the things we should be most thinking about. He would change his position most if he thought international cooperation were very possible or no other country could make AI progress, this would cause very different views. He notices correctly that his perspective is more pessimistic than what he would call a ‘doomer’ view. He says he thinks you cannot ‘just wake up in the morning and legislate safety.’

  12. In the weak sense, well, of course you can do that, the same way we legislate safe airplanes. In the strong sense, well, of course you cannot do that one morning, it requires careful planning, laying the groundwork, various forms of coordination including international coordination and so on. And in many ways we don’t know how to get safety at all, and we are well aware of many (although doubtless not all) of the incentive issues. This is obviously very hard. And that’s exactly why we are pushing now, to lay groundwork now. In particular that is why we want to target large training runs and concentrations of compute and high end chips, where we have more leverage. If we thought you could wake up and do it in 2027, then I would be happy to wait for it.

  13. Tyler reiterates that the only safety possible here, in his view, comes from a hegemon that stays good, which he admits is a fraught proposition on both counts.

  14. His next book is going to be The Marginal Revolution, not about the blog about the actual revolution, only 40k words. Sounds exciting, I predict I will review it.

So in the end, if you combine his point that he would think very differently if international coordination were possible or others were rendered powerless, his need for a hegemon if we want to achieve safety, and clear preference for the United States (or one of its corporations?) to take that role if someone has to, and his statement that existential risk from AI is indeed one of the top things we should be thinking about, what do you get? What policies does this suggest? What plan? What ultimate world?

As he would say: Solve for the equilibrium.

On Dwarkesh’s 3rd Podcast with Tyler Cowen Read More »